~jojo/Carth

bf37062008990c2958909f1d69b3a4433fe65ebc — JoJo 3 months ago adac674
update TODO
1 files changed, 93 insertions(+), 18 deletions(-)

M TODO.org
M TODO.org => TODO.org +93 -18
@@ 813,10 813,86 @@ Features and other stuff to do/implement in/around Carth.

  Ooh, this seems cool:
  https://mapping-high-level-constructs-to-llvm-ir.readthedocs.io/en/latest/README.html
* INACTIVE Alternative backends
  LLVM is big, complex, and moves fast. Can we use something simpler?
* INACTIVE Var pattern syntax, comparison
  What if we did

  #+BEGIN_SRC carth
  (define (foo x pair)
    (match pair
      (case [x (let y)] (Some y))
      (case [_ _] None)))
  #+END_SRC

  instead of

  #+BEGIN_SRC carth
  (define (foo x pair)
    (match pair
      (case [x' y] (if (= x x')
                       (Some y)
                     None))))
  #+END_SRC
* TODO Move from LLVM to alternative backend
  LLVM is kind of not great in some ways. It's often not trivial to
  debug errors stemming from displeasing LLVM. It updates frequently,
  but the Haskell bindings lag behind, so I have to use an older
  version or start maintainin llvm-hs myself. The project is
  *massive*, and most of the stuff I don't need. Sure, it's nice being
  able to target practically any backend, but I don't *actually* care
  about most of them. And there exists *so many* optimization passes,
  but most of them actually improve the performance of the binary very
  little, while bumping the compiletime a not insignificant bit.

  I want to use something simpler.

  To make the transition smooth, and to allow for easier debugging of
  codegen in the future, I think it would be a good idea to add an
  interpreter, like the one we had before, but now supporting FFI
  calls so that std-rs can be used as well. Really, the amount of code
  would not be huge, and it would be incredibly nice to have something
  to compare to when debugging low-level stuff. Also, I want to get
  rid of LLVM right away, but I'm not sure about what to replace it
  with just yet, so an interpreter is needed in the meantime.

** INACTIVE Optinal step: Add low-level intermediate representation in Carth
   Would require less work to change backend or add multiple ones of I
   just have to translate from a low-level IR to the backend code,
   instead of all the way from an AST. Might also be good for the
   interpreter to run at a lower lever, but not sure.

** TODO Step 1: Re-add interpreter for pure Carth code
   Fairly self explanatory. Just operate on whatever is returned by
   the Optimize pass. Make sure to add / translate as many test-cases
   as possible to work without ~extern~ declarations, so that I can
   ensure as few correctness regressions as possible.

** NEXT Step 2: Support ~extern~ in interpreter
   This may not be trivial, but I think it won't be too hard. Can get
   some stuff from the codegen.

   Use [[https://hackage.haskell.org/package/libffi][libffi]] for dynamic FFI calls with runtime type info.

   How to convert data from Haskell to C? Functions for primitive
   types in libffi. For complex datatypes, I'm sure there's libraries
   for converting to bytes directly.

   Use sizeof and alignmentof from codegen module.

** NEXT Step 3: Remove LLVM support
   yeah

** INACTIVE Step 4: Add new native codegen backend
  Investigate QBE, Cranelift, GNU Lightning, libgccjit, GCC, MIR.

  #+BEGIN_QUOTE Candidates
  - C :: I.e., spit out C source and call out to ~cc~. Very portable
    (every platform has a C compiler). Not very elegant. Does not
    natively support tail call elimination, so would have to do that
    myself (true for pretty much everything except llvm though). Used
    by respectable languages like Nim and Haskell (sort of).
  - C-- :: Similar to C, but even more "portable assembly
    language". Created by SPJ and friend, specifically for being
    generated by compilers. Fork called Cmm used by GHC.
  - LLVM :: Approx 5 million LOC. Many targets, OK usability, but
    breaking changes sometimes and big and scary.
  - GCC :: Even bigger than LLVM. Also many targets. Not very good


@@ 843,22 919,21 @@ Features and other stuff to do/implement in/around Carth.
    atm, including AMD64 and Aarch64, and it seems relatively easy to
    add a new one. I've found 2 languages that make use of MIR to
    study: [[https://github.com/grame-cncm/faust][Faust]] and [[https://github.com/dibyendumajumdar/ravi][Ravi]].
* INACTIVE Var pattern syntax, comparison
  What if we did
  #+END_QUOTE

  #+BEGIN_SRC carth
  (define (foo x pair)
    (match pair
      (case [x (let y)] (Some y))
      (case [_ _] None)))
  #+END_SRC
  In the end, I most like the look of MIR. It seems to make good
  tradeoffs.

  instead of
  Compiling to C comes at second place. Incredibly portable, and .c
  files would be a lot more readable than .ll files. Would lose the
  GDB source-line from DWARF stuff though, but that shit kinda sucked
  anyways. Function names would work as well, if not better than in
  LLVM, since the names would be kept in the C, and C compilers
  probable output much better dwarf than I ever could.

  #+BEGIN_SRC carth
  (define (foo x pair)
    (match pair
      (case [x' y] (if (= x x')
                       (Some y)
                     None))))
  #+END_SRC
  Maybe I'll do both? If I just a low-level IR that's just above the
  level of the union of C and MIR it ought to be quite simple to
  translate from that to whatever backendest backend.

** References
   - [[https://gist.github.com/zeux/3ce4fcc3a43072b4315abde95319ecb6][How does clang 2.7 hold up in 2021?]]