~jojo/Carth

ref: 4787862839d89f42711d9e708a5af3e5f5f7311a Carth/TODO.org -rw-r--r-- 39.7 KiB
47878628JoJo Simplify Codegen by using classes for the tail-call stuff 3 months ago
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
#+TITLE: The Carth programming language

Features and other stuff to do/implement in/around Carth.

* INACTIVE Package system

* NEXT Module system
  Postfix syntax for module paths? A bit like web-domains -
  "sub.main.top". E.g. "vector.collections.std".  Most relevant
  information floats to the left. Maybe a good idea, maybe
  not. Consider it.

  Look at ML modules.

** INACTIVE Allow conflicting imports if unambiguous?
   I'm thinking something that would allow the following. It would be
   less annoying than having to qualify everything. Also, gotta think
   about how this relates to overloading à la C++.

   #+BEGIN_SRC carth
   (module Foo
           (data FooThing First Second)
           (define: isFirst
               (Fun FooThing Bool)
             (fun-match
               [First True]
               [Second False])))

   (module Bar
           (data BarThing First Second)
           (define: isFirst
               (Fun BarThing Bool)
             (fun-match
               [First True]
               [Second False])))

   ;; First, there should be no error for just importing modules with conflicting
   ;; defs. This is ok in Haskell, unless one of the conflicting defs is used.
   (import Foo)
   (import Bar)

   ;; Second, it should be allowed to use one of a set of conflicting defs if the
   ;; type makes it unambiguous....

   ;; either explicitly
   (define: x FooThing First)
   (define: y BarThing First)

   ;; or implicitly
   (define t (isFirst x))
   (define u (isFirst y))
   #+END_SRC

* NEXT Typeclasses
  Need some kind of system like type classes for ad hoc
  polymorphism. Maybe Haskell style type classes, Agda style
  implicits, or Ocaml style modules. Not sure.

  "Type classes are functions from types to expressions"
  https://youtu.be/5QQdI3P7MdY?t=920. Interesting thought! Can we view
  type families the same way, but functions from types to types or
  smth? Maybe we can come up with more intuitive terminology.

** Agda style classes w implicit args
   https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#implicit-parameters

   In Haskell, you can only have a single instance of a specific
   typeclass for a specific type. This doesn't always make
   sense. Consider Semigroup for Int. Both + and * make sense, but we
   can only have one unless we goof around with newtypes etc, and that
   kinda sucks.

   Consider an approach more like agda. That model is more lika basic
   Hindley-Milner + dictionsry passing, except the "typeclass"
   argument can be passed implicitly with the {} syntax! That seems
   really cool.

   I'm not sure how implicit arguments work though. Does the compiler
   just look at all available bindings and pick the first/only
   available variable of that type?

   https://agda.readthedocs.io/en/v2.5.2/language/implicit-arguments.html

   https://agda.readthedocs.io/en/v2.5.2/language/instance-arguments.html

   Or just do it kind of Haskell style, but give the instances names
   and allow multiple, overlapping instances, raisi g an error if the
   instance is ambiguous somehow.

   Problem with instances as implicit arguments:
   https://youtu.be/2EdQFCP5mZ8?t=1259.  We'd have to know exactly
   which instances exist for the same type, and from where they're
   imported and what scoping they'll have. That sucks. Another
   horrible thing: imagine creating a sorted list with one instance, and doing
   a sorted lookup with another (accidentally or not), you could an incorrect
   result with no error from the compiler!

   Maybe an alternative could be to have both ~primary~ and
   ~secondary~ instances, where the primary instances may not overlap
   or be orphaned, like Rust, but may be passed implicitly, while
   secondary instances may overlap and be orphaned, but must be
   "overriden"/passed explicitly.

   But that may also not work. For the following code,

   #+BEGIN_SRC haskell
   foo :: Foo a => a -> a
   foo = bar

   bar :: Foo a => a -> a
   bar = ...
   #+END_SRC

   consider that we call ~foo~ with an explicit secondary
   instance. What instance will ~bar~ be given? If we must pass
   secondary instances explicitly, it seems ~bar~ would get the
   primary instance, and ~foo~ and ~bar~ would be called with
   different instances. BAD!

   Probably last update for this section: [[https://old.reddit.com/r/haskell/comments/765ogm/multiple_type_class_instances_for_the_same_type/][this thread]] has convinced me
   that Haskell-/Rust-style typeclasses is the best idea.

* NEXT Linear types
  Linear types would allow predictable performance and behaviour of
  e.g. IO tasks. Force a single manual file-close or
  buffer-flush. Force a single free for malloc.  Affine types would
  allow better performance.  E.g. pure, in-place modification of
  array.  If noone else points to it, value can be consumed and
  modified rather than cloned. Something like: ~fn push(mut v:
  Vec<i32>, x: i32) -> Vec<i32> { v.push(x); v }~ Implemented as maybe
  a wrapper, or an interface?  Maybe like in haskell with lolly
  operator?

  Things to consider: Linear arrow vs. `kind` approach or similar?

  Check out Idris Uniqueness types, Linear Haskell's linear arrows,
  and however Blodwen does it (linear arrows kind of I think).

* NEXT Higher kinded types

* INACTIVE Type families / functional dependencies and multi-param classes / Dependent types
  I'm on the fence here, but the consensus seems to be that type
  families are better than fundeps. Also, it might be possible to
  avoid needing to implement Multi-parameter typeclasses if type
  families are available to compensate. Seems that would reduce
  ambiguities and mental overhead a bit.

  Neither type families or fundeps are necessary if we have dependent
  types, but that would likely bring difficulties of it's own.

  Type families in Haskell vs Dependent types in a pseudo-Haskell vs
  Dependent types in Agda:

** Sketch
   The wiki page is
   good. https://en.wikipedia.org/wiki/Type_family. Haskell wiki also
   has some interesting notes
   https://wiki.haskell.org/GHC/Type_families.

   https://en.wikipedia.org/wiki/Lambda_cube

   Does it complicate typechecking? It's not obvious to me how it
   would?

   In haskell, type families and data families are always
   open. Probably fine to keep it that way? Not sure the complexity of
   having both open and closed versions are worth it?

   Relations:
   - Function :: Value -> Value
   - Typeclass :: Type -> Values
   - Typefamily :: Type -> Type
   - Dependent type :: Value -> Type

   I don't love the names "family" and "class". Could we use something
   that makes more clear the relations above? Like "type function" or
   something? Although, I guess at least "class" wouldn't be so bad to
   keep, for familiarity reasons.

   Do we need data families as well? I'd prefer not to have to add
   them also. A little bit of inconvenience remaining is worth it if
   we can avoid a lot of complexity in the language.

   Observation: Type families are just type aliases, but we can
   pattern match on the input.

   Observation: A typeclass with associated types is basically an
   extension of normal typeclasses that makes it (Type -> (Type,
   Value)). Defining an associated type in an instance of a typeclass
   is basically a way of allowing one to add cases to the pattern
   matching after definition. Consider this:

   #+BEGIN_SRC carth
   (type (Foo a)
     (Match a
            (case Bar Int)
            (case Baz Bool)))
   #+END_SRC

   this is the same as

   #+BEGIN_SRC carth
   (class (Foo' a)
     (type (Foo a)))

   (instance (Foo' Bar)
     (type (Foo Bar) Int))

   (instance (Foo' Baz)
     (type (Foo Baz) Bool))
   #+END_SRC

   The difference being that with the typeclass version of
   typefamilies, cases/definitions can be separated from the
   declaration, and user modules can extend the type family by adding
   another instance.

   #+BEGIN_SRC carth
   ;; Warning: some pseudocode and unimplemented features

   ;; The different possible forms, which would be basically
   ;; equivalent. Each could be convenient, but not sure if
   ;; it's a good idea to implement all.

   ;; Single case

   ;; Alias form
   (type (Option a) (Maybe a))

   ;; <=> closed case form
   (type (Option a)
     (case (_) (Maybe a)))

   ;; <=> open case form
   (type (Option a))
   (type case (Option _) (Maybe a))

   ;; <=> class form
   (class (Foo a)
     (type Option))
   (class case (Foo a)
          (type Option (Maybe a)))


   ;; Multiple cases

   ;; Can't be described as alias
   ...

   ;; closed case form
   (type (Result ok err)
     (case (_ Unit) (Maybe ok))
     (case (_ _)    (Either err ok)))

   ;; <=> open case form
   ;;
   ;; Unlike value pattern matching, order shouldn't matter, as
   ;; we could be defining each case in a different
   ;; package. Some other algorithm for handling overlapping
   ;; instances would have to be used.
   (type (Result ok err))
   (type case (Result ok err)  (Either err ok))
   (type case (Result ok Unit) (Maybe ok))

   ;; <=> class form
   (class (Foo ok err)
     (type Result))
   (class case (Foo ok err)
          (type Result (Either err ok)))
   (class case (Foo ok Unit)
          (type Result (Maybe ok)))
   #+END_SRC

   Typeclass (Type, Values) vs Type family + normal typeclass:

   #+BEGIN_SRC carth
   ;; 1

   ;; should implicitly create namespace `Iter`, so it's `Iter/Item` and `Iter/next`
   (class (Iter it)
     (type Item)
     (: next (Fun it (Maybe [Item it]))))

   (class case (Iter (Array a))
          (type Item a)
          (define (next arr) ...))

   ;; 2
   ;; <=> (except for namespacing)

   (type (Iter-item it))
   (type case (Iter-item (Array a)) a)

   (class (Iter it)
     (: next (Fun it (Maybe [(Iter-item it) it]))))

   (class case (Iter (Array a))
          (define (next arr) ...))
   #+END_SRC

   And in real Haskell that compiles, for comparison:

   #+BEGIN_SRC haskell
   -- 1

   class Iter i where
       type Item i
       next :: i -> Maybe (Item i, i)

   instance Iter [a] where
       type Item [a] = a
       next = \case
           [] -> Nothing
           a : as -> Just (a, as)

   -- 2

   type family Item' i
   class Iter' i where
       next' :: i -> Maybe (Item' i, i)

   type instance Item' [a] = a
   instance Iter' [a] where
       next' = \case
           [] -> Nothing
           a : as -> Just (a, as)
   #+END_SRC

** Type families, Haskell
   #+BEGIN_SRC haskell
   class Iter c where
       type Item c
       next :: c -> Maybe (Item c, c)

   nextList :: [a] -> Maybe (a, [a])
   nextList = \case
       [] -> Nothing
       a : as -> Just (a, as)

   instance Iter [a] where
       type Item [a] = a
       next = nextList
   #+END_SRC

** Dependent types, pseudo-Haskell
   #+BEGIN_SRC haskell
   class Iter c where
       item :: Type
       next :: c -> Maybe (item, c)

   nextList :: [a] -> Maybe (a, [a])
   nextList = \case
       [] -> Nothing
       a : as -> Just (a, as)

   instance Iter [a] where
       item = a
       next = nextList
   #+END_SRC

** Dependent types, Agda
   #+BEGIN_SRC agda2
   record Iter (C : Set) : Set1 where
     field
       item : Set
       next : C -> Maybe (item × C)

   nextList : {A : Set} -> List A -> Maybe (A × List A)
   nextList [] = nothing
   nextList (x ∷ xs) = just (x , xs)

   listIter : {A : Set} -> Iter (List A)
   listIter {a} = record
     { item = a
     ; next = nextList
     }
   #+END_SRC

* INACTIVE Custom GC
  Until we get linear types, and even then, we'll need some form of
  GC. Boehm's seems to be working well enough, but a conservative
  collector is not ideal, and I think it would be a fun project to
  write my own GC.

  There are many problems with refcounting: Generated llvm ir/asm gets
  polluted; While performance is more predictable, it's typically
  worse overall; Cycle breaking would either require using weak refs
  where appropriate, which would in turn require user input or an
  advanced implementation, or a periodic cycle breaker, which would be
  costly performance wise. So tracing GC is probably a good idea.

  GHC seems to prefer throughput over latency, so very long pauses are
  possible when you're working with a nontrial amount of data. "You're
  actually doing pretty well to have a 51ms pause time with over 200Mb
  of live data.".

  It could be interesting to add ways of controlling when GC happens
  so you can reduce spikes of latency. Haskell has ~performGC :: IO
  ()~ that does this. [[https://old.reddit.com/r/haskell/comments/6d891n/has_anyone_noticed_gc_pause_lag_in_haskell/di0vqb0/][Here is a gameboy]] who eliminates spikes at the
  cost of overall performance by calling ~performGC~ every frame.

  [[https://github.com/rust-lang/rfcs/blob/master/text/1598-generic_associated_types.md][Some inspiration here]].

  A tracing GC would be quite separate from the rest of the
  program. The only pollution would be calls to the allocator (not
  much different from the current sitch w malloc) and
  (de)registrations of local variables in Let forms (a total of two
  function calls per heap allocated variable).

  Implementing a tracing GC would be a fun challenge, and I'm sure it
  could be fun to try different algorithms etc.

  Look at https://github.com/mkirchner/gc.

* INACTIVE Effect system
  Seems like it could be more elegant than monad transformers,
  although maybe not as fast?

  Effect fusion seems to make it faster?

  Read Wu, Schrijvers 2014, 2015, 2016. I think their papers basically
  present the concept of fused effects.

  github.com/fused-effects/fused-effects

  https://youtu.be/vfDazZfxlNs?t=1730

  ^ det makear sense. Bygg basically upp ett träd av den här datatype,
  och interpreta det med alla handlers. Varje handler kollar om det är
  dens variant, och isf kör effekten. För varje handler blir trädet
  simplare, och till sist är det bara Pure kvar.

  Naiv implementering ineffektiv. Bara tänk -- måste interpreta ett
  träd ist för att bara *göra* effekterna direkt!

  Man kan använda free monads för att bygga upp trädet, men detta är
  inte så effektivt.

  Grundidén med papret "fusion for free" är att man vill bara traversa
  trädet en gång, och inte en gång per effect handler.

  Med "fusion" verkar de syfta på funktionaliteten i GHC, att man kan
  fusionera ihop funktionsanrop av specifika mönster till mer
  effektiva varianter. E.g., ~map f . map g~ fusioneras till ~map (f
  . g)~. På liknande vis fusioneras ~fold handleState . build . fold
  handleReader~ till bara ~fold (handleState . handleReader)~. Kan vi
  lösa detta utan kompilatorstöd, eller är det kanske värt att lägga
  till?

  See the talk on polysemy, it's a good complement and alternative to
  the fused effects one. https://youtu.be/-dHFOjcK6pA.

  We need type-level lists or sets, and a way to implement Member on
  that thing. If tuple types could contain higher kinded types, I
  think we only need classes.

* INACTIVE Macros?

* INACTIVE Property system
  I'm thinking of a system where you annotate functions in a source
  file with pre- and postconditions, which can then be checked in
  different modes depending on how much time you've got etc.

  - Proof-mode. Exchaustive checking of conditions. All possible
     inputs are generated, and the system checks that the precondition
     always implies the postcondition.
  - Test-mode. Statistical, random testing. Generate enough inputs
    such that the precondition is fulfilled for a statistically
    significant subset of the complete set of possible inputs.
  - Debug-mode. Functions are not tested ahead of time, instead
     assertions are inserted and checked at runtime.
  - Release-mode. Conditions are completely ignored.

* NEXT Consider using lib for pretty printing
  https://hackage.haskell.org/package/pretty-1.1.1.1

* INACTIVE Hoogle equivalent
  https://wiki.haskell.org/Hoogle

* INACTIVE Playground
  Like play.rustlang.org

  https://play.rust-lang.org/help
  https://github.com/google/nsjail

  Might actually be pretty easy by making use of Guix
  containers. Sandboxes the filesystem, and doesn't give network
  access unless `--network` is provided.

  #+BEGIN_EXAMPLE
  guix environment --container --ad-hoc coreutils clang carth
  #+END_EXAMPLE

* INACTIVE Language server protocol
  [[https://github.com/Microsoft/language-server-protocol]]
  [[https://internals.rust-lang.org/t/introducing-rust-language-server-source-release/4209]]

* INACTIVE HTML documentation generation
  Like [[https://www.haskell.org/haddock/][haddock]] and [[https://www.haskell.org/haddock/][rustdoc]].

* INACTIVE Documentation checker
  Like a typechecker-pass but for generated documentation. Verify that
  all links are alive, that examples compile and produce the expected
  output, etc.
* Standard library (std, stdlib)
  Prefer somewhat big / wide stdlib. Small / bad standard library +
  good package manager => npm / cargo situation, where everything has
  sooo many dependencies. Having a dep is not bad per say, but when
  the numbers completely blow up, like in rust- and javascript-land,
  things can get messy. The best way to avoid this, I think, is having
  a standard library that has you covered for most common things.

  Examples of libraries in other ecosystems that should be part of the
  stdlib: `is-even` in JavaScript, `composition` in Haskell, `rand` in
  Rust.

  Go seems to have done this relatively well. Their stdlib has
  everything from JPEG codec, to a webserver. The stdlib shouldn't
  have everything though, as that will add a bunch of legacy cruft
  over time, like in Java. Would not be as much of a problem if we're
  not afraid of releasing new major versions removing deprecated
  stuff.

** INACTIVE Concurrency / parallelism primitives
   Mutex, semaphore, etc.

   Look at how Rust and Haskell do it.

   Also, look at the crate [[https://crates.io/crates/parking_lot][parking_lot]], which does replaces the
   standard Rust primitives with smarter ones. E.g. the mutex does a
   small number of spins first, to avoid expensive thread juggling by
   the OS when the critical section is very short, but resort to the
   usual process interrupts in case it goes on for longer, to avoid
   priority inversion which is a problem with spinlocks.
   https://matklad.github.io/2020/01/02/spinlocks-considered-harmful.html
   https://matklad.github.io/2020/01/04/mutexes-are-faster-than-spinlocks.html

** INACTIVE Random number generation
   References:
   - [[https://arxiv.org/abs/1910.06437][It is high time we let go of the Mersenne Twister]]
* NEXT Some algorithms & data structures
  We need good collections & algs for sorting etc. if Carth is going
  to be of any use to anyone. Would also be a good way to add to the
  set of test-programs & find the worst pain points of current Carth.

  Many of these have implementations to look at and compare to on
  [[rosettacode.org]].

  This list is sort of off the top of my head, so some might not be
  good fits in a purely functional language. Look at some resource on
  persistend data structures as well.

  - Priority queue
  - Binary tree
  - B-tree
  - Random number generator
  - Binary search
  - bubble, insertion, selection sort
  - quicksort

* INACTIVE "Global" memoization
  This is just an idea I had, and may or may not be wise to implement.

  Add a special function for "memoized application" that acts like the
  application function (in Haskell, ($) :: (a -> b) -> a -> b), the
  difference being that it stores the result in a global, hidden Map
  from function pointers and arguments to results. The user can then
  selectively memoize certain functions (or even just certain
  applications of the function), and not others -- the wise choice
  would be to not memoize cheap functions, but do memoize computation
  heavy functions. This is perfectly legal if the language is
  completely pure, as there can be no side-effects that are not
  repeated properly yada yada.

  An alternative could be that the user can mark a function definition
  as memoized, and then it's always memoized, not just certain
  applications. Also, there could then be a unique Map for each such
  function.
* INACTIVE Async I/O
  Zig seems to have a smart solution that doesn't require a separate
  `async` version of the standard library, unlike Rust with
  `async-std`.

  https://ziglang.org/download/0.6.0/release-notes.html#Async-IO

  Also look at how Haskell does it. It's probably smart.

* INACTIVE Boxing to allow for dynamic linking
  Boxing vs monomorphization. Boxing results in smaller binary and
  dynamically-linkable interface, but results in slower code.

  When compiling a library, especially a dynamically linked one, how
  do we allow the export of polymorphic functions? We can't really use
  monomorphization, as we can't predict which types there should be
  instantiations for. Boxing would solve this problem and result in a
  smaller binary, but the code would most likely be slower, and the
  FFI would become more complicated.

  Maybe monomorphize all package-internal code, and require boxing for
  all public-facing polymorphic functions? Could require some keyword
  or special form, like `boxed`, to make it clear when the FFI will be
  affected.
* NEXT Add separate pass before Codegen to compile SrcPos:s
  I think it could be done purely and independently from rest of codegen. Would be more clean.
* NEXT Refactor & document Codegen & Gen
  It's getting big, complex, and unwieldy. Probably buggy as
  well. There's also a distinct lack of documentation. Always takes a
  sec for me to remember what some badly named function actually does.
* INACTIVE Use GADTs in Infer
* NEXT Have a look at LLVM.IRBuilder
  Might simplify my Codegen

  https://hackage.haskell.org/package/llvm-hs-pure-9.0.0/docs/LLVM-IRBuilder-Module.html#v:function

* INACTIVE Add basic repl
  Add a basic repl based on the JIT. Something very similar to
  http://www.stephendiehl.com/llvm/.

  Could maybe be the starting point for an on-demand architechture?
  Would probably require some memoization mechanism so that we don't
  unnecessarily check, monomorphise, and compile stuff we don't need
  to.
* NEXT Un-generalize module Selections
  Since we now use JIT instead of interpreter, only Codegen uses
  Selections, and we could make it simpler by inlining it.
* NEXT Type aliases
  Like ~type String = [Char]~ in Haskell.

* INACTIVE Compute glob vars at compiletime
  I'm not 100% sure this is a good idea anymore. Need to think about
  this.
* INACTIVE Compile-time eval in arbitrary contexts
  As a step towards something like dependent types
* INACTIVE Benchmark, profile, optimize
  Check out
  https://ollef.github.io/blog/posts/speeding-up-sixty.html. Great
  tips!
* INACTIVE Streamline learning the language
  Not that getting users is a primary concern, but if someone is
  indeed curious, I don't want them to be scared off by the process of
  getting started seeming complex.

  https://news.ycombinator.com/item?id=23347357
  https://www.hillelwayne.com/post/learning-a-language/
* NEXT Unify the different ASTs / IRs
  It's just kinda messy right now. Many files must be changed when
  touching just about any part of the AST representation. Also, takes
  up a lot of lines for not much apparent gain. Use some kind of
  attribute-tag to change the AST for different stages. Like:

  #+BEGIN_SRC haskell
  type Expr attr = Expr attr (Expr' attr)

  type ParsedExpr = Expr (Type, SrcPos)
  type CheckedExpr = Expr CheckedType
  #+END_SRC
* INACTIVE Use algebraic effects instead of mtl
  Not 100% about this one -- maybe my monad use is simple enough that
  there wouldn't actually be any gain? But still, I'd like to learn
  effects, so maybe it's worth trying out.

  Polysemy seems like the best one, but I'd have to do a little
  research. https://github.com/thma/PolysemyCleanArchitecture/tree/3a9354a5c31eaf03009e389ce49b318881a2460f#readme
* INACTIVE GRIN as alternative to LLVM and some of my own Codegen
  https://github.com/grin-compiler/grin

  GRIN seems promising. I wouldn't have to perform as complex
  transformations from Carth IR to LLVM, instead transforming to this
  more functional IR. GRIN might also be able to perform more
  optimizations.
* INACTIVE Optimize away zero-sized types before codegen
  It's bad that many operations on zero-sized types are currently
  actually compiled to, in practice, a ton of no-ops. I think it might
  be a good idea to add a dedicated optimization pass after
  monomorphization but before codegen that just gets rid of all
  zero-sized types and operations on them. For example, a type like
  ~(data Foo (Foo Bar Unit Baz))~ can be changed to ~(data Foo (Foo
  Bar Baz))~ without affecting the size of the generated struct
  etc. Also, a store of a ~{}~ into a ~{}*~ is really a no-op -- just
  noise in the generated LLVM. Being able to assume no zero-sized
  types in Gen/Codegen would also be really nice, I think.

  One issue: If you get rid of all ZSTs, what happens to a function
  with return-type Unit? What does it now return? One option could be
  to have add a special LLVM-Void type that just marks that the
  function should return void later. Another, more interesting option,
  would be to simply remove all functions and function-calls where the
  only remaining return type is a ZST, since, in purely functional
  programming, such a function can't do anything anyways. This would
  work, as long as *all* functions with side-effects are marked with
  IO & the RealWorld of IO is not a ZST & unsafePerformIO is known to
  the compiler and is (one of) the only (potentially) ZST-returning
  functions not optimized away, or unsafePerformIO returns something
  like ~(data (UnsafeIOResult a) (UnsafeIOResult a SizedMarker))~ to
  ensure the result is sized.

  Maybe do the flattening thing so there is only one zero sized type,
  but don't optimize away operations returning Unit completely. It
  would still be nice to be able to expect side effects and panics to
  happen. Also, RealWorld wouldn't have to have a size and actually
  impact performance.

* INACTIVE Builtin parsing of C header files
  I think Zig has this, and in Rust you can use the external tool
  ~bindgen~ to generate Rust declarations for C headers ahead of time.

  I just think it would be nice to not need to manually translate
  header files to use external libraries like OpenGL or SDL or
  whatever.
* INACTIVE Investigate alternative linkers
  Linking is one of the bottlenecks. However much caching etc I do in
  the parser & typechecker etc, the linker still has to do everything
  from scratch each time. I read somewhere that "gold" is a new GCC
  linker? Try using that maybe, unless it's already in use?

  https://news.ycombinator.com/item?id=24615916

  This is a new one: *mold*. It has as goal to be really fast. Seems promising!
  https://github.com/rui314/mold

* INACTIVE Produce .so:s for debug builds
  Linking is slow, so for debug builds we could try to split the
  output by module into separate .so:s. Then we'd only have to rebuild
  the .so of the affected module in incremental compilation.

  https://news.ycombinator.com/item?id=24615916

* INACTIVE Build Future into IO, or have both IO and AsyncIO?

* NEXT Some algorithms & data structures
  We need good collections & algs for sorting etc. if Carth is going
  to be of any use to anyone. Would also be a good way to add to the
  set of test-programs & find the worst pain points of current Carth.

  Many of these have implementations to look at and compare to on
  [[rosettacode.org]].

  This list is sort of off the top of my head, so some might not be
  good fits in a purely functional language. Look at some resource on
  persistend data structures as well.

  - Priority queue
  - Binary tree (2-3 tree better?)
  - B-tree (specifically 2-3 tree?)
  - Random number generator
  - bubble, insertion, selection sort
  - quicksort
* NEXT Don't actually define stuff like Str in the compiler
  Just assume they're defined by the user. Would mean less stuff in
  the compiler, and more in carth source. Both positives and
  negatives. I feel it would be nice as a user to be able to inspect
  the .carth source of the stdlib and actually see all the types and
  stuff though.
* INACTIVE Union types
  Like Typescript (I think, I'm not all that familiar with it). Could
  be nice for error handling, for example. That's one of the problems
  in Rust -- you have to use all these fancy crates or write a bunch
  of boilerplate just to allow a function to return two different
  types of errors.

  Java, where exceptions can be combined as a union, essentially:
  #+BEGIN_SRC java
  public Foo foo() throws SomeException, OtherException {
      bar(); // throws SomeException
      baz(); // throws OtherException
  }
  #+END_SRC

  and Rust, where you have to combine the different types somehow:
  #+BEGIN_SRC rust
  fn foo() -> Result<Foo, MyErr> {
      bar().map_err(MySomeErr)?;
      baz().map_err(MyOtherErr)?;
  }

  enum MyErr {
      MySomeErr(SomeErr),
      MyOtherErr(OtherErr)
  }
  #+END_SRC
* INACTIVE Hygienic macros
* INACTIVE Destructors
  System to register a function as a destructor for a value, which can
  be used to destroy / close resources when the value is no longer
  used and garbage collection happens. It's not optimal that resources
  may stay open for quite a while after last usage, but it's better
  than *never* being closed.

  Example use case: We don't want to have to use linear types to
  manually destroy Lazy values when we're done with them, but we still
  need to make sure that their mutexes are destroyed at some point.

  https://www.hboehm.info/gc/finalization.html
* NEXT "Use ptrtoint/inttoptr sparingly, prefer GEPs"
  https://llvm.org/docs/Frontend/PerformanceTips.html#other-things-to-consider

  I don't think I use ptrtoint/inttoptr much or at all in the compiler
  itself, but the ~ptr/+~ function in the stdlib transmutes to int for
  addition. Should add a builtin virtual function that uses gep to
  offset pointer.
* INACTIVE Is my llvm representation of unions causing problems?
  Just had a bug which I haven't quite fixed yet. My current guess is
  that it's caused by an (Either (Fun Unit (Maybe Int)) (Maybe Int))
  being represented as a (Maybe Int) when generic in LLVM. This should
  not be a problem, as both variants are equally (some basic testing
  with equivalent structs in Rust and C seems to confirm this), but
  maybe it's a problem that a function pointer is cast to integer. The
  reference mentioned that LLVM has a harder time doing pointer
  analysis if pointers are cast to integers and back.

  Check out that approach that Troels used in Futhark, with
  deduplicating but otherwise laying out all the members of all
  variants in a single sequence. Why did he pick that approach? I
  remember that I asked when he had a zoom presentation, but I don't
  remember his answer.

  Ooh, this seems cool:
  https://mapping-high-level-constructs-to-llvm-ir.readthedocs.io/en/latest/README.html
* INACTIVE Var pattern syntax, comparison
  What if we did

  #+BEGIN_SRC carth
  (define (foo x pair)
    (match pair
      (case [x (let y)] (Some y))
      (case [_ _] None)))
  #+END_SRC

  instead of

  #+BEGIN_SRC carth
  (define (foo x pair)
    (match pair
      (case [x' y] (if (= x x')
                       (Some y)
                     None))))
  #+END_SRC
* TODO Move from LLVM to alternative backend
  LLVM is kind of not great in some ways. It's often not trivial to
  debug errors stemming from displeasing LLVM. It updates frequently,
  but the Haskell bindings lag behind, so I have to use an older
  version or start maintainin llvm-hs myself. The project is
  *massive*, and most of the stuff I don't need. Sure, it's nice being
  able to target practically any backend, but I don't *actually* care
  about most of them. And there exists *so many* optimization passes,
  but most of them actually improve the performance of the binary very
  little, while bumping the compiletime a not insignificant bit.

  I want to use something simpler.

  To make the transition smooth, and to allow for easier debugging of
  codegen in the future, I think it would be a good idea to add an
  interpreter, like the one we had before, but now supporting FFI
  calls so that std-rs can be used as well. Really, the amount of code
  would not be huge, and it would be incredibly nice to have something
  to compare to when debugging low-level stuff. Also, I want to get
  rid of LLVM right away, but I'm not sure about what to replace it
  with just yet, so an interpreter is needed in the meantime.

** INACTIVE Optinal step: Add low-level intermediate representation in Carth
   Would require less work to change backend or add multiple ones of I
   just have to translate from a low-level IR to the backend code,
   instead of all the way from an AST. Might also be good for the
   interpreter to run at a lower lever, but not sure.

   *UPDATE*: I'm warming up to focusing on this rather than the
   interpreter.

   Features the LIR should have (or maybe lack, rather):
   - Switches with sub-value extraction instead of pattern match.
   - No closures, but their representation in the form of function +
     environment instead.
   - Tail call optimized. (Replace tail-recursion &
     sibling-tail-recursion with loops or smth).
   - Beta reduction.
   - Detect fully saturated calls & have special ways of directly
     calling builtin virtuals, externs, and normal functions
     saturatedly.

   Thinking about alloca:s (stack allocations) in generated loops for
   e.g. tail call optimization. Is it fine to simply generate all the
   alloca:s as we do in the LLVM codegen, but maybe instead of placing
   the statements at the point of use, output them with a Writer monad
   and place them at the function entry. As long as the register names
   used are good, it should work out fine right? Similar to how we
   currently generate strings.

   Thinking about to what level we should lower the IR. Remain at
   nested expressions, or move on to blocks and goto:s? Blocks with
   parameters vs. Phi-nodes? If remain with if-expressions,
   translation to C would be much cleaner, but how do we create the
   loop for tail-recursion? If we go to block level, might be easier
   to generate MIR, LLVM, or even ASM, but what if we want to generate
   for some slightly higher level target like C?

   Mutual tail recursion and/or sibling calls seem more difficult to
   optimize, so maybe just guarantee optimization of tail-recursive
   calls for all backends & platforms, but rely on the backend for
   general sibling call optimization when supported. LLVM can do
   sibling calls, for example.

   http://web.eecs.umich.edu/~mahlke/courses/483f06/lectures/483L17.pdf

   I think I'll start with a very simplified version of Monomorphic,
   and possibly change it or add an additional even lower step
   afterwards.

   Detect tail recursive functions in lowering & mark the tail
   recursive calls. Should then be able to generate an efficient loop
   in LLVM / whatever, and should be able to not generate anything
   unnecessary.

   #+BEGIN_EXAMPLE
   f x y =
     if foo x y
     then f (x - 1, y)
     else g x y
   #+END_EXAMPLE

   becomes

   #+BEGIN_EXAMPLE
   @recursive=yes
   f x y =
     if foo x y
     then @recurse (x - 1, y)
     else g x y
   #+END_EXAMPLE

   If function is marked as recursive, the codegen knows to stack
   allocate the parameters so they can be modified for each iteration
   (could consider block-params / phi-nodes as alt., but this solution
   seems relatively simple). If special instruction to recurse is
   encountered, just set the parameter stack variables and jump to the
   entry label kept in Reader.

** DONE Step 1: Re-add interpreter for pure Carth code
   Fairly self explanatory. Just operate on whatever is returned by
   the Optimize pass. Make sure to add / translate as many test-cases
   as possible to work without ~extern~ declarations, so that I can
   ensure as few correctness regressions as possible.

** INACTIVE Step 2: Support ~extern~ in interpreter
   This may not be trivial, but I think it won't be too hard. Can get
   some stuff from the codegen.

   Use [[https://hackage.haskell.org/package/libffi][libffi]] for dynamic FFI calls with runtime type info.

   How to convert data from Haskell to C? Functions for primitive
   types in libffi. For complex datatypes, I'm sure there's libraries
   for converting to bytes directly.

   Use sizeof and alignmentof from codegen module.

   *UPDATE*: Actually, this got complicated. How to handle GC roots,
   Haskell GC vs. Boehm GC. Allowing arbitrary extern calls, including
   those that might unsafely mutate memory. When we add our own GC
   with user-defineable destructor functions, how can we pass the
   user-defined function via FFI if it's a Haskell function basically?
   It all just gets really messy. Might not be much point in trying to
   do this after all... Focusing on adding our own LIR and using MIR
   for JIT/compilation seems like a better route at this point.

** NEXT Step 3: Remove LLVM support
   yeah

** INACTIVE Step 4: Add new native codegen backend
  Investigate QBE, Cranelift, GNU Lightning, libgccjit, GCC, MIR.

  #+BEGIN_QUOTE Candidates
  - C :: I.e., spit out C source and call out to ~cc~. Very portable
    (every platform has a C compiler). Not very elegant. Does not
    natively support tail call elimination, so would have to do that
    myself (true for pretty much everything except llvm though). Used
    by respectable languages like Nim and Haskell (sort of).
  - C-- :: Similar to C, but even more "portable assembly
    language". Created by SPJ and friend, specifically for being
    generated by compilers. Fork called Cmm used by GHC.
  - LLVM :: Approx 5 million LOC. Many targets, OK usability, but
    breaking changes sometimes and big and scary.
  - GCC :: Even bigger than LLVM. Also many targets. Not very good
    usability. Probably quite stable. GPL.
  - libgccjit :: Despite the name, also AOT. Basically an easier to
    use frontend for GCC with additional functionality to leverage GCC
    for JITting. Most points of GCC apply, but easier to use, and JIT
    included.
  - GNU Lightning :: JIT (only). Used by some schemes. Disjoint from
    GCC.
  - Cranelift :: Small-ish atm, but not sure it has any goals to stay
    that way. Seems more like an effort to replace LLVM, including
    much of its "bloat". Written in Rust. Maybe not all that
    standalone? Seems to be meant to be called from Rust. Performance
    of generated code seems bad atm, but should be improved.
  - QBE :: Small! 10k LOC. Goals to be 70% as fast as
    GCC/LLVM. Generates ASM instead of machine code for some
    reason. Seems like it hasn't seen much update this last
    year. However, one [[https://github.com/michaelforney/qbe][Michael Forney is actively maintaining a fork]],
    for his own language I think, so that might be interesting.
  - [[https://github.com/vnmakarov/mir][MIR]] :: This one looks the most interesting! Similarly to QBE, very
    small at 15k LOC and 70% the performance of GCC. Primarily a
    JIT(?), but seems to be able to to AOT as well. Has a 4 backends
    atm, including AMD64 and Aarch64, and it seems relatively easy to
    add a new one. I've found 2 languages that make use of MIR to
    study: [[https://github.com/grame-cncm/faust][Faust]] and [[https://github.com/dibyendumajumdar/ravi][Ravi]].
  #+END_QUOTE

  In the end, I most like the look of MIR. It seems to make good
  tradeoffs.

  Compiling to C comes at second place. Incredibly portable, and .c
  files would be a lot more readable than .ll files. Would lose the
  GDB source-line from DWARF stuff though, but that shit kinda sucked
  anyways. Function names would work as well, if not better than in
  LLVM, since the names would be kept in the C, and C compilers
  probable output much better dwarf than I ever could.

  Maybe I'll do both? If I just a low-level IR that's just above the
  level of the union of C and MIR it ought to be quite simple to
  translate from that to whatever backendest backend.

** References
   - [[https://gist.github.com/zeux/3ce4fcc3a43072b4315abde95319ecb6][How does clang 2.7 hold up in 2021?]]
* NEXT Try our an alternative prelude, like relude