~vdupras/duskos

c8aa95ec0f86efd6c360cdf4246da10e3ba6804f — Virgil Dupras 29 days ago 7ab521d
Simplify iterator boilerplate

Sometimes I have good ideas! See doc/iter.
3 files changed, 26 insertions(+), 33 deletions(-)

M fs/doc/hal.txt
M fs/doc/iter.txt
M fs/xcomp/bootlo.fs
M fs/doc/hal.txt => fs/doc/hal.txt +1 -0
@@ 214,6 214,7 @@ addr,    op --      Store the effective address of the operand in dest

ps+,    n --      Add n to PSP
rs+,    n --      Add n to RSP
LIT>W,  n --      Set W to n
W+n,    n --   Z  Add n to W
A+n,    n --   Z  Add n to A
W>A,    --        Copy W to A

M fs/doc/iter.txt => fs/doc/iter.txt +19 -26
@@ 9,7 9,7 @@ Before diving into the gory details of creating new iterators (because that's
what's great about them: it's easy to create new ones), let's see how to use the
two iterators that are built-in: "for" and "for2".

"for" takes a single arguments and counts down to 0 from that argument. In the
"for" takes a single argument and counts down to 0 from that argument. In the
loop body, you can refer to the value-like word "i":

    : foo 5 for i . spc> next ;


@@ 35,20 35,26 @@ The heavy lifting is done by ":iterator", which is a does word generating
immediate compiling words (in this instance, "for"). When that word is called, a
few things happen:

1. 12 bytes are reserved on RS for "i" and "j". It is always 12 bytes for all
   iterators, and it's always "i", "j" and "k", even when they aren't used.
2. A call to "for"'s body is written.
3. Two intertwined ahead jumps are written in a way that allow "unyield" to exit
   the loop in cases where the iterator has no yield.
1. 12 bytes are reserved on RS for "i", "j" and "k". It is always 12 bytes for
   all iterators, and it's always "i", "j" and "k", even when they aren't used.
2. Push "for"'s address to RS.
3. Write a forward jump that targets the "yield" that the "next" word is about
   to write.
4. We continue compiling the loop body.
5. When "next" (an immediate too) is called, a "yield" is compiled, followed by
   a backward jump to the beginning of the loop, followed by a forward target
   for the exit jump compiled at "for".
5. When "next" (an immediate too) is called, we close the forward jump opened at
   step 3, then a "yield" is compiled, followed by a backward jump to the
   beginning of the loop.
6. De-allocate the 12 bytes reseved for i/j/k.

Iterators are expected to keep PS and RS balanced between yields. For this
reason, iteration values should exclusively be passed through i/j/k

One can wonder why we push the iterator's address to RS and defer its call to
the following "yield" rather than calling it directly. In most cases, it would
work fine, but in cases where no iteration take place, we end up returning in
the middle of the loop body rather than at the end of it. For this reason, we
always begin an iterator loop by jumping at the end of it.

## i, j and k

"i", "j" and "k" are value-like words (obey "to" semantics) that live on RS.


@@ 65,13 71,13 @@ We use RS for those variables for multiple reasons:
   and it can't push directly to it because RS+0, the coroutine swapping
   address, has to stay there. It's awkward.

The easiest and simplest way to be directly on RS.
The easiest and simplest way to be directly on RS through i/j/k.

## Breaking

It's a common pattern to break from an iterator early. To exit the loop early,
you can use the "break" word (again, an immediate). This words de-allocates "i"
and "j" RS slots, the coroutine address RS slot and then jumps out of the loop
you can use the "break" word (again, an immediate). This words de-allocates
i/j/k RS slots, the coroutine address RS slot and then jumps out of the loop
in a way that is similar to a begin..repeat, that is, to a following then. Yes,
when you use a "break", you need to add a "then" (and optionally a "else")
after the "next". This allows you to conditionally execute code based on whether


@@ 91,7 97,7 @@ a global variable. This means that:
   loop that's going to process the "break" and it won't have the expected
   results.
3. These limitations, of course, are at compile time, which means that "break"
   works fine when the look calls a word that has a "next" loop inside it.
   works fine when the loop calls a word that has a "next" loop inside it.
4. Break only works in "next" loop, not other loops.

## unyield


@@ 111,16 117,3 @@ We already know how many bytes such a jump takes: CALLSZ + CELLSZ.
Therefore, if we want to exit the iterator loop, all we need to do is to add
CALLSZ + CELLSZ to RS+0. That's what "unyield" does. Then, we exit the loop,
execute the loop cleanup code and go on with our lives.

Simple right? I'm glad you agree! ... but there's a caveat, a small wart in this
otherwise gorgeous scheme: it's possible that the iterator was empty and no
actual yield ever took place. In that case, at the time "unyield" is called,
RS+0 point to the address right after the initial iterator call, right before
the loop body. If we add CALLSZ + CELLSZ to that, we'll end in the middle of
nowhere.

To that end, ":iterator" compiles a forward jump right after the initial call
(which has a size of... CALLSZ + CELLSZ!). That jump goes to the loop body.
However, in between that jump and the loop body is another forward jump, but
this time to the loop's exit. This way, if "unyield" is called without a yield,
we end up on that jump, and then at loop cleanup. All good.

M fs/xcomp/bootlo.fs => fs/xcomp/bootlo.fs +6 -7
@@ 221,14 221,13 @@ alias execute | immediate
: xtcomp [compile] ] begin word runword compiling not until ;
: ivar, ( off -- ) RSP) swap +) toptr@ execute ;
: i 4 ivar, ; immediate : j 8 ivar, ; immediate : k 12 ivar, ; immediate
: :iterator doer immediate xtcomp does>
  -12 rs+, execute, -4 [rcnt] +!
  [compile] ahead \ jump to loop
  [compile] ahead \ exit jump
  swap [compile] then [compile] begin ( loop ) ;
: :iterator doer immediate xtcomp does> ( w -- yieldjmp loopaddr )
  -16 rs+, RSP) !, LIT>W, RSP) @!,
  [compile] ahead \ jump to yield
  [compile] begin ( loop ) ;
0 value _breaklbl
: next
  [compile] yield [compile] again [compile] then
: next ( yieldjmp loopaddr -- )
  swap [compile] then [compile] yield [compile] again
  12 rs+, 4 [rcnt] +! 0 to@! _breaklbl ?dup drop ; immediate
: unyield BRSZ RSP) [+n], ; immediate
: break 16 rs+, [compile] ahead to _breaklbl ; immediate