~vdupras/duskos

duskos/fs/doc/dict.txt -rw-r--r-- 15.2 KiB
c515797bVirgil Dupras comp/c/vm/i386: fix integer promotion bug in logical ops 3 hours ago
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
# System dictionary

This is the list of words provided by the kernel and its boot sequence,
organized by topics.

You should read doc/usage first to understand concepts mentioned here.

## Glossary

Stack notation: "<stack before> -- <stack after>". Rightmost is top of stack
(TOS). For example, in "a b -- c d", b is TOS before, d is TOS after. "R:" means
that the Return Stack is modified.

Some words have a variable stack signature, most often in pair with a flag.
These are indicated with "?" to tell that the argument might not be there. For
example, "-- n? f" means that "n" might or might not be there.

Some elements are surrounded in "" quotes, for example ""name" --". This
indicates an element that isn't read from PS, but from the input stream (with
"word").

Letters have consistent meanings

a -- address
n -- number
f -- flag
s -- string
w -- word
e -- entry
m -- entry metadata
r -- range
u -- count
c -- character, a 8-bit value
a b c -- order of elements matter

The description of the words can contain letters in between "*" characters,
indicating a special attribute:

*I* indicates an immediate word.
*B* indicates that it is "binary modulable".
*C* indicates that the word can be compiled. Its description is its behavior in
    interpret mode, but when in compile mode, it will "do the right thing" to
    compile the same behvior.

## Symbols

Across words, different symbols are used in different contexts,
but we try to be consistent in their use. Here's their definitions:

! - Store
@ - Fetch
$ - Initialize
^ - Opposite
( - Lower boundary
) - Upper boundary
' - Address of
? - As a suffix, means "Is it ...?". As a prefix, "do ... if flag"
[...] - Indicates immediateness

## System variables and constants

### Constants

sysdict    Pointer to last word of the system dictionary. Is a linked list.
nextmeta   Pointer to the metadata that will be assigned to the next entry to
           be created. Reset to 0 after each entry. Is a linked list.
curword    Address of last read word (with "word" word). Is a string.
[rcnt]     Contains the current "R level" which is changed when compiling RS-
           related words during compilation.
HERE       Address containing the value of "here".
HEREMAX    Address containing the upper bound for "here"
MAIN       Address containing the value of "main".
ABORT      Address containing the value of "abort".
EMIT       Address containing the value of "emit".
IN<        Address containing the value of "in<".
CELLSZ     Size of a cell: 4
CALLSZ     The size in bytes of a native call.

SPC        $20
CR         $0d
LF         $0a
BS         $08

### Values

Obey "to" semantics

here       Address where the "write" words write.

### Aliases

Obey "to" semantics. See description in sections below.

in<        -- c
emit       c --
abort      --
main       --

## Stack

Stack words move elements around in PS and RS.

drop       a --
2drop      a b --
dup        a -- a a
?dup       a -- a? a       dup if a is nonzero.
2dup       a b -- a b a b
swap       a b -- b a
over       a b -- a b a
nip        a b -- b
tuck       a b -- b a b
rot        a b c -- b c a
rot>       a b c -- c a b

rdrop      --              *I* Compile a RS shrink of 4 bytes.
r@         --              *I* Compile a push of current RS top to PS.
r>         --              *I* Equivalent to r@ rdrop
>r         --              *I* Compiles a RS grow of 4 bytes followed by a pop
                           of PS into that new RS space.
rfree      --              *I* Shrink RS by the current [rcnt] level and reset
                           [rcnt] to 0.
r+,        n --            Compile a RS grow (n is negative) or shrink (n is
                           positive) operation by n bytes.
r',        off --          Compile the yield of RSP with "off" offset applied to
                           it. At runtime, this number will be pushed to PS.
p+,        n --            Same as r+, but for PS.
p',        off --          Same as r', but for PS.
scnt       -- n            Number of elements in PS, excluding "n".
rcnt       -- n            Number of elementS in RS, excluding this call.
stack?     --              Error out if scnt < 0.

## Memory

Memory words fetch, store or write to an address in memory. "bw" means "binary
width", a value that can be 1, 2 or 4 depending of the width of the operation.

@          a -- n        *B* Fetch n at address a.
!          n a --        *B* Store n at address a.
+!         n a --        *B* Add n to value stored at address a.
@!         n1 a -- n2    *B* Fetch n2 at address a, then store n1 at address a.
@+         a -- a+bw n   *B* Fetch n at address a and increase a.
!+         n a -- a+bw   *B* Store n at address a and increase a.
@@+        a -- n        *B* Do an indirect fetch from a and then increase the
                             pointer contained in a by "bw".
@!+        n a --        *B* Do an indirect store to a and then increase the
                             pointer contained in a by "bw".
,          n --          *B* Write n to here and increase here by "bw".
8b         --            *I* Set binary width to 8-bit
16b        --            *I* Set binary width to 16-bit

allot      u --          Increase here by u.
allot0     u --          Allot u and fill this space with zeroes.
move       src dst u --  Copy u bytes from address src to address dst, moving
                         upwards.
move,      src u --      Copy u bytes to "here" and increase "here" by u.
fill       a u c --      Fill range [a, a+u] with byte c.
align4     n --          Allot 0, 1, 2 or 3 bytes so that "here+n" is divisible
                         by 4.
nc,        n --          Parse n numbers from input stream and write them as
                         8-bit values.
[c]?       c a u -- i    Search for character c in range [a, a+u] and yield its
                         index or -1 if not found.

Convenience shortcuts:

c@  --> 8b @
c!  --> 8b !
c,  --> 8b ,
w@  --> 16b @
w!  --> 16b !
c@+ --> 8b @+
c!+ --> 8b !+

## Arithmetics

1+         a -- a+1
1-         a -- a-1
+          a b -- a+b
-          a b -- a-b
-^         a b -- b-a
*          a b -- a*b
/          a b -- a/b
/mod       a b -- r q  r is remainder, q is quotient
mod        a b -- n    n is the remainder of a divided by b
neg        a -- -a
and        a b -- n    n is the result of a binary "and" of a and b
or         a b -- n    n is the result of a binary "or" of a and b
xor        a b -- n    n is the result of a binary "exclusive or" of a and b
^          a -- n      n is the result of "a -1 xor", flipping all bits.
lshift     n u -- n    Shift n by u bits to the left
rshift     n u -- n    Shift n by u bits to the right
<<         n -- n      Shift n left by 1
>>         n -- n      Shift n right by 1
<<c        n -- n c    Shift n left by 1, yielding c as the 32-bit carry bit
>>c        n -- n c    Shift n right by 1, yielding c as the 32-bit carry bit
upcase     c -- c      If c is between "a" and "z", yield their upcase version.
                       Otherwise, yield c.

## Logic

"f" is a flag that can be either 0 or 1. We describe conditions for "f" to be 1.
Other conditions yield f=0.

<          a b -- f     f=1 if a is lower than b
>          a b -- f     f=1 if a is higher than b
=          a b -- f     f=1 if a equals b
<>         a b -- f     f=1 if a is not equal to b
>=         a b -- f     f=1 if a is higher than or equal to b
<=         a b -- f     f=1 if a is lower than or equal to b
0<         a -- f       f=1 if a is negative
0>=        a -- f       f=1 if a is not negative
not        a -- f       f=1 if a is zero
bool       a -- f       f=1 if a is nonzero
min        a b -- n     n is the lowest number between a and b.
max        a b -- n     n is the highest number between a and b.
=><=       n l h -- f   f=1 if n >= l and n <= h.
?swap      a b -- l h   Sort a and b, putting the highest number on TOS.
[]=        r1 r2 u -- f f=1 if memory ranges [r1, r1+u] and [r2, r2+u] have the
                        same content.
s=         s1 s2 -- f   f=1 if s1 and s2 have the same length and content.

## Flow

noop       --        Do nothing
bye        --        Halt the machine.
quit       --        Reset RS and return to the "main" loop.
abort      --        Reset PS and quit.
execute    a --      Call address a.
exit       --        *I* Compile a return from call.
main       --        The mainloop. Repeatedly call "word" and "runword".
leave      --        *I* Set "next" counter so that we leave the loop at the
                     next "next" branch.
[if]       f --      If f=0, skip all following words until a "[then]" word is
                     reached.
[then]     --        No-op.

Structured flow words are better explained in their interaction together rather
than their individual effect and this is done in doc/usage. Below is a list of
all the flow words for reference. All these words are immediate.

if .. else .. then
begin .. again
begin .. until
n >r begin .. next
begin .. while .. repeat
begin .. while .. while .. repeat .. else .. then
n case .. of condition .. endof .. endcase

## Linked list

llnext    ll -- ll     Yield next LL element.
llend     ll -- ll     Iterate LL until we reach the last element.
llprev    tgt ll -- ll From "ll", iterate LL until we reach the element when the
                       LL pointer points to "tgt".
lladd     ll -- ll     Write a new LL element to here, yield it, then write its
                       address to the last element of "ll".
llinsert  'll -- ll    Given a *pointer* to a LL, write a new LL to here and
                       replace that first element in that pointer with the new
                       LL element.
llcnt     ll -- n      Yield the number of elements in "ll".

## Dictionary

find       name 'dict -- word-or-0
                     Find name in dictionary pointer 'dict and yield the found
                     word or 0 if no match was found.
'          "x" -- w  Find x in system dictionary and error out if not found.
[']        "x" --    *I* Find x and compile its address as a literal.
w>e        w -- e    Yield an entry (linked list pointer) from a word reference.
e>w        e -- w    Yield a word reference (executable) from an entry.
entry      'dict s --
                     Create entry with name s in dictionary 'dict.
code       "x" --    Same as "entry", but reads name from input stream.
current    -- w      Yield the last word to be added to the system dictionary.

## Structure

struct[    "x" --     Create new struct "x" and begin defining it.
]struct    --         Exit current struct definition.
extends    "x" --     Find struct "x" in system dictionary and make the next
                      defined struct extend it.
sfield     "x" --     Add a new struct 4b field named "x".
sfieldw    "x" --     Add a new struct 2b field named "x".
sfieldb    "x" --     Add a new struct 1b field named "x".
sconst     "x" --     Add a new struct read-only 4b field named "x".
sfield'    sz "x" --  Add a new struct buffer of size sz named "x".
smethod    "x" --     Add a new struct method named "x".
structbind 'data "x y" --
                      Create a new binding named "x" that binds 'data to struct
                      named "y".
rebind     'data 'bind --
                      Bind 'data to structbind 'bind.

## I/O

key        -- c        Read next character from system interactive input source.
in<        -- c        Read next character from system input source.
"<         -- c        Read from in< and apply literal escapes. c=-1 when " is
                       read (end of string).
emit       c --        Emit character to system output destination.
nl>        --          Emit CR then LF.
spc>       --          Emit SPC.
rtype      a u --      Emit characters in range [a, a+u].
stype      s --        Emit all characters in s.
,"         x" --       Read from in< until " is reached and write it to here.
."         x" --       *IC* Emit string x.
abort"     x" --       *IC* Emit string x then abort.
maybeword  -- s-or-0   Try to read word from system input source and yield it as
                       a string if it could be read. If EOF (c = -1) is reached,
                       yield 0.
word       -- s        Try to read a word and error out if EOF is reached.
\          --          Skip input stream until end of line (LF is the mark).
(          --          Skip input stream until " ) " is read.

## Parsing

parse      s -- n? f Try to parse string s as a number. f=1 and n exists if
                     parsing was successful.
compword   s --      Compile s regardless of "compiling" flag. That is: try to
                     parse as a number. Write a literal if it's a number. Other-
                     wise, find word in system dict and check if it's an
                     immediate. If yes, execute, otherwise, write a call to its
                     address.
runword    s --      Execute string s according to our general logic: if
                     "compiling" flag is set, run "compword". Otherwise, try to
                     parse s as a number. If it is one, push in to PS. Otherwise
                     find in system dict and then execute.

## Compiling

Compiling words operate on a higher plane: they write native code to "here",
which can then be executed to have the desired effect.

[          --       *I* Stop compiling. The following words will be interpreted.
]          --       Begin compiling. The following words will be compiled.
:          "x" --   Create entry x and begin compiling.
;          --       *I* Compile a return from call and then stop compiling.
litn       n --     Compile a literal with value n.
execute,   a --     Compile a call to address a.
exit,      --       Compile a return from call.
compile    "x" --   *I* Find word x and compile a compilation of a call to it.
[compile]  "x" --   *I* Find immediate word x and instead of executing it
                    immediately as we would normally do, compile it as if it
                    wasn't immediate.
compiling  -- f     f=1 if we're currently compiling.
immediate  --       Make last added entry into an immediate entry.
create     "x" --   Create a new entry named "x" of type "cell".
const      n "x" -- Create a new constant named "x" of value n.
doer       "x" --   Create a new "doer" word, to be paired with does>.
does>      --       Begin compiling the runtime behavior of a doer word.
does'      w -- a   Yields the address of the "data" part of a doer word w.
value      n "x" -- Create a new entry of type "value" with n as its initial
                    value
ivalue     a "x" -- Create a new entry of type "indirect value" with a being the
                    address holding the pointer to the value.
alias      "x y" -- Find word "x" in system dictionary and create entry "y" of
                    type "alias" pointing to it.
ialias     a "x" -- Create a new entry of type "indirect alias" with a being the
                    address holding the pointer to the word to execute.
S"         x" --    *IC* Yield string literal with contents "x".
chain      "x y" -- Create a new "chain" targeting alias "x" and chaining "y" to
                    it.

## "to" words

The way "to" words work is that they compile their associated word right
after a literal that yields the address of the word which obeys "to" semantics.

to    --> !
to+   --> +!
to'   --> noop
to@   --> @
to@!  --> @!
to@+  --> @@+
to!+  --> @!+