~vdupras/tumbleforth

f9ca86bf707e28f2742a16a34a92f516c14f91a5 — Virgil Dupras 10 months ago e8b7ad1
01-duskcc/09-dusktillc: add mtools requirements

Thanks again Steve!
1 files changed, 16 insertions(+), 21 deletions(-)

M 01-duskcc/09-dusktillc.md
M 01-duskcc/09-dusktillc.md => 01-duskcc/09-dusktillc.md +16 -21
@@ 64,6 64,10 @@ Let’s prepare our stuff.  First, make sure that you can run the i386 version o
Dusk. Running `make pcrun` should result in QEMU launching with a Dusk OS
prompt.

In addition to that, you should install [mtools][mtools]. Dusk's PC image
builds and runs fine without it, but it won't pick up extra files in `fs/`
root, as you're about to do. With `mtools` installed, it will.

For this story arc, we’ll make the filesystem root our workspace. We’ll write
our work in files at the root filesystem and then load them in Dusk. Let’s test
that this works by creating a file called `myasm.fs` in the `fs/` subfolder in


@@ 71,9 75,8 @@ Dusk’s source code, with this content:

    ." Hello Dusk!\n"

Then, run `make pcrun`[^1] again and at prompt, type `”f<< myasm.fs”`. You get
the message? Good! We’re ready to build our barebone i386
assembler.
Then, run `make pcrun` again and at prompt, type `”f<< myasm.fs”`. You get the
message? Good! We’re ready to build our barebone i386 assembler.

## Assembler scope



@@ 196,7 199,7 @@ structure:
The naming is confusing, but to Intel’s defense, I don’t see what other names
I’d use.  Those fields are too flexible for a more specific naming scheme.

The `mod` field selects one of the 4 general addressing modes[^2] documented at
The `mod` field selects one of the 4 general addressing modes[^1] documented at
p. 244:

* `00 [reg]`


@@ 208,12 211,12 @@ In other words, whenever the modr/m byte fits the `C0` mask, it means that both
operands are “direct”, otherwise, one of the operands is “indirect”.

The next two fields select the operands for the instructions using register IDs
also documented in p. 244[^3]. The `reg` field is generally used for the
also documented in p. 244[^2]. The `reg` field is generally used for the
"direct register" operand and is not affected by `mod`. In the `“add eax,
[esi]”` instance, this field would contain a reference to `EAX`, thus 0. This
field is **not** always the destination. For example, in `“add [esi], eax”`,
the `reg` field would **also** contain `EAX`. It’s the instruction opcode that
determines operands order (`01` vs `03`)[^4][^5].
determines operands order (`01` vs `03`)[^3][^4].

The last field, `r/m`, is the ID of the register that will be affected by
`mod`. For example, under a mod `00`, we would set `r/m` to `ESI` (6) to


@@ 262,7 265,7 @@ problem: it leaks an element to `PS`. Instead of the signature `“a b -- n”`
that we want, it has the signature `“a b -- a n”`. To drop the `a`, we need to
“nip” it from the stack with `“add esi, 4”`.

The instruction form that allows this is the “r/m32, imm32”[^6] form. This one
The instruction form that allows this is the “r/m32, imm32”[^5] form. This one
also has a modr/m byte, but also a 32-bit immediate that will follow it. The
modr/m byte in this case is special: it only has one operand. In these cases,
only the `mod` and `r/m` fields are used, `reg` stays empty. **However**, to


@@ 305,27 308,18 @@ tackle the first part of a C compiler by building a tokenizer.

*[Next: Feeding the beast][nextup]*

[^1]: PC build takes a little while. This is because I insist on using Dusk’s
tools to build the destination FAT12 filesystem rather than POSIX ones, and
those tools have to run through Dusk’s POSIX VM which is pretty slow. Every
time you make a change to the code, this build process will have to repeat.
Sorry about that, there isn’t much to do about it. Dusk is designed to work
from within itself (which means not rebuilding it all the time), but I don’t
want to force you to learn to use Dusk’s text editor, which is a bit rough
around the edges. You can speed up the build process a little bit by deleting
the “fs/doc” directory from your source tree.
[^2]: I ignore special cases for simplicity reasons and because we won’t use
[^1]: I ignore special cases for simplicity reasons and because we won’t use
them in this story arc.
[^3]: EAX=0 ECX=1, etc.
[^4]: It’s also this opcode that determines that the operation is a 32-bit one.
[^2]: EAX=0 ECX=1, etc.
[^3]: It’s also this opcode that determines that the operation is a 32-bit one.
If we wanted to do the same operation with the same operands in 8-bit, we’d
change the opcode to `00` or `02`. What about 16-bit? It’s complicated and out
of scope of this article.
[^5]: It’s a bit confusing, but one realization about i386 will help you grok
[^4]: It’s a bit confusing, but one realization about i386 will help you grok
the logic. In i386, it’s impossible to have an instruction with two “mod”
operands. For example, `“add [esi], [eax]”` is impossible. Therefore, one of
the operands is always “direct”. `reg` is the one.
[^6]: Let’s ignore the imm8 optimization for now.
[^5]: Let’s ignore the imm8 optimization for now.

[src]: https://git.sr.ht/~vdupras/tumbleforth
[prev]: 08-immediate.html


@@ 339,3 333,4 @@ the operands is always “direct”. `reg` is the one.
[brad]: https://www.bradrodriguez.com/papers/moving1.htm
[i386ref]: https://css.csail.mit.edu/6.858/2014/readings/i386.pdf
[x86enc]: https://www-user.tu-chemnitz.de/~heha/hsn/chm/x86.chm/x86.htm
[mtools]: https://www.gnu.org/software/mtools/