@@ 64,6 64,10 @@ Let’s prepare our stuff. First, make sure that you can run the i386 version o
Dusk. Running `make pcrun` should result in QEMU launching with a Dusk OS
prompt.
+In addition to that, you should install [mtools][mtools]. Dusk's PC image
+builds and runs fine without it, but it won't pick up extra files in `fs/`
+root, as you're about to do. With `mtools` installed, it will.
+
For this story arc, we’ll make the filesystem root our workspace. We’ll write
our work in files at the root filesystem and then load them in Dusk. Let’s test
that this works by creating a file called `myasm.fs` in the `fs/` subfolder in
@@ 71,9 75,8 @@ Dusk’s source code, with this content:
." Hello Dusk!\n"
-Then, run `make pcrun`[^1] again and at prompt, type `”f<< myasm.fs”`. You get
-the message? Good! We’re ready to build our barebone i386
-assembler.
+Then, run `make pcrun` again and at prompt, type `”f<< myasm.fs”`. You get the
+message? Good! We’re ready to build our barebone i386 assembler.
## Assembler scope
@@ 196,7 199,7 @@ structure:
The naming is confusing, but to Intel’s defense, I don’t see what other names
I’d use. Those fields are too flexible for a more specific naming scheme.
-The `mod` field selects one of the 4 general addressing modes[^2] documented at
+The `mod` field selects one of the 4 general addressing modes[^1] documented at
p. 244:
* `00 [reg]`
@@ 208,12 211,12 @@ In other words, whenever the modr/m byte fits the `C0` mask, it means that both
operands are “direct”, otherwise, one of the operands is “indirect”.
The next two fields select the operands for the instructions using register IDs
-also documented in p. 244[^3]. The `reg` field is generally used for the
+also documented in p. 244[^2]. The `reg` field is generally used for the
"direct register" operand and is not affected by `mod`. In the `“add eax,
[esi]”` instance, this field would contain a reference to `EAX`, thus 0. This
field is **not** always the destination. For example, in `“add [esi], eax”`,
the `reg` field would **also** contain `EAX`. It’s the instruction opcode that
-determines operands order (`01` vs `03`)[^4][^5].
+determines operands order (`01` vs `03`)[^3][^4].
The last field, `r/m`, is the ID of the register that will be affected by
`mod`. For example, under a mod `00`, we would set `r/m` to `ESI` (6) to
@@ 262,7 265,7 @@ problem: it leaks an element to `PS`. Instead of the signature `“a b -- n”`
that we want, it has the signature `“a b -- a n”`. To drop the `a`, we need to
“nip” it from the stack with `“add esi, 4”`.
-The instruction form that allows this is the “r/m32, imm32”[^6] form. This one
+The instruction form that allows this is the “r/m32, imm32”[^5] form. This one
also has a modr/m byte, but also a 32-bit immediate that will follow it. The
modr/m byte in this case is special: it only has one operand. In these cases,
only the `mod` and `r/m` fields are used, `reg` stays empty. **However**, to
@@ 305,27 308,18 @@ tackle the first part of a C compiler by building a tokenizer.
*[Next: Feeding the beast][nextup]*
-[^1]: PC build takes a little while. This is because I insist on using Dusk’s
-tools to build the destination FAT12 filesystem rather than POSIX ones, and
-those tools have to run through Dusk’s POSIX VM which is pretty slow. Every
-time you make a change to the code, this build process will have to repeat.
-Sorry about that, there isn’t much to do about it. Dusk is designed to work
-from within itself (which means not rebuilding it all the time), but I don’t
-want to force you to learn to use Dusk’s text editor, which is a bit rough
-around the edges. You can speed up the build process a little bit by deleting
-the “fs/doc” directory from your source tree.
-[^2]: I ignore special cases for simplicity reasons and because we won’t use
+[^1]: I ignore special cases for simplicity reasons and because we won’t use
them in this story arc.
-[^3]: EAX=0 ECX=1, etc.
-[^4]: It’s also this opcode that determines that the operation is a 32-bit one.
+[^2]: EAX=0 ECX=1, etc.
+[^3]: It’s also this opcode that determines that the operation is a 32-bit one.
If we wanted to do the same operation with the same operands in 8-bit, we’d
change the opcode to `00` or `02`. What about 16-bit? It’s complicated and out
of scope of this article.
-[^5]: It’s a bit confusing, but one realization about i386 will help you grok
+[^4]: It’s a bit confusing, but one realization about i386 will help you grok
the logic. In i386, it’s impossible to have an instruction with two “mod”
operands. For example, `“add [esi], [eax]”` is impossible. Therefore, one of
the operands is always “direct”. `reg` is the one.
-[^6]: Let’s ignore the imm8 optimization for now.
+[^5]: Let’s ignore the imm8 optimization for now.
[src]: https://git.sr.ht/~vdupras/tumbleforth
[prev]: 08-immediate.html
@@ 339,3 333,4 @@ the operands is always “direct”. `reg` is the one.
[brad]: https://www.bradrodriguez.com/papers/moving1.htm
[i386ref]: https://css.csail.mit.edu/6.858/2014/readings/i386.pdf
[x86enc]: https://www-user.tu-chemnitz.de/~heha/hsn/chm/x86.chm/x86.htm
+[mtools]: https://www.gnu.org/software/mtools/