~vdupras/duskbsdm

NetBSD kernel module running Dusk
Transition to self-hosted git
Dusk i386 kernel now auto-relocates!
ukbd: first step of a new driver port

refs

master
browse  log 

clone

read-only
https://git.sr.ht/~vdupras/duskbsdm
read/write
git@git.sr.ht:~vdupras/duskbsdm

You can also use your local clone with git send-email.

#DuskBSD: Run Dusk as a NetBSD kernel module

DuskBSD is a NetBSD kernel module designed to run Dusk OS from within the kernel's memory space and thus have direct access to driver memory.

The broad idea (read on for details) is that this kernel module exposes a device that we call /dev/dusk. That device is memory mappable and readable/writable. We load Dusk's binary contents through the mmap and we interact with its console by reading/writing to the device.

We thus have an OS within an OS, both being able to poke at the same hardware.

The goal of DuskBSD isn't really to casually run Dusk OS: it's too dangerous. It's meant as a development too to extract driver code from NetBSD and port it to Dusk OS.

#Getting DuskBSD

DuskBSD is available as a git repository (no SSL) at git://git.duskos.org/duskbsd.git.

#Warning

With great powers come great footguns. This mechanism specifically works around all kinds of memory protection mechanisms built into NetBSD so that we can poke at its innards directly. This means that the smallest misstep results in the host machine catastrophically crashing. There are no safety nets.

Also, because of the way Dusk's filesystem is wrapped around NetBSD's VFS, all "vnode" structures being allocated within Dusk are leaked. They're re-used (so you don't leak them at each usage), but they're never reclaimed.

#Requirements

To use this, a NetBSD 10.0 i386 system is needed. It needs to be i386, not x86_64 because otherwise the internal structures that are passed around between NetBSD and Dusk won't match.

You also need a NetBSD's source code in /usr/src because building a module requires the Makefile includes in there. If the source is at another location, you have to change the appropriate paths in Makefile.

#Usage

You need to be root.

Running make will yield dusk.kmod, your kernel module. Running make install will install the module as dusk.

This module hosts a character driver. The device has a "major" value of "351" and needs to be created with mknod /dev/dusk c 351 0. Change "major" if that one is already taken on your system.

You can then load the module with modload dusk (a message will show in the console) and unload it with modunload dusk.

This kernel module exposes an API to the code it runs through the DuskAPI struct. Dusk OS, in its netbsd/ directory, wraps this API to run a usable Dusk. Refer to instructions in netbsd/README.md for running Dusk in this kernel module.

#How it works

The idea behind this module is similar to the one for "kexec" DuskBSD. However, recent work in Dusk's kernel has made it auto-relocate, making things much easier on the kmod side. Therefore, the kmod is designed to run code that can run at any address.

There's also the problem of exposing the kernel's API to Dusk, but that is done exactly in the same manner as with "kexec" DuskBSD.

This DuskBSD is a module that hosts a character driver that can be memory mapped.

The "character" part is for interacting with Dusk and the "mmap" part is to load its initial code in memory. Before you "boot" the device, you need to open a mmap to it and write executable contents to it.

To initiate a "boot" write a single character to the device. The kmod then calls its mmap's first address.

At that point, that device becomes Dusk's console.

One peculiarity of this module is that we don't expect Dusk to run in an infinite loop, but rather to run only up to the point where it needs its next input character. When it does that, it returns to NetBSD. Then, we "poke" it back with an input character and Dusk continues its things.

This is possible because Dusk is expected to maintain two sets of return stacks and swap between the two and "reentry" and "exit" time. This means that when Dusk is waiting for a keystroke, it's entirely static with its memory accessible through the mmap. This allows for ludicrous debuggability!

A few little behaviors to remember:

  • Memory for Dusk is allocated during modload and the address stays the same until modunload.
  • The module is loaded in a "shutdown" state. Whenever a byte it written to it, it "boots".
  • The poke pointer in the Dusk API determines whether the machine is running or not. To "shut down" the machine, the running code sets poke to NULL.
  • At modload, reading /dev/dusk yields the address once.
  • At "boot", the output buffer is flushed.
  • Creating a mmap doesn't affect the output buffer or the "running state" of the device. The idea is for this mmap to allow ludicrous debuggability.

#Execution model

As outlined above, we run Dusk's binary in short "bursts" in dusk.c:dusk_write(). The execution model is articulated around the concept of "machine status" and "resume function".

The machine starts in "off" status. When we're in that status and that a character is written to the device, we "boot" the machine by calling its address 0. It is expected that the code being called there will set the resume() pointer, set a new status and then return.

At that point, we enter a loop where at each step, we act depending on the machine status.

STATUS_OFF: If status is "off" after we have called into Dusk's code, this means that Dusk has shut itself off. In this case, exit the loop and return the EBUSY error to indicate whatever has written to the device that it should "acknowledge" the shutdown in whatever way. In the duskcon utility, this means closing the console.

STATUS_WAITKEY: The machine is waiting for a keypress. This means that we're expected to put a new value in arg and then resume(). If our write buffer isn't empty, we do this right away. Otherwise, we return: the "write" operation is completed.

STATUS_SLEEP: Sleep arg microseconds and then resume.

#Structure proxying

The whole point to having this module is to poke at kernel memory, which is organized through structures. My first idea was to simply mirror C structs from the kernel code into Dusk OS, but this approach has several problems:

  1. NetBSD kernel structures are often pretty big and complex and Dusk only needs to poke to little parts of it.
  2. 64-bit fields are a hassle to pad.
  3. Compiler auto-alignment is sometimes tricky to predict.
  4. Those structures change quite often, making mirrors tightly coupled to a particular NetBSD release.

To solve these problems, we don't try to directly mirror NetBSD's kernel structures in Dusk. Instead, we create what we call proxy structures right here in the kmod. Those structures contain the fields that we care about in a stable interface that we control right here. It's this interface that is mirrored in Dusk.

Each proxy structure has associated "copy in" and "copy out" functions that copy values to/from the original structure from/to the proxy structure.

The kmod doesn't allocate proxy structures itself. Dusk manages proxy memory.

The proxy structures aren't generally passed around, only the real structures are passed in the different functions that the Dusk API provides. The proxy structures are only there to facilitate field access.

Also, proxying isn't recursive. If we expose a link to another structure in a proxy, that link will be one to the real structure. Proxying is always shallow.

When proxying arrays, we set the proxy field as a pointer to it. Therefore it is not copied back in the "copy out" phase.

The naming convention for proxied structures is to prefix them with a capital P.

#Strings

Dusk's strings are counted strings, NetBSD strings are null-terminated strings. Whenever a string is passed to or from the kmod API, it has to be a null terminated string.

It is Dusk's responsibility to convert strings to the proper format.

#NetBSD driver extraction technique

With this kernel module, it becomes possible to extract drivers from NetBSD and into Dusk OS without deep knowledge of the target hardware. Through careful steps, the process becomes a pure code transformation process. This can save us precious time. Here's how it goes.

#Duplicate the target driver in the kernel module

For example, if we want to extract ukbd, we copy /usr/src/sys/dev/usb/ukbd.c in this project's repository and add it to the Makefile's SRCS.

Getting it to compile generally requires very little modifications because the Makefile is already set up to link against the kernel.

However, when you try to load the resulting module, you'll get symbol name clashes with the original driver.

Therefore, you need to rename all public symbols by adding a dusk_ prefix.

#Plugging duplicated code in

At that point, we have two broad options: taking over the duplicated driver entirely or gradually monkeypatch it.

If the target hardware is sometime that attaches itself to the driver tree dynamically, it might be simpler to mess with the match() function to have a better priority than the original driver and thus have your duplicated code take over.

If the target hardware attaches itself at boot and isn't designed to be hot pluggable, things get more tricky because your Dusk module isn't present at boot time.

However, what you can do is to monkey patch the vectors in the drivers dynamically. For example, if you duplicated ukbd, you can replace the pollc vector in the ukbd_consops structure by your duplicated copy.

Because you have exact duplicates, this override/monkeyparch operation isn't supposed to break anything: it's the exact same code.

#Find an ideal place to begin digging

What you'll do afterwards is to begin sending some of that duplicated code to Dusk through "callbacks". Callbacks are words in Dusk OS that are "SysV aware", that is, they know they're being called with a SysV calling convention and fiddle with arguments and return values accordingly.

So, you choose a bit of code in your duplicated code that you know you'll need in Dusk and that is as "unwebbed" from the rest of the code as possible. That is, that doesn't call on other code. If it does, you're better of beginning with that code.

That will be your first target.

#Prepare your Dusk API

It is likely that this piece of code you targeted operates on specific structures. You'll need to expose these structures to Dusk through an API like DeviceAPI. Identify those structres, build proxies, test them in Dusk. You have access to the values you need? Good.

#Vectorize your target

Takes that piece of code you targeted and vectorize it, that is, extract it into a separate function and create a function pointer that points to it. Have the place where you extracted the code call that vector.

Then, expose the address of that vector to Dusk.

#Copy vectorized code to Dusk

Copy the piece of code you vectorized and have it compiled through DuskCC. Does it compile? Good. Do you have a way to test it with dummy data, just to be sure it doesn't crash and burn your machine? Even better. Otherwise, cross your fingers.

What you'll do then is to "callback-ify" your code. The code you've just compiled is expecting a Dusk calling convention. Create a proxy word using callback[ and ]callback to wrap your code.

Then, have Dusk update the vector so that it points to your proxy. Does the device still work as expected? Hurray! You now have Dusk driving a tiny part of your target hardware!

Doesn't work? Debug, but don't make the mistake of debugging at the hardware level. The code you've moved to Dusk is already correct! It's possible that DuskCC limitations require you to re-organize the code a little bit, but try to stay as close as possible to the original code and keep the logic being deployed by that code exactly the same. If you change it, then you break something that requires deep knowledge of the target hardware and that's a whole other ballpark of effort.

When the driver is fully ported and comfy in Dusk, then you can explore the possibility of improving it by looking at the hardware's datasheet.

#Keep digging

What next? Keep digging. Initially, the "surface" between the Dusk kernel and Dusk OS will grow rather large and ugly (callback mechanism is ugly and fragile), but after a while, you'll see it shrink, the code you've already moved reference itself more and more, until, at one point, the whole code is independent from NetBSD. Porting complete!