~mrlee/www.leemeichin.com

3d0a122e5439a07bf14da03fef79c29001ec6635 — Lee Meichin 16 days ago e576922
Replace inline codes
M posts/a-decade-of-work.org => posts/a-decade-of-work.org +2 -2
@@ 20,11 20,11 @@ That was about the age that I went to sixth form, and I was lucky enough to get 

I essentially got into programming as a joke, because a friend in my new social circle wanted a website, or at least hinted at it. I can't fully remember what was on the site, but I used part of my part-time income from Tesco to buy a .co.uk domain in his name and point it to a little HTML thing I made and hosted through the registrar's free web hosting service. All I needed was an FTP client and a bit of dragging and dropping.

Before I even knew it I had 'PHP4 for dummies' and 'MySQL for dummies' on the desk under my weird bunk-bed setup, and I only found out about this stuff through faffing around with those phpBB forums and looking at the configs. I remember /why} I sought that out though: I had a different website and noticed that it always displayed the current time when you refreshed it. I searched for how to do it and found examples in PHP, mostly from the comments section that each page of PHP docs had. It was literally as simple as changing the file extension from ◊code{html} (or ◊code{htm}) to ◊code{php} and then adding ◊code{<?php echo date(); ?>/ wherever you wanted it. Deploying it was a case of dragging and dropping through FTP as most of these shared hosts offered PHP by default.
Before I even knew it I had 'PHP4 for dummies' and 'MySQL for dummies' on the desk under my weird bunk-bed setup, and I only found out about this stuff through faffing around with those phpBB forums and looking at the configs. I remember /why} I sought that out though: I had a different website and noticed that it always displayed the current time when you refreshed it. I searched for how to do it and found examples in PHP, mostly from the comments section that each page of PHP docs had. It was literally as simple as changing the file extension from ~html~ (or ~htm~) to ~php~ and then adding ◊code{<?php echo date(); ?>/ wherever you wanted it. Deploying it was a case of dragging and dropping through FTP as most of these shared hosts offered PHP by default.

That was literally my first line of dynamic code.

Skip ahead a few years, all the way to 2012 when I moved to London. I'd worked a PHP job full time before then but it was only at New Bamboo where I would find my form. They wrote everything in Ruby on Rails, and my experience in that was extremely minimal. Somehow I'd set up a redis server and had ruby communicating with it on my own hardware, but it didn't do much and I couldn't really figure out the code a few months after I wrote it. This required learning a lot of new things in terms of building application servers, and deploying code. Capistrano◊^[4] was the tool of choice in Ruby-land for deploying to a VPS and it was essentially a DSL over a bunch of shell scripts. In all honesty this DSL was great but were I not made to use it, I would be a lot more intimate with the power of SSH and tools like ◊code{scp}, and understanding the issue with things like forwarding your SSH agent because you pull from a private git repo on your server.
Skip ahead a few years, all the way to 2012 when I moved to London. I'd worked a PHP job full time before then but it was only at New Bamboo where I would find my form. They wrote everything in Ruby on Rails, and my experience in that was extremely minimal. Somehow I'd set up a redis server and had ruby communicating with it on my own hardware, but it didn't do much and I couldn't really figure out the code a few months after I wrote it. This required learning a lot of new things in terms of building application servers, and deploying code. Capistrano◊^[4] was the tool of choice in Ruby-land for deploying to a VPS and it was essentially a DSL over a bunch of shell scripts. In all honesty this DSL was great but were I not made to use it, I would be a lot more intimate with the power of SSH and tools like ~scp~, and understanding the issue with things like forwarding your SSH agent because you pull from a private git repo on your server.

I won't talk much about the code, although my years at New Bamboo were truly formative. One thing has stuck with me since then though, over the 8 years since I was told it. My boss at the time saw I was struggling with managing the expectations of the client I was working with, and I was trying too hard to do things alone and hoping for the best instead of reaching out for the help that was readily available. I must have only been about 5 months into the job at that point. My boss took me into our boardroom, the table of which doubled up as a pingpong table, asked if I was alright, and then said something I've never forgotten since:


M posts/blogging-in-haskell.org => posts/blogging-in-haskell.org +3 -3
@@ 12,11 12,11 @@ Eventually I stumbled across Hakyll◊^[5] and, after finding a CSS 'framework' 

The major appeal so far has been the immense ease of customisation. Hakyll itself isn't a static site generator in the same sense that others are, and as a result it offers a layer of customisation that other generators generally defer to templating languages for.

The main difference is that you don't pull down a ◊code{hakyll} binary and then throw a ◊code{yaml} file together in order to configure a few pre-defined properties; you're instead given a basic implementation of a generator, using hakyll's own library, and thus have complete control over routing, page generation, templating, and so on. This generally lives in a ◊code{site.hs} file and it's not difficult to follow even for relative newbies to Haskell. The structure of everything else is entirely up to you.
The main difference is that you don't pull down a ~hakyll~ binary and then throw a ~yaml~ file together in order to configure a few pre-defined properties; you're instead given a basic implementation of a generator, using hakyll's own library, and thus have complete control over routing, page generation, templating, and so on. This generally lives in a ~site.hs~ file and it's not difficult to follow even for relative newbies to Haskell. The structure of everything else is entirely up to you.

Once you compile this file, you end up with a nice binary, e.g. ◊code{site}, and /that/ is what you use to generate your site. It is beautiful in its elegance and I'm eager to see what I can add to this site while also learning some more Haskell at the same time.
Once you compile this file, you end up with a nice binary, e.g. ~site~, and /that/ is what you use to generate your site. It is beautiful in its elegance and I'm eager to see what I can add to this site while also learning some more Haskell at the same time.

As an example, on the home page, there is a ◊code{git log} output section. It's fairly primitive, although I intend to build out the functionality a bit more. Writing the functionality was fairly effortless, with the help of some other authors on the net:
As an example, on the home page, there is a ~git log~ output section. It's fairly primitive, although I intend to build out the functionality a bit more. Writing the functionality was fairly effortless, with the help of some other authors on the net:

◊codeblock['haskell]{
  data GitLog = Hash | Commit | Full

M posts/can-you-crack-the-code.org => posts/can-you-crack-the-code.org +5 -5
@@ 60,7 60,7 @@ Mr Merritt is, to put it professionally, *god damn right*. Here's a valid Prolog

What we have here are some facts, both true and technically true. It's a fact that Obama is a president, as is Trump. It's also a fact that there is a brand of cheese in the UK called President. This is quite ambiguous as a result so some extra facts are supplied, namely that brie is a cheese as much as it is a President-brand cheese, and that Wensleydale is also a cheese. It goes without saying that Trump and Obama are people, so with those facts we should be able to do some querying.

If you're doing this on your own machine, you can save those facts into a file (say, ◊code{example.pl}) and then importing it inside a console, like so: ◊code{[example].}. Otherwise, you can load up the Swish notebook◊^[2] and follow along using an online console, no installation needed!
If you're doing this on your own machine, you can save those facts into a file (say, ~example.pl~) and then importing it inside a console, like so: ~[example].~. Otherwise, you can load up the Swish notebook◊^[2] and follow along using an online console, no installation needed!

Let's do some querying then, which will show you how Prolog might seem a bit back to front compared to what you're used to.



@@ 75,12 75,12 @@ So far, so boring. We stated `president(trump)` as a fact in our first prolog fi
}

◊aside{
  With the online editor you can click 'Next' to see all of the results, and in the console you can type ◊code{;}. This can be done repeatedly until the input ends with ◊code{.}, which says there are no more facts that fit the query)
  With the online editor you can click 'Next' to see all of the results, and in the console you can type ~;~. This can be done repeatedly until the input ends with ~.~, which says there are no more facts that fit the query)
}

The fuck? What is ◊code{X}?
The fuck? What is ~X~?

◊code{X} is a variable, or a placeholder if you like. Any word starting with a capital letter is a variable, and when you pass one in a query Prolog will supply the results of the query to those variables. In this case, we're essentially saying /"who are all the presidents? I don't know their names so put them all in ◊code{X} for me"/.
~X~ is a variable, or a placeholder if you like. Any word starting with a capital letter is a variable, and when you pass one in a query Prolog will supply the results of the query to those variables. In this case, we're essentially saying /"who are all the presidents? I don't know their names so put them all in ~X~ for me"/.

Let's try one more thing, which should explain enough about Prolog to be dangerous.



@@ 88,7 88,7 @@ Let's try one more thing, which should explain enough about Prolog to be dangero
  president(X), cheese(X). % brie.
}

/Now we're cookin' wi' gas!} as we'd say back up north. A lot of what you do in prolog is chain little sentences like this together (using the comma operator ◊code{,}, which means ◊code{and}), and in this instance we're asking Prolog to get all the presidents, put them in ◊code{X}, and then show me only the presidents that are also a cheese. The ◊code{./ finishes the sentence, or the query. Let's do a similar query to wrap this intro up, and you can see if your guess at the answer is the same as what this produces.
/Now we're cookin' wi' gas!} as we'd say back up north. A lot of what you do in prolog is chain little sentences like this together (using the comma operator ~,~, which means ~and~), and in this instance we're asking Prolog to get all the presidents, put them in ~X~, and then show me only the presidents that are also a cheese. The ◊code{./ finishes the sentence, or the query. Let's do a similar query to wrap this intro up, and you can see if your guess at the answer is the same as what this produces.

◊codeblock['prolog]{
  president(X), person(X). % trump, obama.

M posts/floc-off.org => posts/floc-off.org +2 -2
@@ 12,11 12,11 @@ Browsing this site will not opt you into this latest experiment in large-scale p

You might notice that the site does gather analytics using plausible.io◊^[1], who themselves go into some more detail about this and how to opt-out◊^[2].

You can see the analytics for yourself, as I have made them public - you and I see the same thing on that page. It's a glorified hit-counter that lets me see what posts land better than others and it is very easily adblockable. In fact, go ahead and block Javascript on this site - if there's any feature I ever add that depends on it, there will always be an accessible ◊code{<noscript>}◊^[3] fallback if it actually matters to me.
You can see the analytics for yourself, as I have made them public - you and I see the same thing on that page. It's a glorified hit-counter that lets me see what posts land better than others and it is very easily adblockable. In fact, go ahead and block Javascript on this site - if there's any feature I ever add that depends on it, there will always be an accessible ~<noscript>~◊^[3] fallback if it actually matters to me.

I don't have any issue with that kind of technology, for what it's worth. You're only seeing how people use your site so you can figure out how you might tweak things, or understand what you need to do less of if you're scaring people away. It has practically nothing in common with the invasive tracking and advertising that follows you all across the internet, the likes that Google and Facebook involve themselves with at a scale beyond human comprehension.

Anyway, every page here is served with the ◊code{Permissions-Policy: interest-cohort=()} header set. There is a valid argument that this still presents a datapoint that can be tracked, but since the change is happening server-side, it is less useful than if you sent the same thing from your browser in every request, adding to your unique fingerprint (as with ◊code{Do-Not-Track}, an abject failure of a standard◊^[4]).
Anyway, every page here is served with the ~Permissions-Policy: interest-cohort=()~ header set. There is a valid argument that this still presents a datapoint that can be tracked, but since the change is happening server-side, it is less useful than if you sent the same thing from your browser in every request, adding to your unique fingerprint (as with ~Do-Not-Track~, an abject failure of a standard◊^[4]).

If you're curious, you can also check out the Security Headers report for this site◊^[5].


M posts/gettin-ziggy-with-it-pi-zero.org => posts/gettin-ziggy-with-it-pi-zero.org +20 -20
@@ 14,7 14,7 @@ I've been interested in giving Zig a spin for quite a while, and once my Raspber

With that out of the way, let's see if we can write something in Zig to power this little display. It's going to be a simple program that simply fills the entire screen by turning the pixels from black (off) to white (on). As an extra challenge, we will do this without pulling in dependencies like WiringPi◊^[5], or relying on existing drivers, as lovely as they are.

Instead, we will be directly using the i◊sup{2}c dev interface◊^[6]. If you're using Debian and/or Ubuntu on your Pi and your own machine, you can grab these libraries with a simple ◊code{sudo apt install i2c-dev}. You will need to enable i◊sup{2}c on your Pi separately though, through ◊code{sudo raspi-config}◊^[7].
Instead, we will be directly using the i◊sup{2}c dev interface◊^[6]. If you're using Debian and/or Ubuntu on your Pi and your own machine, you can grab these libraries with a simple ~sudo apt install i2c-dev~. You will need to enable i◊sup{2}c on your Pi separately though, through ~sudo raspi-config~◊^[7].

Ready to... get Ziggy with it? Oh, I bet you are. 😋 If you want to skip to the end and just grab the code, though, you can find this all on GitHub◊^[8]. I called it Stardust, like /Zig/gy Stardust. Get it?



@@ 26,7 26,7 @@ Ready to... get Ziggy with it? Oh, I bet you are. 😋 If you want to skip to th

The first and most complicated part of any low-level project is the bit where you try and establish a build system of some sorts. We're going to forget about that completely for now and apply some elbow-grease to the situation.

The next step is to define a ◊code{main} function that grabs a file descriptor (or handle) corresponding to our OLED display. According to the aforementioned dev interface docs, we'll need to open a file and check it with ◊code{ioctl}.
The next step is to define a ~main~ function that grabs a file descriptor (or handle) corresponding to our OLED display. According to the aforementioned dev interface docs, we'll need to open a file and check it with ~ioctl~.

◊codeblock['zig]{
  const std = @import("std");


@@ 56,11 56,11 @@ The next step is to define a ◊code{main} function that grabs a file descriptor

You might have noticed something odd: we're not really writing much Zig here, it's practically 95% interop with C. The beauty of Zig is that this interop is so simple and intuitive that it's the /easiest/ way to get started if you're going to be linking against existing C libraries. Get the software working first, abstract it later, as they say, and you might already start to get an idea of what we could convert into idiomatic Zig libraries in future.

The actual Zig code you see though, is quite different to the C stuff. That ◊code{defer fd.close()}, for example, /ensures/ that the file descriptor we opened up will be closed when we're done. If we don't do that, then it'll stay open and there'll be a leak.
The actual Zig code you see though, is quite different to the C stuff. That ~defer fd.close()~, for example, /ensures/ that the file descriptor we opened up will be closed when we're done. If we don't do that, then it'll stay open and there'll be a leak.

There's also the ◊code{try} macro, used in combination with the ◊code{!void} return type, which will be super familiar if you've written some Rust and have dealt with option types. It's short hand for executing the code and catching/dealing with the error, with ◊code{!void} being another shorthand for ◊code{anyerror!void}, namely: this function returns either nothing, or an error if there is one.
There's also the ~try~ macro, used in combination with the ~!void~ return type, which will be super familiar if you've written some Rust and have dealt with option types. It's short hand for executing the code and catching/dealing with the error, with ~!void~ being another shorthand for ~anyerror!void~, namely: this function returns either nothing, or an error if there is one.

What we've actually done, however, is open the device file ◊code{/dev/i2c-1}, and then used the ◊code{ioctl} library to specify which device in particular we want to talk to. You can find out this value by running ◊code{i2cdevice -y 1}, like so:
What we've actually done, however, is open the device file ~/dev/i2c-1~, and then used the ~ioctl~ library to specify which device in particular we want to talk to. You can find out this value by running ~i2cdevice -y 1~, like so:

◊codeblock['text]{
  pi@raspberrypi:~ $ i2cdetect -y 1


@@ 76,7 76,7 @@ What we've actually done, however, is open the device file ◊code{/dev/i2c-1}, 
}

◊aside{
  In my case, the device can be accessed at address ◊code{0x3C}, which is how I defined ◊code{i2c_addr} above.
  In my case, the device can be accessed at address ~0x3C~, which is how I defined ~i2c_addr~ above.
}

We're at a good point now to try and compile this thing and then run it on the Pi. If we get the message 'Init successful.' then we're golden.


@@ 91,26 91,26 @@ Are you writing this code on the Pi itself? Probably not, I imagine, and nor do 

◊q["Andrew Kelley" 2020]{Cross-compiling is a first-class use case}

Let's build a binary, then. Save your code into a file, say, ◊code{stardust.zig} and then proceed.
Let's build a binary, then. Save your code into a file, say, ~stardust.zig~ and then proceed.

◊codeblock['bash]{
  zig build-exe stardust.zig  -target arm-linux-musleabihf -mcpu arm1176jzf_s -O ReleaseSafe -lc
}

To unpack that a little, the ◊code{target} is a triplet stating that we want to build this using the musl◊^[9] libc ABI, on a 32bit ARM architecture. ◊code{mcpu} goes along with that to make sure the resulting binary will work on our Pi Zero. I grabbed these values from an issue on Zig's github repo◊^[10], so credit goes to the author of that issue for unintentionally guiding me forward.
To unpack that a little, the ~target~ is a triplet stating that we want to build this using the musl◊^[9] libc ABI, on a 32bit ARM architecture. ~mcpu~ goes along with that to make sure the resulting binary will work on our Pi Zero. I grabbed these values from an issue on Zig's github repo◊^[10], so credit goes to the author of that issue for unintentionally guiding me forward.

Passing the optimiser flag (◊code{-O}) isn't strictly necessary, so you can omit this if you require a debug build and stack traces with errors.
Passing the optimiser flag (~-O~) isn't strictly necessary, so you can omit this if you require a debug build and stack traces with errors.

◊code{-lc} basically says that this binary needs to be linked against libc.
~-lc~ basically says that this binary needs to be linked against libc.

Once the build finishes, you should find a shiny new executable called ◊code{stardust} in the same directory as your code. You can get it onto your Pi with ◊code{scp}, like so:
Once the build finishes, you should find a shiny new executable called ~stardust~ in the same directory as your code. You can get it onto your Pi with ~scp~, like so:

◊codeblock['bash]{
  scp stardust pi@raspberrypi:~/stardust
}

◊aside{
  You will need to change ◊code{pi@raspberrypi} to whatever else you've configured if you've changed the defaults.
  You will need to change ~pi@raspberrypi~ to whatever else you've configured if you've changed the defaults.
}

SSH into your Pi after that, and try and run it! Does it return successfully? I hope so!


@@ 194,9 194,9 @@ Next we'll want to init the display and get it into a clean state, with the curs
  }
}

Wow, actual Zig code! The formatting may look a little odd because that's what ◊code{zig fmt} decides is appropriate.
Wow, actual Zig code! The formatting may look a little odd because that's what ~zig fmt~ decides is appropriate.

◊code{init_display} is quite a complex beast that issues a whole series of commands that sets up the display for further use. A more detailed explanation of that will be in another post, for the sake of brevity, but in essence it was adapted from AdaFruit's CircuitPi driver, written in Python◊^[15].
~init_display~ is quite a complex beast that issues a whole series of commands that sets up the display for further use. A more detailed explanation of that will be in another post, for the sake of brevity, but in essence it was adapted from AdaFruit's CircuitPi driver, written in Python◊^[15].

The recurring theme in all of these new functions is that the entire basis of their existence is to create an array of two bytes, and then write them to file descriptor we opened right at the start. The data structure looks something like this:



@@ 205,9 205,9 @@ The recurring theme in all of these new functions is that the entire basis of th
  buf[1] = 0x??; // the value to assign to that register
}

The file opened in ◊code{main} isn't a traditional file as you know it, but it points to all of the devices connected to your GPIO header on the Pi. Therefore, if you know enough about the hardware at a low enough level, you can control all of them by writing the right bytes to the right register, at the right address.
The file opened in ~main~ isn't a traditional file as you know it, but it points to all of the devices connected to your GPIO header on the Pi. Therefore, if you know enough about the hardware at a low enough level, you can control all of them by writing the right bytes to the right register, at the right address.

The rest of the code, e.g. ◊code{reset_cursor}, resets the state of the display in such a way that you can write a pixel and the cursor will advance, linearly, to the next one.
The rest of the code, e.g. ~reset_cursor~, resets the state of the display in such a way that you can write a pixel and the cursor will advance, linearly, to the next one.

◊codeblock['zig]{
  fn fill(fd: fs.File) !void {


@@ 220,7 220,7 @@ The rest of the code, e.g. ◊code{reset_cursor}, resets the state of the displa
  }
}

This ◊code{fill} function will (rather quickly) turn the display solid white, updating each pixel one at a time. Before we continue though, let's go through some more Zig specifics; namely, ◊code{inline}.
This ~fill~ function will (rather quickly) turn the display solid white, updating each pixel one at a time. Before we continue though, let's go through some more Zig specifics; namely, ~inline~.

◊hr{}



@@ 230,11 230,11 @@ This ◊code{fill} function will (rather quickly) turn the display solid white, 
  Reach out to me at pleasemakeitstop@mrlee.dev if this is too much for you.
}

Zig has some nice language features intended to replace and improve upon C/C++ preprocessor macros. The ◊code{inline} keyword is one such thing, and when applied to a ◊code{for} or ◊code{while} loop it'll unroll it at compile time. A simple optimisation but a useful one. We don't use it, but you also have ◊code{comptime}, which is powerful enough to be able to implement generics, if you so desire. We're not going to go into that here though, and you can read more about it from a certain Loris Cro◊^[16].
Zig has some nice language features intended to replace and improve upon C/C++ preprocessor macros. The ~inline~ keyword is one such thing, and when applied to a ~for~ or ~while~ loop it'll unroll it at compile time. A simple optimisation but a useful one. We don't use it, but you also have ~comptime~, which is powerful enough to be able to implement generics, if you so desire. We're not going to go into that here though, and you can read more about it from a certain Loris Cro◊^[16].

◊hr{}

This post is getting pretty long-winded, and all I wanted to do was show how to set some pixels on a tiny display. Let's wrap this up then, since we're almost ready to recompile. Just one finishing touch, which is to call the functions we defined. Update ◊code{main} to look like this:
This post is getting pretty long-winded, and all I wanted to do was show how to set some pixels on a tiny display. Let's wrap this up then, since we're almost ready to recompile. Just one finishing touch, which is to call the functions we defined. Update ~main~ to look like this:

◊codeblock['zig]{
  pub fn main() !void {


@@ 261,7 261,7 @@ This post is getting pretty long-winded, and all I wanted to do was show how to 
  }
}

Once you're done, rebuild the binary and ◊code{scp} it over, like you did the first time. SSH into your Pi and run it again (i.e ◊code{./stardust}), and see your display light up! 🥳
Once you're done, rebuild the binary and ~scp~ it over, like you did the first time. SSH into your Pi and run it again (i.e ~./stardust~), and see your display light up! 🥳

◊hr{}


M posts/hakyll-on-devops-pipelines.org => posts/hakyll-on-devops-pipelines.org +11 -11
@@ 13,9 13,9 @@ In a way, this is total overkill for a static site. If I have the repo cloned on
  scp -r _site/ deploy@mrlee.dev:/var/www/www.mrlee.dev/
}

It's flawed compared to using ◊code{rsync}, as it won't remove existing files, but it does the job in less than a second or two.
It's flawed compared to using ~rsync~, as it won't remove existing files, but it does the job in less than a second or two.

The thing is, this isn't so quick if I want to publish a post from a different computer that doesn't have any programming tools installed. I would have to install ◊code{stack}◊^[1], which is a build tool for Haskell, and then I would have to run ◊code{stack build}. This can take at least half an hour as the command will pull down the correct version of ◊code{GHC} and a 'snapshot' (basically a huge collection of all the Hackage◊^[2] libraries available for that build) before it even /thinks} about compiling my ◊code{site.hs/ file. It also means to committing a few gigs of storage space for all of that.
The thing is, this isn't so quick if I want to publish a post from a different computer that doesn't have any programming tools installed. I would have to install ~stack~◊^[1], which is a build tool for Haskell, and then I would have to run ~stack build~. This can take at least half an hour as the command will pull down the correct version of ~GHC~ and a 'snapshot' (basically a huge collection of all the Hackage◊^[2] libraries available for that build) before it even /thinks} about compiling my ◊code{site.hs/ file. It also means to committing a few gigs of storage space for all of that.

I like to write from my little Surface Pro when I'm out and about, so I'd rather not do a full-blown compilation on that for the sake of my battery. Enter Azure DevOps Pipelines◊^[3].



@@ 32,7 32,7 @@ Let's do a step-by-step walk through my setup.
    vmImage: 'ubuntu-latest'
}

This is pretty much CI boilerplate. The build will run on any PR that targets ◊code{master}, and it uses Ubuntu as the underlying image. I'm not doing any Docker stuff here.
This is pretty much CI boilerplate. The build will run on any PR that targets ~master~, and it uses Ubuntu as the underlying image. I'm not doing any Docker stuff here.

◊codeblock['yaml]{
  jobs:


@@ 60,7 60,7 @@ Won't get far without grabbing the latest stable Stack binary.
      cacheHitVar: 'STACK_SNAPSHOT_RESTORED'
}

Later on there will be a step that runs ◊code{stack build}, which will take about 40 minutes in CI. It would be a waste to repeatedly download all of that, so I'm caching the root stack folder for good measure. The ◊code{cacheHitVar} is something we will reference later.
Later on there will be a step that runs ~stack build~, which will take about 40 minutes in CI. It would be a waste to repeatedly download all of that, so I'm caching the root stack folder for good measure. The ~cacheHitVar~ is something we will reference later.

◊codeblock['yaml]{
  - task: Cache@2


@@ 81,7 81,7 @@ This is the same as the last step, but it's for the dependencies my static site 
    condition: ne(variables.STACK_SNAPSHOT_RESTORED, 'true')
}

Notice the ◊code{STACK_SNAPSHOT_RESTORED} condition at the bottom there? This step sets up GHC and the Stack snapshot, but only if one wasn't restored from the cache. If the cache has it, then it will have alread been fetched.
Notice the ~STACK_SNAPSHOT_RESTORED~ condition at the bottom there? This step sets up GHC and the Stack snapshot, but only if one wasn't restored from the cache. If the cache has it, then it will have alread been fetched.

◊codeblock['yaml]{
  - script: |


@@ 100,7 100,7 @@ This is the same as above, but for the project dependencies. So far so good. We'
    displayName: Build Site Executable
}

Since I've already run ◊code{stack build}, this just copies the binary to a different location, which I use to store it as a build artifact. ◊code{Build.BinariesDirectory} is a special place on the VM to store compiled build artifacts. It doesn't matter where specifically that is, only that it's the same across steps.
Since I've already run ~stack build~, this just copies the binary to a different location, which I use to store it as a build artifact. ~Build.BinariesDirectory~ is a special place on the VM to store compiled build artifacts. It doesn't matter where specifically that is, only that it's the same across steps.

◊codeblock['yaml]{
  - task: PublishBuildArtifacts@1


@@ 121,7 121,7 @@ So, that's the first step done, but what about actually publishing a post? I hav
    steps: ...
}

The key to this step is the condition. This will run only if the ◊code{build} job was successful, /and/ the branch being built is the master branch. Practically, this only runs if I push straight to master or merge a PR. The staging version runs only on PRs.
The key to this step is the condition. This will run only if the ~build~ job was successful, /and/ the branch being built is the master branch. Practically, this only runs if I push straight to master or merge a PR. The staging version runs only on PRs.

◊codeblock['yaml]{
  - task: DownloadBuildArtifacts@0


@@ 141,7 141,7 @@ Time to put that binary I compiled to good use. It downloads it into the main wo
    displayName: Build with published posts
}

This is the same as running ◊code{stack exec site build} on my local machine. It compiles the static site, so finally I'll have a new version to upload.
This is the same as running ~stack exec site build~ on my local machine. It compiles the static site, so finally I'll have a new version to upload.

◊codeblock['yaml]{
  - task: InstallSSHKey@0


@@ 151,7 151,7 @@ This is the same as running ◊code{stack exec site build} on my local machine. 
      sshKeySecureFile: 'nexus_deploy'
}

I host this blog on my own little VPS, which means that the server needs to know that the CI is authorised to connect to it with its SSH key. This is the same as having a deploy key on GitHub, and requires generating a keypair to be stored in CI, with the public key being added to your ◊code{authorized_keys} file of the appropriate user on the server.
I host this blog on my own little VPS, which means that the server needs to know that the CI is authorised to connect to it with its SSH key. This is the same as having a deploy key on GitHub, and requires generating a keypair to be stored in CI, with the public key being added to your ~authorized_keys~ file of the appropriate user on the server.

◊aside{
  At this point I'll say that if you're doing this yourself, make sure to properly harden your server. I'll describe this more in a follow-up post.


@@ 171,9 171,9 @@ There's only step left now, and that's to deploy!
      readyTimeout: '20000'
}

This is similar to running ◊code{rsync} to deploy, except that it knows where to get your private key from and where to connect to. This is defined elsewhere in Azure DevOps, through the UI, rather than in the YAML file.
This is similar to running ~rsync~ to deploy, except that it knows where to get your private key from and where to connect to. This is defined elsewhere in Azure DevOps, through the UI, rather than in the YAML file.

To solve the issue I first mentioned, ◊code{cleanTargetFolder} makes sure to delete the previous deployment before copying the new one over. Problem solved!
To solve the issue I first mentioned, ~cleanTargetFolder~ makes sure to delete the previous deployment before copying the new one over. Problem solved!

To see the pipeline in full, you can check out the full YAML file◊^[5]. I've been using it with success for the past couple of weeks now.


M posts/rewrite-it-in-lisp.org => posts/rewrite-it-in-lisp.org +3 -3
@@ 14,7 14,7 @@ Going through Beautiful Racket led me to the technology used to build the book, 

The best way to evaluate a solution is to build a proof of concept, and soon enough I'd embarked on a several-week long project to convert this blog to be a Pollen-based publication. The end result is what you're reading now: a successful rewrite, operating in production without a hitch.

In order to make that happen, I had to convert all the Markdown-based posts to use the Pollen Markup processor. This is a minimal syntax that uses a 'lozenge' (◊code{◊"◊"}) to apply formatting to text where needed. This makes it much easier to publish posts under multiple formats, like LaTeX or PDF, as each format can render the same tags in an appropriate way, not just as HTML.
In order to make that happen, I had to convert all the Markdown-based posts to use the Pollen Markup processor. This is a minimal syntax that uses a 'lozenge' (~◊"◊"~) to apply formatting to text where needed. This makes it much easier to publish posts under multiple formats, like LaTeX or PDF, as each format can render the same tags in an appropriate way, not just as HTML.

That was the easy, albeit boring, part. The real challenge was taking the bits and pieces I'd written in Haskell and porting them over to this new system, such as the changelog that is rendered at the bottom of each post and the estimated reading time at the top. Footnote references are not a built-in feature either, so that also required some thinking.



@@ 24,7 24,7 @@ This is where Pollen gets interesting: each source file is essentially converted
  (root (div ((class "main")) (p "Hello" (span "world"))))
}

There is no templating language, per-se, in Pollen. The ◊code{◊"◊"} syntax is a small sugar over normal Lisp syntax, and all you're ultimately doing is calling a function that you've defined somewhere in your code (or in the same file). These functions can do anything--there's nothing special about them--so long as they return a valid X-expression, which means that you don't really have to learn how to extend a language or figure out how to integrate with one.
There is no templating language, per-se, in Pollen. The ~◊"◊"~ syntax is a small sugar over normal Lisp syntax, and all you're ultimately doing is calling a function that you've defined somewhere in your code (or in the same file). These functions can do anything--there's nothing special about them--so long as they return a valid X-expression, which means that you don't really have to learn how to extend a language or figure out how to integrate with one.

As such, a functional approach is still the easiest way to add new capabilities to your project, although it may not always be enough if you need to handle state across pages. An example is my implementation for links with footnotes:



@@ 83,7 83,7 @@ You can see the rest of what I scripted in the main `pollen.rkt` file for this s

In the end, the only thing I traded-off was the post category in the URL itself. Hakyll made that easy to add, but ultimately it wasn't that important a feature to add. There was no point getting clever as there are only a dozen or so published posts, so I constructed a flat list of permanent redirects for Caddy to serve◊^[5].

Overall, I'm happy with the change. It doesn't take 45 minutes to re-compile my Hakyll build if I add new functionality, and I don't have to host my own Debian repo to install the compiled binary through ◊code{apt}. Rendering the site and publishing it is still completed in a matter of seconds. And, now I've gone through the effort of deploying it for this site, I'm a lot more confident about using Pollen again for other projects.
Overall, I'm happy with the change. It doesn't take 45 minutes to re-compile my Hakyll build if I add new functionality, and I don't have to host my own Debian repo to install the compiled binary through ~apt~. Rendering the site and publishing it is still completed in a matter of seconds. And, now I've gone through the effort of deploying it for this site, I'm a lot more confident about using Pollen again for other projects.

◊footnotes{
  ◊^[1]{◊<>["https://beautifulracket.com/"]}

M posts/ruby-sorcery-ractor-2.org => posts/ruby-sorcery-ractor-2.org +7 -7
@@ 55,13 55,13 @@ The anatomy of an HTTP request is divided, essentially, into three parts:
                                   | Final empty line to denote end of request
}

The first line states the request method (i.e. ◊code{GET}, ◊code{POST}, etc.) and the target of the request, which will often be a relative path on the server but can also be a full URL.
The first line states the request method (i.e. ~GET~, ~POST~, etc.) and the target of the request, which will often be a relative path on the server but can also be a full URL.

◊aside{Note that there is also a concept of HTTP 'trailers', which are headers that appear /after} the body◊^[2]. These are only used for certain kinds of chunked requests and are way out of scope for this post./

What follows is a list of headers, which are key/value pairs used to provide extra information about the request. For the sake of simplicity, most of these will be ignored, and there are /many/ of them.

Finally, there is a place for the body of the request. This is optional, but it must have an empty line both before and afterwards if it is present. A typical request body may contain URLEncoded form data, which is how your typical HTML forms work, but there is not much of a restriction provided that the ◊code{Content Type} header describes the format of the payload, e.g if it's JSON, XML, or perhaps even something like an image or a video.
Finally, there is a place for the body of the request. This is optional, but it must have an empty line both before and afterwards if it is present. A typical request body may contain URLEncoded form data, which is how your typical HTML forms work, but there is not much of a restriction provided that the ~Content Type~ header describes the format of the payload, e.g if it's JSON, XML, or perhaps even something like an image or a video.

In this chapter, the main focus is on the first section, and some of the second. And since the purpose of the post is to demonstrate Ractor, and because in the Actor model, everything is an actor... the thing that parses the request will be an Actor too.



@@ 108,11 108,11 @@ Something like this should do the trick, and provide a foundation to build on.
  end
}

◊aside{Why all the ◊code{loop}s and ◊code{while} loops? Ractors behave a bit like Enumerators◊^[5], which means that if they stop yielding values or actually return a value, the Ractors close and can no longer be used.}
◊aside{Why all the ~loop~s and ~while~ loops? Ractors behave a bit like Enumerators◊^[5], which means that if they stop yielding values or actually return a value, the Ractors close and can no longer be used.}

What we have here is a Ractor that waits for incoming HTTP request messages, and then parses them into something that the server can more easily work with by pulling out important info like the request location, the HTTP method, the content type, and the body. In this example, Ruby's pattern matching features are liberally employed to handle the parsing in some places; this is more for the sake of demonstration to show that it /can} be done, not necessarily that it always ◊em{should/ be.

In any case, once the ◊code{HttpRequest} object is constructed, it is yielded so that another Ractor can use the object, and therefore it will sit in a queue (or a mailbox, in actor model parlance) until it is taken from it. As a final housekeeping step, the string scanner instance used to parse the request is terminated. It's always a good idea to clean up after yourself if the language provides you the mechanism to do so.
In any case, once the ~HttpRequest~ object is constructed, it is yielded so that another Ractor can use the object, and therefore it will sit in a queue (or a mailbox, in actor model parlance) until it is taken from it. As a final housekeeping step, the string scanner instance used to parse the request is terminated. It's always a good idea to clean up after yourself if the language provides you the mechanism to do so.

Going back to the functionality at hand; this basically shunts the parsing of HTTP requests into another thread, which means that the Ractors responsible for managing the TCP layer can stay responsible for that, and hand over the application-layer responsibilities to other actors/processes/Ractors.



@@ 133,13 133,13 @@ The TCP server now requires an upgrade: it's going to read input but it can no l
  end
}

The most significant change, here, is that the innnermost Ractor sends input over to the new ◊code{HttpRequestParser} Ractor. It then immediately waits for a response. That seems a bit weird - why not just do it inline? - but that's only because the job of the HTTP Parser is pretty basic right now, whereas in future a whole bunch of things can happen in between the TCP layer reading in some data, and the TCP layer sending back a bunch of HTML or JSON or some such.
The most significant change, here, is that the innnermost Ractor sends input over to the new ~HttpRequestParser~ Ractor. It then immediately waits for a response. That seems a bit weird - why not just do it inline? - but that's only because the job of the HTTP Parser is pretty basic right now, whereas in future a whole bunch of things can happen in between the TCP layer reading in some data, and the TCP layer sending back a bunch of HTML or JSON or some such.

◊aside{This works for basic requests with no body element, but consider why it fails if a body is also supplied. Would the connection not have already closed?}

In our toy examples, this works fine, but try this with many clients at once and you will experience chaos. This is because we're using a single global ractor to parse input from any number of connections. Perhaps it shouldn't be a ractor at all, or it should work a little differently. This will be addressed in another chapter, as it becomes clear that building a concurrent HTTP server isn't as simple as it looks even when your concurrency primitives are threadsafe.

Note that this won't work with ◊code{curl} yet, because the server isn't returning an appropriate response.
Note that this won't work with ~curl~ yet, because the server isn't returning an appropriate response.

With that said, it's a good time to combine these two things, to make a functioning server:



@@ 208,7 208,7 @@ It's gonna take a little bit more work to turn this into a workable HTTP server,
  ◊li{It primarily uses Ractors for communication}
}

The next chapter will focus on creating a valid response, something that ◊code{curl} will like. Keep in mind that the primarily goal is to get something that works, warts and all, and later on it will be revisited, having learned more.
The next chapter will focus on creating a valid response, something that ~curl~ will like. Keep in mind that the primarily goal is to get something that works, warts and all, and later on it will be revisited, having learned more.

◊footnotes{
  ◊^[1]{◊<>["https://www.kamelasa.dev/posts/ruby-sorcery-ractor.html"]}

M posts/ruby-sorcery-ractor-3.org => posts/ruby-sorcery-ractor-3.org +4 -4
@@ 6,7 6,7 @@
:CATEGORY: ruby
:END:

In chapter 2 of this exploration, a basic Ractor-based TCP server was refactored into a partly-functional HTTP server.◊^[1] It can handle super-basic requests, but it doesn't send back a valid HTTP response. That means that it's difficult to use tools like ◊code{curl} to interact with the server and, thanks to that, previous demonstrations have depended on hand-crafting requests inside a ◊code{telnet} session.
In chapter 2 of this exploration, a basic Ractor-based TCP server was refactored into a partly-functional HTTP server.◊^[1] It can handle super-basic requests, but it doesn't send back a valid HTTP response. That means that it's difficult to use tools like ~curl~ to interact with the server and, thanks to that, previous demonstrations have depended on hand-crafting requests inside a ~telnet~ session.

This chapter is therefore going to focus on constructing a valid HTTP response. As with the previous chapter, it won't cover 100% of the spec, but it will lay down some decent foundations.



@@ 56,15 56,15 @@ A response has a similar structure to a request, comprising a status line, an ar
  <h1>Hello world!</h1>                 | 3. Body
}

◊aside{Most responses will contain a body, but there are certain status codes where it wouldn't make sense to provide one, e.g. with ◊code{3xx} codes for temporary and permanent redirects, and the ◊code{201 No Content} response.}
◊aside{Most responses will contain a body, but there are certain status codes where it wouldn't make sense to provide one, e.g. with ~3xx~ codes for temporary and permanent redirects, and the ~201 No Content~ response.}

Most responses, and in particular the ones being handled by this basic Ractor server, will be sent all at once. But how does the client know when it's received all of the response from the server? This is important to know, because it won't be possile to render HTML or parse JSON until the client knows that it has all of the data. It can't rely on two consecutive carriage-returns, as with a request, because the response itself may legitimately contain those characters. Neither can it rely on the server closing the TCP connection, because that can happen for many other reasons.

The ◊code{Content-Length} header is therefore required for such responses, as it informs the client that a response body will be present and it will be of a certain size in bytes. Given that information, the client can read in the same number of bytes and expect to have received the whole response.
The ~Content-Length~ header is therefore required for such responses, as it informs the client that a response body will be present and it will be of a certain size in bytes. Given that information, the client can read in the same number of bytes and expect to have received the whole response.

This is not the only way to define a response, as it can also be split up into chunks and delivered in parts: this is generally what happens when downloading large files. The same method can also be used for streaming data where the total length is unknown, but the client is expected to process the response on-the-fly instead of waiting for a completed payload.

For the sake of simplicity, the server will handle only the simple case for now: given a request to the server, it will respond with ◊code{200 OK} and a string of HTML.
For the sake of simplicity, the server will handle only the simple case for now: given a request to the server, it will respond with ~200 OK~ and a string of HTML.

◊h2{Building an HTTP Response in Ractor}


M posts/ruby-sorcery-ractor.org => posts/ruby-sorcery-ractor.org +5 -5
@@ 8,9 8,9 @@

This is part two of a series of posts about Ruby and its more experimental features. The first part is about pattern matching.◊^[1]

Ractor is a new addition to Ruby's core library, and it is essentially an implementation of the Actor model. More importantly, it offers a more lightweight approach to concurrency that might feel more at home to those familiar with Go's channels, or have perhaps worked with Elixir. Note that this isn't a wholesale replacement of Ruby's existing multithreading implementations, namely ◊code{Thread}◊^[2] and ◊code{Fiber}◊^[3], and is still highly experimental. As such, there is no guarantee that it would remain stable across future Ruby versions.
Ractor is a new addition to Ruby's core library, and it is essentially an implementation of the Actor model. More importantly, it offers a more lightweight approach to concurrency that might feel more at home to those familiar with Go's channels, or have perhaps worked with Elixir. Note that this isn't a wholesale replacement of Ruby's existing multithreading implementations, namely ~Thread~◊^[2] and ~Fiber~◊^[3], and is still highly experimental. As such, there is no guarantee that it would remain stable across future Ruby versions.

◊aside{Speaking of ◊code{Fiber}s, they've received some upgrades in Ruby 3 too. You can now create non-blocking fibers and provide your own scheduler to run them automatically.}
◊aside{Speaking of ~Fiber~s, they've received some upgrades in Ruby 3 too. You can now create non-blocking fibers and provide your own scheduler to run them automatically.}

◊h1{The actor model}



@@ 20,7 20,7 @@ In a nutshell, it's a way of handling concurrent communication. If, in OOP, ever

An actor is something that has one or more addresses, and receives messages; but for anything useful to happen, it also has to do something with those messages.

◊aside{Imagine a ◊code{no-reply} email inbox, where every cry for help sent to it is routinely ignored. That is a version of an actor model that has an address, can receive messages, but can do nothing else.}
◊aside{Imagine a ~no-reply~ email inbox, where every cry for help sent to it is routinely ignored. That is a version of an actor model that has an address, can receive messages, but can do nothing else.}

In addition to receiving messages, then, an actor can also send messages to another actor. On top of that, it can also create new actors as children of itself.



@@ 63,7 63,7 @@ First things first, quick recap on Actors:
  ◊li{An actor can mutate its own state but not another actor's state}
}

In order to guarantee thread-safety, some aspects of the language have had to change. Most objects in Ruby are unshareable by default, which is different to how a ◊code{Thread} behaves, and this means that code inside a ractor essentially cannot read /anything/ outside of its own scope, which includes global variables and constants.
In order to guarantee thread-safety, some aspects of the language have had to change. Most objects in Ruby are unshareable by default, which is different to how a ~Thread~ behaves, and this means that code inside a ractor essentially cannot read /anything/ outside of its own scope, which includes global variables and constants.

Rather than rewording the Ruby manual on Ractors◊^[11], let's dig into a practical example and build a basic echo server over TCP.



@@ 89,7 89,7 @@ Rather than rewording the Ruby manual on Ractors◊^[11], let's dig into a pract

This example demonstrates how one Ractor can create more Ractors: whenever a new connection is established to the TCP server, a new Ractor is spawned and a TCP client is moved into it. This new Ractor listens on the connection and when input is received, it echoes it back but in uppercase.

Try it for yourself by running that code in an IRB console, and then open up ◊code{telnet} in another session.
Try it for yourself by running that code in an IRB console, and then open up ~telnet~ in another session.

◊script[#:id "asciicast-438705" #:src "https://asciinema.org/a/438705.js" #:async "true" #:data-cols "190"]{}


M posts/ruby-sorcery.org => posts/ruby-sorcery.org +5 -5
@@ 13,7 13,7 @@ The first part of this series of posts is all about /Pattern Matching/.

◊h2{Pattern matching}

Ruby's pattern matching support, introduced experimentally in 2.7, is a lot more powerful than you may expect. All you need is to replace ◊code{when} with ◊code{in} and your ◊code{case} statements become capable of matching against /anything/.
Ruby's pattern matching support, introduced experimentally in 2.7, is a lot more powerful than you may expect. All you need is to replace ~when~ with ~in~ and your ~case~ statements become capable of matching against /anything/.

◊codeblock['ruby]{
  require 'base64'


@@ 72,7 72,7 @@ You can deeply match any object in Ruby so long as you define a method to repres
  end
}

This ◊code{PlayingCard} class is now capable of pattern matching.
This ~PlayingCard~ class is now capable of pattern matching.

◊codeblock['ruby]{
  def face_card?(playing_card)


@@ 133,7 133,7 @@ This particular solution depends on the hand being ordered, but that's fine, a l
  # => true
}

The clever bit here is that the first part of the match (◊code{[1, c, s]}) is used to constrain the rest of the pattern. So if ◊code{c} is ◊code{:red}, then ◊code{^c} also has to be ◊code{:red} in order to match.
The clever bit here is that the first part of the match (~[1, c, s]~) is used to constrain the rest of the pattern. So if ~c~ is ~:red~, then ~^c~ also has to be ~:red~ in order to match.

◊h2{Pattern guards}



@@ 160,7 160,7 @@ Building on the poker example, maybe it's valid to play the Joker, but only if t
  # => true
}

◊h2{Destructuring assignment without ◊code{case}}
◊h2{Destructuring assignment without ~case~}

One of the odd side-effects of this pattern matching functionality is that you get a new kind of assingment. In fact, in Ruby 3 this gets a syntax of its own with the rightward assignment operator, but you can still use something similar in 2.7.



@@ 185,7 185,7 @@ You also have to be absolutely sure you're matching the right thing.

◊h2{Optimisations}

If you recall earlier examples, I defined ◊code{destructure_keys(*)}, which meant that I was explicitly ignoring the arguments normally passed to the method. This is useful in simple cases, but when dealing with complex objects you might want to be a bit more thoughtful about how you return a value. For example, converting the entire structure of the object into a hash might not be appropriate.
If you recall earlier examples, I defined ~destructure_keys(*)~, which meant that I was explicitly ignoring the arguments normally passed to the method. This is useful in simple cases, but when dealing with complex objects you might want to be a bit more thoughtful about how you return a value. For example, converting the entire structure of the object into a hash might not be appropriate.

◊codeblock['ruby]{
  # When used in pattern matching, this class will only destructure into the provided keys

M posts/to-simpler-times.org => posts/to-simpler-times.org +2 -2
@@ 61,11 61,11 @@ The site has moved once again, back to a VPS hosted somewhere in the UK. Caddy
  }
}

Deploying to this server is a case of firing off a couple of ◊code{ssh}, ◊code{scp} or ◊code{rsync} requests using a separate user with its own SSH key, and as soon as the command is finished running the changes are visible online.◊^[5]
Deploying to this server is a case of firing off a couple of ~ssh~, ~scp~ or ~rsync~ requests using a separate user with its own SSH key, and as soon as the command is finished running the changes are visible online.◊^[5]

This leads me to the final bit. Modern tech feels more complicated as it tends towards distributed solutions: put thing /x} here, deploy service ◊em{y/ there, sync them up with webhooks, and hope the network holds up to the task. Earlier tech feels more complicated because the documentation is intricate and detailed and requires some fidgeting around with.

It took me just about a day to figure out how to host my own ◊code{apt} repository for Debian◊^[6], compiling information from various manuals, blog posts and examples. It was mostly a case of creating a GPG key and setting up a correct directory structure for ◊code{apt-ftparchive}◊^[7] to do its business, with a little bit of extra config. I'll go into detail about that another time, but let it be said it does the job tremendously in any Debian-based CI pipeline.
It took me just about a day to figure out how to host my own ~apt~ repository for Debian◊^[6], compiling information from various manuals, blog posts and examples. It was mostly a case of creating a GPG key and setting up a correct directory structure for ~apt-ftparchive~◊^[7] to do its business, with a little bit of extra config. I'll go into detail about that another time, but let it be said it does the job tremendously in any Debian-based CI pipeline.

◊codeblock['bash]{
cd www.kamelasa.dev

M posts/using-ruby-c-in-ruby.org => posts/using-ruby-c-in-ruby.org +14 -14
@@ 10,7 10,7 @@ A thought occurred to me in my mask-wearing, lockdown-addled brain last night: w

One of those thoughts stuck out in particular, because of how ridiculous it sounded: could you optimise your Ruby code by using FFI with Ruby's C bindings? I'm not talking about making a native extension in pure C, I'm talking about making Ruby talk to itself through a foreign function interface using the ffi gem◊^[1].

Let's apply some method to this madness and set up some bindings, otherwise we're dead in the water. Let's be descriptive and call our FFI module ◊code{LibRuby}. No naming conflicts at all there, /no sirree/!
Let's apply some method to this madness and set up some bindings, otherwise we're dead in the water. Let's be descriptive and call our FFI module ~LibRuby~. No naming conflicts at all there, /no sirree/!

◊codeblock['ruby]{
  require 'ffi'


@@ 31,17 31,17 @@ Let's apply some method to this madness and set up some bindings, otherwise we'r
  end
}

If you look at the code in this module, you'll notice that I used ◊code{attach_variable} to get access to the Kernel module, and ◊code{attach_function} for the method calls. The ◊code{:id} and ◊code{:value} types are just aliases for ◊code{:pointer}, because ◊code{VALUE} and ◊code{ID} in the C API are themselves pointers. It's for the sake of documentation, so it's clearer what order you pass arguments in.
If you look at the code in this module, you'll notice that I used ~attach_variable~ to get access to the Kernel module, and ~attach_function~ for the method calls. The ~:id~ and ~:value~ types are just aliases for ~:pointer~, because ~VALUE~ and ~ID~ in the C API are themselves pointers. It's for the sake of documentation, so it's clearer what order you pass arguments in.

Ruby's built in modules and classes are already defined globally with a naming scheme. In this case, ◊code{Kernel} is a variable called ◊code{rb_mKernel}, where ◊code{rb} is a prefix that every C function has in common (so you know it's for Ruby as C doesn't have namespaces), and the letter ◊code{m} means ◊code{module}. If it was ◊code{c} instead it would mean ◊code{class}.
Ruby's built in modules and classes are already defined globally with a naming scheme. In this case, ~Kernel~ is a variable called ~rb_mKernel~, where ~rb~ is a prefix that every C function has in common (so you know it's for Ruby as C doesn't have namespaces), and the letter ~m~ means ~module~. If it was ~c~ instead it would mean ~class~.

Anyway this boilerplate should give us enough to do a hello world using Ruby's C API but at runtime, in Ruby, so it's time to fire up ◊code{irb}.
Anyway this boilerplate should give us enough to do a hello world using Ruby's C API but at runtime, in Ruby, so it's time to fire up ~irb~.

◊aside{
  It should go without saying that at this point, you're not just playing with fire, you're inviting it to burn down your house. Be careful lest the `segfault`s creep up on you.
}

Let's take it from the top and talk through this ungodly incantation. Go ahead and copy that little module into your console! If it fails, make sure you've got the ◊code{ffi} gem installed◊^[2].
Let's take it from the top and talk through this ungodly incantation. Go ahead and copy that little module into your console! If it fails, make sure you've got the ~ffi~ gem installed◊^[2].

Once you're done, you can save some keystrokes by importing that module.



@@ 49,7 49,7 @@ Once you're done, you can save some keystrokes by importing that module.
  include LibRuby
}

In order to call ◊code{puts} in Ruby through the C API, we'll need to get a reference to the module it's defined in (◊code{Kernel}), and also get the method name as a symbol (like you might normally do with ◊code{.to_sym}).
In order to call ~puts~ in Ruby through the C API, we'll need to get a reference to the module it's defined in (~Kernel~), and also get the method name as a symbol (like you might normally do with ~.to_sym~).

◊codeblock['ruby]{
  kernel = LibRuby.rb_mKernel


@@ 62,35 62,35 @@ Oh, before we continue, better disable the garbage collector. This is a simple w
  GC.disable
}

We can't just pass in a normal string to ◊code{puts} without things going 💥, as everything is an object in Ruby and therefore we need to
get a pointer to a ◊code{String} instance (or in internal Ruby lingo, one of those ◊code{VALUE}s).
We can't just pass in a normal string to ~puts~ without things going 💥, as everything is an object in Ruby and therefore we need to
get a pointer to a ~String~ instance (or in internal Ruby lingo, one of those ~VALUE~s).

◊codeblock['ruby]{
  str = rb_str_new_cstr('welcome, mortals')
}

Now we have all of the ingredients to make the actual call, which syntactically and aesthetically blows idiomatic Ruby out of the water. Delicately paste this into your console and you should see the string printed out. You'll also get a return value like ◊code{#<FFI::Pointer address=0x0000000000000008>}, which will refer to ◊code{Qnil}. ◊code{Qnil} is a pointer to Ruby's ◊code{nil} object.
Now we have all of the ingredients to make the actual call, which syntactically and aesthetically blows idiomatic Ruby out of the water. Delicately paste this into your console and you should see the string printed out. You'll also get a return value like ~#<FFI::Pointer address=0x0000000000000008>~, which will refer to ~Qnil~. ~Qnil~ is a pointer to Ruby's ~nil~ object.

◊codeblock['ruby]{
  rb_funcall(kernel, puts_method, 1, :value, str)
}

Run it again a few times, and with different strings. If you're feeling experimental, attach more functions in ◊code{LibRuby} and see what else you can print out! Ruby's extension documentation should be a good place to start◊^[3].
Run it again a few times, and with different strings. If you're feeling experimental, attach more functions in ~LibRuby~ and see what else you can print out! Ruby's extension documentation should be a good place to start◊^[3].

◊h3{So, why disable the GC?}

For every step in this post up to creating a ◊code{String} object, we've been using function bindings and global variables. Global variables and constants won't be garbage collected, because the global scope will always maintain a reference to them; besides which, it would be quite bad if your classes and modules suddenly disappeared after a GC pass.
For every step in this post up to creating a ~String~ object, we've been using function bindings and global variables. Global variables and constants won't be garbage collected, because the global scope will always maintain a reference to them; besides which, it would be quite bad if your classes and modules suddenly disappeared after a GC pass.

The string object is different, however, as on the C side of things Ruby is taking a pointer to a C string (a ◊code{const char *}), allocating memory, and giving back a pointer to the new object. Eventually the GC will run and free up the memory at the pointer's address, and the string will no longer exist. You'll probably find something else at that address instead, or just garbage.
The string object is different, however, as on the C side of things Ruby is taking a pointer to a C string (a ~const char *~), allocating memory, and giving back a pointer to the new object. Eventually the GC will run and free up the memory at the pointer's address, and the string will no longer exist. You'll probably find something else at that address instead, or just garbage.

Disabling the GC in this instance is a *shitty hack* because it's a direct admission that the code is /not memory safe/. Hopefully you didn't need me to tell you that, though, and the quality of the code in this post was self-evident.

How would you fix it? Well, now we've found out that we ◊code{can} write Ruby with itself we'll explore that next time. And there'll be benchmarks, too.
How would you fix it? Well, now we've found out that we ~can~ write Ruby with itself we'll explore that next time. And there'll be benchmarks, too.

Until then, I'll see you further into the abyss.

◊footnotes{
  ◊^[1]{◊<>["https://github.com/ffi/ffi"]}
  ◊^[2]{◊code{`gem install ffi -- --enable-system-libffi`}}
  ◊^[2]{~`gem install ffi -- --enable-system-libffi`~}
  ◊^[3]{◊<>["https://ruby-doc.org/core-2.7.0/doc/extension_rdoc.html"]}
}
\ No newline at end of file