~subsetpark/subsetpark

fa1be551287aad1e1aeeb5602c90de2a1e010b07 — Zach Smith 5 months ago 320c89d
Add zettelkasten post
2 files changed, 249 insertions(+), 1 deletions(-)

A posts/zettelkasten.md
M temple/post.temple
A posts/zettelkasten.md => posts/zettelkasten.md +248 -0
@@ 0,0 1,248 @@
{:title "A Specification of a note-taking program"
 :date "2022-02-19"
 :status :post}

%%%

I recently watched Conrad Barski's [video introducing zek][zek], a program he
wrote to take Zettelkasten-style notes on the command line.

[zek]: https://www.youtube.com/watch?v=4W_m176PIdU

This is the perfect spur to a programming project: another, already perfectly
fine, program that does almost what you want but not exactly. In my case, I was
intrigued by the idea of a Zettelkasten-style note-taking app on the command
line, but the whole line-editor approach seemed to me to be a bridge too far.
So of course I started thinking about how I'd write my *own*.

I came up with an idea that I like, and that is extremely simple, deferring as
much as possible to my text editor, and could conceivably be written as a
single shell script calling widely-available tools.

I thought about it during a couple showers, and then I wrote a [Pantagruel][]
spec for what I came up with. I reproduce that here.

[Pantagruel]: https://pantagruel-language.com

<div style="text-align: center; margin: 2em">&#128607;</div>

module ZK.

> Notes on a note-taking system, inspired by Zettelkasten.

Line ⇐ String.  
Note ⇐ [Line].

> The basic operation we're describing is *indexing*: taking all the notes in
> our set and updating them with references to other notes. That's all this
> system does; the notes themselves can be written and read with a standard
> text editor, and most text editors have a "go to file" procedure, so even
> the navigation from note to note can be done within the editor.

> The only thing it needs help with is finding the connections between notes.

index notes: [Note].

---

> Indexing does two things:

> 1. It updates each note, in place. All backlinks to each note are placed at
> the beginning of it, and all references contained within it are placed at
> the end.

all n: notes ⇒  
&nbsp;&nbsp;&nbsp;nʹ = backlinks n + body n + references n  
&nbsp;&nbsp;&nbsp;and name nʹ = name n.

> 2. It maintains an Index. Indices aren't notes - they aren't affected by
> indexing, in particular. They are simply a list of links to each note in the
> set, ordered by creation time.

some1 i: Index ⇒  
&nbsp;&nbsp;&nbsp;all n: notesʹ ⇒ ref-note n in i  
&nbsp;&nbsp;&nbsp;and all (n, m): notesʹ ⇒  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;created-at n < created-at m → i (ref-note n) < i (ref-note m).

;

Index ⇐ [Line].

> A Reference is a line of text that's read and written by the system. Both
> backlinks and (forward) references are simply collections of References.

Reference ⇐ Line.  
backlinks n: Note ⇒ {Reference}.  
references n: Note ⇒ [Reference].

name n: Note ⇒ String.  
body n: Note ⇒ [Line].  
created-at n: Note ⇒ Date.

ref s: String ⇒ Reference.  
ref-note n: Note ⇒ Reference.

---

> To refer to a given name, we simply insert a line with a recognized string
> in front of it, followed by the name. For instance: `%ref:scifi\ authors`.
> In particular, it's important to escape any delimiters in the name; this is
> to allow the "go to file under cursor" command to recognize the whole name
> rather than the first part, in the case of names with spaces.

ref s = `%ref:` + escape s + `\n`.

> It follows then that a reference to a note is simply a reference to its
> filename.

ref-note n = ref (name n).

body n = [all line: n, ~(line in Reference) ⇒ line].

> As mentioned above, to *index* a note is to take the note body, prepend all
> of the backlinks to that note, and append all the references to other notes
> (or other potential notes) contained in the body. A backlink is a reference
> to any note that links to this one; a (forward) reference is a reference to
> any note which is linked to by this one.

backlinks n = {all m: Note, (bracketed (name n)) in m ⇒ ref-note m}.  
references n = [all s: String, (bracketed s) in n ⇒ ref s].

> It's worth noting that references can be ordered as they appear in the note
> body; but backlinks have no defined order.

all (r, s): references n ⇒  
&nbsp;&nbsp;&nbsp;n (bracketed (name r)) < n (bracketed (name s)) → references r < references s.

;

> To link to any name, simply enclose it in square brackets in the body of a note.

escape s: String ⇒ String.  
bracketed s: String ⇒ String.

---

bracketed s = `[` + s + `]`.

<div style="text-align: center; margin: 2em">&#128607;</div>

I invite you to read the above; hopefully you like the idea of such a program.
I intend to write it. Hopefully, in fact, you could write it too; hopefully the
spec is descriptive enough that we could write two programs which did the same
thing.

There's English prose in it, of course, and no doubt you'd refer to that when
you wrote it. And it doesn't constitute the entire documentation for such a
program; there's no description of how the thing is intended to be used. And
there's certainly no description of the rationale for the program or its
values.

---

The rationale is this: I very much like the core idea of the note system, as
it's presented in `zek`: a flat directory of text files, linked to other ones
(a Wiki, so far), and comprehensively and automatically *backlinked* as well
(not so much a Wiki); and a time-ordered index to bind them all together.

But in my mind, vim is already a wonderful tool for writing text. There's
absolutely no need to reimplement a text editor if text files are your medium
of choice. And with `gf`, it's not a half-bad one for navigating the links
either. 

What's left is actually quite small. You simply need a program that will scan
all the text files in a directory, find all the [square brackets references],
create a quasi-organized list of filenames corresponding to those square
brackets as metadata in the files themselves, and create a quasi-organized list
of all the filenames containing references to *this* file in them too. You can
do all that with awk. The distinctive `%ref:` syntax is just to make it easy to
filter those lines out when you generate a new generation of the file.

---

Nevertheless, I do think there's a lot of worth in actually writing a
specification of some sort, as I've done above. 

One reason is: there's an entire largely-unpracticed art of *describing*
programs rather than *writing* them. The most successful example that comes to
mind is [`redo`](http://cr.yp.to/redo.html), as conceived by the brilliant D.
J. Bernstein. `redo` is a program that Bernstein didn't (or we might say,
didn't bother) writing; he merely described a hypothetical program with enough
clarity that there are [at least](https://github.com/apenwarr/redo)
[two](https://github.com/leahneukirchen/redo-c) existing implementations of the
system, written by others. I like this idea; it would tickle me pink to see
others write the program that I've described and compare it to the program in
my head.

Another, though, is that I maintain there's an enormous worth in thinking
critically and deeply about a program *before* one writes it. There is an
extent to which idle shower thoughts fulfill this role. But there are other,
deeper ways to think about parts of a program like its core primitives, its
vocabulary, its invariants, its edge cases, its sequence of operations, and so
on, and we rarely as a field avail ourselves of them.

There's the whole subfield of [formal methods][fm], of course, of course, but I
think you'll agree that the above document is business casual methods at best.
This is Formal Methods Ultralite, and I think it has some worth; in fact I
think part of its worth is how lightweight it is. I banged that out in an hour.

[fm]: https://www.hillelwayne.com/

---

And yet: I understand the program more deeply now than when I wrote it. 

The first step is, add the English last. The English is for the reader, but
it's best written only after one is confident that one fully understands what
they're about to describe. Otherwise, natural language has a tendency to infect
thought; because we have a name or a phrase to encompass something, we think we
understand it; but natural language is full of ambiguities, pronouns without
clear referents, double meanings, unclear phrasing, and more.

Indeed, the style that Pantagruel encourages is a naturally explanatory one. I
start by describing, quasi-formally, the behaviour of the program in a
"sentence" or two. In doing so, two things happen:

1. The formality begins to work its magic. Things that are easy to express in
   English are harder to express in the sentence structures of first-order
   logic, and that's often because the English thoughts refer to things without
   introducing them, and the like. So I have to introduce things and then I
   realize I haven't thought through all of their contours.

2. I introduce some new terminology. To explain my program at a high level, I
   necessarily refer to certain other qualities or concepts. When I run `pant`,
   I get this error message:

   ```
   error: Unglossed symbols:

   name, references, created_at, body, backlinks, ref_note, Index
   ```

   So, this is the next layer down. These are the concepts I must elaborate on;
   left to my own devices, I might have remembered to explain some of them, but
   probably not all. 

So I introduce a new chapter and continue the process. I recurse, until I hit
the leaves. That is: I run `pant` and I see no error message. There are no
terms left to explain. I'm sure I don't need to mention that it is through the
act of explanation that we stumble on every notion we don't actually
understand, even though we think we did, and are forced to clarify it for
ourselves, and in doing so are often shown exactly which two concepts are
subtly incompatible, which definition is used differently in different places,
and so forth.

---

All of this without writing any code. All, in fact, without even having decided
what language I'll use to write it. I say this *not* in the sense of "it's just
that easy!" I am not a believer in the no-code revolution. I like code. Indeed,
that's the problem: code is powerful and seductive and has its own gravity.
Putting ideas into code makes them *real* and gives them an inertia, even if
they're the wrong ideas. Thinking about things beforehand is a valuable
practice. It's valuable because it can be a much higher-leverage way to figure
them out than to write a whole program until you realize you didn't understand
the assignment; it's also valuable because they *remain ideas*. And in a
language like Pantagruel, they're short little sentences, little bits of
pseudo-math. They're easy to edit. They're easy to type, to write on a
whiteboard or notebook, and easy to erase and change.

M temple/post.temple => temple/post.temple +1 -1
@@ 11,7 11,7 @@
      <p class="post-info">
        {{ (get-in args [:_item :date]) }}
      </p>
      {- (-> (get-in args [:_item :body]) (bagatto/markdown->html :exts [:table])) -}
      {- (-> (get-in args [:_item :body]) (bagatto/markdown->html :opts [:unsafe] :exts [:table])) -}
    </div>
    <div class="col-md-3">
      <h3>Recent Posts</h3>