Z. D. Smith, Brooklyn, NY, 2020.
Bagatto operates in two phases: the data phase and the site phase.
In the data phase, Bagatto will read a data specification and use it to create a table containing all the input data. Data specifications that consist of literal attributes will result in site data containing those same attributes. Specifications of a single file path will result in site data pertaining to the one file. Specifications of a file wildcard will result in an array of data objects, one for each file that matches the wildcard.
In the site phase, Bagatto will read the site specification and use it to generate a sequence of files. Each site specification will ultimately specify the path and contents of one or more files. Bagatto will then ensure the presence of each file path and ensure that its contents are as specified.
We can enter a REPL environment that allows us to explore the index
module by using the --repl
flag to bag
. This enters a Janet REPL
with three helper functions injected: eval-data
, eval-site
, and
write-site
. These represent the three main steps of executing
Bagatto: generating site data, generating site "write specifications",
and writing the specified files.
Here's a short example using the basic-site
demo:
code-src/bagatto/demo/basic-site [master !] ⊕ bag --repl index.janet
repl:1:> (eval-data data)
Reading config data spec...
Reading pages data spec...
Reading css data spec...
Beginning 3 jobs...
Loaded config
Loading pages...
[pages] Loading 2 files
Loading css (styles.css)...
Finished jobs.
@{:config {:author "Z. D. Smith"
:description "A minimal website"
:title "Bagatto Demo"}
:pages @[@{:basename "about"
:path "pages/about.md"
:contents @"..."}
@{:basename "bagatto"
:path "pages/bagatto.md"
:contents @"..."}]
:css @{:path "styles.css"
:contents @"..."}}
The bag
command accepts a single filename as an argument. This is
known as the index module, and it should be syntactically correct
Janet. One of the principles of Bagatto is to go as far as is
practicable to make the operation of the Janet language inside the
index module as similar as possible to any other use of the Janet
interpreter or compiler.
Thus, there's only one real difference between programming inside an index module and writing a normal Janet module: Bagatto inserts a couple useful libraries into the namespace so that we, as the site authors, don't need to manage these libraries in order to use them inside the module.
They are:
bagatto
library itself, a collection of useful functions
designed to reduce boilerplate in Bagatto modules, whose API is
listed below;path
library, which
exposes functions for manipulating file paths;janet-sh
library, which exposes a useful DSL for shelling out to the command
line.One of Bagatto's principles is to expose as much of its API as possible in the form of ordinary functions to be used to produce the data structures you define in your index module. There are a couple places where that isn't possible, and where we have to expose a "global" API instead.
In addition to the helper functions exposed in the bagatto/
namespace, there are a few features that can be accessed directly
inside of index modules:
Bagatto exposes the bagattto/set-defaults!
function, which can be
called at any point inside an index module. It takes a single
argument: a struct or dictionary specifying the default value for any
of the specification attributes: :src
, :attrs
, :dest
,
:out
.
By calling bagatto/set-output-dir!
, you can specify the directory
that Bagatto should write its generated file tree into. This is
principally exactly the same as appending that directory name to every
path that you generate; however, if you use this feature then you can
re-use paths in your business logic (for instance, you can define a
path value and use it when generating a file, and when rendering a
link in your site), as the additional file hierarchy will be
transparently dealt with.
Your Bagatto module should expose a data specification like this:
(def data ... )
This value will be used as the starting point by the Bagatto application. Its job is to specify all the inputs that should go into the system.
The data
value should be a struct where the keys are data specification names
and the values are the specifications themselves. The data
specification names are meaningful, as they are referred to by the
site specifications, as we'll see.
The simplest form of a data specification is a literal struct or table, like this:
(def data {:config {:attrs {:title "A Demo Bagatto Config"}}})
When Bagatto creates the site data for this specification, it will consist of a single key-value pair:
repl:13:> (eval-data data)
@{:config {:title "A Demo Bagatto Config"}}
The next type of data specification is a reference to a single file in
the project. These will consist of two attributes, :src
, which
specifies the location of the file with respect to the current working
directory, and :attrs
, which contains a function that will be called
with the file contents, like this:
(def data {:config-json {:src "config.json"
:attrs bagatto/parse-json}}
(Theoretically, you could pass in a data literal as above in this case too, but in that case the file would be ignored and there wouldn't be much point.)
In this case, Bagatto will look for a file called config.json
in the
current directory, load its contents, and then call
bagatto/parse-json
on them. The resulting attributes will then be
the content of the site data associated with :config-json
.
repl:17:> (eval-data data)
@{:config-json @{"subtitle" "A Very Good Blog."
:path "config.json"
:contents @"{\"subtitle\":\"A Very Good Blog.\"}\n"}}
We see that the resulting site data has a single entry,
:config-json
. The table associated with this entry has the two
attributes we get for free---:path
and :contents
, which are the file
path and contents, respectively---but that the call to parse-json
has resulted in the key/value pairs inside the JSON file have been
parsed and put in the site data too.
The last way to specify data inputs is with wildcard references to
multiple (potential) files. Under the hood, this relies on the glob
function of janet-sh. There are two wildcard methods:
bagatto/*
and bagatto/slurp-*
.
Use bagatto/*
to provide a all the filenames that match the wildcard.
repl:24:> (bagatto/eval-loader (bagatto/* "demo/static/*"))
@["demo/static/hello.png"]
Thus, we can use it as a data specification:
(def data {:static {:src (bagatto/* "demo/static/*")
:attrs bagatto/parse-base}})
repl:27:> (eval-data data)
@{:static @[@{:path "demo/static/hello.png"}]}
Since we specified the parse-base
parser, and used the basic form
bagatto/*
(which only lists files), we get an array of tables with
the :path
attribute only.
This is the minimal case for listing files, but for files like the above, that only need to be copied into place, it's all we need.
bagatto/slurp-*
has the same wildcard functionality, but it also
includes the contents of the matching files. We can use this to
process files in more interesting ways.
repl:28:> (bagatto/eval-loader (bagatto/slurp-* "demo/posts/*.md"))
@[("demo/posts/post.md" @"## A Post That You Might Be Interested In...") ...]
The return value of the loader for each matching file is a two-tuple of the file's path and contents. Notice that the contents of the markdown files include their own metadata in the form of YAML frontmatter.
We can define a data specification based off of this loader. In this
case we'll specify the multimarkdown parser as the attrs
callback. That will be able to extract the YAML frontmatter as
additional metadata.
(def data {:posts {:src (bagatto/slurp-* "demo/posts/*.md")
:attrs parse-mmarkdown}})
repl:33:> (eval-data data)
@{:posts @[@{:path "demo/posts/post.md" :contents @"..."}
@{"status" "post"
:path "demo/posts/post2.md"
:contents @"..."
"title" "A Really Good Title"}]}
Having evaluated the data specification, we can see that :posts
is
an array with one element for each file that matched the
wildcard. Unlike with the single-file example above, parse-mmarkdown
was then called for each post. I've collapsed the full contents so
we can see that, since post2.md
included a title:
and status:
attribute in its metadata, the parse-mmarkdown
function has pulled
that out and put it in the attributes for that post. post.md
didn't
have any frontmatter, so it has no additional attributes.
Since the wildcard loaders offer the ability to load multiple files,
and the attrs
callback operates on each file individually, Bagatto
exposes one more element of the data specification: the transform
callback. A transform, if specified, is called on the whole set of
elements after each one has been parsed. This allows us to, for
instance, sort a list of blog posts after they've been loaded and
parsed.
(def data {:notes {:src (bagatto/slurp-* "notes/*.md")
:attrs parse-note
:transform (bagatto/attr-sorter "topic")}})
bagatto/attr-sorter
is exposed as a part of
the Bagatto library and allows us to specify a key present in all the
items, and sort the collection by it.
The second and last value that your index module should define is site
:
(def site ...)
This is the site specification, which defines all the outputs of the system. Every site specification entry specifies either a single file, or a sequence of files, to be generated. To specify a file we must output the path it should be created at and the contents of the created file.
The structure of the site specification is quite similar to the data specification: it's an association between names and specification values. However, in this case the names don't have any effect on the generated site; they're just useful for the site author to organize their code.
The relationship between data
and site
is simple but important to
understand. The site specification is evaluated in the context of
the site data, which is the output of the data specification (it's
what we see when we run eval-data
above).
A site specification isn't actually a mapping data entries to pages; in most websites of any size, any given page will require data from more than one input (for instance, to display a recent posts sidebar on every page), and may well create more than one page out of the same input. Thus it's useful to understand the overall flow of data in the system: Bagatto uses the data specification to create the site data, and then iterates through the site specification using the site data as the context, or evaluation evaluation environment, as it evaluates each entry in the specification. Each entry results in a sequence of one or more files to be written.
As noted above, every site specification specifies the path and contents for one or more files to be created. Therefore, perhaps the simplest possible site is one consisting of a static path and contents:
repl:2:> (def site {:_ {:dest "out.txt" :out "Welcome to my website"}})
{:_ {:dest "out.txt" :out "Welcome to my website"}}
We can use eval-site
to get an output of the path and
contents of each file to be created. This is useful for debugging; in
reality, site contents are generated using fibers, so this lets us
peek under the hood to understand what our site specifications produce.
Site specifications are evaluated in the context of site data, but in this case our only specification is completely static. therefore we can pass in an empty struct as the site data.
repl:4:> (eval-site site {})
@[(:write "out.txt" "Welcome to my website")]
We can see that it plans to write a single file, with the specified path and contents.
More useful is to pass a function as the specification contents, rather than a static value. This allows us to dynamically act on the input data in useful ways.
A renderer function is simply any function which takes in the site
data and outputs some file contents. We can write our own extremely
simple one, which looks for a secret in the site data and outputs it
in JDN format to a file. Then we can define a simple site
specification with a static path that passes that function in directly
as the :out
attribute.
repl:6:> (defn renderer [data] (string/format "%j" (data :secret)))
<function renderer>
repl:7:> (def site {:_ {:dest "out.txt" :out renderer}})
{:_ {:dest "out.txt" :out <function renderer>}}
Now, of course, we need to ensure that :secret
is present in the
site data. While, in practice, we'd have a data entry that defined
:secret
, it's useful to note that for the purposes of inspecting our
functions, we don't need to use the output of the eval-data
command. We can construct a struct directly.
repl:8:> (eval-site site {:secret "p@ssw0rd"})
@[(:write "out.txt" "\"p@ssw0rd\"")]
The fact that the site data is a simple key-value structure, and the renderer output is just a string, makes it very simple to understand how data flows through the application and to extend it.
Perhaps a slightly more realistic example would be one that combines data from more than one source.
repl:12:> (defn renderer [data] string/format "%s:%j:%f"
(get-in data [:config :prefix])
(get-in data [:personal :password])
(math/random)))
<function renderer>
repl:14:> (def site {:_ {:dest "out.txt" :out renderer}})
{:_ {:dest "out.txt" :out <function renderer>}}
repl:15:> (eval-site site {:personal {:password "p@ssw0rd"}
:config {:prefix "md5"}})
@[(:write "out.txt" "md5:\"p@ssw0rd\":0.487181")]
Here we see the very common case of combining data from multiple sources into a single file's contents.
We saw that, using data specifications, we can select source files to be included in our site data with wildcards. A very common example of this would be building a blog; in addition to all the static content and any config files, we'd want to load up all the blog posts in a directory, and to be able to add a new post simple by adding a file to the directory, without changing the config.
Thus, it will be very common that in addition to rendering pages based
on static config data or files, we'll want to iterate through all the
files that match a wildcard and render one output file for each (eg.,
rendering a post.html
for each source file).
For that we can use site selectors.
:each
Given some site data with a series of named data entries, we can use
the each
attribute to refer to one of those entries. Bagatto will
then call the dest
and out
functions on each file in the
entry.
Because there's now an additional piece of data in addition to the
site data, the renderer and path generator functions in an each
specification take two arguments: the site data data
, and then the
individual element item
.
We can see an example. First let's define a data specification with a wildcard, so we have something to iterate over:
repl:55:> (def data {:users {:src (bagatto/slurp-* "users/*")
:attrs bagatto/parse-json}
:config {:attrs {:prefix "pw::"}}})
{:config {:attrs {:prefix "pw::"}} :users {:attrs <function parse-json> :src <fiber 0x55F89DCCE930>}}
The users
directory will have two JSON files in it. Therefore, since
we specify bagatto/parse-json
as the parser for users
, we can
expect the users
site data to contain an array of 2 tables that have
been decoded from the JSON.
Next we'll define a renderer function. Like above, it will draw from
multiple sources; but this time, it will take two arguments, because
we intend it to be called on each element in users
.
repl:34:> (defn renderer [data item] (string/format "%s%s"
(get-in data [:config :prefix])
(item "password")))
As before we expect data
, but now we expect item
as well. For each
call data
will be the same site data, and item will be a different
element.
Finally we will define a site specification that uses :each
to refer
to the users
site data.
repl:35:> (def site {:_ {:each :users
:dest (bagatto/path-copier "passwords/")
:out renderer}})
{:_ {:dest <function 0x55F89DCBC880> :out <function renderer> :each :users}}
:each :users
will cause Bagatto to call the renderer once for each
item in :users
. In addition, we now need to specify an actual
function for :dest
. If we left it as a static value, the contents
would be repeatedly written to the same file, which is obviously not
what we want. Here we use the
bagatto/path-copier
helper, which gives us a
function that will accept any file and return a new path with the base
we specify.
We can evaluate the data spec, and use that to evaluate the site spec:
repl:56:> (eval-site site (eval-data data))
@[(:write "passwords/alice.json" "pw::1234")
(:write "passwords/bob.json" "pw::snoopy")]
It's produced two write plans, one for each user file, whose contents are interpolated from the contents of their respective source files.
A very common operation when generating a website is to copy a source
file without touching it. If Bagatto receives a site specification
with a site selector and a :dest
entry, but no :out
entry, it
will interpret that as a copy operation. It will read the :path
of whatever item or item it receives (this attribute is always
present), and copy it to the :dest
attribute of the site
specification.
Here's a super simple data spec:
repl:62:> (def data {:users {:src (bagatto/* "users/*") :attrs bagatto/parse-base}})
{:users {:attrs <function parse-base> :src <fiber 0x55F89DD25A00>}}
We use bagatto/*
instead of bagatto/slurp-*
, which just lists the
files, but doesn't read them. We also use bagatto/parse-base
as our
parser, which just returns the base :path
attribute.
We can now define a site
that simply refers to :users
and
specifies a path without specifying contents.
repl:58:> (def site {:_ {:each :users :dest (bagatto/path-copier "passwords/")}})
{:_ {:dest <function 0x55F89DCD8B30> :each :users}}
Evaluating the site produces two copy instructions to the new paths:
repl:63:> (eval-site site (eval-data data))
@[(:copy "users/alice.json" "passwords/alice.json")
(:copy "users/bob.json" "passwords/bob.json")]
Of course, most websites are not made by string/format
ing HTML
together; they use HTML templates. The template system used by Bagatto
is Temple. Temple is a
wonderfully powerful and simple templating system that should be very
enjoyable to use.
Here's the contents of post.temple
in the Bagatto demo directory:
{$ (import ./base_top) $}
{% (base_top/render-dict args) %}
<h1>{{ (get-in args [:_item :title]) }}</h1>
<p class="post-info">
{{ (get-in args [:_item :date]) }}
</p>
{- (bagatto/mmarkdown->html (get-in args [:_item :contents])) -}
{$ (import ./base_bottom :as base_bottom) $}
{% (base_bottom/render-dict args) %}
The appeal of Temple is in its simplicity. It consists of four types of expression, all of which are seen here.
{$ ... $}
: Evaluate the expression between the $
s at compile time;{% ... %}
: Evaluate the expression between the %
s at runtime,
escape and interpolate the output;{- ... -}
: Evaluate the expression between the -
s at runtime,
interpolate the output without escaping it.{{ ... }}
: Evaluate and interpolate the expression inside the curly braces.While many other templating languages differentiate between capturing
and non-capturing by differentiating between their escape brace types
(which means having to change brace types from line to line, even
within the same syntactic expression), Temple is non-capturing by
default, and we interpolate into the surrounding template by printing to
stdout. In other words, to interpolate something into a Temple
template, simply use print
:
Welcome to my web page. Here's a pretty-printed example
of one of my favorite data structures:
{% (print (string/format "%q" {:name "Bowler Cat"
:species "Felis Domesticus"})) %}
Ain't she a beaut?
We can think of {{ foo }}
as syntactic sugar for {% (print foo) %}
.
Temple templates accept a single dictionary of arguments, which is
bound inside the template to args
.
Bagatto adds a very thin layer of functionality and convenience on top
of Temple. The first thing it does is it extends the Temple
environment with the same libraries that are listed at the beginning
of this manual. Thus we can call bagatto/
helper functions from
within a template.
The only other change it makes is to ensure the presence, if
applicable of the item
passed in as the second argument to site spec
functions, which contains the attributes of the individual element of
an :each
selection. Those attributes are made available at (args :_item)
.
For instance, in the example above, we expect the attributes
of the specific blog post being rendered to be present in the :_item
value, and so we refer to it to get the title, date and contents of
the post.
The basic call to render a template is
bagatto/render
. This allows us to directly invoke
a template by name, with site data and an optional item, and returns
the fully rendered template. For instance, if we have a simple
template at templates/simple.temple
:
I am known for my {{ (args "topic") }} skills.
Then we can render out page contents like so:
repl:5:> (bagatto/render "templates/simple" {"topic" "Web Design"})
@"I am known for my Web Design skills.\n"
In a proper web page, of course, our template file would contain HTML with placeholders for the values to be interpolated.
Because bagatto/render
is such a common operation, Bagatto offers a
convenience function that will generate a
renderer that will make the above call. For instance, if I wanted
to specify the above template in a site specification, I'd probably
write this:
repl:6:> (def site {:_ {:dest "out.txt"
:out (bagatto/renderer "templates/simple")}})
{:_ {:dest "out.txt" :out <function 0x55DAC976C660>}}
Thus I avoid having to write a new renderer function for each
:out
entry, if I'm just going to pass on the data to a specific
template. Evaluating the site we get the same thing:
repl:7:> (eval-site site {"topic" "Web Design"})
@[(:write "out.txt" @"I am known for my Web Design skills.\n")]
A site spec with an :each
can include a :filter
attribute,
too. This can be any predicate function which takes the site and an
individual item from the spec's site selector, and returns true or
false. If the return value is false, the site spec will skip that elements.
This can be very useful when handling an input of mixed files. For
instance, with a static/
directory that contains both CSS and
supplementary HTML files, we might want to have different render steps
for each. We could then write two site specs, that both take that data
entry in their :each
, but have different :filter
attributes (we
could also have written two different wildcards in two different data
specs, but hopefully you get my point).
Bagatto bills itself as a "transparent" static site generator. By this we mean: we should favor first-class functions over configuration, and native terms and data structures over indirect control flow whenever possible.
Here's a simple example: Bagatto creates files by combining a file
path with some file contents. The values that can go in the :out
section of a site specification can either be strings, or functions
which produce strings.
We might be tempted as application authors to introduce a layer of abstraction in front of the render process and ask the user to specify the name of a render function built into Bagatto. This would provide a simple, convenient DSL. Unfortunately, it has the very unfortunate side effect of effectively walling off that function from a site author. If---when---the author needs to understand what specifically is being passed into the render function, or needs to tweak its output slightly, they're out of luck. The logic that reads this name, translates it into a render function, calls the function with some inputs and uses the output is all stuck within the belly of Bagatto and the author might need to recompile the whole application to get into it.
Similarly, if they want to introduce a new renderer---a new template language for instance---they can only do so by introducing the function directly into Bagatto, giving it a name, and then passing the name in a site specification.
Therefore we keep the operation of the renderer within inspection of the author. By specifying a literal function, we can easily wrap other functions and debug their output or change it. Similarly, we do attempt to offer an author the same level of convenience as the above DSL; but instead of offering them the ability to name a function that we control, we offer them the ability to call a function that outputs the renderer function itself, so that they still have access to its inputs and outputs.
Thus, we have a pretty straightforward way to write our own loaders, attributes, path-generator and renderer functions.
Each of the below entries will have a typespec describing the signature of the functions that can be implemented. This isn't meaningful Janet, but hopefully gives a succinct picture of the types that will be meaningful.
(let [element (or 'source-path '(source-path file-contents))]
(defn loader []
(or '{:each (element ...)} '{:some element)}))
The :src
attribute in a data spec can take a 0-arity function which,
when called, returns one of two types of values:
{:each values}
values
is any indexable data structure, the elements of which are
either two-tuples or single values. Two-tuples will be treated by the
base attribute parser as [source-path file-contents]
. Single values
will be treated as source-path
only.
{:some value}
value
is a single instance of the above value type: either a
two-tuple or a single path value.
We could, for instance, write a custom loader function that accepted a URL, made a web request, and returned a (file-url file-contents) tuple.
(defn parser [contents attrs] attrs)
:attrs
can take any parser function. The purpose of a parser is to
transform the individual outputs of a data loader into an attributes
table. There are two attributes that are guaranteed to be present when
the parser is called, :path
and :contents
. A parser function
shouldn't remove either of these attributes, but can use them to
generate new ones. For instance, if contents
is unparsed Markdown
with YAML frontmatter, then a parser function could extract metadata
from the frontmatter and return an updated attributes table with those
arbitrary metadata.
contents
and the :contents
attribute can be expected to be
identical, and the former is provided as a convenience.
An example of a custom parser would be one that shelled out to Asciidoctor to extract attributes from an asciidoc document.
(defn each-parser [data item] path)
(defn some-parser [data] path)
In the site specification, :dest
can take any function which returns
a file path string. If the spec has an :each
, the generator function
should take the site data and the individual item as arguments, and
return the destination path for the individual item.
Otherwise, it should take the site data as a single argument and return the destination path for its entry output.
(defn each-renderer [data item] file-contents)
(defn some-renderer [data] file-contents)
:out
takes any renderer function---these work along exactly the
same lines as path generators. If the site spec has an :each
, the
function should take two arguments, otherwise it should take one. The
return value of the function will be written directly to the file path
in its site spec.
Following from the parser example above, an example custom renderer could take an asciidoc document and shell out to Asciidoctor to render it into HTML.