# The Bagatto Manual Z. D. Smith, Brooklyn, NY, 2020. # Usage Bagatto operates in two phases: the **data** phase and the **site** phase. In the data phase, Bagatto will read a data specification and use it to create a table containing all the input data. Data specifications that consist of literal attributes will result in site data containing those same attributes. Specifications of a single file path will result in site data pertaining to the one file. Specifications of a file wildcard will result in an array of data objects, one for each file that matches the wildcard. In the site phase, Bagatto will read the site specification and use it to generate a sequence of files. Each site specification will ultimately specify the **path** and **contents** of one or more files. Bagatto will then ensure the presence of each file path and ensure that its contents are as specified. ## REPL Mode We can enter a REPL environment that allows us to explore the index module by using the `--repl` flag to `bag`. This enters a Janet REPL with three helper functions injected: `eval-data`, `eval-site`, and `write-site`. These represent the three main steps of executing Bagatto: generating site data, generating site "write specifications", and writing the specified files. Here's a short example using the `basic-site` demo: ```clj code-src/bagatto/demo/basic-site [master !] ⊕ bag --repl index.janet repl:1:> (eval-data data) Reading config data spec... Reading pages data spec... Reading css data spec... Beginning 3 jobs... Loaded config Loading pages... [pages] Loading 2 files Loading css (styles.css)... Finished jobs. @{:config {:author "Z. D. Smith" :description "A minimal website" :title "Bagatto Demo"} :pages @[@{:basename "about" :path "pages/about.md" :contents @"..."} @{:basename "bagatto" :path "pages/bagatto.md" :contents @"..."}] :css @{:path "styles.css" :contents @"..."}} ``` ## The Bagatto API The `bag` command accepts a single filename as an argument. This is known as the **index module**, and it should be syntactically correct Janet. One of the principles of Bagatto is to go as far as is practicable to make the operation of the Janet language inside the index module as similar as possible to any other use of the Janet interpreter or compiler. Thus, there's only one real difference between programming inside an index module and writing a normal Janet module: Bagatto inserts a couple useful libraries into the namespace so that we, as the site authors, don't need to manage these libraries in order to use them inside the module. They are: - The [`bagatto`](./api.html#bagatto-api) library itself, a collection of useful functions designed to reduce boilerplate in Bagatto modules, whose API is listed below; - The [`path`](https://github.com/janet-lang/path) library, which exposes functions for manipulating file paths; - The [`janet-sh`](https://github.com/andrewchambers/janet-sh) library, which exposes a useful DSL for shelling out to the command line. One of Bagatto's principles is to expose as much of its API as possible in the form of ordinary functions to be used to produce the data structures you define in your index module. There are a couple places where that isn't possible, and where we have to expose a "global" API instead. In addition to the helper functions exposed in the `bagatto/` namespace, there are a few features that can be accessed directly inside of index modules: ### Defaults Handling Bagatto exposes the `bagattto/set-defaults!` function, which can be called at any point inside an index module. It takes a single argument: a struct or dictionary specifying the default value for any of the specification attributes: `:src`, `:attrs`, `:dest`, `:out`. ### Output Directory By calling `bagatto/set-output-dir!`, you can specify the directory that Bagatto should write its generated file tree into. This is principally exactly the same as appending that directory name to every path that you generate; however, if you use this feature then you can re-use paths in your business logic (for instance, you can define a path value and use it when generating a file, and when rendering a link in your site), as the additional file hierarchy will be transparently dealt with. ## Data Your Bagatto module should expose a **data specification** like this: ```clj (def data ... ) ``` This value will be used as the starting point by the Bagatto application. Its job is to specify all the inputs that should go into the system. The `data` value should be a struct where the keys are **data specification names** and the values are the specifications themselves. The data specification names are meaningful, as they are referred to by the site specifications, as we'll see. ### Data Literals The simplest form of a data specification is a literal struct or table, like this: ```clj (def data {:config {:attrs {:title "A Demo Bagatto Config"}}}) ``` When Bagatto creates the *site data* for this specification, it will consist of a single key-value pair: ```clj repl:13:> (eval-data data) @{:config {:title "A Demo Bagatto Config"}} ``` ### File References The next type of data specification is a reference to a single file in the project. These will consist of two attributes, `:src`, which specifies the location of the file with respect to the current working directory, and `:attrs`, which contains a function that will be called with the file contents, like this: ```clj (def data {:config-json {:src "config.json" :attrs bagatto/parse-json}} ``` (Theoretically, you could pass in a data literal as above in this case too, but in that case the file would be ignored and there wouldn't be much point.) In this case, Bagatto will look for a file called `config.json` in the current directory, load its contents, and then call `bagatto/parse-json` on them. The resulting attributes will then be the content of the site data associated with `:config-json`. ```clj repl:17:> (eval-data data) @{:config-json @{"subtitle" "A Very Good Blog." :path "config.json" :contents @"{\"subtitle\":\"A Very Good Blog.\"}\n"}} ``` We see that the resulting site data has a single entry, `:config-json`. The table associated with this entry has the two attributes we get for free---`:path` and `:contents`, which are the file path and contents, respectively---but that the call to `parse-json` has resulted in the key/value pairs inside the JSON file have been parsed and put in the site data too. ### File Wildcards The last way to specify data inputs is with wildcard references to multiple (potential) files. Under the hood, this relies on the `glob` function of [janet-sh][sh]. There are two wildcard methods: `bagatto/*` and `bagatto/slurp-*`. [sh]: https://github.com/andrewchambers/janet-sh #### List matching files Use `bagatto/*` to provide a all the filenames that match the wildcard. `bagatto/*` will return a new function, which will run the file wildcard when evaluated, so we can simply call the resulting value in the REPL to see it at work: ```clj repl:24:> ((bagatto/* "demo/static/*")) {:each @["demo/static/hello.png"]} ``` The output is an `:each` struct, which lets Bagatto know that when used as a data source, it should iterate over the contents and create a new output for each one. We can use it as a data specification: ```clj (def data {:static {:src (bagatto/* "demo/static/*") :attrs bagatto/parse-base}}) ``` ```clj repl:27:> (eval-data data) @{:static @[@{:path "demo/static/hello.png"}]} ``` Since we specified the `parse-base` parser, and used the basic form `bagatto/*` (which only lists files), we get an array of tables with the `:path` attribute only. This is the minimal case for listing files, but for files like the above, that only need to be copied into place, it's all we need. #### Slurp matching files `bagatto/slurp-*` has the same wildcard functionality, but it also includes the contents of the matching files. We can use this to process files in more interesting ways. ```clj repl:28:> ((bagatto/slurp-* "demo/posts/*.md")) {:each @[("demo/posts/post.md" @"{:title \"Post 1\"}\n%%%\n## A Post That You Might Be Interested In...") ...]} ``` Each output is a two-tuple of the file's path and contents. In this example, the post markdown files are formatted with [Mago][], an extremely simple way to add frontmatter metadata to any text file. [Mago]: https://git.sr.ht/~subsetpark/mago We can define a data specification based off of this loader. In this case we'll specify the Mago parser as the `attrs` callback. That will be able to extract the Janet frontmatter as additional metadata. ```clj (def data {:posts {:src (bagatto/slurp-* "demo/posts/*.md") :attrs parse-mago}}) ``` ```clj repl:33:> (eval-data data) @{:posts @[@{:path "demo/posts/post.md" :title "Post 1" :body @"..." :contents @"..."} @{:path "..." ...} ...]} ``` Having evaluated the data specification, we can see that `:posts` is an array with one element for each file that matched the wildcard. Unlike with the single-file example above, `parse-mago` was then called for *each* post. #### Transforms Since the wildcard loaders offer the ability to load multiple files, and the `attrs` callback operates on each file individually, Bagatto exposes one more element of the data specification: the `transform` callback. A transform, if specified, is called on the whole set of elements after each one has been parsed. This allows us to, for instance, sort a list of blog posts after they've been loaded and parsed. ```clj (def data {:notes {:src (bagatto/slurp-* "notes/*.md") :attrs parse-note :transform (bagatto/attr-sorter "topic")}}) ``` [`bagatto/attr-sorter`](./api.html#bagattoattr-sorter) is exposed as a part of the Bagatto library and allows us to specify a key present in all the items, and sort the collection by it. ## Site The second and last value that your index module should define is `site`: ```clj (def site ...) ``` This is the **site specification**, which defines all the outputs of the system. Every site specification entry specifies either a single file, or a sequence of files, to be generated. To specify a file we must output the **path** it should be created at and the **contents** of the created file. The structure of the site specification is quite similar to the data specification: it's an association between names and specification values. However, in this case the names don't have any effect on the generated site; they're just useful for the site author to organize their code. The relationship between `data` and `site` is simple but important to understand. The site specification is evaluated in the *context* of the site data, which is the output of the data specification (it's what we see when we run `eval-data` above). A site specification isn't actually a mapping data entries to pages; in most websites of any size, any given page will require data from more than one input (for instance, to display a *recent posts* sidebar on every page), and may well create more than one page out of the same input. Thus it's useful to understand the overall flow of data in the system: Bagatto uses the data specification to create the site data, and then iterates through the site specification using the site data as the context, or evaluation evaluation environment, as it evaluates each entry in the specification. Each entry results in a sequence of one or more files to be written. ### Path and Contents Literals As noted above, every site specification specifies the path and contents for one or more files to be created. Therefore, perhaps the simplest possible site is one consisting of a static path and contents: ```clj repl:2:> (def site {:_ {:dest "out.txt" :out "Welcome to my website"}}) ``` We can use `eval-site` to get an output of the path and contents of each file to be created. Site specifications are evaluated in the context of site data, but in this case our only specification is completely static. therefore we can pass in an empty struct as the site data. ```cli repl:4:> (eval-site site {}) @[(:write "out.txt" "Welcome to my website")] ``` We can see that it plans to write a single file, with the specified path and contents. ### Renderer functions More useful is to pass a function as the specification contents, rather than a static value. This allows us to dynamically act on the input data in useful ways. A renderer function is simply any function which takes in the site data and outputs some file contents. We can write our own extremely simple one, which looks for a secret in the site data and outputs it in JDN format to a file. Then we can define a simple site specification with a static path that passes that function in directly as the `:out` attribute. ```clj repl:6:> (defn renderer [data] (string/format "%j" (data :secret))) repl:7:> (def site {:_ {:dest "out.txt" :out renderer}}) {:_ {:dest "out.txt" :out }} ``` Now, of course, we need to ensure that `:secret` is present in the site data. While, in practice, we'd have a data entry that defined `:secret`, it's useful to note that for the purposes of inspecting our functions, we don't need to use the output of the `eval-data` command. We can construct a struct directly. ```clj repl:8:> (eval-site site {:secret "p@ssw0rd"}) # ...Some output... @[(:write "out.txt" "\"p@ssw0rd\"")] ``` The fact that the site data is a simple key-value structure, and the renderer output is just a string, makes it very simple to understand how data flows through the application and to extend it. Perhaps a slightly more realistic example would be one that combines data from more than one source. ```clj repl:12:> (defn renderer [data] (string/format "%s:%j:%f" (get-in data [:config :prefix]) (get-in data [:personal :password]) (math/random))) repl:14:> (def site {:_ {:dest "out.txt" :out renderer}}) {:_ {:dest "out.txt" :out }} repl:15:> (eval-site site {:personal {:password "p@ssw0rd"} :config {:prefix "md5"}}) @[(:write "out.txt" "md5:\"p@ssw0rd\":0.487181")] ``` Here we see the very common case of combining data from multiple sources into a single file's contents. ### Site Selectors We saw that, using data specifications, we can select *source files* to be included in our site data with wildcards. A very common example of this would be building a blog; in addition to all the static content and any config files, we'd want to load up all the blog posts in a directory, and to be able to add a new post simple by adding a file to the directory, without changing the config. Thus, it will be very common that in addition to rendering pages based on static config data or files, we'll want to iterate through all the files that match a wildcard and render one output file for each (eg., rendering a `post.html` for each source file). For that we can use **site selectors**. #### `:each` Given some site data with a series of named data entries, we can use the `each` attribute to refer to one of those entries. Bagatto will then call the `dest` and `out` functions on each file in the entry. Because there's now an additional piece of data in addition to the site data, the renderer and path generator functions in an `each` specification take two arguments: the site data `data`, and then the individual element `item`. We can see an example. First let's define a data specification with a wildcard, so we have something to iterate over: ```clj repl:55:> (def data {:users {:src (bagatto/slurp-* "users/*") :attrs bagatto/parse-json} :config {:attrs {:prefix "pw::"}}}) ``` The `users` directory will have two JSON files in it. Therefore, since we specify `bagatto/parse-json` as the parser for `users`, we can expect the `users` site data to contain an array of 2 tables that have been decoded from the JSON. Next we'll define a renderer function. Like above, it will draw from multiple sources; but this time, it will take two arguments, because we intend it to be called on each element in `users`. ```clj repl:34:> (defn renderer [data item] (string/format "%s%s" (get-in data [:config :prefix]) (item "password"))) ``` As before we expect `data`, but now we expect `item` as well. For each call `data` will be the same site data, and item will be a different element. Finally we will define a site specification that uses `:each` to refer to the `users` site data. ```clj repl:35:> (def site {:_ {:each :users :dest (bagatto/path-copier "passwords/") :out renderer}}) ``` `:each :users` will cause Bagatto to call the renderer once for each item in `:users`. In addition, we now need to specify an actual function for `:dest`. If we left it as a static value, the contents would be repeatedly written to the same file, which is obviously not what we want. Here we use the [`bagatto/path-copier`](./api.html#bagattopath-copier) helper, which gives us a function that will accept any file and return a new path with the base we specify. We can evaluate the data spec, and use that to evaluate the site spec: ```clj repl:56:> (eval-site site (eval-data data)) @[(:write "passwords/alice.json" "pw::1234") (:write "passwords/bob.json" "pw::snoopy")] ``` It's produced two write plans, one for each user file, whose contents are interpolated from the contents of their respective source files. #### Copying Files A very common operation when generating a website is to copy a source file without touching it. If Bagatto receives a site specification with a site selector and a `:dest` entry, but no `:out` entry, it will interpret that as a **copy** operation. It will read the `:path` of whatever item or item it receives (this attribute is always present), and copy it to the `:dest` attribute of the site specification. Here's a super simple data spec: ```clj repl:62:> (def data {:users {:src (bagatto/* "users/*") :attrs bagatto/parse-base}}) ``` We use `bagatto/*` instead of `bagatto/slurp-*`, which just lists the files, but doesn't read them. We also use `bagatto/parse-base` as our parser, which just returns the base `:path` attribute. We can now define a `site` that simply refers to `:users` and specifies a path without specifying contents. ```clj repl:58:> (def site {:_ {:each :users :dest (bagatto/path-copier "passwords/")}}) ``` Evaluating the site produces two copy instructions to the new paths: ```clj repl:63:> (eval-site site (eval-data data)) @[(:copy "users/alice.json" "passwords/alice.json") (:copy "users/bob.json" "passwords/bob.json")] ``` ## Templates Of course, most websites are not made by `string/format`ing HTML together; they use HTML templates. The template system used by Bagatto is [Temple](https://git.sr.ht/~bakpakin/temple). Temple is a wonderfully powerful and simple templating system that should be very enjoyable to use. ### Temple Basics Here are the contents of a simple blog post template: ```html {:} {% (bagatto/include "/templates/base_top") %}

{{ (get-in args [:config :title]) }}

{{ (get-in args [:_item :title]) }}

{- (bagatto/markdown->html (get-in args [:_item :contents])) -} {% (bagatto/include "/templates/base_bottom") %} ``` The appeal of Temple is in its simplicity. It consists of four types of expression: - `{$ ... $}`: Evaluate the expression between the `$`s at compile time; - `{% ... %}`: Evaluate the expression between the `%`s at runtime, escape and interpolate the output; - `{- ... -}`: Evaluate the expression between the `-`s at runtime, interpolate the output without escaping it. - `{{ ... }}`: Evaluate and interpolate the expression inside the curly braces. While many other templating languages differentiate between capturing and non-capturing by differentiating between their escape brace types (which means having to change brace types from line to line, even within the same syntactic expression), Temple is non-capturing by default, and we interpolate into the surrounding template by printing to stdout. In other words, to interpolate something into a Temple template, simply use `print`: ```html Welcome to my web page. Here's a pretty-printed example of one of my favorite data structures: {% (print (string/format "%q" {:name "Bowler Cat" :species "Felis Domesticus"})) %} Ain't she a beaut? ``` We can think of `{{ foo }}` as syntactic sugar for `{% (print foo) %}`. Temple templates accept a single dictionary of arguments, which is bound inside the template to `args`. ### Using Temple in Bagatto Bagatto adds a very thin layer of functionality and convenience on top of Temple. The first thing it does is it extends the Temple environment with the same libraries that are listed at the beginning of this manual. Thus we can call `bagatto/` helper functions from within a template. The only other change it makes is to ensure the presence, if applicable, of the `item` passed in as the second argument to site spec functions, which contains the attributes of the individual element of an `:each` selection. Those attributes are made available at `(args :_item)`. For instance, in the example above, we expect the attributes of the specific blog post being rendered to be present in the `:_item` value, and so we refer to it to get the title, date and contents of the post. #### Special attributes In addition to ensuring the presence of the special `:_item` attribute in `args`, Bagatto ensures the presence of a special attribute, `:_dest`: If `:_item` is present, then `(get-in args [:_item :_dest])` will return the fully rendered path string that the current `item` will be rendered to. If `:_item` isn't present (ie, the renderer function passed to `:out` only has access to `args`), then `:_dest` will be ensured in `args`. ### Rendering a Template The basic call to render a template is [`bagatto/render`](./api.html#bagattorender). This allows us to directly invoke a template by name, with site data and an optional item, and returns the fully rendered template. For instance, if we have a simple template at `templates/simple.temple`: ``` I am known for my {{ (args "topic") }} skills. ``` Then we can render out page contents like so: ``` repl:5:> (bagatto/render "/templates/simple" {"topic" "Web Design"}) @"I am known for my Web Design skills.\n" ``` In a proper web page, of course, our template file would contain HTML with placeholders for the values to be interpolated. ### Renderer Generators Because `bagatto/render` is such a common operation, Bagatto offers a [convenience function](./api.html#bagattorenderer) that will generate a **renderer** that will make the above call. For instance, if I wanted to specify the above template in a site specification, I'd probably write this: ```clj repl:6:> (def site {:_ {:dest "out.txt" :out (bagatto/renderer "/templates/simple")}}) ``` Thus I avoid having to write a new renderer function for each `:out` entry, if I'm just going to pass on the data to a specific template. Evaluating the site we get the same thing: ```clj repl:7:> (eval-site site {"topic" "Web Design"}) @[(:write "out.txt" @"I am known for my Web Design skills.\n")] ``` ## Filters A site spec with an `:each` can include a `:filter` attribute, too. This can be any predicate function which takes the site and an individual item from the spec's site selector, and returns true or false. If the return value is false, the site spec will skip that elements. This can be very useful when handling an input of mixed files. For instance, with a `static/` directory that contains both CSS and supplementary HTML files, we might want to have different render steps for each. We could then write two site specs, that both take that data entry in their `:each`, but have different `:filter` attributes (we could also have written two different wildcards in two different data specs, but hopefully you get my point). # Extending Bagatto Bagatto bills itself as a "transparent" static site generator. By this we mean: we should favor *first-class functions* over *configuration*, and *native terms and data structures* over *indirect control flow* whenever possible. Here's a simple example: Bagatto creates files by combining a *file path* with some *file contents*. The values that can go in the `:out` section of a site specification can either be strings, or functions which produce strings. We might be tempted as application authors to introduce a layer of abstraction in front of the render process and ask the user to specify the *name* of a render function built into Bagatto. This would provide a simple, convenient DSL. Unfortunately, it has the very unfortunate side effect of effectively walling off that function from a site author. If---when---the author needs to understand what specifically is being passed into the render function, or needs to tweak its output slightly, they're out of luck. The logic that reads this name, translates it into a render function, calls the function with some inputs and uses the output is all stuck within the belly of Bagatto and the author might need to recompile the whole application to get into it. Similarly, if they want to introduce a new renderer---a new template language for instance---they can only do so by introducing the function directly into Bagatto, giving it a name, and then passing the name in a site specification. Therefore we keep the operation of the renderer within inspection of the author. By specifying a literal function, we can easily wrap other functions and debug their output or change it. Similarly, we do attempt to offer an author the same level of convenience as the above DSL; but instead of offering them the ability to name a function that we control, we offer them the ability to call a function that outputs the renderer function itself, so that they still have access to its inputs and outputs. Thus, we have a pretty straightforward way to write our own loaders, attributes, path-generator and renderer functions. Each of the below entries will have a typespec describing the signature of the functions that can be implemented. This isn't meaningful Janet, but hopefully gives a succinct picture of the types that will be meaningful. ## Loaders ``` (let [element (or 'source-path '(source-path file-contents))] (defn loader [] (or '{:each (element ...)} '{:some element)})) ``` The `:src` attribute in a data spec can take a 0-arity function which, when called, returns one of two types of values: ### `{:each values}` `values` is any indexable data structure, the elements of which are either two-tuples or single values. Two-tuples will be treated by the base attribute parser as `[source-path file-contents]`. Single values will be treated as `source-path` only. ### `{:some value}` `value` is a single instance of the above value type: either a two-tuple or a single path value. We could, for instance, write a custom loader function that accepted a URL, made a web request, and returned a (file-url file-contents) tuple. ## Parsers ``` (defn parser [contents attrs] attrs) ``` `:attrs` can take any parser function. The purpose of a parser is to transform the individual outputs of a data loader into an attributes table. There are two attributes that are guaranteed to be present when the parser is called, `:path` and `:contents`. A parser function shouldn't remove either of these attributes, but can use them to generate new ones. For instance, if `contents` is unparsed Markdown with YAML frontmatter, then a parser function could extract metadata from the frontmatter and return an updated attributes table with those arbitrary metadata. `contents` and the `:contents` attribute can be expected to be identical, and the former is provided as a convenience. An example of a custom parser would be one that shelled out to [Asciidoctor](https://asciidoctor.org/) to extract attributes from an asciidoc document. ## Path Generators ``` (defn each-parser [data item] path) (defn some-parser [data] path) ``` In the site specification, `:dest` can take any function which returns a file path string. If the spec has an `:each`, the generator function should take the site data and the individual item as arguments, and return the destination path for the individual item. Otherwise, it should take the site data as a single argument and return the destination path for its entry output. ## Renderers ``` (defn each-renderer [data item] file-contents) (defn some-renderer [data] file-contents) ``` `:out` takes any renderer function---these work along exactly the same lines as path generators. If the site spec has an `:each`, the function should take two arguments, otherwise it should take one. The return value of the function will be written directly to the file path in its site spec. Following from the parser example above, an example custom renderer could take an asciidoc document and shell out to Asciidoctor to render it into HTML.