@@ 1,6 1,6 @@
title: Scheme Static Site Generators Review
-date: 2023-05-10 12:00
-tags: architecture, tech
+date: 2023-05-23 12:00
+tags: architecture, tech, scheme
abstract: An overview of Scheme ecosystem in the field of static site generators (SSGs), review of the Haunt SSG architecture and possible ways to improve it and Guile ecosystem.
---
@@ 248,16 248,19 @@ Providing default values for them is convinient, but making them to be
fields of `site` records incorporates unecessary assumptions about
blog nature of the site, which can negatively impact the rest of the
implementation by adding unwanted coupling and reducing composability.
+One of the options to avoid it is to make them to be values in
+default-metadata rather than fields in the record.
-TODO: Write about possible alternative for site fields.
-#### Builders
-Builders are functions, which accept `site` and `posts` and returns a
-list of artifacts. Artifacts are records, which have
-`artifact-writer` field, containing a closure writing actual output
-file. There are a number of different builders provided out of the
-box, but the most basic one (static-page) is missing, luckily it's not
-hard to implement it, so let's do it.
+#### Builders, Themes and Readers
+Builders are functions, which accept `site` and `posts`, apply series
+of transformations and returns a list of artifacts. Themes and
+Readers are basically transformations used somewhere in the build
+process. Artifacts are records, which have `artifact-writer` field,
+containing a closure writing actual output file. There are a number
+of different builders provided out of the box, but the most basic one
+(static-page) is missing, luckily it's not hard to implement it, so
+let's do it.
```scheme
(define* (page-theme #:key (footer %default-footer))
@@ 300,68 303,186 @@ transmorations happens here:
- `serialized-artifact` creates a closure, which wraps `sxml->html`
and will later serialize obtained SXML for the page to HTML.
-### Readers
-There is a concept of readers, small functions
-
-
-### Guile-Commonmark
-It used in haunt by default to parse markdown files in SXML, it
-doesn't support embeded html, tables, footnotes and comments, so it
-can be quite inconvinient for many use cases.
-
-TODO: It was something else important that was missing in
-guile-commonmark?
-
-### Mix of imlicit and explicit things
+The implementation using already existing API is quite easy, but
+unfortunately not perfect. While functions and records are composable
+enough to produce desired results, names are quite confusing and
+tightly related to blogs, but doesn't make much sense in the context
+of other site types.
-### Metadata
-Accepts only one-line metadata. Doesn't accept files without metadata.
-Metadata is not a part of the html grammar -> post is not a valid html.
-
-register-metadata-parser! is a reimplementation of multimethods.
+Every builder always accepts a list of posts, which were read and
+transformed into sxml before ahead, this is imlpicit and again blog
+related, which makes implementation less generic. It could be
+implemented in the `blog` builder, but this way other builders like
+atom-feed won't be able to reuse readed posts from from `blog` builder
+and would need to read them again. This is due to the fact, that
+build process has 3 primary steps and looks like this:
+```scheme
+;; 1. Prepare site and posts
-### Site Alist
+;; 2. Build artifacts
+(builder1 site posts) ;; => artifacts-1
+(builder2 site posts) ;; => artifacts-2
+(builder3 site posts) ;; => artifacts-3
+;; 3. Produce actual site:
+(serialize-artifacts
+ (append artifacts-1 artifacts-2 artifacts-3))
```
-`((posts-directory . "pages/posts")
- (build-directory . "target/site"))
-```
-
-### Theme
-Layout for posts and collections is the same. the same layout is
-coupled to both of them.
-### TODOs
-Linking to md files are not converted to apropriate urls, do we want
-to implement it and if want then how?
+It makes a build process rigid and makes it harder to compose
+procedures. The alternative more streamlined process could look like
+this:
-Org-roam workflow
+```scheme
+(define readers (list ...))
+;; threading macro passes the result of the form
+;; as a first argument to the next form
+(->
+ (make-site ...) ;;=> ((site . <site-record>))
+ (read-posts
+ "posts/" readers) ;;=> ((posts . <list-of-posts>) (site . <site-record>))
+ (static-page "index.md" "index.html") ;;=> ((artifacts <index-artifact>) ...)
+ (blog-posts theme) ;;=> ((artifacts <post1-artifact> <index-artifact>) ...)
+ (collection "main") ;;=> ((artifacts <coll-artifact> <post1-artifact> ...) ...)
+ (atom) ;; takes value from posts and appends a few more artifacts
+ (atom-by-tags)
+ (serialize-artifacts!))
+```
-### Workflows
-- ox-haunt
-- md
-- citations
-- one file multiple post
+Just a series of transformations, which enriches one associative data
+structures. Moreover it makes the implementation of such
+transformations much more composable:
-### Build/Deploy
-Rebuild and redeploy do it for the whole site every time.
+```scheme
+(define (read-posts o dir readers)
+ (let* (;; (dir (site-posts-dir (assoc-ref x 'site))) ; could be
+ (posts (map (read-with-readers readers) (files-in dir))))
+ (alist-update o 'posts (lambda (x) (append x posts)))))
+(define* (static-page o file destination
+ #:key
+ (reader commonmark-reader)
+ (page-layout default-page-layout))
+ (let* ((sxml-body (get-sxml (reader from)))
+ (sxml-page (page-layout sxml-body))
+ (page (serialized-artifact destination sxml-page sxml->html)))
+ (alist-update o 'artifacts (lambda (x) (append x (list page))))))
+
+(define* (blog-posts o destination-dir
+ #:key
+ (page-layout default-page-layout)
+ (post-layout post-layout))
+ "Implementation for the first posts here is to clearer demonstrate the
+idea of reusability."
+ (let* ((post (first (assoc-ref o 'posts)))
+ (destination (string-append destination-dir (post-file post)))
+ (sxml-content (get-sxml post))
+ (sxml-body (post-layout from))
+ (sxml-page (page-layout sxml-body))
+ (page (serialized-artifact destination sxml-page sxml->html)))
+ (alist-update o 'artifacts (lambda (x) (append x (list page))))))
+
+(define* (collection o name
+ #:key
+ (filter-function identity)
+ (collection-generator default-collection-generator)
+ (page-layout default-page-layout))
+ (let* ((posts (filter-function (assoc-ref o 'posts)))
+ (file (string-append name ".html"))
+ (sxml-body (collection-generator posts))
+ (sxml-page (page-layout sxml-body))
+ (collection (serialized-artifact file sxml-page sxml->html)))
+ (alist-update o 'artifacts (lambda (x) (append x (list collection))))))
+```
-TODO:
+The naming of intermediate transformations is much more suitable (no
+notion of the post in static-page builder), the transformations are
+more atomic and it's easier to reuse them (page-layout and similiar)
+and there is no need to combine them into records like `theme`, it's
+easier to restructure complex transformations, for example there is an
+option to make a collection a part of `blog` builder or be a separate
+step as in example above, there is no need to special case read and
+serialize steps, the read step can skip posts, which are flagged as
+drafts or have some other advanced logic, now it's possible to build a
+page, which relies on the content of previous steps, for example a
+collection of generated rss/atom links.
+
+However, such implementation has its own flaws: more flexibility and
+less rigid structure can lead to more user mistakes and steeper
+learning curve, original implementation theoretically could run
+builders in parallel, but here one will need to implement it on the
+user or builder side.
-How to build page with links to rss?
-How to build a collection with different template?
+### Readers
+As a component of the build process we encountered a step, where file
+with in markup language is read by readers. There are two parts for
+it: reading metadata and reading actual content. Let's cover
+implementation details for them.
+
+#### Metadata
+As show in the example code snippet in the section related to
+transformation, one can provide additional metadata in simple
+key-value format delimited by `---` from the content of the markup
+file. There are two main issues with the implementation, let's
+discuss them.
+
+The metadata is required for built-in readers and even if one don't
+want to set any values, they have to add `---` at the beginning of the
+file. This requirement is not needed and could be easily avoided.
+
+Metadata reader accepts only simple `:` delimited key-value pairs. It
+maybe not as flexible as yaml frontmatter. Metadata in such format
+usually is not a part of the markup grammar and that means files are
+written in the invalid markup. However, it's not a big deal, as
+readers can use custom metadata parsers.
+
+#### Guile-Commonmark and Tree-Sitter
+Guile-Commonmark is used in Haunt by default to parse markdown files
+in SXML, it doesn't support embeded html, tables, footnotes and
+comments, so it can be quite inconvinient for many use cases. It's
+somehow works and serves basic needs and more advanced use cases can
+be potentially implemented with more feature full libraries like
+hypotetical `guile-ts-markdown`
+([tree-sitter](https://tree-sitter.github.io/) based markdown parser).
## Conclusion
-The conclusion here
+Haunt is the primary player in Scheme static site generators field at
+the moment of writing. It gives all the basics to get up and running.
+The number of available learning resources in the wild much smaller
+than for similiar solutions from other languages ecosystems, but
+provided documentation and source code is enough for seasoned schemer
+to start with it and more importantly to learn everything about it in
+a matter of hours, which is not possible for projects like `hugo`,
+`jekyll`.
+
+The functionality can be lacking in some cases, but due to hackable
+nature of the project it's possible to gradually build upon basics and
+add all the things needed. Unfortunatelly, the current state of
+Scheme ecosystem and Guile in particular feels to be behind more
+mainstream languages, but luckily the popularity of Guile reached the
+critical level and the ecosystem will start growing in the nearest
+future.
### Future Work
-Possible future steps are improving SXML/HTML ecosystem in Guile,
-producing tree-sitter based parsers for various formats
+There is a number of improvements points for Haunt in particular and
+Guile and Scheme in general. More complete tooling for working with
+markup languages: org, md, html, yaml, etc. As a generic solution
+tree-sitter seems a good candidate to quickly cover this huge area.
+
+More streamlined and composable build process for Haunt described in
+Builders section could be a good thing as well to make SSG to be more
+flexible and components more reusable.
+
+Possible integrations with other tools like Guix, REPL, Emacs for
+easier deployment, better caching, more interactive development and
+other goodies.
+More documenation, materials and tool for possible workflows and use
+cases from citation capabilites and automatic url resolution to
+on-huge-file workflows and org-roam integration.
-**Aknowledgments.** Thank you to [David
+**Aknowledgments.** Kudos to [David
Thompson](https://dthompson.us/about.html) for making Haunt.