blep
Project licensed under the CNPLv7+ terms.
A parser for the GLS format, a format for writing Gay Little Stories.
I'm making this document format to have a saner base on which I can easily write stories.
Its aim is not to fully cover what a "serious" / "professional" / whatever format would cover, but to provide me with the base tools I need for myself.
The document file extension is expected to be .gls
by default (gls standing for gay little story uwu).
A story document is comprised of a metadata block and a content block. All elements cited here are required for the document to be valid. The compiler doesn't need (and should probably not need) to handle invalid documents.
METADATA
CONTENT
The document always ends with a line return as last character.
The base format for this will then be <METADATA>\n\n<CONTENT>\n
.
The metadata block must hold all important data related to the document. The following data is made available.
series
and oneshot
)draft
if not published yet (required; publication date of format YYYY-MM-DD
, with optional suffix of .X
where X is a number of at least 1 digit character).
This suffix may be used to publish multiple documents in the same day (e.g. 2023-03-02.1
, 2023-03-02.2
, etc.).
For ordering, it is to be interpreted as numerical value.For oneshots, the following metadata is made available.
For series, the following metadata is made available.
The formatting is defined as follows.
<DOCUMENT TYPE> <PUBLICATION DATE>
<STORY METADATA>
<EXTRA METADATA>
cw: <list of cws>
the format for the cw list is "keywords separated by ,"prompt: prompt text
The formatting for oneshots is defined as follows.
<STORY TITLE>
The formatting for series is defined as follows.
<STORY TITLE>
<CHAPTER NUMBER>: <CHAPTER TITLE>
An example for a story document using all described parts here is shown below.
series 2022-11-09
A pretty lengthy story title
2: A new chapter
prompt: You're trying to get something on the page, anything honestly
cw: test story, lots of gayness
There are three distinct groups of "rules" (ie lines).
Since I want the document to be pretty easy to parse on a per-line basis (ie to be able to identify what a line is right from its first character), comment and meta rules have their own prefix.
//
is a comment#
is a meta rule---
is a time skip rule[
, comprised of a person identifier, then closed by ]
is a dialogue line ruleA person identifier is a single character used to identify a person (a speaker).
A special case is the empty line / paragraph separator logic.
If an empty line appears, it will be counted as a paragraph separator. In other words, the previous paragraph (text rule set) is closed. A new paragraph (text rule set) will then only be opened when a text line will be found; any subsequent empty line or non-text rule line will not open a new paragraph.
A paragraph is automatically closed at end of document.
This is a test --beginning of paragraph
Another word
--empty line, end of paragraph
// This is a comment -- not a text rule, not opening a paragraph
This is another test --first text rule since end of paragraph, a new one is opened
[E] I am a dialogue line --dialogue line rule, closes the previous paragraph and opens+closes a paragraph with "I am a dialogue line" and attached metadata "spoken by E"
--- Later on --this is a time skip rule, with a label
[P] I am another line --dialogue line rule, doesn't close anything since the previous rule closed the paragraph it made; opens+closes a paragraph with "I am another line" and attached metadata "spoken by P"
And I am a final paragraph --text rule, opening a paragraph
This example may generate the following HTML.
<p>This is a test<br/>
Another word</p>
<!-- This is a comment -->
<p>This is another test</p>
<p class="speaker speaker-E" data-speaker="E">I am a dialogue line</p>
<hr data-label="Later on"/>
<p class="speaker speaker-P" data-speaker="P">I am another line</p>
<p>And I am a final paragraph</p>
Comments are "just that", a compiler may safely ignore them as they only carry metadata for the writer, and not the reader.
Their prefix is three characters, two forward slashes (/
) and an ascii standard space (
, code 0x20).
Meta rules are rules modifying the compiler's behaviour on the fly. The first use case I have in mind is to add or change metadata on speakers/actors.
A meta rule always start with a #
and no space afterwards; its rule identifier is put right after.
That means the meta rule format is #<RULE ID>
, for example #&
.
Text rules are, at their core, text. They're your story.
Dialogue lines are made to stand out / be focused, as they're a single unit. This implies that each dialogue line is its own paragraph (ie parsing a dialogue line will close any open paragraph, and open/close one with its content).
Text lines can have inline components, for example they're useful / needed for inline dialogue lines (to provide the metadata of who's speaking).
An inline rule cannot be opened when another rule is already opened (no nested rules, and no overlapping rules).
<PERSON: some text>
(eg <E: This is my line>
).{PERSON: the word(s)}
(eg {E: they}
)Historically, I started to write this EBNF. I may draw from it to write a real parser now.
<document> ::= <metadata> <eol> <eol> <content>? <eol>
/* METADATA */
<metadata> ::= <header> <eol> <extra>
| <header>
<header> ::= <seriesheader> <eol> <series>
| <oneshotheader> <eol> <oneshot>
<seriesheader> ::= "series " <pubdate>
<oneshotheader> ::= "oneshot " <pubdate>
<pubdate> ::= <date> | <draft>
<date> ::= <d> <d> <d> <d> "-" <d> <d> "-" <d> <d>
<draft> ::= "draft"
<oneshot> ::= <storytitle>
<series> ::= <storytitle> <eol> <chapterline>
<storytitle> ::= <text>
<chapterline> ::= <chapterno> ": " <chaptertitle>
<chapterno> ::= <dpos>
<chaptertitle> ::= <text>
<extra> ::= <extraline> <eol> <extra> | <extraline>
<extraline> ::= (<cws> | <prompt>)
<prompt> ::= "prompt: " <text>
<cw> ::= <text>
<cwlist> ::= <cw> | <cw> "," <cwlist>
<cws> ::= "cw: " <cwlist>
/* CONTENT */
<content> ::= <contentline>+
<contentline> ::=
<comment>
| <meta>
| <timeskip>
| <dialogueline>
| <genericline>
| <eol>
<comment> ::= "// " <text>
<meta> ::= "#"
<timeskip> ::= "---" (" " <text>)?
<dialogueline> ::= "[" <utf> "] " <text>
<genericline> ::= <fragment> (" " <fragment> | " ")*
<fragment> ::= <dialogue>
| <ref>
| <word>
<dialogue> ::= "<" <utf> ": " <text> ">"
<ref> ::= "{" <utf> ": " <text> "}"
/* basics */
<text> ::= " "* <utf> (<utf> | " ")*
<word> ::= <utf>+
<dpos> ::= "0"* [1-9] <d>*
<d> ::= [0-9]
<eol> ::= "\n"
/* meant as a generic placeholder for any kind of utf8 sequence; */
/* obviously not utf8 here */
<utf> ::= [a-z] | [A-Z] | <d>