~trs-80/ostrta-spec

ref: 36ae36caee7df529c2a8182ca3cb605550d34238 ostrta-spec/Specifications.md -rw-r--r-- 9.6 KiB
36ae36caTRS-80 Remove stupid gigantic ToC heading 10 months ago
  1. Specifications
    1. Controlled Vocabulary
      1. CV File Format
    2. Filename
      1. Minimum
      2. Full Filename Specification
    3. Filesystem
    4. Timestamp-ID
      1. ostrta-id-N

#Specifications

Here follow (in alphabetical order) some more detailed notes on implementing some of the general concepts.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED","MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

#Controlled Vocabulary

For a conceptual overview, see the Controlled Vocabulary section in General Concepts.

  1. CV item is defined as a contiguous word or term used as an additional axis of metadata. Commonly referred to as a "tag" but that is only one usage, so here we use the more general term.
    1. By contiguous, we mean that spaces MUST NOT be used.

    2. Underscores, camelCase, PascalCase, etc. MAY be used instead within CV items.

#CV File Format

An implementation of the concept of including additional disambiguation notes directly in the same place you are choosing the CV item from, in a simple plain text file format.

Using common example of selecting tag(s), the plain text CV file implementation we propose looks like:

tag1
tag2    <- tag3
tag3    use tag2 instead
  1. Where:

    1. The CV item (i.e., "tag") MUST appear at the beginning of each line.

    2. CV items MUST be separated by newlines.

    3. CV item MAY be followed by OPTIONAL disambiguation notes. If notes follow, they MUST be separated from CV item by at least one space character.

      1. This makes discarding the disambiguation notes from the desired tag (after selection) trivial in many different programming languages.
    4. Redirection from one CV item to another MAY be accomplished by way of simple arrow glyph of "less than and hyphen" (<-).

    5. Other than above extremely simple requirements, you are not only free but actually encouraged to use whatever terms, glyphs, etc. make sense to you personally.

  2. In addition to the above:

    1. Implementations SHOULD provide a user selectable option whether to limit selections strictly to the choices in CV file, or allow adding new items "on the fly."

#Filename

The filename spec is based upon (and closely related to) the timestamp-ID spec.

#Minimum

The minimum file name considered to be following the spec would be a simple ostrta-id-4 with no extension:

YYYY-MM-DD-HHMM

In the Elisp implementation, this simple check is performed by the function ostrta-filename-p, which in turn uses the variable ostrta-id-4-regexp.

#Full Filename Specification

A simple example (in this case, a photo filename):

YYYY-MM-DD-HHMM_description_text_here--tag1-tag2-tag3_with_spaces.jpg

A much more detailed definition:

timestamp-id [_description...] [--[tag...]-another_tag...] [.ext]
  1. timestamp-id is the only strictly required part and therefore MUST follow ostrta-id-4 (at minimum) but MAY achieve higher resolution by following ostrta-id-6, ostrta-id-8, etc. See the timestamp-ID specification for further detail.

  2. description is OPTIONAL but if present MUST start with an underscore (_) delimiter to clearly mark its separation from the timestamp.

    1. The initial delimiter (_) is not considered a part of the description. It is a delimiter.

    2. Illegal characters throughout the file name depend on the file system. Having said that, I think the project SHOULD endeavour to develop a short list which any implementation SHOULD check against when implementing any sort of (re-)naming function(s).

      1. exFAT (common on larger SD cards) for example does not allow {/\:*?\"<>|}
    3. Besides the above, I think we SHOULD NOT use spaces (personally I use underscores instead) but I guess that does not have to be part of spec.

    4. Note that periods (.) MAY be present in description. N.B. how we define filename extension (.ext) below!

  3. tags are also OPTIONAL but if present must start with double hyphen (--) delimiter to clearly separate them from the description.

    1. The initial delimiter (--) SHALL NOT be considered a part of any tags. It is a delimiter.

    2. Within tags, there MAY be spaces, but again, underscores SHOULD be used instead.

    3. Different tags MUST be separated by a hyphen (-) as delimiter.

      1. Corollary to this, individual tags MUST NOT contain hyphens (-).
    4. Note that periods (.) MAY be present in tags. N.B. how we define filename extension (.ext) below!

  4. We define filename extension (.ext) as the last group of legal characters (including letters, numbers, symbols) at the end of the file name after the last period (.).

    1. This means that extensions MAY be arbitrary length. I get a headache just thinking about the potential implications here, so I would welcome feedback from anyone who has more experience dealing with something like this. In particular I wonder if we should limit it to some number of characters.

    2. At the moment nothing really relies on this anyway, but some day it might, hence me trying to come up with a good definition here.

  5. Editing filename after initial creation or processing:

    1. The optional parts of file name (description, tags, etc.) MAY (and should) change!

    2. The timestamp-id portion MUST never change (after initial assignment / processing).

    3. The intention of this rule is to insure the timestamp-id portion of the filename remains a reliable identifier.

Alternatively, you MAY leave the base timestamp-id there by itself (perhaps only along with the extension) and implement your metadata in another index file or even a database (although plain text files are always preferred).1

#Filesystem

I have a lot of ideas about how to organize my home dir. I am sure other people do, too, and therefore I am not sure how many of these ideas are appropriate for this project.

Having said that, at a minimum I think we need to have one or more of the all important timeline structures defined therein. Consider the following as an example to spur discussion, rather than any sort of "standard", certainly for the time being.

One thing in particular I noticed so far is that having the intermediate month folders seemed to be more trouble than it was worth in the ~/tmp directory. So I did away with them there. However in ~/timeline, items are much more numerous, so it's useful to have folders for months because each of those could contain hundreds (or more) of files and additional directories.

~
├── timeline
│   ├── 2016
│   │   ├── 01-Jan
│   │   ├── 02-Feb
│   │   ├── 03-Mar
│   │   ├── 04-Apr
│   │   ├── 05-May
│   │   ├── 06-Jun
│   │   ├── 07-Jul
│   │   ├── 08-Aug
│   │   ├── 09-Sep
│   │   ├── 10-Oct
│   │   ├── 11-Nov
│   │   └── 12-Dec
│   ├── 2017
│   │   └── [...]
│   └── 2018
│       └── [...]
└── tmp
    ├── 2019
    │   ├── 2019-06-08_software_download
    │   └── 2019-12-31_experimental_project
    └── 2020
	├── 2020-04-04_another_temp_dir
	└── 2020-12-18_you_get_the_idea

#Timestamp-ID

Related closely to the base filename spec, and vice-versa.

#ostrta-id-N

The notion of -4 and -6 comes from the size of the last group of digits in the timestamp:

Spec name Format Example Resolution
ostrta-id-4 YYYY-MM-DD-HHMM 2021-01-01-2029 minute
ostrta-id-6 YYYY-MM-DD-HHMMSS 2021-01-01-202983 second

Therefore it is an expression of the level of time resolution (minute and second, respectively).

I suppose there MAY eventually be -8 (or further) but I personally have not come across the need as of yet.

  • Then we would also need to get into discussion of whether to use period, etc. for fractional seconds or what. So I suppose we cross that bridge when we come to it.

Historical note: At one point early on, I was using an underscore between day and time. But then I realized we are still just talking about degrees of time. And since they are all similar (time), I think we should simply stick with hyphens throughout.

#Footnotes

1 In fact this is the approach I took in the (as yet unreleased) Meme Manager as some memes have far too much metadata to comfortably store in the filename.