~trs-80/ostrta-spec

ref: 36ae36caee7df529c2a8182ca3cb605550d34238 ostrta-spec/Specifications.org -rw-r--r-- 9.6 KiB
36ae36caTRS-80 Remove stupid gigantic ToC heading 10 months ago
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
* Specifications
  :PROPERTIES:
  :CUSTOM_ID:            specifications
  :END:

Here follow (in alphabetical order) some more detailed notes on implementing some of the [[file:README.org::#general-concepts][general concepts]].

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED","MAY", and "OPTIONAL" in this document are to be interpreted as described in [[https://tools.ietf.org/html/rfc2119][RFC 2119]].

** Controlled Vocabulary
   :PROPERTIES:
   :CUSTOM_ID:            controlled-vocabulary
   :END:

For a conceptual overview, see the [[file:README.org::#controlled-vocabulary][Controlled Vocabulary]] section in General Concepts.

1. *CV item* is defined as a contiguous word or term used as an additional axis of metadata.  Commonly referred to as a "tag" but that is only one usage, so here we use the more general term.

   1. By contiguous, we mean that spaces MUST NOT be used.

   2. Under_scores, camelCase, PascalCase, etc. MAY be used instead within CV items.

*** CV File Format
    :PROPERTIES:
    :CUSTOM_ID:            cv-file-format
    :END:

An implementation of the [[file:README.org::#disambiguation-notes-live-directly-with-cv-items][concept]] of including additional disambiguation notes directly in the same place you are choosing the CV item from, in a simple plain text file format.

Using common example of selecting tag(s), the plain text CV file implementation we propose looks like:

#+begin_example
  tag1
  tag2    <- tag3
  tag3    use tag2 instead
#+end_example

1. Where:

   1. The CV item (i.e., "tag") MUST appear at the beginning of each line.

   2. CV items MUST be separated by newlines.

   3. CV item MAY be followed by OPTIONAL disambiguation notes.  If notes follow, they MUST be separated from CV item by at least one space character.

      1. This makes discarding the disambiguation notes from the desired tag (after selection) trivial in many different programming languages.

   4. Redirection from one CV item to another MAY be accomplished by way of simple arrow glyph of "less than and hyphen" (=<-=).

   5. Other than above extremely simple requirements, you are not only free but actually encouraged to use whatever terms, glyphs, etc. make sense /to you personally/.

2. In addition to the above:

   1. Implementations SHOULD provide a user selectable option whether to limit selections strictly to the choices in CV file, or allow adding new items "on the fly."

** Filename
   :PROPERTIES:
   :CUSTOM_ID:            filename
   :END:

The filename spec is based upon (and closely related to) the [[#timestamp-id][timestamp-ID]] spec.

*** Minimum
    :PROPERTIES:
    :CUSTOM_ID:            minimum
    :END:

The minimum file name considered to be following the spec would be a simple [[#ostrta-id-n][ostrta-id-4]] with no extension:

#+begin_example
  YYYY-MM-DD-HHMM
#+end_example

In the Elisp implementation, this simple check is performed by the function =ostrta-filename-p=, which in turn uses the variable =ostrta-id-4-regexp=.

*** Full Filename Specification
    :PROPERTIES:
    :CUSTOM_ID:            full-filename-specification
    :END:

A simple example (in this case, a photo filename):

#+begin_example
  YYYY-MM-DD-HHMM_description_text_here--tag1-tag2-tag3_with_spaces.jpg
#+end_example

A much more detailed definition:

#+begin_example
  timestamp-id [_description...] [--[tag...]-another_tag...] [.ext]
#+end_example

1. *timestamp-id* is the only strictly required part and therefore MUST follow ostrta-id-4 (at minimum) but MAY achieve higher resolution by following ostrta-id-6, ostrta-id-8, etc.  See the [[#timestamp-id][timestamp-ID]] specification for further detail.

2. *description* is OPTIONAL but if present MUST start with an underscore (=_=) delimiter to clearly mark its separation from the timestamp.

   1. The initial delimiter (=_=) is not considered a part of the description.  It is a delimiter.

   2. Illegal characters throughout the file name depend on the file system.  Having said that, I think the project SHOULD endeavour to develop a short list which any implementation SHOULD check against when implementing any sort of (re-)naming function(s).

      1. exFAT (common on larger SD cards) for example does not allow {=/\:*?\"<>|=}

   3. Besides the above, I think we SHOULD NOT use spaces (personally I use underscores instead) but I guess that does not have to be part of spec.

   4. Note that periods (=.=) MAY be present in description.  N.B. how we define filename extension (.ext) below!

3. *tags* are also OPTIONAL but if present must start with double hyphen (=--=) delimiter to clearly separate them from the description.

   1. The initial delimiter (=--=) SHALL NOT be considered a part of any tags.  It is a delimiter.

   2. Within tags, there MAY be spaces, but again, underscores SHOULD be used instead.

   3. Different tags MUST be separated by a hyphen (=-=) as delimiter.

      1. Corollary to this, individual tags MUST NOT contain hyphens (=-=).

   4. Note that periods (=.=) MAY be present in tags.  N.B. how we define filename extension (.ext) below!

4. We define filename extension (*.ext*) as the last group of legal characters (including letters, numbers, symbols) at the end of the file name after the last period (=.=).

   1. This means that extensions MAY be arbitrary length.  I get a headache just thinking about the potential implications here, so I would welcome feedback from anyone who has more experience dealing with something like this.  In particular I wonder if we should limit it to some number of characters.

   2. At the moment nothing really relies on this anyway, but some day it might, hence me trying to come up with a good definition here.

5. Editing filename after initial creation or processing:

   1. The optional parts of file name (description, tags, etc.) MAY (and /should)/ change!

   2. The timestamp-id portion MUST never change (after initial assignment / processing).

   3. The intention of this rule is to insure the timestamp-id portion of the filename remains a reliable identifier.

Alternatively, you MAY leave the base timestamp-id there by itself (perhaps only along with the extension) and implement your metadata in another index file or even a database (although plain text files are always [[file:README.org::#relying-strictly-on-floss-and-lowest-common-denominator-formats][preferred]]).[fn:1]

** Filesystem
   :PROPERTIES:
   :CUSTOM_ID:            filesystem
   :END:

I have a lot of ideas about how to organize my home dir.  I am sure other people do, too, and therefore I am not sure how many of these ideas are appropriate for this project.

Having said that, at a minimum I think we need to have one or more of the all important timeline structures defined therein.  Consider the following as an example to spur discussion, rather than any sort of "standard", certainly for the time being.

One thing in particular I noticed so far is that having the intermediate month folders seemed to be more trouble than it was worth in the =~/tmp= directory.  So I did away with them there.  However in =~/timeline=, items are much more numerous, so it's useful to have folders for months because each of those could contain hundreds (or more) of files and additional directories.

#+begin_example
  ~
  ├── timeline
  │   ├── 2016
  │   │   ├── 01-Jan
  │   │   ├── 02-Feb
  │   │   ├── 03-Mar
  │   │   ├── 04-Apr
  │   │   ├── 05-May
  │   │   ├── 06-Jun
  │   │   ├── 07-Jul
  │   │   ├── 08-Aug
  │   │   ├── 09-Sep
  │   │   ├── 10-Oct
  │   │   ├── 11-Nov
  │   │   └── 12-Dec
  │   ├── 2017
  │   │   └── [...]
  │   └── 2018
  │       └── [...]
  └── tmp
      ├── 2019
      │   ├── 2019-06-08_software_download
      │   └── 2019-12-31_experimental_project
      └── 2020
	  ├── 2020-04-04_another_temp_dir
	  └── 2020-12-18_you_get_the_idea
#+end_example

** Timestamp-ID
   :PROPERTIES:
   :CUSTOM_ID:            timestamp-id
   :END:

Related closely to the base [[#filename][filename]] spec, and vice-versa.

*** ostrta-id-N
    :PROPERTIES:
    :CUSTOM_ID:            ostrta-id-n
    :END:

The notion of =-4= and =-6= comes from the size of the last group of digits in the timestamp:

|-------------+-------------------+-------------------+------------|
| Spec name   | Format            | Example           | Resolution |
|-------------+-------------------+-------------------+------------|
|             |                   | <l>               |            |
| ostrta-id-4 | YYYY-MM-DD-HHMM   | 2021-01-01-2029   | minute     |
| ostrta-id-6 | YYYY-MM-DD-HHMMSS | 2021-01-01-202983 | second     |
|-------------+-------------------+-------------------+------------|

Therefore it is an expression of the level of time resolution (minute and second, respectively).

I suppose there MAY eventually be =-8= (or further) but I personally have not come across the need as of yet.

- Then we would also need to get into discussion of whether to use period, etc. for fractional seconds or what.  So I suppose we cross that bridge when we come to it.

Historical note: At one point early on, I was using an underscore between day and time.  But then I realized we are still just talking about degrees of time.  And since they are all similar (time), I think we should simply stick with hyphens throughout.

** Footnotes
   :PROPERTIES:
   :CUSTOM_ID:            footnotes
   :END:

[fn:1] In fact this is the approach I took in the (as yet unreleased) Meme Manager as some memes have far too much metadata to comfortably store in the filename.