~ivilata/gwit-spec

448417a8cce90050e5d1a20b4b46bc101011b7aa — Ivan Vilata-i-Balaguer 2 months ago f5520a7
Add "then" when it helps with readability.
1 files changed, 8 insertions(+), 8 deletions(-)

M README.md
M README.md => README.md +8 -8
@@ 51,7 51,7 @@ The word "gwit" (pronounced [gu̯it]) reflects the "Web in Git" concept, but it 

A gwit site is just *a Git repository branch associated with a PGP public key* whose private keys are used to sign commits in that branch. While that **site key** is public, its private keys are only owned by the **site author**. Since that **site branch** can be fetched from different locations managed by people other than the author, a location is not used to identify the site. Instead, *the full fingerprint of the site key is used as its identifier*.

This means that the relation between site and site key is one-to-one: if an author wants to create another site, a new site key MUST be created. Thus, using one's day-to-day PGP key as a site key is NOT RECOMMENDED. The mechanisms to relate a site (and its key) to a particular identity outside of gwit are out of the scope of this specification.
This means that the relation between site and site key is one-to-one: if an author wants to create another site, then a new site key MUST be created. Thus, using one's day-to-day PGP key as a site key is NOT RECOMMENDED. The mechanisms to relate a site (and its key) to a particular identity outside of gwit are out of the scope of this specification.

As gwit site identifiers are not meaningful nor memorable to humans, some support is provided to allow using **petnames** for sites. This specification uses the concepts of petname, edge name, and (self-)proposed name from the paper [Petnames: A humane approach to secure, decentralized naming][petnames].



@@ 91,7 91,7 @@ No restrictions are placed upon content files themselves, but it is RECOMMENDED 

`_gwit/self.ini` has the same format as [Git configuration files][git-config-file], which can be summarized as an [INI file][ini-file] where subsection definitions have a `[section-name "subsection-name"]` format. It MUST be encoded using UTF-8, all its values MUST be considered as simple strings (i.e. no special parsing of integers or pathnames), and includes MUST be disabled. Each single encoded value MUST NOT exceed 1000 bytes (unless otherwise stated below), values with multiple occurrences MUST NOT have more than 10 single values, and the whole file MUST NOT exceed 65536 bytes.

Recognized sections and values are described below, and unknown ones SHOULD be ignored. If a value marked as "single" is assigned more than once in the file, the last assignment is used.
Recognized sections and values are described below, and unknown ones SHOULD be ignored. If a value marked as "single" is assigned more than once in the file, then the last assignment is used.

[git-config-file]: https://git-scm.com/docs/git-config#_configuration_file
    "CONFIGURATION FILE (git-config Documentation)"


@@ 169,7 169,7 @@ remote = https://lab.example.org/s.one/gwit-site.git

### Initial retrieval

If someone wants to use a client program to retrieve a gwit site for the first time, the client MUST know:
If someone wants to use a client program to retrieve a gwit site for the first time, then the client MUST know:

- The site identifier, i.e. the site key fingerprint. This MUST be a string of hexadecimal digits.
- The location of an existing copy of the site, accessible to it (either locally or remotely). This MUST be a local file system path or other URL format supported by Git for a remote.


@@ 196,7 196,7 @@ After the previous steps, the client MAY access the `_gwit/self.ini` file in the

### Site updates

If someone wants to retrieve updates to a gwit site identified by `<SITE-ID>` for which they already have a Git clone in persistent client storage, the gwit client MUST choose one of its remotes `<REMOTE>`, fetch new items from it (including site key updates), verify that the new head of the site branch `<SITE-BRANCH>` (derived from `<SITE-ID>` as described further above) is signed by the key matching `<SITE-ID>` and a successor of its current head, and then point the site branch to its new head. An implementation may follow the steps below, or some others with equivalent results:
If someone wants to retrieve updates to a gwit site identified by `<SITE-ID>` for which they already have a Git clone in persistent client storage, then the gwit client MUST choose one of its remotes `<REMOTE>`, fetch new items from it (including site key updates), verify that the new head of the site branch `<SITE-BRANCH>` (derived from `<SITE-ID>` as described further above) is signed by the key matching `<SITE-ID>` and a successor of its current head, and then point the site branch to its new head. An implementation may follow the steps below, or some others with equivalent results:

1. Get the commit hash of the current head of `<SITE-BRANCH>` as `<OLD-HEAD>` (e.g. `git show-ref --verify --hash refs/heads/<SITE-BRANCH>`).
2. Try to fetch new objects from `<REMOTE>` (e.g. `git fetch --atomic --no-write-fetch-head <REMOTE> '+refs/heads/*:refs/remotes/<REMOTE>/*'`; this preserves all fetch heads for each remote).


@@ 339,14 339,14 @@ Once the client has established the value of `<COMMIT>`, it MUST check that `<CO

The client MUST then resolve the path `<PATH>` in the URI (which has already been percent-decoded if necessary) to a file or directory in the Git tree associated with the commit `<COMMIT>`, by following the steps below, so as to produce some output:

1. If `_gwit/self.ini` exists as a file (blob) in the desired commit `<COMMIT>` (e.g. `git ls-tree --format='%(objecttype) %(objectname)' <COMMIT> _gwit/self.init` succeeds and reports `blob <CONF-FILE-HASH>`), parse it (e.g. `git cat-file blob <CONF-FILE-HASH> | git config -f- …`). If it does not exist, treat site configuration as empty for the next steps.
1. If `_gwit/self.ini` exists as a file (blob) in the desired commit `<COMMIT>` (e.g. `git ls-tree --format='%(objecttype) %(objectname)' <COMMIT> _gwit/self.init` succeeds and reports `blob <CONF-FILE-HASH>`), then parse it (e.g. `git cat-file blob <CONF-FILE-HASH> | git config -f- …`). If it does not exist, treat site configuration as empty for the next steps.
2. Compute `<RELPATH>` by replacing repetitions of the forward slash (`/`) in `<PATH>` by a single slash, then removing leading and trailing slashes, then removing dot segments according to the `remove_dot_segments` algorithm described in Section 5.2.4 of RFC3986 (e.g. `/foo//../bar/` becomes `bar`).

   The resulting `<RELPATH>` is relative to the site's root directory `<ROOT>` (as per site configuration) and either empty (meaning `<ROOT>` itself), or it consists of one or more non-empty path components separated by a single slash (for other files or directories).
3. Check that `<ROOT>/<RELPATH>` exists in the commit tree, that it resolves (via any symbolic links) to a `<TARGET>` path also within the tree, and get its type (e.g. `echo '<COMMIT>:<ROOT>/<RELPATH>' | git cat-file --batch-check='%(objecttype) %(objectname)' --follow-symlinks` reports `<TARGET-TYPE> <TARGET-HASH>`).
4. If `<TARGET>` refers to a file (e.g. `<TARGET-TYPE>` is `blob`), produce its contents (e.g. `git cat-file blob <TARGET-HASH>`).
4. If `<TARGET>` refers to a file (e.g. `<TARGET-TYPE>` is `blob`), then produce its contents (e.g. `git cat-file blob <TARGET-HASH>`).

   Else, if `<TARGET>` refers to a directory (e.g. `<TARGET-TYPE>` is `tree`), the client SHOULD test if the site configuration defines an index file `<INDEX>`; if it does, and `<TARGET>/<INDEX>` resolves to a file (blob) in the commit tree (e.g. `echo '<TARGET-HASH>:<INDEX>' | git cat-file --batch-check='%(objecttype) %(objectname)' --follow-symlinks` reports `blob <INDEX-HASH>`), produce its contents (e.g. `git cat-file blob <INDEX-HASH>`); if the client does not allow index files, or the index file is undefined, missing or unreadable, the client SHOULD produce some form of directory listing for the entries in `<TARGET>` (e.g. from `git ls-tree <TARGET-HASH>`).
   Else, if `<TARGET>` refers to a directory (e.g. `<TARGET-TYPE>` is `tree`), the client SHOULD test if the site configuration defines an index file `<INDEX>`; if it does, and `<TARGET>/<INDEX>` resolves to a file (blob) in the commit tree (e.g. `echo '<TARGET-HASH>:<INDEX>' | git cat-file --batch-check='%(objecttype) %(objectname)' --follow-symlinks` reports `blob <INDEX-HASH>`), then produce its contents (e.g. `git cat-file blob <INDEX-HASH>`); if the client does not allow index files, or the index file is undefined, missing or unreadable, then the client SHOULD produce some form of directory listing for the entries in `<TARGET>` (e.g. from `git ls-tree <TARGET-HASH>`).

   Else fail.



@@ 358,7 358,7 @@ When producing or displaying contents on URI retrieval, the gwit client MAY make

One of gwit's goals is to make existing Web or Gemini static sites easy to publish in parallel as gwit sites. This may be as simple as distributing site files in a Git repository, along with `_gwit/self.key` and `_gwit/self.ini` files, and using the key in `_gwit/self.key` to sign commits.

For a more seamless integration, it should be possible to use the other protocols supported by such a **combined site** to both identify it as such and get the information needed to then access it over gwit. This information may be found in the files in the `_gwit` directory. However, since this is always found in the Git repository's top directory, if the site is configured in the other protocol to use some subdirectory `<SITE-ROOT>` as a root, those files may not be available via the other protocol's URIs.
For a more seamless integration, it should be possible to use the other protocols supported by such a **combined site** to both identify it as such and get the information needed to then access it over gwit. This information may be found in the files in the `_gwit` directory. However, since this is always found in the Git repository's top directory, if the site is configured in the other protocol to use some subdirectory `<SITE-ROOT>` as a root, then those files may not be available via the other protocol's URIs.

A Well-Known URI ([RFC8615][]) MAY be used to provide such site metadata, accessible via the other protocol's `/.well-known/gwit.ini` URI path, mapping to the repository file `<SITE-ROOT>/.well-known/gwit.ini`. The format and features of this file are those of a site introduction file (see further above), where the site introduces itself. The file MUST contain exactly one `[site "<ID>"]` subsection. As with any introduction, the only truly relevant pieces of information are the site ID and the value(s) of `site.<ID>.remote` (e.g. `git config -f … --get-regexp '^site\.0x[0-9a-f]+\.remote$'`).