~ivilata/gwit-spec

05c96d6c8d249212cd16480d274e91a622512404 — Ivan Vilata-i-Balaguer 3 months ago 252edb8
Clients retrieving gwit URIs must parse site configuration files.

Otherwise, a URI in a site with a configured root (different than the top
directory) may end up resolving to different repository paths depending on
whether the client supports site configuration files or not, with no errors.

A client not supporting config files may still work fine with a site with no
such file, but as soon as it had that file, even if it did not contain a root
value, the client would be uncertain about the site's root and URI retrieval
should then fail.

Requiring support for the config files for URI retrieval looks like the most
consistent and least surprising way to go.
1 files changed, 2 insertions(+), 2 deletions(-)

M README.md
M README.md => README.md +2 -2
@@ 320,7 320,7 @@ Alice eventually sets the petname "Carol's blog" for Carol's site, thus the latt

## URI retrieval

Let `gwit://[<VERSION>@]<SITE><PATH>` be the URI which identifies a particular file or directory in a gwit site. A gwit client that is to retrieve that resource MUST first obtain the site identifier `<SITE-ID>` by removing `0x` or `0X` from the beginning of `<SITE>`. There MUST be a Git clone of the site with that identifier in persistent client storage; to that end, the client MUST follow the procedures for initial site retrieval and updates described further above; Git operations described below will operate on that clone.
Let `gwit://[<VERSION>@]<SITE><PATH>` be the URI which identifies a particular file or directory in a gwit site. A gwit client that is to retrieve that resource MUST be able to parse site configuration files (see further above). It MUST first obtain the site identifier `<SITE-ID>` by removing `0x` or `0X` from the beginning of `<SITE>`. There MUST be a Git clone of the site with that identifier in persistent client storage; to that end, the client MUST follow the procedures for initial site retrieval and updates described further above; Git operations described below will operate on that clone.

The client MUST then establish which Git commit `<COMMIT>` to use, according to the `<VERSION>` in the URI (which has already been percent-decoded if necessary), by following the first of the steps below whose condition applies:



@@ 339,7 339,7 @@ Once the client has established the value of `<COMMIT>`, it MUST check that `<CO

The client MUST then resolve the path `<PATH>` in the URI (which has already been percent-decoded if necessary) to a file or directory in the Git tree associated with the commit `<COMMIT>`, by following the steps below, so as to produce some output:

1. If the client supports site configuration files (see further above), try to parse `_gwit/self.ini` in the desired commit `<COMMIT>` (e.g. `git cat-file blob <COMMIT>:_gwit/self.ini | git config -f- …`); if that fails, treat site configuration as empty for the next steps.
1. Try to parse `_gwit/self.ini` in the desired commit `<COMMIT>` (e.g. `git cat-file blob <COMMIT>:_gwit/self.ini | git config -f- …`); if that fails, treat site configuration as empty for the next steps.
2. Compute `<RELPATH>` by replacing repetitions of the forward slash (`/`) in `<PATH>` by a single slash, then removing leading and trailing slashes, then removing dot segments according to the `remove_dot_segments` algorithm described in Section 5.2.4 of RFC3986 (e.g. `/foo//../bar/` becomes `bar`).

   The resulting `<RELPATH>` is relative to the site's root directory `<ROOT>` (as per site configuration) and either empty (meaning `<ROOT>` itself), or it consists of one or more non-empty path components separated by a single slash (for other files or directories).