~ivilata/gwit-spec

d01d9070df7463f678081cfadd0bad0d88b5e7fb — Ivan Vilata-i-Balaguer 5 months ago b0e52a0
Also take into account Git SHA-256 hashes.

Since Git is transitioning to these more secure hashes (see
<https://git-scm.com/docs/hash-function-transition>).

There is an attack scenario where a 40 hex digits string is used as the
`<VERSION>` field in a gwit URI for a repo using SHA-256; since the string
looks like a full hash, a client may take it as a commit name instead of a
prefix (shortened hash), and it may proceed to signature verification with
that string without checking for malicious branches or tags.

Fortunately, if the implementation follows the recommendations in "Security
considerations", such a branch or tag should have been removed anyway after
fetching new Git objects, thus foiling the attack.
1 files changed, 2 insertions(+), 2 deletions(-)

M README.md
M README.md => README.md +2 -2
@@ 229,7 229,7 @@ Although site history rewrites (and subsequent cleanups) should be accepted in t

- Depending on the implementation of Git, some operations expecting a commit or object name (hash) may instead act upon a tag or branch with the same name. This behavior may allow certain attacks, e.g. the site author may craft signed tags to avoid history rewrite detection in a client when retrieving site updates, or to trick a client into importing a `_gwit/self.key` file in a commit different from the head of the site branch on initial site retrieval; other attackers may insert unsigned tags or branches in their public clones that cause errors in clients using them as remotes.

  As a way to fend off these attacks, clients SHOULD warn about and remove Git tags and branches with names matching the format of a SHA-1 hash (40 hexadecimal digits, lower or upper case) right after cloning a repository (initial retrieval step 1) or fetching new objects (site update step 2), as those tags and branches are certainly malicious.
  As a way to fend off these attacks, clients SHOULD warn about and remove Git tags and branches with names matching the format of a SHA-1 or SHA-256 hash (40 or 64 hexadecimal digits, lower or upper case) right after cloning a repository (initial retrieval step 1) or fetching new objects (site update step 2), as those tags and branches are certainly malicious.

## URI format



@@ 323,7 323,7 @@ Let `gwit://[<VERSION>@]<SITE><PATH>` be the URI which identifies a particular f
The client MUST then establish which Git commit `<COMMIT>` to use, according to the `<VERSION>` in the URI (which has already been percent-decoded if necessary), by following the first of the steps below whose condition applies:

1. If `<VERSION>` is missing or empty, get the commit hash of the head of the site branch `<SITE-BRANCH>` (derived from `<SITE-ID>` as described further above) as `<COMMIT>` (e.g. `git show-ref --verify --hash refs/heads/<SITE-BRANCH>`).
2. Else, if `<VERSION>` matches the format of a SHA-1 hash (40 hexadecimal digits, lower or upper case), use it as `<COMMIT>`. This is the case for a permanent link.
2. Else, if `<VERSION>` matches the format of a SHA-1 or SHA-256 hash (40 or 64 hexadecimal digits, lower or upper case), use it as `<COMMIT>`. This is the case for a permanent link.
3. Else, if `<VERSION>` consists only of hexadecimal digits (lower or upper case), check that there is neither tag nor branch with that name (e.g. `git show-ref --tags --heads <VERSION>` reports nothing), then check that it is the prefix of a single commit object, and use its complete name as `<COMMIT>` (e.g. `git rev-parse --verify <VERSION>^{commit}` only reports the `<COMMIT>`).

   **Note:** The check for tags or branches named after `<VERSION>` prevents an attacker from using such a named reference in their public clone to confuse other gwit clients which try to access a URI with a certain abbreviated commit name version, and tricking them into accessing a different commit. Shall the check fail, the client SHOULD report the situation as a potential attack (e.g. to help neutralize the problematic references or remotes). As a side effect of this check, tag and branch names which are to be used in gwit URIs MUST NOT consist only of hexadecimal digits (lower or upper case).