Fix shellcheck warnings
init: use commit instead of commit-tree
commit: handle lack of HEAD gracefully
Time Machine, a simple version control system.
Note: WIP, expect major breakage.
POSIX, sha512sum(1).
TODO: do we want to do compression? TODO: drop dependency on sha512sum(1) TODO: rewrite performance-critical commands in C
Support for the following might be removed in the future:
AGPLv3.
(While tm is unlikely to come in contact with a network, there's no reason not to protect it from SaaS.)
Internals are similar to git, except where I thought I could get away with something simpler.
All text is UTF-8. All files are text files, and are newline-terminated.
Objects are identified by 512-bit SHA-512 hashes. A "pointer" is the hexadecimal representation of a hash, encoded in UTF-8.
.tm
| objects
| [hex SHA-512 hash]
...
| refs
| index
| HEAD
...
There are three types of objects: blobs, commits, and trees.
The first line of the object is the type of the object. Unlike in git, the size of the object is not stored in the object, and the object type is terminated with a newline instead of a NUL.
The SHA-512 hash is of the contents of the object, including the object type.
An object with hash $HASH
will be stored at .tm/objects/$HASH
.
A blob is just a flat array of bytes. tm doesn't care about its contents.
blob
The remainder of this object can be anything at all, though we can't
put an invalid UTF-8 character in for demonstration because it messes up
sourcehut.
A commit is a tagged tree. More specifically, a commit encapsulates the following information:
The format of a commit is:
commit
tree deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
parent cafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabe
parent deafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbead
author J. Random Hacker <jrh@example.org>
committer K. Random Hacker <krh@example.org>
date SECS
This line will become the subject of the patch
These lines will become the body of the patch.
This is another line. It serves no purpose except demonstrating that the
body can have multiple lines.
This commit tags the tree
deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef
and has two parents:
cafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabecafebabe
and
deafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbeaddeafbead
.
Because the full commit hash is extremely long, it is permitted to use any unique prefix in commands.
Commits must have one tree, any number of parents, any number of authors, one committer, one date, one subject line, and any number of body lines. These lines MUST occur in the order specified here.
SECS
is the number of seconds since 1970-01-01 00:00 UTC at which the
commit occurred.
The subject line MUST be less than 72 characters, and SHOULD be less than 50 chars. The body SHOULD be hard-wrapped at 72 characters, except for lines which contain logs or other data which must be copied verbatim.
A tree represents a directory. It contains a list of blobs or trees, each of which is associated with a name and a set of permissions.
The format of a tree is:
tree
775 deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeef docs
664 42424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242 README.md
Each line is of the form $mode $hash $filename
, where $mode
is the
octal mode of the file, $hash
points to the contents of the file
(either a tree or a blob), and $filename
is the name of the file.
Note that we can't tell from this whether docs
and README.md
are
trees or blobs. We can get that information from the objects referenced.
(Though in this case, it's obvious that docs
is a tree and README.md
is a blob.)
Refs are pointers to commits. Refs may be used anywhere an object is
required, and are equivalent to specifying $(cat .tm/refs/$REF)
.
The HEAD ref will always exist, and points to the commit currently checked out. New commits are parented to HEAD.
The index ref will always exist, and points to the commit which is currently being built on top of HEAD.
Avoid "best practices". If you think some way of doing something is better, make everything else illegal.
Similarly, avoid configuration. tm should do the right thing, and only the right thing.
tm has a small number of fundamental abstractions -- currently only objects and patches. There are a few non-trivial operations each one supports. Make sure you understand these, and try your hardest to frame new features in terms of these operations.
Avoid doing anything complicated. If you have to do something complicated, make sure to get the most out of it. This goes doubly for the plumbing.
Like in git, each remote gets a set of branches under $remotename/ dedicated to it.
Considerations
For the initial clone, the second case happens immediately and the client just gets the entire contents of the repo.