Arbor is a decentralizable, tree-based real-time collaboration platform. It consists of a set of data structures, protocols, and applications that together provide security, flexibility, and integrity. This document explores each component of Arbor and explains how they are combined to create the system that exists today.
Arbor stores conversation as "nodes" organized into many "trees."
Trees are rooted at either an "Identity" or "Community" node. Identity nodes represent a single user account. They combine a user's name in Arbor with a public key. Community nodes serve as the root nodes for trees of conversation relevant to a group of users. A community node primarily contains a name for the community.
The final node type is a "Reply" node. Reply nodes are messages to other users, and they are child nodes of either other Reply nodes or Community nodes.
Here's a comparison between conventional linear chat (left) and Arbor (right). Differing from the simpler diagram on the overview page, this one includes references to the reply's author node and is more technically accurate.
Every node is referenced by a "node ID" that is a cryptographic hash of that node's content. Every node has a node ID for its own parent and author embedded inside of itself. Additionally, every node is signed by the public key embedded within its author Identity node.
Since every node is referenced by a hash of its contents and every node is signed by its author, we gain the following properties:
Given a node from an untrusted source:
author
node referenced in the new node:
author
node referenced in the new node:
This means that we can safely acquire Arbor nodes from any source and validate that they are either legitimate (if sent by trusted authors) or internally-consistent elaborate scams.
The fact that Arbor nodes can be freely exchanged over any method of communication makes the ecosystem as a whole very flexible. During development, we have exchanged nodes via email, DropBox, Syncthing, http, and custom protocols. The great thing is that none of these methods are mutually exclusive with the others. This gives us the capability to build a robust system that can survive the failure of core components.
There is another doc for the Forest specification, and we have a Go library for creating and manipulating nodes in the Forest.
We store Forest nodes on disk in a file-system hierarchy called a "Grove". Think of it like a git repository, but for conversations. The specification for the layout of a Grove is in flux, so we don't have a formal document for it right now.
Groves have to follow two simple rules in order to be valid:
If this principle is obeyed, multiple programs can concurrently access the grove and can trust that insertions into the grove are valid.
An existing implementation of the logic for managing a grove can be found in ~whereswaldon/forest-go.
Sprout is a simple connection-oriented protocol for exchanging updates with the Forest. It uses a request-response architecture, but (unlike HTTP) you can make multiple requests on the same connection without waiting for responses. This allows sending many requests for different parts of the Forest and then waiting until the requested data becomes available.
At a high level, the different sprout requests are:
For reference, there is a specification for Sprout as well as a Go implementation.
An Arbor "relay" is analogous to a server in traditional client-server architectures, but our relays are designed to be used more widely than as solely server-side infrastructure. Relays speak Sprout, and can both listen on a local address and connect to a set of peer relays. This means that they can both act as a server and connect to a group of other servers in a mesh topology.
We do not currently have any fancy algorithm for building optimal P2P meshes, but one might be introduced later. For now, we operate mostly in a semicentralized star topology where each Arbor user runs a local relay connected to a central relay on sprout://arbor.chat:7117
. Over time we hope to expand the central infrastructure so that we can all connect to a set of peer relays that run in a geographically distributed fashion to ensure that they're unlikely to all go down at the same time. We also hope to build mechanisms for relays subscribed to similar communities to dynamically find and connect directly to one another.
Here is the current Go relay implementation.
There are many possible ways to visualize and interact with a tree-based chat application. Arbor clients are the programs that read Forest nodes (typically from a local grove) and render their relationships so that a human can understand them.
We hope to explore many of the possibilities, and we encourage anyone with an idea for a new way to visualize the tree to try it out or describe it to us!
Our current terminal client for Arbor is called wisteria
and is written in Go.