~mht/cmr

ref: f7a9c8fc9186b939e409c5fe6734244d4fc78241 cmr/README.md -rw-r--r-- 3.8 KiB View raw
f7a9c8fc — Martin Hafskjold Thoresen Remove wrong data from lngrad 80 raw 1 year, 6 months ago

Memory Reclamation for Rust

CMR is a memory management system for concurrent applications. The system is described in Martin Hafskjold Thoresens masters thesis. This file offers a quick outline of the reasoning behind CMR as well as a rough sketch on how it works.

Setup

Install Rust here.

Clone the repo, enter it, and run cargo build, and you're done.

$ git clone git@github.com:ist-daslab/rust-drop-box cmr
$ cd cmr
$ cargo build

Why

This project implements memory management for concurrent systems in Rust. This is not in of itself novel work; Crossbeam, a popular umbrella project for concurrency, has crossbeam-epoch, which uses the popular EBR scheme for memory reclamation. There are also implementations of Hazard Pointers. CMR is ment to be a concept of an alternative to these schemes.

Main issues with EBR, at least the way it is implemented in Crossbeam, is that the programmer has to explicitly retire memory, and that operations using shared memory often include a fair amount of unsafe code. In CMR, programmers do not have to retire (and currently, cannot retire) memory. In addition, users of CMR rarely have to write unsafe code (with occational exceptions with initialization); see data-structures for examples.

How

CMR acts as a tracing garbage collector. Whenever a thread wants to read a shared pointer, the thread registers the address in which that pointer is to be stored (which is usually on the stack). This way CMR has access to all roots in the program at any time. This also allows us to guarantee that pointers read are either null or valid. When it is time to reclaim memory, which all thread attempt to do every once in a while, we fork() off a new process, and find all memory allocations that are not reachable from any root in the program. This is possible to do safely, as all types stored in a shared pointer must implement the Trace trait, which writes out other pointers to a buffer.

An an example, we look at MsQueue::push:

pub fn push(&self, t: T) {
    guards!(_new_node, _tail, _next);
    let new_node = cmr::alloc(_new_node, Node::new(t));
    loop {
        let tail = cmr::guard(_tail, &self.tail).ptr().unwrap();
        let next_ptr = &tail.next;
        let ptr = cmr::guard(_next, next_ptr);
        if ptr::addr(ptr) != 0 {
            let _ = self.tail.cas(tail, ptr, SeqCst);
        } else if next_ptr.cas(ptr::null(), new_node, SeqCst).is_ok() {
            let _ = self.tail.cas(tail, new_node, SeqCst);
            break;
        }
    }
}

guards! declare that we need three memory locations in which to store shared pointers. We allocate a new node with cmr::alloc. When loading the tail, we know that it will not be null, since the MsQueue is non-empty, so we may convert the NullablePtr to a Ptr. If the next ptr is not null (its address is not 0), we cas the tail of the queue to our new suspected tail. If not, we cas the tail to the newly allocated node. Note that despite reading shared memory, this function does not contain any unsafe code.

Current Shortcommings

We do not run destructors when memory is freed. This turned out to be difficult, as we needed to decide at runtime whether the type dropped implenented Drop or not. This might be possible with some trick.

Most operations that CMR exposes uses atomics; these always use the SeqCst ordering. This is mainly done for simplicity, although correctness of the system as a whole may suffer if the user is allowed to specify more relaxed orderings. Experimental benchmarks show that the performance of CMR does not suffer from this, at least not on x86.