Example repo demonstrating automated white-box tests
0b749dbb — Kartik Agaram 16 days ago
work around sr.ht
435d9740 — Kartik Agaram a month ago
0c34d4b3 — Kartik Agaram a month ago
add the trace browser, and illustrate its use


browse log



Example project showing a simple test harness for small projects in C that uses white-box testing. Based on the basic test harness for C at https://git.sr.ht/~akkartik/basic-test

What is white-box testing?

Conventional automated tests tend to make assertions on the results returned by programs in different situations.

White-box tests instead make assertions on a global trace emitted by programs. The trace is a global append-only log of domain-specific events encountered and facts deduced by the program.

Traces provide 4 major benefits:

  1. Traces allow us to write tests for more diverse situations:

    1. Performance. Pick a robust, deterministic domain-specific metric, track it in the trace, make assertions on its values in tests.
    2. Concurrency. Track reads and writes to shared values in the trace, make assertions on their ordering and behavior.
    3. Graphics. Pick a robust, deterministic metric like triangles or squares rendered, track it in the trace as you perform painting operations.
    4. ...and so on.

    In each case, you could write such tests by managing the book-keeping in special variables and checking their values in different situations. In practice, every such dimension we want to add checks for adds new complexity. It can also slow down our programs if we aren't careful to disable book-keeping in production, which adds further complexity. For these reasons, we rarely write such tests. Traces reduce such overheads by providing a single data structure that can perform all the book-keeping we need, in the context of which we can evolve a shared vocabulary of ways to slice and dice the trace to support diverse use cases.

  2. Traces allow us to write tests oblivious to implementation details.

    Since conventional tests don't have access to internal details, we tend to write each test at the finest granularity possible, testing sub-systems independently. As a result, radical reorganizations of a system (say replacing two sub-systems with three others, sharing work very differently) require rewriting lots of tests. As a result, while tests encourage small-scale refactoring, they tend to hinder radical reorganizations that require changing subsystem interfaces.

    With traces, tests can check internal details at will. This allows us to write all our tests in a similar form, running the entire system regardless of the sub-system of interest. Labels help segment and namespace the trace so that the assertions in tests can focus on specific sub-systems even while running the entire system. This also helps convey the rationale for sub-system design; readers can see how the results are used, and why specific values matter in specific situations (how they're consumed downstream).

  3. Traces are great for debugging. It is common wisdom, any time a bug is encountered, to write a test demonstrating the bug before fixing it. However, in practice we often skip this step. It's tempting to just make the one-line fix and move on, even if that gives future readers no access to the manual process by which we exercised the bug and verified the fix. Traces can support the process of tracking down the bug, and so reduce the overhead in turning it into a test.

    Debugging using traces is the culmination of a parallel evolutionary track to traditional interactive debuggers: debugging by adding 'print' statements to your program. Interactively modifying a program to print out intermediate steps has several benefits over interactive debuggers:

    1. Free time-travel debugging. A mythical feature rarely encountered by most becomes trivial in the presence of 'print' statements.
    2. Applicability in far more situations. If you debug using debuggers you're restricted to languages that provide mature debuggers, something that takes person-years of effort. Even when debuggers are available, they're finicky things, apt to perturb the program they aim to observe, particularly in the presence of low-level bugs. On the other hand, 'print' statements can be added anywhere, and their level of perturbance is usually far lower than a debugger.

    The drawback of 'print' statements, of course, is that they don't scale. It's easy to get swamped in too much detail. Unless you add a little bit of structure and dump all your prints into some sort of repository where they can be sliced and diced in sophisticated ways, as done for the traces in this repository. The labels and depths each line of a trace is tagged with permit a sophisticated trace browser that hides 'deep' details until one chooses to drill down into them (see the browser/ directory).

    (One situation where interactivity is peerless: diagnosing a misbehaving program while it's still running. Traces don't help here. They're usually turned off on production runs. Low-overhead production logging can be comparable to debuggers in complexity/effort.)

  4. Traces emitted by tests are a ready repository for newcomers to understand your system. They give you the option to answer questions by pointing people at a specific test (for example, look here for what happens when the text editor encounters a keypress). The trace browser mentioned above lets readers skim the sequence of operations your program performs in different situations, drilling down into details at will.

These benefits are so compelling that the author is building a system that provides them at all levels of abstraction. The eventual goal is to be able to, for example, understand precisely what got written to disk in response to a mutation in a database, regardless of how many applications, languages and libraries were involved.

(More details on traces.)

Try it out

First build the project, using either (preferred):


or (more conventional):


Then try running tests.

$ ./a.out  # run some example tests
..  # emits one dot per passing test
$ ./a.out test_2  # run a single test
.  # single dot for single test

This repo is just an example. To use this approach in your own programs:

  1. Set up the basic test harness as described at https://git.sr.ht/~akkartik/basic-test

  2. Designate one function as the top-level that all your tests will call.

  3. Perform all assertions in tests using the CHECK_TRACE* family rather than raw CHECK* macros. This will require adding trace lines in your program using trace(...).

Trace primitives
  • Add a line to the trace: trace(depth, label) << ... << end();

    Don't forget the end(); without it, the line will be forgotten.

    The label is a namespace. Other helpers below focus on the lines in a trace for a given label.

    The depth is an indicator of importance. It isn't used by tests, but it's helpful for debugging.

  • Check that the trace contains certain lines:

      "label1: abc\n"
      "label2: def\n"

    This will match a trace with intervening lines between these two. But both lines must be present, and the first line must be present before the second.

  • Check that the trace doesn't contain a given line:

    CHECK_TRACE_DOESNT_CONTAIN("label1: abc");
  • Check that the trace contains exactly n lines with a given label:

    CHECK_TRACE_COUNT("label", 3);
Using the trace browser

First create an example trace:

$ ./build
$ ./a.out test_2
dumping trace to file 'last_run'

Check out its contents: $ cat last_run 0 app: transforming 3 1 app: 3 + 1 is 4 0 app: 3 transformed to 8

There are three lines, and each line in the trace starts with a depth and a label before the :. Two lines are at depth 0, while one is at depth 1.

Now let's open last_run using the trace browser:

$ ./browse_trace/browse_trace last_run

It'll take a few seconds to compile the first time around, before taking over the screen to display the following: 0 app: transforming 3 (2) 0 app: 3 transformed to 8

Only two lines are shown, the ones at depth 0. The first line is highlighted. It also ends in (2) which indicates that there's an extra line hidden below it.

Hit the <Enter> key. The screen now displays the complete file, including the line at depth 1.

Now hit the <Backspace> key (<delete> on Mac computers) to collapse the first line again, and hide the line at depth 1.

For a complete list of shortcuts available in the trace browser, consult browse_trace/Readme.md.