Example project showing a simple test harness for small projects in C that uses white-box testing. Based on the basic test harness for C at https://git.sr.ht/~akkartik/basic-test
Conventional automated tests tend to make assertions on the results returned by programs in different situations.
White-box tests instead make assertions on a global trace emitted by programs. The trace is a global append-only log of domain-specific events encountered and facts deduced by the program.
Traces provide 4 major benefits:
Traces allow us to write tests for more diverse situations:
In each case, you could write such tests by managing the book-keeping in special variables and checking their values in different situations. In practice, every such dimension we want to add checks for adds new complexity. It can also slow down our programs if we aren't careful to disable book-keeping in production, which adds further complexity. For these reasons, we rarely write such tests. Traces reduce such overheads by providing a single data structure that can perform all the book-keeping we need, in the context of which we can evolve a shared vocabulary of ways to slice and dice the trace to support diverse use cases.
Traces allow us to write tests oblivious to implementation details.
Since conventional tests don't have access to internal details, we tend to write each test at the finest granularity possible, testing sub-systems independently. As a result, radical reorganizations of a system (say replacing two sub-systems with three others, sharing work very differently) require rewriting lots of tests. As a result, while tests encourage small-scale refactoring, they tend to hinder radical reorganizations that require changing subsystem interfaces.
With traces, tests can check internal details at will. This allows us to write all our tests in a similar form, running the entire system regardless of the sub-system of interest. Labels help segment and namespace the trace so that the assertions in tests can focus on specific sub-systems even while running the entire system. This also helps convey the rationale for sub-system design; readers can see how the results are used, and why specific values matter in specific situations (how they're consumed downstream).
Traces are great for debugging. It is common wisdom, any time a bug is encountered, to write a test demonstrating the bug before fixing it. However, in practice we often skip this step. It's tempting to just make the one-line fix and move on, even if that gives future readers no access to the manual process by which we exercised the bug and verified the fix. Traces can support the process of tracking down the bug, and so reduce the overhead in turning it into a test.
Debugging using traces is the culmination of a parallel evolutionary track to traditional interactive debuggers: debugging by adding 'print' statements to your program. Interactively modifying a program to print out intermediate steps has several benefits over interactive debuggers:
The drawback of 'print' statements, of course, is that they don't scale.
It's easy to get swamped in too much detail. Unless you add a little bit of
structure and dump all your prints into some sort of repository where they
can be sliced and diced in sophisticated ways, as done for the traces in
this repository. The labels and depths each line of a trace is tagged
with permit a sophisticated trace browser that hides 'deep' details until
one chooses to drill down into them (see the
(One situation where interactivity is peerless: diagnosing a misbehaving program while it's still running. Traces don't help here. They're usually turned off on production runs. Low-overhead production logging can be comparable to debuggers in complexity/effort.)
Traces emitted by tests are a ready repository for newcomers to understand your system. They give you the option to answer questions by pointing people at a specific test (for example, "look here for what happens when the text editor encounters a keypress"). The trace browser mentioned above lets readers skim the sequence of operations your program performs in different situations, drilling down into details at will.
These benefits are so compelling that the author is building a system that provides them at all levels of abstraction. The eventual goal is to be able to, for example, understand precisely what got written to disk in response to a mutation in a database, regardless of how many applications, languages and libraries were involved.
First build the project, using either (preferred):
or (more conventional):
Then try running tests.
$ ./a.out # run some example tests .. # emits one dot per passing test $ ./a.out test_2 # run a single test . # single dot for single test
This repo is just an example. To use this approach in your own programs:
Set up the basic test harness as described at https://git.sr.ht/~akkartik/basic-test
Designate one function as the top-level that all your tests will call.
Perform all assertions in tests using the
CHECK_TRACE* family rather than
CHECK* macros. This will require adding trace lines in your program
Add a line to the trace:
trace(depth, label) << ... << end();
Don't forget the
end(); without it, the line will be forgotten.
The label is a namespace. Other helpers below focus on the lines in a trace for a given label.
The depth is an indicator of importance. It isn't used by tests, but it's helpful for debugging.
Check that the trace contains certain lines:
CHECK_TRACE_CONTENTS( "label1: abc\n" "label2: def\n" );
This will match a trace with intervening lines between these two. But both lines must be present, and the first line must be present before the second.
Check that the trace doesn't contain a given line:
Check that the trace contains exactly n lines with a given label:
First create an example trace:
$ ./build $ ./a.out test_2 dumping trace to file 'last_run' .
Check out its contents:
$ cat last_run 0 app: transforming 3 1 app: 3 + 1 is 4 0 app: 3 transformed to 8
There are three lines, and each line in the trace starts with a depth and a
label before the
:. Two lines are at depth
0, while one is at depth
Now let's open
last_run using the trace browser:
$ ./browse_trace/browse_trace last_run
It'll take a few seconds to compile the first time around, before taking over the screen to display the following:
0 app: transforming 3 (2) 0 app: 3 transformed to 8
Only two lines are shown, the ones at depth
0. The first line is highlighted.
It also ends in
(2) which indicates that there's an extra line hidden below
<Enter> key. The screen now displays the complete file, including
the line at depth
Now hit the
<Backspace> key (
<delete> on Mac computers) to collapse the
first line again, and hide the line at depth
For a complete list of shortcuts available in the trace browser, consult