~luxferre/nntrac

No-nonsense TRAC T-64 reimplementation under 1000 SLOC of ANSI C
d795c09b — Luxferre 10 months ago
First embed hdr version
a65f47ed — Luxferre 11 months ago
rdm fx
9acabe0d — Luxferre 11 months ago
improved prng

refs

master
browse  log 

clone

read-only
https://git.sr.ht/~luxferre/nntrac
read/write
git@git.sr.ht:~luxferre/nntrac

You can also use your local clone with git send-email.

#nntrac: no-nonsense TRAC implementation in ANSI C

The nntrac language (all-lowercase) is a portable, lightweight (under 1000 SLOC of ANSI C) and embeddable derivative of the TRAC T-64 language originally designed by Calvin N. Mooers in the 1960's. Compared to the original, nntrac is created from scratch with modern systems in mind and adds several useful features to interact with current operating environments.

#Building

Just invoke your C compiler like this (replacing cc with the specific command and updating the flags if necessary):

cc [-static] -std=c99 -Os -s nntrac.c -o nntrac [-DNNT_EMBED] [-DNNT_NO_SHELLEXT] [-DNNT_SHARP="#"] [-DNNT_SYMNAMELEN=32]

where cc is the C compiler of your choice. Compilation was tested on GCC, Clang, Zig cc, TCC (dynamic linking only), Cproc and chibicc.

Supported compiler flags (all optional):

  • -DNNT_EMBED: build nntrac without the main function (entry point). See "Embedding nntrac into your projects" section for more information on the embedded usage.
  • -DNNT_NO_SHELLEXT: disable the os primitive. See "New primitives" section for more information.
  • -DNNT_SHARP: change the # character (start of active or neutral function call) to something else.
  • -DNNT_SYMNAMELEN: maximum length of TRAC form or primitive function names (default 32).

#Usage

The nntrac binary can be run interactively (with nntrac or echo 'script' | nntrac) or in the script invocation mode (./nntrac script.trac param1 param2 ...). In the first case, the default "idling program" #(ps,#(rs)) is run and the interpreter exits as soon as everything input until the first meta character (' by default) is executed and evaluated. In the second case, the script file is read and executed directly, with command line parameters passed to the special form names (see below).

#Main differences from the T-64 standard

  • the ps (print string) primitive can accept multiple arguments (concatenating them on the output);
  • default call results can be neutral and don't differ from explicit call results in any way;
  • up to 255 segment ordinals are supported (from 1 to 255);
  • extended operation mode (#(mo,E)) is the default one;
  • arithmetic primitives operate on 63-bit signed integers (if the target architecture allows, otherwise it's 31-bit signed integers) and accept numbers in any base supported by C's strtoll function in base autodetection mode (i.e. decimal, octal and hexadecimal) but they don't support prepending any string prefixes to the output and always return the result in base-10;
  • in arithmetic primitives, overflow argument is fully optional (in this case, a null value is returned if an overflow occurs);
  • all bitwise primitive results are returned in base-10 as well and truncated to unsigned 32-bit values, and br bit rotations are also done on 32 bits width;
  • form storage primitives (fb, sb, eb) directly accept a filename instead of a form name with the address (like in Nat Kuhn's Trac-in-Python);
  • the trace mode (tn, tf) is fully non-interactive and just prints out every function run into stderr, also it doesn't trace the tn, tf and hl primitives call;
  • other diagnostic functions (ln, pf) output to stdout;
  • 9 new primitives have been added (see below).

#Original T-64 primitives implementation status

Name Args Implemented? UTF-8 safe? Meaning
rs 1 yes yes Read string
rc 1 yes no Read char
cm 2 yes no Change meta
ps var yes yes Print string
ds 3 yes yes Define string
dd var yes yes Delete definition
da 1 yes yes Delete all
ss var yes yes Segment string
cl var yes yes Call string
cs 3 yes yes Call segment
cc 3 yes no Call character
cn 4 yes no Call N characters
cr 2 yes yes Call [pointer] restore
in 4 yes yes Initial
eq 5 yes yes String equality
gr 5 yes yes Greater than
ad 3, 4 yes yes Add
su 3, 4 yes yes Subtract
ml 3, 4 yes yes Multiply
dv 3, 4 yes yes Divide
bu 3 yes yes Bitwise union (OR)
bi 3 yes yes Bitwise intersection (AND)
bc 2 yes yes Bitwise complement (NOT)
br 3 yes yes Bitwise rotation
bs 3 yes yes Bitwise shift
sb var yes yes Store block (file): #(sb,fname,f1,f2...)
fb 2 yes yes Fetch block (file): #(fb,fname)
eb 2 yes yes Erase block (file): #(eb,fname)
ln 2 yes yes List names
pf 2 yes yes Print form
tn 1 yes yes Trace on
tf 1 yes yes Trace off
hl 1 yes yes Halt
mo 2, 3 yes yes Mode (see below)

#New primitives

  • bx: bitwise XOR. 3 arguments: #(bx,A,B). Returns the operation result. Invocation is the same as for the bi or bu primitives.
  • ac: ASCII code. 2 arguments: #(ac,S). Returns the numeric unsigned value of the first character in S.
  • av: ASCII value. 2 arguments: #(av,N). Returns the single byte corresponding to the ASCII code N (unsigned).
  • fn: format number. 3 arguments: #(fn,fmt,N). Returns the sprintf-formatted string representation of the number N according to the format fmt.
  • sf: store (raw) file. 3 arguments: #(sf,fname,form). Stores the raw value from form into a named file. The value is always written in its entirety (the form pointer is ignored) and segment gap bytes, if there are any, are written "as is". Returns a null value. The form doesn't get deleted from the internal form storage. The file is fully overwritten if it already exists.
  • ff: fetch (raw) file. 3 arguments: #(ff,fname,form). Reads a raw string from the named file into form. Returns null value (check the target form afterwards).
  • tm: local/UTC/Epoch time. 2 or 3 arguments: #(tl,fmt[,U]). Returns the local (or UTC, if the third parameter U is specified) time formatted according to the strftime-compatible fmt string or E format. If the format string is just E (Epoch), returns the amount of seconds since 00:00:00 UTC, January 1, 1970.
  • rn: random number. 3 arguments: #(rn,n1,n2). Returns a (pseudo-)random integer number in the range n1 (included) to n2 (not included). Implemented using the double-pass xorshift64* algorithm.
  • os: run an OS command. 2 arguments: #(os,cmd). Runs a command in the external OS shell (the one determined by the system() C call) and returns the command exit code. The output is not captured.

For self-contained nntrac environments that have no external shell (or the shell is nntrac itself), you can disable the os primitive by building nntrac with the -DNNT_NO_EXTSHELL flag.

#Modes

The nntrac processor can run in one of the three modes that can be switched with the mo primitive:

  • E (extended): all primitives are available, including the custom ones — this mode is the default one (unlike the original spec);
  • L (legacy): only the original 34 primitives from T-64 standard are available, no (built-in) extensions are permitted;
  • S (secure): all primitives are available except those that can interact with the filesystem and outside operating environment (sb, fb, eb, sf, ff, os).

To lock the mode set with #(mo,L) or #(mo,S) until the end of the program, use the third L parameter (#(mo,S,L) or #(mo,L,L)). This way, no code inside will be able to extend nntrac's privileges back to unsafe level.

#Accessing command-line parameters from nntrac scripts

On normal script invocation (nntrac script.trac [param1 param2 ...]), nntrac automatically creates two forms: nnt-argc and nnt-argv. The nnt-argc form is a number containing the amount of command-line parameters (the script file name + everything after it, akin to Python). The nnt-argv form contains the parameters themselves and already is segmented so that you can use the cs primitive for easier parameter access.

#Embedding nntrac into your projects

Besides being lightweight, nntrac is also fully embeddable. You can use it as the smallest scripting engine that can be tailored for the specific needs of your own software.

#Invoking nntrac from other C code

Place nntrac.c and nntrac-embed.h files inside your project. In your C source code, append #include "nntrac-embed.h" to the top. Then, the following prototypes are available to you:

  • void nnt_init(): allocate the forms and primitives storage, register the basic primitives and prepare the engine for work. You must start every session with the nnt_init(); call before being able to call other functions in this list.
  • void nnt_regprimitive(const char *name, void *handler): register your own primitive function to the nntrac interpreter. See below for details.
  • void nnt_proc(char *prog, unsigned int len): run a script contained in the string prog of length len. Any #(hl) call will exit this function.
  • void nnt_finish(): free all forms and primitive function resources. Must be called when you no longer need the nntrac engine.

Then, you must build your project along with the nntrac.c file with the -DNNT_EMBED compiler flag. This flag disables the main() function in the nntrac source code itself.

#Extending nntrac with your own primitives

For your project scripting needs, you might have to introduce your own primitive functions to nntrac. First, define your functions according to the following prototype: char* handler(char *arglist, char *res, int *reslen);, and your handler must do two things to be a valid primitive: return the res pointer back and, if necessary, update the *reslen field, which is 0 by default, with the actual result string length. Also, to resize the res buffer, you must only use realloc! Doing otherwise will eventually lead to segfaults or memory leaks.

E.g. the function pr_custom might look like this:

char* pr_custom(char *arglist, char *res, int *reslen) {
  /* ...some actions that update the result string... */
  *reslen = strlen(res);
  return res;
}

Then, in your main code, somewhere between the calls to nnt_init and nnt_proc, you register the pointer to your primitive function with the name of your choice using the nnt_regprimitive call:

nnt_init();
/* ... */
nnt_regprimitive("my-custom", &pr_custom);

And the #(my-custom) call becomes available in your nntrac script code.

Now, how do we process the function arguments in our custom primitive definition? The arglist parameter is a string of function arguments (starting with the registered primitive name itself) delimited with a special NNT_ADEL character. For usage with strtok C function, it's more convenient to use a predefined null-terminated string with the same delimiter, called NNT_ADEL_S. Both NNT_ADEL and NNT_ADEL_S definitions are available in the nntrac-embed.h header, as well as the inclusion of stdlib.h and string.h for your convenience.

Here's an example of how we would implement some RGB light API for nntrac, returning the status:

char *pr_rgbled(char *arglist, char *res, int *reslen) {
  char *arg = strtok(arglist, NNT_ADEL_S), *r, *g, *b;
  r = strtok(NULL, NNT_ADEL_S); /* get the first parameter */
  g = strtok(NULL, NNT_ADEL_S); /* get the second parameter */
  b = strtok(NULL, NNT_ADEL_S); /* get the third parameter */
  if(r != NULL && g != NULL && b != NULL) { /* all read successfully */
    int val_r = atoi(r), val_g = atoi(g), val_b = atoi(b); /* convert to int */
    rgbled_set_lights(val_r, val_g, val_b); /* call your internal API */
    rgbled_get_lights(&val_r, &val_g, &val_b); /* read back the status */
    char *fmt = "R=%d, G=%d, B=%d\n"; /* set the formatting string */
    /* estimate the size and initialize the resulting buffer */
    res = realloc(res, (*reslen) = 1 + snprintf(NULL, 0, fmt, val_r, val_g, val_b));
    memset(res, 0, *reslen); /* zero it out */
    snprintf(res, *reslen, fmt, val_r, val_g, val_b); /* render */
    res = realloc(res, (*reslen) = strlen(res)); /* resize to the actual written size */
  }
  return res; /* return the result pointer */
}

Then you can register this primitive with nnt_regprimitive("rgb", &pr_rgbled); in your main C code, and then, calling ##(rgb,43,67,133) in your script will return the string R=43, G=67, B=133 if the API succeeds.

#FAQ

#Why reimplement TRAC and not another scripting language?

Because this is probably the only functional scripting language that can be fully, and even with some useful extensions, be implemented in under 1000 SLOC of ANSI C in a truly portable manner. Besides, C implementations of other embeddable scripting languages are easy to find and pick up, but for TRAC, at the time of nntrac creation, there existed nothing like that except a GPL-ed T-84 version that's hard to build with any modern C compiler.

#Why was T-64 standard chosen as the basis, not T-84 or T2001?

While having more "batteries included", T-84/T2001 had diverged from the original elegant design by switching to name suffixes to decide what to do with the function return value. This is much less flexible and less convenient for large-scale programs. T-64, on the other hand, can be easily extended (when really necessary) to do all the same things as T-84 allowed out of the box without sacrificing its core simplicity and flexibility.

#Is nntrac binary-safe?

In general, no. All nntrac programs are expected to not contain null bytes and the bytes from 248 to 255. Since nntrac, like all other TRAC dialects, is fully homoiconic and any piece of data can be treated as code, your data must not contain these bytes either. Emitting bytes with these values using the av primitive can and most probably will result in undefined behavior.

To process arbitrary binary data in nntrac script, it is mandatory to convert it into readable format (like hex or dec) with external tools (like od or xxd) before feeding it to the script. The nntrac interpreter contains all features required for your script to be able to process this kind of data.

#Is nntrac UTF-8-safe?

Mostly. All the internal meta characters are chosen so that they never occur in any valid UTF-8 sequence. All primitives, however, operate on individual bytes, so the primitives that allow you to input/output/manipulate a single byte or some numbered bytes are not UTF-8-safe. These include rc, cm, cc, cn, ac and av primitives.

#Why do rc and rs primitives require pressing Return (Enter) even after the metacharacter (') was entered?

They don't require it, your OS does. If you absolutely need per-character input then you need to set your terminal into the unbuffered input mode. For Unix-like systems, it can be done using a wrapper shell script with stty command.

#Why doesn't the os primitive capture the shell command output?

Because there is no truly portable way to do this. For capturing the output, it's recommended to redirect the command into a file (usually with > or >> shell operator) and then read the file contents with the ff primitive.

#Credits

Implemented by Luxferre in 2023, released into public domain with no warranty.

Based on the original specification according to "Definition and Standard for TRAC T-64 Language" by Calvin N. Mooers (1972).