Added the first README draft
Git rid of some hardcoding
Optimized lookup table placement and trailing zero words stripping
NRJ is a single-instruction computer architecture designed by Luxferre in 2022 and released into public domain. This architecture is meant to be portable and easily reimplementable in both software and hardware. It includes a specification (this README), a reference emulator, a reference assembler and a minimum standard library.
This document is a work in progress, and the specification and reference implementation may change at any time without notice.
nrj.c
)NRJ features:
Depending on the word size, particular NRJ machine variants are called NRJ8, NRJ16, NRJ32 etc.
NRJ emulation is notoriously simple to implement. The entire algorithm of running a program by an NRJ machine is as follows:
For I/O routines, the operand is passed to the cell address 0 or 1 respectively and the context ID is passed to the cell 2. Current NRJ reference implementation only provides I/O routines for the standard context (ID 0), which are:
The standard context input/output should always be implemented as non-blocking and unbuffered.
By convention, the NRJ binary files should (but aren't required to) either have .nrj
suffix or the suffix that includes the word size, like .nrj16
or .nrj32
. If the word size isn't explicitly in the suffix, 16-bit word size should be assumed.
nrjasm.py
)To start writing software for any new OISC architecture, some toolchain is required. The first reference assembly language and the corresponding assembly program for NRJ is called nrjasm.
By convention, the nrjasm source files should have .nrjasm
suffix. All numeric values in the code are treated as hexadecimal.
Besides the only possible 3-operand machine instruction (with the basic form [hex_addr] [hex_addr] [hex_addr]
), the nrjasm language has just 10 core directives, as well as a comment operator (;
) and two dereferencing operators (@
and '
). Everything else is built on top of them in the standard library and your own code.
The directives can be divided into two classes:
.bit
, .org
, .inc
, .var
, .set
, .def
, .end
);FREE
, NXT
, HLT
).Here, the nrjasm directives and operators are described in the order they are processed by the assembler.
Only single-line comments are supported in nrjasm. Everything after ;
in the source line, as well as all the remaining leading or trailing whitespace, is stripped away at the first stage of assembly.
You can include another source file in any place of your main code or any other included file with the .inc
directive that accepts a file path that can be either relative or absolute. Note that if the path is relative, the current implementation will consider the path relative to your current directory you run the nrjasm executable from. There also is a basic cyclic inclusion protection, so every source file can only be included once.
Although NRJ16 looks like the most practical NRJ variant (and is the default one), nrjasm can build for any word size divisible by 8. To override the default, supply the necessary size with the .bit
directive, like .bit 8
. Note that can only be done once per source, any subsequent .bit
occurrences will be ignored.
Macros are probably the most important nrjasm feature. They allow you to define repeatable, callable and parameterized pieces of code that are expanded quite early at the build time. Every macro definition starts with .def [macro name]
and finishes with .end
. Each macro can accept up to 3 parameters. Within the macro, you can refer to these parameters with the pseudo-variables %A
, %B
and %C
. If the macro uses more pseudo-variables than the amount of parameters actually passed to it, the missing ones are replaced as follows: %A
and %B
are set to HLT
, and %C
is set to NXT
. HLT
and NXT
are pseudo-location directives we'll cover a bit later.
Nesting macros is not allowed, i.e. you cannot define a macro inside another macro.
To denote where exactly in the binary the code should be placed by the assembler, you can use .org
directive. Normally, you want to at least use .org 3
directive at the start of your code to reserve the first three words for input, output and I/O context ID ports if you don't plan on pre-populating their values. Unlike .bit
, .org
directive can be used multiple times in the nrjasm code and you can allocate your instruction however you see fit.
The .var
directive allows you to set a named alias to a hexadecimal word constant value. To retrieve the value from the alias, we prepend the @
dereferencing operator to it. Yes, that's it.
However, most of the time we use these hexadecimal constants as variable addresses (hence the name .var
, not .const
). And most of the time, we don't care what the address actually is, we just need to reserve some space for our variable. This is when we can use the FREE
pseudo-location directive (and it can only be used here, in conjunction with .var
). E.g. instead of writing .var mycoolvariable 12EF
we can write .var mycoolvariable FREE
and the assembler will reserve the first location yet unused by other variables and set its address to @mycoolvariable
.
In fact, FREE
directive only allows the assembler to fill in the address values only after all other .var
s are processed. So the addresses taken with FREE
are guaranteed to be larger than any strictly defined ones, and even calculated relatively to the maximum used address. For this reason, you must define at least one "strict" .var
if you want FREE
to work correctly. Also, keep in mind that nrjasm doesn't know where the code part ends and the variable part begins (it only knows where it ends), so you must adjust the minimum variable address manually in case the two start overlapping.
Besides @
operator that dereferences a name into its memory address, there also is the '
operator that dereferences a character into its ASCII code. This can be useful in some cases when you need to output or compare something to a character value. E.g. in NRJ16, 'M
is fully equivalent to 004D
hex constant but easier to write and remember.
Note that the dereferencing operators don't work in .var
itself, it can only either accept FREE
or a plain hexadecimal value as the second parameter.
To just set a word value at some address (plain or named), you can use .set
directive. This is a common scenario to initialize a variable with a non-zero value at build time, when we write .set @myvar 1234
after declaring myvar
with, for instance, .var myvar FREE
. Instead of a plain hex value, the .set
directive can also take pseudo-location directives HLT
and NXT
, which we're going to describe right now.
Since NRJ machine cannot just jump to the next instruction afterwards and requires some memory cell to refer to it first, some lookup table is necessary to do so. In nrjasm, this table is constructed right after the last allocated variable address, or in the middle of the memory space in a highly unlikely event no variables were declared at all.
The very first entry of the lookup table always refers to the last machine address (FFFF
in case of NRJ16) to allow to halt. This is why the HLT
pseudo-location directive always gets substituted with the actual lookup table start offset.
The NXT
pseudo-location directive, on the other hand, is resolved to the address of the lookup table entry corresponding to the following machine instruction in the actual code. This allows to write machine instructions in a linear fashion by just using NXT
as the third operand instead of an absolute address.
Besides direct machine instructions, HLT
and NXT
pseudo-locations can also be used in the .set
directive. Their logic is processed differently but they essentially give the same effect.
(under construction)