~ashn/sunder

Replace no_sanitize("function") attribute further comments on ABI expectations

After further thought, disabling sanitizers in generated C functions
masks a very real ISO C conformance issue, masks actual unexpected UB
that could be discovered in the future, and makes generated code uglier.
Expand on previous comments on function pointer casting with further
explanation about the ABI expectations of the target platform instead of
disabling -fsanitize=function checks altogether.
Disable -fsanitize=function checks within generated C function definitions

This changeset disables -fsanitize=function for function definitions in
generated C. Sunder allows implicit casting of a function with parameter
types and/or a return type of type `*T` to a function type where those
same parameter types and/or return type are of type `*any`. This form of
casting matches the Sunder definition of compatible types, in which all
pointers are considered to be word-sized integers with the same size and
alignment.
Clean up `#define`s in lib/sys/sys.h

Rewrites some constant `#define`s as `static const` definitions and
prefixes the `IEEE754_MATH_DEFINITIONS*` macro templates with the
`__SUNDER_` mangle prefix.
Use sed instead of cut in examples.build.sh

Usage of `cut` with the field set to `=` would not permit setting
`SUNDER_CC` and `SUNDER_CFLAGS` to a string the containing the `=`
character (e.g. "-fsanitize=address"). This changeset cuts from the
initial `=` character to the end of the line using `sed` to overcome
this limitation.
Revamp FFI examples

Make the C<->Sunder FFI examples a little bit cleaner and easier to
understand. These examples should serve as a starter template for how to
build mixed C/Sunder applications.
Move the "calling-sunder-from-c" example into its own directory
Move the "calling-c-from-sunder" example into its own directory
Update README

The README program listing for `examples/greet.sunder` was not up to
date with the actual source under the examples directory.
Add the mangle prefix to language-internal identifiers

During the initial refactor work on mangled identifiers in generated C,
the mangle prefix, `__sunder_`, was removed from all mangled names that
did not otherwise conflict with identifiers used by the C preprocessor,
the C language, or the C runtime (for a small set of special cases).

This changeset reintroduces the mangle prefix for identifiers used to
hold temporary objects (e.g. `__sunder_lhs` or `__sunder_idx`) as well
as language-internal builtins expected to be present within the sys.h
header file (e.g. `__sunder_fatal` or `__sunder_add_wrapping_ssize`).
Rename T___MIN and T___MAX to __sunder_T_MIN and __sunder_T_MAX in sys.h
Separate "parameter" and the parameter number in mangled function typenames

Updates to match other forms of mangling and code generation which
separate words and numbers by underscores.
Remove unnecessary underscore added during function type mangling
Additional readability improvements to mangling

Followup to the previous commit. This changeset makes further
improvements to name mangling, providing additional context for member
variables within mangled anonymous struct and enum definitions.

This changeset also replaces the GCC-style
`<symbol-part-A>.<symbol-part-B>.<unique-id>` for static symbol
addresses with `<symbol-part-A>::<symbol-part-B>.<unique-id>`, allowing
for name mangling to work uniformly across mangled/generated addresses
and identifiers. Given that Sunder no longer directly outputs
NASM-flavored assembly, the old GCC-style behavior was no longer
necessary.
Add bespoke readability improvements to mangling

This changeset updates the `mangle` function with the explicit goal of
improving the general readability of generated C code.

Previously, most non-alphanumeric characters in mangled text would be
replaced with an underscore, leading to mangled strings that were
somewhat difficult for humans to parse:

        typedef struct std_optional__std_result____byte___std_error_info____ std_optional__std_result____byte___std_error_info____; // std::optional[[std::result[[[]byte, *std::error_info]]]]

With the updated `mangle` function, known text strings produced by the
Sunder compiler within type names, such as "[[" or ", ", are translated
into meaningful textual representations that better convey the meaning
to the pre-mangled text:

        typedef struct std_optional_TEMPLATE_BGN_std_result_TEMPLATE_BGN_slice_of_byte_COMMA_pointer_to_std_error_info_TEMPLATE_END_TEMPLATE_END std_optional_TEMPLATE_BGN_std_result_TEMPLATE_BGN_slice_of_byte_COMMA_pointer_to_std_error_info_TEMPLATE_END_TEMPLATE_END; // std::optional[[std::result[[[]byte, *std::error_info]]]]

Given that these changes are targeted exclusively at type mangling,
there is potential future work to be done by moving individual type
mangling transformations into subtype-specific portions of the
`mangle_type_recursive` or `mangle_type`.
Remove extraneous #define in generated code
Include the name mangling prefix in generated code for local variables

Local variables use a fabricated identifier in generated C which has the
potential to cause a name collision with a static object or function.
Even if this scenario is unlikely, we specifically choose to prefix C
identifiers corresponding to local variables with the name mangle prefix
as a preventative measure.
Add `mangle_address`

Adds `mangle_address` as shorthand way to generate the lvalue name
associated with an address instead of needing to write:

        mangle_name(address->data.local.name)

or

        mangle_name(address->data.static_.name)

every time.
Improve name mangling by only prefixing symbols when necessary

This changeset improves the name mangling algorithm used during C code
generation by only adding the `__sunder_` prefix to identifiers that
would otherwise conflict with the C preprocessor, language, or runtime.
With this changeset in place, generated C code becomes slightly more
readable, and C `extern` declarations binding to Sunder symbols no
longer require adding `asm("<name>")` for Sunder symbol `<name>`.
Greatly improve the performance of `std::read_all_with_allocator`

The previous behavior of `std::read_all_with_allocator` would re-size
the output buffer in 4096 byte chunks after every buffered read, leading
to truly horrendous performance when reading from large data sources.
This changeset doubles the output buffer size when the total amount of
data read would exceed the current allocated size of the output buffer.

Performance improvements were tested by reading a 64MB file with
`std::read_all`, which implicitly calls `std::read_all_with_allocator`
using the global general allocator.

```
import "std";

func main() void {
    var result = std::file::open("/home/ashn/test/64mb.txt", std::file::OPEN_READ);
    var f = result.value();
    defer f.close();

    var r = std::reader::init[[typeof(f)]](&f);

    var result = std::read_all(r);
    var text = result.value();
    defer std::slice[[byte]]::delete(text);
}
```

Testing on my desktop machine with `/usr/bin/time`, the runtime
characteristics of the test program before this changeset were:

```
$ /usr/bin/time -v ./test
        Command being timed: "./test"
        User time (seconds): 1534.94
        System time (seconds): 35.78
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 26:11.24
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 133148
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 7142976
        Voluntary context switches: 1
        Involuntary context switches: 1880
        Swaps: 0
        File system inputs: 67072
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
```

and the runtime characteristics after this changeset are:

```
$ /usr/bin/time -v ./test
        Command being timed: "./test"
        User time (seconds): 0.24
        System time (seconds): 0.03
        Percent of CPU this job got: 100%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.27
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 100412
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 3738
        Voluntary context switches: 1
        Involuntary context switches: 0
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
```

Due to buffer reallocations no longer exactly fitting the size of read
data, this changeset will break the previous allowed behavior where a
source read with an `std::linear_allocator` exactly large enough to fit
the source's data was guaranteed to succeed if the read calls succeeded.
Fortunately, this behavior can still be emulated using standard library
interfaces, and the trade-off to make the simple/easy path performant is
well worth the cost of requiring extra work for exact-size reads.
Fix the install target (again) to behave uniformly on macOS and Linux

Followup to the previous commit. On macOS, using `cp -r DIR_A/ DIR_B`
unpacks and copies the contents of `DIR_A` into `DIR_B`. On Linux, using
`cp -r DIR_A/ DIR_B` copies the directory `DIR_A` into `DIR_B`.
Substituting the command with `cp -r DIR_A DIR_B` seems to make the
behavior identical on macOS and Linux.
Next