~q3cpma/misc-tools

Various useful C tools
textwidth: misc cleanup in width()
test.sh: cosmetics
wcswidth -> textwidth and add tab/backspace handling

refs

master
browse  log 

clone

read-only
https://git.sr.ht/~q3cpma/misc-tools
read/write
git@git.sr.ht:~q3cpma/misc-tools

You can also use your local clone with git send-email.

                            misc-tools
                            ==========

    Overview
    --------

A collection of miscellaneous tools made in POSIX C99:
    * genhtab     Generate static C99 hash tables (cf genhtab_bench/ for
                  a performance comparison with gperf)
    * htmldecode  HTML decoding to UTF-8
    * htmlencode  HTML encoding from UTF-8
    * mbcut       Multibyte aware string trimming
    * natsort     Natural sorting for UTF-8
    * urldecode   URL decoding
    * urlencode   URL encoding
    * textwidth   Like wcswidth(3) but with tab and backspace expansion (as cursor moving)

Note: for simplicity, Unicode handling is limited to code points, treating
combining characters and emoji as a sequence of code points instead of a
complete grapheme.


    Dependencies
    ------------

A POSIX environment with the following additions at build time:
    * Internet access and curl(1), wget(1) or fetch(1)
      for htmldecode, mbcut, natsort (to get Unicode data) and
      genhtab if USE_XXHASH=true


    Building and installation
    -------------------------

Building and installation (default (resp. optional) values shown inside curly
(resp. square) brackets):
    $ [BIN=<tool>] {CC=c99} {LTO=false} {NATIVE=false} ./build.sh
    $ [BIN=<tool>] {DESTDIR=} {PREFIX=/usr/local} ./build.sh install

For genhtab, you can set USE_XXHASH=true to switch from FNV1-A to XXH3/XXH32.

Cleanup:
    $ [BIN=<tool>] ./build.sh clean (or mrproper if you the binaries gone too)

Uninstall:
    $ [BIN=<tool>] ./build.sh uninstall

Test:
    $ ./test.sh [<tool>...]

For all operations, if no <tool> is specified, all tools will be iterated on.
LTO=true if strongly recommended for mbcut and htmldecode, to avoid binary
bloat due to utf8.c containing big Unicode LUTs.