@@ 0,0 1,107 @@
++++
+title = "The Problem With GNU getopt; Or, On Standards"
+draft = true
++++
+
+GNU `getopt(3)` is broken, the term that they would use is 'nonstandard', but
+nonstandard interfaces make software less robust, less portable and less
+maintainable. Generally speaking such interfaces are present to create vendor
+lock-in. No matter how much GNU "respects your freedom" or allows you to do
+anything with their software, they are still software vendors who are
+incentivized to make it harder for us to use other versions of software. To be
+clear, I prefer GNU's attempts at lock-in over proprietary lock-in, but the
+consequences are often the same anyways, non-portable and fragile programs.
+
+The [`getopt(3)`](https://pubs.opengroup.org/onlinepubs/9699919799/functions/getopt.html)
+function is a POSIX system interface implementing argument parsing according to
+the [Utility Syntax Guidelines]. It's widely implemented across programming
+languages as a sane way to handle command line options like the `-l` in `ls -l`.
+Those guidelines require that POSIX conformant utilities place all their
+options before their operands. A conformant `getopt` will stop parsing when the
+first non-option value is encountered. Needless to say glibc's `getopt` is
+not-compliant.
+
+[Utility Syntax Guidelines]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
+
+The glibc implementation of `getopt` does not conform to the Utility Syntax
+Guidelines. In fact, when trying to write a command line C utility, it has some
+_extremely surprising_ behavior. To illustrate, I present an example that I ran
+into while implementing the `env(1)` command for **ctools**{{fn(id=1)}}. Here,
+`env` is linked to glibc:
+
+```txt
+$ cat /dev/urandom | base64 | env -i PATH=/usr/bin head -n 15
+env: invalid option -- 'n'
+env [-i] [name=value]... [utility [argument...]]
+```
+
+What? The `-n` option was clearly for `head`. What happened? It turns out that
+glibc's `getopt` _permutes_ the elements of `argv` as it scans. Why? To quote
+glibc's [docs](https://www.gnu.org/software/libc/manual/html_node/Using-Getopt.html#Using-Getopt):
+
+> * The default is to permute the contents of argv while scanning it so that
+> eventually all the non-options are at the end. This allows options to be
+> given in any order, even with programs that were not written to expect this.
+>
+> * POSIX demands the following behavior: the first non-option stops option
+> processing. This mode is selected by either setting the environment variable
+> POSIXLY\_CORRECT or beginning the options argument string with a plus sign
+> (‘+’).
+>
+
+To summarize, to write a utility conforming to the syntax guidelines with no
+dependencies other than the system interfaces under GNU, POSIXLY\_CORRECT must
+be set, otherwise, _glibc may break the program_{{fn(id=2)}}. In trying to write
+portable, robust, software, there is instead a broken program that doesn't work
+on the most installed system in the world.
+
+Why do we even try to standardize? We write standards because we disagree. In
+software, we disagree on what algorithm to use to sort a list, or what
+programming language to use to write a client to send and read an email.
+Standards mean that I don't have to care about what you use in order to know
+that my message will be readable, or the list I give you will be sorted. It
+doesn't matter if `/bin/true` is an empty file with the execute bit set or
+a C program the entire text of which is `int main(void) { return 0; }`. They
+both return a true value that my shell that I can rely on in a script. When
+I use `getopt` to implement an option parsing for a utility, I expect it to
+behave, not _break my program_.
+
+It's not in GNU's interest to break my program, but it's also not not in GNU's
+interest to break my program. GNU wants us to write programs for GNU.
+A proliferation of programs that only work under GNU makes it more attractive to
+install GNU, and less attractive to install alternatives. There may have been
+historical reasons for this, GNU's not UNIX after all, but these days GNU/Linux
+is way more the 800 pound *NIX gorilla compared to any official
+UNIX{{fn(id=3)}}.
+
+Without standards, even de facto ones, alternatives proliferate. When
+programmers can't rely on the properties of the system, they implement their
+own. These are generally buggy. We end up with POSIX compatibility
+implementations for GNU and GNU behaving implementations for POSIX
+systems. These all get fewer eyes on them than standard interfaces and so can
+often be buggier than their standard counterparts. Standards can furthermore be
+validated against specifications and different implementations.
+
+We need to both push on standards and work within them. They are both an
+artifact (like ISO C 1999) and a living process, frustratingly ossified and
+a rock solid base to build programs. We should be collaborative in
+developing interfaces, actively working to prevent fragmentation in our
+ecosystems while always being open to innovation. If we do this we can make
+using standards obvious and freeing, rather than difficult and limiting. In
+making and using standards, we free ourselves and we free our users, as we and
+they can be confident that our utilities and interfaces will work, robustly,
+predictably, and everywhere.
+
+{{renderfootnotes(items=[
+ '<a href="https://git.sr.ht/~sircmpwn/ctools" target="_blank">ctools</a> is an
+ implementation of strictly conformant core POSIX utilities written in C. The
+ self contained nature of every utility makes it a really easy project to
+ contribute to. You can find the source of the `env` command I mentioned
+ <a href="https://git.sr.ht/~sircmpwn/ctools/tree/master/src/env.c"
+ target="_blank">here</a>.',
+ 'I sincerely hope that no other utilities on my system rely on behavior that
+ will change when POSIXLY\_CORRECT is set, but who am I kidding? 100% chance
+ that some other program would break.',
+ 'Except perhaps macOS, which is often worse as we have to deal with
+ proprietary bullshit _and_ GNU bullshit (from 2006!).',
+])}}