~cdv/chris.vittal.dev

790a48bfa06c67a126348cbb2414efeb6b5d2067 — Chris Vittal 5 months ago 38e6a07
A few article edits.
2 files changed, 24 insertions(+), 26 deletions(-)

M content/posts/2019-10-28_gnu-getopt-sucks.md
M sass/style.scss
M content/posts/2019-10-28_gnu-getopt-sucks.md => content/posts/2019-10-28_gnu-getopt-sucks.md +23 -26
@@ 18,8 18,7 @@ the [Utility Syntax Guidelines]. It's widely implemented across programming
languages as a sane way to handle command line options like the `-l` in `ls -l`.
Those guidelines require that POSIX conformant utilities place all their
options before their operands. A conformant `getopt` will stop parsing when the
first non-option value is encountered. Needless to say glibc's `getopt` is
not-compliant.
first non-option value is encountered.

[Utility Syntax Guidelines]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02



@@ 27,7 26,7 @@ The glibc implementation of `getopt` does not conform to the Utility Syntax
Guidelines. In fact, when trying to write a command line C utility, it has some
_extremely surprising_ behavior. To illustrate, I present an example that I ran
into while implementing the `env(1)` command for **ctools**{{fn(id=1)}}.  Here,
`env` is linked to glibc:
my version of `env` is linked to glibc:

```txt
$ cat /dev/urandom | base64 | env -i PATH=/usr/bin head -n 15


@@ 37,7 36,9 @@ env [-i] [name=value]... [utility [argument...]]

What? The `-n` option was clearly for `head`. What happened? It turns out that
glibc's `getopt` _permutes_ the elements of `argv` as it scans. Why? To quote
glibc's [docs](https://www.gnu.org/software/libc/manual/html_node/Using-Getopt.html#Using-Getopt):
[glibc's docs]:

[glibc's docs]: https://www.gnu.org/software/libc/manual/html_node/Using-Getopt.html#Using-Getopt

> * The default is to permute the contents of argv while scanning it so that
>   eventually all the non-options are at the end. This allows options to be


@@ 47,19 48,17 @@ glibc's [docs](https://www.gnu.org/software/libc/manual/html_node/Using-Getopt.h
>   processing. This mode is selected by either setting the environment variable
>   POSIXLY\_CORRECT or beginning the options argument string with a plus sign
>   (‘+’).
>

To summarize, to write a utility conforming to the syntax guidelines with no
dependencies other than the system interfaces under GNU, POSIXLY\_CORRECT must
be set, otherwise, _glibc may break the program_{{fn(id=2)}}. In trying to write
portable, robust, software, there is instead a broken program that doesn't work
on the most installed system in the world.
portable and robust software, there is instead a broken program that doesn't work.

Why do we even try to standardize? We write standards because we disagree. In
software, we disagree on what algorithm to use to sort a list, or what
programming language to use to write a client to send and read an email.
Standards mean that I don't have to care about what you use in order to know
that my message will be readable, or the list I give you will be sorted. It
programming language to use to write a client to send or fetch and read an
email. Standards mean that I don't have to care about what you use in order to
know that my message will be readable, or the list I give you will be sorted. It
doesn't matter if `/bin/true` is an empty file with the execute bit set or
a C program the entire text of which is `int main(void) { return 0; }`. They
both return a true value that my shell that I can rely on in a script. When


@@ 70,9 69,11 @@ It's not in GNU's interest to break my program, but it's also not not in GNU's
interest to break my program. GNU wants us to write programs for GNU.
A proliferation of programs that only work under GNU makes it more attractive to
install GNU, and less attractive to install alternatives. There may have been
historical reasons for this, GNU's not UNIX after all, but these days GNU/Linux
is way more the 800 pound *NIX gorilla compared to any official
UNIX{{fn(id=3)}}.
historical reasons for this, GNU's not UNIX after all, but our software should
be better. We should expect to be able use more programs in more environnments
without modifications. A free program is strictly better than a proprietary one
as it can be modified to run in a different environment, but not everyone can do
that, and those users deserve to be able to use software as they like as well.

Without standards, even de facto ones, alternatives proliferate. When
programmers can't rely on the properties of the system, they implement their


@@ 83,25 84,21 @@ often be buggier than their standard counterparts. Standards can furthermore be
validated against specifications and different implementations.

We need to both push on standards and work within them. They are both an
artifact (like ISO C 1999) and a living process, frustratingly ossified and
a rock solid base to build programs. We should be collaborative in
developing interfaces, actively working to prevent fragmentation in our
ecosystems while always being open to innovation. If we do this we can make
using standards obvious and freeing, rather than difficult and limiting. In
making and using standards, we free ourselves and we free our users, as we and
they can be confident that our utilities and interfaces will work, robustly,
predictably, and everywhere.
artifact and a living process, frustratingly ossified and a rock solid base to
build with. We should be collaborative in developing interfaces, actively
working to prevent fragmentation in our ecosystems while always being open to
innovation. If we do this we can make using standards obvious and freeing,
rather than difficult and limiting. In making and using standards, we free
ourselves and we free our users, as we and they can be confident that our
utilities and interfaces will work robustly, predictably, and everywhere.

{{renderfootnotes(items=[
  '<a href="https://git.sr.ht/~sircmpwn/ctools" target="_blank">ctools</a> is an
  implementation of strictly conformant core POSIX utilities written in C. The
  self contained nature of every utility makes it a really easy project to
  contribute to. You can find the source of the `env` command I mentioned
  <a href="https://git.sr.ht/~sircmpwn/ctools/tree/master/src/env.c"
     target="_blank">here</a>.',
  contribute to. <a href="https://git.sr.ht/~sircmpwn/ctools/tree/master/src/env.c"
  target="_blank">Here is the source of my env implementation</a>.',
  'I sincerely hope that no other utilities on my system rely on behavior that
  will change when POSIXLY\_CORRECT is set, but who am I kidding? 100% chance
  that some other program would break.',
  'Except perhaps macOS, which is often worse as we have to deal with
  proprietary bullshit _and_ GNU bullshit (from 2006!).',
])}}

M sass/style.scss => sass/style.scss +1 -0
@@ 137,6 137,7 @@ article {

    margin: (1.5*0.83em) 0;
    display: flex;
    flex-wrap: wrap;
    justify-content: space-between;

    time {