This blog post documents my understanding of how the conventions for Unix command line syntax have evolved over time. It’s not properly sourced, and may well be quite wrong. I’ve not been using Unix until 1989, so I wasn’t there for the early years. Maybe someone has written a proper essay on this, with citations. I’m too lazy to dig them up.
In the beginning, in the first year or so of Unix, an ideal was formed for what a Unix program would be like: it would be given some number of filenames as command line arguments, and it would read those. If no filenames were given, it would read the standard input. It would write its output to the standard output. There might be a small number of other, fixed, command line arguments. Options didn’t exist. This allowed programs to be easily combined: one program’s output could be the input of another.
There were, of course, variations. The echo
command didn’t read anything. The cp
, mv,
and rm
commands didn’t output anything. However, the “filter” was the ideal.
In the example above, the cat
program reads all files with names with a .txt
suffix, writes them to its standard output, which is then piped to the wc
program, which reads its standard input (it wasn’t given any filenames) to count words. In short, the pipeline above counts words in all text files.
This was quite powerful. It was also very simple.
Fairly quickly, the developers of Unix found that many programs would be more useful if the user could choose between minor variations of function. For example, the sort
program could provide the option to order input lines without consideration to upper and lower case of text.
The command line option was added. This seems to have resulted in a bit of a philosophical discussion among the developers. Some were adamant against options, fearing the complexity it would bring, and others really liked them, for the convenience. The side favoring options won.
To make command line parsing easy to implement, options always started with a single dash, and consisted of a single character. Multiple options could be packed after one dash, so that foo -a -b -c
could be shortened to foo -abc
.
If not immediately, then soon after, an additional twist was added: some options required a value. For example, the sort
program could be given the -kN
option, where N
is an integer specifying which word in a line would be used for sorting. The syntax for values was a little complicated: the value could follow the option letter as part of the same command line argument, or be the next argument. The following two commands thus mean the same thing:
At this point, command line parsing became more than just iterating over the command line arguments. The dominant language for Unix was C, and a lot of programs implemented the command line parsing themselves. This was unfortunate, but at this stage the parsing was still sufficiently simple that most of them did it in sufficiently similar ways that it didn’t cause any serious problem