So David Albert
wrote a tweetstorm about Plan 9 and about generality. I’ve reassembled some
paragraphs for ease of quoting:
There is a ton of symmetry between messaging and late binding at the
core of OOP, and private name spaces in Plan 9. With messaging in
OOP, the decision about what code to run is made dynamically, as
late as possible. With private name spaces, each process sees a its
own file system hierarchy. The /foo/bar/baz that I see might not be
the same one you see. In a sense, private name spaces late bind
file contents. This is a big deal when all system functions are
accessed using files.
There’s a great quote from Kay in the Early History of Smalltalk,
that I still don’t fully understand, but I think applies here.
“Smalltalk is a recursion on the notion of computer itself. Instead
of dividing ‘computer stuff’ into things each less strong than the
whole–like data structures, procedures, and functions which are the
usual paraphernalia of programming languages — each Smalltalk object
is a recursion on the entire possibilities of the computer.”
This seems pretty reasonable descriptively, but not really great
Recently I submitted a bug fix
which illustrates one case of this: jgit was willing to write git tree
entries with zero-length names. These entries represent, roughly,
filenames. So by removing power, I was able to reduce bugs. This is
sort of a small case of a power reduction — previously, the domain of
the function was approximately all strings; now it’s all-but-one.
But let’s look at a stronger case: OpenSSL. OpenSSL famously had a
wide surface area which allowed all sorts of use cases.
Unfortunately, most of those use cases were wrong, from a security
perspective. Maybe there’s room in the world for a security library
where everything is permitted. But mostly I would rather use the
library where only correct things are
I guess this isn’t always true — I use a lot of Python, and when I’m
writing Python to write SVG files, I don’t bother with an interface
that would prevent me from making formatting errors. I just use print
statements. But I probably would prefer the interface if I were
programming for external consumption, as opposed to hacking together
some throw-away code to get something else done.
Those are some special cases, but the most general reason for limiting
what your code can do, is that limits make analysis easier. Valgrind
has to do a tremendous amount of work to show that one particular run
of your C code doesn’t have memory errors. Java simply never has that
problem (C++ references don’t either). Regular expressions are far
less powerful than full parsers, so it’s easier for a human reader to
understand what they’re doing. Pure functions and immutable data
structures are weaker than impure/mutable — but if you use a lot of
them, it’s easier to track down where that stupid variable got
changed. You can also build abstractions like map-reduce on top of
Which I guess gets to a point that David makes later:
I think the key idea is find uniform interfaces (the message, the
file), make them as dynamic as possible, and build a system around
that. Another striking thing about Plan 9 is that everything uses
9P – the remote file system protocol – both locally and remotely.
If you didn’t have to interact with the outside world, you’d
basically have only one network protocol for all services.
But this also reminds me of the STEPS project to build a complete
system in 20,000 lines of code (also Alan Kay, et al). To do that,
you have to discover powerful abstractions and use them
everywhere. Having just one network protocol is a good start.
[rearranged from earlier]
Consider the Plan 9 window manager. It consumes a screen, a mouse,
and a keyboard from the computer, (/dev/draw, /dev/mouse, etc.)…
…and then re-exports virtual screens, mice, and keyboards to each
of the windows that it makes. The programs in each window don’t
know they’re in a window. You could run them w/o the window manager
and they’d take up the whole screen.
In indexed-color (e.g. 256-color) graphics, which Plan 9 supported,
there is a difference between being full-screen and being windowed;
when you are full-screen, you have full control over the palette.
When you aren’t, you have a sad negotiation problem.
Also, in a windowed mode, you can be partially covered up and then
exposed, while in a full-screen mode, you can’t. So either the
full-screen interface has to contemplate this possibility, or the
windowed interface has to be artificially weakened.
Anyway, a file (or series of files) is the wrong interface to a
screen. You want a higher-level interface that can do things like
scrolling, or playing movies, or drawing textured triangles. These
are both often hardware-accelerated, and this matters a lot for smooth
graphics. This sort of rich interface is best accessed through a
series of functions, which communicate, in part, by reifying objects
(“a window”, or “a button”) so that they can be referenced.
Because I can write any old string to a file, there is nothing that
will check for me whether I have written a string that does something
meaningful (until I run my program). Plan 9’s use of C’s file reading
APIs makes this even worse: are short reads or short writes possible?
What do they mean? Sure, you could document that, but you shouldn’t
have to; a good API is the documentation about what’s possible.
And to a reader of code, uniformity makes navigation difficult. What’s
this piece of code doing? The same thing as all of the other code:
reading and writing some files. At this point, strace is a more useful
debugging tool than grep, since at least I can see which file is being
read/written by a particular piece of code. Larry Wall once said,
“Lisp has all the visual appeal of oatmeal with fingernail clippings
mixed in.” There’s more to life than visual appeal, but I do think
there’s something to the idea that different tools should look
different so you don’t accidentally grab the scalpel when you
wanted the cautery pen.
I also don’t believe that local resources should be treated the same
as remote resources. This is a seductive idea — they’re just streams
of bytes, who cares where they’re stored? And sometimes, it’s
reasonable: when you’re building casual software where you’re not
going to think too hard about failure cases. But when engineering
something that will see heavy use, it matters whether a read failed
because of a network failure vs a disk failure. Network failures are
recoverable; disk failures more-or-less aren’t. And often a stream
isn’t the interface that you want for network communication anyway —
something that’s datagram-based and best-effort is better for games
And this is why basically nothing is 20,000 lines of code, and if
anything is 20,000 lines of code, it’s “by shaving off as many
requirements of every imaginable kind as you can”.
As programmers, we deal with extremely heterogeneous systems. A
carpenter might pound a thousand identical nails; we just write a
nail-pounding function. So it’s not surprising that we end up with
specialized rather than uniform interfaces, and it’s not bad either.