So David Albert wrote a tweetstorm about Plan 9 and about generality. I’ve reassembled some paragraphs for ease of quoting:

There is a ton of symmetry between messaging and late binding at the core of OOP, and private name spaces in Plan 9. With messaging in OOP, the decision about what code to run is made dynamically, as late as possible. With private name spaces, each process sees a its own file system hierarchy. The /foo/bar/baz that I see might not be the same one you see. In a sense, private name spaces late bind file contents. This is a big deal when all system functions are accessed using files.

There’s a great quote from Kay in the Early History of Smalltalk, that I still don’t fully understand, but I think applies here.

“Smalltalk is a recursion on the notion of computer itself. Instead of dividing ‘computer stuff’ into things each less strong than the whole–like data structures, procedures, and functions which are the usual paraphernalia of programming languages — each Smalltalk object is a recursion on the entire possibilities of the computer.”

This seems pretty reasonable descriptively, but not really great software engineering.

Recently I submitted a bug fix which illustrates one case of this: jgit was willing to write git tree entries with zero-length names. These entries represent, roughly, filenames. So by removing power, I was able to reduce bugs. This is sort of a small case of a power reduction — previously, the domain of the function was approximately all strings; now it’s all-but-one.

But let’s look at a stronger case: OpenSSL. OpenSSL famously had a wide surface area which allowed all sorts of use cases. Unfortunately, most of those use cases were wrong, from a security perspective. Maybe there’s room in the world for a security library where everything is permitted. But mostly I would rather use the library where only correct things are possible.

I guess this isn’t always true — I use a lot of Python, and when I’m writing Python to write SVG files, I don’t bother with an interface that would prevent me from making formatting errors. I just use print statements. But I probably would prefer the interface if I were programming for external consumption, as opposed to hacking together some throw-away code to get something else done.

Those are some special cases, but the most general reason for limiting what your code can do, is that limits make analysis easier. Valgrind has to do a tremendous amount of work to show that one particular run of your C code doesn’t have memory errors. Java simply never has that problem (C++ references don’t either). Regular expressions are far less powerful than full parsers, so it’s easier for a human reader to understand what they’re doing. Pure functions and immutable data structures are weaker than impure/mutable — but if you use a lot of them, it’s easier to track down where that stupid variable got changed. You can also build abstractions like map-reduce on top of them.

Which I guess gets to a point that David makes later:

I think the key idea is find uniform interfaces (the message, the file), make them as dynamic as possible, and build a system around that. Another striking thing about Plan 9 is that everything uses 9P – the remote file system protocol – both locally and remotely. If you didn’t have to interact with the outside world, you’d basically have only one network protocol for all services.

But this also reminds me of the STEPS project to build a complete system in 20,000 lines of code (also Alan Kay, et al). To do that, you have to discover powerful abstractions and use them everywhere. Having just one network protocol is a good start.

[rearranged from earlier]

Consider the Plan 9 window manager. It consumes a screen, a mouse, and a keyboard from the computer, (/dev/draw, /dev/mouse, etc.)… …and then re-exports virtual screens, mice, and keyboards to each of the windows that it makes. The programs in each window don’t know they’re in a window. You could run them w/o the window manager and they’d take up the whole screen.

In indexed-color (e.g. 256-color) graphics, which Plan 9 supported, there is a difference between being full-screen and being windowed; when you are full-screen, you have full control over the palette. When you aren’t, you have a sad negotiation problem.

Also, in a windowed mode, you can be partially covered up and then exposed, while in a full-screen mode, you can’t. So either the full-screen interface has to contemplate this possibility, or the windowed interface has to be artificially weakened.

Anyway, a file (or series of files) is the wrong interface to a screen. You want a higher-level interface that can do things like scrolling, or playing movies, or drawing textured triangles. These are both often hardware-accelerated, and this matters a lot for smooth graphics. This sort of rich interface is best accessed through a series of functions, which communicate, in part, by reifying objects (“a window”, or “a button”) so that they can be referenced.

Because I can write any old string to a file, there is nothing that will check for me whether I have written a string that does something meaningful (until I run my program). Plan 9’s use of C’s file reading APIs makes this even worse: are short reads or short writes possible? What do they mean? Sure, you could document that, but you shouldn’t have to; a good API is the documentation about what’s possible.

And to a reader of code, uniformity makes navigation difficult. What’s this piece of code doing? The same thing as all of the other code: reading and writing some files. At this point, strace is a more useful debugging tool than grep, since at least I can see which file is being read/written by a particular piece of code. Larry Wall once said, “Lisp has all the visual appeal of oatmeal with fingernail clippings mixed in.” There’s more to life than visual appeal, but I do think there’s something to the idea that different tools should look different so you don’t accidentally grab the scalpel when you wanted the cautery pen.

I also don’t believe that local resources should be treated the same as remote resources. This is a seductive idea — they’re just streams of bytes, who cares where they’re stored? And sometimes, it’s reasonable: when you’re building casual software where you’re not going to think too hard about failure cases. But when engineering something that will see heavy use, it matters whether a read failed because of a network failure vs a disk failure. Network failures are recoverable; disk failures more-or-less aren’t. And often a stream isn’t the interface that you want for network communication anyway — something that’s datagram-based and best-effort is better for games and telephony.

And this is why basically nothing is 20,000 lines of code, and if anything is 20,000 lines of code, it’s “by shaving off as many requirements of every imaginable kind as you can”. As programmers, we deal with extremely heterogeneous systems. A carpenter might pound a thousand identical nails; we just write a nail-pounding function. So it’s not surprising that we end up with specialized rather than uniform interfaces, and it’s not bad either.


Next post: Analogies for Git

Previous post: Whipsaw!