Looking at a bug

2.3 Looking at a bug

Let's take a moment to talk about a bug—not a software bug, but the other kind of bug.

A peculiar species of flying insect inhabits the area around Lake Victoria, a source of the Nile river. Spectacular video footage by the late explorer Jacques Cousteau shows this insect, known as a lake fly, congregating in thick, fog-like masses on the lake and in a nearby jungle. Similar to mosquitoes in size and appearance, they sometimes form such dense clouds over the lake that one could easily mistake them for waterspouts or small tornadoes.

Large flocks of birds often swoop down into these "bug-spouts" to feast on nature's abundance, consuming millions of insects in a single air raid. Despite the predatory onslaught, though, millions more insects persist, their shadowy swarms exhibiting few signs of attrition.

When Cousteau most closely examined the life cycle of these unusual insects, he discovered that the adults live for an exceedingly short period— about 6 to 12 hours. Even if they don't wind up as lunch for one of our feathered friends, their adult existence consists of little more than a brief flutter in the sunlight.

Just what does an insect that spends its entire adult life stage in a single day do with all that time? It attempts to propagate the species. It tries to squeeze what we human beings spend years doing into a few hours. Evidently it succeeds, for the species continues and in respectable numbers. If survival of the species is their goal, then these tiny winged nothings have set their priorities straight.

These flies have but one thing to do in life, and they do it well. Unix developers believe that software should do the same.

2.4 Tenet 2: Make each program do one thing well

The best program, like Cousteau's lake fly, performs but one task in its life and does it well. The program is loaded into memory, accomplishes its function, and then gets out of the way to allow the next single-minded program to begin. This sounds simple, yet it may surprise you how many software developers have difficulty sticking to this singular goal.

Software engineers often fall prey to "creeping featurism," as it's called in the industry. A programmer may write a simple application, only to find his creative urges taking over, causing him to add a feature here or an option there. Soon he has a veritable hodgepodge of capabilities, many of which add little value beyond the original intent of the program. Some of these inventions may have merit. (We're not talking about stifling creativity here!) But the writer must consider whether they belong in this chunk of code. The following group of questions would be a good starting point for deciding.

  • Does the program require user interaction? Could the user supply the necessary parameters in a file or on the command line?

  • Does the program require input data to be formatted in a special way? Are there other programs on the system that could do the formatting?

  • Does the program require the output data to be formatted in a special way? Is plain ASCII text sufficient?

  • Does another program exist that performs a similar function without your having to write a new program?

The answer to the first three questions is usually no. Applications that truly demand direct user interaction are rare. Most programs get along fine without having to incorporate dialogue parsers into their routines. Similarly, most programs can satisfy most needs by using standard input and output data formats. For those cases where a special format is desired, a different general-purpose program can be used to make the conversion. Otherwise, each new program must reinvent the wheel, so to speak.

The Unix ls command is an excellent example of a Unix application gone astray. At last count, it had more than twenty options, with no end in sight. It seems as if the number of options grows with each new version of Unix. Rather than pick on an esoteric feature, however, let's look at one of its more basic functions, specifically, the way it formats its output. ls in its purest form should list the names of the files in a directory (in no particular order) like this:

/home/gancarz -> ls
















However, most versions of ls format output like this:

















Listing the files in neat columns seems like a sensible thing to do, at first. But now ls contains code that does column formatting, a task that has little to do with listing the contents of a directory. Column formatting can be simple or complex, depending on the environment in which it is used. For example, ls assumes that the user is using an old-style character terminal 80 characters wide. What happens to the columns when invoking ls on, say, a window system in which the terminal window is 132 characters wide? Suppose the user would prefer to view the output in two columns instead of four? What if the terminal uses a variable-width character set? Suppose the user would prefer to follow every fifth line of file names with a solid line? The list goes on.

In all fairness, ls retains the ability to list the contents of a directory one file per line. That is about all it should do, leaving the column work to other commands better suited to formatting tasks. ls would then be a much smaller command (i.e., easier to understand, easier to maintain, using fewer system resources, and so on).

Since writing an application that does one thing well results in a smaller program, these two tenets complement each other. Small programs tend to be unifunctional. Unifunctional programs tend to be small.

A hidden benefit of this approach is that you remain focused on the current task, distilling it to the essence of what you're trying to accomplish. If you cannot make the program do one thing well, then you probably don't comprehend the problem you're trying to solve. In a later chapter, we'll discuss how to acquire that understanding the Unix way. For now, think small. Do one thing well.

If a lake fly on the Nile can do it, how hard can it be?