and Elementary Actions
Entities should not be multiplied unnecessarily.
?span class="docEmphasis">William of Occam
The hardware that makes up a computer interface has become formulaic: one or more text input devices (keyboard, handwriting tablet, speech recognizer), a GID, and a two-dimensional
display. The formula, not a bad one, has a few variations; for example, a touch screen can be the text input device, the GID, and the display. Microphones for sound, video inputs, and other electronic interfaces are devices that are not, except experimentally, part of the usual human-machine interface. Indeed, we use the interface to control the functioning of these devices.
If you were to watch a person who is operating today's standard hardware, without your seeing what is being displayed and without any idea of what the operator is trying to accomplish, you would
not be able to guess what he is doing. There are exceptions: A
's fixated stare at the display, accompanied by manic gyrations of a joystick to the beat of a persistent and repetitive sound track, are strong hints that a game is being
. But in general, user actions in one application, such as word processing, look much like user actions in others, such as a data entry task or a spreadsheet manipulation.
This uniformity of user actions across applications is a clue that interfaces for various applications are not as different as they might seem when you are involved in using the computer yourself. Applications seem more different than they are because you are
to the content of what you are doing—that is, to the widely varying semantics of each action. In particular, you are not paying attention to the physical manipulations you are performing.
Another way in which applications are similar is that nearly all of them require text entry. (Even
have you type your
when you do well.) It is thus worth ensuring that text handling—whether in the small, as when the user is entering the string on which to execute a Find command, or in the large, as when the user is writing a
, is accomplished with a smooth and efficient set of operations.
We and our software are not perfect; not all text-entry keystrokes, pen
, or acts of speech will cause the desired
to be displayed. Therefore, an essential provision of an interface is that keystrokes can be erased with an immediate tap of the Backspace (or Delete) key, with analogous
for other forms of input. Larger changes, such as adding a paragraph, require that the user be able to select
and to delete them. Another fundamental requirement, except for brief inputs, is that the user be able to move the cursor to any point in the text and to insert additional characters. In short, whenever text is to be entered, the user expects to have available many of the capabilities of a word processor.
Whenever you are entering text, you are placing it into a document or a field, such as an area provided for you to enter your name on a form. With present systems, the rules about what editing functions are available change from field to field and from one document type, such as a word processor document, to another, such as a spreadsheet. The editing rules can change within a single document in systems that allow the document to include portions generated by separate applications (in Section 5-7, we look at a solution to this problem).
Two different but similar pieces of software on a system form a prime breeding ground for user confusion and frustration. Yet that is precisely what we do have on nearly every personal computer system. My computer has 11 text editors, each with its own set of behaviors, and there may be a few editors that I missed. The situation is unnecessarily confusing.
One important step in creating a humane interface for computers and or for computerlike systems, such as a Palm Pilot, is to ensure that the same rules apply whenever text is being entered and when editable text is selected. For example, on the Macintosh or in Windows, you cannot spell-check a file name in situ, so if you are not sure of the spelling of
and you want to use that word as a file name,
you will have to guess at the spelling or
a word processor and retype or drag
into that word processor to check it. I suspect that, if I suggested to software developers that users should be able to check file name spellings, they might well add a new feature. Such an ad hoc addition—which would probably take the form of a new menu item in one of the desktop
, probably the Edit menu, would only increase the already absurd complexity of the software. It is better to simplify by means of the idea presented here, that one command for spell-checking should suffice for any text, whatever role that text might be playing at the moment.
Interface design should be such that
any objects that look the same are the same
. Insisting on this principle results in a boon of simplicity for user and programmer alike and is a concept that extends far beyond text. Every object for which this can be done is an affordance. If a user cannot tell what he may and may not do with an on-screen object by looking at it, your interface fails to meet the criterion of visibility as discussed in Section 3-4. You put the user in the position of having to guess what operations are possible and to guess what will happen when a given operation is performed. Requiring the user to guess at what a piece of software will do is an interface technique more suited to games than to tools.
The ideal of having appearance invariably
function is not, in general, achievable. For example, one object can
or spoof another. A bitmap of text looks exactly like text, but in current systems, text-editing operations fail on bitmaps. This kind of problem can be partially surmounted if the system always attempts to transform the object into a type to which the operation applies, a notion we discuss in Section 5-8.