Be a driver, not a mechanic.
A desktop-based interface has a very low efficiency because you do not accomplish your tasks when you are in the desktop. A humane design that has neither desktop nor applications should leave the user involved with content at all times.
As we have seen, we can eliminate files and file names, leaving only one universe, or content space. You do not need to open or close documents; you zoom to them and just start working. You do not open applications; you duplicate a blank document (or whatever). You do not launch a game; you zoom in on it (a multiuser game may even be in progress). Separation of text into user-defined content areas uses the separators from the character set, user-chosen words or codes, or by making positional distinctions.
As the testing that led to SwyftWare and the Canon Cat demonstrated, neophytes find a system without a desktop or named files very easy to use. However, if experienced users of present systems are not given a few minutes of explanation, they can find these interface concepts so alien to their experience that they seem puzzling. An extreme example of this was when IBM reviewed the Canon Cat for possible interest in the interface. To avoid having to execute a nondisclosure agreement, IBM chose two ex-IBM employees who were experienced personal computer users to review the interface. Their self-chosen modus operandi was to see whether they could quickly learn the interface without manuals, online help screens, or tutoring.
At the end of the time, they reported failure. They had experimented with a range of command techniques, from the slashes of the IBM punch card based Job Control Language an ugly style of delimiter that, amazingly enough, persists to this day in web addresses to IBM DOS commands, in their attempt to get the word processor started. They were, of course, typing and erasing text as they made these attempts. But they assumed that they were doing this at a "system level."
An interface, as far as is possible, should be self-teaching. This does not mean that an interface is, or can be, intuitive but that a user can readily find comprehensible explanations or instructions when she needs them.
To facilitate learning, a display of instructional text should be presented the first time the product is activated. A tutorial and the complete reference manual should be part of the interface, accessible at any time. Help displays are simply part of the content. No special mechanisms or techniques are required to use them. A LEAP to any keyword and perhaps a few taps of a LEAP Again key will suffice to find the desired information. Alternatively, you can zoom to a help area and then into the particular portion you need. Help systems in GUIs are an addition to the rest of the interface and have their own rules and implementation, which are an additional burden on the user and on the system's implementers. To where do you turn if you need help learning to use the help system in a conventional GUI: the help system's help system? As with lists of commands, which were removed from the realm of programmer-only creations and became a specie of ordinary text (in Section 5-4), the help feature of a humane interface can be operated without any special methods.
You're in Locked Text and Try to Type
You can move the cursor into locked text, say, by LEAPing there or just positioning the cursor directly. Now, if you try to type, what should the system do?
The more or less traditional computer method is to give you a beep and possibly flash the screen or the menu bar to indicate that you are doing something wrong and perhaps offer a dialog box with a message saying that you are doing an illegal operation. This is not a humane interface: It gives you an error message that you have to dismiss, and any characters you typed are lost, violating our parallel to Asimov's first law, that all user content is sacred.
It is also unacceptable for the system to move your input point to somewhere else in the document or universe. You have determined what you want to look at, and the machine shall not change it. A number of options do not violate the cognitive principles of humane-interface designing. For example, the screen could split, one part showing the cursor in the locked text, with the material you are typing in the other part, say, just after the locked text. Another solution might be to provide a transparent overlay with the message that you are trying to modify locked text, that same overlay accepting the text you do type, and selecting it so that you can readily move it to wherever you wish. As soon as you move the text to where you wish it to be, the overlay disappears.
Transmitting e-mail is a separate application in most systems. In the system being discussed here, sending e-mail consists of selecting the text, selecting the address, and executing a Send command. Because nearly every operation begins by making a selection and ends by executing a command, the standard sequence of operations becomes habitual, and the operation of sending e-mail does not feel like three distinct steps but only one.
Your e-mail directory is simply your address list or any list containing names and e-mail addresses. It is just ordinary text, not a special mechanism. Say that you want to send an e-mail to your Uncle Herman. You type the note and select it. You then LEAP to Herman Jackson or however you have him listed and then select his e-mail address. Lastly, you type or select from a menu the Send command and execute it as previously described, with the Command key. Instead of a single name, of course, you could have selected a list of names for an e-mail broadcast.
Another approach is to have an e-mail command use the first line of the selection as an address, the second line as the subject heading, and the remainder as the content of the message. A command would be used for making a selection an attachment to an e-mail. No doubt various vendors would supply different methods for sending e-mail. A wise user would buy only one.
Here is a seemingly peculiar method of receiving e-mail that, in practice, works much better than it sounds. When an e-mail arrives, two document characters, which sandwich the growing contents of the e-mail, are inserted immediately before your current cursor location. You can, of course, continue to work as the e-mail streams in. No special window or "You have mail" message is needed. You can ignore the e-mail or read it in progress. At any time, during transmission or afterward, you can select the e-mail and then move it to wherever you wish. An experienced user is likely to have a spot labeled "E-Mail Repository," "E-Mail Goes Here," or some such so that it can be LEAPed to and the e-mail (deposited there automatically as well as being presented at the focus as it arrived) can be read at leisure. The value is that the e-mail, with any attachments, comes in automatically and visibly and becomes part of your text; you already know how to move text to where you want it. Attachments download in the background so that your work is not interrupted and that, if there is fear of software viruses, can be placed into a nonexecutable software quarantine. A command that gathers recently received e-mail and puts it into a particular area in your universe is a feature that software vendors might supply.
Encryption is a matter of selecting some text and applying the encryption command. Encrypted text can be left in place, e-mailed, or otherwise used. Decryption is similar.
Any abilities now provided by applications also fit into the same mental model. Consider spreadsheets: If a selection has the syntax of an algebraic expression and if a Calculate command is executed, the expression is stored, and the results of the expression are displayed in its place. Another command reverses the process and shows the stored expression, which, if necessary, can be edited and recalculated. If expressions are permitted to have variables, if values can be given names, if a syntax that allows position-relative references is provided, if there is some notation for ranges, and if the expressions are put into a table, a fully functional spreadsheet has been created. But it is more than just a spreadsheet, because references to the spreadsheet values can be incorporated anywhere in text, and any value in the text can be incorporated, by name, in the spreadsheet. That is, the results from a spreadsheet in a report can be used in the introduction to the report, and values in the text can be used in the spreadsheet without having to set up any special mechanisms. There are no sharp demarcations among text, mail, and spreadsheets or, for that matter, any other content.
6-4-1 Cut and Paste
Another problem with conventional interfaces is the cut-and-paste paradigm. Most users of cut and paste have experienced the loss of some work when they inadvertently performed a second cut before pasting the first. When text is deleted, it should not disappear into limbo, especially not into an invisible cut buffer. One solution is to have the cut text appear as the last item in a document that collects deletions. The deletion document is just an ordinary document and can be treated as ordinary text. (It may be useful to have a special command that deletes in a more permanent way.) The essential point here is that nothing mysterious happens to deleted text, and no special command or knowledge is needed to find the document containing deleted text, beyond having to know that the plaintext words, such as "This document contains deleted text," are in it. The user could type in any phrase to use as a target for this document. Of course, any phrase in the recently deleted text can also serve to find it, just as if it hadn't been deleted.
Any humane design for deletion
Has deletion operating no differently from other commands
Puts nothing at risk when text is deleted or moved
Creates no special buffer or other "system-level" or hidden place to where text is moved
Treats single-character deletions no differently from multiple-character deletions
Can be undone and redone
6-4-2 Messages to the User
Always do right. This will gratify some people, and astonish the rest.
Whenever you find yourself specifying an error message, please stop; then redesign the interface so that the condition that generated the error message does not arise. In other words, an error message signals an error, to be sure, but the error is usually in the design of the system or its interface, not on the part of the user. On occasion, the work my associates and I have done in trying to eliminate an error message has resulted in our realizing that fundamental design decisions were incorrect and that design revisions were in order. In this regard, and in this regard only, error messages are a great boon. For example, in designing a package to do arithmetic, there seemed, at first, to be no way to avoid giving an error message when the user tried to divide by 0, but a better solution was the creation of a value called undefined. The IEEE (Institute for Electrical and Electronic Engineers) 754 standard for arithmetic uses NaN, or "not a number," for this. Arithmetic on undefined is well defined (undefined + 3 = undefined, for example), and results involving it are more useful and diagnostic than simply stopping the calculation. Using undefined also solves the problem of what to do when the Calculate command is applied to an object that does not have the syntax of an arithmetic expression. The information that something has gone wrong appears as the user's locus of attention, namely, as the result desired. More informative is to replace the simple undefined by divide-by-0 and other messages as appropriate. Arithmetically, they all behave as does undefined.
Another example of where the desire to eliminate an error message affected a hardware decision arose in the design of the Macintosh; it is also an example of a "bet the farm" kind of decision that often arises when you are creating a new product. At the time, we were choosing the storage medium for the Macintosh. Hard drives were too expensive to be considered. The 5¼-inch floppy was in nearly universal use, and a number of technologies were competing to replace it. The Macintosh team decided on the 3½ inch floppy, which turned out to be the right decision, as the rest of the personal computing world went the same way. Had IBM, for example, chosen otherwise, users might have found it difficult to buy diskettes for the Mac.
But the choice of the Sony drive was, for us, sealed by a humane interface consideration: Most of the brands of drives we examined allowed you to eject a disk by pressing a button on the disk drive. Unfortunately, nothing prevented you from doing this before your current work had been saved on the disk. In that case, we would need a warning message to let users know that they had made an error: "You have removed the diskette that this document came from. To save your work back onto the diskette, please replace the diskette in the drive." Of course, if the user had already moved away from the computer, taking the disk, the next user would be stuck. Then I learned of a drive that did not have an Eject button but that ejected the disk only on a command from the computer. Now, when you wanted to eject the disk, you signaled the computer, and the computer had time to check that your command was appropriate before ejecting the disk. If your command would lose some data or would otherwise cause a problem, the system could correct the situation before ejecting the disk. We chose the self-ejecting disk drives, which, during the reign of the floppy, was one of the many factors that made the Mac easier to use than its competition.
If a message must be given and no user response is required, the message can usually be eliminated. If, for some reason, the message absolutely must appear, it can be presented as an overlaying transparency, as discussed in Section 5-2-3. With a transparent layer for messages, the underlying screen can be seen and used as if the transparency were not there. In other words, no special action need be taken to dismiss the message. You just work through the message, and the message disappears when you perform any operation on the underlying layer; moving the cursor is not an operation in this sense. Unlike a standard dialog box, no extra user actions are required to dismiss a message, and a message never totally obscures any information on the display, a problem common in today's interfaces.
The claim is sometimes made that responding to a message is necessary to meet a legal standard. This presumption could, I believe, be challenged in court by pointing out that a user closes such message boxes habitually and that closing the message box does not imply that the user has read, or even consciously seen, the message (see Section 2-3-2).
Messages: A Case Study
One of the more remarkable designs that came from eliminating a set of error messages was the set of methods developed by Dr. James Winter for storing and retrieving information from mass storage on Canon Cat. Users of GUIs are familiar with the many messages associated with storing and retrieving information; for example, you are warned if you try to close a file that has not been saved. Originally, I had proposed that we have a pair of commands, each activated by a dedicated button, for saving and retrieving the user's universe; there was, as I have discussed, no file structure.
Dr. Winter showed that only one command was needed and that this interface would be more secure. My first reaction was that it could not possibly work, a common response to good, radical ideas. After all, what could be simpler than my two commands? Dr. Winter's method did all that he had claimed, and it was implemented in the commercial product, where it proved successful and popular. In particular, many errors made in saving and loading information, which can so often devastate a user's data, could not occur.
To follow his idea requires a bit of background. The Cat had already dispensed with the concept of files in the usual sense, so that all work was stored in a user's workspace. The removable storage in those days, a floppy disk contained the same amount of information as memory, so the entire workspace was kept whole whether in memory or on disk. One then had the mental model of swapping entire workspaces, a concept users found very comfortable. Once a workspace was swapped in, the computer kept track of whether it had been modified, and every disk was given a probably unique serial number derived from a checksum on its contents.
Winter's idea was this: Only one command, which he called DISK, was needed. It would consider the state of affairs and automatically do the right thing. To demonstrate that this would work, he created a simple chart that showed the action the system would take when the DISK command was activated under various conditions. Here is the central portion of that chart.
State of memory: Unchanged Changed Empty
State of disk:
Same no action save no action
Different load warn load
Blank save (dup) save no action
If you move a workspace from the disk into memory and have not changed it, the memory state is "unchanged." If that same disk was in the machine and you invoked DISK command, there is no need for the system to do anything. If the disk had been removed and another inserted into its place (the state of the disk is now "different"), the system knows that no information would be lost by loading the new disk's information into memory. This is because there was a copy of the workspace on the disk from which it had been taken, namely, the disk that was just removed. In the third case, there might be a blank, unused diskette, in which case the user probably wants to make a copy of the workspace, and that's what the system does; diskettes did not have to be formatted, as this was done "on the fly," as necessary. If the system has guessed wrong in any of these cases, no harm has been done.
The rest of the chart is self-explanatory. The term "warn" means that a message was made available that told the user that because the workspace had been changed and the disk in the drive had also been changed, the system could not do anything without losing information. It recommended that if the user wanted to retain the changes, he or she return to the original disk or save the workspace to a blank disk. After that was done, the DISK command would load the new workspace. There was a way to force a load. The system also automatically performed a DISK command if the system were left unused with the workspace in the changed state for a few minutes, adding further safety to the system.
Winter had originally proposed an even easier method, based on making use of a disk drive that could detect when a disk was inserted and eject it under program control. Unfortunately, these drives were made by Sony, and we could not convince Canon, a competitor, to use them: an example of making users suffer in the name of corporate pride. Given the Sony hardware, we would have needed no disk-related commands at all, just an Eject button on the drive that would have been read by the system and the disk ejected or not, depending on the chart. At Apple, which did not make disk drives, we were able to specify the Sony drive; however, I did not have the benefit of Dr. Winter's insight at the time I was writing the specifications for the Macintosh project.
In classroom testing, the DISK command was often praised by educators. Aside from requiring almost no class time to learn, and no time was wasted formatting diskettes, it prevented one of the most frequent excuses for lost work: If a first student left without saving her or his work and a second student put a disk in and tapped the DISK command, the warning would appear. The second student would then either rush to find the first student or would save the material on a blank disk for the first student, because the second student's disk would not load until that had been done!
Winter's DISK command was easy to use. The instructions reduced to: Whenever you want to do something with the disk, tap this button. It also made every attempt to preserve the users' work. The one-disk-equals-one-workspace concept also vastly simplified the user's mental model; the CAT became a "window" that apparently looked at the information on the diskette.
It has been suggested that the DISK command created a modal situation. The objection, however, occurs only if the DISK command is thought of as Load and Save commands hidden behind the same key. But if you don't know about those commands, as on conventional systems, you may just think of the DISK button as the "do-what-needs-to-be-done-with-the disk" command. In the event, no mode errors were observed in testing.
If your computer still uses floppy disks, you are probably aware that you must either "format" a floppy before you can use it or purchase "preformatted" floppy disks. On the Canon Cat, disk formatting was done as data was stored, so that it seemed to take no time and the user did not have to be aware that the process existed. Because the user gets no utility from being able to control the formatting process, there is no reason for the user to know that it exists. This is another example of the general principle: If a control must always (or never) be operated, don't provide it.
 Ideally, there would be no commands for saving your work whatsoever. Your material, including all intermediate states of its development, should be automatically preserved, unless you deliberately permanently deleted any or all of it. Our hardware resources at the time did not allow such an implementation.
6-4-3 Simplified Sign-Ons
Users are doing more work than necessary when signing on to most systems. You first state who you are your "handle," "online name," or "system name" and then you provide a password. The name presumably tells the system who you are, and the password prevents unauthorized persons from using your account.
In fact, you are telling the system who you are twice. All that is logically required is that you type in a password. There is no loss of system security: The probability of guessing someone's name and password depends on how the password was chosen, its length, and the like. Finding the user's online name is usually trivial; in fact, it is commonly made public so that she can be communicated with. A badly chosen password, such as your dog's name, is the most common reason for poor security.
The technical argument that typing two separate strings of characters gives more security is false. If the online name is j characters and the password is k characters, the user, to sign on, must type j + k characters, of which only k characters are unknown to a potential interloper. If the password was chosen randomly this is the best you can do from a character set with q characters, the probability of breaking into the account on a single guess is 1 / qk.
Requiring a password of even one additional character and eliminating the requirement that a user supply a name increases the difficulty of guessing the password by a factor of q and saves the user from having to type j ?1 characters and from having to wait for, recognize, and respond to two prompts, or fill in two fields, instead of one. We get greater security, use less screen real estate, and gain greater ease of use by increasing the minimum password length by one character and eliminating the name field. We lose nothing by just dropping the name. Less bothersome security techniques, such as voiceprinting or using fingerprints or another unmodifiable physical characteristic of a user, would be better yet for some applications, although you could not tell a trusted associate how to sign on to your account unless there was also an alternative way to pass the security check.
The question arises: How can you ensure that everybody has a unique password in a password-only system? What if two or more users choose the same password? One option is to have the system assign them. Badly implemented, the passwords are unmemorable, such as 2534-788834-003PR7 or ty6*>fj`d%d.
There are many ways of creating memorable passwords, and in such quantity that you can give the user a choice of five or six. For example, you can have the computer choose two adjectives and a noun at random from a large dictionary, and present a list such as
savory manlike oracle
exclusive malformed seal
old free papaya
blooming small labyrinth
rotten turnip sob story
from which the user can choose her favorite. In English, there are at least two trillion such combinations. With a million tries a day, it would take an average of over 25 years to guess such a password. That's reasonably secure.
When the idea of improving the interface to a web site or a computer system by simplifying the sign-on process to require only a password is suggested, it is usually rejected on one of two grounds. Either the programmers say that that's just not the way it's done, or they say that they have no control over the sign-on procedure. But someone, of course, does have that control.
6-4-4 Time Delays and Keyboard Tricks
We are likely to have conventional alphanumeric keyboards attached to our computers for a while. In spite of many attempts at reform, such as the Dvorak arrangement of keys, the inertia of the mass of millions of people trained to touch-type on the QWERTY keyboard has proved impossible to overcome. All we can do as interface designers is to peck at the edges and make small improvements that do not require tedious retraining. Here are some small improvements that we might get away with.
Initiating automatic repeat on most keyboards requires that you hold a key for 500 msec, after which the repetition begins. This is an example of a fixed delay; however, there are good reasons to avoid using fixed timed delays in interface design. Any choice of delay is likely to be both too long and too short, depending on the user and the circumstance. In this case 500 msec is too short when you linger while holding down a key, perhaps because you are thinking about what you want to type next. In this case, you may awake from your reverie to discover a few lines of ssssssssssssssssssss on your page. (My cat is adept at eliciting this behavior from my computer.) A slow typist or a person who suffers from any of a number of neurological or physiological problems may also find the 500 msec autorepeat delay too short.
Nevertheless, 500 msec is also too long. For one thing, delays are just that: delays. The user has to wait for an effect to take place. An example that users find particularly annoying occurs in the Macintosh interface: To change a file name after you open a volume or a folder, you click on the name and then wait for about half a second for a special border or color to appear, indicating that it is now editable. The reason for doing this was to allow you to select a file name with a simple click, without risking accidentally editing it. Once the file name changes to the preeditable state, you must click on the name again to put the system in the editing state. Evidence that the delay annoys users comes both from interviews and from the number of times that magazines mention tricks for getting around the delay. Users do not like to be forced to wait.
John Bumgarner, while working at Information Appliance, came up with an elegant solution to the autorepeat problem. He began with the observation that in most phonetic languages, the same character is almost never typed three times in a row. He also observed that autorepeat is rarely used unless more than five instances of the repeated character are required; otherwise, the user simply types the character the desired number of times. His autorepeat software started to autorepeat if a key was held down for more than 100 msec after the third consecutive instance of a key being typed. In other words, to get a line of equal signs, you would type
= = =
You would then wait until the desired length of equal signs was produced before releasing the Equal-Sign key.
Pressing the same key repeatedly is faster than typing different letters, and a GOMS analysis shows that the expected time for starting autorepeat drops from the conventional method's 700 msec to 400 msec. Bumgarner's autorepeat proved easy to learn and, in our testing, was never activated accidentally. (It won't autorepeat even if your cat sits on the keyboard.) One negative that it shares with the standard autorepeat method: It is an invisible function, labeled nowhere on the computer.
Well-designed computers and information appliances have chord key boards, so that the software can recognize multiple keys held down simultaneously. Older and more primitive computers had keyboards that could recognize only a few special keys, such as Shift, that could be recognized as being pressed at the same time as were other keys. Chord keyboards allow us to solve a number of otherwise difficult interface problems. For example, how to do an overstrike: A logical method of creating two symbols that appear in the same character location is needed. For example, if you wanted to create a dollar sign by overstriking an s with a vertical bar ( | ), you should be able to use a temporal "overstrike" on the keyboard to represent the physical overstrike:
This would not conflict with the overlapping keystrokes that occur in normal high-speed typing, whereby the key pressed first is released only after one or more other keys have been struck. The word the is often typed not as
but, to give one of many possible sequences, as
Modern keyboards and their enabling software can accommodate these overlapped keystrokes, a phenomenon called rollover. Most keyboards have n-key rollover, which means that the system will not get confused if as many as n keys are held down at once while typing continues. Given human anatomy, there is little need for n to be greater than 10, although there is no technical reason for n to be limited at all if the computer has a chord keyboard.
Given the convention of creating overstrikes by holding one key while tapping another, accents and diacritical marks could be treated as overstrikes and handled in a uniform way. For example, é, as in the name Dupré, is produced on the Macintosh with the unguessable key sequence
Note that this is an example of a modal, verb-noun method, a violation of Apple's own guidelines. It also operates inconsistently. If you type the following sequence, you get a quote followed by a t, not a t with an acute accent, as you might expect:[*]
[*] If you guessed Optione t t to get an accented t, youd still be wrong.
Typing accents and diacritical marks becomes simpler and more logical if overstriking is accomplished with a quasimode:
You hold down e and, while holding it down, tap the accent. You could also type é by overstriking in the reverse order:
It makes no difference, logically, in which order you perform the operations.
Overstriking is also useful in creating mathematical and other special symbols and for some computer languages, such as APL. You might argue that rather than accommodate overstriking, we should just add whatever characters we need to the character set; after all, we have fully bitmapped displays. True, but not all of us want to take time or have the skills to design and to install a new character and add it to every font in which we want the new character to appear. Also, it seems absurd that on a modern computer we cannot accomplish what we used to be able to do with the lowly mechanical typewriter!
Overstriking need not be limited to two characters; any number can be overlaid, such as
Shift | / / |
That sequence will produce a dollar sign with a slash through it. Considerations of esthetics and readability, rather than hardware or software concerns, should form the only limits to overstriking.
To give immediate feedback to the user as each key is typed, assuming that n-key rollover and this book's overstriking technique are both operating, the interface may have to temporarily display a pair of overstruck characters as adjacent characters. The reason is that it cannot tell the difference between overstriking and rollover until the keys are released, at which time overstruck characters would coalesce automatically. I will mention here an essential keyboard reform that I touched on earlier the elimination of the Caps-Lock key. It introduces a mode.