5-4 String Searches and Find Mechanisms | The Humane Interface. New Directions for Designing Interactive Systems ACM Press Series


Team-Fly

	Human Interface, The: New Directions for Designing Interactive Systems By Jef Raskin
	Table of Contents

	Chapter Five. Unification

One giant leap for mankind.
?span>Neil Armstrong (1969)

Before going further into LEAP, it is useful to treat the subject of the interface to searches with somewhat more precision. A string is a sequence^[6] of characters; ordinary English words and sentences are examples of strings. String searches look through a (usually lengthy) string, called the text, for an instance of a (usually brief) string that the user specifies, called the pattern. Each occurrence of a substring of the text that matches the pattern is called a target. For example, if you were trying to find where you had written of a cat called "little Tatsu" in a long letter, little Tatsu is a good choice of target, and you might choose the briefer string Tatsu as the pattern to use in the search. The match may be exact, may be case-independent, or may represent another relationship between the pattern and the target; they might rhyme, for example. A commonly used matching criterion, one that tests well, is that lowercase letters in the pattern match either uppercase or lowercase letters in the text but that uppercase characters in the pattern match only uppercase characters in the text. Searches usually start from the current cursor location and continue forward through the text. In most systems, a modal user preference setting can direct the search to proceed backward through the text (Figure 5.3).

^[6] I use the term sequence as mathematicians do, to mean a set of objects having a first object, a second, and so forth.

Figure 5.3. A modal-search dialog box with modal-search types and direction settings.

graphics/05fig03.gif

Interfaces to searches are typically based on either of two search interface strategies. The most common strategy is the delimited search, found in most word processors. In a typical delimited search, the user enters a mode in which typing or other method of text input is regarded not as text but as a pattern. This is usually accomplished by using a dialog box containing a field into which the user can enter characters. After summoning the dialog box, the user types the pattern, followed by a delimiter, which is usually a character, such as Return, that is not permitted to occur in the pattern. In most dialog boxes, the user may also limit the pattern by using the GID to click a button with a label such as OK, Search, Find, or Find Next. When it is found in the text, a target is selected, and the cursor is placed immediately at the end of the selection.

This traditional method is rather punishing to the user, although most computer aficionados have become so accustomed to it that they no longer feel the pain. For example, an individual can type in a search string, make a typo unnoticed until too late, of course, because he pressed Return by habit and sit there, waiting for a search that he knows will fail. Most searches are not interruptible: a serious design error. Because the computer waits until the user has finished the pattern before beginning its search, delimited searches often keep him waiting unnecessarily.

The less common strategy is the incremental search, a popular example of which is found in EMACS, an editor used with the UNIX operating system (Stallman 1993). In most implementations of incremental searches, as with the delimited search, the user first summons a dialog box that contains a field in which the user can enter the pattern. When he types the first character of the pattern, however, the system uses this character alone as a complete pattern and immediately begins to search for the first instance of that character in the chosen search direction. If an instance of that first character is found before the next character of the pattern is typed, the instance is selected and the cursor placed immediately after the end of the selection. If the next character of the pattern is typed before an instance is found, that character is added to the pattern and the search continues, now looking for an instance of the now extended pattern. This method is repeated as each character is added to the pattern.

Using the Backspace or Delete key to delete a character from an incremental search pattern should return the search to the previously found instance that matched the pattern, as it was before the deleted character had been added to the pattern. The user can then add to the pattern, and the search continues without his losing the search already accomplished on the partial pattern. Many implementations do not have this desirable characteristic.

Incremental searching has a number of other advantages over delimited searching. Incremental searching wastes less of a user's time: The search begins when the first character of the pattern is typed; the system does not wait until the pattern is complete. With a delimited search, the computer waits for the user to type the pattern and delimit it, after which it is the user who waits while the computer does the search. When using a delimited search, the user must guess, beforehand, at how much of a pattern the computer needs to distinguish the desired target from other, similar targets; with an incremental search, he can tell when he has typed enough to disambiguate the desired instance, because the target has appeared on the display. Thus, as soon as he sees that he has reached the point desired, he can stop entering the pattern. If he types too many letters of the pattern if the hand is quicker than the search the pattern will still match, and the cursor will stay approximately where he wanted it. If he mistypes a pattern in a delimited search, he must wait until the search for the incorrect pattern is complete or, at best, he can operate a mechanism for stopping the search before he can correct his error. In a large text, the search can take a considerable period of time. In a well-implemented incremental search, the user can backspace at any time and be returned to the last match found. Because backspacing after an error is habitual, the process of correcting an error is very fast, and the search stops immediately. He can also resume the search by typing the correct character.

Another advantage of incremental search is that, as the user enters the pattern, it provides constant feedback as to the results of the search. With a delimited search, he does not find out whether his pattern is suitable or even if it was typed correctly until it is fully entered and a search attempt has been made. From the point of view of interface engineering, the advantages of incremental searching are so numerous and the advantages of delimited searches so few that I can see almost no occasions when a delimited search would be preferred.^[7] In spite of near universal agreement about the desirability of incremental searches on the part of both designers and users, almost all interface-building tools make it easy to implement delimited searches and difficult or impossible to implement incremental searches. JavaScript and Visual BASIC are two examples of such.

^[7] A search is either incremental or excremental.

Building a pattern incrementally allows the user to adjust the pattern interactively during the search, which leads the user to improve his search strategies through the feedback received. Even building a Boolean search pattern is made more effective when the early results of the pattern appear as the user adds to the pattern's specificity. The found instance should appear in the middle of a display area, not at the top or the bottom. Thus, material both before and after the instance is visible; that is, the found instance is displayed in context. The instance of the pattern found should always appear in the same place relative to the display or window so that the user soon learns where to look for the result of the search. In the Canon Cat, it always appeared at the vertical center of the display. It should not appear adjacent to any edge of the display area, so that material on all sides of the found instance is shown.

If an instance of the pattern does not occur in the text, the search fails. Many systems cease operating in this event and remain unusable until a particular key typically Enter or Return is pressed or a particular on-screen button is clicked. A modal message is placed on the display to let you know that you must make the required obeisance before you will be allowed to continue using the computer. In multidisplay systems or if the screen is visually busy, this message may be nowhere near your locus of attention. You may not notice it at all. It then seems to you that the computer will not respond to the keyboard; it seems to have crashed. In an incremental search, it is clear, without any special indication, that a search has failed; the cursor returns to its original location, and additional keystrokes do not have any effect. A short beep or a screen flash can also be helpful, especially if the search exceeded the duration of short-term memory, say, 10 seconds, so that the user has forgotten the appearance of the display prior to the search. The beep is also a useful cue for visually impaired users.

5-4-1 Search-Pattern Delimiters

Another major deficiency of delimited searches is that the delimiter used to end the pattern cannot be typed into the pattern. Often, other delimiters are excluded as well. I looked at four popular word processors, one did not allow a Return in a search pattern at all. In the second word processor, the user types ^r to insert a return into the search pattern. In the third, \\is used; and in the fourth, a Return is put into a search pattern by the use of a special dialog box with a pull-down menu of delimiters (Figure 5.4). It is easier to simply press the Return key when creating a pattern. After all, that's how you enter it into text; why should it be different in entering a pattern? The general principle is that the same sequence of characters should always be typed the same way; you should not have to use one method here and another method there. Another way of saying this is that there should be nothing special about special characters.^[8]

^[8] A parallel problem is the use of reserved words in programming languages.

Figure 5.4. The Word search box opened to show the list of characters that can be inserted. With a better search design, the user could put, say, a tab in a search string just by tapping the Tab key. Note the two codes for tab characters inserted into the pattern (the Find What box).

graphics/05fig04.gif

Although incremental searches are better than delimited searches, further interface improvement over the EMACS style of incremental search is possible. For example, the search should leave the cursor on the first character of the target rather than on the last. In general, you have little control over what will be the last character in your pattern, because you type just enough pattern to find your target. Thus, you do not know exactly where the cursor will be when you are finished. If the cursor lands on the first character of the pattern, you can predict how the target will appear. This also means that the search mechanism can also be used to quickly position the cursor in text, because your locus of attention is the character on which you want the cursor to be placed. Trying to invent a pattern where this is the last character in the pattern is far more difficult than typing the desired character and whatever follows it on the display.

In conventional GUIs, both delimited and incremental searches are initiated modally, by means of a dialog box. LEAP is modeless. The concept of using a quasimode for searches can be extended to a button on a microphone (or GID) that is held to create a quasimode during which words, sketches, or handwritten letters are used to create a search pattern. Other input techniques have analogous means for creating a search quasimode (Raskin and Winter 1991).

The speed of incremental searches is increased when a few implementation tricks are used; for example, when you type the first character of the search string, the computer instantly proceeds to find the first instance of that character in the text, which is highlighted and, if it is not there already, moved with its context into the display window. This is usually a fast process because there are a lot of potential matches, and one is likely to be close in text. While waiting for the user to type the next character, a search can be started for the next instance of the first letter, followed by each of the next possible characters, working in decreasing frequency-of-use order to maximize the probability of an early hit. The program can save pointers to each as they are found. When the second character is typed, the computer will often be ready to display the found target, seemingly instantly.

String searches can also be speeded by such methods as the Boyer-Moore algorithm for fast string search (Moore and Boyer 1977), whereby search time decreases as search-string length increases. In case the user backspaces through the pattern, keeping a pointer to the last-found location one for each character in the search string can make going back blindingly fast. Indexing all local mass-storage devices can make searching over a local system or network take only milliseconds. Interaction at high speeds, with wide area networks or the web, also depends on indexing methods. Incremental searching, in various versions, has been available in such commercial products as Borland's IDE, Global Village's fax program, the Canon Cat, and SwyftWare.

5-4-2 Units of Interaction

Incremental searches are just one example of a broader humane-interface design principle: A program should interact with you on the basis of the smallest meaningful unit of input. Interaction with input from a keyboard should be on a character-by-character basis, not on a line-by-line basis. Voice input should interact on a word-by-word basis; for some applications, interaction could be on a morpheme-by-morpheme basis, and so on.

Interaction by means of entering an entire line of text, that is, text delimited by a Return, at a time is a holdover from the days of the Teletype and should be relegated to a museum alongside the hardware of that time.^[9] Today, we can and should have interfaces able to react to each character as you enter it, whenever such a reaction could improve the quality of the interaction. As always, designers must be considerate: Character-at-a-time interaction should not be used to deliver spelling-error messages in the middle of typing a word, correctly judged a nuisance by touch typists.

^[9] There is an ever increasing need for a museum of interaction (MOI), as we have museums for hardware and software. The Internet might be an appropriate home.

In the small texts for which string searches were originally designed, searching usually proceeded from the current cursor position to the end of the text. In larger texts, it is generally a good idea to continue the search, if it has not found a target by the time it has reached the end, circularly from the beginning back to the cursor location, in case the user has forgotten that the desired target is farther back in the text. Unpublished testing at Information Appliance demonstrated that, if the search is fast, the circular search is especially favored by users. Fast means here that there is not enough time for a user action between the time the search is launched and the time it succeeds or fails: as usual, about 250 msec. In many systems, the user can also choose to have the search stop at the end of the file or search circularly. This causes a typical mode problem: If the search is set to noncircular and the user is not aware of the setting, a "string not found" message can give the misimpression that there is no instance of the pattern in the text. Or, what I have seen happen on many occasions, the user repeats the search a number of times because he distinctly recalls that there was an instance of the pattern in the text, and he is puzzled and frustrated by the search's repeated failure. It can be many seconds or even minutes before he either figures out the problem or gives up in frustration. If it is necessary to have different kinds of searches, this can be accomplished modelessly by using a different command or on-screen button to launch each kind of search.

In many circumstances, modality can be avoided by having a set of launch buttons, one for each variety of an operation, instead of having means for setting up the desired variety and providing only one launch button to initiate the operation under conditions you have established. This improvement not only eliminates a mode but also saves keystrokes. Also, the user's locus of attention is the task that he wants done, not the setting up for the task. Figure 5.5, a typical set-up-then-launch dialog box found in Microsoft Word, exhibits a second interface problem: Should the radio buttons be always in the state shown when the user opens the box? Should they be in the state that he last left them, or should there be a preference setting as to which is the default?

Figure 5.5. Dialog box with check boxes for different kinds of searches, and one launch button.

graphics/05fig05.gif

All three interface options are wrong. If the user is always updating the entire table, the first option forces two clicks or a click and a Return every time. If the box is left in the last-used position, the user cannot use the box habitually, because he has to stop and check on its state each time. If there is a user preference (see Section 3-2-2), then a mode has been set up.

The dialog box of Figure 5.6 solves all of these problems at once, and the larger buttons have a Fitts' law advantage over the radio buttons. The Cancel button could be eliminated if this box were transparent, as discussed in Section 5-2-3. The two buttons would not be transparent, showing that they are active.

Figure 5.6. More efficient dialog box with different launch buttons.

graphics/05fig06.gif

Delimited-search dialog boxes usually provide a facility that allows you to reuse the current pattern in a search that begins just after the last instance was found. This can be thought of as a "look again" or "find next" facility; in some implementations, it is invoked by operating the same button used to launch the initial search. With an incremental search facility, a command dedicated to searching again for the same target is required, because no command was necessary to launch the original search. It is dangerous to use a tap of the Search quasimode key to repeat the search, as a user might intend to do a search, press the key, and then change his mind and release it. Under such circumstances, an unwanted search would be started, and he could lose his place as a result. Although a complete undo facility can ameliorate the danger, it is better not to create the problem. A specific method for repeated searches is discussed in Section 5-6.

In larger texts, a search can proceed circularly not only through the local document but also afterward in automatically expanding domains, up to and including the entire Internet.^[10] After the local document has been completely searched, the search then proceeds through the following documents until the end of the folder is reached, when the search continues from the beginning of the first document in the folder until the current, already searched document is reached. After cycling through the folder, the next larger domain is treated similarly, and so forth. When, during an incremental search, a user sees that his search has gone too far afield, he can stop searching, confident that any nearer instance of the pattern does not exist. It is usually easy to tell in what domain you are searching, because you see the results in context; what you see is not just a list of file names, as with many present search systems.

^[10] It is entirely possible, by such techniques as lookahead and indexing, to maintain the 250 msec response time.

Generally, people will use more efficient strategies than simply relying on this expanding sweep of the hierarchy. For example, if you are looking for a particular document in the current folder, you would be likely to do a repeated search for document characters and would thus quickly be able to review the beginning of, and any heading on, each document. When the desired document is found, an incremental search for the target is initiated. This guarantees that the chosen document is searched first; the advantage of this is that you can typically use a shorter pattern if the area to be searched is smaller. If you did not know in what document the target resided or if you did not want to look for the document, you could perform a specific search from the beginning, which would find the target anyway.


Team-Fly

Top