5.2 Pervasive Dialog Elements

In addition to choosing a dialog strategy, in the high-level design phase you need to make decisions about how to handle several pervasive dialog elements. We will discuss the choice of error recovery strategy and universals, which are relevant to all applications. We will also discuss login strategies, a high-level design choice that must be made for some applications.

5.2.1 Error Recovery Strategies

There are two reasons error recovery is needed: The technology is not perfect, and neither are humans. The recognizer may not be very confident that it understood the caller, it may not have been ready to accept the caller's utterance, it may not have heard the caller, or it may be taking too long to generate a recognition hypothesis. As for humans, users often misspeak their responses, slur their words, stay silent, provide irrelevant information, and push buttons on the keypad at times when the system is expecting speech.

Applications must be armed with the logic to handle these situations so that callers can get back on track, no matter what the problem was, and succeed in completing their task. The challenge for the designer is to understand the nature of the caller's problem and provide the information needed to recover.

The error recovery strategies discussed here involve designing how the system should behave when the recognizer does not return a recognition hypothesis. The two most common messages the recognizer returns when it cannot find a result are reject (no good match was found) and no-speech timeout (no speech was heard). In this chapter we cover the choice of strategy for handling such error messages. The actual prompts used in response to error messages are highly dependent on the dialog state and are worked out as part of the detailed design (see Chapter 13). How you detect and handle recognition mistakes (when the recognizer returns a wrong hypothesis) depends on the dialog state context and is also discussed in Chapter 13.

To define the error recovery strategy for your application, you must consider both the type of error and the number of errors the caller has encountered. First, let's look at the most common error type: rejections.

Escalating Detail

A common approach to reprompting following a rejection is to give more detailed instruction or examples of what the caller should say (or both). With successive rejects, even more detail is provided. This approach has been variously referred to as escalating detail or progressive prompting (Yankelovich, Levow, and Marx 1995; Weinschenk and Barker 2000). Here's an example from a travel planning application:

(4)

SYSTEM:	When would you like to leave?
CALLER:	Well, um, I need to be in New York in time for the first World Series game.
SYSTEM:	<reject>. Sorry, I didn't get that. Please say the month and day you'd like to leave.
CALLER:	I wanna go on October fifteenth.

Instead of asking "when" the caller wants to leave again, the error prompt explicitly tells the caller to say the month and day.

Another aspect of dialog 4 is the phrase, "Sorry, I didn't get that." This is a common technique in which the application provides feedback on the exact nature of the problem: The system did not understand the caller. This technique is often combined with escalating detail.

The first error prompt the caller hears in a dialog state is sometimes referred to as the first-level error prompt. When the caller encounters a second error in a row, the second-level error prompts play. Upon the second error, some designers like to begin with a phrase such as, "Sorry, I still didn't get that," followed by either a more detailed instruction or a suggestion to try an alternative approach (e.g., "Please key in your account number"). In general, during detailed design, you should craft the specific error recovery prompts for each state. In this way, you will maximize the chance of recovery; the best approach will vary depending on the state.

One disadvantage of using the escalating detail approach is that often the caller knows exactly what to say to the system and only needs another chance. As a result, callers sometimes get impatient with detailed error messages. The rapid reprompt approach, discussed next, is an alternative strategy that is often effective.

Rapid Reprompt

The rapid reprompt approach does not provide detailed information right away. Instead, the system replies to the error with a short prompt such as, "I'm sorry?" or "What was that?" This is similar to the kinds of statements people often use in conversation to indicate that they did not understand the speaker. If another error occurs, the behavior usually follows the escalating detail strategy by providing detailed information about what to say. Table 5-1 compares the two strategies.

A number of experiments at Nuance compared the two strategies. When all variables are controlled and when only the prompt wording is varied, the rapid reprompt strategy is unquestionably preferred by users. Figure 5-1 shows the results of comparing the two error recovery strategies.

Figure 5-1. Preference for error recovery strategies.

graphics/05fig01.gif

The major disadvantage of rapid reprompt, however, is that it does not provide the caller with detailed information right away. As a result, callers who are unsure of what to say often take an extra step to recover from an error, because they must wait for the next error prompt to get explicit instructions. On a positive note, callers do not seem to mind this extra step because they do not perceive it as an error.

The best place to use the rapid reprompt strategy is in directed dialogs whose structures are obvious or with repeat users of a system. It is not recommended for cases with open-ended prompts, such as, "How may I help you?"

Table 5-1. Escalating Detail Versus Rapid Reprompt
EVENT	ESCALATING DETAIL	RAPID REPROMPT
Initial prompt	What's your account number?	What's your account number?
First error	Sorry, I didn't understand. Please say your 10-digit account number.	I'm sorry?
Second error	Sorry, I still didn't understand. Your 10-digit account number statement at the top right corner. Please say your account number now, or for more information, say, "Help."	Sorry, I didn't understand. Please say your 10-digit appears on your monthly account number.
Third error	Sorry, I still didn't understand. Please key in your account number, or say, "I don't know it," and I'll connect you to someone who can help you.	Sorry, I still didn't understand. Your 10-digit account number appears on your monthly statement at the top right corner. Please say your account number now, or for more information, say, "Help."

Variation on Rapid Reprompt

Some designers prefer a variation of the rapid reprompt strategy that quickly restates what the system is expecting but does not go into detail or provide examples. Compare the following sample dialogs.

Here's a sample dialog with rapid reprompt:

(5)

SYSTEM:	What's your account number?
CALLER:	Uh, it's five five five … wait.
SYSTEM:	I'm sorry?

Here's a sample dialog with the rapid reprompt variation:

(6)

SYSTEM:	What's your account number?
CALLER:	Uh, it's five five five … wait.
SYSTEM:	Sorry, what was that account number?

Rapid reprompt and escalating detail are the two most common error recovery strategies used for rejections. Which one you use will depend on the application and the specific task. Even if you choose rapid reprompt as the general strategy, you may find that you need to override this behavior in certain dialog states.

No-Speech Timeouts

No-speech timeouts are the second most common type of error. Although much of the error recovery behavior for rejections can be adopted for no-speech timeouts, it is a good idea to treat them separately. For example, if the reject prompt begins, "Sorry, I didn't understand …" you might substitute "Sorry, I didn't hear you" even if the remainder of the message is the same. Furthermore, the causes of no-speech timeouts are likely to be different from the causes of rejects, so you should consider alternative strategies. Given that callers often remain silent when they are not sure what to say, a rapid reprompt strategy may not be as effective for timeouts as it is for rejects.

State-Specific and Global Error Counts

After two or three successive errors in a single dialog state (regardless of the type of error), it is unlikely the caller is going to recover. In fact, hang-up rates are very high after a few rejects. For this reason, every application should have defined behavior for what to do when a maximum number of errors has been reached. For example, if live agents are available, the application might transfer callers to an agent after they have experienced three successive rejects. In the high-level design phase you should establish three thresholds: the maximum number of successive errors in a single dialog state (the state-specific error count), the maximum number of errors counted globally, and the maximum number of disconfirmations (responses of "No" in confirmation states).

A typical number for the state-specific error count is three. Designers differ on whether the three errors should be of the same type. We recommend that any error count toward the maximum error threshold. An exception might be in an initial dialog state for a newly deployed system. Sometimes callers are caught off-guard by the new system, especially one that is expecting the caller to talk to it. In these situations it is sometimes better to customize the threshold and corresponding behavior for no-speech timeout errors.

For disconfirmations, a default limit of two is recommended (this is based on hang-up rates from real deployments). When the threshold is reached, it is a good idea to send callers directly to an agent if one is available.

5.2.2 Universals

Universals are commands that are always available in an application. The most common universal is "Help." Most applications allow the caller to say "Help" at any time, in any dialog state. In response to "Help," the application provides the caller with detailed instructions specific to the current dialog state.

There is a set of six universals that we recommend for all applications: help, repeat, main-menu/start-over, go-back, operator, and good-bye. Chapter 9 covers the motivation for this standard set of universals and explains how to use them. During high-level design, your task is to decide whether to use these standard universals and whether to supplement them with any universals that have particular relevance for the application. For example, a stock trading application may use "broker" as a universal so that the caller can always get directly to a live broker.

5.2.3 Login

Although not as pervasive as error handling and universals, login strategies can have an effect on many parts of certain applications. (Login is the part of some applications that gathers information such as account numbers and PINs from callers.) The decision to be made is whether to log in callers at the beginning of the interaction or wait until they want to do something that requires secure information (often called delayed login). There are pros and cons to each approach.

The benefit of up-front login is that the designer can customize the system's behavior for that individual or class of caller. For example, if the caller has not set up automatic payments, the system does not have to offer automatic payment options related to managing an existing account. The application can also be intelligent about the help strategy and can even adjust the prompt detail depending on whether the caller is a novice or expert user.

A disadvantage of this type of login is that callers must provide personal information even if they want to do something generally available, such as getting a stock quote. This takes time. Often, users forget their PINs and have a difficult time getting into their accounts. This can be frustrating if all they want to do is a simple task that does not require account information.

The opposite is true for delayed login. Although callers are not bothered with remembering account numbers, PINs, and passcodes, they may have to wade through irrelevant options in menus, and often experts must use dialogs designed for novices.

Your choice of login approach depends greatly on the application's key design criteria and how easy or difficult it is for users to remember or access the required login information.