Typically there is one person who oversees (or in some cases does) all the usability testing. This person's sole role is to evaluate the quality of the design as it works for people. The usability tester doesn't have to be the designer (even though it often works out that way); he or she just needs an intimate knowledge of how the system works and familiarity with how to run a usability test. There are two popular ways to test a speech-recognition application: the Wizard of Oz approach and the classical approach. Wizard of Oz Usability TestingNamed for the classic film's "man behind the curtain" who had no real power or wizardry, Wizard of Oz [2] testing is performed without a real system ”only a real design.
Simply put, one person pretends to be the computer, while another person acts as the caller. The caller picks up a phone and calls the other person (who is pretending to be the computer). The computer person can either speak the prompts just as the computer would ”"I'm sorry, I didn't understand you. Please say 'Red,' 'Green,' or 'Blue'" ”or use a slightly more sophisticated variation of the Wizard of Oz test involving some technology. This version would involve having the computer person control and play a set of audio files in response to the caller at the other end of the phone. This method sounds more like a real system (because actual audio prompts are used), but it requires additional time to record the files and set up the test and a method to allow a person to quickly choose the right audio prompt to play. Classical Usability TestingThe classical approach calls for the designer to recruit the services of a usability-testing laboratory. Each caller/test subject is placed inside a room with a phone, a chair , a desk ”and a video camera to capture audio and facial expressions. In this method, the test subject calls into a real speech-recognition system. The sounds on the phone line (as well as the ambient room sounds) are also recorded. Wizard of Oz versus Classical Usability TestingWhich method of testing is better? Well, they both use the right population. And both of them ask callers to perform real tasks that mimic the system in real life. But the classical method will give the designer/tester more insight. Because it uses a real system, the classical usability test also enables the designer to test the quality of the recognition engine and whether it understands the caller's responses correctly. This can't happen in a Wizard of Oz test, where a real person pretends to be the computer. The Wizard of Oz approach is a double-edged sword since it enables the tester to ignore system performance ”beneficial when the system hasn't been built yet, but detrimental if performance issues aren't discovered before deployment. If test subjects are using a real system, they will encounter the problems associated with using a real system, such as delays while the database is looking something up. In a real system, a caller might have to wait up to 30 seconds for the database to come back with an answer. (In that case, the designer might want to prepare users to wait for a while, so they don't think the system is broken. But with the Wizard of Oz approach, the person who is pretending to be the computer might simply say, "Please hold while I get that information," then quickly come back and say the next line, "OK, I've got your checking account information" ”not knowing that the real (and yet to be built) system may have long or unpredictable database delays. No matter which testing method you use, you will gain insight into whether people understand the prompts, know what to say, and get their tasks accomplished. |