17.14. Microsoft AgentMicrosoft Agent is a technology used to add interactive animated characters to Windows applications or Web pages. Microsoft Agent characters can speak and respond to user input via speech recognition and synthesis. Microsoft employs its Agent technology in applications such as Word, Excel and PowerPoint. Agents in these programs aid users in finding answers to questions and in understanding how the applications function. The Microsoft Agent control provides programmers with access to four predefined charactersGenie (a genie), Merlin (a wizard), Peedy (a parrot) and Robby (a robot). Each character has a unique set of animations that programmers can use in their applications to illustrate different points and functions. For instance, the Peedy character-animation set includes different flying animations, which the programmer might use to move Peedy on the screen. Microsoft provides basic information on Agent technology at www.microsoft.com/msagent Microsoft Agent technology enables users to interact with applications and Web pages through speech, the most natural form of human communication. To understand speech, the control uses a speech-recognition enginean application that translates vocal sound input from a microphone to language that the computer understands. The Microsoft Agent control also uses a text-to-speech engine, which generates characters' spoken responses. A text-to-speech engine is an application that translates typed words into audio sound that users hear through headphones or speakers connected to a computer. Microsoft provides speech-recognition and text-to-speech engines for several languages at www.microsoft.com/msagent/downloads/user.asp Programmers can even create their own animated characters with the help of the Microsoft Agent Character Editor and the Microsoft Linguistic Sound Editing Tool. These products are available free for download from www.microsoft.com/msagent/downloads/developer.asp This section introduces the basic capabilities of the Microsoft Agent control. For complete details on downloading this control, visit www.microsoft.com/msagent/downloads/user.asp The following example, Peedy's Pizza Palace, was developed by Microsoft to illustrate the capabilities of the Microsoft Agent control. Peedy's Pizza Palace is an online pizza shop where users can place their orders via voice input. The Peedy character interacts with users by helping them choose toppings and calculating the totals for their orders. You can view this example at www.microsoft.com/agent2/sdk/samples/html/peedypza.htm To run the example, you must go to www.microsoft.com/msagent/downloads/user.asp and download and install the Peedy character file, a text-to-speech engine and a speechrecognition engine. When the window opens, Peedy introduces himself (Fig. 17.28), and the words he speaks appear in a cartoon bubble above his head. Note that Peedy's animations correspond to the words he speaks. Figure 17.28. Peedy introducing himself when the window opens.Programmers can synchronize character animations with speech output to illustrate a point or to convey a character's mood. For instance, Fig. 17.29 depicts Peedy's Pleased animation. The Peedy character-animation set includes 85 different animations, each of which is unique to the Peedy character. Figure 17.29. Peedy's Pleased animation.Look-and-Feel Observation 17.1
Peedy also responds to input from the keyboard and mouse. Figure 17.30 shows what happens when a user clicks Peedy with the mouse pointer. Peedy jumps up, ruffles his feathers and exclaims, "Hey, that tickles!" or "Be careful with that pointer!" Users can relocate Peedy on the screen by dragging him with the mouse. However, even when the user moves Peedy to a different part of the screen, he continues to perform his preset animations and location changes. Figure 17.30. Peedy's reaction when he is clicked.Many location changes involve animations. For instance, Peedy can hop from one screen location to another, or he can fly (Fig. 17.31). Figure 17.31. Peedy flying animation.Once Peedy completes the ordering instructions, a tool tip appears beneath him indicating that he is listening for a voice command (Fig. 17.32). You can enter the type of pizza to order either by speaking the style name into a microphone or by clicking the radio button corresponding to your choice. Figure 17.32. Peedy waiting for speech input.If you choose speech input, a box appears below Peedy displaying the words that Peedy "heard" (i.e., the words translated to the program by the speech-recognition engine). Once he recognizes your input, Peedy gives you a description of the selected pizza. Figure 17.33 shows what happens when you choose Seattle as the pizza style. Figure 17.33. Peedy repeating a request for Seattle-style pizza.Peedy then asks you to choose additional toppings. Again, you can either speak or use the mouse to make a selection. Checkboxes corresponding to toppings that come with the selected pizza style are checked for you. Figure 17.34 shows what happens when you choose anchovies as an additional topping. Peedy makes a wisecrack about your choice. Figure 17.34. Peedy repeating a request for anchovies as an additional topping.You can submit the order either by pressing the Place My Order button or by speaking "Place order" into the microphone. Peedy recounts the order while writing down the order items on his notepad (Fig. 17.35). He then calculates the figures on his calculator and reports the total price (Fig. 17.36). Figure 17.35. Peedy recounting the order.Figure 17.36. Peedy calculating the total.Creating an Application That Uses Microsoft Agent[Note: Before running this example, you must first download and install the Microsoft Agent control, a speech-recognition engine, a text-to-speech engine and the four character definitions from the Microsoft Agent Web site, as we discussed at the beginning of this section.] The following example (Fig. 17.37) demonstrates how to build a simple application with the Microsoft Agent control. This application contains two drop-down lists from which the user can choose an Agent character and a character animation. When the user chooses from these lists, the chosen character appears and performs the selected animation. The application uses speech recognition and synthesis to control the character animations and speechyou can tell the character which animation to perform by pressing the Scroll Lock key, then speaking the animation name into a microphone. Figure 17.37. Microsoft Agent demonstration.
The example also allows you to switch to a new character by speaking its name and creates a custom command, MoveToMouse. In addition, when you press the Speak Button, the characters speak any text that you typed in the TextBox. To use the Microsoft Agent control, you must add it to the Toolbox. Select Tools > Choose Toolbox Items... to display the Choose Toolbox Items dialog. In the dialog, select the COM Components tab, then scroll down and select the Microsoft Agent Control 2.0 option. When this option is selected properly, a small check mark appears in the box to the left of the option. Click OK to dismiss the dialog. The icon for the Microsoft Agent control now appears at the bottom of the Toolbox. Drag the Microsoft Agent Control 2.0 control onto your Form and name the object mainAgent. In addition to the Microsoft Agent object mainAgent (of type AxAgent) that manages the characters, you also need a variable of type IAgentCtlCharacter to represent the current character. We create this variable, named speaker, in line 11. When you execute this program, class FrmAgent's constructor (lines 1034) loads the character descriptions for the predefined animated characters (lines 1623). If the specified location of the characters is incorrect, or if any character is missing, a FileNotFoundException is thrown. By default, the character descriptions are stored in C:\Windows\msagent\chars. If your system uses another name for the Windows directory, you'll need to modify the paths in lines 1623. Lines 2628 set Genie as the default character, obtain all animation names via our utility method GetAnimationNames and call IAgentCtlCharacter method Show to display the character. We access characters through property Characters of mainAgent, which contains all the characters that have been loaded. We use the indexer of the Characters property to specify the name of the character we wish to load (Genie). Responding to the Agent Control's ClickEventWhen a user clicks the character (i.e., pokes it with the mouse), event handler mainAgent_ClickEvent (lines 5057) executes. First, speaker method Play plays an animation. This method accepts as an argument a String representing one of the predefined animations for the character (a list of animations for each character is available at the Microsoft Agent Web site; each character provides over 70 animations). In our example, the argument to Play is "Confused"this animation is defined for all four characters, each of which expresses this emotion in a unique way. The character then speaks, "Why are you poking me?" via a call to method Speak. Finally, we play the RestPose animation, which returns the character to its neutral, resting pose. Obtaining a Character's List of Animations and Defining Its CommandsThe list of valid commands for a character is contained in property Commands of the IAgentCtlCharacter object (speaker, in this example). The commands for an Agent character can be viewed in the Commands pop-up window, which displays when the user right clicks an Agent character (the last screenshot in Fig. 17.37). Method Add of property Commands adds a new command to the command list. Method Add takes three String arguments and two Boolean arguments. The first String argument identifies the name of the command, which we use to identify the command programmatically. The second String defines the command name as it appears in the Commands pop-up window. The third String defines the voice input that triggers the command. The first Boolean specifies whether the command is active, and the second Boolean indicates whether the command is visible in the Commands pop-up window. A command is triggered when the user selects the command from the Commands pop-up window or speaks the voice input into a microphone. Command logic is handled in the Command event handler of the AxAgent control (mainAgent, in this example). In addition, Agent defines several global commands that have predefined functions (for example, speaking a character name causes that character to appear). Method GetAnimationNames (lines 79109) fills the cboActions ComboBox with the current character's animation listing and defines the valid commands that can be used with the character. The method contains a SyncLock block to prevent errors resulting from rapid character changes. The method uses an IEnumerator (lines 8384) to obtain the current character's animations. Lines 8990 clear the existing items in the ComboBox and the character's Commands property. Lines 93103 iterate through all the items in the animation-name enumerator. For each animation, line 95 assigns the animation name to String voiceString. Line 96 removes any underscore characters (_) and replaces them with the String "underscore"; this changes the String so that a user can pronounce and employ it as a command activator. Line 98 adds the animation's name to the cboActions ComboBox. The Add method of the Commands property (lines 101102) adds a new command to the current character. In this example, we add every animation name as a command. Each call to Add receives the animation name as both the name of the command and the string that appears in the Commands pop-up window. The third argument is the voice command, and the last two arguments enable the command but indicate that it is not available via the Commands pop-up window. Thus, the command can be activated only by voice input. Lines 106107 create a new command, named MoveToMouse, which is visible in the Commands pop-up window. Responding to Selections from the cboActions ComboBoxAfter the GetAnimationNames method has been called, the user can select a value from the cboActions ComboBox. Event handler cboActions_SelectedIndexChanged (lines 112119) stops any current animation, then plays the animation that the user selects from the ComboBox, followed by the RestPose animation. Speaking the Text Typed by the UserYou can also type text in the TextBox and click Speak. This causes event handler btnSpeak_Click (line 3747) to call speaker method Speak, supplying as an argument the text in speechTextBox. If the user clicks Speak without providing text, the character speaks, "Please, type the words you want me to speak". Changing CharactersAt any point in the program, the user can choose a different character from the cboCharacters ComboBox. When this happens, the SelectedIndexChanged event handler for cboCharacters (lines 6065) executes. The event handler calls method ChangeCharacter (declared in lines 6876) with the text in the cboCharacters as an argument. Method ChangeCharacter stops any current animation, then calls the Hide method of speaker (line 70) to remove the current character from view. Line 71 assigns the newly selected character to speaker, line 74 generates the character's animation names and commands, and line 75 displays the character via a call to method Show. Responding to CommandsEach time a user presses the Scroll Lock key and speaks into a microphone or selects a command from the Commands pop-up window, event handler mainAgent_Command (lines 122147) is called. This method receives an argument of type AxAgentObjects._AgentEvents_CommandEvent, which contains a single property, userInput. This property returns an Object that can be converted to type AgentObjects.IAgentCtlUserInput. Lines 126127 assigns the userInput object to an IAgentCtlUserInput object named command, which is used to identify the command, so that the program can respond appropriately. Lines 130134 use method ChangeCharacter to change the current Agent character if the user speaks a character name. Microsoft Agent always will show a character when a user speaks its name; however, by controlling the character change, we can ensure that only one Agent character is displayed at a time. Lines 137142 move the character to the current mouse location if the user invokes the MoveToMouse command. Agent method MoveTo takes x- and y-coordinate arguments and moves the character to the specified screen position, applying appropriate movement animations. For all other commands, we Play the command name as an animation in line 146. |