Configuring Speech Preferences | Mac OS X Bible, Panther Edition

Many works of science fiction have depicted humans interfacing with computers by means of natural verbal communication. Macs have been capable of speaking text aloud since 1984, but it wasn’t until 1993 that Apple introduced the ability for humans to speak to Macintoshes; although some would argue the Macs are still not listening.

Mac OS X’s built-in English Speech Recognition is designed to understand a few dozen commands for controlling your computer. You can add to and remove some of the commands that the speech recognition system understands, but you can’t turn it into a general dictation system. If voice dictation interests you, check out IBM’s Via Voice for Macintosh (www.ibm.com/software/speech/mac/) and MacSpeech’s iListen (www.macspeech.com).

Note

Speech recognition does not work in the Classic environment. Even if you install the Mac OS 9 version of the Apple speech recognition software, which is an optional component, it does not work in the Classic environment. You are not able to set it up in the Classic Speech control panel.

Getting a speech recognition microphone

Typically, speech recognition requires a special microphone. If you use a current-shipping iMac, eMac, iBook, or PowerBook, the built-in microphone may work for speech recognition. However, the microphone built into some older iMacs and PowerBooks does not work for speech recognition. Apple is characteristically vague about exactly which iMac and PowerBook microphones don’t work, so you just have to try your own.

Consider upgrading your input hardware; you’ll probably get the best results with speech recognition if you use a headset that includes a noise-canceling microphone designed for voice recognition. Many brands and models are available. Some have a standard 3.5mm mini-plug for Macs with microphone jacks, and some have a USB connector for Macs with USB ports. If you don’t have any luck with your Mac’s built-in microphone, try a headset instead.

Configuring speech recognition

Open the Speech preferences pane as depicted in Figure 13-41, click the Speech Recognition button, and then click the On/Off button. Now you can turn speech recognition on and off and specify the kind of feedback you get when you speak commands. You can also click a button to see the contents of the Speakable Items folder, which we cover in more detail later in this Chapter.

click to expand
Figure 13-41: Whether you live on the Aleutian Islands or you’re just a lonely soul turning on speech recognition is not the answer.

Turning on speech recognition

When you turn on speech recognition, a round feedback window appears. The first time you turn on speech recognition, it displays a welcome message that explains how speech recognition works. (You can read this message again by clicking the Helpful Tips button in Speech Preferences.)

If you want to have speech recognition turned on automatically every time you log in, select the option labeled Open Speakable Items at log in. This setting, like all others in the Speech Recognition panel, is user account–specific and does not have an effect on other user accounts.

Setting feedback options

Below the On/Off controls in the Speech Recognition panel are settings that affect the feedback you get while using speech recognition. You can turn the Speak confirmation option on or off using the Speak Confirmation checkbox to have Mac OS X speak an acknowledgement to your commands.

You can also change the sound that indicates the computer has recognized a spoken command. Choose a sound from the pop-up menu or choose None if you don’t want to hear a sound signifying recognition. This pop-up menu lists the sounds from System/Library/Sounds; the Sounds folder of the main Library folder; and two special sounds, Single Click and Whit, which are part of the speech recognition software. If you put any sounds in ~/Library/Sounds, these sounds are not included in the speech feedback pop-up menu.

Using the feedback window

The speech feedback window appears when you turn on speech recognition and has several unusual attributes. First, the window is round. It floats above most other windows. What’s more, this window has no close button, minimize button, or zoom button. Figure 13-42 shows the speech feedback window in several states.

Figure 13-42: A feedback window indicates when speech recognition is idle (left), listening for a command (middle), or hearing a command (right).

Interpreting feedback

The speech feedback window provides the following information about speech recognition:

Attention mode: Indicates whether the computer is listening for or recognizing spoken commands, as follows:
- Not listening for spoken commands: The small microphone at the top of the feedback window looks dim.
- Listening for a command: The small microphone at the top of the feedback window looks dark.
- Recognizing to a command you are speaking: You see arrowheads move from the edges of the feedback window toward the microphone picture.
Listening method: Indicates how to make the computer listen for spoken commands. You may see the name of a key you must press or a word you must speak to let the computer know that you want it to interpret what you are saying as a command.
Loudness: Colored bars on the bottom part of the window theoretically measure the loudness of your voice. In practice, there seems to be no relationship between the indicated loudness level and successful speech recognition. If you see no bars or one blue bar, you’re speaking relatively quietly; a blue bar and one or two green bars mean you’re speaking louder; and these three bars plus a red bar mean that you’re speaking very loudly. Apple recommends you speak loudly enough to keep green bars showing but rarely should the red bar appear.
Recognition results: When the computer recognizes a command that you have spoken, it displays the recognized command in a help tag above the speech feedback window. The displayed command may not exactly match what you said, because speech recognition interprets what you say with some degree of flexibility. For example, if you say, “Close window,” speech recognition probably displays the command it recognized as “Close this window.” If speech recognition has a response to your command, it generally displays it in a help tag below the feedback window. Figure 13-43 shows how speech recognition displays the recognized command and its feedback in help tags.

Figure 13-43: Help tags above and below the speech feedback window display the command that the computer recognized and its response, if any.

Using feedback window controls

The only control in the speech feedback window is a pop-up menu, which you can see by clicking the small arrow at the bottom of the window. One menu command opens Speech Preferences, previously shown in Figure 13-41. Another command in the pop-up menu opens a window that lists available speech commands.

You can move the feedback window by clicking it almost anywhere and dragging. You can’t drag from the bottom of the window because clicking there makes the pop-up menu appear.

Minimizing the feedback window

Although the speech feedback window has no minimize button, you can minimize it with a spoken command, which is “Minimize speech feedback window.” While minimized in the Dock, this window continues to provide most of the same feedback as it does when it is not minimized. While the feedback window is minimized, you don’t see help tags containing the recognized command and response. In addition, the window’s pop-up menu is not available in the Dock. Figure 13-44 shows the speech feedback window in the Dock.

Figure 13-44: While minimized in the Dock, the speech feedback window continues to indicate speech recognition status.

After minimizing the speech feedback window, you can open the window by clicking it in the Dock. You can also open the window by speaking the command, “Open speech feedback window.”

Looking at the Speech Commands window

Instead of seeing your spoken command and the computer’s response to it displayed briefly in help tags, you can see a list of all your recent spoken commands and the responses to them. The list of commands appears in the Speech Commands window, and also lists the commands you can speak in the current context. You can display this window by choosing Open Speech Commands Window from the pop-up menu at the bottom of the round speech feedback window. You can also open the Speech Commands window with the spoken command, “Open Speech Commands window.” Figure 13-45 shows the Speech Commands window when the Finder is the active application.

Figure 13-45: The Speech Commands window lists commands you have spoken, responses to them, and commands you can speak.

The commands you have spoken appear at the top of the speech commands window in bold, and any responses appear below each command in plain text. The bottom part of the window displays the commands you can speak in the current context. The list is organized in the following categories:

Name of current application: Appears only if the application you’re currently using has its own speakable commands.
Speakable Items: Includes commands that are available no matter which application you’re currently using.
Applications: Lists commands for switching to specific applications. (Switching to an application opens it if necessary.)

You can hide or show the commands for a category by clicking the disclosure triangle next to the category name. You can adjust the relative sizes of the top and bottom parts of the speech commands window by dragging the handle located on the bar between the two parts of the window. You find out how to add speakable commands later in this Chapter.

Setting the listening method

The Listening panel of the Speech panel facilitates the configuration of when the computer should listen for spoken commands:

Push-to-talk method: This is the most reliable method because the computer listens for commands only when you are pressing a key that you designate.
Code name method: The computer listens for its code name and tries to interpret the words that follow it as a command.

Setting the push-to-talk method

To set the push-to-talk method and the key that makes speech recognition listen for spoken commands, follow these steps:

Click the Speech Recognition button. The Speech Recognition panel is organized into three smaller subpanels: On/Off, Listening, and Commands.
Click the Listening button. You see the options for setting the speech recognition listening method. Figure 13-46 shows the listening panel set for the push-to-talk method with the Esc key, which is the initial setting.

Figure 13-46: Speech recognition can be set for push-to-talk listening, but be warned, prolonged use of speech may also push your buttons.
Set the Listening Method option to Listen only while key is pressed. This setting means you can hold down the listening key to make speech recognition recognize your spoken commands.
If you want to change the listening key, click the Change Key button. A dialog appears in which you can type the key or combination of keys that you want to use as the listening key. You can use the Esc key, Delete key, any key on the numeric keypad, one of the function keys F5 through F12, or most punctuation keys. You can combine one of these keys with any one or more of the Shift, Option, Control, or Command keys. You can’t use letter keys or number keys on the main part of the keyboard.

Setting the code name method

If you prefer to have the computer listen for a code name that you say before speaking a command, use these steps:

Click the Speech Recognition button. The Speech Recognition panel is organized into three subpanels: On/Off, Listening, and Commands.
Click the Listening button. You see the options for setting the speech recognition listening method. Figure 13-47 shows the listening panel set for the code name method.

Figure 13-47: Speech recognition set for code name listening.
Set the Listening Method option to Key toggles listening on and off. This setting means pressing the listening key alternately turns listening on and off. Turning listening off puts speech recognition on standby, which may improve the performance of the computer.
If you want to change the listening key, click the Change Key button. A dialog appears in which you can type the key or combination of keys that you want to use as the key. You can use the Esc key, Delete key, any key on the numeric keypad, one of the function keys F5 through F12, or most punctuation keys. You can combine one of these keys with any one or more of the Shift, Option, Control, or z keys. You can’t use letter keys or number keys on the main part of the keyboard.
Specify a code name. Type a code name for speech recognition in the text box; the default name is Computer. (The text box is not case-sensitive.) Use the nearby pop-up menu to specify when you must speak the name.

In the pop-up menu that specifies when you must speak the code name, one of the choices makes the code name optional but not without risk: The computer could interpret something you say in conversation as a voice command.

Other choices in the pop-up menu make the code name optional if you spoke the last command less than 13 seconds ago or 30 seconds ago. The idea is that when you have the computer’s attention, you shouldn’t have to get its attention immediately following the previous command. You can tell whether you need to speak the code name by looking at the round speech feedback window. If you see the code name displayed in the middle of the feedback window, you have to speak the name before the next command.

Specifying what commands to listen for

Speech has commands organized by group. You choose which groups of commands Speech will listen for in the Commands panel shown in Figure 13-48. To choose commands by group:

click to expand
Figure 13-48: Speech recognition organizes commands by group.

Click the Speech Recognition button.
Click the Commands button.
Click the checkboxes to select and deselect groups of commands. To activate the Front Windows commands group and the Menu Bar commands group, you must select Using Assistive Features in the Universal Access preferences pane. As you add and remove groups of commands, the groups appear and disappear in the Speech Commands window.
Select or clear the checkbox next to Require exact wording of Speakable Item command names. Requiring the speaker to use the exact name of the command improves recognition accuracy and response time. If this is deselected, Mac OS X will attempt to identify a command from more relaxed, casual speech.

Specifying which microphone to use

If your Mac has more than one microphone connected, such as a built-in microphone and an external microphone, you can specify which option you want speech recognition to use. Follow these steps:

Click the Speech Recognition button.
Click the Listening button.
Choose an available microphone from the Microphone pop-up menu. If your computer has a microphone jack, the pop-up menu includes it as a choice even if no microphone is plugged into the jack.

Specifying the spoken user interface

Mac OS X speaking ability is not just a one-way conversation. Speech can be used to vocalize OS X’s responses back to the user. This capability is configured in the Spoken User Interface panel, as shown in Figure 13-49. Click the Spoken User Interface button within the Speech preferences pane. You can choose what phrase, such as “Alert!” or “Excuse me” to use as a signal phrase. You can edit the list of alert phrases and add your own phrases, too. You can choose a voice just for alert messages that’s a different voice than the default voice; you can specify the time delay before the alert is spoken. You can also choose to have Speech read text under the mouse cursor as you move the cursor over text.

click to expand
Figure 13-49: Before closing the Spoken User Interface panel, try out your settings by using the Demonstrate Settings button.

Choosing a voice for Mac OS X

The voices that the Speech preferences pane employs are highly configurable. You set the default voice and speaking rate for Mac OS X in the Default Voice panel as shown in Figure 13-50.

click to expand
Figure 13-50: Although the options appear to be bountiful, try as you may, no combination of options here, will get you close to having your computer sound like HAL.

To set a voice and a speaking rate for Mac OS X, follow these steps:

Click the Default Voice button.
Select any voice from the list on the left side of the Speech preferences pane. You hear a sample of the voice, and a description of it appears next to the list.
Optionally, change the speaking rate by adjusting the Rate slider. Each voice has a preset speaking rate. You can hear a sample of the voice at the current rate by clicking the Play button.

Reading documents aloud

If you are using an OS X application, Speech may be available as a Service. Services are available to OS X applications from the Application menu. An example is shown in Figure 13-51. Refer to Chapter 11 for more on Services.

click to expand
Figure 13-51: Speech is available as a Service to many OS X applications.