Section 6.8. Speech Recognition: All Versions


6.8. Speech Recognition: All Versions

For years , there's been quite a gulf between the promise of computer speech recognition (as seen on Star Trek ) and the reality (as seen just about everywhere else). You say "oxymoron," it types "ax a moron." (Which is often just what you feel like doing, actually.)

Microsoft has had a speech-recognition department for years. But until recently, it never got the funding and corporate backing it needed to do a really bang-up job.

The speech recognition in Windows Vista, however, is another story. It can't match the accuracy of its chief rival, Dragon NaturallySpeaking, but you might be amazed to discover how elegant its design is now, and how useful it can be to anyone who can't, or doesn't like to, type.

In short, Speech Recognition lets you not only control your PC by voiceopen programs, click buttons , click Web links, and so onbut also dictate text a heck of a lot faster than you can type.

To make this all work, you need a PC with a microphone. The Windows Speech Recognition program can handle just about any kind of mike, even the one built into your laptop's case. But a regular old headset mike"anything that costs over $20 or so," says Microsoftwill give you the best accuracy.

6.8.1. Take the Tutorial

The easiest way to fire up Speech Recognition for the first time is to open the Start menu and then type spee . Using the mouse takes way too much time (Start All Programs Accessories Ease of Access Windows Speech Recognition).

In any case, the first time you open Speech Recognition, you arrive at a very slick, very impressive full-screen tutorial/introduction, featuring a 20-something model in, judging by the gauzy whiteness, what appears to be heaven.

Click your way through the screens. Along the way, you're asked to:

  • Specify what kind of microphone you have. Headset, desktop, array, or built-in?

  • Read a sample sentence , about how much Peter loves speech recognition, so your PC can gauge the microphone's volume.

  • Give permission to Vista to study your documents and email collection. Needless to say, there's no human rooting through your stuff, and none of what Speech Recognition finds is reported back to Microsoft. But granting this permission is a great way to improve your ultimate accuracy, since the kinds of vocabulary and turns of phrase you actually use in your day-to-day work will be built right into Speech Recognition's understanding of your voice.

  • Print the reference card . This card is critical when you're first learning how to operate Windows by voice. Truth is, however, you don't really need to print it. The same information appears in this chapter, and you can always call the reference card up on the screen by saying into your microphone, "What can I say?"

  • Practice . The tutorial is excellent ; it'll take you about half an hour to complete. It teaches you how to dictate and how to operate buttons, menus , windows, programs, and so on.


Tip: At the outset, Windows is just simulating its responses to what you say. But behind the scenes, it's actually studying your real utterances, learning about your voice, and shaping your voice profile. This, in other words, is the "voice training" session you ordinarily have to perform with commercial dictation programs.

Now you're ready to roll. Operating Windows by voice entails knowing three sets of commands:

  • Controlling Speech Recognition itself,

  • Controlling Windows and its programs, and

  • Dictating.

The following sections cover these techniques one at a time.

6.8.2. Controlling Windows Speech Recognition

Slip on your headset, open Windows Speech Recognition, and have a gander at these all-important spoken commands:

  • "Start listening"/"Stop listening." These commands tell your PC to start and stop listening to you. That's important, because you don't want it to interpret everything you say. It would not be so great if it tried to act when you said to your roommate, "Hey, Chris, close the window."

    So say "Start listening" to turn on your mikeyou'll see the microphone button on the speech status palette (Figure 6-5) darken . Say "Stop listening" when you have to take a phone call.

    Figure 6-5. The Speech palette is how Windows holds up its end of the conversation. If it doesn't understand something you said, for example, its text says, "What was that?" The Speech shortcut menu opens when you say "Show Speech Options." It's as though you right-clicked the little palette .



    Tip: Once you've opened the Speech Recognition program, you can hit a keystroke to turn listening on and off instead. That key combo is Ctlr+ . Get it? "Control Windows"?
  • "What can I say?" This one's incredibly important. If you can't figure out how to make Windows do something, look it up by saying this. You get the Speech Recognition page of the Vista Help system, complete with a collapsible list of the things you can say.

  • "Show Speech Options." This command opens the shortcut menu for the Speech palette, as shown in Figure 6-5. From this menu, you can leap into further training, open the "What can I say?" card, go to the Speech Recognition Web site, and so on.

  • "Hide Speech Recognition"/"Show Speech Recognition" hides or shows the Speech palette itself when screen real estate is at a premium.

6.8.3. Controlling Windows and Its Programs

The beauty of controlling Windows by voice is that you don't have to remember what to say; you just say whatever you would click with the mouse.

For example, to open the little Calculator program using the mouse, you'd click the Start button (to open the Start menu), then All Programs , then Accessories , and finally Calculator . To do the same thing using speech recognition, you just say all that: "Start...All Programs...Accessories...Calculator." And prestothe Calculator appears.

Actually, that's a bad example; you can open any program just by saying "Start Calculator" (or whatever its name is). But you get the idea.

Here's the cheat sheet for manipulating programs. In this list, any word in italics is meant as an example (and other examples that work just as well are in parentheses):

  • Start Calculator (Word, Excel, Internet Explorer...) . Opens the program you named, without you having to touch the mouse. Super convenient .

  • Switch to Word (Excel, Internet Explorer...) . Switches to the program you named.

  • File. Open . You operate menus by saying whatever you would have clicked with the mouse. For example, say "Edit" to open the Edit menu, then "Select All" to choose that command, and so on.

  • Print (Cancel, Desktop...) . You can also click any button by saying its nameor any tab name in a dialog box.

  • Contact us (Archives, Home page...) . You can also click any link on a Web page just by saying its name.

  • Double-click Recycle Bin . You can tell Windows to "double-click" or "right-click" anything you see.

  • Go to Subject (Address, Body...) . In an email message, Web browser, or dialog box, "Go to" puts the insertion point into the text box you name. "Address," for example, means the Address bar.

  • Close that . Closes the frontmost window. Also "Minimize that," "Maximize that," "Restore that."

  • Scroll up (down, left, right) . Scrolls the window. You can say "up," "down," "left," or "right," and you can also append any number from one to 20 to indicate how many lines: "Scroll down 10."

  • Press F (Shift-F, capital B, down arrow, X three times...) . Makes Windows press the key you named.


Tip: You don't have to say "Press" before certain critical keys: Delete, Home, End, Space, Tab, Enter, Backspace. Just say the key's name: "Tab."
6.8.3.1. Show numbers

It's great to know that you can click any button or tab by saying its name. But what if you don't know its name? What if it's some cryptic little icon on a toolbar? You can't exactly say, "Click the little thing that looks like a guy putting his head between two rollers."

For this purpose, Microsoft has created a clever command called "Show numbers." When you say that, the program overlays every clickable thing with superimposed colorful numbers; see Figure 6-6.

Figure 6-6. When you say a number, that number turns green and changes into an OK logoyour clue that you must now say "OK" to confirm the selection. (You can run these utterances together without pausingfor example, "three OK.") Not all programs respond to the "Show numbers" command, alas .


The numbers appear automatically if there's more than one button of the same name on the screen, tooseveral Settings buttons in a dialog box, for example. Say "One OK."


Tip: This trick also works great on Web pages. Say "Show numbers" to see a number label superimposed on every clickable element of the page.

6.8.4. Controlling Dictation

The real Holy Grail for speech recognition, of course, is dictation you speak, and Windows transcribes your words, typing them into any document. (This feature is especially important on Tablet PCs that don't have keyboards.)

Vista's dictation accuracy isn't as good as, say, Dragon NaturallySpeaking's. But it's a close second, it's free, and it's a lot of fun.

It's also very easy. You just talkat regular speed, into any program where you can type. The only real difference is that you have to say the punctuation. You know: "Dear Mom (comma, new line): How are things going (question mark)? Can't believe I'll be home for Thanksgiving in only 24 more weeks (exclamation mark)!"

6.8.4.1. Correcting errors

Sooner or laterprobably soonerSpeech Recognition is going to misunderstand you and type out the wrong thing. It's very important that you correct such glitchesfor two reasons. First, you don't want your boss/family/colleagues to think you're incoherent. Second, each time you make a correction, Windows learns . It won't make that mistake again. Over time, over hundreds of corrections, Speech Recognition gets more and more accurate.

Suppose, then, that you said, "I enjoyed the ceremony," and Speech Recognition typed out, "I enjoyed this era money." Here's how you'd proceed:

  1. Say, "Correct this era money."

    Instantly, the Alternates panel pops up (Figure 6-7).

    Figure 6-7. You make your corrections in the Alternates panel. It shows a numbered list of other possible interpretations of what you said. To choose one of these alternates, say its number and then OK (no pause needed)for example, "Two OK."


  2. If the correct transcription is among the choices in the list, say its number and then OK .

    As noted in Figure 6-7, you don't have to pause before "OK."

  3. If the correct transcription doesn't appear in the list, speak the correct text again .

    In this example, you'd say, "the ceremony." Almost always, the version you wanted now appears in the list. Say its number and then OK.

  4. If the correct transcription still doesn't appear in the list, say "Spell it."

    You arrive at the Spelling panel; see Figure 6-8.

    Figure 6-8. Just spell out the word you really wanted: "F-I-S-H," for example. For greater clarity, you can also use the "pilot's alphabet": Alpha, Bravo, Charlie, Delta, and so onor even "A as in alligator" (any word you like). If it mishears a letter you've spoken, say the number over it ("three") and then repronounce the letter. Say "OK" once you've gotten the word right .


When you finally exit the Alternates panel, Speech Recognition replaces the corrected text and learns from its mistake.

6.8.4.2. More commands

Here are the other things you can say when you're dictating text. The first few are extremely important to learn.

  • Select next (previous) two (10, 14, 20...) words (sentences, paragraphs) . Highlights whatever you just specifiedfor example, "Select previous five sentences."

    At this point, you're ready to copy, change the font or style, say "Cap that" to capitalize the first wordsor just redictate to replace what you wrote.


    Tip: If the phrase you want to highlight is long, you can say, "Select My country through land of liberty." Windows highlights all of the text from the first phrase through and including the second one.
  • Correct ax a moron . Highlights the transcribed phrase and opens the Alternates panel, as described above. (You can say a whole phrase or just one word.)

  • Undo . Undoes the last action.

    GEM IN THE ROUGH
    Mousegrid

    The voice commands described in this section are all well and good when it comes to clicking onscreen objects. But what about dragging them?

    When you say "Mousegrid," Speech Recognition superimposes an enormous 3 x 3-square grid on your screen, its squares numbered 1 through 9.

    Say "five," and a new, much smaller 3 x 3-square grid, also numbered, appears in the space previously occupied by the five square. You can keep shrinking the grid in this way until you've pinpointed a precise spot on the screen.

    Dragging somethingsay, an icon across the desktopis a two-step process.

    First, use Mousegrid to home in on the exact spot on the screen where the icon lies; on your last homing-in, say "four mark." (In this example, the icon you want lies within the four square. "Mark" means "This is what I'm going to want to drag.")

    When you say "mark," the Mousegrid springs back to full-screen size ; now you're supposed to home in on the destination point for your drag. Repeat the grid-shrinking exercisebut in the last step, say "seven click." Watch in amazement as Windows magically grabs the icon at the "mark" position and drags it to the "click" position.

    You can use Mousegrid as a last resort for any kind of click or drag when the other techniques (like saying button or menu names , or saying "show numbers") don't quite cut it.


  • Scratch that . Deletes the last thing you dictated. ("Delete that" works, too.)

  • Delete your stupid parents . Instantly deletes the text that you identified.


    Tip: If you use commands like Delete, Select, Capitalize , or Add hyphens to on a word that occurs more than once in the open window, Speech Recognition doesn't try to guess. It puts colorful numbered squares on every occurrence of that word. Say "one OK" (or whatever the number is) to tell it which occurrence you meant.
  • Go to little . Puts the insertion point right before the word "little."

  • Go after lamb . Puts the insertion point right after the word "lamb."

  • Go to the start (end) of the sentence (paragraph, document) . Puts the insertion point where you said.

  • Caps . Capitalizes the next word you dictate (no pause is necessary). Saying "All caps" puts the next word ENTIRELY in caps.

  • Ready no space Boost . Types ReadyBoostno space.

  • He typed the word literal comma . The command "literal" tells Speech Recognition to type out the word that follows it ("comma"), rather than transcribing it as a symbol.

  • Add hyphens to 3D . Puts a hyphen in the word ("3-D").

  • Start typing I, P, C, O, N, F, I, G; stop typing . When you say "Start typing" (and then pause), you enter Typing mode. Now you can spell out anything, letter by letter, in any program on earth. It's a handy way to dictate into programs that don't take dictation well, like PowerPoint and Excel.

GEM IN THE ROUGH
Text to Speech

The big news in Vista is that speech-to-text feature. But Windows can also convert typed text back to speech, using a new set of voices of its very own.

To hear them, choose Start Control Panel. In Classic view, open Speech Recognition Options. In the task pane at left, click "Text to speech." Click Preview Voice to hear the astonishing realism of Microsoft Anna, the new computer voice that debuts with Vista. You can even control her speaking rate using the "Voice speed slider.

Unfortunately, you won't have many chances to hear Microsoft Anna. She's available in Narrator (page 269) to read error messages aloud (woo-hoo!), and Speech Recognition's Speech Options shortcut menu offers a "Speak text in correction dialog" option that makes her read the choices in the Alternates panel.

But Windows offers no way for her to read back whatever you want , like stuff you've written or articles you find on the Web.

Note to Microsoftlet Anna free!


6.8.4.3. Speech Recognition tips

There are zillions of secrets, tips, and tricks lurking in Speech Recognitionbut here are a few of the most useful:

  • You can teach Speech Recognition new wordsunusual last names, oddball terminologyby adding them directly to its dictionary. Say "Show speech options" to open the shortcut menu, and then click (or say) "Open the Speech Dictionary." You're offered the chance to add words, change existing words, or stop certain words from being transcribed.

  • When you want to spell out a word, say "Spell it," and then launch right into the spelling: "F, R, E, A, K, A, Z, O, I, D." You don't have to pause between letters or commands.

  • In the Spelling window, say the digit over the wrong letter, then say A, or Alpha, or "A as in alligator" (or any word that starts with that letter).

  • Beginning any utterance with "How do I" opens up Windows Help; the next part of your sentence goes into the Search box.

  • "Computer" forces the interpretation of your next utterance as a command; "Insert" forces it to be transcribed.

  • Out of the box, Speech Recognition puts two spaces after every perioda very 1980s thing to do. Nowadays, that kind of gap looks a little amateurish. Fortunately, you can tell Speech Recognition to use only one space.

    Making this change requires you to visit the little-known Advanced Speech Options dialog box. Choose Start Control Panel. In Classic view, open Speech Recognition Options. In the task pane at left, click "Advanced speech options (Figure 6-9).

    Figure 6-9. In this dialog box, you can find the "Number of spaces to insert after punctuation" (meaning "periods") pop-up menu near the bottom. The other controls here let you create new voice files ("speech profiles")one for your quiet home office, for example, and another for use in a busy, humming office .





Windows Vista. The Missing Manual
Windows Vista: The Missing Manual
ISBN: 0596528272
EAN: 2147483647
Year: 2006
Pages: 284
Authors: David Pogue

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net