Chapter 6. Creating Persona, by Design


In helping customers design their VUIs, we ask them many questions about their corporate identity and brand and about typical users of the system (see Chapter 4). The intent of such questions is to help the designer arrive at the most appropriate "character" or "personality" for the interface, which the user is bound to infer in even a single interaction. In response to these questions, however, we sometimes hear, "Oh, this app isn't going to have a personality." A substantive body of research, though, shows that we humans cannot help inferring personality traits and social information from the voices we hear, even if we encounter them as brief, recorded samples.

We infer not only the speaker's gender but also age, ethnicity, socioeconomic status, geographic background, level of education, and emotional state. We can also infer personal qualities, perceiving the person to be, for example, trustworthy, punctual, generous, hospitable, even-tempered, or romantic. In U.S. culture, breathiness in male voices is perceived as young, but in female voices as sexy. In other words, the entire field of research that sociolinguists call speech evaluation strongly suggests that there is no such thing as a voice user interface with no personality.

Many speech evaluation studies center on the social cues that are transmitted through a speaker's accent. These studies show the significant extent to which people rely on certain linguistic cues in other people's speech to infer nonlinguistic attributes and to make value judgments about the speakers as individuals. These attributes and judgments are based on preconceived notions about the social group speakers are associated with. For example, if we think of people from Brooklyn, New York, as being tough, and then we hear someone speaking with a Brooklyn accent, then we judge the individual as being tough.

In an article titled "Y'all Come Back Now, Y'Hear?" Soukup (2000) explores language attitudes in the United States toward Southern accents. In this study, some 300 college students from New England and Tennessee were asked to evaluate four speakers' samples: two with a Southern accent and two with a "neutral" accent, with male and female voices represented for each. The setting for this study was a job interview situation in sales. Statistical analysis of the data strongly confirmed that having a Southern accent is a strike against job applicants. Southerners categorically lost on "competence" (e.g., intelligence, education, determination), a quality the informants deemed most important for job performance. There were some positive associations with this accent: Notably, the Southern female voice took the lead in social attractiveness (e.g., friendliness, good humor) but not enough to endorse the speaker as a desirable employee. Language attitudes toward Southern American English were rather negative overall.

This study corroborates the findings of Giles and Smith (1979), in which subjects rated Cockney-accented speakers lower than standard-accented speakers, and Labov (1966), where listeners tended to downgrade speakers with a New York City accent in terms of their job suitability.

Not all speech evaluation studies focus on regional differences. In an older study (Giles and Powesland 1975) whose findings are now relevant to the design of commercially deployed VUIs, student teachers were asked to assess eight hypothetical students with respect to intelligence, enthusiasm, self-confidence, gentleness, and "being privileged." These fictional students were defined by (1) a photograph, (2) a taped sample of speech, and (3) school work. The research question posed in this study was, What would happen if information from one of the three sources gave a favorable impression, but information from another source gave an unfavorable one? The findings of this study consistently point to the central role that speech plays in our evaluation of others. Favorable impressions of the speech sample overrode unfavorable impressions from the photograph and the work sample. Unfavorable impressions from the speech samples overrode favorable impressions from the other sources.

These findings provide one of the most compelling arguments for incorporating explicit persona design into the larger design and production effort of a VUI, especially in the case of commercially deployed applications. Simply put, speech is more powerful than the written word or a visual image in making a good impression.

Voices can even suggest physical characteristics. Most people have had the experience of meeting in person someone who until that point had only existed, so to speak, as a voice on the telephone. When finally we meet the person face-to-face, either we feel validated that our mental image of the person was on target, or we are taken aback by the mismatch. ("He seemed taller over the phone.") Either way, we cannot help imagining what people must look like based on what they sound like. The relevance for VUI design is that the people who call your application are likely to follow the universally human impulse to evaluate the voice they hear and draw on it for nonlinguistic information.

The research on speech evaluation contradicts assertions that a particular voice user interface is "neutral" that it "doesn't have a personality" or that "there's no persona." The choice not to deal with personality nevertheless is likely to result in some personality being perceived by the user. The voice of such a system might be perceived as an announcer reading awkwardly worded messages at the user, or perhaps a robot who speaks "computerese." In fact, we have found that people sometimes mistake prerecorded audio for an artificial, synthesized voice, especially in applications whose prompts are poorly concatenated or awkwardly worded. In any case, it is not advisable to leave such perceptions to chance, especially because branding and image are at stake. The creation of persona should not be haphazard but the object of an explicit design effort.

To fully leverage the power of the spoken word, voices alone are not enough. Both favorable and unfavorable mental images that we infer from speech signals also depend on what the voice is saying and how it is said (see Chapters 10 and 11, respectively). Taking all these elements together, we can think of speech application design as creating for users a total language experience in which they interact with an ideal employee if we get the details right. This is the world of persona.



Voice User Interface Design 2004
Voice User Interface Design 2004
ISBN: 321185765
EAN: N/A
Year: 2005
Pages: 117

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net