Animating Dialogue
Animating dialogue can really frighten the beginning animator, and rightly so, because it is one of the most difficult techniques for an animator to master. Like any hard task, however, animating dialogue is grounded in some very simple techniquesin this case, matching the character's mouth to the audio track and then animating the entire character to match.
Animating Lip Sync
The first step in animating dialogue is to sync the mouth to the dialogue track. This is called lip sync. Lip sync involves moving the lips to match an audio track by reading the audio track a frame at a time and then animating the character's mouth so that it speaks in rhythm with the track.
In animation, dialogue is almost always recorded before the
characters
are drawn. Dialogue looks more natural when the animator
follows
the natural
rhythms
of speech. Voice actors will have difficulty matching previously animated dialogue and still sounding natural, which is why recording the speech before animation begins is essential.
The animator is responsible for breaking down the dialogue track frame by frame into individual phonemes to be animated. This process is known as reading the track. The
easiest
way to picture a phoneme is to think of each discrete sound that makes up a word. The word "
funny
" for example has four phonemes: the "f" sound, the "uh" sound, an "n" sound, and finally, a long "ee." Reading a dialogue track can be a
tedious
task, as you'll have to break down the dialogue frame by frame. Computer animators have the advantage of using digital audio software to help them visualize their dialogue tracks.
The word "funny" contains four phonemes.
The Eight Basic Mouth
Positions
In order to animate speech, you must first understand how the mouth moves when it speaks. Dozens of different mouth shapes are made during the course of normal speech. Animators usually boil these down to a handful of standard shapes that are used repeatedly. Depending on the style of animation, some animators get away with as few as three or four
shapes
, and some may use dozens. For most situations, you can get away with approximately eight basic mouth positions. These eight positions usually provide adequate coverage and give you the ability to animate most dialogue effectively.
To really see how these positions work, watch yourself in the mirror while you talk. Make the sounds used by each position. If you talk naturally, you'll begin to see how the shapes work and how they all fit together into a continuous stream. The shapes and the rules that
govern
them are
certainly
not strict. Different accents and speech patterns may cause you to substitute one shape for another in order to achieve a more convincing look.
You may notice that some of these positions are not in the standard library of shapes you
modeled
as morph targets. They can, however, be created as separate targets or created at animation time by mixing the appropriate sliders. One good example is the sound "oh," which is created by mixing an
open
jaw with the "ooo" sound. In fact, for speech, most of the
grunt
work is done by manipulating only these two shapes, with a possible "ffff" thrown in when needed. Other sliders, such as the smile, frown, and sneer, are used
mainly
to add character to the face.
Position 1
is the closed mouth used for consonants made by the lips,
specifically
the
M
,
B
, and
P
sounds. In this position, the lips are usually their normal width. For added
realism
, you could mix in an additional shape to get the lips slightly pursed, for sounds following an "ooo" sound, such as in the word "room."
Position 2
has the lips open with the teeth closed. This position is a common shape and is used for consonants made within the mouth, specifically the sounds made by
C
,
D
,
G
,
K
,
N
,
R
,
S
,
TH
,
Y
, and
Z
. All of these sounds can also be made with the teeth slightly open, particularly in fast speech.
Position 3
is used for the wide-open
vowels
such as
A
and
I
. It is
essentially
the same as the fundamental shape for an open jaw.
Position 4
is used primarily for the vowel
E
, but it can also be used on occasion for
C
,
K
, or
N
during fast speech.
Position 5
has the mouth wide open in an
elliptical
shape. This is the position used for the vowel
O
, as in the word "flow." It is created by mixing together an open jaw and the "ooo" sound. Sometimes, particularly when the sound is at the end of a word, you can overlap this shape with the one in position 6 to close the mouth.
Position 6
has the mouth smaller but more pursed. It is used for the "oooo" sound, as in "food," and for the vowel
U
. It is one of the fundamental mouth shapes.
Position 7
has the mouth wide open with the tongue against the teeth. This position is reserved for the letter
L
. It can also be used for the
D
or
TH
sounds, particularly when preceded by
A
or
I
. It is essentially an open jaw with the tongue moved up against the top teeth. If the speech is particularly rapid, this shape may not be necessary, and you can substitute position 2.
Position 8
has the bottom lip tucked under the teeth to make the sound of the
letters
F
or
V
. In highly pronounced speech, this shape is necessary, but the shape could also be
replaced
with position 2 for more casual or rapid speech. This shape is one of the extra shapes modeled previously.
Reading the Track
Now that you understand the basic mouth positions, it's time to break down the track. If you have animator's exposure sheet paper, use it. Otherwise, get a pad of lined paper on which to write your track, using one line per frame. Load the dialogue into a sound editing program. A number of sound editing packages are available, and you should choose one that enables you to display the time in
frames
, as well as select and play portions of the track. The ability to label sections in the editor is also handy.
The first thing you should do is match your sound editing program's time-base to the timebase you're animating30, 25, or 24 frames per second, for example. After your
timebase
is set, selecting a snippet of dialogue should enable you to listen to the snippet and read its exact length in the editor's data window. The visual readout of the dialogue gives you clues as to where the words start and stop. Work your way through the track, and write down each phoneme as it occurs on your exposure sheet, frame by frame. This is a tedious but necessary chore.
Some packages give you the ability to play back audio in sync with the animation. This feature is particularly helpful because you may be able to skip the step of reading the track and just eyeball the sync. Still, it's always a good idea to have read the track methodically before animating so that you completely understand it and know exactly where all the sounds occur.
You can use audio editing programs to read dialogue tracks.
When reading the track, be sure to represent the sounds accurately. In human speech, most consonants are short and usually don't take up more than one or two frames. Vowels, however, can be of any length. If a person is shouting, for instance, you may have vowels topping 30 frames in length. In these cases, it is important that you don't simply hold the mouth in the exact same position for more than a second; it would look
unnatural
. Instead, create two slightly different mouth positions and keep the mouth moving between them so that the character looks
alive
.
Most advanced 3D applications also allow you to view audio as you animate.
Animating the Mouth
Once you've read the track properly, the phonemes and their locations are pretty much known, and you can simply adjust the morph sliders to match the dialogue. While it may sound easy, the actual task of creating convincing lip sync is an art; not only does the character's mouth have to match the dialogue, it must also be fluid and seamless. Success requires practice and experience, but there are a few rules of thumb.
If the lips are
generally
in sync, the audience accepts the character. If the lips are out of sync, the audience senses that something is wrong. Lips that are overanimated also stand out like a sore thumb.
Vowels are the points in speech at which the mouth opens. When animating a vowel, you need two positions: The first position is the accent pose, when the vowel is first uttered. The second position is the cushion pose, which happens toward the middle to the end of the vowel sound. The
accent
usually has the mouth open wider than the cushion. One good way to do this is to animate the jaw so that it
closes
slightly as the vowel progresses. For fast vowels that happen over only two frames, this may not be much of an issue, but this rule applies to anything that takes four frames or more.
Consonants occur when the mouth closes. With the possible exception of a long M, F, or V sound, most consonants are only a few frames in length, and some can be less than one frame long. With this in mind, leave each position on the screen long enough for the audience to read it. Consonants must be on the screen for at least two frames in order to be read. If the consonant is too short, steal time from a vowel or combine two consonants into one.
When animating a vowel, you open the mouth quickly and close it slowly.
The best way to achieve smooth mouth motion is to concentrate on phrases rather than individual phonemes. An extreme example is rapid dialogue. In this case, the phonemes occur more frequently than once per frame. Animating even at one phoneme per frame makes the character's mouth appear to strobe or stutter. Typically, you should keep most mouth poses on the screen for at least two frames so that they are readable, so if the dialogue is more rapid than two phonemes per frame, you'll need to approximate the track by picking the most dominant phoneme. Vowels are always the loudest and longest sounds. Even when you're working with the
fastest
dialogue, accent the vowels.
One of the most common mistakes in animating the mouth is to create the phoneme shapes and let the computer do the inbetweens. This winds up looking incredibly mechanical. Just like motions of the body, motions in the face need to overlap. If you're animating your mouth shapes using muscle-based morph targets, you can adjust the curves slightly so that the
motions
overlap by a frame or two. This
smooths
out the mouth motion, so that the resulting animation looks more natural.
Another mistake is using a single set of stock phoneme shapes for all of a character's lip sync. This will provide no variety in the mouth shapes, which will make the character look mechanical.
As with overlap, muscle-based morph targets can help a lot. Go back over the animation and mix up the curves. If the character has a loud vowel sound, open the jaw more. Add some asymmetry by turning up the smile on one side of the face. In general, do whatever it takes to make the mouth match the character of the dialogue track on any given frame.
Consonants break up the vowels and are usually shorter in length.
Animating the Character
Lip sync gets the mouth to move, but dialogue involves the whole character, not just the face and mouth. When a character in a film is speaking dialogue, the audience sees an entire character
composed
of a body, arms, legs, and a head. Although the mouth is important, it is usually small in relation to the entire character. While the lips should be synced, much more important to the audience is the movement of the body and head, as well as the expression in the face and eyes. This is where you truly bring the character to life.
Mouth or Body First?
With the body so important to dialogue, one of the questions you might have is whether to animate the mouth or the body first. In cel animation, animators are forced to draw the mouths last, since it makes no sense the draw mouths on a character until the animation of the body is drawn. In stop motion, the mouths are done at the same time as the body. In computer graphics, it's really not that big of an issue, because any part of the animation can be tweaked independently of the others.
Some animators do the mouth first just to get the tedious task out of the way. It also is easier to get the mouth animated first on a still head rather than one that is moving. Other animators like to concentrate on the body first, and then get to the mouth. Both approaches work equally well, and since you can always go back and tweak the body and the lips independently, where you start is up to you.
Listening to the Track
Before you animate the character's body, you need to listen to the dialogue tracknot for phonemes, but for mood. As you listen, close your eyes and try to picture yourself as the character. Pretty soon, you'll have an idea of what the character should be doing as the dialogue is spoken. This is where acting really enters the pictureyou'll need to place yourself in the character's frame of mind to understand how the character will act.
As you listen, you'll also get a sense of the rhythm of the track. Certain words are
emphasized
more than others; note these on your script, because these are the major beats of the dialogue. Your character's major gestures usually happen near the beats, and this process gives you the timing of your animation. After you have the poses and the timing, you can begin to block out the animation, pose to pose.
Blocking Out Poses
The best way to animate dialogue is using standard pose-to-pose animation techniques, as outlined in Chapter 5. Posing to a dialogue track is like creating a
dance
to the spoken word instead of to music. Listening to the track will suggest a string of poses; try to make each one flow into the
next
. Once you have a mental picture of your character speaking the dialogue, you can thumbnail the poses or create them in the computer. A library of stock poses can help a lot with this process.
Keep your animation simple and direct. Try to animate one motion and emotion at a time. In any given frame, the character will have one thought and emotion going through its head, so
illustrate
that single thought in that single frame.
When you animate to dialogue, the timing of the track is fixed, so don't try to squeeze too many poses into this fixed amount of time. A pose generally needs at least 6 to 12 frames minimum to be read by the audience, but most poses will be held longer. To keep the number of poses under control, hit the major beats of the track first and then, if the animation needs more, fill in the holes with secondary poses.
A sequence of poses for a line of dialogue.
Timing the Poses
Once you have your poses blocked out, you need to sync them to the track. Find the major beats of the track and adjust the poses so that they "hit" a few frames before the beat. A thought
manifests
itself in the motions of the hands and body before it is voiced: think of a character who's having a difficult time saying something, perhaps a confession of love. As he searches for the right thing to say, his hands reveal the thought several seconds before he spits the words outthis is true even in normal conversation.
The body usually anticipates the dialogue by a few frames.
Finishing the Animation
Once you've blocked out the basic poses, you can finish the rest of the animation using standard animation principles. When a character moves from one pose to the next, for example, he must anticipate the change and overshoot the final pose before settling in. In addition to the main poses, dialogue animation requires a lot of secondary gestures and poses. These little details are what add life to the character.
Eyes and Dialogue
When animating eyes with dialogue, be sure you understand where the character needs to be looking. Ask yourself who the character is talking to, and try to keep the eyes focused on the subject.
Of course, there are also places where a character may need to look away. People who are nervous tend to give darting glances. A dishonest person's eyes may be somewhat shifty. Don't be afraid to change the shape of the eyes and brows as the dialogue requires. A character whose eyes
remain
the same shape throughout a line of dialogue will appear lifeless.
The eyes can add a whole new level of meaning to a line of dialogue.
Blinks are also very important. They accompany most major head motions, so if the head turns or
bobs
to accent a phrase, blink the eyes as well. Dead spots in the dialogue are also good places to sprinkle in a blink or two.
Head Motion and Dialogue
The head moves quite a bit when people talk, bobbing, nodding, and shaking to
emphasize
certain words. When your character makes a loud sound, it usually raises its head to help open the throat, and this is helpful to keep in mind when you're animating loud sounds or emphasis in speech.
When a character starts speaking, the head and body lift four to six frames before the mouth starts talking. This happens partly because the character needs to take a
breath
before talking. Lifting the character helps emphasize the start of speech and also draws the audience's attention to the character.
When animating a beat in which the head rises, it's always a good idea to anticipate the motion by lowering the head three or four frames before the accent, then popping the head up on the
accented
syllable. This is also known as a head bob, and it is usually accompanied by a blink. To get more action into the head bob, you can also involve the body: As the head moves down in anticipation of the accent, raise the shoulders a bit. When the head pops up, lower the shoulders. Taken to an extreme, this type of motion is the same as is used in the classic cartoon "take."
When a character shouts, its head tilts up to open the throat.
Hand Gestures and Dialogue
When talking, many people use their hands to clarify and emphasize the major points of their speech. Getting this part of the animation correct is a lesson in acting. If you want to see how
not
to animate the hands, watch some really nervous or first-time actors. They usually are very self-conscious and stuff their hands in their pockets, wring them nervously, or hang them loose at their sides.
In real life, body language precedes the dialogue by
anywhere
from a few frames to as many as 20. Generally, a slow, dim-witted character has more time between his gestures and his dialogue than a sharp, quick character. Speedy Gonzales has considerably less lead time in his gestures than Forrest Gump. Someone giving a long, boring speech will be much slower than a fire-and-brimstone evangelist.
You should also make an effort to ensure that your gestures fit the dialogue smoothly. The first gesture every animator learns is the ubiquitous finger point for emphasis, followed soon after by the fist pounding into the palm. These gestures certainly have their place, but within a much larger palette. As with most types of motions, simply watching people in their natural habitat is always your best reference.
Little details like hand gestures can add life to an animation.
|
This first exercise will get you familiar with creating facial animation. Using a character that has been rigged for facial animation, do a series of tests that make the character go through a range of emotions:
-
Make the character go from happy to
angry
.
-
Make the character go from surprised to sad.
-
Make the character go from disgusted to fearful.
Although the face will play an important part in depicting these emotions, you should animate the entire character. As the character goes from one emotion to the next, he will anticipate the change. The animation may include a breakdown pose to help guide the transitions.
|
|
Now that you're comfortable animating facial expressions, move on to animating dialogue. For this exercise, find about 10 seconds of dialogue and animate your character speaking the line. You can record your own voice or have an actor record the line. If you don't have recording facilities, take a line from a movie or TV show and animate to that.
Make sure you read the track first, and then animate the lip sync. Once the mouth is in sync, animate the basic poses to match the track. Flesh out the animation with additional gestures and poses.
|
|