Before we can talk about tape formats, cabling and connectors, or capture settings, you have to understand the basics of the video signal. From our digital vantage point of 2005, a lot of video's technical details look downright nonsensical: what's with 59.94 Hz, 7.5 IRE setup, and dot-crawl? Yet they all resulted from careful design decisions made back in analog days, and most (if not all) make perfect sense once you know their background.
When you know how an image is structured, and how it's broken down for storage and transmission, you'll be able to choose the right kinds of connections for moving pictures around, and be able to diagnose arcane but common problems associated with interlacing and field order.
Frames, Scanning, and Sync
Video is a sequence of pictures; but the pictures themselves can be structured in a variety of ways. Fortunately, the need for transmitter and receiver to work together resulted in some fundamental principles that all modern video systems follow.
When you read this book, you start at the upper-left corner of a page, scan across a line of text, then return to the left edge to read the next line of text (assuming, of course, that this is not the Japanese or Hebrew version of this book). Reading a line at a time, you traverse the entire two-dimensional page, breaking each page down into lines of text.
Video works the same way, reading a single scanline across the image, then moving down to read another, and another, until an entire frame of video has been scanned. The basic video signal is an analog waveform (we'll get to digital later on), where zero voltage corresponds to black, and voltage increases in proportion to scene brightness.
In NTSC television, the brightness scale is calibrated in IRE units, with 100 divisions between zero voltage and the nominal white level (the level at which the display shows a full-brightness white image). In PAL, brightness is measured in millivolts, mV.
IRE units were standardized by the Institute of Radio Engineers, hence the designation. The IRE became the IEEEInstitute of Electrical and Electronics Engineersand IRE units nowadays should be called IEEE units, but nobody ever does.
FCP's internal Waveform monitor is calibrated in percent, from 0 percent to 100 percent nominal peak white. This is not the same as IRE units, as you'll see.
The structure of scanlines all stacked one atop another to form a complete frame is called a raster, hence the term raster graphics for images broken down into a regular pattern of scanlines, or scanning raster for the image structure displayed on the face of a CRT (cathode-ray tube).
Scanlines are separated by horizontal sync pulses, short negative voltage spikes inserted by the camera to tell the video display where one scanline ends and another is about to begin. The camera also generates a vertical sync pulse at the bottom of the image so that the display knows when to retrace to the top of the screen and start a new scan.
Vertical sync pulses have the same amplitude (height) as horizontal sync pulses, but are many times wider and have additional complications called serrations and equalizing pulses, which are interesting in their own right but beyond the scope of this lesson.
This waveform display shows scanlines complete with horizontal sync and colorburst.
FCP's waveform display shows only picture, not sync or blanking.
Setup, Black, and Blanking
Early televisions were simple things: the video signal controlled the CRT's electron beam directly. You set the black level of the screen by fiddling with a "black level" or "picture" control; that control set the point at which the voltage of the signal just barely started to make the CRT's phosphors glow. If you misadjusted the picture control even slightly towards the "bright" side, you risked seeing a handful of diagonal lines running across your picture: the retrace of the electron beam as it completes one field and returns to the top of the screen to start the next one.
The NTSC standard fixed the problem by adding a setup or pedestal to the picture: a small voltage offset equal to 7.5 percent of the total brightness level. Setup raises the nominal black level a bit above the zero-voltage blanking levelthe level at which the electron beam should be completely off, or blanked. Setup gave you room for error; by the time you turned the brightness up enough to see the retrace lines, the blacks in your picture were milky gray.
Setup was a sensible solution in the early days of television, but modern TV sets blank the retrace completely, regardless of black level. Outside of North and Central America, setup is unknown, even in NTSC-using countries like Japan and South Korea. For most of the world, the black level is the same as the blanking level. But setup remains part of North American NTSC, and black is 7.5 IRE units higher than blanking, a detail that will return to haunt us in later lessons.
Fields and Frames
The more scanlines a frame has, the more vertical detail it shows. Of course, the more scanlines you have, the more time it takes to trace out the picture. Unfortunately, the phosphors in CRTs have a very short persistence: they don't remain bright for very long after the scanning electron beam passes by, and early television pioneers found that by the time they had enough scanlines to form a decent picture, they couldn't refresh the images quickly enough to prevent the displays from flickering abominably. Even though the frame rate was high enough to adequately represent motion, flicker made the pictures unwatchable.
The solution to the flicker problem was interlace: instead of progressively scanning the entire frame a line at a time, the way you'd read a page, the frame was divided into two fields, one comprised of all the even scanlines, the other of all the odd scanlines. Each field contains half the picture, so fields can be scanned (and displayed) twice as quickly as frames. Even though the total amount of information is the same as in progressive frame scanning, interlaced field scanning allows the CRT to be refreshed twice as quickly, so the flicker problem was overcome. The eye handily integrates the two interlaced fields into a full frame, and the net result is that you get enough scanlines to form a detailed picture while repainting the screen fast enough to minimize flicker.
Back in the 1930s, television was an inherently real-time process; what the camera saw was transmitted as it occurred to the receiver, and displayed just as it was received; there was no means to store a frame and repeatedly display it to overcome flicker.
Film projection overcomes 24 fps flicker by flashing each frame onscreen two or three times. The resulting 48 Hz or 72 Hz flashing, like video's 50 Hz or 59.94 Hz interlaced field rate, keeps "the flicks" from flickering to excess.
Unfortunately, interlace introduces its own unique problems.
The first is simple terminological confusion: the same interlaced signal can be referred to by either its field rate or by its frame rate. NTSC television runs at 59.94 Hz, about 60 fields per second, or 29.97 Hz, about 30 frames per second. PAL runs at 50 Hz (field rate) or 25 Hz (frame rate).
NTSC's field rate of 59.94Hzoriginally 60Hzand PAL's 50Hz were chosen to match their native countries' powerline frequencies so that interference and "hum" from AC power wouldn't ripple annoyingly through the picture.
It's not difficult to figure out what the numbers mean when you're dealing with NTSC and PAL, but once you start looking at HDTV, it's sometimes difficult to tell whether the field rate or frame rate is being used to describe a clip.
Final Cut Pro always uses the frame rate for its Vid Rate and Editing Timebase values, never the field rate.
Second, the two separate fields don't really make up a proper frame, because they're not captured at the same time. Although each fills in the gaps of the other, the first field happens a field's time before the second. For static images, the two fields interlace smoothly, but if there's motion in the image, it shows up on still frames as "interlace combing" or tearing, a displacement of the moving objects in one field compared to the other.
The stills were extracted from the two short clips 60i.mov and 30p.mov in the media directory for this lesson. You might want to load them into a DV/DVCPRO NTSC sequence in FCP and see how they look in the Viewer and Canvas. To see what the clips really look like, set the Canvas to 100 percent, with the Show As Sq. Pixels setting off. (Both settings appear in the Zoom pop-up menu in the Viewer and Canvas, or in the View > Level menu.)
FCP's Viewer and Canvas show you a true pixel-for-pixel representation only when Show As Sq. Pixels is turned off and the scale is set to 100 percent. You'll find that larger scales will seem to work with raw clips, but once you apply filters, the only way to see what you really have is to use these settings.
Of course, if you can output the sequence to an NTSC monitor, you'll also see the difference in still frames there.
Because the two fields are really separate images, they have to be processed as such. That's why FCP has the Field Dominance setting in Sequence Settings: the setting tells FCP which field comes first in time. (If you get the setting wrong, rendered images show exaggerated combing artifacts when they're moving.)
Many filters and transitions work on a field basis for interleaved material, not a frame basis, to avoid combing artifacts. FCP also has to resize interlaced images using the two half-resolution fields separately instead of the entire full-resolution frame. This is one reason folks like progressive scanning when upconverting or downconverting or printing out to film: progressive images allow much crisper and smoother resizing with fewer artifacts.
Interlaced images also have slightly less vertical resolution than progressive images, even on static pictures. Interlaced cameras apply slight vertical blurring during image capture to avoid twitter (frame-rate flicker) of fine details that only appear in one field or the other, and that, combined with various perceptual factors, means that the actual vertical resolution of interlaced images is about 0.7 x the expected resolution based on the line count.
A picture's aspect ratio is the ratio of its width to its height. In TV terms, aspect ratio is most often expressed in integers: 4x3, or 16x9. Most standard-definition pictures worldwide are 4x34 units wide by 3 units tallbut the 16x9 widescreen format is increasingly popular, especially in Europe, and has been chosen as the aspect ratio for high-definition television. All current TV standards use one of these two ratios, though others have been used in the past.
4x3 can also be expressed as 12x9, for direct comparison with 16x9. 16x9 can be shot with a camera that captures 16x9 directly, or by adding an anamorphic lens to a 4x3 camera, which squeezes the wider image to fit in the narrower frame.
FCP normally displays standard-definition pictures as 4x3. If a clip or sequence has the anamorphic flag set, that picture will be shown as 16x9 within FCP. Depending on your video output, you may have to switch your picture monitor to 16x9 manually.
Most 16x9-capable cameras set the flag automatically when shooting in 16x9, and FCP reads that flag during FireWire capture. Footage shot with an anamorphic lens won't have the flag set, and capture though other capture cards may not sense the widescreen flag. In these cases, use a Capture Preset that sets the anamorphic flag explicitly.
You can change the state of the anamorphic setting for clips you've already captured by using Edit > Item Properties > Format, and for sequences with Sequence > Settings
In film, aspect ratio is normally given as a ratio to a picture height of 1: a 4x3 picture is 1.33:1, a 16x9 picture is 1.77:1. A full 8mm, Super8, 16mm, or 35mm frame is 1.33:1, but most "flat" (non-anamorphic) films are released in the USA as 1.85:1 (which is only 5 percent wider than 16x9, making 16x9 the favored format for digital filmmakers). "Scope" films (from "Cinemascope"), normally shot with an anamorphic lens, have an aspect ratio of 2.38:1.
FCP handles both 4x3 and 16x9 material. FCP calls all standard-definition 16x9 material "anamorphic" because 16x9 images, viewed on a normal 4x3 monitor, are horizontally squished regardless of whether they were shot with a 16x9-native camera or with an anamorphic lens.
Anamorphic means "not (iso)morphic", or not equally scaled horizontally and vertically.
Widescreen pictures are often displayed on "narrowscreen" televisions by letterboxing, shrinking the image to fit and filling the space above and below with black bands. 4x3 images can be inserted into a 16x9 program by pillarboxing, adding black side panels on the edges of the 4x3 picture.
The odd terminology refers to British mailboxes, which are tall vertical pillars with thin horizontal slots for the mail.
Bandwidth, Resolution, and Line Counts
Bandwidth determines the resolution of an analog television signal, and line count is what measures the resolution. You'll see both terms used in equipment specifications, and in the never-ending arguments about which camera, VTR, or tape format is supposedly better than another.
An NTSC television picture has about 483 active lines, scanlines carrying picture information. (There are 525 lines in an entire frame; the rest carry the vertical blanking and vertical sync. PAL has 576 active lines out of 625 total.) You might think, then, that the vertical resolution of an NTSC picture would be 483 lines, but interlace causes the effective resolution to be only 70 percent of the line count (the Kell factor), to about 338 lines.
In television, resolution is measured by individual TV linesa black line beside a white line counts as two lines. In film, print, and most other media, resolution is measured in line pairs or cyclesa black line and a white line counts as only one line pair.
Horizontal resolution is measured over a width equal to the height of the picture, so you can compare horizontal and vertical resolution values on an equal basis. The formal measurement is thus TVl/phTV lines per picture heightalthough you'll often see it listed as "TV lines."
It turns out that each MHz of analog signal bandwidth results in about 80 TVl/ph. NTSC brightness signals have 4.2 MHz of bandwidth, or 336 TVl/ph. Thus an NTSC picture has essentially equal horizontal and vertical luma resolutions.
NTSC chroma's bandwidth of 0.5 to 1.5 MHz (depending on the exact color) yields an effective chroma resolution of only 40 to 120 TVl/ph. This low resolution doesn't bother us much because our eyes are so much less sensitive to fine detail in chroma than in luma.