Choosing Audio Encoding Parameters

 < Day Day Up > 

Once you move past encoding templates and start setting parameters manually, you'll find yourself frequently challenged with a confusing array of audio compression options not only bitrate, but foreign concepts such as sampling rate and bit depth. When choosing these parameters, it's important to know what comprises an audio file.

Most audio starts out as analog, meaning the spoken word or music. When an analog signal is converted to digital, whether by using your DV camera or your computer's sound card, the signal has three characteristics, each of which affects the ultimate size of the digital audio file.

The first characteristic is sampling frequency, or the number of times per second an incoming signal is "sampled." When an audio file has a sampling frequency of 44.1kHz, it means that each second of audio is split into 44,100 chunks during digitization. As you might expect, the higher the sampling frequency, the more accurate the recording, but each chunk must be stored separately, which increases the size of the digital file. Files recorded at 44.1kHz (the standard for CD-Audio) are twice as large as files recorded at 22kHz (considered FM-quality) and four times larger than files sampled at 11kHz (AM-quality). According to Nyquist's theorem the governing principle of digital audio sampling for an analog audio clip to be reconstructed accurately as a digital waveform, the sampling rate must be at least twice the highest audio frequency in the clip. The good news is, your video editor will almost always spare you such calculations simply by limiting your options to the 11 to 44.1kHz (or higher) range for your sampling rate.

The next characteristic is bit depth, which describes the amount of data allocated to each sample, generally either 8 bits or 16 bits. Obviously, a 16-bit recording will be twice as large as an 8-bit recording. However, when analog audio is recorded at 8 bits, there are only 256 possible values available to store the signal 256=28. In contrast, at 16 bits, there are 65,216 possible values (216).

To put this in context, imagine you were recording an orchestra complete with strings, woodwinds, brass, and percussion, with many instruments capable of an incredible range of subtle tones. If you record that orchestra at 8 bits, all samples, from alto flute to xylophone, must be slotted into one of 256 possible values. Not much room for subtlety there. At 16 bits, the number expands to 65,216, which is much more reasonable. As you would expect, CD-Audio discs use 16 bits; with a sampling rate of 44.1kHz, that means 705,600 bits for each second of sound ample breathing room.

The last characteristic is the number of channels stereo for left and right channels, or monaural, for one signal containing all sound. Assuming sampling frequency and bit depth are the same for both channels, a stereo signal is twice as large as a monaural (or mono) signal.

CD-quality digital audio is 44.1kHhz, 16-bit stereo, and has a data rate of 176 Kilobytes per second (1,408 kilobits per second). It's far smaller than the uncompressed data rate of video, but CD-quality audio is still huge, especially when you compare it to the bandwidths used to deliver video to certain target viewers.

For a visual explanation of these digital audio concepts, go to: www.animemusicvideos.org/guides/avtech/audio1.html

Subsampling for Quality

Let's put this background to work. Note that many encoding schemes let you change these parameters during encoding. For example, Figure 7.6 shows QuickTime Pro's audio encoding screen, with the QDesign Music 2 codec selected. To set the audio bitrate, click Options on the screen on the left, which opens the screen on the right, showing a data rate of 24 Kilobits per second (Kbps).

Figure 7.6. Whenever possible, adjust sampling rate, bitrate, and number of channels before choosing a data rate (on the right).


In addition to setting the bitrate, I can also adjust the sampling rate and choose between mono and stereo. In essence, if I reduce the sampling rate from 44.1kHz to 11kHz, I reduce the number of audio samples by a factor of four. Therefore, QDesign should be able to allocate four times as much compressed data to each sample.

If you're encoding high-quality music, you may find it worthwhile to experiment with different sampling rates and numbers of channels to see if overall quality improves. At a high level, this is almost identical to your decision to reduce the video resolution from 720x480 and frame rate from 30 to 15. In both cases, you're decreasing the amount of information that the compressed data rate must describe, which should boost overall quality.

Remember, though, converting from 16-bit to 8-bit audio is generally a bad idea when it comes to music, as the trained ear can pick up the vastly reduced subtlety of the sound. For this reason, many programs, like QuickTime Pro with the QDesign plug-in, simply won't permit you to change these parameters.

As we'll see below, most compression technologies have different codecs are optimized for voice and music. Obviously, you should choose the codec best suited to your source material.

Allocating Between Audio and Video

At low bitrates, audio quality is generally more important than video quality, since most viewers expect poor-quality video at these delivery rates. So at low bitrates such as 32Kbps (to stream to a 56Kbps modem), allocating as much as 8Kbps to audio, which is 25 percent of total bandwidth, is a good decision.

As data rates increase, the allocation is largely content-driven. With most audio codecs, the quality of compressed speech doesn't really improve when you boost the data rate beyond 32 to 64Kbps. So even if I were producing a 1Mbps Windows Media stream, I would probably limit my audio data rate to 32Kbps, or around 3 percent of the total.

However, music is more difficult to compress than speech because the range of sounds is greater; when music is a significant component of the video, as in a concert, music quality becomes paramount. For both these reasons, when encoding a music video at 1Mbps, I might allocate 192Kbps of bandwidth to audio, perhaps even more.

In this regard, the optimal data rate allocation between audio and video will always be project- and content-specific. Now that you know the parameters and tradeoffs involved, I'm sure that the best allocation for your project will quickly become apparent.

With this as background, let's visit our various output options.

     < Day Day Up > 


    DV 101. A Hands-On Guide for Business, Government & Educators
    DV 101: A Hands-On Guide for Business, Government and Educators
    ISBN: 0321348974
    EAN: 2147483647
    Year: 2005
    Pages: 110
    Authors: Jan Ozer

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net