Chapter 9. Multimedia
This chapter is about multimedia on Linux.
Multimedia
is a rather vague and much abused
Multimedia has historically been one of the more challenging areas of Linux, both for developers and users, and one that did not receive as much attention from Linux distributions as it should have, perhaps because Linux was initially embraced by so many as a server operating system. It was only recently that Linux has been seriously
The good news is that, unlike a few
We start off this chapter with a quick overview of multimedia concepts such as digital audio and video, and a description of the different types of multimedia hardware devices. Those familiar with the technology may wish to skip over this section. If you don't really care about how it all works or get lost in the first
We then discuss some of the issues related to multimedia support at the kernel level, which is a prerequisite for using the hardware. We then move on to applications, first those offered by some of the popular desktop environments, and then a sampling of more specialized applications broken down into different categories. If you want to develop your own applications, we
Keep in mind that multimedia is an area where Linux development moves
There are minor differences among Linux distributions. Although most of the information in this chapter is generic and
|
9.1. Multimedia ConceptsThis section very quickly covers some concepts relevant to digital audio , video , and sound cards . Understanding these basics will help you follow the rest of the material in this chapter. 9.1.1. Digital Sampling
Sound is produced when waves of varying pressure travel though a medium, usually air. It is
Modern computers are
digital
, meaning they
A hardware device called an
The process of converting analog signals to digital form consists of taking measurements, or
samples
, of the values at regular periods of time, and storing these samples as
The
sample size
is the range of values of numbers that is used to represent the digital samples, usually
The sample rate is the speed at which the analog signals are periodically measured over time. It is properly expressed as samples per second, although sometimes informally but less accurately expressed in Hertz (Hz) . A lower sample rate will lose more information about the original analog signal, a higher sample rate will more accurately represent it. The sampling theorem states that to accurately represent an analog signal it must be sampled at at least twice the rate of the highest frequency present in the original signal.
The range of human hearing is from approximately 20 to 20,000 Hz under ideal situations. To accurately represent sound for human listening, then, a sample rate of twice 20,000 Hz should be adequate. CD player technology uses 44,100 samples per second, which is in agreement with this simple calculation. Human speech has little information above 4000 Hz. Digital telephone systems typically use a sample rate of 8000 samples per second, which is
Other issues that arise when storing sound in digital format are the number of channels and the encoding format. To support stereo sound, two channels are required. Some audio systems use four or more channels. Often sounds need to be combined or changed in volume. This is the process of mixing , and can be done in analog form (e.g., a volume control) or in digital form by the computer. Conceptually, two digital samples can be mixed together simply by adding them, and volume can be changed by multiplying by a constant value.
Up to now we've discussed storing audio as digital samples. Other techniques are also commonly used.
FM synthesis
is an older technique that produces sound using hardware that manipulates different waveforms such as sine and triangle waves. The hardware to do this is quite simple and was popular with the first generation of computer sound cards for generating music. Many sound cards still support FM synthesis for backward compatibility. Some
MIDI
stands for Musical Instrument Digital Interface. It is a standard protocol for allowing electronic musical instruments to communicate. Typical MIDI devices are music keyboards, synthesizers, and drum machines. MIDI works with events representing such things as a key on a music keyboard being pressed, rather than storing actual sound samples. MIDI events can be stored in a MIDI file, providing a way to represent a song in a very compact format. MIDI is most popular with professional
9.1.2. File Formats
We've talked about sound samples, which typically come from a sound card and are stored in a computer's memory. To store them permanently, they need to be represented as files. There are various
The most straightforward method is to store the samples directly as bytes in a file, often referred to as
raw sound files
. The samples
A problem with raw sound files is that the file itself does not indicate the sample size, sampling rate, or data representation. To interpret the file correctly, this information needs to be known.
Storing the sound samples in the file has the advantage of making the sound data easy to work with, but has the
We've focused
Compression of image files uses various techniques. Standard compression schemes such as zip and gzip can be used. Run-length encoding, which describes sequences of pixels having the same color, is a good choice for images that contain areas having the same color, such as line drawings. As with audio, there are lossy compression schemes, such as JPEG compression, which is optimized for photographic-type images and designed to provide high compression with little noticeable effect on the image.
To extend still images to video, one can imagine simply stringing together many images arranged in time sequence. Clearly, this quickly generates extremely large files. Compression schemes such as that used for DVD movies use sophisticated algorithms that store some complete images, as well as a mathematical representation of the differences between adjacent
We mentioned Compact Disc Digital Audio, which stores about 600 MB of sound samples on a disc. The ubiquitous CD-ROM uses the same physical format to store computer data, using a filesystem known as the ISO 9660 format. This is a simple directory structure, similar to MS-DOS. The Rock Ridge extensions to ISO 9660 were developed to allow storing of longer filenames and more attributes, making the format suitable for Unix-compatible systems. Microsoft's Joliet filesystem
CD-ROMs are produced in a manufacturing facility using expensive equipment. CD-R (compact disc recordable) allows recording of data on a disc using an inexpensive drive, which can be read on a standard CD-ROM drive. CD-RW (compact disc rewritable) extends this with a disc that can be blanked (erased) many times and rewritten with new data.
DVD-ROM drives allow storing of about 4.7 GB of data on the same physical format used for DVD movies. With suitable decoding hardware or software, a PC with a DVD-ROM drive can also view DVD movies. Recently, dual-layer DVD-ROM
Like CD-R, DVD has been extended for recording, but with two different formats, known as DVD-R and DVD+R. At the time of writing, both formats were popular, and some combo drives supported both formats. Similarly, a rewritable DVD has been developed or rather, two different formats, known as DVD-RW and DVD+RW. Finally, a format known as DVD-RAM offers a random-access read/write media similar to hard disk storage.
DVD-ROM drives can be formatted with a (large) ISO 9660 filesystem,
For applications where multimedia is to be sent live via the Internet, often broadcast to multiple users, sending entire files is not suitable.
Streaming media
refers to systems where audio, or other media, is sent and
9.1.3. Multimedia Hardware
Now that we've discussed digital audio concepts, let's look at the hardware used. Sound cards follow a similar history as other peripheral cards for PCs. The first-generation cards used the ISA bus, and most aimed to be compatible with the Sound Blaster series from Creative Labs. The introduction of the ISA Plug and Play (PNP) standard allowed many sound cards to adopt this format and simplify configuration by eliminating the need for hardware
Some sound cards now support higher-end features such as surround sound using as many as six sound channels, and digital inputs and outputs that can connect to home theater systems. This is beyond the scope of what can be covered in this chapter. In the realm of video, there is obviously the ubiquitous video card, many of which offer 3D acceleration, large amounts of on-board memory, and sometimes more than one video output (multi-head). TV tuner cards can decode television signals and output them to a video monitor, often via a video card so the image can be mixed with the computer video. Video capture cards can record video in real time for storage on hard disk and later playback. Although the mouse and keyboard are the most common input devices, Linux also supports a number of touch screens, digitizing tablets, and joysticks.
Many scanners are supported on Linux. Older models
Digital cameras have had some support under Linux, improving over time as more drivers are developed and
|