Chapter 2: Video Basics

Before discussing the fundamentals of video compression, let us look at how video signals are generated. Their characteristics will help us to understand how they can be exploited for bandwidth reduction without actually introducing perceptual distortions. In this regard, we will first look at image formation and colour video. Interlaced/progressive video is explained, and its impact on the signal bandwidth and display units is discussed. Representation of video in digital form and the need for bit rate reductions will be addressed. Finally, the image formats to be coded for various applications and their quality assessments will be analysed.

2.1 Analogue video

2.1.1 Scanning

Video signals are normally generated at the output of a camera by scanning a two-dimensional moving scene and converting it into a one-dimensional electric signal. A moving scene is a collection of individual pictures or images, where each scanned picture generates a frame of the picture. Scanning starts at the top left corner of the picture and ends at the bottom right.

The choice of number of scanned lines per picture is a trade-off between the bandwidth, flicker and resolution. Increasing the number of scanning lines per picture increases the spatial resolution. Similarly, increasing the number of pictures per second will increase the temporal resolution. There is a lower limit to the number of pictures per second, below which flicker becomes perceptible. Hence, flicker-free, high resolution video requires larger bandwidth.

If a frame is formed by the single scanning of a picture, it is called progressive scanning. Alternatively, two pictures may be scanned at two different times, with the lines interleaved, such that two consecutive lines of a frame belong to alternate fields to form a frame. In this case, each scanned picture is called a field, and the scanning is called interlaced. Figure 2.1 shows progressive and interlaced frames.

click to expand
Figure 2.1: Progressive and interlaced frames

The concept behind interlaced scanning is to trade-off vertical-spatial resolution with that of the temporal. For instance, slow moving objects can be perceived with higher vertical resolution, since there are not many changes between the successive fields. At the same time, the human eye does not perceive flicker since the objects are displayed at field rates. For fast moving objects, although vertical resolution is reduced, the human eye is not sensitive to spatial resolutions at high display rates. Therefore, the bandwidth of television signals is halved without significant loss of picture resolution. Usually, in interlaced video, the number of lines per field is half the number of lines per frame, or the number of fields per second is twice the number of frames per second. Hence, the number of lines per second remains fixed.

It should be noted that if high spatio-temporal video is required, for example in high definition television (HDTV), then the progressive mode should be used. Although interlaced video is a good trade-off in television, it may not be suitable for computer displays, owing to the closeness of the screen to the viewer and the type of material normally displayed, such as text and graphs. If television pictures were to be used with computers, the result would be annoying interline flicker, line crawling etc. To avoid these problems, computers use noninterlaced (also called progressive or sequential) displays with refresh rates higher than 50/60 frames per second, typically 72 frames/s.

2.1.2 Colour components

During the scanning, a camera generates three primary colour signals called red, green and blue, the so-called RGB signals. These signals may be further processed for transmission and storage. For compatibility with black and white video and because of the fact that the three colour signals are highly correlated, a new set of signals at different colour space is generated. These are called colour systems, and the three standards are NTSC, PAL and SECAM [1]. We will concentrate on the PAL system as an example, although the basic principles involved in the other systems are very similar.

The colour space in PAL is represented by YUV, where Y represents the luminance and U and V represent the two colour components. The basis YUV colour space can be generated from gamma-corrected RGB (referred to in equations as R'G'B') components as follows:

(2.1) 

In the PAL system the luminance bandwidth is normally 5 MHz, although in PAL system-I, used in the UK, it is 5.5 MHz. The bandwidth of each colour component is only 1.5 MHz, because the human eye is less sensitive to colour resolution. For this reason, in most image processing applications, such as motion estimation, decisions on the types of block to be coded or not coded (see Chapter 6) are made on the luminance component only. The decision is then extended to the corresponding colour components. Note that for higher quality video, such as high definition television (HDTV), the luminance and chrominance components may have the same bandwidth, but nevertheless all the decisions are made on the luminance components. In some applications the chrominance bandwidth may be reduced much further than the ratio of 1.5/5 MHz.



Standard Codecs(c) Image Compression to Advanced Video Coding
Standard Codecs: Image Compression to Advanced Video Coding (IET Telecommunications Series)
ISBN: 0852967101
EAN: 2147483647
Year: 2005
Pages: 148
Authors: M. Ghanbari

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net