7.2 Preprocessing

The source material for video coding may exist in a variety of forms such as computer files or live video in CCIR-601 format [4]. If CCIR-601 is the source, since MPEG-1 is for coding of video at VCR resolutions, then SIF format is normally used. These source pictures must be processed prior to coding. In Chapter 2 we explained how CCIR-601 video was converted to SIF format. If the source is film, we also discussed the conversion methodology in that Chapter. However, if computer source files do not have the SIF format, they have to be converted too. In MPEG-1, another preprocessing step is required to reorder the input pictures for coding. This is called picture reordering.

7.2.1 Picture reordering

Because of the conflicting requirements of random access and highly efficient coding, the MPEG suggested that not all pictures of a video sequence should be coded in the same way. They identified four types of picture in a video sequence. The first type is called I-pictures, which are coded without reference to the previous picture. They provide access points to the coded sequence for decoding. These pictures are intraframe coded as for JPEG, with a moderate compression. The second type is the P-pictures, which are predictively coded with reference to the previous I or P-coded pictures. They themselves are used as a reference (anchor) for coding of the future pictures. Coding of these pictures is very similar to H.261. The third type is B-pictures, or bidirectionally coded pictures, which may use past, future or combinations of both pictures in their predictions. This increases the motion compensation efficiency, since occluded parts of moving objects may be better compensated for from the future frame. B-pictures are never used for predictions. This part, which is unique to MPEG, has two important implications:

If B-pictures are not used for predictions of future frames, then they can be coded with the highest possible compression without any side effects. This is because, if one picture is coarsely coded and is used as a prediction, the coding distortions are transferred to the next frame. This frame then needs more bits to clear the previous distortions, and the overall bit rate may increase rather than decrease.
In applications such as transmission of video over packet networks, B-pictures may be discarded (e.g. due to buffer overflow) without affecting the next decoded pictures [5]. Note that if any part of the H.261 pictures, or I and P-pictures in MPEG, are corrupted during the transmission, the effect will propagate until they are refreshed [6].

Figure 7.3 illustrates the relationship between these three types of picture. Since B-pictures use I and P-pictures as predictions, they have to be coded later. This requires reordering the incoming picture order, which is carried out at the preprocessor.

click to expand
Figure 7.3: An example of MPEG-1 GOP

The fourth picture type is the D-pictures. These are intraframe coded, where only the DC coefficients are retained. Hence the picture quality is poor and normally used for applications like fast forward. D-pictures are not part of the GOP, hence they are not present in a sequence containing any other picture type.