The Stream Format Chunk

The Stream Format Chunk

The stream header is followed by another chunk, strf, which defines the format of the data in the stream. Because that data is format-dependent, it can have any of a number of legal formats. The two most common formats for these fields are BITMAPINFO, a descriptor for video streams, and WAVEFORMATEX, a descriptor for audio streams. Here s the structure of BITMAPINFOHEADER, which comprises the first fields in the BITMAPINFO structure:

typedef struct tagBITMAPINFOHEADER { DWORD biSize; LONG biWidth; LONG biHeight; WORD biPlanes; WORD biBitCount; DWORD biCompression; DWORD biSizeImage; LONG biXPelsPerMeter; LONG biYPelsPerMeter; DWORD biClrUsed; DWORD biClrImportant; } BITMAPINFOHEADER;

The size of the structure is defined in biSize. The biWidth field specifies the width of the bitmap. For RGB and YUV bitmaps, this figure is given in pixels, except where the YUV depth is not an even power of two, in which case, it s given in bytes. The biHeight field specifies the height of the image and the image direction. If the image is RGB and biHeight is negative, a bottom-up image is specified. Otherwise, a top-down image is specified. YUV images are always top down.

The number of bits per pixel is specified in biBitCount, and if the bitmap is compressed, the FOURCC field biCompression specifies the compression format. Otherwise, the legal values are BI_RGB for RGB bitmaps and BI_BITFIELDS for RGB bitmaps with color masks. The BI_BITFIELDS flag is valid for 16-bit-depth and 32-bit-depth RGB images and when set means that the R, G, and B (red, green, and blue) values for each pixel can be found using three bitmasks. You can get the R, G, and B portions of a pixel by applying the appropriate mask to the value of the pixel. For 16-bit RGB, BI_BITFIELDS allows you to distinguish between 565 RGB and 555 RGB. BI_RGB is valid for all uncompressed RGB bit depths. For 16-bit color depths, BI_RGB always means 555. For other bit depths, there s no ambiguity. For 24-bit and 32-bit color, BI_RGB always means 8 bits per color. Anything less than or equal to 8 bits per color is always palettized.

The biSizeImage field defines the image size in bytes, while biXPelsPerMeter and biYPelsPerMeter specify the horizontal and vertical resolution, respectively, in pixels per meter, of the target device. Therefore, an image can be scaled up or down to meet the output capabilities of a display device. The biClrUsed field defines the number of palette entries used by the video sequence. For palettized video formats (with 8 or fewer bits of color per pixel), if this value is zero, it means the maximum size color table (the number of bits of color, raised to the power of two, so an 8-bit color table would have 256 entries). For non-palettized formats (greater than 16 bits of color per pixel), the color table is optional, so zero really means zero. In these days of supercomputer-class graphics cards and cinema-sized displays, we don t see much palettized video. The biClrImportant field defines how many of the colors in the color table are important, a feature used by some palette-processing software.

Following the BITMAPINFOHEADER might be a color table and/or the three color masks for BI_BITFIELDS formats. It s up to the application to calculate the size for these additional elements because they re not included in the biSize field. The BITMAPINFO structure is a BITMAPINFOHEADER plus a field for the first entry in the color table, which is simply a convenience for easily accessing the color table from a structure definition.

The WAVEFORMATEX structure, used to describe audio data, looks like this:

typedef struct { WORD wFormatTag; WORD nChannels; DWORD nSamplesPerSec; DWORD nAvgBytesPerSec; WORD nBlockAlign; WORD wBitsPerSample; WORD cbSize; } WAVEFORMATEX;

The wFormatTag field is a value that describes the audio format type. For PCM data, this value will be WAVE_FORMAT_PCM. (Other values are registered with Microsoft.) The number of audio channels (2 for stereo) is defined in nChannels, and the number of samples per second is defined in nSamplesPerSec. The average data-transfer rate required to read and play the sample in real time is given in nAvgBytesPerSec, and the byte alignment of the sample data is specified in nBlockAlign. The number of bits per sample (generally 8 or 16, but could be 12, 24, or even 32) is given in wBitsPerSample. If the audio stream is not a WAVE_FORMAT_PCM type, additional format data might follow the WAVFORMATEX structure. In this case, the cbSize field will define how many bytes of additional format data follow. If there s no additional data or if the audio stream is WAVE_FORMAT_PCM, this field is set to zero.

An AVI file will have at least one stream format chunk and will most likely have two, one for audio and one for video. These streams, in the order they are presented in the stream header list, are used to reference entries in another list, which actually contains the stream data. Figure 14-3 shows how the lists and chunks should look inside the AVI header.

figure 14-3 the avi header list, which is composed of stream headers and stream formats

Figure 14-3. The AVI header list, which is composed of stream headers and stream formats



Programming Microsoft DirectShow for Digital Video and Television
Programming Microsoft DirectShow for Digital Video and Television (Pro-Developer)
ISBN: 0735618216
EAN: 2147483647
Year: 2002
Pages: 108
Authors: Mark D. Pesce

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net