Calculating Source and Target Rectangles | Fundamentals of Audio and Video Programming for Games (Pro-Developer)

When the compositor mixes a video frame, it must calculate two rectangles: the source rectangle , which describes the portion of the video frame that should be drawn onto the render target, and the destination rectangle , which describes the area of the render target that should receive the video. These rectangles are derived from several pieces of information:

The format of the source video
The format of the render target
The stream position that was set by the application through the IVMRMixerControl9::SetOutputRect method.

A video format is described with the AM_MEDIA_TYPE structure. This structure provides a generic way to describe any media format ” whether it s audio, video, or a format that nobody has invented yet. Here is the declaration of the AM_MEDIA_TYPE structure.

 typedef struct  _MediaType {      GUID      majortype;      GUID      subtype;      BOOL      bFixedSizeSamples;      BOOL      bTemporalCompression;      ULONG     lSampleSize;      GUID      formattype;      IUnknown  *pUnk;      ULONG     cbFormat;      BYTE      *pbFormat;  } AM_MEDIA_TYPE;

The majortype and subtype GUIDs are used to identify formats without needing to include all of the specifics. For uncompressed video, majortype is always MEDIATYPE_Video . The subtype roughly corresponds to the D3DFORMAT enumeration in Direct3D. For example, 24-bit RGB is MEDIASUBTYPE_RGB24 . You can find a list of subtype GUIDs in the DirectShow SDK documentation.

The details of the format, such as the width and height, are contained in another structure called the format block . This structure is allocated separately from the AM_-MEDIA_TYPE structure, and is stored in the pbFormat field as a byte array. The layout of the format block is defined by a third GUID value, the formattype field. For uncompressed video, the format block is one of two structures, either VIDEOINFOHEADER or VIDEOINFOHEADER2 . These are identified by the GUID values FORMAT_-VideoInfo and FORMAT_VideoInfo2 , respectively. The VIDEOINFOHEADER2 structure contains all of the information in the VIDEOINFOHEADER structure, as well as some additional fields. However, because VIDEOINFOHEADER2 is newer than VIDEOINFOHEADER , not all DirectShow filters support it.

The following list describes the structure members that are relevant for calculating the source and target rectangles:

bmiHeader is a BITMAPINFOHEADER structure. It contains the width and height of the video surface.
rcTarget is the target rectangle. In the video stream, it defines the size of the image. This may be smaller than the size defined in bmiHeader . For example, the surface may be wider than the image, due to stride requirements. In the render target, rcTarget defines the area of the surface that should receive the video. If rcTarget is an empty rectangle, the entire surface should be used.
rcSource is the source rectangle. It defines the portion of the source video that should be placed into the render target. Decoders can set this value in order to crop a portion of the image. If this field is empty, the entire image should be used.
dwPictAspectRatioX and dwPictAspectRatioY define the picture aspect ratio. The aspect ratio may not match the physical image size. For example, standard consumer-format DV video is 720 x 480 pixels (1.5:1) but should be displayed at a 4:3 aspect ratio (approximately 1.33:1). These fields are defined only in the VIDEOINFOHEADER2 structure.
The dwControlFlags field may contain one of the following flags:
- AMCONTROL_PAD_TO_4x3 : Pad the image to a 4 — 3 area.
- AMCONTROL_PAD_TO_16x9 : Pad the image to a 16 — 9 area.

These flags are only defined in the VIDEOINFOHEADER2 structure, and are sometimes used with DVD video content. If either of these flags is present, the target rectangle should be squeezed along one axis so that the video image retains the correct aspect ratio when the render target is displayed inside the specified area.

To summarize, here are the transformations needed to calculate the source and target rectangles:

Get rcTarget from the video media type (T1).
Get rcTarget from the render target media type (T2).
Find the video aspect ratio from the video media type (AR).
Adjust T1 for the aspect ratio by scaling horizontally (T1 ).
Map T1 into T2 by finding the largest rectangle that fits in T2 and has the same aspect ratio as T1 (T3).
If the AMCONTROL_PAD_TO_4x3 or AMCONTROL_PAD_TO_16x9 flag is present, the target rectangle is scaled in the opposite direction so that the image retains the correct aspect ratio (T3') when the image is stretched to the final dimensions.
Map T3' into the normalized rectangle.
Set the texture coordinates based on the source rectangle.

Figure 11.3 shows the intermediate results for the target rectangle calculations.

Figure 11.3: Calculating the destination rectangle.