Exploring the Histogram Application | Programming Microsoft DirectShow for Digital Video and Television (Pro-Developer)

Exploring the Histogram Application

The Histogram application transforms an incoming stream of video samples. The word histogram means that the application counts the frequency of every pixel value in each frame (or image) of the video stream. The application tracks how often a given pixel value occurs in a given frame. Once that information has been gathered, it s put to use in one of the following three specific ways, depending on which section of code is executed (which is determined by commenting/uncommenting code):

A histogram stretch changes the values of each pixel in each frame so that the pixels utilize an entire range of possible values. A histogram stretch has the visible effect of changing the brightness of the image, and it s also known as automatic gain control. (The upper and lower range values can be set within the code.)
A histogram equalization brings out details in an image that might otherwise be invisible because they re too dim or too bright.
A logarithmic stretch applies a logarithm operator to every pixel in the frame, which boosts the contrast in the darker regions of the image.

One of these three mathematical transformations is applied to every frame of the video. After it passes through the Sample Grabber, the video is DV (digital video) encoded and rendered to an output file. The resulting AVI file can be viewed with Windows Media Player. (The algorithms implemented in these histogram operations were found in Handbook of Image and Video Processing, Al Bovik, ed., Academic Press, 2000.)

Creating a Filter Graph Containing a Sample Grabber

The Histogram is a console-based application that has hard-coded file names inside it, and it produces an output AVI file at C:\Histogram_Output.AVI. (You ll need to have Sample.AVI in the root of drive C.) The application s simple filter graph is built, run, and torn down in one function, RunFile:

HRESULT RunFile(LPTSTR pszSrcFile, LPTSTR pszDestFile) { HRESULT hr; USES_CONVERSION; // For TCHAR -> WCHAR conversion macros CComPtr<IGraphBuilder> pGraph; // Filter Graph Manager CComPtr<IBaseFilter> pGrabF; // Sample grabber CComPtr<IBaseFilter> pMux; // AVI Mux CComPtr<IBaseFilter> pSrc; // Source filter CComPtr<IBaseFilter> pDVEnc; // DV Encoder CComPtr<ICaptureGraphBuilder2> pBuild; // Create the Filter Graph Manager. hr = pGraph.CoCreateInstance(CLSID_FilterGraph); if (FAILED(hr)) { printf("Could not create the Filter Graph Manager (hr = 0x%X.)\n", hr); return hr; } // Create the Capture Graph Builder. hr = pBuild.CoCreateInstance(CLSID_CaptureGraphBuilder2); if (FAILED(hr)) { printf("Could not create the Capture Graph Builder (hr = 0x%X.)\n", hr); return hr; } pBuild->SetFiltergraph(pGraph); // Build the file-writing portion of the graph. hr = pBuild->SetOutputFileName(&MEDIASUBTYPE_Avi, T2W(pszDestFile), &pMux, NULL); if (FAILED(hr)) { printf("Could not hook up the AVI Mux / File Writer (hr = 0x%X.)\n", hr); return hr; } // Add the source filter for the input file. hr = pGraph->AddSourceFilter(T2W(pszSrcFile), L"Source", &pSrc); if (FAILED(hr)) { printf("Could not add the source filter (hr = 0x%X.)\n", hr); return hr; } // Create some filters and add them to the graph. // DV Video Encoder hr = AddFilterByCLSID(pGraph, CLSID_DVVideoEnc, L"DV Encoder", &pDVEnc); if (FAILED(hr)) { printf("Could not add the DV video encoder filter (hr = 0x%X.)\n", hr); return hr; }

Although the Histogram source code uses the Active Template Library (ATL) features of Visual C++, this code should be very familiar by now. A Filter Graph Manager and a Capture Graph Builder are both instantiated and appropriately initialized. The Capture Graph Builder method SetFileName establishes the file name for the output AVI file, and AddSourceFilter is used to add the source file. Next a DV Encoder filter (the output stream is DV-encoded) is added. At this point, the Sample Grabber filter is instanced, added to the filter graph, and initialized before it s connected to any other filters.

 // Sample Grabber. hr = AddFilterByCLSID(pGraph, CLSID_SampleGrabber, L"Grabber", &pGrabF); if (FAILED(hr)) { printf("Could not add the sample grabber filter (hr = 0x%X.)\n", hr); return hr; } CComQIPtr<ISampleGrabber> pGrabber(pGrabF); if (!pGrabF) { return E_NOINTERFACE; } // Configure the Sample Grabber. AM_MEDIA_TYPE mt; ZeroMemory(&mt, sizeof(AM_MEDIA_TYPE)); mt.majortype = MEDIATYPE_Video; mt.subtype = MEDIASUBTYPE_UYVY; mt.formattype = FORMAT_VideoInfo; // Note: I don't expect the next few methods to fail... hr = pGrabber->SetMediaType(&mt); // Set the media type _ASSERTE(SUCCEEDED(hr)); hr = pGrabber->SetOneShot(FALSE); // Disable "one-shot" mode _ASSERTE(SUCCEEDED(hr)); hr = pGrabber->SetBufferSamples(FALSE); // Disable sample buffering _ASSERTE(SUCCEEDED(hr)); hr = pGrabber->SetCallback(&grabberCallback, 0); // Set our callback // '0' means 'use the SampleCB callback' _ASSERTE(SUCCEEDED(hr));

Using the ATL smart pointers, a QueryInterface call is made to the Sample Grabber filter object requesting its ISampleGrabber interface. Once that interface has been acquired, the Sample Grabber can be configured. First the Sample Grabber has to be given a media type, so an AM_MEDIA_TYPE structure is instantiated and filled with a major type of MEDIATYPE_Video and a subtype of MEDIASUBTYPE_UYVY. (As covered in Chapter 10, this subtype means that the media samples in the stream are in YUV format, specifically UYVY.) A call to the ISampleGrabber method SetMediaType establishes the media type for the Sample Grabber. The Sample Grabber will accept any media type until a call is made to SetMediaType, at which point it will accept only matching media types. It s vital that the media type be set before the Sample Grabber is connected to other filters so that they can negotiate to meet the needs of the Sample Grabber.

A bit more initialization needs to be done to ensure the proper operation of the Sample Grabber. The Sample Grabber has the option of stopping the execution of the filter graph after it receives its first media sample, which comes in handy when grabbing still frames from a video file but would be a detriment to the operation of the histogram. To prevent this one-shot behavior from happening (we want continuous operation of the filter graph), we pass FALSE to the ISampleGrabber method SetOneShot. The Sample Grabber can also buffer samples internally as they pass through the filter; buffering is disabled by passing FALSE to SetBufferSamples. Finally the callback function within the Histogram application is initialized with a call to SetCallback. The Sample Grabber will call this function every time it receives a media sample. (The format of the object passed to SetCallback is discussed in the next section.) With the Sample Grabber filter fully initialized, construction of the filter graph can continue.

 // Build the graph. // First connect the source to the DV Encoder, // through the Sample Grabber. // This should connect the video stream. hr = pBuild->RenderStream(0, 0, pSrc, pGrabF, pDVEnc); if (SUCCEEDED(hr)) { // Next, connect the DV Encoder to the AVI Mux. hr = pBuild->RenderStream(0, 0, pDVEnc, 0, pMux); if (SUCCEEDED(hr)) { // Maybe we have an audio stream. // If so, connect it the AVI Mux. // But don't fail if we don't... HRESULT hrTmp = pBuild->RenderStream(0, 0, pSrc, 0, pMux); SaveGraphFile(pGraph, L"C:\\Grabber.grf"); } } if (FAILED(hr)) { printf("Error building the graph (hr = 0x%X.)\n", hr); return hr; } // Find out the exact video format. hr = pGrabber->GetConnectedMediaType(&mt); if (FAILED(hr)) { printf("Could not get the video format. (hr = 0x%X.)\n", hr); return hr; } VIDEOINFOHEADER *pVih; if ((mt.subtype == MEDIASUBTYPE_UYVY) && (mt.formattype == FORMAT_VideoInfo)) { pVih = reinterpret_cast<VIDEOINFOHEADER*>(mt.pbFormat); } else { // This is not the format we expected! CoTaskMemFree(mt.pbFormat); return VFW_E_INVALIDMEDIATYPE; }

Once the graph has been connected together with a few calls to RenderStream, the Sample Grabber is queried for its media type using the ISampleGrabber method GetConnectedMediaType. If the returned subtype isn t MEDIASUBTYPE_UYVY, the function returns an error because the Histogram application can t process video frames in any format except UYVY. The function also checks that the format type is FORMAT_VideoInfo, which defines the format structure as a VIDEOINFOHEADER type. This check is performed because the Histogram application wasn t written to handle VIDEOINFOHEADER2 format types. (The Sample Grabber filter doesn t accept VIDEOINFOHEADER2 formats either.) The VIDEOINFOHEADER2 structure is similar to VIDEOINFOHEADER, but it adds support for interlaced fields and image aspect ratios.

 g_stretch.SetFormat(*pVih); CoTaskMemFree(mt.pbFormat); // Turn off the graph clock. CComQIPtr<IMediaFilter> pMF(pGraph); pMF->SetSyncSource(NULL); // Run the graph to completion. CComQIPtr<IMediaControl> pControl(pGraph); CComQIPtr<IMediaEvent> pEvent(pGraph); long evCode = 0; printf("Processing the video file... "); pControl->Run(); pEvent->WaitForCompletion(INFINITE, &evCode); pControl->Stop(); printf("Done.\n"); return hr; }

A global object, g_stretch, implements the methods that perform the mathematical transformation on the media sample. (One of three different instances of g_stretch can be created, depending on which line of code is uncommented. These three instances correspond to the three types of transforms implemented in the Histogram application.) A call to the g_stretch SetFormat method allows it to initialize with the appropriate height, width, and bit depth information needed for successful operation.

In a final step, a call is made to the IMediaFilter interface method SetSyncSource. This call sets the reference clock for the filter graph, which is used to preserve synchronization among all graph components and to define the presentation time for the filter graph. When passed NULL, as it is here, SetSyncSource turns off the filter graph s reference clock, allowing the filter graph components to run at their own rates, which is desirable because you will want the filter graph to process each sample as quickly as possible. If a reference clock was active, some filter graph components might choose to slow the graph down to maintain synchronization across the filter graph. With the reference clock turned off, this won t happen. (There aren t any filters in this particular filter graph that keep track of the reference clock, so we re being a bit overzealous here, for the sake of the example.)

Now that everything has been initialized, the IMediaControl and IMedia Event interfaces are acquired from the Filter Graph Manager. Finally the filter graph is run to completion.

Defining the Sample Grabber Callback Object

The Sample Grabber was initialized with a callback object, which acts as a hook between the Sample Grabber and the Histogram application. Although instantiated by the Histogram application, the Sample Grabber doesn t know anything about the application; specifically, the Sample Grabber doesn t know how to pass media samples to the application. The interface between the application and the Sample Grabber is managed with the callback object GrabberCB, which is defined and implemented as follows:

class GrabberCB : public ISampleGrabberCB { private: BITMAPINFOHEADER m_bmi; // Holds the bitmap format bool m_fFirstSample; // True if the next sample is the first one public: GrabberCB(); ~GrabberCB(); // IUnknown methods STDMETHODIMP_(ULONG) AddRef() { return 1; } STDMETHODIMP_(ULONG) Release() { return 2; } STDMETHOD(QueryInterface)(REFIID iid, void** ppv); // ISampleGrabberCB methods STDMETHOD(SampleCB)(double SampleTime, IMediaSample *pSample); STDMETHODIMP BufferCB(double, BYTE *, long) { return E_NOTIMPL; } }; GrabberCB::GrabberCB() : m_fFirstSample(true) { } GrabberCB::~GrabberCB() { } // Support querying for ISampleGrabberCB interface HRESULT GrabberCB::QueryInterface(REFIID iid, void**ppv) { if (!ppv) { return E_POINTER; } if (iid == IID_IUnknown) { *ppv = static_cast<IUnknown*>(this); } else if (iid == IID_ISampleGrabberCB) { *ppv = static_cast<ISampleGrabberCB*>(this); } else { return E_NOINTERFACE; } AddRef(); // We don't actually ref count, // but in case we change the implementation later. return S_OK; } // SampleCB: This is where we process each sample. HRESULT GrabberCB::SampleCB(double SampleTime, IMediaSample *pSample) { HRESULT hr; // Get the pointer to the buffer. BYTE *pBuffer; hr = pSample->GetPointer(&pBuffer); // Tell the image processing class about it. g_stretch.SetImage(pBuffer); if (FAILED(hr)) { OutputDebugString(TEXT("SampleCB: GetPointer FAILED\n")); return hr; } // Scan the image on the first sample. // Re-scan if there is a discontinuity. // (This will produce horrible results // if there are big scene changes in the // video that are not associated with discontinuities. // Might be safer to re-scan // each image, at a higher perf cost.) if (m_fFirstSample) { hr = g_stretch.ScanImage(); m_fFirstSample = false; } else if (S_OK == pSample->IsDiscontinuity()) { hr = g_stretch.ScanImage(); } // Convert the image. hr = g_stretch.ConvertImage(); return S_OK; }

The class declaration for GrabberCB inherits the methods of the ISampleGrabberCB interface. This interface defines two methods that must be implemented by the callback object, GrabberCB::BufferCB and GrabberCB:: SampleCB. However, the Sample Grabber filter can use only one of these callback methods at a time, so you can simply return E_NOTIMPL from the one you aren t going to use. The GrabberCB::BufferCB method would receive a sample buffer from the Sample Grabber, but because sample buffering was disabled when the Sample Grabber was instantiated, this method simply returns the error code E_NOTIMPL. The GrabberCB::QueryInterface implementation ensures that the appropriate interface is returned when handling QueryInterface calls made on the GrabberCB object.

The GrabberCB::AddRef and GrabberCB::Release methods implement fake reference counting. Normally, COM requires that an object keep a reference count and delete itself when the reference count goes to zero. In this case, we don t keep a reference count of this object, which means that the callback object must have a wider scope than the DirectShow filter graph so that the object doesn t accidentally get deleted while the Sample Grabber is still using it. That is why the grabberCallback object is implemented as a global variable. The variable stays in scope for the duration of the Histogram application s execution, and the object is automatically deleted when the application terminates execution.

Now we come to the heart of the GrabberCB object, the implementation of GrabberCB::SampleCB. This method is called every time the Sample Grabber filter has a sample to present to the Histogram application. The method communicates with g_stretch, the object that manages the histogram transformation, passing it a pointer to the media sample s memory buffer, which contains the raw sample data. On the first time through the method (that is, when the first media sample is ready for processing by the Histogram application), the m_fFirstSample flag is set true and the method calls ScanImage, which allows the histogram object to learn the particulars of the video frames that it will be processing. (This also happens if a discontinuity is detected, which happens when the filter graph pauses and restarts.) Finally the method calls the Convert Image method of the histogram object, which performs the in-place transformation of the data in the buffer.

Although this might seem like a minimal implementation, this is all the implementation that s required to add the Sample Grabber filter to your DirectShow applications. The interesting part is what your application does with the media samples delivered to it by the Sample Grabber.

Processing Sample Grabber Media Samples

There are only a few major restrictions on the kind of processing that can take place on a sample presented by the Sample Grabber. The transformations must be performed in-place, within the same buffer presented by the Sample Grabber to the application. So, media format translations are not permissible, nor is any other operation that affects the media type or buffer size of the sample. If the Sample Grabber is upstream of a video renderer, performance of the filter graph will suffer because it will probably require hardware reads of the video graphics card, which can slow processing dramatically, as explained in Chapter 10.

In the Histogram application, the histogram image processing object CImage PointOp uses three methods for data translation: SetFormat, ScanImage, and ConvertImage. Because there are three possible implementations of ScanImage and ConvertImage, depending on which global histogram object is uncommented in the source code, we present the default case, CEqualize (which brings out detail across the image) as if it were the only implementation.

HRESULT CImageOp::SetFormat(const VIDEOINFOHEADER& vih) { // Check if UYVY. if (vih.bmiHeader.biCompression != 'YVYU') { return E_INVALIDARG; } int BytesPerPixel = vih.bmiHeader.biBitCount / 8; // If the target rectangle (rcTarget) is empty, // the image width and the stride are both biWidth. // Otherwise, image width is given by rcTarget // and the stride is given by biWidth. if (IsRectEmpty(&vih.rcTarget)) { m_dwWidth = vih.bmiHeader.biWidth; m_lStride = m_dwWidth; } else { m_dwWidth = vih.rcTarget.right; m_lStride = vih.bmiHeader.biWidth; } // Stride for UYVY is rounded to the nearest DWORD. m_lStride = (m_lStride * BytesPerPixel + 3) & ~3; // biHeight can be < 0, but YUV is always top-down. m_dwHeight = abs(vih.bmiHeader.biHeight); m_iBitDepth = vih.bmiHeader.biBitCount; return S_OK; } HRESULT CEqualize::_ScanImage() { DWORD iRow, iPixel; // looping variables BYTE *pRow = m_pImg; // pointer to the first row in the buffer DWORD histogram[LUMA_RANGE]; // basic histogram double nrm_histogram[LUMA_RANGE]; // normalized histogram ZeroMemory(histogram, sizeof(histogram)); // Create a histogram. // For each pixel, find the luma and increment the count for that // pixel. Luma values are translated // from the nominal 16-235 range to a 0-219 array. for (iRow = 0; iRow < m_dwHeight; iRow++) { UYVY_PIXEL *pPixel = reinterpret_cast<UYVY_PIXEL*>(pRow); for (iPixel = 0; iPixel < m_dwWidth; iPixel++, pPixel++) { BYTE luma = pPixel->y; luma = static_cast<BYTE>(clipYUV(luma)) - MIN_LUMA; histogram[luma]++; } pRow += m_lStride; } // Build the cumulative histogram. for (int i = 1; i < LUMA_RANGE; i++) { // The i'th entry is the sum of the previous entries. histogram[i] = histogram[i-1] + histogram[i]; } // Normalize the histogram. DWORD area = NumPixels(); for (int i = 0; i < LUMA_RANGE; i++) { nrm_histogram[i] = static_cast<double>( LUMA_RANGE * histogram[i] ) / area; } // Create the LUT. for (int i = 0; i < LUMA_RANGE; i++) { // Clip the result to the nominal luma range. long rounded = static_cast<long>(nrm_histogram[i] + 0.5); long clipped = clip(rounded, 0, LUMA_RANGE - 1); m_LookUpTable[i] = static_cast<BYTE>(clipped) + MIN_LUMA; } return S_OK; } HRESULT CImagePointOp::_ConvertImage() { DWORD iRow, iPixel; // looping variables BYTE *pRow = m_pImg; // pointer to the first row in the buffer for (iRow = 0; iRow < m_dwHeight; iRow++) { UYVY_PIXEL *pPixel = reinterpret_cast<UYVY_PIXEL*>(pRow); for (iPixel = 0; iPixel < m_dwWidth; iPixel++, pPixel++) { // Translate luma back to 0-219 range. BYTE luma = (BYTE)clipYUV(pPixel->y) - MIN_LUMA; // Convert from LUT. // The result is already in the correct 16-239 range. pPixel->y = m_LookUpTable[luma]; } pRow += m_lStride; } return S_OK; }

The first method, CImageOp::SetFormat, is called from RunFile when all the connections have been made across the filter graph and the Sample Grabber is fully aware of the media type it will be presenting to the histogram object. (Although the Sample Grabber is initialized with a media type and subtype, the histogram object needs more detailed information to process each frame of video.) From the passed VIDEOINFOHEADER parameter, the method learns the width, height, stride, and bit depth of the image, information that the histogram object will need for processing.

When the Sample Grabber receives its first media sample from an upstream filter, it calls CEqualize::ScanImage from GrabberCB::SampleCB. This method walks through the image, pixel by pixel, creating a histogram, from which it extracts maximum and minimum luma values. (Luma is covered in the sidebar on YUV formats in Chapter 10.) This histogram is then normalized, and these values are placed into an array that is used by CImageOp::ConvertImage. When CImageOp::ConvertImage is called, the method extracts the old luma value of a pixel, looks it up in the array created by CEqualize::ScanImage, and inserts a new luma value for the pixel.

The differences between the three histogram techniques offered by this application are entirely based in the three implementations of the ScanImage method. Each method generates a unique array of luma values, which thereby changes the output of CImageOp::ConvertImage. You could easily write your own ScanImage implementations, adapting the code in this example program, to produce unique video effects.

This concludes the exploration of the Sample Grabber from the application designer s point of view. Now we ll examine the source code of the Grabber Sample, which is an alternate implementation of the Sample Grabber, with its own unique class ID and interface ID. The Grabber Sample source code is included in the DirectX SDK. It has a unique design that illuminates many internal features of DirectShow.