Working with WinCap | Programming Microsoft DirectShow for Digital Video and Television (Pro-Developer)

Working with WinCap

Mike Wasson on the DirectShow Documentation Team created a very useful application that presents all the basic principles involved in capture from and recording to a digital camcorder. (It does a lot more than that, but you ll have to wait until Chapter 7 before we cover the rest of its features.) This program, known as WinCap, uses the Windows GUI to present an interface to all the video capture devices on a user s system, and it gives you a wide range of control over these devices. (At this point, you should use Microsoft Visual Studio .NET to build an executable version of the WinCap sample program included on the companion CD.) When you launch WinCap, you should see a window on your display that looks something like Figure 6-1.

In the lower left corner of the WinCap window, WinCap presents a list of all video capture devices detected on your computer. (If you don t see your digital camcorder in this list, check to see that it s powered on. When powered on, it should show up in the list of available devices.) On my computer, I have three video capture devices installed: the webcam (which we covered in Chapter 5), an ATI TV tuner card (which we ll cover in Chapter 7), and a Sony DV Device, which is a generic name for my Sony TRV900 digital camcorder.

To select a video capture device, click its entry in the list of video capture devices and then click Select Device. When I select the Logitech QuickCam Web entry, which corresponds to my webcam, and then click Select Device, the upper portion of the WinCap window displays a live video stream captured from the webcam, as shown in Figure 6-2.

figure 6-1 on startup, wincap showing all available video capture devices

Figure 6-1. On startup, WinCap showing all available video capture devices

figure 6-2 a live video stream in the wincap window

Figure 6-2. A live video stream in the WinCap window

We ve already covered webcams in some detail, so we ll move along to the digital camcorder. When I select the Sony DV Device entry from the list of video capture devices and then click Select Device, I don t immediately get a video image. Instead, a new window opens on the screen, as shown in Figure 6-3.

figure 6-3 the new window that opens when the digital camcorder is selected for video capture

Figure 6-3. The new window that opens when the digital camcorder is selected for video capture

This new window, labeled VTR, has a number of familiar buttons, with the industry-standard icons for play, stop, rewind, fast forward, and eject. (There s a sixth icon, which we ll come to shortly.) Why did this window open? Why can t I see a live image from the camcorder? The reason is a function of how camcorders are designed. Camcorders generally have at least two operating modes: camera mode, in which images are recorded to tape, and VTR (video tape recorder) mode, in which images stored on tape can be played back through the camera. At the time I clicked the Select Device button, my camera was in VTR mode, so I got a window with VTR controls on the display. If I were to switch modes from VTR to camera and then select the device, I d see what we had been expecting to see initially, a live picture from the device, as shown in Figure 6-4.

figure 6-4 a live picture from the camcorder with the vtr controls disabled

Figure 6-4. A live picture from the camcorder with the VTR controls disabled

The WinCap window displays a live image from the camcorder. Although the VTR window is open, all its controls are disabled. When a camcorder is in camera mode, it won t respond to commands such as play, record, or fast- forward. This functionality won t matter for our purposes because it s possible to record a live image from a camera in VTR mode all you need to do is send the appropriate commands to it.

WinCap allows you to capture video and audio streams to an AVI file; you can select the file name for the capture file with the Set Capture File command on the File menu. (If you don t set a name, it will default to C:\CapFile.avi.) To begin capture, click Start Capture; to stop the capture, click Stop Capture (the same button). The AVI file created from the captured video stream can be played in Windows Media Player or any other compatible application, such as Adobe Premiere.

Finally, when the camcorder is in VTR mode, WinCap can write a properly formatted AVI file to the camcorder, making a video recording of the movie in the AVI file. (This AVI file must contain DV-encoded video; if the video is in any other format, it won t be written to the camcorder and the application will fail.) That s the sixth button in the VTR window, which shows a file icon pointing to a tape icon. When you click the icon, you ll be presented with a file selection dialog box that filters out everything except AVI files. Select an appropriate AVI file (such as Sunset.AVI, included on the companion CD), and it will be written out to the camcorder. Now that we ve covered some of the basic features of WinCap, let s begin a detailed exploration of how it works.

Modular C++ Programming and WinCap

All the DirectShow programming examples presented thus far in this book have been very simply structured. They barely qualify as C++ programs. Objects are instantiated, and method calls are made on those objects, but that s about as far as we ve gotten into the subtleties of C++ programming. WinCap is a formally structured C++ program with separate code modules for each of the user interface elements (such as the main window, the VTR controls, and so on) as well as a separate module for program initialization, device control, and so forth.

This modularity makes the code somewhat more difficult to follow; you ll need to noodle through the various modules there are 13 of them as we go through the portions of the code relevant to understanding how to manipulate a digital camcorder in DirectShow. To make this an easier process, program listings in this portion of the book will be modified slightly. The object that contains a method will be presented in the method s implementation, and any time a method is referenced in the text, it will be referenced through its containing object. For example, the method that processes UI messages for the main WinCap window will be identified as CMainDialog::OnReceiveMsg. This notation should help you find the source code definitions and implementation of any method under discussion. Table 6-1 describes the function of the 13 WinCap C++ modules.

Table 6-1. The WinCap C++ Modules
Module Name	Description
AudioCapDlg.cpp	Dialog box handling code for the Select An Audio Capture Device dialog box; populates list with audio capture devices
CDVProp.cpp	Dialog box and device handling code for the VTR dialog box that appears when a DV device is selected
CTunerProp.cpp	Dialog box and device handling code for the TV dialog box that appears when a TV tuner is selected (discussed in Chapter 7)
ConfigDlg.cpp	Handles display of property pages associated with a capture device
Device.cpp	Code to populate list with video capture devices, selection of video capture devices
Dialog.cpp	Basic C++ class infrastructure for WinCap dialog boxes
ExtDevice.cpp	Handles the IAMExtDevice, IAMExtTransport, and IAMExtTimecode interfaces for a video capture device
MainDialog.cpp	Event handling for the WinCap main dialog box window
SampleGrabber.cpp	Code for taking a snapshot from the video stream, placing it into the C:\Test.bmp file on disk
WinCap.cpp	Program entry point
graph.cpp	Code to instantiate and handle the filter graph manager and capture graph builder interfaces
stdafx.cpp	Standard C++ file to handle all includes for the prebuilt header files
utils.cpp	Various convenience and utility functions

Initializing WinCap

WinCap begins with some basic Microsoft Foundation Classes (MFC) initialization sequences and calls CoInitialize, which sets up the Component Object Model (COM) for use by the application. Shortly after that, the main dialog box window the WinCap window is instantiated, which results in a call to CMainDialog::OnInitDialog. This method does a bit more initialization to the user interface elements in the dialog box, such as enabling and disabling buttons, and so on, and calls CMainDialog::RefreshDeviceList. This function creates the video capture device list with a call to CDeviceList::Init, which is passed with the value CLSID_VideoInputDeviceCategory. CDeviceList::Init should look a bit familiar by now, as it enumerates all the video input devices known to DirectShow, a technique we ve already covered.

HRESULT CDeviceList::Init(const GUID& category) { m_MonList.clear(); // Create the System Device Enumerator. HRESULT hr; CComPtr<ICreateDevEnum> pDevEnum; hr = pDevEnum.CoCreateInstance(CLSID_SystemDeviceEnum, NULL, CLSCTX_INPROC_SERVER); if (FAILED(hr)) { return hr; } // Obtain a class enumerator for the video capture category. CComPtr<IEnumMoniker> pEnum; hr = pDevEnum->CreateClassEnumerator(category, &pEnum, 0); if (hr == S_OK) { // Enumerate the monikers. CComPtr<IMoniker> pMoniker; while(pEnum->Next(1, &pMoniker, 0) == S_OK) { m_MonList.push_back(pMoniker); pMoniker.Release(); } } return hr; }

The monikers for each enumerated video input device are placed into a list object managed by CDeviceList. When control returns to CMainDialog::RefreshDeviceList, it issues a call to CDeviceList::PopulateList, which queries for the description of the device (supported by some DV camcorders). If the description is found for the device, the description is returned. Otherwise, the routine retrieves the friendly name for the item, and each entry is added to the list box in the lower left corner of the WinCap window. This technique, in just a few lines of code, is exactly what you ll need in your own DirectShow applications if you want to present the user with a list of possible DirectShow devices or filters video, audio, compressors, and so on.

Using the Active Template Library (ATL)

In the preceding code sample, some variable type declarations probably look unfamiliar, such as CComPtr<ICreateDevEnum>. This is our first encounter with the Microsoft Active Template Library (ATL), a set of template- based C++ classes that handle much of the housekeeping associated with object allocation. ATL itself is vast, and the smart pointer classes it offers (as given in the preceding code) are just the tip of the iceberg.

What is this CComPtr? In COM, the lifetime of an object is managed by the reference count. The object deletes itself whenever its reference count goes to zero (which means it is no longer being used in the application). Conversely, if the reference count never goes to zero, the object is never deleted at least, not until the application terminates its execution. If you discard a COM reference pointer without releasing it, you have created a ref count leak, which means your application s memory space will become polluted with objects that are unused but still allocated.

One of the hardest things in COM client programming is avoiding ref count leaks. (The only thing harder is tracking them down!) This is especially true when functions have multiple exit points. When you leave a function, you must release every valid interface pointer (but not release any that are still NULL). Traditionally, you d design functions with a single exit point, which is good programming practice (it s what we all learned in school) but hard to carry out in many real-world situations.

Smart pointers supported by ATL are an elegant solution to the reference count problem, and if used correctly, smart pointers can save you a lot of debugging time. For example, CComPtr is an ATL template that manages a COM interface pointer. For example, consider this line of code from the snippet given earlier:

CComPtr<ICreateDevEnum> pDevEnum;

This declaration creates a pointer to an object of class ICreateDevEnum and initializes its value to NULL. Now the CoCreateInstance call can be made as a method of the COM smart pointer. That s one big difference between user-declared pointers and ATL smart pointers. Additionally, ATL smart pointers will delete themselves when they go out of scope, which means that if you ve defined an ATL smart pointer nested deep in several pairs of braces (or parentheses), it will delete and dereference itself automatically the next time you hit a closing brace or closing parenthesis. If it s defined within the main body of a function, the pointer will be deleted when the function exits.

In true C++ style, ATL smart pointers overload two basic C++ operators, & and ->. You can take &pDevNum, and it will work correctly, even though you re not pointing to an ICreateDevEnum object but to an ATL wrapper around it. The pointer operation pDevEnum-> also works as it should, allowing you to make method calls and variable references on an ATL smart pointer as if it were a real pointer.

These ATL pointers are used liberally throughout WinCap; most of the time, you won t have any trouble understanding their operation. If you do have more questions, there s extensive documentation on ATL within Visual Studio.

Selecting a Digital Camcorder

Once the list of video capture devices has been created and populated with entries known to DirectShow, it is possible to select a device and control it. Although any valid video capture device can be selected, in this section, we ll consider only what happens when a digital camcorder is selected.

When a video capture device is selected and the Select Device button is clicked, the CMainDialog::OnSelectDevice method is invoked, which immediately invokes the CMainDialog::InitGraph method. This is where the DirectShow Filter Graph Manager directly interacts with WinCap. The application maintains communication with the DirectShow Filter Graph Manager through a CGraph object. The CMainDialog object has a private variable named m_ pGraph that is a CCaptureGraph object derived from the CGraph class but specifically used to build a capture filter graph. These classes are application defined, created by WinCap, and shouldn t be confused with either the IGraphBuilder or ICaptureGraphBuilder2 interfaces offered by DirectShow. CMain Dialog::InitGraph creates a new CCaptureGraph object, which in turn creates a new Filter Graph Manager object.

Next CMainDialog::OnSelectDevice calls CCaptureGraph::AddCaptureDevice, passing it the moniker for the video capture device. CCaptureGraph::AddCaptureDevice creates the DirectShow filter that manages the capture device and then adds it to the filter graph after clearing the filter graph of any other filters with a call to CCaptureGraph::TearDownGraph. At this point, CMainDialog::OnSelectDevice calls its CMainDialog::StartPreview method. This method builds a fully functional filter graph that will display a preview from the selected video capture device (in this case, a DV camcorder) in the WinCap window. CMainDialog::StartPreview calls CCaptureGraph::RenderPreview. Here s the implementation of that method:

HRESULT CCaptureGraph::RenderPreview(BOOL bRenderCC) { HRESULT hr; OutputDebugString(TEXT("RenderPreview()\n")); if (!m_pCap) return E_FAIL; // First try to render an interleaved stream (DV). hr = m_pBuild->RenderStream(&PIN_CATEGORY_PREVIEW, &MEDIATYPE_Interleaved, m_pCap, 0, 0); if (FAILED(hr)) { // Next try a video stream. hr = m_pBuild->RenderStream(&PIN_CATEGORY_PREVIEW, &MEDIATYPE_Video, m_pCap, 0, 0); } // Try to render CC. If it fails, we still count preview as successful. if (SUCCEEDED(hr) && bRenderCC) { // Try VBI pin category, then CC pin category. HRESULT hrCC = m_pBuild->RenderStream(&PIN_CATEGORY_VBI, 0, m_pCap, 0, 0); if (FAILED(hrCC)) { hrCC = m_pBuild->RenderStream(&PIN_CATEGORY_CC, 0, m_pCap, 0, 0); } } if (SUCCEEDED(hr)) { InitVideoWindow(); // Set up the video window ConfigureTVAudio(TRUE); // Try to get TV audio going } return hr; }

This method is straightforward. It repeatedly calls the RenderStream method on m_pBuild, an ICaptureGraphBuilder2 COM object. In the first RenderStream call, the stream is specified as MEDIATYPE_Interleaved; that is, DV video, where the video and audio signals are woven together into a single stream. Specifying this stream is a bit of a trick to test whether the video capture device is a digital camcorder or something else altogether, because a webcam or a TV tuner will not produce interleaved video but a camcorder will. So, if the first call to RenderStream is successful, we know we re dealing with interleaved video, which indicates a digital camcorder. There s a bit of code to deal with TV tuners and closed-captioning, which we ll cover extensively in a later chapter. Finally a call to CGraph::InitVideoWindow sends the video preview to a specific region in the WinCap window. This is done by the DirectShow Video Renderer filter, which we ve used several times before.

Once again, back in CMainDialog::OnSelectDevice, we determine whether we were successful in producing a preview of the video stream. If so, we call CMainDialog::SaveGraphFile. This method, which I added to WinCap, is identical to the function we ve used in our previous DirectShow applications, and it s been incorporated here so that you can see the kinds of filter graphs generated by WinCap. Once the filter graph has been written out, we call the filter graph manager s Run method, and the filter graph begins its execution.

Although we ve begun execution of the filter graph, we re not quite done yet. We want to do more than just get a preview from the digital camcorder; we want to be able to control it. We need to display a control panel appropriate to the device we re controlling, so we ll need to determine the type of the device and then open the appropriate dialog box. Two public variables of CMainDialog DVProp and TVProp manage the interface to the video capture device. DVProp is designed to work with DV camcorders, and TVProp is designed to interface with TV tuners. Which one do we use? How can we tell whether we have a camcorder or a TV tuner? DVProp is an instance of the CDVProp class, which is derived from the CExtDevice class, which furnishes the implementation of the InitDevice method, as follows:

void CExtDevice::InitDevice(IBaseFilter *pFilter) { m_pDevice.Release(); m_pTransport.Release(); m_pTimecode.Release(); // Query for the external transport interfaces. CComPtr<IBaseFilter> pF = pFilter; pF.QueryInterface(&m_pDevice); if (m_pDevice) { // Don't query for these unless there is an external device. pF.QueryInterface(&m_pTransport); pF.QueryInterface(&m_pTimecode); LONG lDeviceType = 0; HRESULT hr = m_pDevice->GetCapability(ED_DEVCAP_DEVICE_TYPE, &lDeviceType, 0); if (SUCCEEDED(hr) && (lDeviceType == ED_DEVTYPE_VCR) && (m_pTransport != 0)) { m_bVTR = true; StartNotificationThread(); } } }

In this method, a QueryInterface call is made, requesting the IAMExtDevice interface to the capture graph filter object if it exists. If it does exist, DirectShow has determined that this device is an external device. That s good, but it s not quite all that s needed. Two more calls to QueryInterface are made. The first call requests the IAMExtTransport interface. If this result is good, the device has VTR-like controls and is a digital camcorder in VTR mode. The second call requests the IAMTimecodeReader interface, which, if good, means that you can read the timecode on the tape inside the camcorder.

Timecode

DV camcorders leave a timestamp of information embedded in every frame captured by the device. This data is encoded in HH:MM:SS:FF format, where FF is the frame number (0 29 for standard 30-fps video). It s very useful information because with timecode you can, for example, know when to stop rewinding a tape if you re searching for a specific position within it. While it hardly qualifies as random access stopping a tape takes time, and in general, you won t be able to stop it at precisely the location you want timecode is very useful and provides a guide to the medium.

Through the three QueryInterface calls, we ve learned enough to know what kind of device is attached to the computer and what its capabilities are. If the device is external and has VTR-like characteristics, it must be a digital camcorder in VTR mode. If the device does not have VTR-like characteristics, it must be a digital camcorder in camera mode. In either case, the device should be able to provide timecode; if it doesn t, that might mean there s a problem with the device or that there s no tape in it! To be entirely sure that we have a VTR-capable device, a call is made to the GetCapability method of IAMExtDevice, and this result is compared to ED_DEVTYPE_VCR. If the result is true, the device has VCR capabilities.

CExtDevice::InitDevice creates a separate program thread that tracks all state changes in the external device. The CExtDevice::DoNotifyLoop method is called repeatedly whenever a DV-type device is selected. CExtDevice::DoNotify Loop issues a call to the GetStatus method on the IAMExtTransport interface with ED_MODE_CHANGE_NOTIFY passed as a parameter. When a change in state occurs, a private message is sent to the message queue of WinCap s main window. If the device doesn t support notification, the PollDevice method is executed, which will poll the device repeatedly, looking for changes in the device s status. These state changes will be passed along to WinCap.

Finally, back in CMainDialog::OnSelectDevice, if CExtDevice::HasDevice returns true, meaning there is a digital camcorder connected, CMainDialog::OnSelectDevice calls the DVProp::ShowDialog, and the VTR window is drawn on the display. The status of the buttons in the VTR window is dependent on whether the device is in VTR mode; that is, if CExtDevice::HasTransport is true. If it is, the buttons are enabled with a call to CDVProp::EnableVtrButtons; if not, they re disabled with the same call.

Issuing Transport Commands

Now that WinCap has acquired all the interfaces needed both to control and monitor a digital camcorder, there s a straightforward mapping between user interface events (button presses) and commands issued to the device. In the case of either rewind or fast forward, a method call is made to CExtDevice::Rewind and CExtDevice::FastForward, respectively, which translates into a call to CExtDevice::SetTransportState with one of two values, either ED_MODE_REW or ED_MODE_FF. CExtDevice::SetTransportState has a little bit of overhead, but its essence is a single line of code that calls the put_Mode method of the IAMExtTransport interface, with either ED_MODE_REW or ED_MODE_FF, depending on the user request. That single command is all that s required to put the camcorder into either mode.

It s not much more complicated to put the device into play, stop, or record mode. In these cases, however, there s a bit of overhead inside the CDVProp methods in case the device is in another mode for example, if the play button is pressed after a recording has been made to tape. In the case of the play button, it s all quite simple, as shown here:

void CDVProp::OnVtrPlay() { if (m_pGraph) { // If previously we were transmitting from file to tape, // it's time to rebuild the graph. if (m_bTransmit) { m_pGraph->TearDownGraph(); m_pGraph->RenderPreview(); m_bTransmit = false; } m_pGraph->Run(); } Play(); }

The structure of a DirectShow filter graph for tape playback is different from a filter graph used to write an AVI file to tape, so if the filter graph exists that is, if m_ pGraph is non-NULL a test is made of m_bTransmit. If this is true, the filter graph is destroyed with a call to CGraphBuilder::TearDownGraph and rebuilt with a call to the CGraphBuilder::RenderPreview method. Once that process is complete, the CExtDevice::Play method is invoked, which, as is in the case of rewind and fast forward, results in a call to CExtDevice::SetTransportState with a passed value of ED_MODE_PLAY. Stopping tape playback is performed in a nearly identical way; in that case, the value sent to CExt Device::SetTransportState is ED_MODE_STOP.

Recording an AVI File

Now that we ve gone through the more basic controls for a camcorder, we ll move on to the more involved procedure of writing an AVI file to a digital camcorder. As discussed earlier in this chapter, AVI isn t the same thing as DV format, but DV data can be stored inside an AVI file format. DV-formatted AVI files come in two varieties, known as Type 1 and Type 2. Type 1 AVI files store the DV audio and video together as chunks, while Type 2 AVI files store the DV audio in a separate stream from the DV video. Most modern AVI tools and programs produce Type 2 AVI files, so those will be the ones we ll cover here, but both types are discussed in detail in Chapter 14.

When the user clicks the File Transmit icon, the CDVProp::OnTransmit method is invoked. Here the user selects an AVI file for transfer to camcorder. Once the selection is made, a call is made to CCaptureGraph::RenderTransmit. That method calls CCaptureGraph::TearDownGraph, destroying any existing filter graph, and then calls CCaptureGraph::RenderType2Transmit (assuming a Type 2 AVI file for clarity s sake), which builds a new filter graph. Here s the source code for that method:

HRESULT CCaptureGraph::RenderType2Transmit(const WCHAR* wszFileName) { HRESULT hr; // Target graph looks like this: // File Src -> AVI Split -> DV Mux -> Inf Pin Tee -> MSDV (transmit) // -> renderers CComPtr<IBaseFilter> pDVMux; CComPtr<IBaseFilter> pFileSource; // Add the DV Mux filter. hr = AddFilter(m_pGraph, CLSID_DVMux, L"DV Mux", &pDVMux); if (FAILED(hr)) { return hr; } // Add the File Source. hr = m_pGraph->AddSourceFilter(wszFileName, L"Source", &pFileSource); if (FAILED(hr)) { return hr; } // Connect the File Source to the DV Mux. This will add the splitter // and connect one pin. hr = ConnectFilters(m_pGraph, pFileSource, pDVMux); if (SUCCEEDED(hr)) { // Find the AVI Splitter, which should be the filter downstream // from the File Source. CComPtr<IBaseFilter> pSplit; hr = FindConnectedFilter(pFileSource, PINDIR_OUTPUT, &pSplit); if (SUCCEEDED(hr)) { // Connect the second pin from the AVI Splitter to the DV Mux. hr = ConnectFilters(m_pGraph, pSplit, pDVMux); } } if (FAILED(hr)) { return hr; } // Add the Infinite Pin Tee. CComPtr<IBaseFilter> pTee; hr = AddFilter(m_pGraph, CLSID_InfTee, L"Tee", &pTee); if (FAILED(hr)) { return hr; } // Connect the DV Mux to the Tee. hr = ConnectFilters(m_pGraph, pDVMux, pTee); if (FAILED(hr)) { return hr; } // Connect the Tee to MSDV. hr = ConnectFilters(m_pGraph, pTee, m_pCap); if (FAILED(hr)) { return hr; } // Render the other pin on the Tee, for preview. hr = m_pBuild->RenderStream(0, &MEDIATYPE_Interleaved, pTee, 0, 0); return hr; }

This function should look very familiar to you because here a filter graph is built up, step by step, using available DirectShow filters. Figure 6-5 shows the filter graph created by WinCap to record the AVI file to the camera.

figure 6-5 the filter graph created by wincap

Figure 6-5. The filter graph created by WinCap

First a DV Mux filter is added to the filter graph with a call to AddFilter; this filter takes separate audio and video streams and multiplexes them into a single DV stream. These filters are intelligently connected using the ConnectFilters function (a function global to the application). This function adds an AVI Splitter filter between the two filters. Next the FindConnectedFilter function (another global function) locates the AVI Splitter filter and once located, connects the second pin from the AVI Splitter to the DV Mux. This splitting and recombining keeps the video and audio channels together as they pass from the file into the digital camcorder. If you don t split and recombine the stream, you ll end up with a video image but no audio. That s because we re dealing with a Type 2 AVI file here: the DV video and DV audio reside in separate streams and must be manipulated separately by DirectShow.

Next the method instantiates an Infinite Pin Tee filter. This filter is like a Smart Tee (which could have been used in its place), except that it has an unlimited number of output pins. (Also, we don t need the Smart Tee s ability to drop frames to conserve processor time.) In other words, the incoming stream can be replicated any number of times limited by computer horsepower, of course and every time another output pin on the Infinite Pin Tee receives a connection from another filter, another output pin is added to the Infinite Pin Tee. It truly has infinite pins or at least one more than you ll use! The DV Mux is then connected to the Infinite Pin Tee.

At this point, the filter graph is nearly complete. The Infinite Pin Tee is connected to the filter representing the renderer filter, which in the case the Microsoft DV Camcorder and VCR filter, has an input stream to receive a stream. In addition, the Infinite Pin Tee is passed along in a call to the RenderStream method of the ICaptureGraphBuilder2 interface on the filter graph builder. This call will add another output pin on the Infinite Pin Tee and send that along to the WinCap window so that the recording process can be monitored by the user as it happens.

As the last step in CDVProp::OnTransmit, the CExtDevice::Record method is invoked, which results in the invocation of CExtDevice::SetTransportMode with a value of ED_MODE_RECORD. Notice that the filter graph has not been started and data is not being transmitted across the filter graph. This is why it was important to set up a monitoring thread when we initialized the CDVProp object that manages the digital camcorder. As stated in the Issuing Commands section earlier in this chapter, a camcorder does not enter record mode for some time after the record command has been issued. However, once the device has entered record mode, the CDVProp::OnDeviceNotify method receives a message from the thread, and the Filter Graph Manager executes the Run method, beginning execution of the filter graph.

Capturing Video to an AVI File

WinCap makes capture to an AVI file very easy; the user only needs to click on the Start Capture button to begin the capture process. (AVI is a generic container format, like ASF; just because a file has an AVI extension doesn t imply that there s DV video information contained within it.) This action produces a call to the CMainDialog::OnCapture method. If capturing has already begun, the button reads Stop Capture, so the first thing the method does is determine whether capturing has already been going on. If so, capture ceases. If not, a call is made to CCaptureGraph::RenderAviCapture. This function builds a simple filter graph with a call to the ICaptureGraphBuilder2 method SetOutputFileName, followed by a call to RenderStream. Once the filter graph has been built, a call to CCaptureGraph::StopCapture, which translates into a method call on the ICaptureGraphBuilder2 method ControlStream. ControlStream allows you to set the capture parameters of a filter graph so that you can get frame-accurate capture from a stream. In this case, the parameters force any capture in progress to cease immediately. Once the filter graph has been set up, the Filter Graph Manager invokes the Run method and filter graph execution begins.

Back again in CMainDialog::OnCapture, CCaptureGraph::StartCapture is invoked, which translates into another call to ControlStream, this time with parameters that force capture to begin. With that step complete, WinCap begins to write an AVI file to disk, translating the video frames from the digital camcorder into a disk file.

When the user clicks the Stop Capture button, CCaptureGraph::StopCapture is invoked a final time, stopping the capture process. At this point, there should be an AVI file on the user s hard disk containing the entire content of the captured video stream. This file is likely to be very large because DV-based AVI files grow at something close to 240 MB per minute!

DirectShow doesn t sense any difference between capture from a live camcorder and capture from a miniDV tape being played through a camcorder in VTR mode. As far as DirectShow is concerned, it s a DV stream and that s it. However, for the programmer, the difference is significant because in VTR mode, the programmer has to start the filter graph, start the tape, play the tape (sending the DV stream through the filter graph), stop the tape, and after a second or two has passed to allow for device latency in responding to commands stop the filter graph.

Clocks and the Filter Graph

Every filter graph has to find some way to synchronize the media streams as they pass through the filter graph. Synchronization is particularly important if multiple streams are passing through a filter graph. If they become desynchronized, the results will very likely be unacceptable. For this reason, the Filter Graph Manager chooses a clock source that it uses to synchronize its operations across the entire filter graph.

Selection of a clock source is based on the available sources. The Filter Graph Manager first looks for live capture sources, such as the Microsoft DV Camera and VCR capture source filter. That filter is connected to the IEEE 1394 bus and its associated drivers, which have their own clock. Because the stream flowing across the filter graph is being generated by the capture source filter, using that filter as the clock source will help keep all streams in the filter graph properly synchronized.

If there isn t a usable capture source filter, the Filter Graph Manager will try to use a time source from the Microsoft DirectSound Renderer filter, which is generally an interface to some hardware on the computer s sound card. The filter graph must be using the DirectSound Renderer filter if it is to become the clock source for the filter graph. The Filter Graph Manager will not add the DirectSound Renderer filter to the filter graph just to acquire a clock source. Using a DirectSound-generated clock is more than convenient; the clocking circuitry on a PC s sound card is highly accurate, and providing a clocking source through the audio hardware keeps the video synchronized with the audio. If the DirectSound Renderer isn t the clock for the filter graph, the audio renderer works through DirectSound to synchronize the clock generated by the sound card with the DirectShow system clock.

As a last resort, if the Filter Graph Manager can t acquire a clock source from a capture source filter or the DirectSound Renderer filter, it uses the system time as its clock source. The system time is the least optimal clock source because the stream passing through the graph isn t being synchronized to any clock involved in capturing the stream, which could lead to a situation where two streams passing through different portions of the graph could progressively desynchronize with respect to one another.

One way to explore clock assignments within the filter graph is to create a few capture and rendering filter graphs in GraphEdit. If a filter can act as the clock source for the filter graph, its filter is branded with a small icon of a clock. You ll also see the Select Clock menu item enabled when you right-click a filter that can be used as the clock source for a filter graph. Selecting that menu item sets the filter graph s clock source to the selected filter.

From the programmer s perspective, any filter object that exposes the IReferenceClock interface can be used as the reference clock for a filter graph. To change the reference clock on a filter graph, use the IMedia Filter interface, exposed by the Filter Graph Manager object. The interface has a method, SetSyncSource, which is passed a pointer to the new IReferenceClock to be used as the reference clock.

Monitoring Timecode

In the Timecode sidebar earlier in this chapter, we learned that digital videotape has a timestamp, known as timecode, that tracks the tape s position. The VTR window displays the tape s timecode in an inset area underneath the VTR buttons. Whenever the VTR window needs to be updated and that s driven by a simple timer set up when the window initializes it makes a call to CDVProp::DisplayTimecode, which in turn calls CExtDevice::GetTimecode. Here s the implementation of that method:

HRESULT CExtDevice::GetTimecode(DWORD *pdwHour, DWORD *pdwMin, DWORD *pdwSec, DWORD *pdwFrame) { if (!HasTimecode()) { return E_FAIL; } if (!(pdwHour && pdwMin && pdwSec && pdwFrame)) { return E_POINTER; } // Initialize the struct that receives the timecode. TIMECODE_SAMPLE TimecodeSample; ZeroMemory(&TimecodeSample, sizeof(TIMECODE_SAMPLE)); TimecodeSample.dwFlags = ED_DEVCAP_TIMECODE_READ; HRESULT hr; if (hr = m_pTimecode->GetTimecode(&TimecodeSample), SUCCEEDED(hr)) { // Coerce the BCD value to our own timecode struct, // in order to unpack the value. DV_TIMECODE *pTimecode = (DV_TIMECODE*)(&(TimecodeSample.timecode.dwFrames)); *pdwHour = pTimecode->Hours10 * 10 + pTimecode->Hours1; *pdwMin = pTimecode->Minutes10 * 10 + pTimecode->Minutes1; *pdwSec = pTimecode->Seconds10 * 10 + pTimecode->Seconds1; *pdwFrame = pTimecode->Frames10 * 10 + pTimecode->Frames1; } return hr; }

If the device has timecode capabilities that is, if CExtDevice::HasTimecode returns a non-NULL result, a structure defined as TIMECODE_SAMPLE is initialized and then passed along as a parameter to the IAMTimecodeReader method GetTimecode. GetTimecode returns the tape s current timecode format, not in decimal notation, but in binary-coded decimal, an antique notation that s rarely used anymore. Because of this notation, there s a bit of arithmetic to convert the binary-coded decimal to decimal, and that value is returned in four parts: hours, minutes, seconds, and frames. Don t attempt to use the raw, unconverted values returned by GetTimecode; they re not proper decimal numbers and will lead to all sorts of errors.

DV Devices, Timecode, Synchronization, and Dropped Frames

Some DV devices in particular, Sony camcorders will create non-sequential timecodes, which means that as your tape advances through the play heads, the timecode value could actually return to 00:00:00:00. This is a convenience feature (as Sony sees it) because the camera decides (based on its own internal calculations) that an entirely new sequence of video has begun, at which point the timecode is zeroed and the count begins again.

This situation is a possible gotcha for programmers searching for a particular timecode on a tape. For example, you might find more than one point on a tape that has a 00:00:00:00 timecode. If you re looking for the start of the tape, it s probably best to rewind the tape to its beginning instead of search for a particular timecode. (To get around this problem, some DV filmmakers take a blank tape and simply record onto it with no source video to establish a continuous timecode from beginning to end of tape.) This timecode problem is one reason that professionals working with digital video often opt for a DVCAM-capable camcorder. Although DVCAM records the same 25 Mbps stream as miniDV and Digital8 camcorders, the timecode is maintained continuously from the start of the tape through to its end.

Another issue that a DirectShow programmer might encounter when working with DV timecode is that the timestamps on DV media samples do not match the computer clock. DirectShow uses the IEEE 1394 bus clock contained within the IEEE 1394 data packets to create a timestamp value that s added to the media sample data as it passes through the filter graph. This clock isn t the same as the computer s clock (the REFERENCE_TIME value given in 100-nanosecond increments), so over a long enough time period, the system time and the DV timestamp will drift apart. Is this a problem? It might be if you re relying on perfect synchronization between the system clock and media sample timestamps.

DirectShow provides an interface on every video capture source filter known as IAMDroppedFrames. This interface provides a way for the filter graph to report frames dropped by the capture filter. Frames can get dropped if the computer gets too busy performing some other task to service the capture source filter or if the computer s hard disk isn t fast enough to write the entire DV stream to disk. The IAMDroppedFrames interface presents the GetNumDropped and GetDroppedInfo methods so that the DirectShow application can detect dropped frames and, perhaps, recover any lost data. (An application could detect dropped frames and in the case of a VTR source, rewind the input DV stream to a position before the lost frames and capture the source stream once again.) Dropped frames can also be the fault of the camcorder; errors on a DV tape playing through a camcorder can produce dropped frames.

Getting Device Information

It s often important to know specific parameters about an external device connected to a computer. In the case of a digital camcorder, for example, it might be very important to know whether there s a tape in the unit. If there isn t, you might need to alert users before they attempt to record to a nonexistent tape! Although it s not a part of WinCap, the following function can be integrated into your own DirectShow applications to give you the ability to detect a tape-not-present situation, as well as tell you something about the capabilities of the device attached to the user s computer:

HRESULT DV_GetTapeInfo(void) { HRESULT hr; LONG lMediaType = 0; LONG lInSignalMode = 0; // Query Media Type of the transport. hr = MyDevCap.pTransport->GetStatus(ED_MEDIA_TYPE, &lMediaType); if (SUCCEEDED(hr)) { if (ED_MEDIA_NOT_PRESENT == lMediaType) { // We want to return failure if no tape is installed. hr = S_FALSE; } else { // Now let's query for the signal mode of the tape. MyDevCap.pTransport->GetTransportBasicParameters (ED_TRANSBASIC_INPUT_SIGNAL, &lInSignalMode, NULL); Sleep(_MAX_SLEEP); if (SUCCEEDED(hr)) { // Determine whether the camcorder supports NTSC or PAL. switch (lInSignalMode) { case ED_TRANSBASIC_SIGNAL_525_60_SD : g_AvgTimePerFrame = 33; // 33 ms (29.97 FPS) printf("VCR Mode - NTSC\n"); break; case ED_TRANSBASIC_SIGNAL_525_60_SDL : g_AvgTimePerFrame = 33; // 33 ms (29.97 FPS) printf("VCR Mode - NTSC\n"); break; case ED_TRANSBASIC_SIGNAL_625_50_SD : g_AvgTimePerFrame = 40; // 40 ms (25 FPS) printf("VCR Mode - PAL\n"); break; case ED_TRANSBASIC_SIGNAL_625_50_SDL : g_AvgTimePerFrame = 40; // 40 ms (25 FPS) printf("VCR Mode - PAL\n"); break; default : printf("Unsupported or unrecognized tape format type\n."); g_AvgTimePerFrame = 33; // 33 ms (29.97 FPS); default break; } printf("Avg time per frame is %d FPS\n", g_AvgTimePerFrame); } else { printf("GetTransportBasicParameters Failed (0x%x)\n", hr); } } } else { printf("GetStatus Failed (0x%x)\n", hr); } return hr; }

The DV_GetTapeInfo function from the sample program DVApp, an excellent sample program that covers many DV features not explored in WinCap begins with a call to the IAMExtTransport method GetStatus, passed with a parameter of ED_MEDIA_TYPE. This call translates into a request for the media type inserted into the external device, which in this case is a digital camcorder. If the method returns a value of ED_MEDIA_NOT_PRESENT, there s no tape in the device.

Next a call to the IAMExtTransport method GetTransportBasicParameters, passed with a parameter of ED_TRANSBASIC_INPUT_SIGNAL, will return a value that indicates the format of the video signal encoded on the tape. A value of ED_TRANSBASIC_SIGNAL_ 525_60_SD or ED_TRANSBASIC_SIGNAL_ 525_60_SDL indicates a video format of 30 fps, the NTSC standard used in the USA and Japan. A value of ED_TRANSBASIC_SIGNAL_625_ 50_SD or ED_TRANS -BASIC_SIGNAL_625_ 50_SDL indicates a video format of 25 fps, the PAL standard used widely across Europe, Africa, and Asia.

Several other properties can be queried with the GetTransportBasic Parameters method call. Information on these properties can be found in the DirectShow SDK documentation.

Issuing Raw AV/C Commands

In some situations, the hardware control options offered by DirectShow filters are not enough to handle the particulars of your digital camcorder based application. DirectShow provides a clean and concise interface to the camcorder for such basic operations as play, record, fast forward, and rewind. But what if you want to eject the tape from the camcorder? As you see in the VTR dialog box, there s a button featuring the eject glyph, which attempts to eject the tape from the selected camcorder. Ejection can t be done with any of the basic functionality offered by the IAMExtTransport interface. Instead, it has to be done through a raw Audio Video Control (AV/C) command sent directly to the device.

DirectShow does offer a mechanism to issue raw AV/C commands to attached IEEE 1394 devices. The IAMExtTransport interface s GetTransportBasic Parameters method can be used to issue these raw commands. Here s the implementation of the CDVProp::Eject method, which sends a raw AV/C command to the camcorder:

void CDVProp::Eject() { BYTE AvcCmd[512]; // Big enough for the command ZeroMemory(AvcCmd, sizeof(AvcCmd)); BYTE EjectCommand[] = { 0x00, 0x20, 0xC1, 0x60 }; memcpy(AvcCmd, EjectCommand, sizeof(EjectCommand)); long cbCmd = sizeof(EjectCommand); HRESULT hr = GetTransport()->GetTransportBasicParameters(ED_RAW_EXT_DEV_CMD, &cbCmd, (LPOLESTR*)AvcCmd); return; }

To issue a raw AV/C command, a buffer is created, cleared, and then loaded with the command to be sent to the device. This buffer has to be big enough to handle any return message sent from the device, so here we ve created a buffer of 512 bytes, the maximum message length. The buffer is passed in a call to the GetTransportBasicParameters method with a value of ED_RAW_EXT_DEV_CMD passed as the first parameter, indicating that this is a raw AV/C command. The call is processed synchronously; control will not return until the command has been completed. At this point, the returned buffer should have whatever response codes were issued by the device, and the method s return value should be examined for any Windows error code.

The topic of raw AV/C commands is a big one, too rich to be covered in detail in this book. The relevant documentation on AV/C commands can be found on the Web site of the IEEE 1394 Trade Association, www.1394ta.org, under Specifications. Here you ll find links to documentation describing the format and definition of the call and return values of each of the AV/C commands implemented by IEEE 1394 devices.

A final warning: you should use a raw AV/C command only when you have no other choice. The eject function is not offered by Windows and it isn t implemented in all camcorders, so we need to use raw AV/C functionality to implement it. Raw AV/C commands are not managed by Windows, and the program will wait for a response to the command before continuing program execution. Therefore, if the device fails to respond for some reason, your program could be waiting a long, long time.

Multiple DV Camcorders on a Computer

IEEE 1394 allows multiple camcorders to be connected to a host computer. Although you might be tempted to think that this would allow you to work with multiple DV video streams simultaneously in your filter graph, the truth of the matter is that it is at best unreliable. Here are some technical comments explaining why from the DirectShow engineering group at Microsoft.

CMP (Connection Management Procedure) issues IEEE 1394 devices need to establish peer-to-peer connection for streaming. This is different from USB, where the host is always in control. To start streaming, the device and the PC establish a connection, which is an agreement for the data producer and consumer to transmit and listen to the same isochronous channel. (Isochronous channels allow the transmitter to reserve a portion of bandwidth on a network.) This is done via the CMP protocol. IEEE 1394 uses 64 channels (0 through 63), with channel 63 reserved for broadcast. Many DV camcorders are not CMP compliant, so the PC does not initiate the connection. It relies on the camcorder to make a broadcast connection on channel 63. Because there is only one broadcast channel, DirectShow can work with only one such non-compliant DV device on a 1394 bus.

Bandwidth Although IEEE 1394a supports up to 400 Mbps networking, most consumer DV camcorders use the 100 Mbps rate. Bandwidth is measured in bandwidth units of 20.3 nanoseconds (ns). There are 6144 units per 125 millisecond (ms) cycle. For isochronous (streaming) transfer, 25 ms (20 percent) is reserved for asynchronous control data, leaving 80 percent for the isochronous transfer (4915 units). A typical consumer DV camcorder has a 25-Mbps video stream, but there is additional overhead for audio and things like 1394 packet headers. The calculation for one DV stream according to IEC 61883-1 is 2512 units, which is more than half of the 4915 available. If the DV device is the isochronous resource manager (IRM), it will make the available bandwidth value slightly higher (greater than 5024 units or 82 percent) to allow two DV streams, although this does not comply with the IEEE 1394 specification. If the PC is the IRM, it reserves exactly 80 percent of bandwidth, following the specification, with the result that only one camcorder will be visible to the DirectShow application the one that s turned on first.

Camera mode If two legacy (non-CMP) camcorders are connected to the computer, the PC does not establish the connection. The devices will compete to use the broadcast channel (63) for streaming, and probably the first one to stream will get the broadcast channel. If you have a mix of one legacy and one CMP-compliant device, the situation gets even more complicated, so it s not a reliable scenario.

VTR mode If both cameras are in VTR mode and the application plays a tape on only one device, then it should work.

If you do have multiple cameras of the same type (same manufacturer, make, and model) on your system, you re going to need some way to distinguish them. Their monikers will have the same Description and FriendlyName values, so they can t be used for this purpose. Instead, you ll need to get the DevicePath property from the property bag. (We covered how to use the property bag when enumerating devices in Chapter 4.) This isn t a human-readable string (it s not text), but it is unique to each camcorder. That string can be used to differentiate between multiple camcorders of the same type on a single system. How you chose to represent these camcorders to the user ( Sony Camcorder 1, Sony Camcorder 2, etc.) is up to you.