DirectX Audio Building Blocks

DirectX Audio Building Blocks

In order to accomplish the simplest level of DirectX Audio programming, we need to work with three objects — Segment, Performance, and Loader. It is important to understand AudioPaths and COM programming as well.


A Segment represents any playable audio. It might be a MIDI or wave file, or it might be a DirectMusic Segment file authored in DirectMusic Producer. You can load any number of Segments from disk, and you can play any number of them at the same time. You can even play the same Segment multiple times overlapping itself.


An AudioPath represents the journey from Segment to synthesizer/mix engine to audio channel to final mix. A DirectSound Buffer manages the audio channel, since a Buffer really represents one hardware (or software) audio channel. There can be any number of AudioPaths active at any time and any number of Segments can play on the same AudioPath at once.


The Performance manages the scheduling of Segments. Think of it as the heart of the DirectX Audio system. It allocates the AudioPaths and connects them, and when a Segment plays, it connects the Segment to an AudioPath and schedules its playback.


Finally, the Loader manages all file input/output. This is intentionally separate from the Performance, so applications can override the loading functionality, should they desire.


COM is short for Component Object Model. Like all DirectX objects, the Segment, AudioPath, Performance, and Loader are COM objects. DirectX Audio's usage of COM is at the most elementary level. Let's discuss COM now:

  • Individual function interfaces represent each object as a table of predefined functions that you can call. There is no direct access to the internal data and functions of the object.

  • Interfaces and objects are not the same. An interface provides an understood way to communicate with an object. Importantly, it is not limited to one object. For example, there is an interface, IPersistStream, that provides methods for reading and writing data to a file. Any COM object that includes an IPersistStream interface can receive commands to read and write data from any component that understands IPersistStream.

  • A unique identifier called an IID (interface ID) identifies each COM interface. An IID is really a GUID (globally unique ID), a 16-byte number guaranteed to be unique. For example, IID_IDirectMusicPerformance represents the IDirectMusic-Performance interface.

  • Every COM object also has a unique identifier called a CLSID (class ID), which is also a GUID. Continuing our example, CLSID_DirectMusicPerformance identifies the DirectMusic Performance object.

  • COM has a special function called CoCreateInstance() that you can use to create most COM objects. CoCreateInstance() finds the DLL (dynamic-link library) that contains the object's code, loads it, and calls a standard entry function to create the object and then return it to your program. The caller provides the CLSID of the object and the IID of the desired interface, and CoCreateInstance() returns the matching object and interface.

  • With few exceptions, all COM methods return error codes in a standardized format called HRESULT. When a method succeeds, it returns S_OK. There are failure codes for everything from out of memory to errors specific to the particular object. The SDK documents all possible return codes for each interface.

  • Each object supports the base COM interface, IUnknown, with its three methods: AddRef(), Release(), and QueryInterface(). These methods provide a standardized way to manage the existence of the object and access its interfaces.

    A program might reference an object with more than one pointer in more than one place. This easily happens when objects indirectly reference other objects (for example, a program references several Segments that all reference a particular wave file). It is important for the object to know how many pointers are currently referencing it. Every time something new points to the object, it calls AddRef() to let the object know. The object's AddRef() implementation increments an internal reference counter and ensures that the object stays valid until the last external reference is released via Release(). AddRef() is automatically invoked when an object is created, so you rarely call it directly.

    When a pointer to an object is no longer needed, the referencing owner calls Release() to tell the object that a reference has gone away. When the reference count drops to zero, indicating that nothing is using the object anymore, the object must go away for good. Typically, it cleans up any variables of its own and then frees its memory.

    A COM object can support one or more interfaces. Because all interfaces are based on IUnknown and must implement the QueryInterface() method, it is possible to use one interface to hopscotch to the next by calling QueryInterface() and naming the desired interface (using the IID). For example, the Segment object supports both the IDirectMusicSegment interface, which you use to set Segment parameters, as well as the IPersist-Stream interface, which the Loader uses to read data into the Segment.

Therefore, from your perspective, COM programming in DirectX Audio means you use CoCreateInstance() to create a few objects (the rest are created automatically by other DirectX Audio objects), Release() to get rid of them, and on the rare occasion QueryInterface() to access additional features.

That's enough explanation about COM. Let's get on with the program.