Chapter 11: AudioPaths

Chapter 11: AudioPaths


Todor Fay

AudioPaths bring a whole new level of flexibility and control to your music and sound effects. So far, everything we have done plays through one AudioPath. This has worked well for us in our programming examples up to this point; however, it is limited with respect to adding sound effects and increased musical complexity. Let's take a look:

  • No way to address 3D: For sound effects work, it is critical that one or more sound-producing Segments can be routed to a specific 3D position in space. It is also required that the 3D position be directly accessible to the host application so it can be controlled during gameplay.

  • Pchannel collision: Segments authored with the same pchannels conflict when played at the same time. For example, instrument choices from one Segment are applied to another, and a piano part is played by a bassoon. It should be possible to play two Segments in their own "virtual" pchannel spaces, so they cannot overlap.

  • No independent audio processing: It should be possible to set up different audio processors to affect different sound effects paths in different ways, such as applying echo to sounds going under a tunnel or distortion to an engine. Likewise, different musical instruments as well as sections can benefit from individualized audio processing (reverb, chorus, compression, etc.).

  • No independent MIDI processing: In the same vein, it should be possible to set up different MIDI, or pmsg, processors (tools) to manipulate individual musical parts and sound effects.

In fact, this was the status quo with DirectX 7.0. AudioPaths, introduced in DirectX 8.0, set out to solve these issues.

Anatomy of an AudioPath

So what is an AudioPath? An AudioPath is a DirectMusic object that manages the route that a musical element or sound effect takes. It defines everything that is needed to get from the point when a Segment Track emits a pmsg (Performance message) to when the final rendered audio finally leaves DirectSound as PCM wave data. This includes everything from how many pchannels to use to the placement of audio effects and their configuration.

An AudioPath can be as simple or sophisticated as you want it to be. It can be a 3D path with nothing more than a DirectSound3D module at the end for positioning, or it can be a complex path with different pchannels assigned to different audio channels with realtime audio effects processing.

Let's dissect the AudioPath from top to bottom and learn about all the possible steps. Because there's so much that it can do, you might find this a bit overwhelming. It's not important that you memorize all the steps. Just read this and get a feel for the possibilities. Later, if you want to zero in on a particular feature, you can read through it again.

The AudioPath can be broken down into two phases, before and after the synth (see Figure 11-1). Before the synth, the play data is managed in pmsg form inside DirectMusic's Performance Layer. Each pmsg identifies a particular sound to start or stop or perhaps a control, such as which instrument to play or its volume at any point in time. The synth converts the pmsg into wave data, at which point we are in the second phase of the AudioPath where raw wave data is fed through multiple streaming DirectSound buffers to be processed via effects, optionally positioned in 3D, and finally mixed to generate the output.

click to expand
Figure 11-1: AudioPath phases.

AudioPath Performance Phase

Figure 11-2 shows the Performance phase in greater detail. This is the journey a pmsg takes as it travels through the AudioPath from Track to synth:

click to expand
Figure 11-2: AudioPath Performance phase.

Wow! Does that look complicated. Is there danger of the pmsg making it through in time for lunch, let alone alive? Have no fear; it's not as crazy as it looks. If you are worried about extra latency of CPU overhead, don't be. The internal implementation is lightweight and efficient. Also, most of the objects in the path are optional, so their steps are simply skipped when they don't exist.

Let's walk through the steps:

  1. Track emits a pmsg. The pmsg is filled with appropriate data and time stamped. There is a wide range of available pmsg types, from MIDI note to wave.

  2. The pmsg enters the Segment's Tool Graph. This is optional. The Graph holds a set of one or more Tools, which are pmsg processors. These can alter the pmsg in real time. Simple examples of Tools would be echo (make multiple copies at later timestamps) and transpose (shift the pitch up or down.) Every Tool is represented by an IDirectMusicTool interface. Tools are very easy to write and use. In fact, there's a wonderful Tool wizard that ships with DX9, so if you have wild ideas for things you'd like to do in real time to your music or sounds as they play, I highly recommend giving it a try. Unfortunately, there isn't enough room in this book to walk through the process of creating full-featured Tools. Tools in the Segment are typically authored directly into the Segment and only process pmsgs in the Segment itself.

  3. The pmsg leaves the Segment and enters the AudioPath's Tool Graph. Tools in this Graph process all pmsgs coming from all Segments playing in the AudioPath. Think of these as partially global tools, in that they process all pmsgs that flow through a specific AudioPath, regardless of the Segment. These Tools are also optional and typically authored via the AudioPath configuration file.

  4. The AudioPath maps the pmsg pchannel from the local AudioPath defined pchannel range to a unique pchannel in the Performance. This ensures that Segments played on different AudioPaths cannot collide on the same pchannels.

  5. The Performance also has an optional Tool Graph. Tools in this Graph process pmsgs from all Segments on all Audio-Paths. These are truly global Tools.

  6. Then, the Performance maps the pmsg's pchannel to a real MIDI channel and channel group number because the synthesizer operates completely in the MIDI domain. Additionally, the pmsg is converted into MIDI format data. For example, a note pmsg is converted to a MIDI note on and a MIDI note off. Or, a volume curve is converted into a series of discrete MIDI volume control change commands. The resulting MIDI events are shipped down to the synth as a block of MIDI commands.

  7. Finally, the synth receives the MIDI data and renders it into one or more channels of wave data. Note that even the type of synthesizer and which pchannels it connects to is managed by the AudioPath.

The cool thing is the AudioPath has access and control over almost all of the steps in this process. This means that via the AudioPath API, you can access most of these as well.

Let's continue with the second phase of the AudioPath.

AudioPath DirectSound Phase

The synth, which is managed via DirectMusic's IDirectMusicPort interface, generates multiple streams of audio data, which are fed into one or more DirectSound buffers (see Figure 11-3). Again, the AudioPath retains access to and control over each of the steps on the way.

click to expand
Figure 11-3: AudioPath DirectSound phase.

Let's walk through these steps:

  1. The synth emits one or more streams of audio data. The AudioPath defines for each pchannel how many audio outputs should be created and where they should connect. These are called "buses." This means that you can literally have every single pchannel routed to a different set of effects if you so desire.

  2. Each bus is fed into a DirectSound Buffer. These Buffers are called "synth-in Buffers" because they receive their input from the synth.

  3. Each Buffer can hold any number of DMOs. The audio data is fed through each DMO in order. A DMO is an audio effects processor. Like tools, you can create and insert these into your AudioPath configuration. DirectMusic ships with a great set of DMOs, from Reverb to Compression, which you can use. But be careful; each DMO uses CPU to continuously process the audio. So the more you add, the more CPU gets gobbled.

  4. If the Buffer is set up as a 3D Buffer, the audio data next enters the 3D engine where it is placed in 3D space and sent to the final mix. Otherwise, the data goes directly to the final mix.

  5. A send can be placed in the Buffer to route audio data to another Buffer. The destination buffer takes its input from other buffers, so it is called a "Mix-in Buffer."

  6. The Mix-in Buffer also has one or more DMOs, which can process the data before sending to the final mix. These are often called global Buffers because they can take input from any number of regular Buffers. This is a great way to economize on CPU. When creating the AudioPath configuration in Producer, place Send Effects in your regular Buffers to send to a shared Mix-in Buffer, which has some CPU-expensive processing to do, and just that one instance can do the work for all.

Pretty amazing, eh? With AudioPaths, you can create sophisticated routing and processing of your audio all the way from the Segment to the final mix. There are many cool things you can stick in it and many ways to configure it. You can create any number of instances of any AudioPath and each has its own virtual pchannel space, so your Segments won't collide.


One word about terminology — although we use the term "Buffer," it is anything but that. Historically, a Buffer in DirectSound represented a genuine memory buffer. You'd create it, copy your audio data into it, and then play it. As we discussed earlier, this is very inefficient because it binds the data to the playback device. But, by DX8, buffers had many new features, including 3D and a kludge to get around the inefficiency via the DuplicateSoundBuffer command. So, we continue with the name "Buffer," but think of it really as "AudioStream" or something like that.