Okay, we have covered everything you need to know to get DirectX Audio making sound. As you start to work with this, though, you will find that there are some big picture issues that need to be sorted out for optimal performance.
Minimize latency: By default, the time between when you start playing a sound and when you hear it can be too long for sudden sound effects. Clearly, that needs fixing.
Manage dynamic 3D resources: What do you do when you have 100 objects flying in space and ten hardware AudioPaths to share among them all?
Keep music and sound effects separate: How to avoid clashing volume, tempo, groove level, and more.
Latency is the number one concern for sound effects. For music, it has not been as critical, since the responsiveness of a musical score can be measured in beats and sometimes measures. Although the
Fortunately, DX9 introduces dramatically improved latency. With this release, the latency has dropped as low as 5ms, depending on the sound card and driver. Suddenly, even the hair-trigger sound effects work very well.
But there still are a few things you need to do to get optimal performance. Although the latency is capable of dropping insanely low, it is by default still kept pretty high — between 55 and 85ms, depending on the sound card. Why? In order to guarantee 100 percent compatibility for all applications on all sound cards on all systems, Microsoft decided that the very
Fortunately, if you know your application is running on a half-decent machine, you can override the settings and drive the latency way, way down. There are two commands for doing this. Because the audio is streamed from the synthesizer and through the effects chains, the mechanism that does this needs to wake up at a regular interval to process another batch of sound. You can determine both how close to the output the write cursor sits and how frequently it wakes up. The lower these numbers, the lower the overall latency. But there's a cost: If the write cursor is too close to the output time, you can get glitches in the sound when it simply doesn't wake up in time. If the write period is too frequent, the CPU overhead goes up. Indeed, these commands actually always existed with DirectMusic from the very start, but they could not reliably deliver significant results because it was very easy to drive the latency down too far and get horrible
The two commands set the write period and write latency and are implemented via the IKsControl mechanism. IKsControl is a general-purpose mechanism for talking to kernel modules and low-level drivers. The two commands are represented by GUIDs:
GUID_DMUS_PROP_WritePeriod: This sets how frequently (in ms) the rendering process should wake up. By default, this is 10 milliseconds. The only cost in lowering this is increased CPU overhead. Dropping to 5ms, though, is still very reasonable. The latency contribution of the write period is, on average, half the write period. So, dropping to 5ms is an average latency increase of 2.5ms — worst case 5ms.
GUID_DMUS_PROP_WriteLatency: This sets the latency, in milliseconds, to add to the sound card driver's latency. For example, if the sound card has a latency of 5ms and this is set to 5ms, the real write latency ends up being 10ms.
So, total latency ends up being WritePeriod/2 + WriteLatency + DriverLatency. Here's the code to use this. This example sets the write latency to 5ms and the write period to 5ms.
IDirectMusicAudioPath *pPath; if (SUCCEEDED(m_pPerformance->GetDefaultAudioPath(&pPath)) { IKsControl *pControl; pPath->GetObjectInPath(0, DMUS_PATH_PORT,0, // Talk to the synth GUID_All_Objects,0, // Any type of synth IID_IKsControl, // IKsControl interface (void **)&pControl); if (pControl) { KSPROPERTY ksp; DWORD dwData; ULONG cb; dwData = 5; // Set the write period to 5ms. ksp.Set = GUID_DMUS_PROP_WritePeriod ; // The command ksp.Id = 0; ksp.Flags = KSPROPERTY_TYPE_SET; pControl->KsProperty(&ksp, sizeof(ksp), &dwData, sizeof(dwData), &cb); dwData = 5; // Now set the latency to 5ms. ksp.Set = GUID_DMUS_PROP_WriteLatency ; ksp.Id = 0; ksp.Flags = KSPROPERTY_TYPE_SET; pControl->KsProperty(&ksp, sizeof(ksp), &dwData, sizeof(dwData), &cb); pControl->Release(); } pPath->Release(); }
Before you put these calls into your code, remember that it needs to be running on DX9. If not, latency
Unfortunately, there is no simple way to find out which version of DirectX is installed. There was some religious reason for not exposing such an obvious API call, but I can't remember for the life of me what it was. Fortunately, there is a sample piece of code, GetDXVersion, that ships with the SDK. GetDXVersion sniffs around making calls into the various DirectX APIs, looking for specific features that would
MingleDX8.dsp
, that you should compile with if you still have the DX8 SDK.
| Note |
The low latency takes advantage of hardware acceleration in a big way. As long as all of the buffers are allocated from hardware, the worst-case driver latency stays low. Once hardware buffers are used up and software emulation comes into play, additional latency is typically added as the software emulation kicks in. However, today's cards usually have at least 64 3D buffers available, which, as we discuss, is typically more than you'll want. Nevertheless, the best low latency strategy makes sure that the number of buffers allocated never exceeds the hardware capabilities. |
So we get to the
DirectSound does have a dynamic voice management system whereby each playing sound can be assigned a priority and voices can swap out as defined by any combination of priority, time playing, and distance from the listener. It's not perfect, though. It only terminates sounds. So, if a pig that is flying away from the listener is
For better or worse, the AudioPath mechanism doesn't even try to handle this. You must roll your own. In some ways, this is a win because you can completely define the algorithm to suit your needs, and you have the extra flexibility of being able to restart with Segments or call into script routines or whatever is most appropriate for your application. This does mean quite a bit of work. With that in mind, the major
There are two sets of items that we need to track:
Things that emit sound: Each "thing" represents an independently movable object in the world that makes a noise and so needs to be playing via one instance of an AudioPath. Each thing is assigned a priority, which is a criteria for figuring out which things can be heard, should there not be enough AudioPaths to go around. There is no limit to how large the set of things can be. There is no restriction to how priority should be calculated, though distance from the listener tends to be a good one.
3D AudioPaths: This set is limited to the largest amount of AudioPaths that can be running at one time. Each AudioPath is assigned to one thing that plays through the AudioPath.
Since we have a finite set of AudioPaths, we need to match these up with the things that have the highest priorities. But priorities can change over time,
Figure 13-1:
Three AudioPaths manage sounds for four things, sorted by priority.
Once a frame, we do the following:
Scan through the list of things and recalculate their positions and priorities. However, do not assign their
Look for things with newly high priorities that outrank things currently assigned to AudioPaths. Stop the older things from playing and assign their AudioPaths to the new, higher priority things.
Get the new 3D positions from the things assigned to AudioPaths and set the AudioPath positions.
Start the new things, which have just been assigned AudioPaths, to make sound.
If you think about it, sound effects and music follow very different purposes as well as
Groove level interference:
Groove level is a great way to set intensity for both music and sound effects, but you might want to have separate intensity controls for music and sound effects. You can accomplish this by having the groove levels on separate
Volume interference: Games often offer separate music and sound effects volume controls to the user. Global control of the volume is most easily managed using the global volume control on the Performance (GUID_PerfMasterVolume). That affects everything, so it can't be used. One solution is to call SetVolume() on every single AudioPath. Again, there has to be an easier way …
Okay, I get the hint. Indeed, there is an easier way. It's really quite simple. Create two separate Performance objects, one for music and one for sound effects. Suddenly, all the interference issues go out the window. The only downside is that you end up with a little extra work if the two
There are other bonuses as well:
Each Performance has its own synth and effects architecture. This means that they can run at different sample rates. The downside is that any sounds or instruments shared by both are downloaded twice, once to each synth.
You can do some sound effects in music time, which has better authoring support in some areas of DirectMusic Producer. Establish a tempo and write everything at that rate. This is how the sound effects for Mingle were authored. Of course, Segments
Mingle takes this approach. It creates two CAudio objects, each with its own Performance. The sound effects Performance runs at the default tempo of 120BPM without ever changing. 120BPM is