DirectX Audio for Software Engineers

DirectX Audio for Software Engineers

This is short and sweet. DirectX 9 Audio Exposed: Interactive Audio Development contains a comprehensive tutorial-style look at the inner workings of the DirectX Audio libraries. Everything you'd like to know about the DirectMusic API is within these pages. Todor Fay, the originator of the cornerstone DirectMusic technology, has laid out much of his wisdom for all of us to learn from. So take a look at the examples and discussions and get some ideas about how you can build your own software applications upon the DirectX Audio backbone. We look forward to seeing/hearing/using what you come up with!

Unit I: Producing Audio with DirectMusic

Chapter List

Chapter 1: DirectMusic Concepts
Chapter 2: Linear Playback
Chapter 3: Variation
Chapter 4: Interactive and Adaptive Audio
Chapter 5: DirectMusic Producer
Chapter 6: Working with Chord Tracks and ChordMaps
Chapter 7: DirectX Audio Scripting

Chapter 1: DirectMusic Concepts


Scott Selfon

DirectMusic and DirectMusic Producer (the tool used to author music and soundscapes for DirectMusic) introduce some new concepts in audio production. Learn to appreciate what follows, and you will see that there is a new musical world waiting to be conquered. But first, a quick explanation of terms.

We use the term "audio producer" throughout this book. An audio producer is anyone who creates audio to be used as part of a DirectMusic project. This can be a sound designer, producer, recording engineer, composer; we don't care. If your focus is creating audio to be placed in a project, you are an audio producer. If your job is focused on integrating sounds into a project using code, building tools/extensions to DirectX Audio, or developing playback software, or you are a game programmer, then you should perk up when you see the term "programmer."


"Interactivity" is both the most overused and misused term in game audio. Often, when someone speaks of interactive audio, he is referring either to adaptive audio (audio that changes characteristics, such as pitch, tempo, instrumentation, etc., in reaction to changes in game conditions) or to the art and science of game audio in general. A better definition for interactive audio is any sound event resulting from action taken by the audience (or a player in the case of a game or a surfer in the case of a web page, etc.). When a game player presses the "fire" button, his weapon sounds off. The sound of the gun firing is interactive. If a player rings a doorbell in a game world, the ringing of the bell is also an interactive sound. If someone rolls the mouse over an object on a web site and triggers a sound, that sound is interactive. Basically, any sound event, whether it be a one-off, a musical event, or even an ambient pad change, if it comes into play because of something the audience does (directly or indirectly), it is classified as interactive audio.


An interesting side effect of the performance of recorded music production is the absence of variation in playback. Songs on a CD play back the same every single time. Music written to take advantage of DirectMusic's variability properties can be different every time it plays. Variation is particularly useful in producing music for games, since there is often a little bit of music that needs to stretch over many hours of gameplay. Chapter 21 discusses applications of variation in music production outside the realm of games.

The imperfection of the living, breathing musician creates the human element of live musical performances. Humans are incapable of reproducing a musical performance with 100 percent fidelity. Therefore, every time you hear a band play your favorite song, no matter how much they practice, it differs from the last time they played it live, however subtle the differences. Repetition is perceived as being something unnatural (not necessarily undesirable, but unnatural nonetheless) and is easily detectable by the human ear. Variability also plays a role in song form and improvisation. Again, when a song is committed to a recording, it has the same form and solos every time someone plays that recording. However, when performed by musicians at a live venue, they may choose to alter the form or improvise in a way that is different from that used in the recording.

DirectMusic allows a composer to inject variability into a prepared piece of digital audio, whether a violin concerto or an audio design modeled to mimic the sounds of a South American rain forest. Composers and sound designers can introduce variability on different levels, from the instrument level (altering characteristics such as velocity, timbre, pitch, etc.) to the song level (manipulating overall instrumentation choices, musical style choices, song form, etc.). Using DirectMusic's power of variability, composers can create stand-alone pieces of music that reinvent themselves every time the listener plays them, creating a very different listening experience when juxtaposed to a mixed/mastered version of the same music. Composers alter the replay value of their compositions as well by allowing their music to reinvent itself upon every listening session.

Avoidance of audio content repetition in games is often important. When asked about music for games, someone once said, "At no time in history have so few notes been heard so many times." Repetition is arguably the single biggest deterrent to the enjoyment of audio (both sound effects and music) in games. Unlike traditional linear media like film, there are typically no set times or durations for specific game events. A "scene" might take five minutes in one instance and hours in another. Furthermore, there is no guarantee that particular events will occur in a specific order, will not repeat, or will not be skipped entirely. Coupled with the modest storage space (also known as the "footprint") budgeted for audio on the media and in memory, this leaves the audio producer in a bit of a quandary. For the issue of underscore, a game title with hours of gameplay might only be budgeted for a few minutes worth of linear music. The audio director must develop creative ways to keep this music fresh — alternative versions, version branching, and so on. Audio programmers can investigate and implement these solutions using DirectMusic.

As to events triggering ambience, dialog, and specific sound effects, these may repeat numerous times, adding the challenge of avoiding the kind of obvious repetition that can spoil a game's realism for the player. As we discuss in more detail, DirectMusic provides numerous methods for helping to avoid repetition. On the most basic level, audio programmers can specify variations in pitch and multiple versions of wave, note, and controller data. Even game content (specific scripted events in the game for instance) can specify orderings for playback (such as shuffling, no repeats, and so on) that DirectMusic tracks as the game progresses. Using advanced features, chord progressions can maintain numerous potential progression paths, allowing a limited amount of source material to remain fresh even longer.