Hack 24. Create Telephony Sounds with SoX
Use the Swiss Army knife of sound-conversion utilities for your VoIP setup.
Though dozens of utilities are available for converting and tweaking audio files, the cross-platform open source audio tool called SoX really stands out. If you've got a Linux or BSD PC, chances are pretty good that you've got SoX installed. Windows and Mac users will have to download a compatible version of SoX from http://sox.sourceforge.net/. And since SoX is a command-line utility, you'll need to be at least a mediocre typist to get through this hack and the next two hacks. You'll also need to know how to get to a command line on your particular platform. On Windows, this means running the MS-DOS prompt. On the Mac, it's the Terminal. Linux and BSD users need only to fire up xterm. This hack will show you the ins and outs of using SoX to convert audio files from one format to another, add audio effects, and telephonize your audio through downsampling.
2.18.1. File Format Conversion
File format conversion is perhaps SoX's biggest strength. You can use SoX to convert from one format to another (WAV, AIFF, etc.) and from one encoding to another (uLaw, MP3, etc.). It even supports some fossilized sound formats like 8SVX and .voc. All of this format support is helpful if you want to use a file that you have only in some oddball format that your telephony software can't use.
In most telephony applications, like voicemail and interactive voice response (IVR), where recorded voice prompts are the user interface, you'll encounter sounds in one of a few encoding formats:
To convert a sound file from one format to another, there are two ways to go. SoX can recognize the input and desired output formats merely by parsing the filename extensions you provide, as in the following example:
$ sox basic _instructions.ulaw basic_instructions.gsm
This syntax takes basic_instructions.ulaw and creates a GSM-encoded file called basic_instructions.gsm. Of course, if the file you're converting doesn't have a file extension in its name, you can express your intentions more explicitly:
$ sox t gsm another_brick t aif another_brick_in_aiff_format
By specifying the encoding type with the -t option, you can tell SoX specifically how to convert the file, regardless of filenames and extensions. But that's not all you'll find SoX useful for.
2.18.2. Adding Sound Effects
Aside from converting files between formats, SoX can add some cool effects, too. Equalization, reversal, chorus, reverb, time shifting, and vibrato are some of the most commonly used effects options. Some of these effects are probably more useful in a pro audio environment than in VoIP, but there are uses for audio effects even in telephonylike an on-hold message that hypes a particular product or event. Such an announcement might benefit from a little reverb or delay. Just think about some of the sound effects used in monster-truck advertisements beckoning your attendance on SundaySunday-Sunday! Consider the following syntax, which adds reverb to a sound:
$ sox bigFootSunday.aif bigFootSundayVerb.aif reverb 1 1000 15
This example takes bigFootSunday.aif and adds 1,000 milliseconds of reverb with a 15-millisecond delay before saving the file as bigFootSundayVerb.aif. You can combine sound effects, too. So, for instance, you can place a reverb and an EQ effect together:
$ sox gilmour.aif gilmour.aif reverb 1 1200 30 highpass 1000
The reverb effect is followed by a high-pass filter, which is an EQ technique that trims (reduces) samples below a certain frequencyin this case, 1 kHz. You can experiment with the high-pass and low-pass features to trim frequencies, letting you obtain a number of cool effects. Make your music sound like it's coming through a megaphone or, with a little reverb, make it sound like you're singing in the shower. Now, it's up to you to find an appropriate venue for all this aural awesomeness in your VoIP setupper-haps in your Skype answering machine [Hack #37].
2.18.3. Resample and Re-Level Sounds
The SoX bag of tricks has many compartments. Aside from EQ, effects, and format conversion, you can use SoX to downsample sounds, as in the Cacophony example earlier in this chapter [Hack #22]. SoX can also alter a sound's volume level (amplitude level).
To alter the sample rate, use the -r option and specify the desired sample rate in kilohertz. Of course, you can decrease (downsample) or increase the sample rate, but increasing the sample rate won't result in a higher-fidelity sound. This example takes a file called bytor.wav and downsamples it to 8 kHz:
$ sox bytor.wav r 8000 bytor_8khz.wav
To alter a sound's amplitude, or volume level, use the -v option. This example increases the volume of the sound by 25% (using a negative value will decrease the volume):
$ sox v 0.25 bytor.wav bytor_loud.wav