5.3 Audio Over Universal Serial Bus (USB)

5.3 Audio Over Universal Serial Bus (USB)

5.3.1 Basic USB Principles

The universal serial bus is not the same as IEEE1394, but it has some similar implications for desktop multimedia systems, including audio peripherals. USB has been jointly supported by a number of manufacturers including Microsoft, Digital, IBM, NEC, Intel and Compaq. It is a copper interface that, in its basic version, runs at a lower speed than 1394 (typically either 1.5 or 12 Mbit/s) and is designed to act as a low cost connection for multiple input devices to computers such as joysticks, keyboards, scanners and so on. The data rate is, however, high enough for it to be used for transferring limited audio information if required. A recent revision of the USB standard enables newer interfaces to operate at a high rate of up to 480 Mbit/s.

USB supports up to 127 devices for both isochronous and asynchronous communication and can carry data over distances of up to 5 m per hop (similar to 1394). A hub structure is required for multiple connections to the host connector. Like 1394 it is hot pluggable and reconfigures the addressing structure automatically. When new devices are connected to a USB setup the host device assigns a unique address. Limited power is available over the interface and some devices are capable of being powered solely using this source known as ˜bus- powered ' devices which can be useful for field operation of, say, a simple A/D convertor with a laptop computer.

Data transmissions are grouped into frames of 1 ms duration in USB 1.0 but a ˜micro-frame' of one- eighth of 1 ms was also defined in USB 2.0. A start-of-frame packet indicates the beginning of a cycle and the bus clock is normally at 1 kHz if such packets are transmitted every millisecond. So the USB frame rate is substantially slower than the typical audio sampling rate. The transport structure and different layers of the network protocol will not be described in detail as they are long and complex and can be found in the USB 2.0 specification 5 . However, it is important to be aware that transactions are set up between sources and destinations over so-called ˜pipes' and that numerous ˜interfaces' can be defined and run over a single USB cable, only dependent on the available bandwidth. Some salient features of the audio specification will be described.

5.3.2 Audio Over USB

The way in which audio is handled on USB is well defined and somewhat more clearly explained than the 1394 audio/music protocol 6 . It defines three types of communication: audio control, audio streaming and MIDI streaming. We are concerned primarily with audio streaming applications.

Audio data transmissions fall into one of three types. Type 1 transmissions consist of channel-ordered PCM samples in consecutive subframes, whilst Type 2 transmissions typically contain non-PCM audio data that does not preserve a particular channel order in the bitstream, such as certain types of multichannel data-reduced audio stream. Type 3 transmissions are a hybrid of the two such that non-PCM data is packed into pseudo-stereo data words in order that clock recovery can be made easier. This method is in fact very much the same as the way data-reduced audio is packed into audio subframes within the IEC 61937 format described in Chapter 4, and follows much the same rules.

Audio samples are transferred in subframes, each of which can be one to four bytes long (up to 24 bits resolution). An audio frame consists of one or more subframes, each of which represents a sample of different channel in the cluster (see below). As with 1394, a USB packet can contain a number of frames in succession, each containing a cluster of subframes. Frames are described by a format descriptor header that contains a number of bytes describing the audio data type, number of channels, subframe size , as well as information about the sampling frequency and the way it is controlled (for Type 1 data). An example of a simple audio frame would be one containing only two subframes of 24-bit resolution for stereo audio.

Audio of a number of different types can be transferred in Type 1 transmissions, including PCM audio (two's complement, fixed point), PCM-8 format (compatible with original eight-bit WAV, unsigned, fixed point), IEEE floating point, A-law and -law (companded audio corresponding to relatively old telephony standards). Type 2 transmissions typically contain data-reduced audio signals such as MPEG or AC-3 streams. Here the data stream contains an encoded representation of a number of channels of audio, formed into encoded audio frames that relate to a large number of original audio samples. An MPEG encoded frame, for example, will typically be longer than a USB packet (a typical MPEG frame might be 8 or 24 ms long), so it is broken up into smaller packets for transmission over USB rather like the way it is streamed over the IEC 60958 interface described in Chapter 4. The primary rule is that no USB packet should contain data for more than one encoded audio frame, so a new encoded frame should always be started in a new packet. The format descriptor for Type 2 is similar to Type 1 except that it replaces subframe size and number of channels indication with maximum bit rate and number of audio samples per encoded frame. Currently only MPEG and AC-3 audio are defined for Type 2.

Rather like the compound data blocks possible in 1394 (see above), audio data for closely related synchronous channels can be clustered for USB transmission in Type 1 format. Up to 254 streams can be clustered and there are 12 defined spatial positions for reproduction, to simplify the relationship between channels and the loudspeaker locations to which they relate. (This is something of a simplification of the potentially complicated formatting of spatial audio signals and assumes that channels are tied to loudspeaker locations, but it is potentially useful.) The first six defined streams follow the internationally standardized order of surround sound channels for 5.1 surround, that is left, right, centre , LFE (low frequency effects), left surround, right surround. Subsequent streams are allocated to other loudspeaker locations around a notional listener. Not all the spatial location streams have to be present but they are supposed to be presented in the defined order. Clusters are defined in a descriptor field that includes ˜bNrChannels' (specifying how many logical audio channels are present in the cluster) and ˜wChannelConfig' (a bit field that indicates which spatial locations are present in the cluster). If the relevant bit is set then the relevant location is present in the cluster. The bit allocations are shown in Table 5.1.

Table 5.1: Channel identification in USB audio cluster descriptor

Data bit

Spatial location

D0

Left Front (L)

D1

Right Front (R)

D2

Center Front (C)

D3

Low Frequency Enhancement (LFE)

D4

Left Surround (LS)

D5

Right Surround (RS)

D6

Left of Center (LC)

D7

Right of Center (RC)

D8

Surround (S)

D9

Side Left (SL)

D10

Side Right (SR)

D11

Top (T)

D15 12

Reserved

5.3.3 Clock Synchronization

Audio devices transferring signals over USB can have sample clocks that are either asynchronous with the USB data transfer, or that are locked in some way to the USB start-of-frame (SOF) identifier (that occurs every 1 ms). Asynchronous devices would typically use free-running or externally synchronized audio clocks, whereas synchronous devices would either have a means of locking their sample clocks to the 1 ms SOF point or (perhaps unusually) have a means of controlling the USB clock rate so that it became locked to the audio sampling frequency. It is up to host applications to ensure that groups of audio channels that belong together and are supposed to be sample-aligned are kept so through any buffering that is employed. Buffering of at least one frame is normally required at the receiver and the management and reporting of delays is an inherent feature of the recommendations.



Digital Interface Handbook
Digital Interface Handbook, Third Edition
ISBN: 0240519094
EAN: 2147483647
Year: 2004
Pages: 120

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net