In addition to normal end systems, RTP supports middle boxes that can operate on a media stream within a session. Two classes of middle boxes are defined: translators and mixers. TranslatorsA translator is an intermediate system that operates on RTP data while maintaining the synchronization source and timeline of a stream. Examples include systems that convert between media-encoding formats without mixing, that bridge between different transport protocols, that add or remove encryption, or that filter media streams. A translator is invisible to the RTP end systems unless those systems have prior knowledge of the untranslated media. There are a few classes of translators:
The defining characteristic of a translator is that each input stream produces a single output stream, with the same SSRC. The translator itself is not a participant in the RTP session ”it does not have an SSRC and does not generate RTCP itself ”and is invisible to the other participants . MixersA mixer is an intermediate system that receives RTP packets from a group of sources and combines them into a single output, possibly changing the encoding, before forwarding the result. Examples include the networked equivalent of an audio mixing deck, or a video picture-in-picture device. Because the timing of the input streams generally will not be synchronized, the mixer will have to make its own adjustments to synchronize the media before combining them, and hence it becomes the synchronization source of the output media stream. A mixer may use playout buffers for each arriving media stream to help maintain the timing relationships between streams. A mixer has its own SSRC, which is inserted into the data packets it generates. The SSRC identifiers from the input data packets are copied into the CSRC list of the output packet. A mixer has a unique view of the session: It sees all sources as synchronization sources, whereas the other participants see some synchronization sources and some contributing sources. In Figure 4.5, for example, participant X receives data from three synchronization sources ”Y, Z, and M ”with A and B contributing sources in the mixed packets coming from M. Participant A sees B and M as synchronization sources with X, Y, and Z contributing to M. The mixer generates RTCP sender and receiver reports separately for each half of the session, and it does not forward them between the two halves . It forwards RTCP source description and BYE packets so that all participants can be identified (RTCP is discussed in Chapter 5, RTP Control Protocol). Figure 4.5. Mixer M Sees All Sources as Synchronization Sources; Other Participants (A, B, X, Y, and Z) See a Combination of Synchronization and Contributing Sources.
A mixer is not required to use the same SSRC for each half of the session, but it must send RTCP source description and BYE packets into both sessions for all SSRC identifiers it uses. Otherwise, participants in one half will not know that the SSRC is in use in the other half, and they may collide with it. It is important to track which sources are present on each side of the translator or mixer, to detect when incorrect configuration has produced a loop (for example, if two translators or mixers are connected in parallel, forwarding packets in a circle). A translator or mixer should cease operation if a loop is detected , logging as much diagnostic information about the cause as possible. The source IP address of the looped packets is most helpful because it identifies the host that caused the loop. |