9.6 Time and Synchronization


Above the media coding layer is the media synchronization layer. In iTV, the time model could be viewed as having two components: methods by which one encodes and decodes time, and methods by which various components of the program are related and integrated. Each iTV program has a single time line whose encoding is based on a clock model, and component relationships are defined with respect to that clock rather than with respect to each other; the relationship to the clock is either asynchronous, synchronized, or synchronous.

9.6.1 IP-based

The RTP is the underlying foundation for IP-based delivery of audio and video content [RTP]. Recent activity is focused on RTP payload formats that support media delivery in general, and delivery of digital video in particular (see RFC 3189). These activities focus on constraints on the values of fields in the RTP header. Typically, the following values are constrained (see RFC 1889 for the RTP header description):

  • Payload Type : This 7-bit field is often constrained to specify the formats supported by a specific encapsulation format.

  • Time Stamp : In the context of MPEG, this 32-bit field is often set to 90 KHz time stamp representing the time at which the first data in the frame was sampled.

  • Marker Bit : This 1-bit field is intended to be profile and encapsulation dependent.

TCP is not appropriate for carriage of real-time data as it provides delivery guarantees at the expense of real-time guarantees [TCP]. Therefore, to ensure synchronization, RTP is carried over UDP (carried over IP) which provides the flexibility that enables achieving the real-time guarantee needed [UDP]. Note, however, that although RTSP could be used in conjunction with RTP for controlling the stream delivered by RTP (i.e., to implement VCR-like controls), the carriage of RTP over UDP is not related to the carriage of RTSP over TCP; RTSP does not provide real-time delivery guarantees (see Section 9.1.6 on RTSP).

9.6.2 MPEG-based

MPEG's time model is clock-driven and synchronous [MPEG2]. A 27 MHz clock is used as the receiver's reference clock. The 42-bit sample of the 27 MHz frequency is partitioned into two parts defined by the MPEG-2 specification: the 33-bit program_clock_reference_base and the 9-bit program_clock_reference_extension . The former is equivalent to a sample of a 90 KHz clock locked to the 27 MHz clock, referred to as the System Time Clock (STC), and is used by the audio and video source encoders when encoding the PTS and the DTS.

To synchronize the emitter's PCR with the receiver's STC is achieved as follows . Periodically, about every 100 ms, the value of the PCR counter is sampled and inserted into the transport stream packets in the appropriate location. The receivers clock is also based on a 27 MHz clock divided by 300 to obtain a 90 KHz clock that is fed into its STC counter. The values of the emitter's PCR and receiver's STC, are compared. If the a difference is found, that difference is then used to adjust the clock as needed. When the difference is too large, or a discontinuity flag is set, the PCR value is copied onto the STC's counter (see Figure 9.22). Such situations typically occur at edit-points (e.g., commercial insertion).

Figure 9.22. Simplified MPEG synchronization feedback loops .

graphics/09fig22.gif

As explained earlier, MPEG transport streams support the notion of a PTS and DTS, both available in the PES header. Their presence in the PES header is signaled using the PTS_DTS_flag : if 0010 then PTS only is provided; if 0011 then both PTS and DTS are provided; if 0000 then neither are provided. The PTS and DTS apply to the content of the PES packets immediately following the header, namely the data access units or frames carried by the PES packet, regardless of the content of that packet. Further, there is no requirement that the data_alignment_indicator is nonzero when the PTS_DTS_flag specifies the existence of a PTS or a DTS. As a result, it is possible for a PTS or a DTS to be scoped to a fraction of a data access unit, or a frame, rendering PTS/DTS assignment for border frames ambiguous.

The need to distinguish the two originates from the video compression techniques used by MPEG. Specifically, due to data dependencies, video frames are typically transmitted and decoded such that an I-frame is followed by a P-frame followed by two B-frames. Rendering these frames, however, may require a different order (see Section 9.5.4.1), whereas the I-frame is rendered first, then the two B-frames are rendered next , and finally the P-frame is rendered. To facilitate lower-cost decoders that do not perform dependency analysis, the order of decoding is simply specified using the DTS, while the order of presentation is specified by the PTS.

In contrast to video, for audio streaming the decoding and rendering of frames is done in the same order. Therefore, there is no need to distinguish between the DTS and PTS. Nevertheless, in practice, one may observe strange behaviors because it is possible that audio frames cross the PES packet boundary, rendering their DTS assignment ambiguous.

Table 9.6. The Difference Between the Order of Decoding and Rendering

Frame Type

Dependency

Decoding/DTS Order

Presentation/ PTS Order

I

1

1

P

I

2

4

B

I,P

3

2

B

I,P

4

3

9.6.3 Asynchronous Streaming

An asynchronous media stream carries no information about the time it needs to be displayed with respect to the MPEG-2 clock. Data is emitted by servers and processed by receivers on a best effort basis without a requirement that transmission or display is complete at specific times. There is, however, a buffer-model requirement that data does not pile-up at the receiver, which means that on average (over some period of time) the data is processed roughly as fast as it is transmitted. Many of the scenarios described in Chapter 2 can be effectively implemented using asynchronous timing models, provided that the lag between transmission and display is known and sufficiently controlled (e.g., 1 ± 0.1 seconds).

9.6.4 Synchronized Streaming

A synchronized stream carries information regarding the time at which information is to be displayed. Synchronization accuracy is usually expected to be frame-level, namely such that one could specify exactly with which video frame is displayed; at 30 frames per second, this translates to 33 ms accuracy. Video and audio streams are synchronized and contain PTS and DTS.

The need for synchronizing data with video stems from the need to display information at specific times when it is relevant. In the horse-racing example, one would expect that all receivers stop receiving bets at the same time. In the interactive game show example, it is critical that viewers are prompted and their responses are collected at the right time.

Data streams could be synchronized only with respect to the PCR, but not with respect to any other component of the MPEG-2 program (e.g., video or audio). This model relies on the notion of a Data Access Unit (DAU), e.g., HTML file, which is to be decoded and displayed. DAUs may be delivered at variable bit-rates and may vary in size. Their transmissions may be separated with unspecified time periods. The challenge is, therefore, to predict, or at least bound, the time it takes to decode and render the content of DAUs that may have arbitrary complexity.

9.6.5 Synchronous Streaming

Synchronous data represents the regular delivery of bits at a constant average bitrate. For example, a synchronous stream may deliver a stream of bits such that each second 192 Kbits are transmitted, received, decoded, and rendered. Like synchronized data, synchronous MPEG-2 program elements require the use of a PCR and PTSs. However, unlike Synchronized data, Synchronous data streams carry a small fixed length DAU of 2 bytes, thereby significantly simplifying their implementation which is driven only by the PCR and bitrate .

Synchronous streams are useful in scenarios when the data displayed on the screen complements the video but is not directly related to it. As an example, synchronous data streams may be useful for stock ticker, weather information, and headline news banners.



ITV Handbook. Technologies and Standards
ITV Handbook: Technologies and Standards
ISBN: 0131003127
EAN: 2147483647
Year: 2003
Pages: 170

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net