9.7 Synchronizing Data with Video


The key technical challenge when attempting to synchronize data with video is the lack of data transmission and decoding time guarantees needed to achieve accurate frame synchronization of the data. For example, a problem occurs when complex data objects with long decoding times are combined with time-line (PCR) discontinuities. In this circumstance, there can be ambiguity in the meaning of the PTS value in the encapsulation.

The ATSC A93 Synchronized/Asynchronous Trigger Standard was designed to address this situation, as well as other complications [A93]. It allows an arbitrary complex DAU to be activated by an arbitrary simplified receiver to achieve tight synchronization in the context of a discontinuous time line. The reminder of this section describes ATSC A93; at the time of writing this book, I am not aware of a parallel DVB or ISO standard.

A key property of the ATSC trigger design (see Figure 9.23), which enables synchronizing data with video at frame-level accuracy, is its ability to decouple the asynchronous DAU delivery (and decoding) from their (synchronized or asynchronous) activation. Synchronized triggers carry a PTS that indicates the point along the content time-line at which the DAU is to be presented. The trigger and the DAU referenced by a trigger are emitted received and decoded before the time it is to be presented. Both the DAU and trigger can be repeatedly transmitted to enable acquisition in the presence of random tuning, namely viewers may tune to a trigger carrying channel at random time points.

Figure 9.23. Overall trigger structure.

graphics/09fig23.gif

Triggers are small data structures with bounded size and limited complexity. The key information in a trigger is the identify of the intended consuming application, the trigger's PTS, a reference to a DAU to be activated at the PTS, and, optionally , private application data. Each trigger fits into a single MPEG package and can typically be decoded during a time period that requires for reception of the following packet. In contrast, DAU could be of arbitrary complexity, for example, an XHTML page referring to a JavaTV Xlet. Note that MPEG terminology, however, regards both the preload data and the trigger as DAUs.

Figure 9.24 depicts a simplified trigger timing scenario. In this example, the STC is discontinuous at time t due to, e.g., insertion of an advertisement clip. The DAU transmission is repeated because a DSM-CC data carousel may be used to deliver the preload data used to enhance that advertisement with interactivity. The PTS of the trigger indicates when to activate the preload data referenced. To remove ambiguity regarding activation time, one of the requirements of ATSC A93 is that the packet carrying a synchronized trigger that is transmitted after t and after the packet carrying the first PCR value of a new time-line. The PTS are sufficiently delayed after t (at least 33 ms) to enable the receiver to place the trigger in the buffer, decode it, and activate the preload data at the activation time denoted t pts .

Figure 9.24. Simplified ATSC trigger timing diagram.

graphics/09fig24.gif

The preload data is transmitted asynchronously, possibly using a data carousel. To ensure that the preload data arrives and is decoded sufficiently early, consideration is given to the bandwidth allocated to the carrying carousel as well as its repeat rate. Decoding should begin immediately following the acquisition of a preload DAU carrying executable code of an application (e.g., ECMA Script fragment, XHTML page with JavaTV Xlets).

9.7.1 Activation

Trigger activation is the process of enabling the preload DAU referenced by a trigger. This process may cause rendering of the DAU or executing a script it contains. For asynchronous triggers, activation occurs as soon as the trigger decoding is complete; the trigger decoding may start as early as when it has been fully received. For synchronized triggers, activation occurs at the time that the value of the 90 KHz portion of the receiver STC strikes the PTS carried in the trigger; as before, the trigger decoding starts as soon as it is fully received.

Receivers may have arbitrary behaviors at activation times, ranging from completely ignoring the trigger to unconditional presentation of the DAU at any cost. Intelligent resource allocation may prove to have a critical impact on the viewer's experience in many situations.

Triggers may be transmitted repeatedly to ensure capture through random tuning. On repeated transmission, however, it is critical to ensure that the identities of the triggers are generated so as to ensure their uniqueness. In particular duplication of the same PTS value over several triggers on the same MPEG-2 program element is not allowed.

9.7.2 Ramifications for Content Authoring

At authoring time, the activation time, namely the time instant for which synchronization occurs for data relative to a video, audio, or data elementary stream is captured relative to a story time-line (e.g., SMPTE 12M time code). The trigger instances need to be generated before any simulated or real emission. On generating a trigger and an associated set of preload data, the authoring system may need to compute an arrival time relative to the trigger activation time point. Furthermore, trigger generation requires conversion of trigger activation times and target acquisition times from the storyboard time-line units to STC units.

The duration between the arrival time of a preload DAU and the activation time point should be greater than or equal to the decoding time needed by any receivers to decode the preload data. This duration may be derived by the authoring system based on a set of meta-data associated with the preload data to be synchronized. Since the preload data could be delivered asynchronously, for example via a data or object carousel, the authoring system needs to compute a maximum target acquisition time period during which the receiver should not purge the preload data from the cache. This maximum target acquisition period, specified via acquisition descriptors, is the period between the earliest target acquisition time and the (latest) activation time for a specific target using all emitted triggers referring to that target.

The calculation of time delays needs to be robust to avoid synchronization errors to be introduced by re-multiplexers that do not parse the trigger content to analyze activation times. Unpredictable behavior may occur in the event that the preload data is not available (or accessible or decoded) at the activation time.

9.7.3 Signaling

MPEG-2 transports that contain triggers need to include one or more taps; different extensions of the MPEG-2 standard may define different specific locations. These taps, called the trigger taps, refer to the (asynchronous or synchronized) trigger streams through matching the association_tag in the PMT.

The presence of an MPEG program element carrying triggers (i.e., using a distinct PID) needs to be signaled by means of a content type descriptor specifying the appropriate MIME type ( application/atsc-trigger ). To signal to a receiver where to find and how to acquire the preload DAU independently of the acquisition of any triggers referring to them, a reference to that preload DAU is signaled by means of an acquisition descriptor. Receivers should acquire signaled targets as soon as they are received. A late binding mechanism is provided to support situations when the trigger refers to a target that is not a data module, for example, a URI whose binding is unknown at the time of the descriptor reception. The location of the acquisition descriptor impacts its scope as follows :

  • Placing the acquisition descriptor in the DSI groupInfoByte descriptor loop renders the descriptor applicable to an entire group within the data carousel.

  • Placing the acquisition descriptor in the DII moduleInfoByte descriptor loop renders it applicable to that module regardless of what applications reference it.

When an acquisition descriptor occurs in multiple locations that could apply, the descriptor with the most narrow scope is applicable; other descriptors are ignored. For example, when the acquisition descriptor appears both in a DII moduleInfoByte descriptor loop and the DSI groupInfoByte descriptor loop, the value of the max_age field in the DII descriptor loop is used (as it is the most specific), and the other max_age field values is ignored.



ITV Handbook. Technologies and Standards
ITV Handbook: Technologies and Standards
ISBN: 0131003127
EAN: 2147483647
Year: 2003
Pages: 170

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net