5. Inter-Stream Synchronization

The multimedia presentations consist of parallel and sequential presentations of streams. Inter-stream synchronization is usually as classified as fine-grained synchronization and coarse-grained synchronization. Fine-grained synchronization is related with the synchronization of parallel streams at fine granules. On the other hand, the coarse-grained synchronization is related to how streams are connected to each other, when they start and end. Coarse-grained synchronization is important to support consistent presentations. Due to the delay and loss of data over networks, synchronization models that support flexible presentations are necessary. For flexible synchronization models, the relationships among streams are declared rather than time-based start and end times of streams. In this section, we explain a flexible synchronization model based on ECA (Event-Condition-Action) rules [15] to manage the coarse-grained synchronization in NetMedia.

The synchronization specification lays out media objects in a presentation. The synchronization model should have the following properties:

It should support composition of media objects in various ways.
It should support user interactions of VCR type like skip or backward which are helpful in browsing of the multimedia presentations.
The user interactions should not complicate specifications.
It should support both time-based and event-based operations.

It has been shown that event-based models have been more robust and flexible for multimedia presentations. The bottleneck of the event-based models is the inapplicability of the model in case there is a change in the course of the presentation (like backwarding and skipping). Most of the previous models are based on event-action relationships. The condition of the presentation and participating streams also influence the actions to be executed. Thus ECA rules, which have been successfully employed in active database systems, are applied to multimedia presentations. Since these rules are used for synchronization, they are termed as synchronization rules. Synchronization rules are used in both the synchronization and the organization of streams. Since the structure of a synchronization rule is simple, the manipulation of the rules can be performed easily in existence of user interactions. The synchronization model uses Receiver-Controller-Actor (RCA) scheme to execute the rules. In RCA scheme, receivers, controllers, and actors are objects to receive events, to check conditions and to execute actions, respectively. The synchronization rules can easily be regenerated from SMIL expressions [3].

Synchronization rules form the basis of the management of relationships among the synchronization rules. A synchronization rule is composed of an event expression, condition expression and action expression which can be formulated as:

on event expression if condition expression do action expression

A synchronization rule can be read as: When the event expression is satisfied if the condition expression is valid, then the actions in the action expression are executed. The event expression enables the composition of events that can be created by Boolean operators && and | |. The condition expression enables the composition of conditions using && and | |. The action expression is the list of the actions to be executed when condition expression is satisfied.

5.1 Events, Conditions and Actions for a Presentation

In a multimedia system, the events may be triggered by a media stream, user or the system. Each media stream is associated with events along with its data and it knows when to signal events. When events are received, the corresponding conditions are checked. If a condition is satisfied, the corresponding actions are executed.

The goal in inter-stream synchronization is to determine when to start and end streams. The start and end of streams depend on multimedia events. The hierarchy of multimedia events is given in [3]. The user has to specify information related to the stream events. Allen [1] specifies 13 temporal relationships. Relationships meets, starts and equals require the InitPoint event for a stream. Relationships finishes and equals require the EndPoint event for a stream. Relationships overlaps and during require realization event to start (end) another stream in the mid of a stream. The relationships before and after require temporal events since the gap between two streams can only be determined by time. Temporal events may be absolute with respect to a specific point in a presentation (e.g., the beginning of a presentation). Temporal events may also be relative with respect to another event.

Definition 1. An event is represented with source(event_type[,event_data]) where source points the source of the event, event_type represents the type of the event and event_data contains information about the event.

Event source can be the user or a stream. Optional event data contains information like a realization point. Event type indicates whether the event is InitPoint, EndPoint or realization if it is a stream event. Each stream has a series of events. Users can also cause events such as start, pause, resume, forward, backward and skip. These events have two kinds of effects on the presentation. Skip and backward change the course of the presentation. Others only affect the duration of the presentation.

Definition 2. A condition in a synchronization rule is a 3 tuple

C=condition(t₁,θ,t₂)

where 6 is a relation from the set {=,≠,<,≤,>,≥} and t_i is either a state variable that determines the state of a stream or presentation or a constant.

A condition indicates the status of the presentation and its media objects. The most important condition is whether the direction of the presentation is forward. The receipt of the events matter when the direction is forward or backward. Other types of conditions include the states of the media objects.

Definition 3. An action is represented with

action_type(stream[,action_data], sleeping_time)

where action_type needs to be executed for stream using action_data as parameters after waiting for sleeping_time. Action_data can be the parameter for speeding, skipping, etc.

An action indicates what to execute when conditions are satisfied. Starting and ending a stream, and displaying or hiding images, slides and text are sample actions. For backward presentation, backwarding is used to backward and backend is used to end in the backward direction. There are two kinds of actions: Immediate Action and Deferred Action. Immediate action is an action that should be applied as soon as the conditions are satisfied. Deferred action is associated with some specific time. The deferred action can only start after this sleeping_time has been elapsed. If an action has started and has not finished yet, that action is considered as an alive action.

For example, the following synchronization rule,

on VI (EndPoint) if direction=FORWARD do start(V2, 4s),

means that when stream VI ends if the direction is forward, start stream V2 4 seconds later.

5.2 Receivers, Controllers and Actors

The synchronization model is composed of three layers, the receiver layer, the controller layer and the actor layer. Receivers are objects to receive events. Controllers check composite events and conditions about the presentation such as the direction. Actors execute the actions once their conditions are satisfied.

Definition 4. A receiver is a pair R=(e, C), where e is the event that will be received and C is a set of controller objects.

Receiver R can question the event source through its event e. When e is signaled, receiver R will receive e. When receiver R receives event e, it sends information of the receipt of e to all its controllers in C. A receiver object is depicted in Figure 33.7. There is a receiver for each single event. The receivers can be set and reset by the system anytime.

click to expand
Figure 33.7: The relationships among receiver, controller and actor objects.

Definition 5. A controller is a 4-tuple C = (R, ee, ce, A) where R is a set of receivers; ee is an event expression; ce is a condition expression; and A is a set of actors.

Controller C has two components to verify, composite events ee and conditions ce about the presentation. When the controller C is notified, it first checks whether the event composition condition, ee, is satisfied by questioning the receiver of the event. Once the event composition condition ee is satisfied, it verifies the conditions ce about the states of media objects or the presentation. After the conditions ce are satisfied, the controller notifies its actors in A. A controller object is depicted in Figure 33.7. Controllers can be set or reset by the system anytime.

Definition 6. An actor is a pair A = (a, t) where a is an action that will be executed after time t passed.

Once actor A is informed, it checks whether it has some sleeping time t to wait for. If t is greater than 0, actor A sleeps for t and then starts action a. If t is 0, action a is an immediate action. If t>0, action a is a deferred action. An actor object is depicted in Figure 33.7.

5.3 Timeline

If multimedia presentations are declared in terms of constraints, synchronization expressions or rules, the relationships among streams are not explicit. Those expressions only keep the relationships that are temporally adjacent or overlapping. The status of the presentation must be known at any instant. The timeline object keeps track of all temporal relationships among streams in the presentation.

Definition 7. A timeline object is a 4-tuple T =(receiverT, controllerT, actorT, actionT) where receiverT, controllerT, actorT and actionT are time-trackers for receivers, controllers, actors and actions, respectively.

The time-trackers receiverT, controllerT, actorT and actionT keep the expected times of the receipt of events by receivers, the expected times of the satisfaction of the controllers, the expected times of the activation of the actors and the expected times of the start of the actions, respectively. Since skip and backward operations are allowed, alive actions, received or not-received events, sleeping actors and satisfied controllers must be known for any point in the presentation. The time of actions can be retrieved from the timeline object.

The information that is needed to create the timeline is the duration of streams and the relationships among the streams. The expected time for the receipt of realization, InitPoint and EndPoint stream events only depends on the duration of the stream and the start time of the action that starts the stream. Since the nominal duration of a stream is already known, the problem is the determination of the start time of the action. The start of the action depends on the activation of its actor. The activation of the actor depends on the satisfaction of the controller. The expected time when the controller will be satisfied depends on the expected time when the event composition condition of the controller is satisfied.

The expected time for the satisfaction of an event composition condition is handled in the following way: In our model, events can be composed using && and || operators. Assume that ev₁ and ev₂ are two event expressions where time(ev₁) and time(ev₂) give the expected times of satisfaction of ev₁ and ev₂, respectively. Then, the expected time for composite events is found according to the predictive logic for WBT (will become true) in [5]:

time(ev₁ && ev₂)=maximum(time(ev₁),time(ev₂))
time(ev₁ || ev₂)=minimum(time(ev₁),time(ev₂))

where maximum and minimum functions return the maximum and minimum of the two values, respectively. The time of the first controller and the receiver is 0, which only depends on the user event to start the presentation.

Another important issue is the enforcement of the smooth presentation of multimedia streams at client sites. This is critical for building an adaptive presentation system that can handle high rate variance of stream presentations and delay variance of networks. In [32], we have observed that the dynamic playout scheduling can greatly enhance the smoothness of the presentations of the media streams. We thus have formulated a framework for various presentation scheduling algorithms to support various Quality of Service (QoS) requirements specified in the presentations [32]. We design a scheduler within the NetMedia-client to support smooth presentations of multimedia streams at the client site in the distributed environment. Our primary goal in the design of playout scheduling algorithms is to create smooth and relatively hiccup-free presentations in the delay-prone environments.

Our algorithms can dynamically adjust any rate variations that are not maintained in the transmission of the data. We define rendition rate to be the instantaneous rate of presentation of a media stream. Each stream must report to the scheduler (which implements the synchronization algorithm) its own progress periodically. The scheduler in turn reports to each stream the rendition rate required to maintain the desired presentation. The individual streams must try to follow this rendition rate. The client has several threads running concurrently to achieve the presentation. The detail of these algorithms can be found in [32].