4. End-to-End Flow Control

Various delays and congestion in the network may cause the client buffer underflow/overflow, which may result in discontinuities in stream playback. We propose the DSD and PLUS schemes to dynamically monitor the network situation and adapt our system to the current situation of the network. The DSD scheme provides full flow control on stream transmission, and the PLUS scheme utilizes probing of the network status and an effective adjustment mechanism to data loss to prevent network congestion. Both schemes work hand in hand to find the optimum adjustment considering network delay and data loss rate, and ensures no overflow/underflow in the client buffer.

4.1 DSD Scheme

The basic dynamic end-to-end control scheme used in NetMedia is the DSD algorithm presented in [50]. The DSD scheme is based on the Distance in time between Sending and Display (DSD) media units in a stream. The distance between sending and display at time t is defined to be the difference between the nominal display time of the media unit that is being sent and the media unit that is being displayed. Thus, at a given time, let the media unit being displayed have nominal display time T and the media unit being sent have nominal display time t. Then, DSD=t-T. Figure 33.5 illustrates the meaning of the DSD in a stream.

click to expand
Figure 33.5: Definition of DSD in a stream.

The connection between the DSD and the network delay is as follows. Comparing with the network delay, if DSD is too large, then the overflow data loss at the client buffer could be high; if DSD is too small, then the underflow data loss at the client buffer could also be high. Thus, DSD should be large enough to cover the network delay experienced by most media units; it also should be small enough to make sure the client buffer won't overflow.

By adjusting the current DSD in the system to respond to the current network delays, we can monitor the best DSD dynamically. We can then adjust the sending rate to avoid underflow/overflow in the client buffers and thus alleviate data loss.

Initially, a DSD is selected based on the up-to-date network load, and the sending and display rates are fixed as nominal display rates. The calculation of the current DSD, the best DSD and the adjustment of the DSD is given below.

Calculation of current DSD

At time t, let t_sent be the nominal display time of the media unit being sent at the server site and t_displayed be the nominal display time of the media unit being displayed at the client site. By definition, the DSD is:

(33.2)

Suppose at time t, the media unit being displayed has unit number u^t_displayed, the media unit being sent has unit number u^t_sent and nominal duration of one unit is UnitDuration. We can then calculate DSD as follows:

(33.3)

Client can always know u^t_displayed directly, but it can only estimate u^t_sent from the information carried by currently arriving media units. Suppose that the current arriving media unit u was sent at time SendingTime_u and its unit number is n_u. Assume that the server is sending with nominal display rate. At time t, we have

(33.4)

We then obtain:

(33.5)

Calculation of the best DSD

We define an allowable range for DSD:

MINDSD ≤ DSD ≤ MAXDSD,

where MINDSD = 0 and MAXDSD = MAX_DELAY. We then evenly divide interval [MINDSD,MAXDSD] into k-1 subintervals, with interval points, d₁=MINDSD,d₂,d3,…d_k=MAXDSD. Here k can be 10, 20 or even 100. The tradeoff is that if k is too small, the best DSD to be found might not be good enough; if k is too big, the overhead in calculating best DSD is too large. For each d_i, we keep track of data loss with respect to it. Following is the definition of relative data loss with respect to d_i.

Let d be the current DSD. Suppose we have a virtual client, which is the same as the actual client, with the same buffer size and the same displaying time for any media unit. The difference is that, suppose for the real client, a real media unit is sent at time t. It then should be displayed at time t+d. For the virtual client, this unit is sent at time t and is displayed at time t+d_i. So the virtual DSD for this unit is d_i. Upon this assumption, the loss rate of the virtual client is defined as the relative loss rate with respect to d_i. When a media unit arrives, virtual clients first check if the unit is late or not. Suppose current time is currentT, the DSD of the virtual client is d and the sending time of this media unit is sendT. If sendT+d < currentT, then the unit is late. If the unit is late, it's counted as a relative loss for this virtual client.

The best DSD finder keeps track of all these virtual clients. The calculation of the best DSD is launched once in a fixed time period (200 milliseconds in the implementation of NetMedia). When a new (best) DSD is needed, the best DSD finder browses over all these virtual clients, and the DSD in the virtual client with minimum loss rate is reported as the best DSD. Note that DSD only captures relative packet loss, which means that the packet arrived at the client site but was too late for display. Data loss due to network congestion is captured by the PLUS protocol, which will be introduced below.

Adjustment to network delay

At the beginning of an application the DSD is chosen based on the current network load. If the best DSD is different from the current DSD, there are two ways of adjustment. One is to change sending rate and another is to change display rate. The display rate is adjusted by the timing routine at the client site. Only video streams are adjusted.

At any calculation point, if the client calls for adjustment, then a feedback message is sent to the server side to adjust sending rate, so that the system DSD can be changed to the best DSD. The feedback messages will carry the Δd - the difference of the best DSD and the current DSD. A negative Δd means the sending rate should slow down and a positive Δd means the sending rate should speed up. The feedback information also contains information about the loss rate to initiate frame dropping at the server site if necessary.

To slow down, the server stops sending for |Δd| time. To speed up, the server always assumes that the current display rate is nominal display rate r. Let the maximum available sending rate over the network be R. The server will speed up using the maximum data rate for a time period. After the reaction time is finished, the server will use the nominal display rate r as the sending rate.

4.2 PLUS Scheme

The DSD scheme introduced above does not address the congestion problem in the network. In addition, it does not consider the effect of data compression which may introduce burstiness into the data streams. The burstiness results in different bandwidth requirements during transmission of the stream. This makes it difficult to come up with a resource and adaptation scheme because the bandwidth requirements always change.

To address these issues we developed a new flow and congestion control scheme, termed PLUS (Probe-Loss Utilization Streaming protocol), for distributed multimedia presentation systems. This scheme utilizes probing of the network situation and an effective adjustment mechanism to data loss to prevent network congestion. The proposed scheme is also designed to scale with increasing number of PLUS-based streaming traffic and to live in harmony with TCP-based traffic. With the PLUS protocol we address the need to avoid congestion rather than react to it with the help of a novel probing scheme.

The PLUS protocol eases bandwidth fluctuations by grouping together some number of media units (frames for video) in a window interval ΔW (Figure 33.6). Easing bandwidth in a given window is defined as smoothing [20]. One way of smoothing is done by sending out the data at the average bandwidth requirement for the window (Equation 6):

(33.6)

click to expand
Figure 33.6: PLUS probing scheme.

By using this method, the client can guarantee that the bandwidth needed is minimal and constant through the interval. The disadvantage of this method is that the feedback received only applies to the current or past bandwidth requirements. It does not take into account that there might be a higher bandwidth request in the future, which the network may not be able to process.

For each interval ΔW, we identify a critical interval. The critical interval is an interval in the future playback time that contains the maximum number of packets. The aim of the PLUS probing scheme is to test if the network can support the critical interval. We set ΔW to be a 5 second smoothing window in the implementation of NetMedia. This is a reasonable time to reduce the bandwidth through smoothing. It is also granular enough to detect sequences with fast movements (which result in a higher numbers of packets per frame and therefore provide the bottleneck bandwidth of a stream).

Once the critical interval for each sequence is determined we apply our smoothing and probing scheme. The critical bandwidth in the future, at a given interval, is provided by its critical interval. To find the minimal bandwidth requirement for the critical interval we apply the averaging algorithm, which spreads the constant sized packets evenly. This leads to a sending difference between consecutive packets, which is defined by:

(33.7)

According to Keshev [33], the bottleneck bandwidth of a connection can be estimated by the packet pair approach at the receiver site. The essential idea is that the inter-packet spacing of two packets will be proportional to the time required for the bottleneck router to process the second packet. The bottleneck bandwidth is calculated as:

(33.8)

In MPEG encoded streams the loss of I or P frames results in a further loss of dependent frames. Only the loss of B frames does not result in further loss. The loss of a B frame in presentation results only in a short degradation of the quality of service in playback. We use B frames to probe the network if the critical interval can be supported and thereby protect loss of the critical frames like I or P frames (as seen in the lower part of Figure 33.6).

Instead of spreading each packet evenly within our 5 seconds smoothing window we use the bottleneck bandwidth for sending out all B frames. Critical packets (belonging to I or P) will still be sent out with the average bandwidth calculated in Equation 33.6. Our scheme will thereby punctually probe the network while acknowledgments will give direct feedback ahead of time if the critical bandwidth can be supported. In case of congestion we will initiate a multiplicative back off of the sending rate. B frames have in our scheme a less chance of survival in case of congestion. This is not a disadvantage because we can proactively provide a bandwidth reduction at the server site by dropping non-critical B frames in time to increase the survival rate of critical packets (and thereby astonishingly increase the survival number of subsequent B frames).

Concurrent probing of different PLUS streams reports a conservative estimation of the current network situation. The estimation is based on the bandwidth need of the critical intervals of the streams and not necessary of the current average interval. This behavior allows PLUS streams to be very aware of its surrounding and responsive to protect critical packets when the maximal capacity of a connection is reached. If the concurrent probing causes packet loss, PLUS streams back off, to allow living in harmony with TCP and other PLUS streams. Sequential probing of different PLUS streams could report an estimation which leads to data loss in case that multiple PLUS streams send their critical interval at exactly the same time, each assuming the network can support it. The probability of such a situation is neglectably low and can be further reduced with the number of B frames in a stream.