3.7 Developing Delay Requirements

For applications that have delay requirements, we will use the terms end-to-end delay, round-trip delay, and delay variation as measures of delay in the network.

We begin by introducing some useful general thresholds and limits for delay: interaction delay, human response time, and network propagation delay. These thresholds and limits are useful in helping to distinguish low-and high-performance delay requirements for your network.

Interaction delay (INTD) is an estimate of how long a user is willing to wait for a response from the system during an interactive session. Here a session is the time period during which an application is active. The INTD will depend on user behavior, the environment, and the types of applications being used. INTDs may range from hundreds of milliseconds to a minute or more. In general, a useful range is 10 to 30 seconds.

INTD is important when building a network targeted toward interactive applications. An INTD estimate is useful in characterizing applications that are loosely interactive—that is, those in which waiting to receive information is expected. This applies to transaction-processing applications, as well as Web, file transfer, and some database processing. For applications such as these, the user will notice some degree of delay, and INTD is an estimate of how long users are willing to wait. An upper limit to INTD is a measure of the tolerance level of users.

Human response time (HRT) is an estimate of the time threshold when users begin to perceive delays in the system. Thus, when the system response time is less than the HRT, users generally do not perceive any delay in the system. For delays greater than the HRT, users will notice the system delay. A good estimate of HRT, based on experience and observation, is approximately 100 ms. This is an important delay threshold for it distinguishes between when users will and will not notice a delay in the system. Note that when users notice a system delay, they become aware of the system. When architecting and designing a network in which one of the goals is to hide the system from the user, you should consider HRT a vital requirement for the system. An example of this is in grid computing, in which the system is abstracted from the users.

HRT is particularly important for highly interactive applications because wait times may not or should not be perceived by users of these applications. This is usually the case when the application supports an interactive environment for users, such as in visualization, virtual reality, and collaborative applications, but it may also apply to applications for which system delays greater than HRT result in loss of productivity.

Network propagation delay is an estimate of how long it takes for a signal to cross a physical medium or link, for example, the propagation delay across a DS3 link. This provides a lower limit to the end-to-end and round-trip network and system delays. Propagation delay is dependent on distance and technology. It is useful as a lower delay limit, for it tells you when an application may not work well across the network as a result of its network propagation delay requirement being more stringent than the actual or planned propagation delay of the network.

These delay thresholds and limits are shown in Figure 3.14. Any or all of these delay estimates may be applied to a network. For example, we can use HRT as a limit for all services, constraining the architecture and design to provide delay characteristics less than HRT for that network. Or we can choose a value for INTD that defines and limits interactive service. In all cases, network propagation delay will provide a lower limit for delay. Any of these delay estimates may also be used as guarantees on service.

click to expand
Figure 3.14: Delay estimates for user requirements.

These delay estimates come from experience and are presented here for you to consider. You may disagree with their values or find other useful estimates for delay. You are encouraged to develop your own estimates or improve on those presented here. Since you know the network, system, and environment for your network project, you are in the best position to apply these estimates to your network. We can use the estimates for HRT and INTD to help us distinguish between interactive-burst and interactive-bulk applications. When both the responsiveness of the application (how frequently and quickly the application interacts with the user) and the end-to-end or round-trip delay (whichever is available to measure the system) are limited by one of these delay characteristics, we estimate that the application is interactive-burst or interactive-bulk.

In Figure 3.15, HRT and INTD are used as thresholds to separate interactive-burst and interactive-bulk applications. Between these two delay estimates, ranging from 100 ms to 10 to 30 seconds, is a gray area where the application could be considered either interactive-burst or interactive-bulk.

click to expand
Figure 3.15: Performance regions for interactive-burst and interactive-bulk applications.

The use of INTD and HRT is probably the most straightforward way to distinguish between interactive-burst and interactive-bulk applications.

3.7.1 End-to-End and Round-Trip Delays

End-to-end and round-trip delays are composed of many sources of delay, including propagation, queuing, transmission, input/output, switching, and processing. Although it would be useful to be able to monitor and measure each source of delay, it is not practical for most networks. Therefore, totals such as end-to-end and round-trip delay are used. For many networks, especially IP networks, the round-trip delay is measured using various versions of the utility ping. Ping provides a useful, easily measured, and readily available form of round-trip delay.

Recall that we used HRT, INTD, and network propagation delay as thresholds and limits to distinguish between performance levels for delay requirements. They are based on combinations of the following:

Physical limits of the network—for example, the size of the network and the distances between applications, users, and/or devices
Device hardware and software performance
Network protocol performance
Application behavior at particular delay thresholds
User interaction with the system at particular delay thresholds

A guideline in developing delay requirements is to determine the limiting factor from the delay thresholds and limits. The limiting factor will be the ultimate delay bottleneck within the system. As limiting factors are found, they can often be reduced or eliminated, revealing the next limiting factor. This process is repeated until a limiting factor is found that cannot be reduced or eliminated or until system delay is at an acceptable level.

Given that physical limits on delay are absolute limits (the speeds of electromagnetic radiation through various media are well known), they may have an impact on or negate other delay thresholds. For example, if an application requires a round-trip delay of 80 ms to support an interactive virtual reality (VR) environment and the application session is being distributed between Los Angeles and Tokyo, the round-trip delay of the network (approximately 120 ms) will exceed the application delay requirement, regardless of the network architecture and design. Thus, either the application has to take into account the round-trip delay between Los Angeles and Tokyo, possibly by modifying the code, algorithms, or usage model, or the application sessions cannot be distributed between these sites. Knowing this early in the architecture process allows the application developers or network engineers to adapt to the physical limits of the network.

Now let's say that the distance of the aforementioned network is reduced and the physical limitation on the delay is now 10 ms (e.g., the network is now between Los Angeles and San Francisco). However, through testing the application on the current network, a round-trip delay of 100 ms is expected within the system (primarily because of the devices used). What can be done? Device hardware and software performance can be difficult to identify, understand, and improve. In architecting and designing a network, we need to try to consider each piece in the end-to-end path of communications (traffic flows), which may consist of a variety of devices. What can we look for that will reduce the likely sources of delay in the system, without an unreasonable amount of time, effort, and cost? Areas to check for sources of delay that can be modified or tuned include computer operating systems (OSs), network protocols, and device peripherals.

What we can look for in these areas are OSs that are notoriously slow and poorly written, protocols that are poorly implemented, and devices and/or peripherals that are mismatched or misconfigured. This type of information obviously will not come from the manufacturer but can often be found in mailing lists, newsgroups, independent testing, and academic research, all of which are increasingly available on the Internet. By doing some research, you can rapidly learn a lot about performance problems with devices. Make your own observations and analyses, and draw your own conclusions. As we will see later in this book, support services such as network management, routing, and security will also play a role in network delay.

Delay thresholds based on application behavior and user interaction are generally more flexible than physical or device delays. Recall that values for INTD and HRT were estimates (estimated range for INTD), so there is some flexibility to tailor them to your environment. Although in the previous example, the round-trip physical delay was the ultimate limit, when the distance of the network was reduced so the round-trip delay was on the order of 10 ms, the application behavior (interactive VR environment) with its 40-ms delay requirement may become the limiting factor. If this is acceptable to the users, then the delay estimates are (from our high-level view) optimized for that environment.

3.7.2 Delay Variation

Delay variation is often coupled with end-to-end or round-trip delay to give an overall delay performance requirement for applications that are sensitive to the interarrival time of information. Some examples of such applications are those that produce or use video, audio, and telemetry information. For delay variation coupled with end-to-end or round-trip delay, when no information is available about the delay variation requirement, a good rule of thumb is to use 1% to 2% of the end-to-end delay as the delay variation.

For example, an estimate for delay variation (in the absence of any known requirements), when the end-to-end delay is 40 ms, is approximately 400 to 800 s. This would be a rough approximation, however, and should be verified when possible.