3.8 Developing Capacity Requirements

In evaluating application capacity requirements, we want to consider applications that require large amounts of capacity and those that require a specific value (e.g., peak, minimum, sustained) or a range of capacities. When an application requires a large amount of capacity, it is important to know when the application has a real (measurable and verifiable) requirement for high capacity and when it (or the transport protocol it is using) will just attempt to use whatever capacity is available.

Applications that use TCP as their transport mechanism, without any additional conditions or modifications from the higher-layer protocols (or from the application itself), will receive performance levels based on the current state of the network (i.e., best-effort). This is done through communicating and adjusting TCP transmission parameters (across the TCP control channel between the two devices in the TCP session) throughout the TCP session to react to conditions of apparent congestion in the network.

3.8.1 Estimating Data Rates

Estimating a data (or maybe more accurately, information) rate is based on how much information you know about the transmission characteristics (e.g., traffic flow) of the application and how accurate the estimation needs to be. Data rates that are often used are peak data rate, sustained data rate, minimum data rate, or combinations of these. These data rates may be measured at one or more layers in the network (e.g., physical, data link, network, transport, session).

The data rates of all applications are bound by some limiting factor. This may be the application itself or it may be the line rate of the network interface on the device the application is running. Or it might be the performance level of a server the application uses. So the real question is where the limits are in the system for that application. If the application itself supports a very fast data rate (which can be determined by running the application on a test network), then you can go through a process of elimination to determine which component, on the device or somewhere else, is causing the performance bottleneck. In a sense, this is a local optimization of the network. In this case, you are doing it for a single application in a closed network environment.

Many applications rely on the transport protocol, such as TCP, to provide an "as fast as possible" data rate. We have an intuitive feel for some of these applications (e.g., file transfer protocol [FTP], telnet); for example, we can be fairly certain that a telnet session will not have a data rate of 100 Mb/s and that an FTP session should not run at 10 Kb/s if there is greater capacity available.

What we can do to estimate data rates for applications is consider their data sizes and estimated completion times. Data sizes and completion times either are based on what users may want or expect or may be measured on the existing network.

Taking an application that has nebulous capacity and delay characteristics, like Web access, we can estimate a data rate from user-and application-provided data sizes and completion times (e.g., INTD). We may ask a number of users for examples of pages they expect to access or information they would download, and how long they are willing to wait for each event. From this information, we could create entries in a table like the one shown in Figure 3.16. From estimates such as those in Figure 3.16, we can estimate upper and lower limits for an application's data rates or for an average rate.

Application	Average Completion Time (Seconds)	Average Data Size (Bytes)
Distributed Computing (Batch Mode)	10³	10⁷
Web Transactions	10	10⁴
Database Entries/Queries	2–5	10³
Payroll Entries	10	10²
Teleconference	10³	10⁵

Figure 3.16: Completion times and data sizes for selected applications.

For other applications, the characteristics of the data transfers may be better known, such as the sizes of the data sets being transferred and the frequency and duration of such transfers. This may apply to transaction-based applications, such as credit card processing, banking applications, and distributed computing.

Consider a remote interactive data-processing application that connects to retail stores and processes customer information, such as credit card entries. We can base an event (task) as the processing of single customer's credit card information. Then, the completion time of the task is on the order of INTD discussed earlier—approximately 10 to 30 seconds, although it may be expected by the users (store personnel) to be much smaller, say, on the order of 5 to 10 seconds, and the data size for each task is fairly small, on the order of 100 to 1000 bytes.

Another example is a computing environment in which multiple devices are sharing the processing for a task. At each iteration of the task, data are transferred between devices. Here we may know the frequency of data transfer, the size of each transfer (which may also be constant), and how much time is required to process the data (which may indicate how much time a transfer may take). A shared, multiprocessor computing network is shown in Figure 3.17.

click to expand
Figure 3.17: Example of a shared, multiprocessor computing network.

For some applications, the capacity characteristics are well known and estimating a data rate can be relatively easy. Applications that involve voice and/or video usually have known capacity requirements. The rates of such applications are likely to be constant, so their peak and minimum data rates are the same or there will be a known range of values. For example, an MPEG-2 low-level constrained-parameter bitstream (CPB) will have a rate of 4 Mb/s, or a main-level CPB will have a rate between 15 and 20 Mb/s.

There are currently no general thresholds or limits on capacity to determine low-and high-performance capacity. It is not obvious why this should be so, except that RMA and delay have a direct impact on the user's perception of system performance, whereas capacity has a secondary impact through affecting RMA and delay. There will often be, however, environment-specific thresholds on capacity, determining low and high performance for that network.

Generic performance thresholds can be added to the performance envelope we developed in Chapter 1. These thresholds, as shown in Figure 3.18, are used to separate the envelope into low-and high-performance regions. This gives us a visual interpretation of the relative performance requirements for the network and can be used to relate performance requirements to management.

click to expand
Figure 3.18: Performance envelope with generic thresholds.