10.3 General methods of data extraction

< Free Open Study >

In the previous section we examined a system used to test system concepts before the final target system is constructed. Often we are faced with analyzing an existing system. This requires the computer systems analyst to develop methods for extracting information from a running system and for running experiments on an existing system.

There are three methods for extracting information from an existing system: hardware monitors, software monitors, and accounting software (e.g., the operating system). Measurements for performance analysis are typically extracted using either hardware or software monitors specifically set up for the measurements of interest. Depending on what parameters are of interest, we may be able to measure them directly, or we may need to obtain the measures from intermediate measurements.

Most computer systems, even your PC, provide means to determine resource utilization and a variety of other useful measurements. If the system is a timesharing system, one can typically determine how much CPU time was used by a process, how much memory it consumed, and possibly even how much time was spent in I/O processing. Information such as the number of users logged on to the system, number of I/O accesses performed by a user, page faults in memory, and active time for a process can be obtained.

A problem with software developed for system accounting purposes is that it may not provide information concerning the system software components, such as the operating system. In many systems, the time spent by the operating system doing its tasks may actually represent the lion's share of resource utilization. For example, most PCs will spend the majority of their time in the operating systems idle process. Due to this limitation most system accounting packages will not suffice in aiding us in analyzing a system's performance. Some of the newer operating systems provide users with many more tools for determining system resource utilization. For example, the task manager of most of the Microsoft products provides fairly good capabilities to monitor resource use and system performance. This package, however, is more closely related to software monitoring than to accounting software.

Software monitors utilize a collection of code fragments embedded in the operating system, or applications program, to gather performance data. The monitoring software must be kept to a minimum in order that its impact on the running system being measured is minimal. The main modes to construct software monitors use either a sampling or event approach. In the sampling approach, monitor code fragments are invoked periodically to determine the status of the system. Each of the monitoring code fragments has a specific function. One may be examining memory use, and another, CPU or I/O. Using the retrieved information, the monitor can over time construct a fairly accurate image of the systems behavior. The problem with sampling is that some parameters or events of interest may be missed if the sampling period does not fall into their path. The advantage of this approach, however, is its simplicity and lower systems impact. By changing the sampling frequency the load on the system can be increased or reduced as needed. The main design issue when developing a sampling scheme is to determine the appropriate sampling frequency and measurement points within the system.

The event design approach for a software monitor requires that the designers of the monitor have an understanding of the events within the system with which they can synchronize monitoring. For example, CPU task switching is an important event, as is I/O and memory allocation and deallocation. In order for the events to be monitored, the operating systems code must be altered. The code must be adjusted so that when the event occurs, required information can be extracted by the operating system and recorded to a file. For example, we may wish to record what process was allocated or deallocated memory, what process is acquiring the CPU, and the times associated with these events. The event files can then be processed at some later time to extract the performance measures for the system. If we can define all events of interest and provide handles into the operating systems code to extract information about them, then we can construct a fairly good model of the system under study. Using event traces one can determine the duration of every CPU service time and the resources consumed during each of these cycles.

If we do not have access to the operating systems code, then this approach is not feasible. One could augment applications code and be able to extract timings for this code. This would provide at least a measure of the duration of time an application holds a resource and can be used as a means to assess system performance, if the application is designed appropriately. The problem with all these approaches is that they will cause their own impact on systems performance. The sampling software will consume resources and cause additional delays to be added to the performance measurements, possibly causing them to indicate erroneous values. Studies have shown that a software monitor can consume as much as 20 percent of the systems resources, making the performance results questionable. If we choose the type of events carefully and limit added code to the minimum required to capture information, the overhead can be dropped to approximately 5 percent. The tradeoff is fidelity of information versus the overhead of the measurement software.

Besides the problem with impacting system operations, software monitors have other problems. The trace method of data collection can lead to large volumes of information to store and process, making it hard to use effectively. Software monitors also must be configured to fit into a system's architecture, making them one of a kind implementations. Due to this limitation, there are no commercially available software monitor general architectures. In addition, implementing software monitors requires significant expertise in operating systems coding, which is not an everyday capability for most programmers. Due to this limitation, this technique is not used very often. We are left to use the monitoring capabilities delivered with an operating system.

Hardware monitors provide another means to access performance information. A hardware monitor is composed of a collection of digital hardware connected to the system under measurement. For example, an oscilloscope is a general-purpose hardware monitoring device constructed to allow for the monitoring of hardware activities. Hardware monitors can be as simple as a few gates to entire computer systems including all the peripherals. Hardware monitors are readily available from commercial sources.

Hardware monitors must be connected in some way to our system in order to collect data, which are in the form of signals. The points we attach the hardware monitor to represent our test points or probe points. The test points are places in the computer system under examination accessible for measurement. For example, we may wish to probe the interrupt lines of the CPU, so we can determine when task switches are occurring. We may wish to examine specific memory locations to test when values cross some point or are altered. By attaching the monitor's test probes at these points, we can observer systems behavior over some time frame. We can also use multiple test points in conjunction with each other to synchronize when to extract signals based on the measurement or detection of some other test point. The measurements are typically done without adding any additional overhead to the measured system, a distinct advantage over the software monitoring approach.

A difficulty with hardware monitors is knowing when to use them and where to place the test points. For example, where do you measure to know if a CPU is busy? How do we know it is busy with an operating system function or an application? Most systems vendors have developed their components and systems with ready-to-use monitoring points to aid in system debug and repairs. This makes it relatively easy to determine where to place our test points. If these test points are not available, then hardware monitoring will be very difficult to implement.

The monitoring devices must have the capability to collect measured test point data and store these data for future processing and analysis. This is necessary so that we can determine the utilization of tested components within a system: the number of measured units that pass a point over some time frame-for example, how many jobs are presented to the CPU for processing over some period of time, and what percentage of this time the CPU was busy or idle.

The limitation with hardware monitoring is that we can only measure hardware signals and possibly the contents of registers or memory (if allowed). We typically will not know what the operating system is specifically doing at the point we are measuring. Due to this limitation, hardware monitors are usually used in conjunction with some form of event trace software in order to allow for later interpretation of hardware operations.

< Free Open Study >