Technical and business administrators can use instrumentation in different ways to make operational and strategic management decisions. There are two primary instrumentation modes: activating trip wires and taking time slices. A combination of trip wires and time-sliced measurements is used for supporting operational and strategic service management tasks. Figure 4-2 shows the use of trip wires, which can generate real-time alerts by comparing a behavior to a static threshold value or by tracking deviations from a normal behavioral envelope. Time slices are repetitive measurements of the same variables over time. Figure 4-2. Trip WiresNOTE Trip wires and time slices are used for real-time alerts; time slices also help with longer-term functions, such as planning. Trip WiresTrip wires provide simple real-time alerting for operational decision making. Management tools compare the collected information to established thresholds. An alert is sent to a management application when the value is higher or lower than the established threshold. Further processing of the alert determines whether it is a valid problem, whom to notify, and which tools to activate. A series of thresholds can be established. Consider an SLA requiring response times of five seconds or less. A warning level (2.5 seconds, for instance) gives administrators ample time to investigate a performance shift and take appropriate action. A three-second threshold denotes a performance level that is getting closer to unacceptable values, and a four-second threshold is used to bring an urgent response from the management system. Determining when a trip wire should be triggered is fairly simple. However, the simplicity introduces difficulties because a threshold is usually a static value and the environment is dynamic. There may be peaks and valleys of activity and a threshold set too low will trigger a rash of alerts that do not really indicate a problem. Raising the threshold reduces the alert volumes when the normal load is high, but introduces the risk of missing situations when normal volumes are lighter. One key for effective instrumentation is selecting realistic thresholds to ensure accurate warnings. Many product specifications are based on a set of optimum conditions, and actual performance can be quite different. Realistic load testing is a practical means for determining accurate threshold values. Load testing is discussed in Chapter 11, "Load Testing." Time SlicesTime slices are repetitive measurements of the same variables over longer time intervals. They track changes in normal behavior over an extended period of time. Baselines are an example of a time-sliced measurement. Baselines are also used as trip wires because they provide a more accurate assessment of dynamic behavior. Repetitive measurements are used to set the initial baseline for normal behavior as an envelope with high, low, and average values. Statistical techniques such as those mentioned in Chapter 2, "Service Level Management," can be used to set the baseline values. A baseline approach sends an alert whenever measurements fall outside the normal envelope. Current measures are compared to the baseline and deviations can reveal conditions such as the following:
Baselines are most effective when the environment is stable long enough to take the measurements and make the calculations. Baselines must be recalculated as normal loads grow or newly added services alter the environment. Time slices require consistent measurements over time and some processing to determine the trends. Trends revealed with time-sliced measurements are used for longer-term planning and optimization functions. |