Understanding How System Monitor Works

[Previous] [Next]

Although a full discussion of how System Monitor works is beyond the scope of this chapter, this section covers some of its basic concepts and briefly describes how System Monitor works. Because the bulk of your performance-tuning activities will involve the Windows 2000 operating system, our discussion will focus on monitoring Windows 2000.

Performance Monitoring Concepts

Before beginning our discussion, we first need to briefly cover some basic concepts and terms. One thing you'll notice right off the bat is that we are using the terms "performance monitoring" and "System Monitor." Performance monitoring is the activity of gathering measurements and data from individual counters that show how a server is performing its activities. System Monitor is the snap-in utility in MMC that is used to gather this data.

More specifically, performance monitoring looks at how the Windows 2000 operating system and installed applications use the resources of the system. The four main subsystems that are monitored are the disks, memory, processors, and network components. Later in this chapter, we will look at each of these components and highlight some important counters and measurements for them. In connection with performance monitoring, we need to discuss four concepts: throughput, queue, bottleneck, and response time.

MORE INFO
If you need a more detailed discussion of System Monitor, consult Chapters 5 through 10 of the Windows 2000 Server Operating Guide, one of the volumes in the Microsoft Windows 2000 Server Resource Kit (Microsoft Press, 2000).

Throughput

Throughput is a measurement of the amount of work done in a given unit of time. Most often, we think of throughput as the amount of data that can be transmitted from one point to another in a given time period. However, the concept of throughput is also applied to data movement within a computer. Throughput can either decrease or increase. When it increases, the load, which represents the amount of data that the system is attempting to transmit, can increase to the point that no more additional data can be transmitted. This is called the peak level. If the load begins to decrease, which means that less and less data needs to be transmitted, the throughput will also fall.

When data is being sent from one point to another, or in any end-to-end system, the throughput depends on how each component along the path performs. The slowest point in the overall data path sets the throughput for the entire path. If the slowest point is too slow (which is defined differently in each situation), and a queue begins to develop, that point is referred to as a bottleneck, a concept we'll discuss in more detail in just a moment. Oftentimes, the resource that shows the highest use is the bottleneck, and a bottleneck is often the result of an overconsumption of that resource.

Generally, we do not define a heavily used resource in data transmission as a bottleneck unless a queue is also developing for the resource. For instance, if a router is being heavily used but shows little or no queue length, it is not thought of as a bottleneck. On the other hand, if that router develops a long queue (which is defined differently in each situation for each router), it could be said to be a bottleneck.

Queue

A queue is a place where a request for a service sits until it can be processed. For instance, when a file needs to be written to a disk, the request to write that file is first placed in the queue for the disk. The driver for the disk then reads the information out of the queue and writes that information to the disk. Long queues are rarely considered a good thing.

Queues develop under various circumstances. When requests for a service arrive at a rate faster than the resource's throughput, or if certain requests take a long time to fulfill, queues can develop. When a queue becomes long, the work is not being handled efficiently. Windows 2000 reports queue development on disks, processors, server work queues, and server message block (SMB) calls of the server service.

Response Time

Response time is the amount of time required to perform a unit of work from start to finish. Generally speaking, response time increases as stress on the resource increases. It can be measured by dividing the queue length for a given resource by the resource throughput. By using the new trace log feature in Windows 2000, you can track a unit of work from start to finish to determine its response time.

Bottleneck

As we mentioned earlier, a bottleneck represents overconsumption of a resource. You will experience this as a slow response time, but you should think of it as overconsumption. Finding bottlenecks is a key goal in performance tuning because eliminating bottlenecks makes your system run more efficiently. Moreover, if you can predict when a bottleneck will occur, you can do much to proactively solve a problem before it affects your users. Factors that contribute to bottlenecks are the number of requests for the services of a resource, the frequency with which those requests occur, and the duration of each request.

Collecting Data with System Monitor

Before you can properly tune your Exchange 2000 server, you must first collect data that shows how the server is presently running. Data collection involves three distinct elements: objects, counters, and instances. An object is any resource, application, or service that can be monitored and measured. You will select various objects for which you want to collect data.

Each object has multiple counters that measure various aspects of the object. Examples include the number of packets that a network card has sent or received in a given time period or the amount of time the processor has spent processing kernel-mode threads. The counters are where the data is actually measured and collected.

Finally, a counter might have multiple instances. The most common use of multiple instances is to monitor multiple processors on a server or multiple network cards. For example, if a server has two processors, you can either measure the amount of time each processor is spending processing nonidle threads or you can measure the two processors as one unit, and look at the average. Instances allow greater granularity in measuring performance. It is important to note that not all object types support multiple instances.

Each counter is assigned a counter type, which determines how the counter data is calculated, averaged, and displayed. In general, counters can be categorized according to their generic type, as outlined in Table 26-1. System Monitor supports more than 30 counter types. However, many of these types are not implemented in Windows 2000 and so are not listed in the table.

Table 26-1. Generic counter types

Counter Type Description
Average Measures a value over time and displays the average of the last two measurements.
Difference Subtracts the last measurement from the previous measurement and displays the difference, if the result is a positive number. If the result is negative, the display is zero.
Instantaneous Displays the most recent measurement.
Percentage Displays the result as a percentage.
Rate Samples an increasing count of events over time and divides the cache in count values by the change in time to display a rate of activity.

MORE INFO
For more information on each counter type—its name, its description, and how the formulas are calculated—consult the Windows 2000 Performance Counters Reference, one of the reference files installed with the Microsoft Windows 2000 Server Resource Kit (Microsoft Press, 2000).

Viewing Collected Data

When you first open System Monitor, you see a blank screen called a chart view, which displays selected counters in real time as a graph (Figure 26-1). To see data displayed in the chart, you have to add some counters. Choose Add from the toolbar to open the Add Counters dialog box (Figure 26-2).

click to view at full size.

Figure 26-1. System Monitor chart view.

Figure 26-2. Add Counters dialog box.

By default, the computer that you monitor is the computer on which you launched System Monitor, but you can monitor remote computers as well. In fact, you can select different counters from multiple computers at the same time. You might do so, for instance, to monitor how a distributed application is running. You can also choose to monitor the same counter on multiple computers for comparative purposes. For instance, in Figure 26-3, we've chosen to monitor the same counter on four servers—Indianapolis, Minneapolis, Tucson, and Folsom—while installing Microsoft Office 2000 on Folsom from the source files on Indianapolis. The two graph lines showing higher levels of processor activity represent the Indianapolis and Folsom servers.

click to view at full size.

Figure 26-3. Monitoring the same counter on four servers.

As you can see in the figure, you can also view digital values for the selected counter by selecting a counter from the list at the bottom and reading the values just under the graph. You can attain the last, average, minimum, and maximum values for each counter selected. In this case, the Folsom server's processor activity was averaging a bit over 14 percent for a period of 1 minute and 40 seconds.



Microsoft Exchange 2000 Server Adminstrator's Companion
Microsoft Exchange 2000 Server Adminstrator's Companion
ISBN: N/A
EAN: N/A
Year: 1999
Pages: 193

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net