Evaluating the Four Main Subsystems in Windows 2000

[Previous] [Next]

Earlier we mentioned the four subsystems that you should always monitor: memory, processor, disk, and network. In this section, we'll briefly discuss each element and offer some advice on tuning these parts of Windows 2000 to optimize their work with Exchange 2000 Server.

One point that applies to all four of these areas is this: Current data is not that helpful unless you have a baseline against which to compare it. This argues in favor of setting up regular monitoring schedules for all of your servers and then regularly compiling that data to form a baseline of how your servers operate at off-peak, normal, and peak periods of usage. As an example, if you find that one server is averaging 53 pages per minute, that number won't mean much unless you know the period of time that the average represents and whether it depicts abnormal behavior or is an expected result. The only way to know this comparative information is to have conducted regular monitoring of the server.

Evaluating Memory Usage

Use the counters in Table 26-2 to set up a baseline for your system's memory. When you're monitoring these counters, you will see occasional spikes that you can exclude from your baseline because these short-term values will not be representative of your servers. However, do not ignore these spikes if they are occurring with increasing frequency. This increase could indicate that a resource is becoming too heavily utilized.

Table 26-2. Essential memory counters

Counter Name Description
Memory\Pages/Sec Shows the rate at which pages are read from or written to the disk to resolve hard page faults. This counter is a primary indicator of the type of page faults that can significantly slow down your system. It is the sum of Memory\Pages Input/Sec and Memory\Page Faults/Sec. Microsoft recommends keeping this value below 20.
Memory\Available Bytes Shows the amount of physical memory, in bytes, available to processes running on the computers. Microsoft recommends keeping this value above 4000 KB.
Paging File(_Total)\% Usage Shows the amount of the paging file in use during the sample interval, as a percentage. A high value indicates that you may need to increase the size of your Pagefile.sys file or add more RAM. Microsoft recommends keeping this value below 75 percent.

Recall from Chapter 2 that the Extensible Storage Engine (ESE) automatically checks the system's performance and allocates to itself all of the available memory that it anticipates it will need. This allocation means that when you monitor the Memory\Available Bytes counter, it may hover around 4000 KB, even if there isn't much activity on the server. In addition, you'll find that the Store.exe process allocates a large amount of memory to itself. This action is by design and does not represent a memory leak or a memory bottleneck.

You'll also recall that the DSAccess cache automatically allocates 4 MB to itself (set in the registry) for holding LDAP lookup results from the Global Catalog server. Its size can be monitored with the MSExchangeDSAccess Caches\Total Memory Size counter.

To see memory allocations on a per-process basis, use the Memsnap or Process Monitor tool found under Windows 2000 Support Tools on your Windows 2000 CD-ROM. The Memsnap tool records system memory usage to a log file for later review. It gives just a snapshot of your memory usage, not an ongoing logging of how each process is using memory. Figure 26-4 illustrates what the log file looks like.

click to view at full size.

Figure 26-4. Memsnap log file.

The Process Monitor (Figure 26-5) provides total and per-process values for nonpaged and paged pool memory. It also monitors the committed memory values shown in the Pmon display for increases, where a process with a memory leak should report an increasing value under Commit Charge.

click to view at full size.

Figure 26-5. Process Monitor output in a command-prompt window.

Evaluating Processor Usage

Use the counters listed in Table 26-3 to set up a baseline for your processor usage. The processor always has a thread to process. Most often, the system supplies an idle thread for the processor to process while it is waiting to process an active thread. The Processor\% Processor Time counter does not factor in the idle thread when calculating its value.

Table 26-3. Essential processor counters

Counter Name Description
Processor\% Processor Time Shows the percentage of elapsed time that all of the threads of this process used to execute instructions. An instruction is the basic unit of execution in a computer, a thread is the object that executes instructions, and a process is the object created when a program is run. Microsoft recommends keeping this value to 80 or below (sustained).
System\Processor Queue Length Shows the number of threads in the processor queue. There is a single queue for processor time, even on computers with multiple processors. This counter shows ready threads only, not threads that are currently running. Microsoft recommends keeping this value to 2 or less.

The most common causes of processor bottlenecks are insufficient memory and excessive numbers of interrupts from disk or network I/O components. During periods of low activity, the only source of processor interrupts might be the processor's timer ticks. Timer ticks increment the processor's timer. These interrupts occur every 10 to 15 milliseconds, or about 70 to 100 times per second. Use the Processor(_Total)\Interrupts/Sec counter to measure this value. The normal range is in the thousands of interrupts per second for a Windows 2000 server and can vary from processor to processor. Installing a new application may cause a dramatic rise in this value.

If you want to improve processor response time or throughput, you can schedule processor-intensive applications to run at a time when system stress is usually low. Use the Scheduled Tasks tool in Control Panel to do this. You can also upgrade to a faster processor with a larger L2 cache. This upgrade will always increase your system's performance, and you can use multiple processors instead of a single processor to balance the processing load.

Two resource kit tools that will help you understand and experiment with processor performance are CPU Stress (Cpustres.exe), which simulates processor workload, and QuickSlice (Qslice.exe), which provides a graphical display of processor usage by a process.

The CPU Stress utility (Figure 26-6) allows you to run up to four threads that endlessly loop. These threads can be run at different priorities and levels of activity. You can use this tool to test how well your processor will sustain a utilization of over 80 percent. Select a number of threads to run, a Thread Priority and Activity level for each, and then use System Monitor to check the processor utilization. You can then test normal server usage to see if it is acceptable while these test threads are running.

The QuickSlice tool (Figure 26-7) gives a graphical representation of how much of the CPU each of the processes is using for any given instance. This tool has no logging feature; it provides only real-time information. Use this tool to see quickly which thread is monopolizing the CPU.

Figure 26-6. CPU Stress resource kit tool.

click to view at full size.

Figure 26-7. QuickSlice resource kit tool.

Evaluating Disk Usage

Windows 2000 includes counters that monitor the activity of the physical disk and logical volumes. The PhysicalDisk object provides counters that report physical-disk activity, while the LogicalDisk object provides counters that report statistics for logical disks and storage volumes. By default, the Windows 2000 operating system activates only the PhysicalDisk performance counters. To activate the LogicalDisk counters, go to the command prompt and type diskperf -yv. The counters will be activated when you reboot your server.

MORE INFO
For more information on the various switches used with the Diskperf command, see Chapter 8 of the Windows 2000 Server Operations Guide, one of the volumes in the Microsoft Windows 2000 Server Resource Kit (Microsoft Press, 2000).

Table 26-4 lists the counters for evaluating disk performance. They are the same for both the LogicalDisk and PhysicalDisk objects. We've chosen to use the PhysicalDisk object in the table.

Table 26-4. Essential disk counters

Counter Name Description
PhysicalDisk\Avg. Disk Sec/Transfer Indicates how fast data is being moved, in seconds. A high value might mean that the system is retrying requests due to lengthy queuing or, less commonly, a disk failure. There are no benchmark recommendations from Microsoft. Watch for significant variances from baseline data.
PhysicalDisk\Avg. Disk Queue Length Shows the number of requests that are queued and waiting for the disk to process. Microsoft recommends that this value be 2 or less.
PhysicalDisk\Disk Bytes/Sec Indicates the rate at which bytes are transferred. It is the primary measurement of disk throughput.
PhysicalDisk\Disk Transfers/Sec Shows the number of completed read and write operations per second. This counter measures disk utilization and is expressed as a percentage. Values over 50 percent might indicate that the disk is becoming a bottleneck.

Diagnosing a disk as a bottleneck is a tricky process that requires both time and experience. We give some helpful tips here, but for a more full discussion of this topic, please see the works cited earlier in this chapter.

What you want to see in order to diagnose a disk as a bottleneck in your system is either a sustained rate of disk activity that is well above your baseline or an increasing rate of disk activity that represents a dramatic departure from your baseline statistics. In addition, you'll want to see persistent disk queues that are either steadily increasing or that are significantly above your baseline statistics, coupled with the absence of a significant amount of paging (less than 20 pages per second). If these factors combine in any other way than those described here, it is unlikely that your disk is a bottleneck. For example, if your system doesn't have enough RAM to accommodate its load, you will find that paging occurs more frequently, creating unnecessary disk activity. If you monitor only the PhysicalDisk object, you might see this activity as evidence that your disk is a bottleneck. Therefore, you must also monitor memory counters to determine the real source of this type of problem.

If you do determine that your disk is too slow, consider taking one or more of the following steps.

  1. Rule out a memory shortage, for the reasons just discussed.
  2. Defragment the disk, using Disk Defragmenter. For information about using Disk Defragmenter, see the online help for Windows 2000.
  3. Use Diskpar.exe from the Microsoft Windows 2000 Server Resource Kit to reduce performance loss due to misaligned disk tracks and sectors.
  4. Consider implementing a stripe set to process I/O requests concurrently over multiple disks. If you need data integrity, implement a stripe set with parity.
  5. Place multiple drives on different I/O buses.
  6. Limit the use of file compression or encryption.
  7. Be sure you're using the best and fastest controller, disk, and I/O bus that you can realistically afford.

Evaluating Network Usage

Windows 2000 provides two utilities for monitoring network performance: System Monitor and Network Monitor. We will not discuss Network Monitor here. For more information on Network Monitor, see the resource kit books cited earlier in this chapter.

You should monitor other resources, such as disk, memory, and processor objects, along with network objects to obtain an overall perspective on the network objects' results. In addition, you can select which layer of the Open Systems Interconnection (OSI) model you want to monitor. Table 26-5 summarizes the counters and their corresponding OSI layer.

MORE INFO
For more information on the OSI model, see Appendix A in the TCP/IP Core Networking Guide, part of the Microsoft Windows 2000 Server Resource Kit (Microsoft Press, 2000).

Table 26-5. Essential network counters and their OSI layer

Counter Name Description OSI Layer
Network Interface\
Output Queue Length
Indicates the length of the output packet queue. A queue length of 1 or 2 is often satisfactory. Longer queues indicate that the adapter is waiting for the network and thus cannot keep pace with the server. Physical
Network Interface\
Packets Outbound Discarded
A high value indicates that the network segment is saturated. An increasing value means that the network buffers cannot keep pace with the outbound flow of packets. Physical
Network Interface\Bytes
Total/Sec
A high value indicates a large number of successful transmissions. Physical
Network Segment\
Broadcast Frames Received/Sec
You'll need to develop a baseline for this counter and then compare subsequent measurements against it. Since every computer processes every broadcast, frequent broadcasts mean lower overall performance. Physical
Network Segment\
% Network Utilization
Reflects the percentage of network band-width used for the local network segment. A lower value is preferred. For an unswitched Ethernet network, a value under 30 percent is best. Collisions will become a problem at 40 percent. Physical
IP\Datagrams/Sec Shows the rate at which datagrams are received from or sent to each interface. Network
TCP\Segments Received/Sec Shows the rate at which segments are received, including those received in error. This count includes segments received on currently established connections. A low value means that you have too much broadcast traffic. Transport
TCP\Segments
Retransmitted/Sec
Gives the rate at which segments containing one or more previously transmitted bytes are retransmitted. A high value might indicate either a saturated network or a hardware problem. Transport
Redirector\
Network Errors/Sec
Measures serious network errors that indicate the Redirector and one or more servers are having serious communication problems. Application
Server\
Pool Paged Failures
Indicates the number of times that allocations from the paged pool have failed. If this number is high, either the amount of RAM is too little or the pagefile is too small or both. If this number is consistently increasing, increase the physical RAM and the size of the pagefile. Application

The Network Interface object is installed when you install TCP/IP. The Network Segment object is installed when you install Network Monitor. To monitor the TCP/IP protocol, use the TCP/IP, UDP, and ICMP objects. (You no longer need to install SNMP to get the IP counters as you did with Windows NT.) Use the NBT Connection object to track session-layer packets between computers. You can also use this object to monitor routed servers that use NETBIOS name resolution.

Application-layer objects include the Browser, Redirector, Server, and Server Work Queue objects on computers running Windows 2000. These objects will help you understand how your file and print services are performing, using the Server Message Block (SMB) protocol.



Microsoft Exchange 2000 Server Adminstrator's Companion
Microsoft Exchange 2000 Server Adminstrator's Companion
ISBN: N/A
EAN: N/A
Year: 1999
Pages: 193

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net