Optimizing System Resources


Objective:

Monitor and optimize a server environment for application performance.

  • Monitor memory performance objects.

  • Monitor network performance objects.

  • Monitor process performance objects.

  • Monitor disk performance objects.

Before you can optimize a server, you must understand its characteristics, including how it operates under a normal load and what areas are stressed when the load increases. This has a lot to do with the application type and load of the server. For example, a web server reacts differently to a load condition than a server that is hosting Terminal Services.

The first step in optimization should be to establish a baseline. To establish a baseline for a server, you should log performance data for the server when it is under a normal load for an established period of time. You typically want to log at least a day, and sometimes even a week or more. This allows you to observe the various components of your server under normal load and stress circumstances. You should have a large enough sample so that you can observe all the highs and lows and determine what figures are averages for your server.

After you establish this baseline, the next step is to observe your server under load and to identify any components that are limiting the overall performance of your server. The main four components that cause the majority of the bottlenecks in a server are the memory, disk, processor, and network interface. The following four subsections discuss optimizing resources associated with these vital server objects.

Monitoring Memory Performance Objects

The Windows Server 2003 memory system uses a combination of physical memory and a swap file stored on the hard disk to provide space for the applications to run. Data in memory is written to the swap file through a process called paging. Paging is used to increase the amount of memory available to applications. Windows Server 2003 performs paging to make it seem to applications that the computer has more physical memory than is installed. The amount of virtual memory available on a computer is equal to its physical memory plus whatever hard disk space is configured for use as paging files.

Because accessing data from a hard disk is many times slower than accessing it from memory, you want to minimize the frequency with which the server has to swap data to the hard drive. This can usually be accomplished simply by adding more physical memory.

Here are some counters to watch to monitor memory performance:

  • Memory: Pages Input/sec When this counter remains at a low value (2 or less), it indicates that all operations are occurring within physical RAM. This means that paging is not occurring and therefore is not the cause of the performance degradation.

  • Memory: Cache Faults/sec Indicates how frequently the system is unable to locate data in the cache and must search for it on disk. If this number grows steadily over time, your system is headed into constant thrashing. This means every bit of information required by the system must be retrieved directly from the disk. This condition usually indicates an insufficient amount of RAM on your system. However, it can also be caused by running a combination of applications, such as a read-intensive application (typically a database that is performing a large number of queries) at the same time as an application that is using an excessive amount of memory. In this case, you can either schedule the applications to not run at the same time or move one of them to another system.

  • Memory: Page Faults/sec Similar to Cache Faults/sec, except that it also measures faults when a requested memory page is in use by another application. If this counter averages above 200 for low-end systems or above 600 for high-end systems, excess paging is occurring.

  • Memory: Available Bytes Indicates the amount of free memory available for use. If this number is less than 4MB, you do not have sufficient RAM on your system, so the system performs excessive paging.

  • Paging File: % Usage Peak Indicates the level of paging file usage. If this number nears 100% during normal operations, the maximum size of your paging file is too small, and you probably need more RAM. If you have multiple drives with multiple paging files, be sure to view the Total instance of this counter.

Monitoring Disk Performance Objects

The disk subsystem can be a bottleneck, either directly or indirectly. If the access speed of the disk is slow, it negatively affects the load time of applications and the read and write time of application data. In addition, because Windows Server 2003 relies on virtual memory, a slow disk subsystem indirectly affects memory performance.

Note: No More DISKPERF

In previous versions of Windows, you were required to use the DISKPERF utility to enable the disk counters. Windows Server 2003 enables the counters by default.


Here are some key performance counters for the disk subsystem:

  • PhysicalDisk: Avg. Disk Queue Length Tracks the number of system requests waiting for disk access. The number of queued requests should not exceed the number of spindles in use plus 2. Most drives have only a single spindle, but RAID arrays have more (and Performance Monitor views RAID arrays as a single logical drive). A large number of waiting items indicates that a drive or an array is not operating fast enough to support the system's demands for input and output. When this occurs, you need a faster drive system.

  • PhysicalDisk: % Disk Time Represents the percentage of time that the disk is actively handling read and write requests. It is not uncommon for this counter to regularly hit 100% on active servers. Sustained percentages of 90% or better, however, might indicate that a storage device is too slow. This usually is true when its Avg. Disk Queue Length counter is constantly above 2.

  • PhysicalDisk: Avg. Disk sec/Transfer Indicates the average time in seconds of a disk transfer.

Note: Keep Your Data Accurate

When you're recording a log file for the Disk objects, be sure not to record the file to the same drive being measured. You are not recording accurate values if you do, because the act of reading the object and writing to the drive adds a significant amount of workload.


Monitoring Process Performance Objects

The processor is the heart of your server. Most operations in the server are controlled either directly or indirectly by the processor. Most processor bottlenecks are caused by multiple processes running at the same time, requiring more cycles than the processor can deliver efficiently. This can be alleviated by replacing the processor with a faster model or by adding an additional processor in a multiprocessor-capable server.

To identify problems with the processor, monitor the following counters:

  • Processor: % Processor Time Indicates the amount of time the CPU spends on non-idle work. It's common for this counter to reach 100% during application launches or kernel-intensive operations (such as SAM synchronization). If this counter remains above 80% for an extended period, you should suspect a CPU bottleneck. (There will be an instance of this counter for each processor in a multiprocessor system.)

  • Processor: % Total Processor Time Applies only to multiprocessor systems. This counter should be used the same way as the single CPU counter. If any value remains consistently higher than 80%, at least one of your CPUs is a bottleneck.

  • System: Processor Queue Length Indicates the number of threads waiting for processor time. A sustained value of 2 or higher for this counter indicates processor congestion. This counter is a snapshot of the time of measurement, not an average value over time.

Monitoring Network Performance Objects

Although not as common as processor, disk, or memory bottlenecks, thanks to the preponderance of high-performance 100MB and even 1,000MB NICs, there are occasions when the network card is a bottleneck. This is most likely to occur on web servers or terminal servers.

To identify performance problems with the network interface, monitor the following counters:

  • Network Interface: Bytes Total/sec Indicates the rate at which data is sent to and received by a NIC (including framing characters). Compare this value with the expected capacity of the device. If the highest observed average is less than 75% of the expected value, communication errors or slowdowns might be occurring that limit the NIC's rated speed.

  • Network Interface: Current Bandwidth Estimates a NIC's current bandwidth, measured in bits per second (bps). This counter is useful only for NICs with variable bandwidth.

  • Network Interface: Output Queue Length Indicates the number of packets waiting to be transmitted by a NIC. If this averages above 2, you are experiencing delays.

  • Network Interface: Packets/sec Indicates the number of packets handled by a NIC. Watch this counter over a long interval of constant or normal activity. Sharp declines that occur while the queue length remains nonzero can indicate protocol-related or NIC-related problems.

Other network-related counters that may be worth monitoring include protocol-specific objects, such as ICMP, IP, TCP, and UDP.

Make sure you understand the common performance counters and their meanings. In addition, know what ranges are normal and what values indicate that a specific hardware component needs to be upgraded.




MCSA. MCSE 70-290 Exam Prep. Managing and Maintaining a MicrosoftR Windows ServerT 2003 Environment
MCSA/MCSE 70-290 Exam Prep: Managing and Maintaining a Microsoft Windows Server 2003 Environment (2nd Edition)
ISBN: 0789736489
EAN: 2147483647
Year: 2006
Pages: 219
Authors: Lee Scales

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net