Monitoring System Performance for Bottlenecks

Identify system bottlenecks, including memory, processor, disk, and network- related bottlenecks.

Identify system bottlenecks by using System Monitor.

Administering a Windows Server 2003 network is not simply about making sure that people have access to resources and that information is secure. The question administrators ask themselves is not just, "Is it running?" but also, "Is it running well?" Of course, "well" is relative and can't really be quantified outside your individual context.

This section examines the assessment of server performance using the System Monitor, a tool located within the Performance console. In addition, it discusses some tips on how to tune your server before problems occur and which devices should be monitored when they do.

Periodic monitoring of your Windows Server 2003 network is important to the process of optimization. Monitoring helps to overcome the feeling-based assessment of your users. For example, by comparing current network performance against a previously established baseline, you have more information than the anecdotal "The network is slow today!" on which to base your actions. By gathering current information and comparing it against established norms for your systems (a baseline ), you can detect bottlenecks, identify those system components that are slowing down server performance, and fix them before they become a problem to your users.

Although the actual procedure for creating a baseline is outlined later in this section, it is important to discuss the concept of a baseline here because it is the first thing you will do in practiceeven if it is not the first concept that is presented here. A baseline is an established norm for the operation of your server as determined by normal load. This baseline can then be used as a basis of comparison for future performance to see whether repairable problems exist. As the configuration of your server changes (when, for example, a processor is added or RAM is added), new baselines are established to reflect the new expected performance.

The importance of establishing a baseline before beginning to monitor performance can't be overstated. Although there are some guidelines as to what absolute performance numbers indicate , it is as you compare current performance against past performance (the baseline) that you will really be able to evaluate how well current demand is being met and whether you require more resources on your server. In addition, it is imperative that a baseline be established before problems begin to occur. If users are already beginning to complain, "The network is slow," it is too late to establish a baseline because the statistics gathered will include whatever performance factors are contributing to the dissatisfaction.

At a minimum, you must perform periodic monitoring on the following areas of your Windows Server 2003 computers: the hard disk(s), processor(s), memory, and network adapter(s). Regardless of which type of services the server is providing, these four areas interact to make your server efficient (thereby appearing fast) or inefficient. The actual speed or efficiency of each of the components varies in importance depending on the application. In some applications, memory is more important than processor speed or availability; in other applications, disk speed and availability are more important than fast network access.

The Performance Console

Recognizing the need to be able to monitor the performance (and thus the health) of servers and client computers, Microsoft built the Performance console into Windows Server 2003. Whether you are looking for real-time graphical views or a log you can peruse at your convenience, the Performance console can provide the type of data you need to evaluate performance and recommend system modification if necessary.

Monitoring performance begins with the collection of data. The Performance console, shown in Figure 6.1, allows you various methods of working with data, although all methods use the same means of collecting data. Data collected by the Performance console is broken down into objects, counters, and instances. An object is the software or device being monitored, such as the memory or a processor. A counter is a specific statistic for an object. Memory has a counter called Available Bytes , and a processor has a counter called % Processor Time . An instance is the specific occurrence of an object you are watching; in a multiprocessor server with two processors, you have three instances: , 1 , and _Total .

Figure 6.1. The Performance console consists of four different tools.

The primary difference between using the System Monitor and Counter Logs or Trace Logs is that you typically watch performance real-time in System Monitor (or playback saved logs), whereas you use Counter Logs and Trace Logs to record data for later analysis. Alerts function in real-time by providing you with alerts when a user -defined threshold is exceeded. Counter Logs, Trace Logs, and Alerts are beyond the scope of the 70-293 exam; if you want to learn more about them, however, be sure to see www.microsoft.com/technet/prodtechnol/windowsserver2003/proddocs/entserver/sag_MPtopnode.asp.

Introduction to System Monitor

You can access the System Monitor from the Administrative Tools folder by selecting Start, Programs, Administrative Tools, Performance. When you initially open the Performance console, it looks like the console shown previously in Figure 6.1. The System Monitor enables you to view statistical data either live or from a saved log. You can view the data in three formats: graph, histogram, or report. Graph data is displayed as a line graph, histograms are displayed as bar graphs, and text-based reports show the current numerical information available from the statistics.

The basic use of the System Monitor is straightforward. You decide which object/instance/counter combinations you want to display and then configure the monitor accordingly . At that point, information begins to appear. You can also change the properties of the monitor to display information in different ways.

The best way to become familiar with the System Monitor is to start using it. Step by Step 6.1 lets you do just that by configuring monitoring for some network counters.

STEP BY STEP

6.1 Using the System Monitor

Select Start, Programs, Administrative Tools, Performance to open the Performance console.
Click the System Monitor node.
To add counter items to the System Monitor, click the + icon on the toolbar, as shown in Figure 6.2.

Figure 6.2. To start adding counter items, you need to click the + icon on the toolbar.
The Add Counters dialog box, shown in Figure 6.3, opens, allowing you to begin adding counters.

Figure 6.3. You can add counters to begin monitoring the Processor statistics.
Select Network Interface from the Performance Object drop-down list box. The list of counters that relate to network interfaces and are available for selection appears. If you need to know what a counter means, select the counter and click the Explain button.
Be sure that you are adding counters for the correct network interface in your server and not for the loopback interface by selecting the Select Instances from This List option and then selecting the correct network interface, as shown in Figure 6.4.

Figure 6.4. Make sure you are adding the counters for the correct instance when multiple instances exist.
After you have decided what counter you want to monitor, click Add. You can add multiple counters either by selecting each counter and clicking Add or by holding down the Ctrl key while you select all the counters you want to monitor and then clicking Add.
Click Close when you are finished. Your counters are now actively graphed, like those shown in Figure 6.5.

Figure 6.5. You can monitor the selected network interface statistics in real-time.

Working with Counters

Now that you've seen how the System Monitor works from a high-level, let's dig in a little and examine the basic building block of performance monitoring: counters . Figure 6.6 shows a typical Add Counters dialog box. At the top of the dialog box is a set of radio buttons with which you can obtain statistics from the local machine or a remote machine. This feature is useful when you want to monitor a computer in a location that is not within reasonable physical distance from you. Under the radio buttons is a pull-down list naming the performance objects that can be monitored. Which performance objects are available depend on the features (and applications) you have installed on your server. Also, some counters come with specific applications. These performance counters enable you to monitor statistics relating to that application from the Performance console.

Figure 6.6. The Add Counters dialog box contains many options you need to understand.

Under the performance object is a list of counters. When applied to a specific instance of an object, counters are what you are really after, and the object just narrows down your search. The counters are the actual statistical information you want to monitor. Each object has its own set of counters from which you can choose. Counters enable you to move from the abstract concept of an object to the concrete events that reflect that object's activity. For example, if you choose to monitor the processor, you can watch for the average processor time and how much time the processor spent doing non-idle activity. In addition, you could watch for %user time (time spent executing user application processes) versus %privileged time (time spent executing system processes).

To the right of the counter list is the instances list. If applicable , instances enumerate the physical objects that fall under the specific object class you have chosen . In some cases, the instances list is not applicable. For example, no instances list is available with memory. In cases in which the instances list is applicable, you see multiple instance variables (refer to Figure 6.4). One variable represents the average of all the instances, and the rest of the variables represent the values for the first physical object (number 0, 1, and so on). For example, if you have two processors in your server, you see (and can choose from) three instance variables : _Total , , and 1 . This way, you can watch each processor individually and watch them as a collective unit.

Using System Monitor to Discover Bottlenecks

Every chain, regardless of its strength, has its weakest link. When pulled hard enough, some point gives before all the others. Your server is similar to a chain. When it's under stress, some component cannot keep up with the others. This results in a degradation of overall performance. The weak link in the server is referred to as a bottleneck because it's the component that slows down everything else. As an administrator responsible for ensuring efficient operation of a Windows Server 2003 computer, you need to determine the following two things:

Which component is causing the bottleneck?
Is the stress on the server typical enough that action is warranted either now or in the future?

As was mentioned previously, under normal operation, only four system components typically affect system performance: memory, processors, disks, and network adapters. Therefore, you should monitor the counters that tell you the most about how those four components affect system performance. The information from these counters is critical because you can determine the answer to the two diagnostic questions listed here.

The biggest monitoring problem is not collecting the data, but interpreting it. Not only is it difficult to determine what a specific value for a particular counter means, it is also difficult to determine what that value means in the context of other counters. The biggest difficulty is that no subsystem (disk, network, processor, or memory) exists in isolation. As a result, weaknesses in one might show up as weaknesses in another. Unless you take them all into consideration, you might end up adding another processor when all you need is more RAM.

Understanding how the subsystems interact is important to understanding the significance of the counter values that are recorded. For example, if you detect that your processor is constantly running at 90%, you might be tempted to purchase a faster processor (or another processor if you have a system board that can accommodate more than one). However, it is important to look at memory utilization and disk utilization as well because the problem could be originating there instead. If you do not have enough memory, the processor must swap pages to the disk frequently. This page swapping results in high memory utilization, high disk utilization, and higher processor utilization. By purchasing more RAM, you could alleviate all these problems.

This one example illustrates how no one piece of information is enough to analyze your performance problems or your solution. You must monitor the server as a whole unit by putting together the counters from a variety of objects. Only then can you see the big picture and solve problems that might arise.

The recommended method of monitoring is to use a counter log, which captures data over a period of time. This type of monitoring helps you eliminate questions of whether the current stress on the server is typical. If you log over a period of a week or a month and consistently see a certain component under excessive load, you can be sure the stress is typical.

Baselining Servers

As we've already discussed, monitoring the present operation of your servers and network presents you with only half of the picture. You need to create a baseline of the server performance that you can use to compare against future performance statistics to locate problematic areas. If you create baselines on servers, you can compare the present-day performance to a known value. This comparison can be very useful when you're troubleshooting, and it also aids during periods when you are modifying configurations.

A baseline is a set of typical readings that define "normal" for your servers, client computers, or network under various operating conditions, such as no load, moderate load, and heavy load. Of course, what is normal is obviously open to interpretation, but you could say that normal is a server providing users with what they want in a time frame that they think is reasonable. By creating baselines early on, you have something that you can later look back at and compare current server operating conditions to. If your system is already to the point where you are seeing system degradation, it is really too late to establish a baseline.

EXAM TIP

Creating a monitoring station Most administrators recommend that you do not perform performance monitoring locally from the server on which you are attempting to monitor and collect data. When you run the System Monitor directly on a server, you can skew the results because, of course, the actual act of monitoring consumes system resources. It's generally better to run System Monitor on a workstation pointed at the server you want to monitor (in addition to being able to monitor multiple servers from a single console).

To establish a baseline, you pick a time (or duration of time) that represents typical user interaction with the server. Then you create a log of important counters for the duration you have determined. Some of the more commonly used (and recommended) counters are summarized in Table 6.1. For a complete reference to these counters, be sure to see www.microsoft.com/technet/prodtechnol/windowsserver2003/proddocs/deployguide/counters1_lkxw.asp.

Table 6.1. Counters to Monitor for Baselining and Bottleneck Troubleshooting

Server Component	Recommended Counters
Memory	Memory\Page Faults/sec Memory\Page Reads/sec Memory\Page Writes /sec Memory\Pages Input/sec Memory\Pages Output/sec Memory\Available Bytes Memory\Pool Nonpaged Bytes Process\Page Faults/sec Process\Working Set Process\Private Bytes Process\Page File Bytes
Processor	Processor\% Processor Time System\Processor Queue Length Process\% Privileged Time Process\% Processor Time Process\% User Time Process\Priority Base Thread\% Privileged Time Thread\% Processor Time Thread\% User Time Thread\Context Switches/sec Thread\Priority Base Thread\Priority Current Thread\Thread State
Disk	PhysicalDisk\% Disk Time PhysicalDisk\Avg. Disk Queue Length PhysicalDisk\Current Disk Queue Length PhysicalDisk\Avg. Disk Sec/Read PhysicalDisk\Avg. Disk Sec/Write PhysicalDisk\Disk Read Bytes/sec PhysicalDisk\Disk Write Bytes/sec PhysicalDisk\Avg. Disk Bytes/Write PhysicalDisk\Disk Reads/sec PhysicalDisk\Disk Writes/sec LogicalDisk\% Disk Time LogicalDisk\Avg. Disk Queue Length LogicalDisk\Current Disk Queue Length LogicalDisk\Avg. Disk Sec/Read LogicalDisk\Avg. Disk Sec/Write LogicalDisk\Disk Read Bytes/sec LogicalDisk\Disk Write Bytes/sec LogicalDisk\Avg. Disk Bytes/Write LogicalDisk\Disk Reads/sec LogicalDisk\Disk Writes/sec
Network	Network Interface\Bytes Total/sec Network Interface\Bytes Sent/sec Network Interface\Bytes Received/sec TCPv4\Segments Received/sec TCPv4\Segments Sent/sec TCPv4\Frames Sent/sec TCPv4\Frames Received/sec Server\Bytes Total/sec Server\Bytes Received/sec Server\Bytes Transmitted/sec

EXAM TIP

Performance counters Don't worry too much about memorizing all the different counters and their uses. The 70-293 exam is looking more for your ability to use the tools available to identify, troubleshoot, and correct problem situations.

Don't be tricked ! When taking the 70-293 exam, carefully read each question especially those that deal with performance monitoringto ensure that you select the correct answer. You might find several very similar answers presented, but only one of them is correct.

The log you create should be stored in a safe place to ensure that you can refer to it in the future. Every time you perform a major hardware upgrade (such as increasing RAM or adding a processor), you should create a new set of baselines and consider deleting the old ones.

Which actual counters you want to monitor are based on the particular applications running on your server and the requirements you have for the server. Although some recommendations are given in Table 6.1, you might want to watch other objects as well if you have specific applications installed.

Creating Baseline Counter Logs

To create a baseline, you create a counter log from the Counter Logs option of the Performance Logs and Alerts node of the Performance console. The creation of a counter log is outlined in Step by Step 6.2.

STEP BY STEP

6.2 Creating and Using a Baseline Counter Log

Select Start, Programs, Administrative Tools, Performance to open the Performance console.
Expand the Performance Logs and Alerts node and click the Counter Logs entry, as shown in Figure 6.7.

Figure 6.7. A counter log is created to keep baseline (historical) statistics about performance.
To add a new counter log, right-click the Counter Logs entry and select New Log Settings from the context menu.
Enter the name of the new counter log, as shown in Figure 6.8, and click OK. The Properties dialog box for this log appears (see Figure 6.9).

Figure 6.8. You should enter a descriptive name for the new counter log.

Figure 6.9. You must configure objects or counters on the new counter log before proceeding.
Configure new entire objects to be monitored by clicking the Add Objects button to open the Add Objects dialog box (see Figure 6.10). Alternatively, add individual counters by clicking the Add Counters button to open the Add Counters dialog box (see Figure 6.11). Add the objects or counters you want to the counter log.

Figure 6.10. You can add entire objects to the counter log if you want.

Figure 6.11. You can add specific individual counters to the counter log if you want.
After you have added some objects or counters, you can configure the options for sampling interval. The default is every 15 secondsmeaning that while the counter log is running, data will be collected every 15 seconds from the computers being monitored. The sampling interval has a trade-off associated with it: Smaller intervals mean larger, more accurate log files; larger intervals result in smaller, less accurate log files. In addition, the smaller the sampling interval, the more resources Performance consoleing consumes. You can also configure the account to be used to run the counter log.
On the Log Files tab of the Properties dialog box, shown in Figure 6.12, select the type of log file to be created as well as the numbering system to be used. The default settings are Binary File and nnnnn. The available options are explained after this Step by Step in Tables 6.2 and 6.3. In addition, you can enter a comment that will help identify the counter log.

Figure 6.12. You can use the Log Files tab to configure how the log is to be saved.
Click the Configure button to open the Configure Log Files dialog box, shown in Figure 6.13. From here, you can change the path of the log file and its maximum allowed size. Click OK after you make your changes.

Figure 6.13. You can use the Configure Log Files dialog box to configure the maximum log file size as well as its path.
On the Schedule tab of the Properties dialog box, shown in Figure 6.14, you can configure how and when the counter log is to run. Typically, you will run it manually, although you might opt to have it run automatically at a configured time.

Figure 6.14. You can configure the counter log to start and stop automatically if you want.
After you make all your configurations, click OK to save the counter log settings. After a moment, the counter log starts running if you have not configured it for manual starting.
After the counter log has run for a period of time adequate to capture your baseline data, click the square stop icon from the Counter Logs node to stop it.
Click the System Monitor node to switch back to the System Monitor. Click the database icon to open a log file, as shown in Figure 6.15.

Figure 6.15. You need to load the counter log file into the System Monitor to be able to examine it.
The System Monitor Properties dialog box opens, showing the Source tab, as shown in Figure 6.16. Click the Add button to locate and add the counter log file.

Figure 6.16. You can add one or more counter logs to the System Monitor.
Switch to the Data tab, shown in Figure 6.17, and ensure that all counters for which you are interested in seeing data are added. You will see data displayed only for counters that are selected here. Click OK when you are done.

Figure 6.17. You need to ensure that all counters have been added for viewing.
The counter log data is now displayed in the System Monitor. If you need to display only a certain portion of the data, you can do so by configuring the Time Range option on the Source tab of the System Monitor Properties dialog box (refer to Figure 6.16).

When creating log files, you have several different file formats available to you. Table 6.2 outlines the available file formats.

Table 6.2. Counter Log File Format Options

Option	Description
Text File (Comma Delimited)	This option defines a comma-delimited log file (with a `.csv` extension).
Text File (Tab Delimited)	This option defines a tab-delimited log file (with a `.tsv` extension).
Binary File	This option defines a sequential, binary-format log file (with a `.blg` extension). You should use this file format if you want to be able to record data instances that are intermittentthat is, stopping and resuming after the log has begun running. Only binary file formats can accommodate instances that are not persistent throughout the duration of the log.
Binary Circular File	This option defines a circular, binary-format log file (with a `.blg` extension). You should use this file format to continuously record data to the same log file, overwriting previous records with new data when the file reaches its maximum size.
SQL Database	This option allows you to save logs to an SQL database.

When creating log files, you have several different numbering formats available to you. Table 6.3 outlines the available numbering formats.

Table 6.3. Counter Log Numbering Systems

System	Example
nnnnnn	`Network Adapter Performance_000007.blg`
mmddhh	`Network Adapter Performance_042011.blg`
mmddhhmm	`Network Adapter Performance_04201126.blg`
yyyyddd	`Network Adapter Performance_2003111.blg`
yyyymm	`Network Adapter Performance_200304.blg`
yyyymmdd	`Network Adapter Performance_20030420.blg`
yyyymmddhh	`Network Adapter Performance_2003042011.blg`

Daily Monitoring for Usage

On a daily basis, you may not want to monitor the full group of counters that were listed previously in Table 6.1. The counters in Table 6.4 present a smaller, and thus easier-to-manage, group of counters that you might consider monitoring on a daily basis to get a quick snapshot of your system and network performance.

Table 6.4. Counters to Monitor on a Daily Basis

Server Component	Recommended Counters
Memory	Memory\ Available Bytes Memory\ Cache Bytes Memory\ Pages/sec Memory\ Page reads/sec Memory\ Pool Paged Bytes Memory\ Pool Nonpaged Bytes
Processor	Processor\ % Processor Time (all instances) System\ Processor Queue Length (all instances) Processor\ Interrupts/sec
Disk	Physical Disk\ Disk Reads/sec Physical Disk\ Disk Writes/sec Logical Disk\% Free Space Logical Disk\% Disk Time Physical Disk\ Current Disk Queue Length (all instances) Physical Disk\ Split IO/sec
Network	Network Interface\ Bytes total/sec

As they pertain to Table 6.4, the following are descriptions of the counters:

Physical Disk\ Disk Reads/sec The number of disk reads that occur per second. This value is a measure of the read activity on the disk. The transfer rate of the hard drive being used determines what values you should be looking for here, so check the vendor's supplied documentation. In general, Ultra Wide SCSI disks can handle about 5070 I/O operations per second.
Physical Disk\ Disk Writes/sec The number of disk writes that occur per second. This value is a measure of the write activity on the disk. The same guidelines apply for this counter as do for the Physical Disk\ Disk Reads/sec counter.
Logical Disk\ % Free Space The ratio of free space to total disk space on a logical drive. This value is a measurement of remaining capacity on your logical drives . You generally should track this value for each logical drive. To prevent excessive fragmentation, you should not allow the value here to drop below 15%.
Logical Disk\ % Disk Time The ratio of busy time to the total elapsed time. This value represents the percentage the disk is servicing read or write requests . You generally should track this value for each physical drive. If one drive is being used a lot more than another, it might be time to balance the content between the drives. The lower this number, the greater the capacity a disk has to do additional work. This value should typically not exceed 90%.
Physical Disk\ Current Disk Queue Length The average number of read and write requests that are waiting in queue. Optimally, this number should be no more than 2 because a larger number means the disk is a bottleneck; it is incapable of servicing the requests placed on it.
Physical Disk\ Split IO/sec The rate at which the operating system divides I/O requests to the disk into multiple requests. A split I/O request might occur if the program requests data in a size that is too large to fit into a single request or if the disk is fragmented . Factors that influence the size of an I/O request can include application design, the file system, or drivers. A high rate of split I/O might not, in itself, represent a problem. However, on single-disk systems, a high rate for this counter tends to indicate disk fragmentation.
Memory\ Available Bytes The total amount of physical memory available to processes running on the computer. This number's significance varies as the amount of memory in the computer varies, but if this number is less than 4MB, you generally have a memory deficiency.
Memory\ Cache Bytes The amount of cache memory available to processes running on the computer. This counter indicates growth or shrinking of the cache. The value includes not only the size of the cache but also the size of the paged pool and the amount of pageable driver and kernel code. Lower values indicate a problem, although there is no agreed-upon standard value.
Memory\ Pages/sec The number of hard page faults occurring per second. A hard page fault occurs when data or code is not in memory and must be retrieved from the hard drive. Each time this happens, disk activity is required, and the process is temporarily halted (because disk access is momentarily slower than RAM access). A bottleneck in memory is likely when this number is 20 or greater.
Memory\ Page reads/sec The number of times the disk needed to be read to resolve a hard page fault. Unlike Memory\ Pages/sec, this counter is not an indicator of the quantity of data being retrieved but rather the number of times the disk had to be consulted. This counter can give a general feeling of a memory bottleneck, whereas Memory\ Pages/sec gives a more quantifiable value to the bottleneck.
Memory\ Pool Paged Bytes The number of bytes of memory taken up by system tasks that can be swapped out to disk if needed. Although this counter is not a direct indicator of a memory bottleneck, if the number of Pool Paged Bytes is large, it can indicate a lot of system processes. If this number is a significant percentage of total memory, you might need to increase RAM to allow for these tasks to remain in RAM instead of being swapped out.
Memory\ Pool Nonpaged Bytes The number of bytes of memory taken up by system tasks that can't be swapped out to disk. This figure can indicate a bottleneck in memory, especially if the figure is a significant percentage of the total amount of RAM. Because these processes can't be swapped out, they continue to take up RAM for as long as they are running.
Network Interface\ Bytes Total/sec An indication of the total throughput of the network interface. This figure can be used for general capacity planning and does not necessarily indicate a network bottleneck.
Processor\ % Processor Time The amount of time the processor spends executing non-idle threads. This figure is an indication of how busy the processor is. The processor for a single-processor system should not exceed 75% capacity for a significant period of time. The processors in a multiple-processor system should not exceed 50% for a significant period of time. High processor utilization can be an indication of processor bottlenecks, but it could also indicate lack of memory.
System\ Processor Queue Length The number of processes that are ready but waiting to be serviced by the processor(s). There is a single queue for all processors, even in a multiprocessor environment. A sustained queue of more than 2 generally indicates processor congestion.
Processor\ Interrupts/sec The number of hardware requests the processor is servicing per second. This is not necessarily an indicator of system health but, when compared against the baseline, it can help to determine hardware problems. Hardware problems are sometimes indicated by a device dramatically increasing the number of interrupts it sends.

System Monitor Tips and Tricks

Microsoft provides some helpful tips and tricks that you should keep in mind when working with the Performance console to solve performance-related problems. Through careful analysis of data, you might be able to determine problems with the network that are not otherwise seen, such as excessive demands on resources that result in bottlenecksand therefore slow network performance to the degree that users begin to notice that something is wrong.

EXAM TIP

Saving time by saving your configuration No one wants to reinvent the wheel over and over. This holds true when you are configuring System Monitor with a set of counters. You can save a good amount of time and effort by setting up the System Monitor with the counters and options you want on one server and then saving the configuration file and distributing that file to each of your other servers instead of re-creating the configuration each time.

Don't forget about the Task Manager Even though the Task Manager is a very simple tool, don't underestimate its usefulness . You can quickly launch the Task Manager to get a real-time look at network utilization (and process performance) without having to open the System Monitor and configure counters.

The following are some of the most common causes of bottlenecks that you might encounter while troubleshooting your network:

The current level of provided resources is inadequate, thus requiring additional or upgraded resources to be added to the network.
The available resources are not utilized evenly, thus requiring some form of load balancing to be implemented.
An available resource is malfunctioning or stopped and needs to be repaired or restarted.
An available resource is incorrectly configured, thus requiring a configuration correction.

After you have identified a problem, you should take care to avoid creating new problems while correcting the old one. You should make one change at a time to avoid masking the impact of changes. After each change, you should perform additional monitoring to determine the result and the effect of the change and reevaluate the status and condition of the previously identified problem(s). In addition, you can compare the performance of applications that are run over the network to their performance when run locally to determine how the network is affecting performance.

With our discussion of performance monitoring and baselining out of the way, let's move forward and examine the second topic of this chapter: disaster recovery operations.

The Performance Console

Figure 6.1. The Performance console consists of four different tools.

Introduction to System Monitor

STEP BY STEP

Figure 6.2. To start adding counter items, you need to click the + icon on the toolbar.

Figure 6.3. You can add counters to begin monitoring the Processor statistics.

Figure 6.4. Make sure you are adding the counters for the correct instance when multiple instances exist.

Figure 6.5. You can monitor the selected network interface statistics in real-time.

Working with Counters

Figure 6.6. The Add Counters dialog box contains many options you need to understand.

Using System Monitor to Discover Bottlenecks

Baselining Servers

Table 6.1. Counters to Monitor for Baselining and Bottleneck Troubleshooting

Creating Baseline Counter Logs

STEP BY STEP

Figure 6.7. A counter log is created to keep baseline (historical) statistics about performance.

Figure 6.8. You should enter a descriptive name for the new counter log.

Figure 6.9. You must configure objects or counters on the new counter log before proceeding.

Figure 6.10. You can add entire objects to the counter log if you want.

Figure 6.11. You can add specific individual counters to the counter log if you want.

Figure 6.12. You can use the Log Files tab to configure how the log is to be saved.

Figure 6.13. You can use the Configure Log Files dialog box to configure the maximum log file size as well as its path.

Figure 6.14. You can configure the counter log to start and stop automatically if you want.

Figure 6.15. You need to load the counter log file into the System Monitor to be able to examine it.

Figure 6.16. You can add one or more counter logs to the System Monitor.

Figure 6.17. You need to ensure that all counters have been added for viewing.

Table 6.2. Counter Log File Format Options

Table 6.3. Counter Log Numbering Systems

Daily Monitoring for Usage

Table 6.4. Counters to Monitor on a Daily Basis

System Monitor Tips and Tricks