Troubleshooting Physical Server Performance


When troubleshooting the performance of your physical server, you are going to be relying more on your operating system and hardware knowledge than on your SQL Server 2005 knowledge. That’s why we have always tried to keep abreast of at least the basics of every version of the Windows operating system as it has been released. The same goes for hardware.

In most enterprises, a DBA will typically pass on operating system– and networking-related tasks and problems to specialized infrastructure information technology (IT) professionals, but the DBA still needs to be conversant with operating systems and networking at some level. Troubleshooting a physical server’s performance where SQL Server 2005 is installed will be very different from where Exchange or Internet Information Services (IIS) is installed. That’s where your expertise is invaluable!

Troubleshooting Tools

System Monitor is the main tool you’ll use to troubleshoot the performance of your physical servers. If you were inclined to capture data for a longer period of time, you might want to use the Performance Logs and Alerts tool.

Troubleshooting Methodology

You will need to adopt some sort of structured technique or methodology for finding the performance bottleneck at the server level in your SQL Server 2005 solution.

A bottleneck is generally one subsystem that has limited capacity and thus reduces the capacity of the entire system. In software systems, bottlenecks are caused by several factors such as insufficient resources, malfunctioning components, incorrectly configured resources, and workloads that are not distributed evenly.

The following are the major subsystems that reflect bottleneck areas that can affect server performance and subsequently need to be investigated:

  • Processor

  • Memory

  • Disk I/O

You’ll now look at how you can troubleshoot these physical server subsystems.

Troubleshooting Processor Problems

Consistently high processor utilization may indicate the need for tuning your queries and ultimately the need for a processor upgrade. However, before you decide to buy new hardware, you should determine the cause of processor performance problems and find a possible resolution. It might be an operating system–related problem!

Tip 

Generally, you will find that most SQL Servers are more likely to be I/O bound than computer bound. Given today’s processor hardware and the relatively inexpensive cost of multiple processors or even multicore processors, the memory or I/O subsystems will generally be more likely candidates for bottlenecks, especially memory, because the amount of data (and to a degree the number of concurrent users) is generally always growing.

Your main goal with troubleshooting processor-related problems is to isolate whether it is SQL Server 2005 or some other process (be it the operating system, a service, or some other software package) that is consuming the processor resources.

Detecting Processor Problems

As we have discussed, several tools, such as System Monitor, Performance Logs and Alerts, and Task Manager, can help you detect processor performance problems. Let’s start with the performance object counters that allow you to determine the processor usage:

  • The Processor : %Processor Time performance object counter represents the amount of time your processor is spending executing a nonidle thread. Your processors are generally considered to be a bottleneck if this performance object counter is consistently greater than 80 percent. Please note that for multiprocessor systems you can monitor a separate instance of this performance object counter for each processor. If you need the average value for all processors, you can use the System : %Total Processor Time performance object counter.

  • The System : Processor Queue Length performance object counter indicates the number of threads waiting for processor time. A value greater than 2 might indicate a processor bottleneck.

Tip 

Don’t forget that you can also quickly determine whether your SQL Server 2005 instance is consuming all the processor resources by examining the SQLSERVR.EXE process in Task Manager.

The Process : %Processor Time performance object counter represents the amount of time spent by a particular process executing on the processor resources. You can monitor the SQLSERVR.EXE process instance to determine whether it is the operating system or SQL Server 2005 that needs to be investigated further.

Resolving Processor Problems

You can employ a number of techniques to reduce the identified processor utilization before you need to consider purchasing more and/or faster processors. These techniques include the following:

  • Shutting down superfluous services and other start-up processes that are consuming both processor and memory resources.

  • Ensuring that you have the latest service packs and patches installed for both the operating system and the various applications running on your server.

  • Restarting the errant process. We have all seen how Internet Explorer can hog all the processor and memory resources of a computer. Restarting it releases all that memory, thereby improving performance.

Troubleshooting Memory Performance

Insufficient memory is one of the major causes for SQL Server performance degradation, because it will generate excessive paging, generate increased I/O activity, and slow down the system.

To diagnose and monitor memory problems, you should have a good understanding of memory architecture in SQL Server 2005 as well as understand the concept of memory pressure. (There are different types of memory pressure, such as internal and external and physical and virtual.)

Detecting Memory Problems

The virtual memory architecture and management system of the Windows operating system is extremely complex. This is reflected by the plethora of performance object counters that are available in System Monitor. We really don’t recommend anyone trying to understand it completely. There are much better things in life to spend your time doing.

For us mere mortal mortals, it is sufficient to monitor the following performance object counters because they give a good indication of a potential memory problem:

  • For Memory : Available Bytes (the number of bytes available), when memory available drops in the interval of 50–100MB, then you should investigate whether the system has memory problems. When the value is less than 10MB, then it’s a certain sign of external memory pressure.

  • For Process : Working Set (the amount of memory used by a process), if the value is consistently less than the amount of memory that is set by the min server memory and max server memory server options, you need to identify the underlying root cause.

The Task Manager utility is a quick and easy tool to use. You can use the Task Manager’s Performance tab to check the Physical Memory section and get the available memory. (Its output is similar to that of the Memory : Available Bytes performance object counter.)

Resolving Memory Problems

Generally when you have memory pressure, the first task you need to perform is to determine whether it is related to SQL Server (internal memory pressure) or to the operating system (external memory pressure).

So, check for external memory pressure first. If you lack physical memory, find major system memory consumers, such as unnecessary services, and try to eliminate them if possible. Otherwise, you have to consider adding more random access memory (RAM), although poor application design and inefficient indexing strategies can consume more memory.

If the external pressure is due to a lack of virtual memory, consider increasing the swap file size and, if possible, again find and eliminate the major consumers of virtual memory.

Troubleshooting I/O Bottlenecks

To detect and solve I/O bottlenecks, you need to understand how they will manifest themselves, such as through slow response times, timeout error messages, and so on. It is also important to understand the various factors that can contribute to excessive I/O activity such as through paging, transaction log file operations, or heavy tempdb activity.

Although you might have both disk and network I/O bottlenecks, you will generally find that disk I/O tends to be the sole problem, because disk drive technology has not dramatically improved in throughput over the past decade, unlike processors and network cards. Consequently, we will focus on disk I/O bottlenecks.

Detecting I/O Problems

The detection of disk I/O bottlenecks has been well documented in various Windows and SQL Server resources, so you should already be familiar with the performance object counters and what to watch.

You should monitor the following well-known performance object counters:

  • The PhysicalDisk Object : % Disk Time performance object counter represents the percentage of time a disk drive was busy servicing read or write requests. A value greater than 50 percent indicates a potential I/O bottleneck.

  • The PhysicalDisk Object : Avg. Disk Queue Length performance object counter tracks the average number of physical read and write requests queued on the selected physical disk. Look out for a value greater than 2 to indicate an I/O problem.

  • The PhysicalDisk Object : Avg. Disk Sec/Read and PhysicalDisk Object : Avg. Disk Sec/ Write performance object counters count the average time, in seconds, of disk read and write operations, respectively. As a guideline, a value less than 10 milliseconds (ms) is good, a value of 10–20 ms is OK, a value of 20–50 ms is slow, and a value greater than 50 ms indicates a serious I/O problem.

Tip 

When monitoring these performance object counters, do not forget to take into account the number of disk drives if you are using a redundant array of inexpensive disks (RAID), and adjust accordingly.

Resolving I/O Problems

Resolving disk I/O problems is typically done in hardware. It’s simply more cost effective to spread the load across multiple disk drives, for example, than to spend hours tuning the operating system and hardware. Remember, time equals money! However, you do have a number of possible resolution methods:

  • Use faster disk drives (10,000 rpm versus 7,200 rpm).

  • Use a “faster” disk drive interface (SCSI versus ATA).

  • Use a “faster” RAID technology.

    Note 

    The performance impact of RAID-5 on write operations, for example, has been well documented.

  • Use a “faster” network storage solution (SAN versus NAS).

  • Use additional disk drives to distribute the I/O load.

  • Tune your disk array (how much memory you allocate for reads versus writes).

  • Separate the operating system from the applications on the disk drives they use.



MCITP Administrator. Microsoft SQL Server 2005 Optimization and Maintenance Study Guide (70-444)
MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (Exam 70-444) Study Guide
ISBN: 0470127457
EAN: 2147483647
Year: 2004
Pages: 146

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net