Understanding Scale-Up | Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features

How do you grow a system? The first thing that comes to the mind of most system administrators is to throw additional CPU bandwidth at the machine. In other words, if you are using a 3.00 GHz CPU and need more power, then add a bigger CPU. If the hardware, the motherboard, can take additional processors or support multiple processor cards that can be inserted into slots, then by all means add additional CPUs. After all, if the machine is performing poorly at, say, 300 MHz, bump it up another processor for an effective processor bandwidth of 600 MHz. Right? Wrong.

Unfortunately, no system (I don’t care what operating system or hardware brand) improves linearly with the processors you add to it. In other words, if you double the number of processors, you would expect the throughput to double. This is called linear scale-up. You can also look at it another way. If you double the number of processors, you would expect response time to be cut in half (linear speed-up). But continuing linear scale-up cannot be sustained in practice because we live in a world governed by the laws of gravity, friction, inertia, and so on. One such law you would do well to understand is Amdahl’s Law of Scalability.

In 1967, Gene Amdahl stated that the potential speedup to be obtained by applying multiple CPUs will be bounded by the program’s “inherently sequential” computations. In other words, there are always going to be some segments in the program that have no alternative but to execute serially. Examples are reading input parameters, characters entered into the system by the serial machine known as a human being, or writing output to files. The time to execute such processes cannot be eliminated, even if it is possible to execute them in an infinitely small amount of time. In that sense, the upper bound on performance improvement is independent of the number of CPUs that can be applied.

In other words, the end result of the process cannot be achieved by any parallel computation. An example from the data processing world is the standard SQL query that can only derive a result set from the result set of a subquery. Obviously the two queries cannot happen in parallel. All we can do is ensure that the inner query happens as fast as possible on one processor while the outer query waits milliseconds behind it for the inner result set to compute.

In computer science or engineering terms, we can only squeeze out every drop of performance and bandwidth from our systems by increasing the percent of the processing that can be parallelized. This can be expressed in the following equation:

T_N is the computing time using N CPUs. T₁ is the computing time using 1 CPU, and p is the percent of the processors that can be parallelized. The linear speed-up limit as the number of processors (N) reaches infinity is 1/1−p. As p will always be below 100 percent, we will never be able to achieve perfect or constant scalability.

Despite the laws of physics that come to bear on our computer system, constant scale-up linearity of a system is also an impossible objective because a computer system is not the sum of its processor cycles. Just as the human brain cannot exist without the heart and nervous system, a computer system too comprises system buses, hard disks, speed of random access memory, the I/O capabilities of the operating system, and so on. These are all elements that we identified earlier as possible sources of bottlenecks in a system.

Database operations apply pressure to all areas of a computer system: the nature of the applications; the construction of queries; the condition (such as fragmentation) of files, of indexes, of filegroups; the use of hard disks; and the list can go on. This is the reason that it makes no sense to regard MIPS as any meaningful measure of relative database system performance. To sum it all up, there is substantially more to transaction processing than just the processor. Hence the advent of TPC testing, as discussed in Chapter 4, which even takes into account cost of systems, client access, and maintenance.

Arguing that one platform or product is better than another is also an exercise in futility because no one has achieved linear processor scalability. It is thus a myth and will be for a long time to come. Vendors such as IBM and Sun might spend millions trying to achieve the ultimate, but I consider it safe to say that 99.99 percent of businesses in the world do not have the budget or access to the technology that these vendors do for their public relations exercises.

Scaling Up: The Shared Memory Model and SMP

But now let’s assume that your system has more than enough memory, that your hard disks are working perfectly, that your code is highly optimized, that no significant bottleneck can be identified on the system buses, and that only the processor is the bottleneck. You still need to improve system performance by increasing the number of threads and fibers executing concurrently, in parallel, and the smartest way to cater to the need is to add additional CPUs, as many as the system can take without impacting the collateral areas such as memory and hard-disk bandwidth.

This is known as symmetric multiprocessing, or SMP. You get vertical growth of the system’s processing capability by adding more processors. The growth may at first continue in a straight line going up, but there will be a point that the various factors discussed earlier begin to pull the processing bandwidth or capacity down and the line starts to curve.

We call such an SMP system a shared memory system, one based on the shared memory model. Regardless of the number of CPUs in the system, there is only one contiguous memory space shared by all processors. The system runs a single copy of the operating system with the application executing oblivious to the additional processors. The DBMS software operates no differently on a single CPU; it just gets a lot more done because more tasks can be executed in parallel.

Shared memory SMP systems vary from operating system to operating system. In the server market, all vendors use the same or very similar processing architectures. We will not get into the differences between CISC and RISC and so on, because that discussion has little to offer in a book on SQL Server and the Microsoft operating systems.

Some operating systems scale better in the SMP arena than others; however, as mentioned earlier, the ability to add more processors or grow huge SMP systems, such as 96 CPUs, is not the only factor to consider, for these reasons:

First in many cases is cost (boiled down to the $tpmC for want of a scale that has an expense factor). Once you begin to scale past eight processors, cost begins to escalate rapidly Non-Intel platforms are known to rocket into the millions of dollars when the number of processors begins to climb into the high double-digits. A $12 million 64+ CPU system is beyond the budget of most companies. On the other hand, when compared to a legacy mid-range or mainframe system, 12 big ones might not be a lot of money.
As SMP scales, so do the collateral systems components. A large SMP system will likely consume large amounts of memory. Storage is also another critical area. On a large SMP system, you will need huge arrays of hard disks, and the number of disks will explode as you configure for redundancy. Remember that RAID 5 requires three or more disks. Other areas that add to the costs include cooling systems, fans, duplexed controllers, redundant components, and so on.
The parallelism achieved in multiple CPUs does not necessarily mean that all threads have equal and concurrent access to the shared resources. In fact, when you start going past the 12-CPU level, serialization in the accessing of resources can incur severe bottlenecks. In other words, you might have 32 CPUs going, but if all 32 threads operating in parallel need access to the same rows in a table, 31 of the threads have to watch their manners. All it takes is one transaction to lock the table for some time, and you have $11,999,000 of equipment standing idle for a few seconds. I am one CTO that would hate to have to explain what those seconds cost.
Management becomes more costly The more components you have in an SMP system, the higher the management and monitoring cost. With large arrays, multiple controllers and scores of CPUs, and miles more cabling come additional chances for bottlenecks.
The single point of failure risk increases. Probably the most important factor detracting from a large SMP system is that the system is only as strong as its weakest link. If that breaks, whatever it might be, the system crashes and all 96 CPUs shut down…and a $12 million system goes south. Sure, you can fail over to another 96-CPU system; just throw another $12 million at the project.

However, probably the most unattractive aspect of the high-SMP systems and the shared memory model is that you get very little more for your money and effort. If large-SMP systems showed performance and processing power greater than any other system by an order of magnitude, there might be a case for them, but they don’t. What model thus works best for SQL Server?

There is obviously a need to scale up a Windows Server 2003 and SQL Server 2005 system. Windows Server 2003 scales very well, far beyond the “sweet” point for a system. On a database management system, high transaction bandwidth or an extensive drill-down for analysis purposes works better when the application is able to spawn multiple concurrent threads across several CPUs. Parallelism has been engineered not only into the operating system but also into SQL Server.

The linear curve would show how you get a significant increase in performance for a given price when you go from one CPU to four CPUs, from four to eight, and from eight onward. For the most part, most of your money is spent in adding the additional CPUs. Hard disks, memory, controllers, and PCI buses-the same components the entire industry uses-are relatively inexpensive on a quad system. Even going to eight processors is relatively easy, and companies like HP are routinely building such systems. Going beyond eight is when things start to go awry-prices begin to rise sharply and returns begin to diminish.

For Windows Server 2003 and SQL Server 2005, we thus find that practical gains for the very high-end systems cap out at eight processors. Beyond that, it starts to make more sense to scale out, as discussed later in this chapter.

Scaling Up and Availability

When planning for high availability, you cannot simply grow a single machine and then place all your money in that one vertical system. All it takes is for a single point of failure to become a failure, and no matter how much the system costs, it is dead in the water. All mission-critical, 24×7, operations require some form of redundancy model that can ensure that, if a system goes, the enterprise does not come to a grinding halt.

There are several means of providing redundancy. The more effective the redundancy (the faster the fail-over capability), the higher the cost and the more complex the solution. On the budget scale, the cheapest fail-over system is an idle instance of SQL Server running on the same computer as the active one. The problem with this is that a hardware crash can take out the idle instance as well. Nevertheless, I will go into idle instance configuration later in this chapter.

Catering to hardware redundancy, the next cheapest fail-over is to have a warm standby server. The standby is a secondary server that will take over the connections for the primary if it fails. But there are some problems with such a scenario that make this model less than desirable in a mission-critical, 24×7 operation that cannot afford more than a few minutes offline.

Our next option, then, is to move to the shared-disk system model, which is catered to by the Microsoft Cluster Service (MSCS) that is available to the operating system of Windows Server 2003. The Cluster Service requires a single shareable disk array to be shared by at least two servers on Windows Server 2003 platforms; although it is possible to make a single-node virtual server with Windows Server 2003. All databases and log files that each instance of SQL Server requires are stored on the shared disk array. The array is usually a RAID-5 or RAID-10 storage unit. The two servers are connected together and communicate via a high-speed interconnect, known as the heartbeat.

The Cluster Service is an active/passive cluster, which means that one server stands idle and lets the active server take all the hits, at the hardware level (more about active/active later). The passive server waits until a fail-over event occurs from the Cluster Service and then takes over as the active server. The takeover event is known as the fail-over. The fail-over is automatic and does not require any operator intervention; however, it is not entirely seamless: clients will have to reconnect but service can be restored in under a minute in most situations. The databases are shared between the nodes on the shared storage unit, so there is no need to restore databases and transaction logs, a process that takes up considerable time.

I will not go too deeply into the actual Cluster Service here, but some background on how the Cluster Service works is a good idea.