7.5 Removing Bottlenecks | sendmail Performance Tuning

Let's assume that the system bottleneck has been revealed using the techniques described earlier in this chapter. The next logical question to ask is, "What should be done about it?" Some of the methods for effecting an improvement in system performance will be obvious, and many have been discussed already. In this section we will explore some of the ways in which bottlenecks may be alleviated and some of the pitfalls that may be encountered.

It may seem that identifying the bottleneck and planning the fix should be the most difficult part of improving system performance, and they usually are. Additional frustration may arise, however. We never really completely eliminate bottlenecks we just improve the throughput of one aspect of a system. A truism of information technology seems to be that the load placed on servers increases over time. As the load on an email server grows, it is inevitable that we will eventually encounter another bottleneck that must be removed. Perhaps this next bottleneck lurks just around the corner, raising its ugly head after our capacity increases just a few percentage points from the level at which the last obstacle was removed. If we have a set of disks on a SCSI-2 interface that is saturated at peak times while delivering 8 Mbps of data to our applications, we won't have long to wait after upgrading the disks before the SCSI controller becomes a bottleneck. Sometimes, as in this example, the next hurdle that will need to be overcome is easy to see. At other times, it's almost invisible. With experience comes better instincts about where the next problem lurks, and with some support from the folks who control budgets, perhaps some of these roadblocks can be eliminated before they slow down the system again. No matter how much experience a person has, no one can anticipate everything. This uncertainty is just one of the things that makes the job so challenging.

7.5.1 CPU-Bound Systems

With a CPU-bound system, the first step is to see if anything currently running on the server can be stopped or moved to another server. If that's possible, it would be a fortunate fix. Of course, one cannot eliminate unnecessary tasks indefinitely. Sometimes a shortage of CPU power really masks another problem for example, the system may be working very hard to move processes in and out of swap space. The danger in interpreting utilities that seem to report CPU utilization but also report I/O utilization, such as some versions of top or vmstat, has already been discussed.

By some measures, having a CPU-bound system is a good thing. It usually indicates that the rest of the system is well tuned and operating efficiently. Besides, CPU is often the easiest component to upgrade. Even in the worst-case scenario, we can expect the new chip released in the next quarter to offer a larger percentage improvement over the current product line than for any other computer component of the next-generation system.

Finally, the email applications discussed in this book have all run many processes simultaneously to handle multiple requests. Thus they operate in parallel very nicely to work on multiprocessor computers. If an email server with two processors becomes CPU bound, it's almost certain that the same vendor has an upgrade plan to a four-CPU box, and upgrading to a system with more CPUs is almost always easier to plan and execute than upgrading I/O controllers, software, or storage systems. Not only is there typically less to configure, but also it's straightforward to estimate the actual improvement in CPU capability between the old and new systems. This factor is generally much easier to predict than the effects of upgrading a storage system or increased RAM.

7.5.2 Memory-Bound Systems

As we've already learned, some amount of paging on a system is normal. Excessive paging, or "thrashing," causes problems, however. This condition is not always easy to detect when it is mild, but it is patently obvious when severe. If the system is truly memory bound, the only solution is to add more RAM. Fortunately, memory is relatively cheap when it comes to system costs. For most applications, having extra memory will help reduce I/O by providing more filesystem cache space. Email is helped less than many other applications by surplus RAM, but extra memory does help, sometimes a great deal. Rarely will a recently constructed email system require more memory storage than can fit in that machine. That is, CPU, networking, and I/O all tend to make an entire computer chassis obsolete before it needs to be completely filled with RAM.

Another pitfall may become evident: It's very easy to be fooled into thinking a system problem is a memory problem when that's not the case. In point of fact, every time a server runs out of a finite resource, if the load keeps coming, the system will always run out of memory. Consider the following scenario:

A computer is relaying email from the Internet to an internal email server. The internal server can handle anything the gateway can throw at it.
The gateway's queue resides on a single disk, and just today, the load has reached the point where metadata operation contention exhausts the I/O capability of the disk.
The server is now processing as much email as it possibly can, but the rest of the Internet won't be sympathetic and back off. Instead, email keeps getting sent, and at a faster rate than data can be moved into and out of the queue. Putting some numbers to this scenario, let us suppose that email comes into the server at a rate of 5 Mbps, but the queue is processed at 4 Mbps.
Consequently, more sendmail processes are spawned on the server than exit in a given time period, causing the number of processes to start to increase.
While they share a single text image in memory, each process has its own data image that starts eating away at available RAM.
This shortfall in memory causes the system to reduce the size of the buffer cache, placing more I/O demands on the queue disk that cannot be satisfied. The total number of processes increases further as each process takes longer to complete its work and exit.
Eventually, real memory pages become exhausted by all of these surplus processes, and the system starts to thrash.

When email administrators come to this machine and start running diagnostics, they will see that the server is out of memory and thrashing. Stopping there, they will erroneously conclude that the system needs more memory. They can obtain more and install it, but next time this situation occurs the server will merely flail around for a longer period before it begins to thrash. Adding memory will not correct the real problem.

In fact, this example leads to a general maxim about Internet servers. Surplus RAM acts as a buffer against temporary resource shortages. More RAM does not eliminate the problem, but it does buy the server more time in which the shortage might become resolved, or at least be abated. In our example, the resource shortage was disk I/O, but the same sort of scenario plays out for an email server that communicates directly with other servers around the Internet when the organization's Internet link is severed. Email backs up on the server filling the queue. As the queue grows deeper, the amount of time any one process spends in the queue increases, leading to resource contention that may become apparent to POP or IMAP users. Having more RAM on the server means that a longer outage may be tolerated before intervention becomes necessary.

Similar sorts of outages occur frequently and can be mitigated by good planning and architecture, but cannot be completely eliminated no matter how much effort is expended. DNS server outages, routers being given bad information or rebooting, "backhoe fade," or even unusual transient spikes in load can all cause these sorts of problems. A good server will be resilient against these sorts of situations, but it can never be made impervious to them. For this reason, it's more difficult to provide reliable Internet services than it is to provide many other utility services such as reliable dial-tone phone service. When a phone switch runs out of circuits, it can say "no" to the next entity wanting to use its resources; the process of saying "no" does not significantly drain the switch's resources. This result is much harder to achieve in the Internet case. Even saying "no" takes more resources, and the load will keep coming despite the refusal.

If a server is running normally at full capacity without any problems, but occasionally runs out of resources without having to process either more email messages (transactions) or larger messages (overall volume), then running out of memory is not the cause of the problem, but rather a symptom. The server is running out of "something else," where that something could be network bandwidth, I/O, CPU, or any number of other possibilities. If a server running out of memory is correlated with a higher demand being placed on that machine, that condition may indeed be a memory shortage.

7.5.3 I/O Controller-Bound Systems

On most disk systems, the data may be accessed by only a single I/O controller. If this controller becomes saturated, few remedies exist. Splitting up the load onto multiple storage systems using different controllers is the first thing to try, but that can't happen beyond the limit of one disk/SSD/RAID system per controller. If a controller with one device becomes saturated, the only option is to upgrade to a faster controller. However, this upgrade is a solution only if the storage device on the other end of the bus can support the faster speed. If a SCSI-2 controller is saturated talking to a single SCSI-2 disk, upgrading the controller isn't enough, because the disk will still speak SCSI-2 and the faster controller won't make a difference. In this case, the disk must be upgraded as well.

The use of system NVRAM can help mask controller saturation, but disks and controllers aren't that expensive, so upgrading shouldn't impose any special burden. It's generally a good idea to buy a high-end SSD or RAID system that comes with the highest-speed interface supported by the device, even if one has to buy a new controller to match. Even if the bandwidth isn't needed now, it would be a tragedy to purchase an ultra-fast storage system that must be completely replaced at a later date solely because its controller runs out of bandwidth before the storage system does. Only the very highest-end storage devices (SSDs and the most powerful RAID systems) can saturate the fastest controllers on the market by themselves in typical email environments. If this event comes to pass, the only solution is to divide the load over multiple disk systems on separate controllers, either with multiple mount points or by using software RAID to create a single storage image out of multiple devices by striping them together.

Because email data access patterns tend to be small and random, one can usually place several, or even many, disks on a single controller with confidence. In an environment where only a single large file will be read at a time, two or three disks per controller might be the maximum supportable. For email servers, it's usually safe to put several disk drives on the same SCSI bus perhaps as many as six on a SCSI-2 chain, or even a dozen on high-speed controllers. It's still a good idea, though, to dedicate controllers to email tasks. Controllers can become saturated on email servers, especially if solid state disks or high-performance storage systems are employed.

7.5.4 Disk-Bound Systems

Conceptually, upgrading disk systems is fairly easy. Get faster disks, get faster controllers, and get more disks. The problem is predicting how much of an improvement one might expect from a given upgrade.

If the system is truly spindle bound, and the load is parallelizable such that adding more disks is practical, this route is almost always the best way to go. When a straightforward upgrade path exists, there's no more likely or predictable way to improve a system's I/O than by increasing the number of disks. The problem is that a straightforward path for this sort of upgrade isn't always obvious. As an example, assume we have one state-of-the-art disk on its own controller storing sendmail's message queue, and the system has recently started to slow down. There are two ways to effectively add a second disk to a sendmail system. First, we could add the disk as its own filesystem and use multiple queues to divide the load between the disks. This upgrade will work, but will become more difficult to maintain and potentially unreliable if it is repeated too many times. Second, we could perform a more hardware-centric solution, upgrading to either create a hardware RAID system, install a software RAID system to stripe the two disks together, or add NVRAM to accelerate the disk's performance. With any of these solutions, upgrading the filesystem might also become necessary. None of these steps is a trivial task, and there's no way to be nearly as certain about the ultimate effect on performance with the addition of so many variables.

Obviously, we can't add disks without considering the potential effect on the I/O controller, and sometimes limits restrict the number of controllers that can be made available in a system. While we rarely push the limits of controller throughput with a small number of disks because email operations are so small and random, it's possible to add enough disks on a system such that we run out of chassis space in which to install controller cards.

Any time a system has I/O problems, it would be a mistake to quickly dismiss the potential benefits of running a high-performance filesystem. This solution is usually cheap and effective, and where available can offer the best bang for the buck in terms of speed improvement. If I am asked to specify the hardware for an email server, in situations where I have complete latitude in terms of the hardware vendors, I know I can get fast disks, controllers, RAID systems, and processors for any operating system. The deciding factor for the platform then usually amounts to which high-performance filesystems are supported. This consideration is that important.

If a RAID system is already in use, performance might potentially be improved by rethinking its setup. If the storage system is running out of steam using RAID 5, but has plenty of disk space, perhaps going to RAID 0+1 will give the box some more life. If it is having problems with write bandwidth, lowering the number of disks per RAID group, and thus having a larger percentage of the disk space devoted to parity may help. Losing unused space is certainly preferable to buying a new storage system. Changing the configuration of the storage system is especially worth consideration if it wasn't set up by someone who really understood performance tuning. The vendor could very easily have given some advice that wasn't optimal for email applications.

If a RAID system has been set up suboptimally, it may also be possible to improve its performance via upgrading. Vendors often provide upgrade solutions to their RAID systems that can improve their throughput, both in terms of hardware components and the software that manages the system. Also, to save money, the system might have originally included insufficient NVRAM or read cache; performance might improve dramatically if more, or any, is installed.

7.5.5 Network-Bound Systems

Two networks are considered: the network(s) under one's control, typically one or more LANs, and the network connection(s) to the Internet, which are usually much more difficult and expensive to upgrade. If the problem lies with the latter, one can do little except to upgrade the server or add an off-site Spillover MX host to ride out the times when network contention causes email to back up. Unfortunately, this tactic doesn't really solve the problem, but merely mitigates it. Further, it's fairly costly in terms of server hardware, maintenance, and potential rack space at a better-connected site. When the problem lies with an internal network, the solution is usually much more tractable.

Reducing the number of other servers contending for time on a cramped network and going to a switched topology are the first things to try if the email server resides on a shared network. If the network is already switched, upgrading speeds and NICs will be necessary, and one will want to make sure the switch itself isn't overloaded.