Performance Tuning

 < Free Open Study > 



qmail is a high-performance MTA. On a modern computer, a qmail installation that hasn't been carefully configured for optimum performance should be able to handle at least a million messages per day. For some sites, though, that's not good enough. Many things can affect the performance of a qmail installation, including system software and configuration; hardware type and configuration; network latency, bandwidth and configuration; and qmail configuration.

Is There a Problem?

Before you charge off and start tuning your qmail installation, you should determine whether you actually have a performance problem. The old adage applies here: "If it ain't broke, don't fix it." There's no point in configuring a system that will never handle more than a couple thousand messages per day to handle 10 million messages per day.

If you are having problems, you're probably already aware of them. Here are some of the potential indicators of a poorly performing system:

  • Delivery rate is too low. You need to deliver a message to a list of 10,000 recipients in 15 minutes, but it's taking an hour.

  • Deliveries take too long. It takes half an hour for a message to be sent between two people on the same system.

  • Unprocessed message count is too high. The number of unprocessed messages reported by qmail-qstat stays above zero for a long time or never goes down.

  • Load average is too high. The system load average goes through the roof when delivering to large lists or stays high all the time.

  • Local/remote concurrencies never reach their limits. qmail-send is unable to start more deliveries, even though there are messages waiting to be delivered now.

  • Local/remote concurrencies often at their limits. The local or remote concurrencies consistently run at their limits.

  • Response to incoming connections is slow. Connections to your SMTP, POP3, or IMAP services take so long that errors are reported.

Is It a Performance Problem?

Once you've determined there's a problem, the next step is to find out whether it's a local performance problem, a local configuration problem, or a remote problem. For example, the unprocessed message count might be growing because qmail-send isn't running. Or the response to incoming SMTP sessions might be slow because the connecting host's reverse DNS configuration is incorrect. See Chapter 6, "Troubleshooting qmail," for guidance in determining the problem's nature. If you find that the problem is local and not because of a configuration error, it's probably time to start tuning.

Tuning qmail

Probably the easiest problems to fix are those that can be fixed by adjusting qmail's configuration. There are two main areas where qmail can be tuned: the queue and the local and remote concurrencies.

Tuning the Queue

In a default qmail installation, several of the queue subdirectories are split into 23 subdirectories. This reduces the number of files in each directory. Many common Unix file systems exhibit poor performance on directories that contain more than 1,000 files or so because directories are searched linearly. Some modern, high-performance file systems such as XFS and ReiserFS use hashing to speed lookups. For queues on such file systems, splitting the queue subdirectories is unnecessary and could even be slightly detrimental to performance.

The conf-split compile-time configuration setting (see Chapter 2, "Installing qmail") can be used to adjust the queue split. The big-todo patch can be applied to add splitting to more of the queue subdirectories. Another thing that can be done to improve queue performance is to use more than one queue.

Adjusting conf-split

Adjusting conf-split requires setting a new value for the number of split sub-directories in the conf-split file, rebuilding the qmail binaries, and installing the new binaries:

  1. Choose a new conf-split.

When qmail places a file in a split subdirectory, it takes the message's queue ID, which is the inode number of the file used to store the message in the queue/pid directory. The queue ID is divided by the conf-split value, and the remainder identifies the split directory. For example, if the queue ID is 29 and conf-split is the default, 23, the split directory used is 6 because 29 divided by 23 is 1 with a remainder of 6. If the inode numbers were random, all of the split sub-directories would average about the same number of files.

However, because inode numbers are assigned by the file system, there's no guarantee they're assigned randomly. In fact, they're often assigned sequentially. Combined with the fact that each message in the queue uses a few inodes, the distribution of queue IDs sometimes contains many multiples of two, three, or four. If conf-split also happens to be a multiple of two, three, or four, qmail could end up putting most of the messages in a few of the split subdirectories while the rest remain nearly empty.

For this reason, conf-split should be a prime number.

Another rule of thumb is that for non-hashing file systems, each split subdirectory should contain no more than about 1,000 files. If your queue typically contains fewer than 23,000 messages, the default conf-split should be fine. If your queue peaks at around 400,000 messages, a conf-split of 401 should be used: 400,000 / 1,000 = 400, and the first prime number over 400 is 401.

To set conf-split to 401, do this:

 # cd /usr/local/src/qmail-1.03 # echo 401 > conf-split # 

  1. Make sure the queue is empty:

     # qmailctl stop Stopping qmail. . .   qmail-smtpd   qmail-send # qmailctl stat /service/qmail-send: down 113 seconds, normally up /service/qmail-send/log: up (pid 274) 494966 seconds /service/qmail-smtpd: down 113 seconds, normally up /service/qmail-smtpd/log: up (pid 279) 494965 seconds messages in queue: 0 messages in queue but not yet preprocessed: 0 # 

    Note 

    Changing conf-split while there are messages in the queue will almost certainly corrupt the queue. The preferred solution is to wait until the queue is empty to change confsplit. Another option is to temporarily install qmail under a different conf-home (such as /var/qmail2) with the new conf-split and run both copies until the old queue is empty. Then shut down the old qmail, move /var/qmail2 to /var/qmail, and rebuild qmail with conf-home set to /var/qmail and the new conf-split.

  2. Remove the old queue:

     # rm -rf /var/qmail/queue # 

  3. Rebuild qmail with the new conf-split:

     # make setup check ./auto-int auto_split 'head -1 conf-split' > auto_split.c ./compile auto_split.c ./load qmail-clean fmtqfn.o now.o getln.a sig.a stralloc.a \ ...lots of output followed by something like: auto_uids.o strerr.a substdio.a error.a str.a fs.a ./install ./instcheck # 

  4. Restart qmail:

     # qmailctl start Starting qmail # 

  5. Verify that qmail is working correctly. Send some test messages and check the logs for queue-related errors.

The Big-todo patch

This patch adds splitting to the todo and intd queue subdirectories, which can improve performance on very busy servers. See the earlier "Modifying the Source Code" section for more information about this patch.

Multiple Queues

qmail-send is single-threaded and has to perform two major functions: processing new messages and passing them off to qmail-lspawn or qmail-rspawn. A qmail system that's trying to deliver mail rapidly to a large number of recipients can be severely impacted by a relatively low level of incoming mail, such as bounces. Installing another copy of qmail with its own queue just for handling mail coming in from remote sites will allow the sending qmail installation to run at full speed.

Because all messages in the queue are considered equally important, on a system that hosts large, busy mailing lists, regular users might find that their messages are sitting in the queue while qmail grinds away on bulk mail. One fix is to install another copy of qmail dedicated to local users. For example, you could install qmail under /var/qmail2 and instruct users to configure their MUAs to inject messages using /var/qmail2/bin/qmail-inject.

Also, because qmail is often bound by the level of input/output (I/O) performance on the queue, a server system with multiple processors or disk interfaces can use multiple installations to achieve higher total levels of performance than they can with a single queue.

Tuning the Concurrencies

By default, qmail will spawn up to 10 local delivery processes and 20 remote delivery processes. This is adequate for single-user systems and small servers, but larger, busier servers will need higher limits. A mailing list server, for example, can dramatically improve sending performance by raising concurrencyremote to 200 or more. The big-concurrency patch discussed earlier in "Source-Code Modifications allows concurrencies of up to 65,000—though, in practice, little is gained by raising it to more than 500 in most cases.

Care should be taken not to raise the concurrencies beyond the capabilities of the system, or a burst of messages could cause qmail to spawn processes until some system resource is critically starved. Even on a dedicated mail server, you should leave some head room. If the system can handle a concurrency of 200 before it starts straining, limit it to 180. If mail is just one of many functions the system supports, restrict the concurrencies even more: You don't want a mail surge combined with, for example, an untimely Web server surge, to bring the system to its knees.

You might find that your system is never able to reach the concurrency limits you've set, even when you know there are messages waiting to be sent immediately—not just sitting in the queue waiting for their next retry time. If that happens, you'll have to look at tuning other parts of the system as described in the following sections.

To change concurrencylocal or concurrencyremote, simply place the desired setting in /var/qmail/control/concurrencylocal or concurrencyremote and restart qmail-send, perhaps using qmailctl restart or svc -t /service/qmail-send. Check the qmail-send logs to verify that the new values are reported in the status: entries. Monitor the logs for a while to determine the effect of the change and adjust as necessary.

Tuning the System Software

Sure, there are the various kernel settings that can be adjusted to eke out modest performance gains. Before you do that, though, you might want to consider some choices that can have a dramatic effect on performance: the choice of the operating system (OS) and file system used to hold the queue.

Choosing an Operating System

Most systems are capable of running under more than one operating system: a proprietary Unix variant provided by the manufacturer and one or more free operating systems such as Linux or a BSD (Berkeley Software Distribution) Unix like OpenBSD, NetBSD, or FreeBSD. It's easy to dismiss the free operating systems as amateur, hobbyist efforts without the support network provided by the major proprietary Unix vendors, but that might not be wise. Free operating systems are now widely used in production environments. They've proven to be powerful, reliable, efficient, and, perhaps most surprisingly, maintainable. Free operating systems are especially attractive on PC-compatible systems where they often outperform their commercial cousins while supporting a much wider range of hardware.

Choosing a File System

Traditional Unix file systems, such as those derived from the Berkeley Fast File System (FFS), perform well in most situations. An exception is directories containing thousands of entries. When searching a directory for a particular file or subdirectory, these file systems read directory entries sequentially until they find a match or have scanned the entire directory.

Modern file systems use sophisticated algorithms to improve performance while maintaining reliability. Some use a technique called hashing to rapidly look for entries in directories. Others store entries in special data structures that enable high-speed lookups.

qmail's queue splitting mechanism, discussed earlier in "Tuning the Queue," can be used to keep queue directories small, so it's not necessary to use one of these newer file systems for that reason alone. Of course, they have other performance advantages and features that make them attractive. See "Requirements for the Location of the Queue," in Chapter 2, "Installing qmail," for more information about selecting a file system for the queue.

Mailboxes are another area where large directories are sometimes encountered. Using the maildir format, each message in a mailbox or mail folder resides in a separate file. A maildir mailbox with 2,000 messages is also a directory with 2,000 files, and if it's stored on a slow file system, accessing the mailbox could be annoyingly slow.

Tuning the System Hardware

Hardware tuning falls into two broad categories: selection and configuration. In other words, you tune your disk performance either by buying a faster disk drive or controller or by altering the configuration of your drives. There are a few exceptions, such as enabling or disabling the write cache on a disk drive, where you can actually tweak the performance of a piece of hardware without otherwise altering its configuration, but those are rare.

We'll look at each of the major components of the system that determine its performance: the CPU (Central Processing Unit), RAM (Random-Access Memory), and disk I/O, and examine ways to tune each for maximum qmail performance.

Tuning the CPU

This is the first thing most novices think of when they think of speeding up a slow computer. It's also one of the least likely bottlenecks on a qmail system. Sending and receiving mail just doesn't require a lot of CPU power. Unless system monitoring utilities show that all available CPU cycles are going to the "user" state most of the time, the CPU isn't your bottleneck. If the CPU is the bottleneck, the fix is to replace it with a faster one (if that's an option) or to add additional processors—if the hardware and operating system support multiple processors.

Tuning the RAM

A simple and often inexpensive performance boost—depending on the current state of the volatile RAM market—can be achieved by installing additional memory. Certainly, if monitoring tools show frequent virtual memory paging activity, adding memory will improve overall performance. Another less obvious reason to have excess RAM installed is that many modern operating systems will use it for a disk cache. Files and directories that are regularly accessed are copied from the relatively slow disk drives into high speed RAM. For example, on a busy qmail server, it's likely that most of the queue directories will be accessed from cache, if it's available, which will dramatically speed up queue operations.

Tuning Disk I/O

Disk I/O—particularly for the queue—is the most common bottleneck on busy MTAs. Because qmail guarantees that the queue is crash proof, it tends to be even more demanding than other MTAs. Luckily, there are many ways to improve disk performance.

Isolation

Whenever possible, locate the queue on disks used only for the queue. Even better: Locate the queue disks on interfaces reserved for the queue. You don't want the disk to have to divide its attention between queuing activities and writing log files or mailboxes. And you don't want the interface to the queue disks to be shared with other non-queue-related activity.

Interface

Obviously, higher performing disk interfaces will improve disk I/O. The two most common disk interfaces are Integrated Disk Electronics (IDE) and Small Computer Systems Interface (SCSI). Both have improved dramatically in the last few years, but SCSI still has the edge. It particularly outshines IDE when multiple drives are used on an interface. IDE is fine for most applications, but SCSI should be used for high-performance servers.

Single-Drive Performance

Another reason for choosing SCSI over IDE is that the fastest drives are always available with SCSI first. The primary indicator of disk drive performance is the speed at which the disk platters rotate. Faster rotation means higher bandwidth and lower latency. At the time of this writing, the fastest SCSI disks run at 15,000 revolutions per minute (RPM), and 10,000 RPM drives are typical. For IDE, the fastest are 7,200 RPM and 5,400 RPM is typical.

RAID

Redundant Arrays of Inexpensive Disk technology (RAID) is the combination of multiple disk drives into a single logical drive for improved performance, capacity, or fault tolerance. The following levels classify RAID systems:

  • 0—Striping. Data is spread across multiple drives, often on different interfaces. Provides high bandwidth by spreading the I/O load across drives and interfaces, but doesn't provide fault tolerance.

  • 1—Mirroring. Data is written simultaneously to two or more drives. Provides high fault tolerance and read bandwidth (due to round-robin reads).

  • 2—Hamming Code ECC. A fault-tolerant configuration with high overhead—hasn't yet been implemented.

  • 3—Striping plus Parity Disk. Like RAID 0 with an additional disk used for storing calculated error detection/correction information (parity).

  • 4—Independent Disks plus Parity Disk. Like RAID 3, except the data disks are independent, not striped together.

  • 5—Independent Disks with Distributed Parity. Like RAID 4, except the parity information doesn't reside on a separate disk, it's distributed across the data disks.

There are also two common combined RAID levels:

  • 1+0—Striped Mirrors. A RAID 0 (stripe) of RAID 1 (mirrored) components. Combines the high performance of RAID 0 with the high fault tolerance of RAID 1.

  • 0+1—Mirrored Stripes. A RAID 1 (mirror) of RAID 0 (striped) components. Yields the high performance of RAID 0 but only the fault tolerance of RAID 3 or 5 because a single disk failure will revert it to a RAID 0.

For critical applications requiring high performance, either RAID 0, RAID 5, or RAID 1+0 is recommended. However, RAID 0 doesn't provide fault tolerance. RAID 3 and 5 don't provide the highest I/O performance, but RAID 5 is faster because the parity information is distributed.

RAID can be implemented in hardware disk controllers or via operating system software. Software RAID will use some CPU cycles, but because CPU rarely limits qmail, that shouldn't be a problem on most systems. Be sure to test performance before putting any RAID configuration into production.

Tuning the Network

qmail is a network server, so naturally it's sensitive to network performance. The network includes local and Internet connectivity. qmail's performance is also sensitive to the performance of the DNS.

Local Connectivity

qmail's performance on the Local Area Network (LAN) depends on the performance (bandwidth and latency) of the LAN. To improve local performance:

  • Use a faster physical network. For example, 100 base-T (Fast Ethernet) or 1000 base-T (Gigabit Ethernet) instead of 10 base-T (Ethernet).

  • Use switches instead of hubs. Hubs share the bandwidth with all of the systems connected and don't allow full-duplex connections.

  • Use full-duplex connections instead of half-duplex. Full duplex provides full bandwidth in both directions at the same time.

  • Use multiple network interfaces, if necessary.

Internet Connectivity

Local 1000 base-T connectivity won't help much if you're trying to pump a million messages to remote hosts over a 64Kbps Integrated Services Digital Network (ISDN) link. Calculate the bandwidth you'll need based on the size of messages and the delivery rate you want to achieve, then add overhead for the protocols such as SMTP and Transmission Control Protocol (TCP), which will be higher for smaller messages, and leave some headroom for DNS and other traffic. Also consider expansion to meet future needs.

DNS Caching

A busy mail server will be constantly sending DNS queries to the local name server. Running a caching-only DNS server, such as dnscache from the djbdns package, directly on the qmail server can dramatically improve DNS performance by storing the results of queries locally. The initial lookup of a domain name will still require sending a DNS query over the network to the local name server, but subsequent lookups of the same domain name will be answered immediately from the data in the local cache. See Appendix B, "Related Packages," for more information about djbdns.



 < Free Open Study > 



The Qmail Handbook
The qmail Handbook
ISBN: 1893115402
EAN: 2147483647
Year: 2001
Pages: 186
Authors: Dave Sill

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net