5.4 Another Mailing List Strategy | sendmail Performance Tuning

A strategy that several sites have used to expedite email delivery deserves some mention, and this is as good a place to do it as any. The idea is to split up the reception of email messages and queue processing tasks with the notion that sending email can become more efficient by aggregating messages in the queue bound for a given destination rather than trying to send out email as it arrives. Basically, the sendmail daemon listening to port 25 will run in queue-only mode. That is, as messages come into the server, they are accepted and written to the queue, but no immediate delivery attempt is made. At the same time, several sendmail processes run that simply process the queue. Instead of initiating a sendmail daemon at start-up that does both,

 /usr/sbin/sendmail -bd -q30m

two sets of processes would be started:

 /usr/sbin/sendmail -bd -odq  /usr/sbin/sendmail -q30m

As part of this strategy, when a queue run starts, the queue is sorted by recipient, which is QueueSortOrder=host, in the .cf file. Therefore, when the queue run starts, all of the qf files whose first recipient is bound for the same domain will be processed at the same time, ideally minimizing the number of SMTP sessions required to deliver all queued messages. The intent is that spawning the minimal number of SMTP sessions will minimize the amount of effort expended by the server to get messages to their destinations.

When using this strategy, several other parameters are often altered so as to further improve performance. First, MinQueueAge is usually set to ensure that queued messages sit in the queue for some period of time before being sent. This idea is exactly the opposite of the strategy usually advocated in this book. The rationale is that we want messages bound for the same domain to accumulate in the queue before they are sent, so that all can be transferred in one connection. In this example, we might fire off a new queue runner every few minutes, maybe even every minute. To accomplish this task, we can invoke sendmail with

 /usr/sbin/sendmail -q5m

but set up the configuration file in the following manner:

 define('confQUEUE_SORT_ORDER', 'host')  define('confMIN_QUEUE_AGE', '30m')  define('confMAX_QUEUE_CHILDREN', '100')

At the rapid rate that queue runners are started, it's possible that a sudden flurry of messages will lead to delays in queue runners finishing their queue runs. This problem might cause more processes than optimal to work on the queues to operate at the same time. Therefore, we cap this number by using confMAX_QUEUE_CHILDREN, just to make sure things don't get too far out of hand.

Using sendmail 8.12, another optimization uses the queue groups features namely, SplitAcrossQueueGroups to split envelopes such that qf files with multiple recipients tend to have all of those recipients at the same domain placed within the same qf file. A good example of how this approach might be used is available in the file sendmail/TUNING in the 8.12 source distribution. If the recipient list in a qf file contains nine email addresses from a popular domain such as aol.com and one recipient from the domain very-obscure-smtp-site.com, but the latter recipient is listed first in the qf file, a queue runner won't know that associating this message with others bound for aol.com might be a good idea. Therefore, this strategy will suffer if none of the following conditions are met:

Each message is bound for a single recipient.
The messages are grouped by destination when injected into the machine.
sendmail can use queue groups to split up the recipients by domain.

In its totality, this strategy runs counter to just about everything mentioned so far in this book. Let's look at it more closely and see what it is really designed to accomplish. First, because messages will always sit in the queue for some time before being progressed, this strategy will by no means minimize the amount of time needed to deliver the message to its final destination. In fact, if lowering the average or maximum transit time of the email is a critical factor, this strategy should absolutely not be used. Second, every message will sit in the queue for some period of time before its first delivery attempt is made. If a delivery attempt has never been attempted for a message, then the first time a queue runner encounters a message in the queue it will attempt to deliver the message regardless of whether confMIN_QUEUE_AGE is set. In any case, we'll still have more queued messages at any one time possibly orders of magnitudes more than with the other strategies advocated in this book. While using version 8.12 queue groups with a large number of queues and envelope splitting based on recipient domain can reduce this number, this strategy will almost certainly result in relatively deep queues, something the rest of this book warns against as being a tremendous hazard.

So, what is being saved? This strategy reduces two things. First, the number of SMTP sessions that need to be spawned in total is drastically reduced. This change decreases the number of DNS lookups that need to be performed as well as likely minimizing memory consumption, the number of fork()s that need to occur, and so on. Second, the total number of write operations in the queue is minimized. While queues may be deep, a fairly small number of writes will occur in them, as each qf file should never need to be rewritten; rather, it will be written and then unlinked. The number of reads of qf files shouldn't increase much, as the files need not be opened to determine when they entered a queue. The creation time (ctime) of the file can be discerned by stat()ing the file.

If larger numbers of queued messages don't pose a problem, no real need for rapid message delivery exists, but the number of total SMTP connections should be minimized, or queue runners using another policy get through their queue runs slowly, this technique might be effective. On the busiest email servers, this strategy likely would not be useful. The large number of queued messages would tend to be too high a price to pay for any optimization in SMTP connections. Instead, in those circumstances, aggressive envelope splitting, rapid queue rotation, and multiple queue runners sorting by random will usually achieve higher throughput than the methods described above, although some systems do use this method effectively. Finally, note that this strategy will usually be more effective for outbound email sending. In most cases, it would not be a useful way to handle inbound email.