4.2 Recipient Verification

Unless it has been configured not to do so, when sendmail is told that a message is bound for a recipient in a local domain, it will look up the user name to make sure that the recipient actually exists on the server before it attempts to deliver the message. This step allows a misaddressed message to be rejected during the portion of the SMTP dialogue where the envelope is exchanged. If a message can be rejected during the envelope exchange, the server won't have to expend resources associated with processing and queueing the message itself. High-performance email servers often store very large numbers of individual mailboxes. Operating systems sometimes don't perform very well, or even correctly, if the /etc/passwd file contains hundreds of thousands, much less millions, of entries. Furthermore, for some large systems, it would be better to not locally store any email account information.

If an alternative authentication system is used and bypassing the system's normal authentication mechanisms becomes necessary, four ways of dealing with this issue in the MTA exist:

Don't have the MTA check for account existence upon delivery.
Modify the sendmail source to make the check using the new mechanism.
Use Pluggable Authentication Modules (PAM) or some other operating system-supported mechanism to make the traditional system-provided user lookup calls access the external information source.
Use the sendmail Mailbox Database (mbdb) API.

4.2.1 Don't Check

The first case is the easiest to accomplish. Simply exclude the "w" flag from the local mailer definition. Note that this step also disables checking for .forward files. This is necessary because sendmail cannot gain information about the location of a user's home directory. Removing the user lookup can be effected using the MODIFY_MAILER_FLAGS macro in the same manner as in an earlier example:

 MODIFY_MAILER_FLAGS('LOCAL', '-w')

This flag isn't modifiable using the LOCAL_MAILER_FLAGS variable, so if the MODIFY_MAILER_FLAGS macro didn't exist, we would have to make some major changes to the M4 files that come with sendmail to accomplish this goal without hand-hacking the sendmail.cf file.

The downside to this method is that a message must be received and handed off to the LDA before we might discover that its recipient isn't valid.

4.2.2 `sendmail` Source Modification

Modification of the sendmail source to use an alternative authentication method isn't terribly difficult for an experienced programmer. Simply create a routine that provides an interface to the authentication system that mimics the UNIX getpwnam() library call, and then make the appropriate insertion into the definitions of sm_getpwnam() and sm_getpwuid() in the sendmail/ conf.c file in the source distribution. These routines have remained remarkably stable over the last few major revisions of sendmail; only the directory in which the source code for the MTA resides in the distribution has changed, from src to sendmail during the change from version 8.9 to version 8.10.

Of course, once these changes are made they must be maintained, applied, and tested against future versions of sendmail.Amajor advantage of a widely adopted Open Source software package such as sendmail is the comfort that comes from knowing that many experienced programmers analyze the significant changes made to such a package, and that many more people are involved in the thorough testing process that results from its wide use. Once a site starts modifying the source, it can never again be completely certain that the global testing that the package constantly endures will remain entirely applicable to the code being run. A considerable amount of time will be required to keep one's changes in sync with a package that changes as often as sendmail. This issue should provide a strong incentive to avoid making site-specific changes to the source code unless absolutely necessary. When changes are necessary, a minimalist approach is best.

4.2.3 Alternative Authentication Method

The third possibility for using an alternative authentication scheme with sendmail is to alter the way the system performs its authentication. The mechanism by which this goal is most easily achieved is by writing one's own Pluggable Authentication Module (PAM) and configuring the system so that this module handles sendmail's authentication requests. Of course, not all operating systems support PAM, but this strategy is a reasonable option for those that do. A nice side benefit with PAM is that other applications may use the new module without modification. The modules should also be straightforward to create for an experienced software developer. Many already exist that may be used or modified for use at a given site. A useful list of many modules that can be adapted to various systems is available [PAM].

4.2.4 mbdb

The fourth method, using the mbdb interface, was introduced to sendmail starting with version 8.12. The mbdb interface avoids the necessity of modifying arbitrary pieces of the sendmail code. Of course, code modifications remain necessary, but now the authentication mechanisms have been abstracted out to make this new code more maintainable, and to allow a site to configure this parameter in the sendmail.cf file.

The sendmail version 8.12 source tree, in the libsm directory, contains a file called mbdb.c. This file holds a set of authentication routines: sm_mbdb_initialize(), sm_mbdb_lookup(), and sm_mbdb_terminate(). Support for two different authentication models are coded into this file: (1) support for authentication via the default getpwent() family of library calls, and (2) a sample Lightweight Directory Access Protocol (LDAP) authentication implementation. A "NULL" authentication method is also defined, but it is presumably intended just for testing. It should not be used in actual practice.

The sm_mbdb_initialize() and sm_mbdb_terminate() routines perform any pre- and post-authentication configuration activity required by a particular authentication system. When using getpwent()-style authentication, sm_mbdb_initialize() does nothing, and sm_mbdb_terminate() calls endpwent(), which merely closes the passwd file. However, LDAP requires some setup before an authentication request can be made. The database needs to be contacted, parameters need to be defined and passed to it, and so on. These actions would be performed in sm_mbdb_initialize(). Also, after processing the authentication request, the connection to the database needs to be closed. These actions would be performed in sm_mbdb_terminate(). The actual authentication request(s) occur within sm_mbdb_lookup().

Suppose we wanted to add a new user lookup interface for example, assume we wanted to continue with delivery if a 7th Edition mailbox already exists for that user in the /var/mail directory and refuse the message if it doesn't. For the sake of argument, let's call this authentication mechanism "me," which stands for "mailbox exists." Here are the steps needed to activate this mechanism in the sendmail 8.12 source for the FreeBSD operating system version 4.5. Someone attempting to thoroughly understand this example may want to have a copy of the mbdb.c file available while reading through this code.

As we plan to use the stat() system call to determine whether the mailbox exists, we include the following header file at the top of libsm/mbdb.c:

 #include <sys/stat.h>

Next, also at the top of the mbdb.c file, we need to define the new authentication functions. We take our cue from the LDAP example already there, and add the following lines:

 static int  mbdb_me_initialize __P((char *));  static int  mbdb_me_lookup __P((char *name, SM_MBDB_T *user));  static void mbdb_me_terminate __P((void));

Just below the LDAP function definitions, the various authentication methods are defined. The ones already present are "pw," "ldap," and "NULL." Between the #endif terminating the "ldap" method and the "NULL" entry, we'll add our new one with the following line:

 { "me", mbdb_me_initialize, mbdb_me_lookup, mbdb_me_terminate },

In our case, mbdb_me_initialize and mbdb_me_terminate will do nothing, so we can blatantly steal the code from the "pw" functions, modify the "terminate" routine slightly, and add them to the file:

 static int  mbdb_me_initialize(char *arg)  {          return EX_OK;  }  static void  mbdb_me_terminate()  {          return;  }

Next, we write a very simple module for the "lookup" function that checks whether /var/mail/username exists. It takes two arguments: the user name and a pointer to a structure in which we're supposed to fill in information such as the user's full name. We can't know this information for certain without looking it up with getpwent(), which would defeat the purpose of this module. However, we can obtain the proper UID and GID of the file and assume that they are set correctly by the system, so we'll partially fill in the structure:

 static int  mbdb_me_lookup(char *name, SM_MBDB_T *user)  {    char path[MAXPATHLEN]; /* MAXPATHLEN defined in sm/conf.h */    struct stat sb;        /* Status structure                */    errno = 0;    /* Form the path to the file to be stat()ed     */    /* _PATH_MAILDIR defined in sm/conf.h           */   if (sm_snprintf(path, sizeof(path), "%s/%s",            _PATH_MAILDIR, name) >= sizeof(path))   {            return ENAMETOOLONG;   }   if (stat(path, &sb) < 0)   {            /* Can't stat mailbox file.                 */            switch (errno)            {              /* It doesn't exist.                      */              case ENOENT:                    return EX_NOUSER;              /* We can't stat the file.                */              default:                    return EX_TEMPFAIL;            }   }   else   {            /* Mailbox exists!                     */            user->mbdb_uid = sb.st_uid;            user->mbdb_gid = sb.st_gid;            (void) sm_strlcpy(user->mbdb_name, name,                   sizeof(user->mbdb_name));            /* But home directory, etc., don't.    */            user->mbdb_homedir[0] = '\0';            user->mbdb_fullname[0] = '\0';            user->mbdb_shell[0] = '\0';            /*            *Wecan do other things here, such as define a            * quota and check sb.st_size against it if we            * want.            */            return EX_OK;    }  }

Finally, we need to tell sendmail to use the new authentication procedure. It's also a good idea to turn off .forward file lookups, as no home directories exist.

We add the following lines to our .mc file:

 define('confFORWARD_PATH','')  define('confMAILBOX_DATABASE','me')

Recompile sendmail to include these source code changes, reinstall the new sendmail.cf file, and this mechanism should work.

Much more should be done in this code before it goes "live," such as checking whether the mailbox file is a plain file and not a symbolic link or a directory, and making sure that the LDA is aware of this authentication mechanism (the sendmail 8.12 mail.local uses mbdb). Nevertheless, this suffices for a quick-and-dirty example. In this case, creating an email mailbox for holding mail for the user "npc" would be as simple as running "touch /var/mail/npc" and properly setting ownership and permission.

I'm not advocating the use of this mechanism as a real email authentication, as it makes no provision for a user to retrieve email. The purpose of this exercise is simply to provide a trivial example of how to use mbdb; in that sense, it meets its goals.

All of these methods of modifying how sendmail performs its authentication involve considerable risk. If any of these mechanisms is adopted, exhaustive testing is an absolute requirement before the system goes live. None of these strategies (except, perhaps, removing the "w" from the local mailer flags) should be considered as a quick hack by any means. Regardless of the extent to which a new authentication system has been tested, once the new system goes live, it must be monitored vigilantly, and a rapid backout strategy should be prepared just in case.

In examining the mail.local code (or the code for /bin/mail or other LDAs), it becomes apparent that, like sendmail, the LDA validates a local user before continuing with email delivery. On a dedicated email server with only administrative user accounts, if we set "F=w" in the local mailer flags, then this step really shouldn't be necessary, as the MTA has performed the same check. If sendmail determines that the user is valid, then the LDA really shouldn't have to do so, too. On servers authenticating against small passwd files or larger files stored in a hashed format on local disk, the cost of the LDA performing the extra lookup will be small, so this sort of optimization probably isn't necessary. If the authentication database is large or is stored on another system, however, this extra delay in delivering messages might not be completely benign. If the LDA can figure out from the user name where the message should be delivered, then it would be safe to assume that the account is supposed to receive email. In this case, the second authentication check becomes unnecessary. If all recipient mailboxes have the same UID, the LDA simply would be run as this user. Otherwise, the LDA would be run as root and change its EUID to that of the owner of the file, assuming that ID is set correctly on the system.

4.2.5 Mailbox Quotas

On systems where disk space is scarce or where mailboxes may grow to very large sizes, it's often advantageous to set a quota limit on mailbox sizes. The most common way to do so is to use the operating system's quota mechanism on the mail spool disk system. The quota mechanism available will likely differ from system to system, but generally running man quota will provide information on where to start. On large systems where the users don't have direct access to the server and must read their email via POP or IMAP access, all email may be stored under the UID of a single user responsible for the entire message store. In this case, an operating system's quota mechanisms won't be useful as they are universally based on tracking data storage via UID. Instead, it's often easiest to build quota-checking functionality into the LDA. The MTA could also perform quota checking using the mbdb interface discussed earlier, but this tactic requires more work. If this functionality is added to the LDA, then it becomes the LDA's responsibility to determine whether an email account is over quota, and, if it is, to refuse to deliver more email until this situation is rectified. This problem breaks into two parts: (1) determining what quota is associated with which mailbox, and (2) determining whether a user is over quota.

The easiest solution to the first problem is to force a universal mailbox quota limit and to code this value into the LDA. This option might be a reasonable response at many sites. The other possibility is to do a table lookup with a default value if the user name doesn't appear in the table. These tables can be simply constructed, for example, in two columns with the user name on the left and the quota (in megabytes, for example, on the right), such as the following:

`npc`	`100`
`jim`	`10`
`scott`	`10`
`philip`	`20`

If this file is large, it can be stored as a database. The most straightforward way to do so is to use the same sorts of Berkeley Database files that sendmail uses for its maps, such as the virtusertable or the access table. Berkeley DB is an Open Source product from Sleepycat Software [OBS99], and some version of it typically comes with most UNIX-like operating systems. These database files can be constructed from "flat" text files using the makemap utility that comes with the sendmail distribution. For example, on a system whose DB library supports the btree format, we'd run the following command on a flat file called emailquota:

 makemap btree emailquota < emailquota

On a FreeBSD system, or any other operating system with a recent version of Berkeley DB, this command will create a file called emailquota.db in the current directory that stores the same information as the text file from which it was generated, except in a btree format readable by the Sleepycat libraries. It's easy to write programs that perform simple queries against these database files. Besides the documentation at the Sleepycat Web site [SLE], a good place to start to understand how this interface works is to run "man db".

Regarding the Sleepycat DB package, serious advances have been made in terms of this package's functionality. However, few operating systems have incorporated the latest versions into their releases. Sleepycat DB version 3 has been in wide use for several years. It's rock-solid and provides considerable improvements over earlier versions. Sleepycat DB version 4 recently came out, but is still relatively new. Many operating systems ship with Sleepycat version 2 or even later releases of version 1. It is worthwhile to consider upgrading a system that uses Sleepycat DBs extensively.

At those sites where the email system performs user authentication requests against remote data repositories, such as LDAP, that repository would be a natural place to store quota information. As an LDA will check whether an account is valid before completing a message delivery, having it request quota information from this database is a logical way to accomplish both tasks at once.

The second aspect of quota enforcement is to verify that a mailbox isn't over quota. For a one-file-per-mailbox storage method, such as the 7th Edition mailbox format, the quota conformance can be checked by the stat() system call to verify the file's size, after a lock has been established on the mailbox but before delivery begins. If the mailbox uses a one-file-per-message format, this task becomes trickier. In this case, each file in the directory (or directory tree) must be found and a stat() performed on each file. These sizes are then summed to see whether the quota has been exceeded. This task involves much more work than in the single-file mailbox case, but it's basically the same amount of work that's performed if one were to run "ls -lR" in the mailbox directory, which will also stat() every file in a directory hierarchy. To expedite this process, some commercial email servers that provide quota support keep a separate database containing the sizes of the email messages in a given mailbox, updating this record if a new message is delivered or an old one deleted. It's more work to keep this extra data repository in sync with the actual mailbox, and this database must be made available to both the LDA and the POP and/or IMAP daemons. Overall, it is probably not necessary to keep such a database, but it would be straightforward to implement one if necessary.

One final comment on the LDA and quotas: One of the big questions that many people ask concerning email quota implementation is, What should be done with the next message if a user's mailbox is over quota? Specifically, should the delivery be rejected with a temporary or permanent error status? If the error reported back to the MTA indicates a permanent error, the message will be bounced as undeliverable. If it's a temporary error, the message will be queued for redelivery. At first glance, the temporary error might seem to achieve the best result, but this may not be the case, especially if the system struggles with performance issues. A mailbox that is over quota takes up too much space because of either somebody's carelessness (not cleaning out email or forgetting to deactivate unused accounts) or somebody's malice (sending larger email messages than are appropriate, such as mail bombing). In either case, if we don't want to store that email message in the message store, then we certainly don't want it (and its cousins) clogging the mail queue, especially if the server is the target of a mail bomb. In either case, the only way to make sure an overfull mailbox doesn't turn into an overfull mail queue is to bounce the message by indicating a permanent delivery failure. Of course, if the server in question is lightly loaded or if a user's quota might be exceeded due to files not related to email, returning a temporary failure may indeed be more appropriate. This book focuses specifically on those systems with heavy use, where resources are specifically in short supply. This policy may seem unkind or even draconian, but it's almost certainly the lesser of two evils under these circumstances.

Unfortunately, the mail.local LDA that ships with the sendmail source distribution has historically considered mail quota violations to be a temporary failure condition. In older source distributions, one can easily patch mail.local to change this behavior by removing the following lines from mail.local.c:

 #ifdef EDQUOT          case EDQUOT:        /* Disc quota exceeded */  #endif

In the source code, this case is part of a list of error conditions that will result in the LDA returning a temporary failure. After removing it from that list, a filesystem error condition of EDQUOT will implicitly result in mail.local returning a permanent error.

Starting with sendmail version 8.10, this behavior can be mandated more easily by having sendmail invoke mail.local with the -b flag. For a system such as FreeBSD, simply add the following M4 code to the appropriate .mc file:

 define('LOCAL_MAILER_ARGS','mail.local -d -b $u')

Under the sendmail source distribution, look at the files in cf/ostype to find out which flags are appropriate for various operating systems. The flags can also be changed here, but it is probably more appropriate to make these changes in a domain-specific file.

To accomplish this task, go into the cf/domain directory under the top-level sendmail source distribution directory. In this directory, create an appropriate domain file for example, example.com.m4. Place the change given earlier in this file, and include it in the configuration by replacing the line

 DOMAIN(generic)

with

 DOMAIN(example.com)

4.2.6 Other LDAs

Some sites use delivery agents other than the traditional mail.local or /bin/ mail. One common LDA replacement is procmail [PRO], which is used by default in many Linux distributions. It provides a great deal of functionality, especially in the realm of email filtering, that other LDAs don't possess. Despite its large feature set, procmail performs remarkably well as an LDA.

We can configure sendmail to use procmail as the LDA by adding the following line to a .mc file:

 FEATURE('local_procmail')

Returning to the CPU-bound email server introduced in Chapter 1, we can rerun our standard test case (sendmail 8.12, background delivery mode, delivery over IPC, sending many messages, each about 1KB in size, to one randomly selected recipient out of 50) using mail.local and procmail as delivery agents. Recall that using mail.local we achieved a delivery rate of about 242 messages/minute. Under the same conditions using procmail version 3.22, about 247 messages/second are achieved. This rate represents a 2% increase in throughput, although this number is probably within the margin of error for this particular measurement. Nonetheless, it is a respectable showing for procmail.

Recent versions of procmail can be compiled to deliver via LMTP as well. To do so, uncomment the #define LMTP declaration in the conf.h file that comes with the source code bundle. However, procmail's LMTP implementation suffers in the performance department. Using the same test methodology as in the previous experiment, we achieved a delivery rate of 236 messages/minute using mail.local and LMTP. With procmail using LMTP, the delivery rate is highly variable, but on average it drops to 183 messages/minute, a decline of more than 22%. To configure procmail and LMTP together, use the following code:

 FEATURE('local_lmtp', '/usr/local/bin/procmail')  define('LOCAL_MAILER_ARGS', 'procmail -Y -a $h -z')

I wouldn't recommend taking this step on a performance-sensitive email server without first improving procmail's LMTP delivery performance.

Another LDA worth mentioning is the deliver LDA used by the Cyrus IMAP [CYR] system. The Cyrus mail server software is discussed in more detail later in this chapter. On a system running Cyrus IMAP, one must use the LDA provided with this package, deliver, as the LDA because Cyrus's message store has a unique layout. Cyrus IMAP is a good example of a system with a one-file-per-message message store in which all mailboxes are owned by the same UID. It's unlikely that someone would want to use deliver in a non-Cyrus context.