5.2 Spam-Checking During SMTP

‚ < ‚ Day Day Up ‚ > ‚

If you want to refuse spam before it reaches your recipients, or set up a spam-checking gateway to an internal email server, you need a way to perform spam-checking during the SMTP transaction. If a message is found to be spam, you may want to refuse it and end the SMTP session, or accept it and add headers that users can use in their mail client filters. sendmail provides a general-purpose filtering interface, called milter , for use during the SMTP transaction.

5.2.1 The Milter Interface

In sendmail's parlance, milter refers to several things. Milter is an application programming interface (API) for writing filters for sendmail, and a protocol for communication between sendmail and a filter. A milter is also a filter program written using this API that listens for connections from a sendmail process and defines functions to call at different points of the SMTP transaction to accept, reject, discard, temporarily refuse, or modify a message. The milter library, libmilter , provides most of the code required to set up a milter and manage the work of calling your filtering functions during an SMTP transaction.

A milter can provide functions that sendmail will call at the following points in an SMTP transaction:

When a mail client connects to sendmail
After the SMTP HELO or EHLO commands
After the SMTP MAIL FROM command
After the SMTP RCPT TO command
After each message header is transmitted during the DATA step
After all message headers are transmitted
After each piece of the message body is transmitted
At the end of the DATA step, after the entire message has been transmitted
When the SMTP transaction is aborted
When the client connection is closed

Milter functions can perform the following operations on a message:

Add, change, or delete a header
Add or remove a recipient
Replace the message body
Reject a connection, message, or recipient
Temporarily fail a connection, message, or recipient
Accept and discard a message
Accept a message

Milters operate as daemons. They are typically started before sendmail during system startup and listen for connections from a sendmail process on a TCP or Unix domain socket. Milters do not have to be run as root . For more information about writing milters, visit http://www.milter.org.

You configure sendmail to use a milter by adding an INPUT_MAIL_FILTER( ) macro to the sendmail.mc configuration file and generating a new sendmail.cf file. Example 5-2 shows parts of a sendmail.mc file that includes a milter.

Example 5-2. A sendmail.mc file with a milter

 divert(0)dnl VERSIONID(`example mc')dnl OSTYPE(linux)dnl DOMAIN(generic)dnl ... INPUT_MAIL_FILTER(`mymilter', `S=unix:/var/run/mymilter.sock, F=T, T=S:60s;R:60s;E: 5m')dnl ... MAILER(smtp)dnl MAILER(local)dnl MAILER(procmail)dnl

The INPUT_MAIL_FILTER macro takes two arguments. The first provides the name of the milter ( mymilter in Example 5-2), and the second tells sendmail how to interact with the milter. The second argument in turn consists of several instructions, separated by commas:

S= socket description: This argument describes how sendmail should connect to the milter. The socket description consists of a protocol ( unix for a Unix domain socket, inet for a TCP/IP socket, inet6 for a TCP/IPv6 socket), a colon , and a protocol-specific address. For Unix domain sockets, the address is the path to the socket file. For TCP sockets, the address is in the form port@host .
F= failure mode: This argument determines how sendmail will behave if it fails to connect to the milter process. Use F=T to cause sendmail to temporarily refuse email when it can't contact the milter. Use F=R to cause sendmail to reject connections when it can't contact the milter. Omit an F= argument to cause sendmail to accept messages without filtering when it can't contact the milter.
T= timeout list: This argument determines how long sendmail should wait for the milter to respond before treating the connection attempt as a failure. It consists of a set of states and the amount of time to allow for each, separated by semicolons. In Example 5-1, sendmail uses a 60-second timeout for sending data to the milter ( S:60s ), a 60-second timeout for reading replies from the milter ( R:60s ), and a 5-minute timeout for waiting for the milter's final acknowledgment after sending the message ( E:5m ). There is also a C timeout for connecting to the milter. If you leave any timeouts unspecified, sendmail uses its default timeouts: 10 seconds for sending and reading, and 5 minutes for connecting and final acknowledgment.

The INPUT_MAIL_FILTER macro results in the following lines being added to the sendmail.cf file when you generate it:

 O InputMailFilters=mymilter ... Xmymilter, S=unix:/var/run/mymilter.sock, F=T, T=S:60s;R:60s;E:5m

Milter in sendmail 8.11

The milter interface was formally announced in sendmail 8.12 but is available as an experimental feature in sendmail 8.11. To use milter in sendmail 8.11, add the following line to your sendmail.mc file:

 define(`_FFR_MILTER')dnl

Milter support in sendmail 8.11 is not as complete as in sendmail 8.12, however, and I strongly encourage you to upgrade to sendmail 8.12 or later rather than use sendmail 8.11's milter subsystem.

Older versions of sendmail do not provide milter. If you must use one of these versions, you are limited to integrating SpamAssassin through procmail.

SpamAssassin itself is not a milter. However, several milters have been written that invoke SpamAssassin on messages and then take action during the SMTP transaction.

5.2.2 MIMEDefang

MIMEDefang is one of the most popular sendmail milters. It provides a general framework for performing milter functions in Perl and comes with a default configuration that performs several functions:

Messages can be checked with a virus scanner, and messages carrying viruses can be refused , discarded, or quarantined.
MIME attachments can be examined, and messages can be refused, discarded, or quarantined if they contain attached files with given filename extensions (e.g., extensions that denote executable Windows files).
The HTML attachment in a message of type multipart/alternative (containing both text and HTML versions of the same message) can be dropped.
SpamAssassin can be invoked on the message, and spam can be refused, discarded, quarantined, or tagged.

MIMEDefang is developed by Roaring Penguin Software and is available as free software at http://www.mimedefang.org. Roaring Penguin also produces commercial products, CanIt and CanIt-PRO, which are based on MIMEDefang and SpamAssassin and add several other features including web-based interfaces for administrators and users.

The rest of this section details the installation, operation, and customization of MIMEDefang 2.42 as an example of a full-scale, milter-based approach to using SpamAssassin. MIMEDefang's other functions, such as virus-checking, are mentioned but not covered in detail; read the MIMEDefang documentation for more information.

Use the latest available version of MIMEDefang. In particular, only versions 2.42 and later support SpamAssassin 3.0.

5.2.2.1 Installing MIMEDefang

MIMEDefang is written in Perl and invokes SpamAssassin through the Mail::SpamAssassin Perl modules. Because MIMEDefang itself is a daemon, you do not need to run spamd . It's easiest to install SpamAssassin (and your antivirus software) first and then install MIMEDefang.

A good way to begin a MIMEDefang installation is to verify that you have the prerequisite Perl modules on hand. MIMEDefang requires sendmail 8.12 (or later). MIMEDefang also requires several Perl modules, including: MIME::Tools , IO::Stringy , MIME::Base64 , MailTools , Digest::SHA1 , and HTML::Parser . Most of them can be installed using CPAN.

MIMEDefang will not work correctly with the standard version of MIME::Tools 5.411a. Either install MIME::Tools 6 or later, or install the special version of MIME::Tools 5.411a available from Roaring Penguin's web site.

You should create a new user account and group for running MIMEDefang; the usual name for both the user and group is defang . This user will own MIMEDefang's files, and the user (or group) must have access to SpamAssassin's configuration and database files as well.

MIMEDefang uses two important directories. It uses /var/spool/MIMEDefang as a working directory for unpacking email messages and scanning them. For optimal performance, place this directory on a fast disk ‚ even a RAM disk if your operating system supports it and you have enough memory to spare. MIMEDefang stores quarantined email messages in /var/spool/MD-Quarantine . Speed is not so critical with this directory, and it should never be located on a RAM disk because you will want to be sure that you can access quarantined files. Create these directories before you install MIMEDefang. The directories should be owned by user and group defang and should not be world-readable or world-searchable.

Next, download the MIMEDefang source code from http://www.roaringpenguin.com, unpack it, run the configure script, make , and perform a make install as root . Example 5-3 shows this process from the point of running the configure script:

Example 5-3. Compiling MIMEDefang

 $  ./configure  creating cache ./config.cache ... creating config.h *** Virus scanner detection results: H+BEDV   'antivir'   NO (not found) Vexira   'vexira'    NO (not found) NAI      'uvscan'    NO (not found) BDC      'bdc'       NO (not found) Sophos   'sweep'     NO (not found) TREND    'vscan'     NO (not found) CLAMSCAN 'clamav'    YES - /usr/bin/clamscan AVP      'AvpLinux'  NO (not found) FSAV     'fsav'      NO (not found) FPROT    'f-prot'    NO (not found) SOPHIE   'sophie'    NO (not found) NVCC     'nvcc'      NO (not found) CLAMD    'clamd'     YES - /usr/sbin/clamd File::Scan           NO (not found) TROPHIE  'trophie'   NO (not found) Found Mail::SpamAssassin.  You may use spam_assassin_* functions Did not find Anomy::HTMLCleaner.  Do not use anomy_clean_html( ) Found HTML::Parser.  You may use append_html_boilerplate( ) Note: SpamAssassin, File::Scan, HTML::Parser and Anomy::HTMLCleaner are detected at run-time, so if you install or remove any of those modules, you do not need to re-run ./configure and make a new mimedefang.pl. $  make  gcc -g -O2 -Wall -Wstrict-prototypes -pthread -D_POSIX_PTHREAD_SEMANTICS -DPERL_PATH=\" /usr/local/bin/perl\" -DMIMEDEFANG_PL=\"/usr/local/bin/mimedefang.pl\" -DRM=\"/bin/rm\" - DVERSION=\"2.42\" -DSPOOLDIR=\"/var/spool/MIMEDefang\" -DQDIR=\"/var/spool/MD-Quarantine\ " -DCONFDIR=\"/etc/mail\" -I../sendmail-8.12.11/include -c -o mimedefang.o mimedefang.c ... $  su  Password:    XXXXXX    #  make install  mkdir -p /etc/mail && chmod 755 /etc/mail ... Please create the spool directory, '/var/spool/MIMEDefang', if it does not exist.  Give it mode 700, and make it owned by the user you intend to run MIMEDefang as. Please do the same with the quarantine directory, '/var/spool/MD-Quarantine'. #

The following programs and files are installed:

mimedefang: The milter itself. This program receives requests from sendmail to filter messages and pass them on to mimedefang-multiplexor to perform the checks. It then communicates the results back to sendmail.
mimedefang-multiplexor: A program to receive requests from mimedefang and farm them out to a pool of mimedefang.pl Perl processes for scanning. It is responsible for maintaining the process pool, creating and destroying processes as necessary. This approach minimizes the time and CPU overhead required in starting new processes for each scan.
mimedefang.pl: A Perl script to perform all of the message-checking functions of MIMEDefang. During the several stages of checking a message, this script calls functions defined in /etc/mail/mimedefang-filter .
md-mx-ctrl: A command-line tool for viewing the status of the multiplexor or for ordering it to reload its slave processes.
watch-mimedefang: A graphical interface based on Tcl/Tk.
/etc/mail/spamassassin/sa-mimedefang.cf: A sitewide configuration file used by MIMEDefang. By default, MIMEDefang's install process generates a simple file, with few options.
/etc/mail/mimedefang-filter: A file containing Perl subroutines called by mimedefang.pl at different stages of message-processing . These subroutines check messages or message parts, and direct MIMEDefang to accept, quarantine, discard, or bounce a message. MIMEDefang installs a default mimedefang-filter that invokes SpamAssassin to add an X-Spam-Score header and a SpamAssassin report to all messages. To implement more complex spam-checking behavior, you'll edit mimedefang-filter . This file is discussed in greater detail in Section 5.3.3, later in this chapter.

5.2.2.2 Starting the MIMEDefang multiplexor

To run MIMEDefang, you must start two processes: the multiplexor ( mimedefang-multiplexor ) and the milter ( mimedefang ). You should start the multiplexor first because the milter process will connect to it. Start each process as root ; each changes its uid to the defang user after startup.

mimedefang-multiplexor has over a dozen command-line options, but you will typically need to use only a few of them. The most common are described here; for complete information, see the manpage .

-U user: Instructs mimedefang-multiplexor to run as the given user (e.g., defang ). Running as a non- root user is an important security measure.
-s /path/to/socket: Specifies the path to the Unix domain socket that the multiplexor will use to listen for requests from the milter process. It defaults to /var/spool/MIMEDefang/mimedefang-multiplexor.sock .
-p filename: Causes the multiplexor to write its process ID to the specified file. You can use this ID to signal the multiplexor to reread the filter when you change it or to stop the multiplexor (these operations are discussed later in this section).
-m number-of-slaves: Specifies the minimum number of slave, mimedefang.pl processes that should be running at any given time. This value defaults to 0, but on most systems, you want to have at least two slave processes running at all times to minimize startup overhead.
-x number-of-slaves: Specifies the maximum number of slave, mimedefang.pl processes that should be running at any given time. This value defaults to 2, but busy mail servers will require more than two processes to be available at any given time. You should plan to increase this value to 5, 10, or higher, depending on your needs.
-q number-of-requests: Causes the multiplexor to queue an incoming request when a multiplexor is not immediately available to service that request. By default, the multiplexor causes sendmail to temporarily fail a message when all slave processes are busy (returning a 4xx SMTP status code to the sending MTA, which should retain the message in its queue and try to deliver it again later).
-D: Causes the multiplexor to run in the foreground, for debugging purposes. Without this option, the multiplexor detaches from the terminal and runs in the background.

A typical invocation of mimedefang-multiplexor might be:

 /usr/local/bin/mimedefang-multiplexor -U defang -p /var/run/mimedefang-multiplexor. pid -m 2 -x 10

5.2.2.3 Checking multiplexor status

Once the multiplexor is running, use the md-mx-ctrl command to examine its status. md-mx-ctrl status provides a human-readable status report on the multiplexor's slave processes; md-mx-ctrl msgs shows the total number of messages processed by the multiplexor. If you're using a nondefault socket for the multiplexor, you can specify that socket to md-mx-ctrl using the - s /path/to/socket command-line option. Example 5-4 shows these md-mx-ctrl invocations and their output. On the system in the example, the multiplexor has been configured with a minimum of two slaves (both of which are idle) and a maximum of ten, and has processed 17,366 messages.

Example 5-4. Invoking md-mx-ctrl

 #  md-mx-ctrl status  Max slaves: 10 Slave 0: stopped Slave 1: stopped Slave 2: idle Slave 3: stopped Slave 4: stopped Slave 5: stopped Slave 6: idle Slave 7: stopped Slave 8: stopped Slave 9: stopped #  md-mx-ctrl msgs  17366

5.2.2.4 Starting the MIMEDefang milter

mimedefang performs a simpler task than the multiplexor. Its job is to receive filtering requests from sendmail and pass them on to the multiplexor to handle. Accordingly, it has fewer command-line options. Here are the most commonly used options.

-p /path/to/socket: Specifies the path to the Unix domain socket that the milter process will listen on for requests from sendmail. This path must match the path you specify in sendmail's INPUT_MAIL_FILTER( ) macro. A typical choice is /var/spool/MIMEDefang/mimedefang.sock , which is a required option.
-m /path/to/multiplexor/socket: Specifies the Unix domain socket on which the multiplexor is listening for requests. mimedefang sends requests to the multiplexor on this socket. This option is required, and the value should match that of the multiplexor's - s option (typically /var/spool/MIMEDefang/mimedefang-multiplexor.sock ).
-U user: Instructs mimedefang to run as the given user (e.g., defang ). You must provide the same user to mimedefang-multiplexor and mimedefang .
-P filename: Directs mimedefang to write its process ID to the specified file. Note that this option uses a capital P.

A typical invocation of mimedefang might be:

 /usr/local/bin/mimedefang -U defang -P /var/run/mimedefang.pid \ -p /var/spool/MIMEDefang/mimedefang.sock \ -m /var/spool/MIMEDefang/mimedefang-multiplexor.sock

A sample boot script for automatically starting and stopping MIMEDefang can be found in the examples directory of MIMEDefang's source code. Editing this script and installing it with your other system boot scripts is an easy way to properly configure MIMEDefang, as it lists all of the multiplexor and milter process options as shell variables . Ideally, the script should run before sendmail's startup script so that the milter socket is in place before sendmail starts. Likewise, you should stop sendmail before you stop MIMEDefang's process.

5.2.2.5 Verifying the MIMEDefang processes

You can use the ps command to verify that all your MIMEDefang processes are running. Example 5-5 shows the process listing and the contents of /var/spool/MIMEDefang and /var/spool/MD-Quarantine on a typical system running sendmail and MIMEDefang. MIMEDefang's processes include one mimedefang-multiplexor process, three slave mimedefang.pl processes started by the multiplexor for scanning messages, and four mimedefang milter processes started by sendmail. All processes are running as user defang . The /var/spool/MIMEDefang directory contains working directories used temporarily by MIMEDefang ( names starting with "mdefang"), as well as Unix domain sockets and pid files. The /var/spool/MD-Quarantine directory includes subdirectories holding quarantined messages.

Example 5-5. Processes and layout of a typical MIMEDefang system

 #  ps auxw  egrep 'mime'  defang   27145  0.0  0.0  1312  688 ?        S    Jan15   0:42 /usr/local/bin/mimedefang- multiplexor -p /var/spool/MIMEDefang/mimedefang-multiplexor.pid -m 2 -x 10 -U defang -b  300 -l -T -s /var/spool/MIMEDefang/mimedefang-multiplexor.sock defang   27162  0.0  0.1  2552  856 ?        S    Jan15   0:00 /usr/local/bin/mimedefang  -P /var/spool/MIMEDefang/mimedefang.pid -U defang -m /var/spool/MIMEDefang/mimedefang- multiplexor.sock -p /var/spool/MIMEDefang/mimedefang.sock defang   20548  1.0  2.8 23464 22416 ?       S    12:05   1:43 perl -w /usr/local/bin/ mimedefang.pl -server defang   25089  0.0  0.1  2552  856 ?        S    13:57   0:00 /usr/local/bin/mimedefang  -P /var/spool/MIMEDefang/mimedefang.pid -U defang -m /var/spool/MIMEDefang/mimedefang- multiplexor.sock -p /var/spool/MIMEDefang/mimedefang.sock defang   25142  0.0  0.1  2552  856 ?        S    13:59   0:00 /usr/local/bin/mimedefang  -P /var/spool/MIMEDefang/mimedefang.pid -U defang -m /var/spool/MIMEDefang/mimedefang- multiplexor.sock -p /var/spool/MIMEDefang/mimedefang.sock defang   25589  0.0  0.1  2552  856 ?        S    14:11   0:00 /usr/local/bin/mimedefang  -P /var/spool/MIMEDefang/mimedefang.pid -U defang -m /var/spool/MIMEDefang/mimedefang- multiplexor.sock -p /var/spool/MIMEDefang/mimedefang.sock defang   26616  0.3  2.6 21588 20572 ?       S    14:35   0:04 perl -w /usr/local/bin/ mimedefang.pl -server defang   26617  0.2  2.6 21492 20492 ?       S    14:35   0:03 perl -w /usr/local/bin/ mimedefang.pl -server #  ls -l /var/spool/MIMEDefang  drwx------   3 defang   defang        149 Jan 28 14:47 mdefang-i0SKkoMD027104 drwx------   3 defang   defang        149 Jan 28 14:48 mdefang-i0SKlwMB027198 -rw-------   1 defang   defang          6 Jan 15 10:40 mimedefang-multiplexor.pid srw-------   1 defang   defang          0 Jan 15 10:40 mimedefang-multiplexor.sock -rw-------   1 defang   defang          6 Jan 15 10:40 mimedefang.pid srwx------   1 defang   defang          0 Jan 15 10:40 mimedefang.sock #  ls -l /var/spool/MD-Quarantine  drwx------   2 defang   defang        212 Dec 27 10:37 qdir-2004-01-28-10.37.35-001 drwx------   2 defang   defang        212 Dec 27 16:25 qdir-2004-01-28-16.25.03-001

5.2.2.6 Customizing MIMEDefang

Use the mimedefang-filter file to configure the actions that MIMEDefang takes when filtering messages. The file is written in Perl. MIMEDefang distributes and installs a working sample file, typically in /etc/mail , but you will need to modify several settings in the file for your local environment. Example 5-6 shows the configuration settings near the beginning of this file. You should always change $AdminAddress , $AdminName , and $DaemonAddress . Generally, $AddWarningsInline and md_graphdefang_log_enable( ) can be left unchanged, and $MaxMIMEParts should be uncommented to prevent denial-of-service attacks.

Example 5-6. Configuration section of mimedefang-filter

 #*********************************************************************** # Set administrator's e-mail address here.  The administrator receives # quarantine messages and is listed as the contact for site-wide # MIMEDefang policy.  A good example would be 'defang-admin@mydomain.com' #*********************************************************************** $AdminAddress = 'postmaster@localhost'; $AdminName = "MIMEDefang Administrator's Full Name"; #*********************************************************************** # Set the e-mail address from which MIMEDefang quarantine warnings and # user notifications appear to come.  A good example would be # 'mimedefang@mydomain.com'.  Make sure to have an alias for this # address if you want replies to it to work. #*********************************************************************** $DaemonAddress = 'mimedefang@localhost'; #*********************************************************************** # If you set $AddWarningsInline to 1, then MIMEDefang tries *very* hard # to add warnings directly in the message body (text or html) rather # than adding a separate "WARNING.TXT" MIME part.  If the message # has no text or html part, then a separate MIME part is still used. #*********************************************************************** $AddWarningsInline = 0; #*********************************************************************** # To enable syslogging of virus and spam activity, add the following # to the filter: # md_graphdefang_log_enable( ); # You may optionally provide a syslogging facility by passing an # argument such as:  md_graphdefang_log_enable('local4');  If you do this, be # sure to setup the new syslog facility (probably in /etc/syslog.conf). # An optional second argument causes a line of output to be produced # for each recipient (if it is 1), or only a single summary line # for all recipients (if it is 0.)  The default is 1. # Comment this line out to disable logging. #*********************************************************************** md_graphdefang_log_enable('mail', 1); #*********************************************************************** # Uncomment this to block messages with more than 50 parts.  This will # *NOT* work unless you're using Roaring Penguin's patched version # of MIME tools, version MIME-tools-5.411a-RP-Patched-02 or later. # # WARNING: DO NOT SET THIS VARIABLE unless you're using at least # MIME-tools-5.411a-RP-Patched-02; otherwise, your filter will fail. #*********************************************************************** # $MaxMIMEParts = 50;

The remainder of the mimedefang-filter file is a set of Perl functions that mimedefang.pl will call when checking a message. You can modify these functions to customize MIMEDefang's behavior. The functions include:

filter_begin( ): Called with no arguments at the start of filtering. Suitable for setting variables that you expect to use throughout the filter, or for performing whole-message checks like virus-scanning immediately.
filter_multipart( entity,name,extension,type ): Called for each MIME part of the message that contains other MIME parts within it. The entity is a MIME::Entity object, name is the suggested filename of the part, extension is the file extension, and type is the MIME type. Suitable for validating MIME parts or refusing specific multipart types (e.g., message/partial).
filter( entity,name,extension,type ): Called for each MIME part of the message that does not contain other MIME parts within it. Arguments are the same as for filter_multipart( ) . Suitable for validating filenames, virus-scanning individual MIME parts, or refusing specific MIME types.
filter_end( entity ): Called at the end of filtering with the MIME::Entity object representing the entire message to be returned to sendmail. Suitable for checking variables that you set elsewhere in the filter and performing computationally expensive whole-message checks like spam-tagging if necessary.

These functions can make decisions about the disposition or modification of individual message parts by calling one of the MIMEDefang action functions. In most cases, actions should be taken only by the filter( ) or filter_multipart( ) functions. The most commonly used action functions are:

action_accept( ), action_accept_with_warning( string ): Accept the current message part, possibly adding a warning to the message.
action_drop( ), action_drop_with_warning( string ): Drop the current message part, possibly adding a warning to the message.
action_replace_with_warning( string ): Replace the current message part with a warning message.
action_quarantine( entity,string ): Drop and quarantine the current message part, and add a warning to the message.
action_quarantine_entire_message( string ): Quarantine the entire message, and add a warning to the administrator notification if one is generated. This action only quarantines; it does not also discard or bounce the message. You must call action_discard( ) or action_bounce( ) afterward.
action_bounce( string[,SMTP reply code[,DSN code]] ): Instruct sendmail to reject the message with string returned to the sender as the reason for rejection . You can optionally specify an SMTP reply code (which defaults to 554) and a DSN code (which defaults to 5.7.1). Bouncing a message does not stop MIMEDefang from continuing to process other message parts; the bounce occurs after all parts have been processed.
action_tempfail( string[,SMTP reply code[,DSN code]] ): Instruct sendmail to temporarily reject the message with string returned to the sender as the reason for rejection. You can optionally specify an SMTP reply code (which defaults to 450) and a DSN code (which defaults to 4.7.1).
action_discard( ): Discard the entire message silently once all parts have been processed.
action_notify_sender( string ): Generate an email notification back to the message sender containing the given string, which may consist of multiple lines.
action_notify_administrator( string ): Generate an email notification back to the MIMEDefang administrator containing the given string, which may consist of multiple lines.
action_add_part( entity,type,encoding,data,fname,disposition[,offset] ): Add a new MIME part to the message represented by entity . The new part will have a MIME content-type of type and content-encoding of encoding . The new part itself should be stored in data and its associated filename in fname . The MIME content-disposition is given by disposition . The optional offset specifies where to add the part; it defaults to -1 (add at end). This action may be performed in filter_end( ) .
action_add_header( header,value ): Add a new header to the message. The header's name is given in header , without a trailing colon, and the value to set the header to is given in value . It is possible to add multiple headers with the same name.
action_change_header( header,value[,index] ): Change a header in the message. The header's name is given in header , without a trailing colon, and the new value to set the header to is given in value . If index is given, changes the index 'th header with that name. Changing a header that does not exist will add a new header.
action_delete_header( header[,index] ): Delete a header in the message. The header's name is given in header , without a trailing colon. If index is given, deletes the index 'th header with that name instead of the first one.
action_delete_all_headers( header ): Deletes all headers in the message with a given name. The header's name is given in header , without a trailing colon.

If you call one of the notification functions (e.g., action_notify_sender ), MIMEDefang creates a notification message and sends it by invoking sendmail in its deferred mode ; sendmail will enqueue the notification message in its client mail queue rather than sending it immediately. You must run a sendmail process that periodically sends messages in the client queue. One way to do so is to issue the following command at system boot (via a boot script):

 /usr/sbin/sendmail -Ac -q5m

See the sendmail documentation for more information about deferred mode and client queue runners.

By calling these functions, you can configure MIMEDefang to suit nearly any email management policy you wish to institute.

When you make changes to the mimedefang-filter script, you must signal mimedefang-multiplexor to reread the configuration and restart its slave processes. The easiest way to signal the multiplexor is to use the md-mx-ctrl reread command. Another way is to use the kill -INT process-id command to send a SIGINT signal to the multiplexor process; you can identify the process ID from ps output or by examining the pid file if the multiplexor was started with the - p option.

5.2.3 SpamAssassin Integration

MIMEDefang expects to find a SpamAssassin configuration file called sa-mimedefang.cf in your sitewide configuration directory (usually /etc/mail/spamassassin ). If it doesn't, it will also look for local.cf in the same directory. This gives you the flexibility of creating different SpamAssassin configurations to be used when SpamAssassin is invoked by MIMEDefang and when SpamAssassin is invoked by local users or scripts.

If you're going to be invoking SpamAssassin only through MIMEDefang, or if there should be no differences in the configuration file based on how MIMEDefang is invoked, consider making a hard or symbolic link from local.cf to sa-mimedefang.cf . MIMEDefang will find the configuration file it first looks for, and you will avoid the possibility of later creating two different configurations.

When running SpamAssassin via MIMEDefang, you may not use any of SpamAssassin's configuration directives that modify a mail message. Attempting to modify the Subject header or add new headers using SpamAssassin directives will not work. All such changes must be performed by MIMEDefang in the mimedefang-filter script.

If you want SpamAssassin to perform network-based tests (such as DNSBL lookups), you must add a line to mimedefang-filter (just after the $AdminName setting works well) to set the $SALocalTestsOnly variable to 0, like this:

 $SALocalTestsOnly = 0;

The section of the default mimedefang-filter that handles spam-tagging appears in the filter_end( ) function and is agreeably easy to read. It is presented in Example 5-7.

Example 5-7. Spam-tagging section of mimedefang-filter

 # Spam checks if SpamAssassin is installed     if ($Features{"SpamAssassin"}) {         if (-s "./INPUTMSG" < 300*1024) {             # Only scan messages smaller than 300kB.  Larger messages             # are extremely unlikely to be spam, and SpamAssassin is             # dreadfully slow on very large messages.             my($hits, $req, $names, $report) = spam_assassin_check( );             my($score);             if ($hits < 40) {                 $score = "*" x int($hits);             } else {                 $score = "*" x 40;             }             # We add a header which looks like this:             # X-Spam-Score: 6.8 (******) NAME_OF_TEST,NAME_OF_TEST             # The number of asterisks in parens is the integer part             # of the spam score clamped to a maximum of 40.             # MUA filters can easily be written to trigger on a             # minimum number of asterisks...             action_change_header("X-Spam-Score", "$hits ($score) $names");             if ($hits >= $req) {                 md_graphdefang_log('spam', $hits, $RelayAddr);                 # If you find the SA report useful, add it, I guess...                 action_add_part($entity, "text/plain", "-suggest",                                 "$report\n",                                 "SpamAssassinReport.txt", "inline");             } else {                 # Delete any existing X-Spam-Score header?                 action_delete_header("X-Spam-Score");             }         }     }

First, the code checks to be sure that MIMEDefang detected SpamAssassin on the system when it started. It then checks to be sure that the INPUTMSG file, which contains the message to scan, is smaller than 300 kilobytes. If that's the case, the code calls MIMEDefang's spam_assassin_check( ) function, which uses Mail::SpamAssassin to check the message and returns the number of hits, number of required hits for tagging, names of tests hit, and the text of SpamAssassin's spam report for the message. The code creates a $score variable containing one asterisk for each hit (up to 40).

Next, the code in Example 5-7 calls the MIMEDefang action_change_header( ) function to change (or add) the X-Spam-Score header. The header will include the number of hits ( expressed numerically and as a line of asterisks) and the names of tests that matched.

If the number of hits is greater than or equal to the required number to declare the message spam, the code calls MIMEDefang's md_graphdefang_log( ) function to make a log entry and then adds the SpamAssassin report text to the message as an additional MIME part using the action_add_part( ) function. If the number of hits is less than the required number for tagging, the script removes the X-Spam-Score header.

You might customize this code in filter_end( ) in several easy ways to suit your needs. By commenting out the action_delete_header( ) line, you can have the X-Spam-Score header added to all messages, spam or not. If you want to modify the Subject header of spam messages as SpamAssassin does, add the following code before the action_add_part( ) line:

 action_change_header("Subject", "*****SPAM***** $Subject");

The $Subject variable will already contain the message subject.

Remember that you must signal the MIMEDefang milter to reread mimedefang-filter whenever you change it or any Perl modules on which it depends ‚ including SpamAssassin and its configuration. If you update SpamAssassin or modify settings in /etc/mail/spamassassin/sa-mimedefang.cf , you should signal the milter.

5.2.3.1 Adding sitewide Bayesian filtering

Adding a sitewide Bayesian filter for use with MIMEDefang is relatively easy. Use the usual SpamAssassin use_bayes and bayes_path directives in sa-mimedefang.cf , and ensure that the defang user has permission to create the databases in the directory named in bayes_path . One way to do this is to create a directory for the databases that is owned by defang , such as /var/spool/MD-Bayes . Another option is to locate the databases in a directory owned by another user but to create them ahead of time and chown them to defang. If local users need access to the databases (e.g., they will be running sa-learn) , you may have to make the databases readable or writable by a group other than defang and adjust the bayes_file_mode , or make them world-readable or world-writable. Doing so, however, puts the integrity of your spam-checking at the mercy of the good intentions and comprehension of your users.

5.2.3.2 Adding sitewide autowhitelisting

In SpamAssassin 3.0, autowhitelisting is easy to enable. You need only add the usual autowhitelist directives to sa-mimedefang.cf to determine where and how the autowhitelist database will be stored. Be sure to enable the use_auto_whitelist configuration option to turn on autowhitelisting.

Using a sitewide autowhitelist database in SpamAssassin 2.63 requires just a bit more effort. In addition to adding the SpamAssassin autowhitelist directives to sa-mimedefang.cf , you must modify mimedefang.pl to provide SpamAssassin with an address list factory, as discussed in Chapter 4. Example 5-8 shows the spam_assassin_init( ) function in mimedefang.pl . Add the emphasized lines to support autowhitelisting. Don't forget to signal mimedefang-multiplexor to reread its configuration after making these changes.

Example 5-8. Adding an address list factory to mimedefang.pl

 sub spam_assassin_init (;$) {     unless ($Features{"SpamAssassin"}) {         md_syslog('err', "$MsgID: Attempt to call SpamAssassin function, but SpamAssassin  is not installed.");         return undef;     }     if (!defined($SASpamTester)) {         my $config = shift;         unless ($config)         {             if (-r "/etc/mail/spamassassin/sa-mimedefang.cf") {                 $config = "/etc/mail/spamassassin/sa-mimedefang.cf";             } elsif (-r "/etc/mail/spamassassin/local.cf") {                 $config = "/etc/mail/spamassassin/local.cf";             } else {                 $config = "/etc/mail/spamassassin.cf";             }         }         $SASpamTester = Mail::SpamAssassin->new({             local_tests_only   => $SALocalTestsOnly,             dont_copy_prefs    => 1,             userprefs_filename => $config});  require Mail::SpamAssassin::DBBasedAddrList;         my $awl = Mail::SpamAssassin::DBBasedAddrList->new( );         $SASpamTester->set_persistent_address_list_factory ($awl);  }     return $SASpamTester; }

5.2.3.3 Adding per-domain or per-user streaming

By default, MIMEDefang processes each message once and applies SpamAssassin's spam determination to the message. This process works well if you run a small mail server for a single domain, but it presents a problem for mail gateways, virtual hosts , and larger servers. What should be done when an email message is received for multiple recipients ‚ possibly at multiple domains? MIMEDefang provides two functions that you can use to implement solutions to this problem, stream_by_recipient( ) and stream_by_domain( ) . Each works in the same way.

If you add a call to stream_by_recipient( ) to the filter_begin( ) function , stream_by_recipient( ) checks to see if a message has only a single recipient. If so, it returns 0, and the filter should continue to work on the message. If the message has multiple recipients, stream_by_recipient( ) reinjects the message by connecting to sendmail and resubmitting the message as a series of new messages, one for each recipient of the original message. Figure 5-1 illustrates this process. In this case, stream_by_recipient( ) returns 1, and the original, multirecipient message should be discarded. When the new single-recipient messages arrive at the filter, they will pass through stream_by_recipient( ) and continue on to the rest of the filter, which can now safely perform per-recipient functions (such as using personal whitelists and blacklists or other user preferences).

Figure 5-1. Streaming by recipients

stream_by_domain( ) works similarly but only reinjects one new copy of a message for each recipient domain in the original message. The rest of the filter can behave differently for different recipient domains, which permits virtual hosting providers to apply different spam criteria for different domains they host.

Although some MIMEDefang features will work with sendmail 8.11, stream_by_domain( ) and stream_by_recipient( ) require sendmail 8.12. Moreover, locally submitted messages must be sent via SMTP for these functions to work ( sendmail must be running as user smmsp rather than as user root ).

Example 5-9 shows how you could use stream_by_domain( ) to offer different policies to different recipient domains. Policies are stored in a Berkeley database file /etc/mail/spampolicy.db that is generated from a text file /etc/mail/spampolicy using the standard sendmail makemap program. Each line of the text file should contain a domain name, white space, and a policy, which should be either TAG (tag spam at SpamAssassin's default level), TAG n (tag messages with over n hits), BLOCK (reject spam at SpamAssassin's default level), BLOCK n (reject messages with over n hits), or IGNORE (do no spam-checking). spampolicy.db must be owned by defang .

Example 5-9. Using stream_by_domain( )

  use DB_File; sub getpolicy {   # Where do we find the policy db?   my $policydb = '/etc/mail/spampolicy.db';   # If a domain isn't listed, what's the default policy?   my $default_policy = 'TAG';   my $host = shift;   tie %policy, 'DB_File', $policydb, O_RDONLY, 0640, $DB_HASH;   my $policy = $policy{"\L$host"};   untie %policy;   return defined($policy) ? "\U$policy" : $default_policy; }  sub filter_begin ( ) {     if ($SuspiciousCharsInHeaders) {         md_graphdefang_log('suspicious_chars');         return action_discard( );     }  # Per-domain streaming is turned on here so we get the $Domain var     # set later on.     return if stream_by_domain( );  ... } sub filter_end ($) {     my($entity) = @_;     send_quarantine_notifications( );     # No sense doing any extra work     return if message_rejected( );     # Spam checks if SpamAssassin is installed     if ($Features{"SpamAssassin"}) {         if (-s "./INPUTMSG" < 100*1024) {  # Spam policy selection, based on $Domain, using a Berkeley db lookup           my $spampolicy = getpolicy($Domain);           action_add_header("X-Spam-Policy", "$spampolicy $Domain");           if ($spampolicy ne "IGNORE") {             my($hits, $req, $names, $report) = spam_assassin_check( );             $req =  if ($spampolicy =~ /(\d+)/);             if ($hits >= $req) {                 md_graphdefang_log('spam', $hits, $RelayAddr);                 if ($spampolicy =~ /BLOCK/) {                   action_bounce("Message rejected by SpamAssassin");                   return;                 }  my($score);                 if ($hits < 40) {                     $score = "*" x int($hits);                 } else {                     $score = "*" x 40;                 }                 action_change_header("X-Spam-Score", "$hits ($score) $names");                 action_add_part($entity, "text/plain", "-suggest",                                 "$report\n",                                 "SpamAssassinReport.txt", "inline");             } else {                 action_delete_header("X-Spam-Score");             }           }         }     } }

You could similarly use stream_by_recipient( ) in an environment where you want to read SpamAssassin user preferences for each recipient from an SQL database. The Mail::SpamAssassin object used in mimedefang-filter is named $SASpamTester . A simple approach is to call the load_scoreonly_sql( ) method on that object, passing the recipient's email address as an argument, like this:

 # @Recipients in mimedefang-filter is an array of recipient emails, # but if you're using stream_by_recipient, there should only be a single # recipient at this point. my $recip = $Recipient[0]; # If your SQL database uses usernames rather than email addresses, uncomment: # $recip =~ s/@.*//; $SASpamTester->load_scoreonly_sql($recip);

This approach creates a new database connection for each mail message. A more complicated, but more efficient approach would be to set up a database connection in filter_begin( ) and write SQL queries by hand in filter_end( ) . On the other hand, using SpamAssassin's own functions, like load_scoreonly_sql( ) , ensures that your code will be compatible with future SpamAssassin releases that might change the database format.

Although stream_by_recipient( ) and stream_by_domain( ) solve an important problem, they do so at a cost in performance. Messages that arrive for multiple recipients (or domains) will have to be split up and reinjected, considerably increasing the overall load on the mail server.

‚ < ‚ Day Day Up ‚ > ‚