6.4 Building a Spam-Checking Gateway

‚ < ‚ Day Day Up ‚ > ‚

Several content-filtering daemons that call SpamAssassin are available for Postfix. This section provides a complete sample installation of amavisd-new, a particularly efficient filter that supports both spam-checking and virus-checking. amavisd-new is written in Perl and available at http://www.ijs.si/software/amavisd/. The version used in this chapter's example is 20030616-p9, which supports both SpamAssassin 2.63 and SpamAssassin 3.0.

amavisd-new is based on amavis, another virus-scanning package that is also actively developed and widely used. Although amavisd-new's most important program is also named amavisd , amavisd-new has developed separately and is a significantly different package. Some of amavisd-new's features include:

avisd-new was specifically developed and tested for Postfix as a daemonized content filter.
Messages can be rejected based on MIME type or extensions of attached filenames.
Messages can be checked with multiple virus scanners , and messages carrying viruses can be refused , discarded, or quarantined.
SpamAssassin can be invoked on a message, and spam can be refused, discarded, quarantined, or tagged.
Per- user configuration of amavisd-new is possible through an SQL or LDAP database.

The rest of this chapter details the installation, configuration, and operation of amavisd-new as an example of a full-scale, daemonized, content filter approach to using SpamAssassin with Postfix. amavisd-new's other functions, such as virus-checking, are mentioned but not covered in detail; read the documentation to learn more about these other amavisd-new features.

6.4.1 Installing amavisd-new

amavisd-new is written in Perl, and invokes SpamAssassin through the Mail::SpamAssassin Perl modules. Because amavisd-new itself is a daemon, you do not need to run spamd . It's easiest to install SpamAssassin (and your antivirus software) first, and then install amavisd-new. amavisd-new also requires several other Perl modules, including: Archive::Tar , Archive::Zip , Compress::Zlib , Convert::TNEF , Convert::UUlib , MIME::Base64 , MIME::Tools , Mail::Internet , Net::Server , Net::SMTP , Digest::MD5 , IO::Stringy , Time::Hires , and Unix::Syslog . If you plan to do per-user configuration of amavisd-new through SQL or LDAP, you'll need appropriate Perl modules for database access ( DBI and a DBD:: module for SQL, or Net::LDAP for LDAP). You can install most of these Perl modules using CPAN as described in Chapter 2.

The standard version of MIME::Tools 5.411a has bugs . Install MIME::Tools 6 or later from http://search.cpan.org/dist/MIME-tools.

Begin the install process by creating a new user account and group for running amavisd-new; the usual name for both the user and group is amavis . This user will own amavisd-new's files, and the user (or group) must have access to SpamAssassin's configuration and database files as well. The user's home directory is traditionally /var/amavis , but you can create it anywhere that fits your system's needs.

amavisd-new uses several important directories. It keeps two files in its home directory, one containing its current process ID, and the other used for locking. It uses a working directory for unpacking email messages and scanning them; by default, this is the home directory or the tmp subdirectory of the home directory. For optimal performance, this directory should be on a fast disk ‚ even a RAM disk if your operating system supports it and you have enough memory to spare. amavisd-new stores quarantined email messages in /var/virusmails by default, but you can select any directory for this purpose. Speed is not so critical with this directory, and it should never be located on a RAM disk because you will often want to be sure that you can access quarantined files. If you plan to physically locate these directories somewhere unusual (e.g., to mount new disk partitions or a RAM disk as /var/amavis/tmp ), you should do so before you install amavisd-new. The directories should be owned by user and group amavis and should not be world-readable or world-searchable.

Next , download the amavisd-new source code from http://www.ijs.si/software/amavisd/ and unpack it. As root , copy the amavisd script to a suitable directory for executable daemons (e.g., /usr/bin , /usr/local/sbin , etc.), chown it to root , and use chmod to set its permissions to 0755 (readable and executable by all users, writable only by root ).

Copy the amavisd.conf file to a suitable directory for configuration files (e.g., /etc , /etc/amavis , /usr/local/etc , etc.). By default, amavisd expects to find this file in /etc , and if you locate it anywhere else, you will have to add an extra command-line option ( -c filename ) when invoking amavisd to tell it the new location. The amavisd.conf file should also be owned by root and should have permissions 0644 (readable by all, writable only by root ).

6.4.2 Configuring amavisd-new

amavisd-new is configured through the amavisd.conf file. amavisd.conf is parsed as a Perl script and can contain any legal Perl code. Because it is parsed as Perl, you must escape any at sign (@), question mark ($), or backslash (\) characters that appear in double-quoted strings by prepending a backslash. For example:

 $some_email = "sample\@example.com";

Email addresses must be specified without surrounding brackets and without RFC 2821 quoting.

Edit amavisd.conf to set the (many) available configuration options to control amavisd . The file is organized in logical sections; the most important options are in Section I, but you'll need to read through the entire file to customize the system completely. The following sections explain commonly modified portions of the configuration file in the order that you'll encounter them.

6.4.2.1 Essential options

Example 6-1 shows the first portion of the configuration file and the settings of the essential options. Set $MYHOME to the amavis user's home directory. Set $mydomain to your domain name. Set $daemon_user and $daemon_group to name of the amavis user and group. Set $TEMPBASE to the directory to use for unpacking messages; for improved performance, this directory should be a mounted RAM disk.

Example 6-1. Essential settings in amavisd.conf

 # Section I - Essential daemon and MTA settings # # $MYHOME serves as a quick default for some other configuration settings. # More refined control is available with each individual setting further down. # $MYHOME is not used directly by the program. No trailing slash!  $MYHOME = '/var/amavis';     # (default is '/var/amavis')  # $mydomain serves as a quick default for some other configuration settings. # More refined control is available with each individual setting further down. # $mydomain is never used directly by the program.  $mydomain = 'example.com';   # (no useful default)  # Set the user and group to which the daemon will change if started as root # (otherwise just keeps the UID unchanged, and these settings have no effect):  $daemon_user  = 'amavis';    # (no default;  customary: vscan or amavis) $daemon_group = 'amavis';    # (no default;  customary: vscan or amavis)  # Runtime working directory (cwd), and a place where # temporary directories for unpacking mail are created. # (no trailing slash, may be a scratch file system) #$TEMPBASE = $MYHOME;        # (must be set if other config vars use is)  $TEMPBASE = "$MYHOME/tmp";   # prefer to keep home dir /var/amavis clean?

6.4.2.2 MTA options

Example 6-2 shows the settings of the MTA options. Set $forward_method to the method you will use to reinject checked mail to the MTA. For Postfix, this method should be of the form smtp : ipaddress : portnumber , where ipaddress is the IP address of the Postfix system (usually 127.0.0.1) and portnumber is the TCP port number on which the second smtpd instance is running. Because amavisd-new was designed with Postfix in mind, you may not need to change this section at all.

Example 6-2. MTA options in amavisd.conf

 # MTA SETTINGS, UNCOMMENT AS APPROPRIATE, # both $forward_method and $notify_method default to 'smtp:127.0.0.1:10025' # POSTFIX, or SENDMAIL in dual-MTA setup, or EXIM V4 # (set host and port number as required; host can be specified # as IP address or DNS name (A or CNAME, but MX is ignored)  $forward_method = 'smtp:127.0.0.1:10025';  # where to forward checked mail  $notify_method = $forward_method;          # where to submit notifications

6.4.2.3 Daemon process options

Example 6-3 shows the daemon process settings. The most important setting is $max_servers , which you should set to the same number of smtp processes you have configured Postfix to use concurrently to send messages to amavisd-new.

Example 6-3. Daemon process settings in amavisd.conf

 # Net::Server pre-forking settings # You may want $max_servers to match the width of your MTA pipe # feeding amavisd, e.g. with Postfix the 'Max procs' field in the # master.cf file, like the '2' in the:  smtp-amavis unix - - n - 2 smtp #  $max_servers  =  2;   # number of pre-forked children          (default 2)

6.4.2.4 Distinguishing local domains

amavisd-new distinguishes local domains from remote domains. Recipients at local domains can take advantage of several per-user features that are not directly available to remote recipients, including local customization of SpamAssassin settings. Example 6-4 shows that part of amavisd.conf that bears on per-user customization.

You can provide your local domain information in several ways. You can set the @local_domains_acl array to a list of domain names that should be considered local. You can set the %local_domains hash instead, providing local domain names as keys and 1 as their values, or use the read_hash function to read in a list of local domain names from an external file. Finally, you can define local domain names by invoking the new_RE function with a regular expression that matches the local domain names and assigning the result to $local_domains_re . No matter which method you use, adding a period (.) to the beginning of a domain name means that the domain and any subdomains should all be considered local.

Example 6-4 shows this section of the configuration file, using the @local_domains_acl variable to define local domains.

Example 6-4. Setting local domains in amavisd.conf

 # Lookup list of local domains (see README.lookups for syntax details) # # NOTE: #   For backwards compatibility the variable names @local_domains (old) and #   @local_domains_acl (new) are synonyms. For consistency with other lookups #   the name @local_domains_acl is now preferred. It also makes it more #   obviously distinct from the new %local_domains hash lookup table. # # local_domains* lookup tables are used in deciding whether a recipient # is local or not, or in other words, if the message is outgoing or not. # This affects inserting spam-related headers for local recipients, # limiting recipient virus notifications (if enabled) to local recipients, # in deciding if address extension may be appended, and in SQL lookups # for non-fqdn addresses. Set it up correctly if you need features # that rely on this setting (or just leave empty otherwise). # # With Postfix (2.0) a quick reminder on what local domains normally are: # a union of domains specified in: $mydestination, $virtual_alias_domains, # $virtual_mailbox_domains, and $relay_domains. # #@local_domains_acl = ( ".$mydomain" );  # $mydomain and its subdomains # @local_domains_acl = qw( );  # default is empty, no recipient treated as local # @local_domains_acl = qw( .example.com ); # @local_domains_acl = qw( .example.com !host.sub.example.net .sub.example.net ); # @local_domains_acl = ( ".$mydomain", '.example.com', 'sub.example.net' );  @local_domains_acl = qw/ example.com example.net example.org /;  # or alternatively(A), using a Perl hash lookup table, which may be assigned # directly, or read from a file, one domain per line; comments and empty lines # are ignored, a dot before a domain name implies its subdomains: # #read_hash(\%local_domains, '/var/amavis/local_domains'); #or alternatively(B), using a list of regular expressions: # $local_domains_re = new_RE( qr'[@.]example\.com$'i );

6.4.2.5 Postfix-specific options

Section II of amavsid.conf specifies options that differ by MTA and is shown in Example 6-5. Because amavisd-new was designed with Postfix in mind, you need to modify relatively few options. Set the $inet_socket_port variable to the TCP port number on which amavisd should listen for SMTP connections from Postfix. To prevent this port from being accessed by remote hosts , set $inet_socket_bind to ' 127.0.0.1 ', which will cause amavisd to listen only on the loopback interface and not on other network interfaces. If you want to allow access by a set of remote hosts (if, for example, you want to run amavisd on a different host than your Postfix MTA), don't set $inet_socket_bind but do set @inet_acl to a list of IP addresses for hosts that should be permitted to connect. This list is checked in order; the first match wins. You may specify these IP addresses as single addresses or as CIDR-style address / netmask (e.g., 192.168.1/255.255.255.0 ) or address / bits (e.g., 192.168.1/24 ) ranges. ^[1] You may prepend an IP address with an exclamation point ( !) to disallow connections from that address, even if a larger range that contains the address is permitted (e.g., !192.168.0/24 192.168/16 to allow all 192.168.*.* addresses except 192.168.0.* addresses).

^[1] "CIDR" stands for Classless Interdomain Routing.

Example 6-5. Postfix-specific options in amavisd.conf

 # SMTP SERVER (INPUT) PROTOCOL SETTINGS (e.g. with Postfix, Exim v4, ...) #   (used when MTA is configured to pass mail to amavisd via SMTP or LMTP)  $inet_socket_port = 10024;  # accept SMTP on this local TCP port                                   # (default is undef, i.e. disabled) # multiple ports may be provided: $inet_socket_port = [10024, 10026, 10028]; # SMTP SERVER (INPUT) access control # - do not allow free access to the amavisd SMTP port !!! # # when MTA is at the same host, use the following (one or the other or both):  $inet_socket_bind = '127.0.0.1';  # limit socket bind to loopback interface                                   # (default is '127.0.0.1')  @inet_acl = qw( 127.0.0.1 );  # allow SMTP access only from localhost IP                                   # (default is qw( 127.0.0.1 ) )

6.4.2.6 Logging options

Section III of amavisd.conf deals with logging and is shown in Example 6-6. amavisd can log using syslog , or it can log to a file. Set $DO_SYSLOG to 1 to instruct amavisd to use syslog for logging; you can change the syslog facility and priority using the $SYSLOG_LEVEL variable. Set $DO_SYSLOG to 0 to instruct amavisd to log to a file; set $LOGFILE to specify the filename. The log file must be in a directory the amavis user can write to.

The $log_level variable controls the amount of detail that amavisd logs. A log level of 0 results in minimal logging; a log level of 5 produces highly verbose logging.

Example 6-6. Logging options

 # Section III - Logging # # true (e.g. 1) => syslog;  false (e.g. 0) => logging to file  $DO_SYSLOG = 1;  # (defaults to false)  #$SYSLOG_LEVEL = 'user.info';  # (defaults to 'mail.info') # Log file (if not using syslog) $LOGFILE = "$MYHOME/amavis.log";  # (defaults to empty, no log) #NOTE: levels are not strictly observed and are somewhat arbitrary # 0: startup/exit/failure messages, viruses detected # 1: args passed from client, some more interesting messages # 2: virus scanner output, timing # 3: server, client # 4: decompose parts # 5: more debug details  $log_level = 1;  # (defaults to 0)

6.4.2.7 Spam-handling options

Most of Section IV of amavisd.conf focuses on detailed configuration of how amavisd will handle detected viruses and spam. Only those options related to spam handling are discussed in detail here.

When amavisd detects a spam email, it logs a message to its log file by default. It can also quarantine the email and/or notify an administrator. It can then generate a bounce message to the sender. Finally, it can either accept and deliver the message, or discard the message. Many different configuration variables are involved in these decisions. Unfortunately, the order of the variables in the file is largely the reverse of the order in which they are checked during the spam-handling process.

Enable a spam quarantine by setting the following two variables:

$QUARANTINEDIR: Set this variable to the directory or mailbox file in which to store the quarantined messages.
$spam_quarantine_method: Set this variable to " local:spam-%b-%i-%n ", to specify the filename format for quarantined spam messages. In that format, %b expands to a digest of the message body, %i expands to the date and time, and %n expands to the amavisd message identifier.

To control the spam quarantine on a per-recipient basis, set the $spam_quarantine_to variable to a reference to a hash, keyed by the recipient's address, like this:

 $local_delivery_aliases{'sam-spam'} = '/home/sam/mail/spam'; $spam_quarantine_to =   { 'example.net' => undef     'jane@example.com' => 'spam@jane.example.com',     'sam@example.com' => 'sam-spam',     'example.com' => 'spam-quarantine',   };

If the hash value is undefined or empty, spam is not quarantined. In this example, spam sent to example.net will not be quarantined at all. If the hash value contains an asterisk ( @) , spam will be forwarded. Spam sent to jane@example.com will be forwarded to spam@jane.example.com . Otherwise, the hash value is looked up in the %local_delivery_aliases hash, and the spam is quarantined in the file or directory returned from that lookup. If the lookup fails, amavisd logs a warning and doesn't quarantine the message. Several default local delivery aliases are defined in amavisd , including spam-quarantine , which quarantines a message in $QUARANTINEDIR . In the preceding example, spam to sam@example.com will be quarantined in the /home/sam/mail/spam mailbox (or mail directory), and other spam to example.com will be quarantined in the default directory.

You can also write your $spam_quarantine_to policies with regular expressions:

 $spam_quarantine_to = new_RE(   [qr/^sam@example\.com$/i => 'sam-spam'],   [qr/^jane@example\.com$/i => 'spam@jane.example.com'],   [qr/@example\.com$/i => 'spam-quarantine'],   [qr/@example\.net$/i => undef] );

Because regular expressions are matched in the order that you list them, you must put the most specific matches first ( /^sam@example\.com/ before /@example\.com/ ). Because regular expression matches are case sensitive, you should generally include the i (case-insensitive) modifier to the qr// operator.

Spam to recipients that don't match any entry in $spam_quarantine_to will not be quarantined, so if you want to quarantine all spam by default, you should either provide a rule for each domain you receive mail for, or use the regular expression approach and include a rule for the regular expression qr/.*/ at the end.

amavisd-new is smart about per-recipient policies like $spam_quarantine_to . If some message recipients choose to quarantine spam and some do not, amavisd-new will honor those preferences. If multiple recipients choose the same quarantine destination, a message sent to two or more of those recipients is written only once to the quarantine destination .

You can also make quarantine decisions based on a spam's sender in an analogous way using $spam_quarantine_bysender_to , but this alternative is rarely useful, as spammers often falsify their sending addresses or use throwaway accounts.

To notify an administrator when spam is received, set $spam_admin to the address of the administrator. These notifications are disabled by default. Consider carefully before setting $spam_admin to the email address of a real person; given the amount of spam on the Internet today, it's easy to get hundreds of notifications or more, and difficult to know what to do about them. An alternative that might be useful for service providers is to set $spam_admin to a reference to a hash based on the spam sender's address, in order to detect outgoing spam from customers. For example, to notify the security staff about spam being sent from the example.com domain but nowhere else, use:

 $spam_admin =    { '.example.com' => 'security@example.com',     '.' => undef   };

The $final_spam_destiny variable controls the final disposition of spam recognized by amavisd . Although this variable appears first in this section of the configuration file, it is consulted last during spam-handling. When using amavisd-new with Postfix, there are three useful settings for $final_spam_destiny :

Set $final_spam_destiny to D_PASS to accept and deliver all spam. Use this strategy when your goal is simply to tag spam and let clients do their own filtering. If you set $warnspamsender to 1, you will also generate a bounce message to the sender. I don't recommend this, however, as spammers often falsify return addresses.
Set $final_spam_destiny to D_DISCARD to discard spam that scores above a "kill level" (specified in Section VII of amavisd.conf ); spam below the kill level will be tagged and accepted. Use this strategy when your goal is to reduce bandwidth or storage space by dropping messages that are very likely to be spam and tagging others.
Set $final_spam_destiny to D_BOUNCE to generate a bounce message to the sender and then discard the message. Because spammers often falsify their return addresses, you will rarely want to use this setting.

6.4.2.8 Recipient whitelists

Section V of the amavisd.conf file focuses on spam policy controls for individual recipients or recipient domains. Its function is analogous to SpamAssassin's whitelist_to feature. You can prevent any spam-checking at all, or you can continue to perform spam-checking but prevent spam-handling actions for detected spam.

To prevent any spam-checking at all for email sent to a recipient, set the @bypass_spam_checks_acl , %bypass_spam_checks , or $bypass_spam_checks_re variables. You may use domain names instead of recipient addresses to whitelist all mail sent to a given domain. Here's how you'd set the @bypass_spam_checks_acl array to a list of recipients that want to opt out of spam-checking:

 @bypass_spam_checks_acl = qw( chris@example.com robin@example.com);

To use the %bypass_spam_checks hash instead, provide recipient addresses as keys and 1 as their values. You might prefer this approach to using @bypass_spam_checks_acl if you have a very long list of recipients, because searching a hash is much faster than searching a long list. You can also use the read_hash function to read in a list of recipients from an external file and assign them to %bypass_spam_checks . This is useful when you want to keep a long list of recipients separate from the amavisd.conf file. For example:

 read_hash(\%bypass_spam_checks, '/var/amavis/bypass_spam');

Finally, you can define recipients to opt out by providing a list of regular expressions that match recipient addresses to the new_RE function and assigning the result to $bypass_spam_checks . This method is useful when you can parsimoniously specify your whitelisted recipients with a regular expression or two. For example:

 $bypass_spam_checks = new_RE(qr'^(chrisrobin)@example\.com'i);

Spam checks are bypassed only if all of the recipients of a message have been added to one of these variables. If even one recipient is not listed, spam-checking will still be performed. To ensure that spam is still delivered to whitelisted recipients in such cases, use the "spam_lovers" features discussed next.

If spam checks are bypassed, SpamAssassin's Bayesian classifier will not have an opportunity to learn from a message, whether or not it is spam.

To prevent spam-handling (e.g., tagging or quarantine) from being performed for a recipient when a message has been checked and designated as spam, set the @spam_lovers_acl , %spam_lovers , or $spam_lovers_re variables. These variables are set analogously to the @bypass_spam_checks_acl , %bypass_spam_checks , and $bypass_spam_checks_re variables.

In Example 6-7, jane@example.com always receives every message, spam or not, and spam-tagging is skipped when messages are addressed to her alone. In addition, if a message is destined for a domain other than example.com (i.e., it's outgoing mail from our domain), spam-tagging is skipped. postmaster @example.com also receives every message, but spam-checking is still performed.

Example 6-7. Whitelisting by recipient

 # Avoid running a spam check if jane is the only recipient, or if # all recipients are outside of  example.com  @bypass_spam_checks_acl = ('jane@example.com', '!.example.com'); # Even if we run a check, don't act on the results for jane or postmaster @spam_lovers_acl = ('jane@example.com', 'postmaster@example.com');

6.4.2.9 Sender whitelists and blacklists

amavisd can maintain whitelists and blacklists of message senders. It uses a message's envelope address (the one provided in the SMTP MAIL FROM command) as the sender address. Whitelisting ensures that amavisd will allow mail from a whitelisted sender to continue to its intended recipients; blacklisting ensures that amavisd will treat mail from a blacklisted sender as spam.

amavisd 's whitelist and blacklist features do not interact in the same manner as SpamAssassin's. For example, if an address is both whitelisted and blacklisted in SpamAssassin, neither takes effect. If an address is both whitelist and blacklisted in amavisd , both take effect ‚ the message is marked as spam and also allowed to pass to the recipient.

As with other amavisd address-matching features, you can specify addresses to globally whitelist by an array, keys of a hash, or by a set of regular expressions. Set the @whitelist_sender_acl array to a list of sender addresses to whitelist. To use the %whitelist_sender hash instead, provide sender addresses as keys and 1 as their values, or use the read_hash function to read in a list of senders from an external file. Finally, you can specify senders to whitelist by providing a list of regular expressions that match the sender addresses to the new_RE function and assigning the result to $whitelist_sender_re . You may use domain names instead of sender addresses to whitelist all mail sent from a given domain.

You can use a similar set of variables for globally blacklisting senders. The array is @blacklist_sender_acl , the hash is %blacklist_sender , and the regular expression version is $blacklist_sender_re .

The default amavisd.conf defines $blacklist_sender_re and %whitelist_sender as shown in Example 6-8. Many username patterns typical of spammers are blacklisted, such as investments ; many addresses of well-known security and vendor mailing lists are whitelisted. You can modify these definitions or use one of the other variables to add additional sender addresses to the whitelist or blacklist.

Example 6-8. Default blacklist and whitelist entries in amavisd.conf

 $blacklist_sender_re = new_RE(     qr'^(bulkmailofferscheapbenefitsearnmoneyforyougreatcasino)@'i,     qr'^(investmentslose_weight_todaymarket.alertmoney2youMyGreenCard)@'i,     qr'^(new\.tld\.registryopt-outopt-inoptinsaveonlsmoking2002k)@'i,     qr'^(specialofferspecialoffersstockalertstopsnoringwantsome)@'i,     qr'^(workathomeyesitsfreeyour_friendgreatoffers)@'i,     qr'^(inkjetplanetmarketoptMakeMoney)\d*@'i, ); map { $whitelist_sender{lc($_)}=1 } (qw(   cert-advisory-owner@cert.org   owner-alert@iss.net   slashdot@slashdot.org   bugtraq@securityfocus.com   NTBUGTRAQ@LISTSERV.NTBUGTRAQ.COM   security-alerts@linuxsecurity.com   amavis-user-admin@lists.sourceforge.net   notification-return@lists.sophos.com   mailman-announce-admin@python.org   owner-postfix-users@postfix.org   owner-postfix-announce@postfix.org   owner-sendmail-announce@Lists.Sendmail.ORG   owner-technews@postel.ACM.ORG   lvs-users-admin@LinuxVirtualServer.org   ietf-123-owner@loki.ietf.org   cvs-commits-list-admin@gnome.org   rt-users-admin@lists.fsck.com   clp-request@comp.nus.edu.sg   surveys-errors@lists.nua.ie   emailNews@genomeweb.com   owner-textbreakingnews@CNNIMAIL12.CNN.COM   spamassassin-talk-admin@lists.sourceforge.net   yahoo-dev-null@yahoo-inc.com   returns.groups.yahoo.com ));

amavisd-new also supports per-recipient blacklists and whitelists of senders. Per-recipient lists override the global lists. Use the $per_recip_blacklist_sender_lookup_tables and $per_recip_whitelist_sender_lookup_tables variables to specify these lists. Each variable is a reference to a hash keyed by the recipient's address (or domain). The hash value should be a reference to an array of sender addresses, a reference to a hash keyed on sender addresses (with hash values of 1), a call to the read_hash function to read the addresses from a file, or a call to new_RE with a list of regular expressions to match sender addresses against. For example, you could add the following code to amavisd.conf to maintain a list of whitelisted senders for jane@example.com in the file /etc/mail/jane-whitelist :

 $per_recip_whitelist_sender_lookup_tables =    { 'jane@example.com' => read_hash('/etc/mail/jane-whitelist')   };

6.4.2.10 SpamAssassin settings

Several variables in amavisd.conf affect the way that amavisd invokes SpamAssassin or the actions it takes based on a message's score from SpamAssassin:

$sa_local_tests_only

Set this variable to 1 if you want SpamAssassin to skip network-based tests. It defaults to 0 (perform network-based tests).

$sa_auto_whitelist

Set this variable to 1 to enable SpamAssassin's autowhitelist feature. It defaults to (no autowhitelist). Specify the location of the autowhitelist database in SpamAssassin's sitewide configuration file, local.cf . Be sure that the amavis user has permission to read from and write to the database.

$sa_mail_body_size_limit

If you set this variable to a size (in bytes), messages larger than the given size will not be checked for spam. This conserves system resources, as SpamAssassin can take a long time to check large messages, and large messages are rarely spam. The variable is undefined by default, which implies no limit. A reasonable value might be 65536 (64Kb) or 102400 (100Kb).

$sa_tag_level_deflt

This variable determines the spam score at or above which X-Spam-Status and X-Spam-Level headers will be added to the message to show the spam status and level of the message. The default is 3, which is suitable for seeing which tests are and are not being triggered for suspicious messages. If you like to see the spam status of all messages, set this value to -10 or so.

This variable can be defined on a per-recipient basis much like $per_recip_blacklist_sender_lookup_tables . Set $sa_tag_level_deflt to a reference to a hash keyed on recipient addresses, with the tag level as the hash value.

$sa_tag2_level_deflt

This variable determines the spam score at or above which amavisd adds an X-Spam-Flag: YES header and an X-Spam-Report header to the message. It may also modify the Subject header to tag the message as spam. The default is 6.3.

This variable can be defined on a per-recipient basis much like $per_recip_blacklist_sender_lookup_tables . Set $sa_tag2_level_deflt to a reference to a hash keyed on recipient addresses, with the tag2 level as the hash value.

$sa_kill_level_deflt

This variable determines the spam score at or above which amavisd will perform spam-handling on the message, such as quarantining the message, discarding it, notifying administrators, etc. By default, this variable is set to the value of $sa_tag2_level_deflt so spam-handling is performed on all spam detected. If you want to discard messages that are extremely likely to be spam and tag messages that are less likely to be spam, set this variable to a higher score (e.g., 12), and only messages above that level will be subject to special handling.

The variable can be defined on a per-recipient basis much like $per_recip_blacklist_sender_lookup_tables . Set $sa_kill_level_deflt to a reference to a hash keyed on recipient addresses, with the kill level as the hash value.

$sa_spam_modifies_subj

If this variable is set to 1, amavisd may modify the Subject header of messages with spam scores above the $sa_tag2_level_dflt setting. You can also set this variable to a reference to a list of recipients who should have their Subject headers modified, a reference to a hash table keyed on recipients who should have their headers modified (with hash values of 1), or the return value of a new_RE( ) call on a list of regular expressions to match against recipients who should have their headers modified. This variable is not defined by default.

$sa_spam_subject_tag

Set this variable to the string to prepend to the Subject header of spam messages when $sa_spam_modifies_subj is true. If you do not define this, Subject headers will never be modified. It is not defined by default; a common definition would be ' *****SPAM***** '.

6.4.2.11 Storing recipient preferences in external databases

It's possible to store amavisd-new recipient preferences in an SQL or LDAP database. This can be useful if you want to permit users to modify their own preferences, particularly if you already use an SQL- or LDAP-based user directory. SQL and LDAP lookups override variables defined in amavisd.conf .

Database entries indicate user preferences, including whether a user has opted out of spam-checking, whether amavisd should modify the Subject of spam messages, and user spam tag levels (tag, tag2, kill). Database entries may also specify sender addresses that the recipient wants to blacklist or whitelist.

To enable SQL lookups, define the variable @lookup_sql_dsn in amavisd.conf . This variable should contain a list of references to three-element arrays that represent database connections. The first element of each array is a Perl DBI DSN that defines the database driver to use, the database name, and the name of the database server host. The second element is a database username that amavisd will provide for identification to the database server, and the third element is the associated password for authentication. The distributed amavisd.conf file provides the following commented-out example:

 # @lookup_sql_dsn = #   ( ['DBI:mysql:database=mail;host=127.0.0.1;port=3306', 'user1', 'passwd1'], #     ['DBI:mysql:database=mail;host=host2', 'username2', 'password2'] );

In this example, amavisd will first attempt to connect to the MySQL database server on port 3306 of the local host in order to access the mail database. It will log into the database server as user1 with password passwd1 . If this connection fails, amavisd will try the next database server, a MySQL server running on host2 , using user username2 and password password2 .

The file README_FILES/README.lookups in the amavisd-new source code provides definitions for a set of SQL tables that are suitable for configuring user policies and whitelists and blacklists in amavisd . You can add these tables to your SQL database and follow the instructions in README.lookups to add appropriate database queries to amavisd.conf .

amavisd-new's SQL support should not be confused with SpamAssassin's SQL support. Each controls different aspects of mail-processing.

The amavisd-new source code includes an LDAP schema for an auxiliary class (amavisAccount) that can be added to user accounts. The class defines attributes that determine whether a user has opted out of spam-checking, whether amavisd should modify the Subject of spam messages, a user's desired spam tag levels (tag, tag2, kill), and sender addresses to blacklist or whitelist for a user.

To enable LDAP lookups, set the $enable_ldap variable in amavisd.conf to 1, and provide LDAP server information in the $default_ldap variable as a reference to a hash:

 $default_ldap = {   hostname => '   ldap-server-hostname   ',   tls =>   1   ,   base => '   base DN for ldap searches   ',   query_filter => '(&(objectClass=amavisAccount)(mail=%m))'} };

For each preference for which amavisd can perform an LDAP query, you must define additional query parameters to specify (at minimum) the result attribute to be returned from the LDAP database to amavisd . Parameters left undefined will prevent LDAP queries from being performed for that preference. The amavisd source code provides the examples in Example 6-9.

Example 6-9. Defining LDAP query parameters for user preferences

 $bypass_spam_checks_ldap  = {res_attr => 'amavisBypassSpamChecks'}; $spam_tag_level_ldap      = {res_attr => 'amavisSpamTagLevel'}; $spam_kill_level_ldap     = {res_attr => 'amavisSpamKillLevel'}; $spam_whitelist_sender_ldap = {   query_filter => '(&(objectClass=amavisAccount)(mail=%m)                      (amavisWhitelistSender=%s))',   res_filter => 'OK'}; $spam_blacklist_sender_ldap = {   query_filter => '(&(objectClass=amavisAccount)(mail=%m)                      (amavisBlacklistSender=%s))',   res_filter => 'OK'};

See the file README_FILES/README.lookups in the source code for more information.

6.4.3 Basic Operations

Once you've configured the options in amavisd.conf , you're ready to test amavisd . Start amavisd either as the amavis user or as root (in which case it will change its UID and GID to that of amavis during startup).

During your first test, start amavisd with the debug argument. This causes amavisd to run in the foreground and produce debugging information that you can watch to be sure that it's working correctly. Example 6-10 shows a debug startup for a properly functioning configuration:

Example 6-10. Starting amavisd with the debug arguments

 #  amavisd debug  Feb  7 16:58:16 tala amavisd[924]: starting.  amavisd at tala amavisd-new-20030616-p7 Feb  7 16:58:16 tala amavisd[924]: Perl version               5.006001 Feb  7 16:58:16 tala amavisd[924]: Module Amavis::Conf        1.15 Feb  7 16:58:16 tala amavisd[924]: Module Archive::Tar        1.08 Feb  7 16:58:16 tala amavisd[924]: Module Archive::Zip        1.09 Feb  7 16:58:16 tala amavisd[924]: Module Compress::Zlib      1.32 Feb  7 16:58:16 tala amavisd[924]: Module Convert::TNEF       0.17 Feb  7 16:58:16 tala amavisd[924]: Module Convert::UUlib      1.0 Feb  7 16:58:16 tala amavisd[924]: Module MIME::Entity        6.109 Feb  7 16:58:16 tala amavisd[924]: Module MIME::Parser        6.108 Feb  7 16:58:16 tala amavisd[924]: Module MIME::Tools         6.110 Feb  7 16:58:16 tala amavisd[924]: Module Mail::Header        1.60 Feb  7 16:58:16 tala amavisd[924]: Module Mail::Internet      1.60 Feb  7 16:58:16 tala amavisd[924]: Module Mail::SpamAssassin  2.63 Feb  7 16:58:16 tala amavisd[924]: Module Net::Cmd            2.24 Feb  7 16:58:16 tala amavisd[924]: Module Net::DNS            0.45 Feb  7 16:58:16 tala amavisd[924]: Module Net::SMTP           2.26 Feb  7 16:58:16 tala amavisd[924]: Module Net::Server         0.86 Feb  7 16:58:16 tala amavisd[924]: Module Time::HiRes         1.54 Feb  7 16:58:16 tala amavisd[924]: Module Unix::Syslog        0.99 Feb  7 16:58:16 tala amavisd[924]: Found myself: /usr/local/sbin/amavisd -c /etc/amavisd. conf Feb  7 16:58:16 tala amavisd[924]: Lookup::SQL code       NOT loaded Feb  7 16:58:16 tala amavisd[924]: Lookup::LDAP code      NOT loaded Feb  7 16:58:16 tala amavisd[924]: AMCL-in protocol code  NOT loaded Feb  7 16:58:16 tala amavisd[924]: SMTP-in protocol code  loaded Feb  7 16:58:16 tala amavisd[924]: ANTI-VIRUS code        loaded Feb  7 16:58:16 tala amavisd[924]: ANTI-SPAM  code        loaded Feb  7 16:58:16 tala amavisd[924]: Net::Server: 2004/02/07-16:58:16 Amavis (type Net:: Server::PreForkSimple) starting! pid(924) Feb  7 16:58:16 tala amavisd[924]: Net::Server: Binding to TCP port 10024 on host 127.0. 0.1 Feb  7 16:58:16 tala amavisd[924]: Net::Server: Setting gid to "110 110" Feb  7 16:58:16 tala amavisd[924]: Net::Server: Setting uid to "2013" Feb  7 16:58:16 tala amavisd[924]: Net::Server: Setting up serialization via flock Feb  7 16:58:16 tala amavisd[924]: Found $file       at /usr/bin/file Feb  7 16:58:16 tala amavisd[924]: No $arc,          not using it Feb  7 16:58:16 tala amavisd[924]: Found $gzip       at /bin/gzip Feb  7 16:58:16 tala amavisd[924]: Found $bzip2      at /usr/bin/bzip2 Feb  7 16:58:16 tala amavisd[924]: Found $lzop       at /bin/lzop Feb  7 16:58:16 tala amavisd[924]: Found $lha        at /usr/bin/lha Feb  7 16:58:16 tala amavisd[924]: Found $unarj      at /usr/bin/arj Feb  7 16:58:16 tala amavisd[924]: Found $uncompress at /bin/uncompress Feb  7 16:58:16 tala amavisd[924]: No $unfreeze,     not using it Feb  7 16:58:16 tala amavisd[924]: Found $unrar      at /usr/bin/unrar Feb  7 16:58:16 tala amavisd[924]: Found $zoo        at /usr/bin/zoo Feb  7 16:58:16 tala amavisd[924]: Found $cpio       at /bin/cpio Feb  7 16:58:16 tala amavisd[924]: Using internal av scanner code for (primary) Clam  Antivirus-clamd Feb  7 16:58:16 tala amavisd[924]: No primary av scanner: KasperskyLab AVP - aveclient ...many other messages about detecting av scanners... Feb  7 16:58:16 tala amavisd[924]: SpamControl: initializing Mail::SpamAssassin Feb  7 16:58:16 tala amavisd[924]: SpamControl: turning on SA auto-whitelisting Feb  7 16:58:23 tala amavisd[924]: SpamControl: done Feb  7 16:58:23 tala amavisd[924]: Net::Server: Beginning prefork (2 processes) Feb  7 16:58:23 tala amavisd[924]: Net::Server: Starting "2" children Feb  7 16:58:23 tala amavisd[924]: Net::Server: Parent ready for children. Feb  7 16:58:23 tala amavisd[929]: Net::Server: Child Preforked (929) Feb  7 16:58:23 tala amavisd[930]: Net::Server: Child Preforked (930)

After the startup messages, you should begin to see amavisd processing incoming messages (which will produce a copious amount of debugging information). When you are satisfied that messages are being properly delivered back to Postfix, hit Ctrl-C to stop amavisd debug and run amavisd with no arguments to start the daemon in the background.

If you've chosen to locate your configuration file somewhere other than /etc , you should either make a symbolic link to /etc/amavisd.conf or use the -c / path /to/amavisd.conf command-line option to amavisd .

Once amavisd is running and you confirm that ordinary email is being delivered correctly, test the SpamAssassin functions by sending a copy of the GTUBE string to yourself from a remote system. Because SpamAssassin assigns GTUBE a spam score of 1000, which should be higher than your spam kill level, you should see the message handled by amavisd 's spam-handling options.

If amavisd appears to work, but SpamAssassin does not, you can enable SpamAssassin debugging by editing amavisd.conf and setting the $sa_debug variable to 1. This variable appears at the end of amavisd.conf . You must stop amavisd and restart it with the debug argument for SpamAssassin debugging to be performed.

Anytime you make a change to amavisd.conf , you must inform amavisd by issuing the command amavisd reload (or stopping and restarting the daemon).

The amavisd-new distribution includes a script named amavisd_init.sh that you can use as a boot script for systems based on RedHat Linux. With a little modification, it makes a suitable boot script for other Unix systems to automatically start and stop amavisd .

6.4.4 Adding Sitewide Bayesian Filtering

You can easily add sitewide Bayesian filtering to amavisd-new. Use the usual SpamAssassin use_bayes and bayes_path directives in local.cf , and ensure that the amavis user has permission to create the databases in the directory named in bayes_path . One way to do this is to use a directory for the databases that is owned by amavis , such as /var/amavis . Another option is to locate the databases in a directory owned by another user but to create them ahead of time and chown them to amavis. If local users need to have access to the databases (e.g., they will be running sa-learn) , you might have to make the databases readable or writable by a group other than amavis and adjust the bayes_file_mode , or make them world readable or writable. Doing so, however, puts the integrity of your spam-checking at the mercy of the good intentions and comprehension of your users.

If users have shell accounts on the system, you can use per-user Bayesian filtering with amavisd-new instead. Configure SpamAssassin for per-user databases as usual, but ensure that each user's databases are group-owned by the amavis group and have group read/write permissions so that amavisd-new can use them. Doing so allows users to run sa-learn themselves to train their databases, while still permitting amavisd-new to access them. With SpamAssassin 3.0, you could also store per-user Bayesian data in an SQL database.

6.4.5 Adding Sitewide Autowhitelisting

amavisd knows how to use autowhitelisting (see the discussion of $sa_auto_whitelist earlier in this chapter). Just add the usual SpamAssassin auto_whitelist_path and auto_whitelist_file_mode directives to local.cf . As with the Bayesian databases, the amavis user must have permission to create the autowhitelist database and read and write to it.

6.4.6 Routing Email Through the Gateway

Once Postfix and amavisd-new are receiving messages for the local host and performing SpamAssassin checks on them, you can start accepting email for your domain and routing it to an internal mail server after spam-checking. Figure 6-4 illustrates this topology.

Figure 6-4. Spam-checking gateway topology

6.4.6.1 Postfix changes

To configure Postfix to relay incoming mail for example.com to internal.example.com , add the following line to /etc/postfix/main.cf :

 transport_maps=hash:/etc/postfix/transport

Then, create the /etc/postfix/transport file, and add either:

 example.com   smtp:internal.example.com

or, if mail.example.com cannot resolve the name internal.example.com , you could use

 example.com   smtp:[129.168.10.55]

Run the command postmap /etc/postfix/transport to build the transport map from /etc/postfix/transport , and run postfix reload to reload Postfix's configuration.

6.4.6.2 Routing changes

Mail from the Internet for example.com should be sent to the spam-checking gateway mail.example.com . Add a DNS MX record for the example.com domain that points to mail.example.com .

Once received by mail.example.com , messages will be spam-checked and should then be relayed to internal.example.com by Postfix. No DNS records for internal.example.com need be published to the Internet, but it's useful if mail.example.com can resolve internal.example.com .

6.4.6.3 Internal server configuration

Once the external mail gateway is in place, you can configure the internal mail server to accept SMTP connections only from the gateway (for incoming Internet mail). If you don't have a separate server for outgoing mail, the internal mail server should also accept SMTP connections from hosts on the internal network. These restrictions are usually enforced by limiting access to TCP port 25 using a host-based firewall or a packet-filtering router.

‚ < ‚ Day Day Up ‚ > ‚