‚ < ‚ Day Day Up ‚ > ‚ |
SpamAssassin is distributed with over 700 test rules defined for English-language spam. SpamAssassin 2.63 includes another 2,900 rules for spam in other languages. (Language support in SpamAssassin 3.0 is currently available only for French and German, but language support is likely to increase as SpamAssassin gets into wider release.) Reading the rules distributed with SpamAssassin is an excellent way to learn to write your own rules. SpamAssassin's rules are defined in a set of files typically installed in /usr/share/spamassassin :
Because these files are overwritten whenever SpamAssassin is upgraded, they should not be changed. All local rules or changes to the scoring of distributed rules should be performed in the systemwide configuration file (or in per- user preference files) rather than in these files. Reading these files, however, provides the most information about how SpamAssassin rules are designed. The following sections describe some of the more important rule files in greater detail. 3.4.1 10_misc.cfThe 10_misc.cf file defines special rules that are not spam tests. These include templates for the spam report that SpamAssassin attaches to spam messages, definitions of headers that SpamAssassin adds to messages, and default settings for the most common configuration options (such as those described in Chapter 2). Templates are defined with the repo rt, unsafe_report , an d spamtrap directives, and the corresponding utility directives clear_report_templa te , clear_unsafe_report_template , and clear_spamtrap_template . Use the report template to design the report that SpamAssassin attaches to spam messages. Use the unsafe_report template to design the report that SpamAssassin attaches to messages that contain potentially executable code. Use the spamtrap template to design the message that SpamAssassin sends back to senders who email a spam trap address that calls the spamassassin script with the --report and --warning-from options (spam-reporting is discussed in Chapter 2). Each time it encounters a template directive, SpamAssassin appends new text to the template. Accordingly, to ensure that you're starting with a clean slate when you define a new template, you must first clear the template and then add your desired text. Here's how the spam report might be defined in SpamAssassin: clear_report_template report Spam detection software, running on the system "_HOSTNAME_", has report identified this email as possible spam. The original message report is attached to this so you can view it (if it isn't spam) or block report similar future email. If you have any questions, see report _CONTACTADDRESS_ for details. report report Content preview: _PREVIEW_ report report Content analysis details: (_HITS_ points, _REQD_ required) report report " pts rule name description" report ---- ---------------------- ------------------------------------ report _SUMMARY_ _HOSTNAME_ , _ CONTACTADDRESS_ , _PREVIEW_ , _HITS_ , _REQD_ , and _SUMMARY_ are variables that are replaced by their values when the template is generated for each message. The complete list of variables , which appears in the Mail::SpamAssassin::Conf manpage , is given in Table 3-3. Table 3-3. Variables for use in report and header templates
The variables in Table 3-3 can also be added to customized message headers for messages processed by SpamAssassin by using the add_header directive, which takes the following form: add_header messagetype headername string The messagetype can be spam , ham (non-spam), or all and determines which kind of messages will have the header added. The new header will be named X-Spam- headername , and string , which should be enclosed in double quotes, will be the value of the header. For example, the following directive, which appears in the distributed 10_misc.cf file, adds an X-Spam-Status header to all messages ‚ spam or not ‚ that shows whether or not each message is spam, the spam score, the spam threshold score, the tests that were matched, whether the message is being automatically learned (see Chapter 5), and the version of SpamAssassin: add_header all Status "_YESNO_, hits=_HITS_ required=_REQD_ tests=_TESTS_ autolearn=_ AUTOLEARN_ version=_VERSION_" If you want to change or remove a default header, you can use the remove_header directive: remove_header messagetype headername You can remove all headers with the clear_headers directive. 3.4.2 20_fake_helo_tests.cfThis file defines a set of rules that use the eval test check_for_rdns_helo_mismatch( ) . This test takes two arguments: a regular expression pattern to match against the reverse DNS lookup of the connecting client's IP address, and a regular expression pattern to match against the hostname provided by the client during in the SMTP HELO command. Spammers often use mail programs that forge the HELO hostname, and these tests look for such forgeries when the clients have hostnames that match those of major commercial ISPs. Here's an example of a test from this file: header FAKE_HELO_AOL eval:check_for_rdns_helo_mismatch("aol\.com","aol\.com") describe FAKE_HELO_AOL Host HELO did not match rDNS: aol.com This test matches if the client connects from an IP address that reverse-resolves to an aol.com hostname but claims in the HELO to have a hostname that does not match "aol.com". These tests are applied to all of the Received headers from untrusted relays. You can use this eval test to reject messages that claim, in their HELO, to be from your own host. If your hostname is myhost.example.com , and you know that your IP address reverse-resolves to the same hostname, you could add a rule like this (to the systemwide configuration file): header FAKE_MY_HELO eval:check_for_rdns_helo_mismatch("(?!myhost\.example\.com). {18}$","myhost\.example\.com") describe FAKE_MY_HELO Host HELO faked my hostname score FAKE_MY_HELO 5.0 The regular expression (?!myhost\.example\.com).{18}$ matches any hostname containing at least 18 characters that does not end in myhost.example.com , which should match the reverse DNS lookup of any untrusted relay host other than your own. If any such host claims in their HELO to be myhost.example.com , it is forging your hostname. 3.4.3 20_body_tests.cfThis file contains most of the tests that SpamAssassin performs against message bodies. In addition to tests for regular expressions in the body, this file defines tests against spam clearinghouses and tests of message language and locale. A spam clearinghouse is a server that maintains a database of checksums of messages reported as spam and allows clients to test a message against the checksum database. SpamAssassin supports three spam clearinghouses: Vipul's Razor (http:// razor .sf.net/), Pyzor (http://pyzor.sf.net), and the Distributed Checksum Clearinghouse, or DCC (http://rhyolite.com/anti-spam/dcc/). Special client software must be installed on the system in order for SpamAssassin to use these tests. The spamassassin ‚ report command can be used to report confirmed spam to these clearinghouses as well. In SpamAssassin 3.0, the pyzor_options configuration directive can be set to a string of additional options to be passed to the Pyzor client on the command line when SpamAssassin invokes it. Similarly, the dcc_options directive can be set to provide additional options to the DCC client. |
‚ < ‚ Day Day Up ‚ > ‚ |