SpamAssassin

See also Procmail + SpamAssassin.

Introduction
SpamAssassin is one of the best spam filtering tools available, freely under the Apache license.

There are basically three different ways to take advantage of SpamAssassin within DreamHost's environment (Listed in order from easiest to advanced)
 * 1) Use DreamHost's Junk Mail system
 * 2) * Easy setup via the DreamHost panel
 * 3) * Maintained by DreamHost
 * 4) * Spam can be reviewed only via webmail, or in IMAP client with IMAP filtering option
 * 5) * Bayes rules (but no trainable Bayes db)
 * 6) * Delivery delays have been noted
 * 7) Run DreamHost's version of SpamAssassin on your assigned host
 * 8) * High level of control over spam scoring and routing
 * 9) * Straightforward setup - some knowledge of shell commands is necessary
 * 10) * DreamHost may (or may not) update the version of SpamAssassin on your server
 * 11) * You can lose mail if you configure this wrong
 * 12) Install a newer or customized version of SpamAssassin on your assigned host
 * 13) * Even higher level of control
 * 14) * Run the latest version of SpamAssassin
 * 15) * Setup is straightforward, but more involved
 * 16) * DreamHost won't update - If your version has a security issue, you must patch/update it yourself.
 * 17) * You can lose mail if you configure this wrong

The version installed on DreamHost's mail servers usually lags behind the current version (3.1.7 as of October 25, 2006, versus DreamHost's v3.03 or so). Therefore, you have two options: either use the version provided by Dreamhost, or install the latest version into your account.

NOTE: Custom versions are only recommended to be installed on a Private Server with courier enabled, so you can check the filtered mail on the command line. Forwarding mail back for IMAP checking is not supported either on shared or PS.

Some portions of the following instructions may be prepared on a local PC and transferred to the server with SFTP, but much must be done in the Shell, using Shell Commands with SSH.

DreamHost's Junk Mail system
Basic SPAM filtering can be set up via the control panel. Instructions can be found here: Junk_Mail

Using SpamAssassin
By default, all your mail is delivered by a program called Postfix. To use Spam Assassin, you must first tell postfix to send mail to procmail (a mail processing program), and then configure procmail to use Spam Assassin. Sounds confusing? It's not.

~/.forward.postfix
Create a file named  in your home directory  which contains the following line (The quotes are important!): "|/usr/bin/procmail -t"

~/.procmailrc
Create a file named  in your home directory and type the following in it: PMDIR=$HOME/.procmail LOGFILE=$PMDIR/log SHELL=/bin/sh :0fw:spamassassin-lock * < 524288 | spamassassin -P
 * 1) PROCMAIL ENVIRONMENT
 * 2) Directory for storing procmail files
 * 1) Procmail log file
 * 1) Shell to use for recipes
 * 1) MAIL PROCESSING RECIPES
 * 2) Pipe any messges under 512 K through spamassassin for scoring

Note: The option  is not needed on Spam Assassin version 3 or greater. It 'is' needed for version 2. Some of the shell systems are running SpamAssassin 3.0.3, while others are not (See Determine Your SpamAssassin Version).

Automatic Sorting
If you want Procmail to automatically put messages that SpamAssassin identifies as spam in a specific folder (eg. a folder named "Spam"), add the following lines to the end of your  file: :0 * ^X-Spam-Status: Yes $HOME/Maildir/.Spam/ This identifies the mail header  that SpamAssassin adds to spam messages, and then causes that mail to be placed in the .Spam/ folder. Examining some mail I receive through a university account, I also find that the  header can hold useful information: :0 * ^X-PMX-Spam: Gauge=XXXXXX $HOME/Maildir/.Spam/ To test that your rules are working, try setting a test subject, then sending yourself mail with that subject from another account. It should be delivered to your Spam folder: :0 * ^Subject: SPAMTEST $HOME/Maildir/.Spam/
 * 1) Dump spam messages in the spam folder
 * 1) Dump more spam messages in the spam folder
 * 1) Filtering test

Shorcuts
You may find it tedious to update the Spam folder location in the above recipes. To avoid this, add SPAMDIR=$HOME/Maildir/.Spam/ to the top half of the file, then filter your mail to : :0 * ^X-Spam-Status: Yes $SPAMDIR

Using these recipes, I don't have to use  in my   file, or use any filters in my mail client.

~/.spamassassin/user_prefs
Spamassassin's default settings are pretty good, but there are a few changes that you can make which will increase your spam capture rate considerably. Global settings are stored in. As of this writing on my server, local.cf is basically empty (comments only). Your personal preferences are stored in. If you have not had any mail delivered, you may have to create this folder and file.

System documentation on SpamAssassin will help with editing the rules. This documentation is available with the commands:

$ man spamassassin $ man Mail::SpamAssassin::Conf

trusted_networks
By default, Spamassassin is a bit too trusting and will score down e-mails with the "ALL_TRUSTED" test; the result is that more spam gets through. You can use the  configuration setting to tell Spam Assassin which networks are to be trusted. In a terminal, type $ dig spf.dreamhosters.com txt to get the latest list of trusted DreamHost mail servers, then add the following lines (or similar) to your  file: trusted_networks 66.33.192.0/19 66.201.54.64/26 205.196.208.0/20 trusted_networks 64.111.96.0/19 208.97.128.0/18 208.113.128.0/19 perhaps via e.g., $ dig spf.dreamhosters.com txt|perl -lnwe 'push @m,(/ip4:(\S+)/g);END{print "trusted_networks @m"}' trusted_networks 66.33.192.0/19 66.201.54.64/26 205.196.208.0/20 64.111.96.0/19 208.97.128.0/18 208.113.128.0/17 67.205.0.0/19

rewrite_header
If you use filters in your mail client (eg. Thunderbird or Evolution) to sort spam, it may be helpful to prepend "[SPAM]" to the subject of add. Add the following line (or similar) to your  file: rewrite_header subject [SPAM] ... or use any other text you'd like.

Bayes Training
Bayesian filtering allows SpamAssassin to learn how to recognize spam messages. To do so, it has to be 'trained' with some messages that are known to be spam and ham (not spam).

If you want to use the Bayes tests:
 * 1) you must be running SpamAssassin 3.0 or newer (See Determine Your SpamAssassin Version).
 * 2) you can only train using full SHELL USER accounts (the m######## accounts exist only on DreamHost's mail server, which you do not have access to from the shell)
 * 3) you must first train the database with at least 200 spam email messages AND 200 ham email messages (non-spam messages).

Basic Training
Sort some existing e-mail into spam and non-spam folders. Let's assume you have a sub-folder of your INBOX called "Spam", another subfolder called "Ham", and some other folders (with no spam in them). Find out where the utility  is. If you are using DreamHost's version of SpamAssassin, this will probably be something like. If you installed your own version, it may be elsewhere. In a terminal, type: $ which sa-learn Train using a folder full of spam. In a terminal, type (replace  with the path you determined above. Capitals are important!): $ /usr/bin/sa-learn --no-sync --spam ~/Maildir/.Spam/cur Train using a single ham folder: $ /usr/bin/sa-learn --no-sync --ham ~/Maildir/.Ham/cur Train using many ham folders. Typically, all your folders, with the exception of .Spam, are containers for non-spam email. To train using these as ham: $ /usr/bin/sa-learn --no-sync --ham `find ~/Maildir -name cur|grep -v .Spam` Synchronize (save) the learned rules: $ /usr/bin/sa-learn --sync You can also view the rules that SpamAssassin has learned by typing $ /usr/bin/sa-learn --dump magic

Automated Training Using Cron
Create a file (I use ) using some of the above rules: # find ~/Maildir -name cur | egrep -v '(.Spam)|(.Trash)|(.Sent)' |while read i; do /usr/bin/sa-learn --no-sync --ham "$i" done /usr/bin/sa-learn --no-sync --spam ~/Maildir/.Spam/cur /usr/bin/sa-learn --sync mv ~/Maildir/.Spam/cur/* ~/Maildir/.Trash/cur
 * 1) !/bin/bash
 * 2) Automated Bayesian Training
 * 1) Train ham (ignore outgoing mail & deleted mail that may be spam or ham)
 * 2) Do each directory in turn so that procwatch doesn't kill sa-learn
 * 1) Train spam
 * 1) Save
 * 1) Delete learned spam

Make the script executable: $ chmod 700 ~/.spamassassin/learn-spam.sh Add a job to your Crontab that runs   daily or less. More frequently may be problematic with large mailboxes.

Using the DBS Block List tests and the Bayes tests, I've found SpamAssassin to be incredibly accurate at detecting both spam and ham.

Country Training
SpamAssassin comes with a plugin to add information to the headers of messages about which country the messages were relayed through. If this plugin is activated and enough spam messages are relayed through certain countries, the Bayes feature will begin to detect this as spam.

To enable the plugin:


 * 1) Use CPAN to install IP::Country::Fast as above.
 * 2) Uncomment the 'loadplugin' line in the ~/saetc/mail/spamassassin/init.pre file for Mail::SpamAssassin::Plugin::RelayCountry.

That's it. Now you can examine the headers of your new mail messages to see if the relay countries are added to the headers. See http://wiki.apache.org/spamassassin/RelayCountryPlugin for more information.

Installing v3.1.0 into your account
There is a good guide available, but it doesn't include instructions for installing the Perl CPAN modules necessary for the DNS-based tests to succeed (this is necessary because the DreamHost version of the needed CPAN modules are too old). The instructions here for installing CPAN modules don't seem sufficent. These instructions were quite helpful.

After following the guide, you will also need to follow these steps for the DNS-based tests to succeed:


 * 1) SSH in to your account.  (Use the same account you installed SpamAssassin to, the one where your maildir is.)
 * 2) Run cpan, then type exit.  That should create ~/.cpan/CPAN/MyConfig.pm.
 * 3) Open ~/.cpan/CPAN/MyConfig.pm in your text editor (I like nano).  Find the makepl_arg variable, and add PREFIX=/home/your_username to it, with a space separating any other arguments.  Save and quit.  The line should look like: 'makepl_arg' => q[PREFIX=/home/username INSTALLEDIRS=site],
 * 4) Run export PERL5LIB=$HOME/lib/perl/5.8.4/:$HOME/share/perl/5.8.4/ so that cpan can find modules that it installs (when checking dependencies)
 * 5) Run cpan, and then install the following modules by typing install modulename (These three modules may want to install additional modules to support these; in my case I answered yes to everything.):
 * 6) *Net::IP
 * 7) *Net::DNS
 * 8) *Mail::DomainKeys
 * 9) *Crypt::OpenSSL::Random
 * 10) *Crypt::OpenSSL::RSA
 * 11) exit CPAN when you've finished installing all of the modules.
 * 12) ln -s ~/lib/perl/5.8.4/Net/* ~/share/perl/5.8.4/Net

Now send yourself some test e-mails, preferably spams you still have. Once received, look at the headers to check their scores and see which tests were run. On most spams, you should see DNS-based tests like URIBL, SpamCop, etc. being run, and giving your spams quite high scores.

If things don't go quite right, or even if they do, you may want to look at ~/.procmail/log and see if SpamAssassin is reporting any problems. I believe these instructions are correct, but I could have left out a few things, so you may need to install some additional modules that I forgot.

You can incorporate ClamAV into your SpamAssassin installation with Clamassassin (A guide for installing ClamAV and Clamassassin on DreamHost).

adding SA for all the domain's mail accounts
In the forum, user stoneyb writes: "I have SA set up for a whole domain (twice). Install it into the domain home, the one with the web sites, using the personal install instructions, and have the users in the domain use the full path to the domain's SA in their procmailrc files. This works quite well for me."

To be more specific:

Create the ~/procmail/spam.rc file, as instructed in the guide referenced above, but when you reach this line:


 * $HOME/sausr/bin/spamassassin

Remove the variable $HOME, substituting the full path to the domain you installed it to, ie:


 * /home/myaccountwithcustomsa/sausr/bin/spamassassin

You may want to actually try running "/home/myaccountwithcustomsa/sausr/bin/spamassassin -V" from the account whose spam.rc file you're modifying, just to ensure you have the appropriate permissions, but it should work as long as both accounts are under the Dreamhost account.

Determine Your SpamAssassin Version
In a terminal, type $ spamassassin -V