Disclaimer: I am the current project leader and main developer for Spastic.
Without getting into all the intricacies of email RFCs, I should mention that spam can be fought in many places throughout the system. Most mail servers, or Mail Transfer Agents (MTAs), have some antispam capabilities, but most users don't have the ability or desire to run their own mail servers. The Mail Delivery Agents (MDAs) are programs that take mail from an MTA and deliver it to local mailboxes. procmail is a very popular MDA and is the means by which both SpamAssassin and Spastic are usually invoked. Finally, many mail clients, or Mail User Agents (MUAs), have some antispam capabilities. One promising new trend is Bayesian filtering, which is built into the latest version of the Mozilla mail client (among others). However, this article is focused on two tools which filter at the MDA level using procmail.
SpamAssassin is a collection of Perl modules which test elements of an email message and assign a numeric ranking to it. The higher the ranking, the more likely that the message is spam. The default settings define a spam message as anything with a score of 5.0 or higher. SpamAssassin also checks Realtime Blackhole Lists and has many other advanced features. It is usually called through procmail, although newer versions come with a powerful spamd/spamc client-server interface as well.
About two years ago, the level of spam I began to receive crossed my pain threshold, and I was motivated to take control of the problem. I tried several Open Source spam solutions, including SpamAssassin. At the time, the numeric ranking method of determining spam by SpamAssassin seemed counterintuitive. How do you know how to effectively weigh each setting? In time, I stumbled across SPAST, which was a relatively simple-to-understand procmail script which used word lists to match against elements of an incoming message. It was simple to set up, understand, and customize. The problem was that SPAST was no longer supported by its author, Chrissie LeMaire. I tracked Chrissie down and asked her permission to take over the SPAST project and develop it. Thus, Spastic was born.
Spastic uses procmail and common system utilities like formail, dig, and egrep to scan elements of an email message for patterns, check for valid domains and address formats, etc. One big difference between Spastic and SpamAssassin is that Spastic rules are binary. When a Spastic rule fires, the message is flagged as spam. If a message passes all the tests, it is not flagged. There is no ranking system. The Spastic distribution also includes bash scripts for reporting statistics and rotating spam archives.
The way I tested each program was to set it up to filter all incoming email for a seven day period and log the success rate of each. I made no configuration changes or tweaks to either program during the test. The main configuration I did for SpamAssassin was setting up my whitelist and a couple of cosmetic settings. Since I am on several mailing lists, I receive about 300 messages a day. In this mix is usually a small number of spam messages which come from a variety of sources. I usually receive about 10-20 spam messages a week, which I consider low by most standards today.
I tested SpamAssassin from April 14-20, 2003 and Spastic from April 21-27, 2003.
While my test results are accurate for the email I typically receive, I can't generalize my results to other email users. Please keep in mind that your results may vary.
|Correctly stopped 16 spam messages.||Correctly stopped 10 spam messages.|
|1 false positive.||0 false positives.|
|1 missed spam message.||2 missed spam messages.|
|Total messages processed outside of whitelists: 51||Total messages processed outside of whitelists: 49|
|2 out of 69 incorrect = 67/69 = 97.10% correct||2 out of incorrect = 63/65 = 96.92% correct|
Unfortunately, I realized too late that I should have saved the messages with which each program made an error and cross-tested them against the other one to see if it would have done better. I made a note of it for the next time I run a comparison test.
The results were very close, with SpamAssassin ending with a slightly higher percentage for correctly processing messages. If you are more concerned with false positives, Spastic came out slightly ahead, since many people would rather see a spam message slip through their filter than take a chance on losing an important message. Keep in mind that these sample sets are very small, so drawing firm conclusions is difficult.
After using the programs back-to-back, I have some observations about the strengths and weaknesses of each.
SpamAssassin is the king of spam filtering for a reason. It is very sophisticated, well designed, and effective. For a sitewide filtering solution, I would strongly recommend SpamAssassin over Spastic. If you can't use SpamAssassin on a particular box (like a hosted box), or if you want a simpler solution for a small number of users, Spastic will also serve you well.
If you want to explore further, here are two other interesting antispam tools:
This is just the tip of the growing iceberg of antispam tools in circulation today. I've been very happy with SpamAssassin for the last year or so. What are you using? What's your experience been with it? What's still slipping through? Where do you think the spam war is headed?