Why is SPAM filtering in Thunderbird so terrible? - Adventures in Switching to Linux

Friday, December 14, 2007

Why is SPAM filtering in Thunderbird so terrible?

A year or so ago, I got fed up with using Outlook. I like it feature wise but it runs sooo slowly and uses way too much memory when running (which at work, I leave up 24/7). It also locks the outlook.pst making backing up my email fail if it is still running. I love Firefox (and have been using it since version 0.4 when it was still called Phoenix) so I decided to try out Thunderbird.

Thunderbird, like Firefox, is one of those great cross platform products that allow people like me to make our gradual switch by using an application on Windows that will be the same on Linux. Thunderbird is a good email client that mostly does what I need it to.

One thing I find Thunderbird fails at is the SPAM filtering. It implements what they call "Adaptive Filtering". I am pretty sure that it is same as Bayesian filtering. So every time I tell it something is SPAM (or not) it should do a much better job next time it encounters a similar email. Well, it doesn't! Fortunately, I don't get many false positives, which would be worse, but I do get a handful of junk in my inbox.

Thunderbird's poor implementation is very frustrating to me because on my home machine I still use Outlook from time to time and to filter SPAM I use POPFile which has an awesome Bayesian implementation. Granted I could setup POPFile with Thunderbird too, but just like this guy in the first comment, I would like the best of both worlds. I would like to see a well implemented Bayesian email filter that doesn't require me to open up a web browser to reclassify something. I would like to be able to just click a button to say this email is junk or not junk and have the filter updated and the mail moved to the appropriate place.

Before I started using Thunderbird myself, I switched my mom over to use it. She gets A LOT of SPAM. She is using the email account that has been around since ~1995 and came with the local ISP my parents use. That is a long time to get on a lot of junk mail lists. Before switching her to Thunderbird, she was using Outlook Express + POPFile. Unfortunately, she didn't go in and classify emails, ever. POPFile would gradually get worse and she would start getting more SPAM. Integrating the classification with the action of deleting SPAM that she was already doing, I thought, would be a much better solution. I am not so sure now though. I don't know exactly how much SPAM gets through these days but I think it is more than with an undertrained POPFile.

Please Mozilla, take a look at the POPFile implementation! Maybe you could implement the "buckets" concept while you are at it too instead of just having Junk/Not Junk.

No comments: