Next week, when you turn on your computer and open your Inbox, you’ll notice a beautiful absence. No more – well almost no more – spam. No more sex, drugs and bankroll come-ons from sleezy strangers crashing your by-invitation-only e-mail party.
That’s because Computing and Network Services (CNS) will have installed a new spam filter it tested this summer as a pilot project. Better than any other spam-reduction methods York has tried over the past couple of years, this new filter – called Bogofilter – is the most effective, says spam-buster and UNIX systems administrator Peter Marques, who led the pilot project with Ramon Kagan, senior information security analyst. Bogofilter reduced spam by as much as 90 per cent for the 45 participants. That can only spell relief for those who receive up to 250 spam messages a day.
“Our goal with spam filtering is not to automatically filter every message you do not wish to see as this would be extremely unrealistic,” says Susan Spence, CNS director of service management and delivery. “Instead, our aim is to filter as much of the ‘pure spam’ as we possibly can. But there will always be a certain amount of ‘nuisance/noise’ e-mail – auto-reply messages, delivery failure notices – that you will still have to deal with yourself.”
CNS is planning to put the new filter to work for much of the rest of the campus community starting Aug. 10 and will notify you by e-mail when it is active on your account.
The filter will apply to faculty and staff whose e-mail addresses end in @yorku.ca. Eventually it will also be extended to student accounts ending in @yorku.ca.
It will not work on e-mail delivered via other departmental or faculty e-mail systems such as Lotus Notes in Glendon, Osgoode, Atkinson and Schulich, MS Exchange in Facilities Services and Campus Services and Business Operations, and First Class in Education.
Previous efforts by CNS to stem a growing tide of unwanted e-mails have included blocking messages from known bulk-e-mail distributors and blocking spam sources one by one as they were reported. Most recently, CNS has used SpamAssassin, a software program that applies a broad filter but requires action by users. “It doesn’t seem to be used very widely,” says Marques, who has been involved in most of the University’s anti-spam battles.
Bogofilter rated best
The new spam filter, Bogofilter, was one of two products tested this summer by Marques and Kagan. It is simpler to implement and “is by far the best method” so far of filtering out spam, says Marques.
Bogofilter classifies mail as spam or ham (non-spam) by a statistical analysis – known as the Bayesian technique – of message headers and content. It allots a value to each word and pattern (called token) that appears in a user’s spam and non-spam, based on how frequently it turn ups in each, and uses the information to block unwanted messages. For instance, the word viagra would get a high value because it appears frequently in spam but rarely in non-spam and would thus be used to identify and block incoming spam. A word like hi, however, would get a lower value because it is common to both and would not serve to identify spam.
Bogofilter can learn from the user’s classifications and corrections and can also be trained to detect changing spam trends, says Marques.