What's with bogofilter and spam

Rodd Clarkson rodd at clarkson.id.au
Thu Jun 4 11:53:25 UTC 2009


On Thu, 2009-06-04 at 12:13 +0100, Anne Wilson wrote:
> On Thursday 04 June 2009 11:26:11 Rodd Clarkson wrote:
> > On Thu, 2009-06-04 at 05:05 -0500, Mike Chambers wrote:
> > > On Thu, 2009-06-04 at 20:01 +1000, Rodd Clarkson wrote:
> > > > Recently bogofilter's spam sensing abilities seem to have gone all wrong
> > > > in evolution.
> > > > 
> > > > I was getting way too much span in the inbox (maybe 10% of my spam
> > > > wasn't getting detected) and even though I was highlighting it and
> > > > marking it as spam the same sorts of messages kept appearing.
> > > > 
> > 
> > <snip>
> > 
> > > > Are other noticing the same issues?
> > > > 
> > > > I prefer bogofilter over spamassassin as the latter takes forever to
> > > > filter through email, especially when you've been on holidays for a week
> > > > and have to pull a couple of 1000 messages.
> > > 
> > > Experienced everything you did, to include the marking my Fedora
> > > messages as spam as well.  Just doesn't seem bogofilter and/or evo is
> > > not working together like they did in F10.
> > > 
> > > I thought I was the only one experiencing this.
> > 
> > Filed as: https://bugzilla.redhat.com/show_bug.cgi?id=504112
> > 
> Spammers are getting a lot more clever/careful these days, using words that 
> won't be detected.  I've found that I have to collect spam and 'unsure' ham 
> into folders until I get a reasonable number, then every few days I run
> 
> bash /usr/share/bogofilter/contrib/contrib/trainbogo.sh -c -H /home/anne/Maildir/.INBOX.bogotrain_ham/cur/ -S /home/anne/Maildir/.INBOX.bogotrain_spam/cur/
> 
> (watch for line-wrap - it's all one line), repeating until the missed spam is 
> down to about 3.  I then delete all the tested messages and collect the next 
> batch.  I'm still seeing a number of unsures, but bogofilter is definitely 
> learning the new stuff.
> 
> If you are seeing ham messages being detected as spam, copy a large number of 
> similar messages, for instance mailing-list messages, into your ham testing 
> folder before the run.  Doing this a few times should sort out any 
> mis-training already there.  HTH

Thanks Anne,

Sadly, I'm not feeling like being manual about this, and I guess that I
just expect my mail client to work well with the spam software and do it
for me.  After all, my mail client has a great collection of ham and
spam so if I can do something like it manually, then surely it can't be
hard for the spam software to do it without me having to thing about it.

bogofilter used to work well, and I'm hoping that it can once again be
the great spam filter it was, fast and accurate.


R.




More information about the test mailing list