Jump to content
Sign in to follow this  
ob1db

SpamAssassin Failing, Bayesian + Training Needed!

Recommended Posts

Hey, the amount of spam getting through SpamAssassin is rising DAILY. With it still

set to "5", I get 10-30 a day for Viagra, Penis Enlargers, Canadian Pharmacy, Breast

enlargers (and my breasts are QUITE large enough, thank you, LOL!) etc. All OBVIOUS

spam, but SpamAssassin gives them ratings like 2.6 or lower. (And still manages to

false positive some!)

I now have nearly ALL the bl lists active, has not helped much but HAS increased the

false positives greatly.

I was just chatting with a local sysop who commented "I have used spamassassin for a

while now, and very little gets through. Are you not training it?"

This stunned me! When I first enabled SpamAssassin, the spam really dropped

dramatically. Now it is getting nearly intolerable. Is it possible the system is set

up with no training enabled ?

Oh, and my friend is running one version OLDER SpamAssassin than SpamCop is. Is

there something that needs to be changed in how SpamAssassin is set up?

We don't have individual access to it's settings besides level, correct ?

He says: "You collect mails that it missed that you know to be spam, and run a

program over them "sa-learn". You also run it over mails it got right, utilizing the

--ham switch. This basically adds data to the Bayesian classifier, I think.

Those using it on a network might want to have a "this is spam, damnit" drop so that

users can report it. It's smart enough to not register a mail twice."

Soo, we need a column for SpamAssassin score on the HeldMail page (web) and a

checkbox for missed scoring spams. Maybe the same on the inbox page as well ?

And maybe a special maildrop for TRULY egregious failures, eg "scored a 2.6 for

viagra with breasy enlargement, all shouting and HTML, etc." Likewise a drop for

ones that WERE flagged that were clean ?

Waddya say ?

We gotta get the levels down.

Drowning again.

David

PS: I now find out that Baysian filtering is OFF, as is training. I am told that SpamAssassin is barely effective set up this way. See examples posted in .spam newsgroup. Do I need to repost them in .spam up here as well ??

Share this post


Link to post
Share on other sites

No, what I'd suggest is to peruse the remarks already made in yet another Topic/Thread that has JT answering the questions as to the status of "Training" and "Beysian"... jump in on that Topic/Thread if you want. But I believe that now that attention has been drawn to the issue, JT's next step is probably looking at changing that .. not for me to say, but .... starting another Topic/Thread on something that you already say you've found the answer to seems a way not to do things.

Share this post


Link to post
Share on other sites
Hey, the amount of spam getting through SpamAssassin is rising DAILY. With it still

set to "5", I get 10-30 a day for Viagra, Penis Enlargers, Canadian Pharmacy, Breast

enlargers (and my breasts are QUITE large enough, thank you, LOL!) etc. All OBVIOUS

spam, but SpamAssassin gives them ratings like 2.6 or lower. (And still manages to

false positive some!)

The question is not how much spam is getting through. The question is what percentage is getting through. You say that 10-30 are getting through daily. How many are being caught daily? Why don't you actually count and let us know.

PS: I now find out that Baysian filtering is OFF, as is training. I am told that SpamAssassin is barely effective set up this way. See examples posted in .spam newsgroup. Do I need to repost them in .spam up here as well ??

You were told wrong. The current installation of SpamAssassin is catching a huge amount of spam.

JT

Share this post


Link to post
Share on other sites
You were told wrong. The current installation of SpamAssassin is catching a huge amount of spam.

...but from *our* (end-users) experience, JT, it's allowing through more and more. Despite being a long-time SC website reporter (until I finally used up my quota), I've only been using SC mailboxes for about a month, and during that time, it does indeed seem that more stuff is getting past SpamAssassin, and the stuff that gets through often has surprisingly low SA values.

For example, I'll include the Subject, the X-spam headers and body (with all HTML removed and without the spammer's URL) of one that got past SA earlier today with only "X-spam-Status: hits=1.1":

Subject: FWD: Order v1[at]GRA * X[at]nax ` Va|l|ium % Fi0ric3`t _ Pnt.e.rmin + S0ma 34xKv

X-spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on blade6

X-spam-Level: *

X-spam-Status: hits=1.1 tests=BIZ_TLD,HTML_50_60,HTML_MESSAGE,MIME_HTML_ONLY

All Your Meds Here

You too can now enjoy the same deep discounts offered to US residents by ordering your prescriptions directly from us.

Your choices: ^ So+m+a ' v|aGr[at] ; Va:l:ium ( _XANAX_ ; Pnterm.i.n $ A't|v[at]n

Plus: U|'tr[at]m, L`3v|tra, Pr0p3:cia, Acyc:|0vir, Pr0z[at]:c, P[at]xi'l, Busp`[at]r, Ad|p:ex, I0.nam|n, M3ridi'a, X`3nica|, Ambi3'n, S:0naTa, Fl3`xeril, C:e|3brex, F'i0ric3t, Tram[at]d:o|

No complicated formalities of any kind.

<spammer's url>Best prices here.

---

Hmmmm....let's not defend SpamAssassin's functioning until it can do something about spam like this one.

Share this post


Link to post
Share on other sites

I'm locking this thread since exactly the same comments and arguments are taking place in the other thread in this same forum.

JT

Share this post


Link to post
Share on other sites
Guest
This topic is now closed to further replies.
Sign in to follow this  

×