Jump to content
Sign in to follow this  
Mike1024

Spamcop e-mail service - blocking by charset

Recommended Posts

Hi there,

I subscribe to your e-mail service and I'm pretty pleased with it.

In recent months, though, I've had a problem with spam getting through the filters. Most of the spam that gets through is Windows-1251 encoded. Here's a picture of e-mails I've deleted (rather than them going to 'held mail' like they should) over recent days:

spamcop_spam.png

So my request is: Can I have something to address this, such as a filter to block all Windows-1251 encoded spam?

Thanks,

Michael

Share this post


Link to post
Share on other sites
Hi there,

I subscribe to your e-mail service and I'm pretty pleased with it.

In recent months, though, I've had a problem with spam getting through the filters. Most of the spam that gets through is Windows-1251 encoded. Here's a picture of e-mails I've deleted (rather than them going to 'held mail' like they should) over recent days:

So my request is: Can I have something to address this, such as a filter to block all Windows-1251 encoded spam?

31677[/snapback]

I don't know the likelyhood of this request being honored, but have you been reporting the spam getting through the filters (before deleting them)? Have you looked at the x-spamcop-* headers to determine why they are getting through?

Share this post


Link to post
Share on other sites

Not using the e-mail side of the house, I'm sure I'm missing something. However, in generalities, the primary focus of the filtering process starts with the IP addresses involved, then you can add in some of the SpamAssassin features. My assumption is that the user filters are part of the Horde/IMP application. So wondering exactly where analysis of the composition of the e-mail (to include the character set) would show up. Granted that all you showed was Deleted stuff, but have to ask ... is any of your 'goof' e-mail also Cyrillic?

Share this post


Link to post
Share on other sites

Hi guys,

I don't know the likelyhood of this request being honored, but have you been reporting the spam getting through the filters (before deleting them)?  Have you looked at the x-spamcop-* headers to determine why they are getting through?

31679[/snapback]

I report all the spam that has got past the filters while I have been at my workstation, which is a big chunk of it. I don't bother with stale stuff that's arrived overnight etc.

The typical SC headers of a message might be:

X-spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on blade6

X-spam-Level: ***

X-spam-Status: hits=3.5 tests=EXTRA_MPART_TYPE,FROM_STARTS_WITH_NUMS,

HTML_30_40,HTML_FONT_BIG,HTML_MESSAGE,RCVD_NUMERIC_HELO version=3.0.2

X-SpamCop-Checked: 192.168.1.103 218.145.97.169 218.145.97.169

I assume the problem is partly that SpamAssassin has lots of filters for English spam but fewer/none for foreign language spam.

Not using the e-mail side of the house, I'm sure I'm missing something.  However, in generalities, the primary focus of the filtering process starts with the IP addresses involved, then you can add in some of the SpamAssassin features.  My assumption is that the user filters are part of the Horde/IMP application.  So wondering exactly where analysis of the composition of the e-mail (to include the character set) would show up.  Granted that all you showed was Deleted stuff, but have to ask ... is any of your 'goof' e-mail also Cyrillic?

31686[/snapback]

Just to clarify, all the e-mail shown was 'goof' e-mail:- Correctly identified spam goes to 'held mail', while spam I've had to delete manually goes to 'deleted' (which is what is shown in the screenshot).

Basically the state with the e-mail service is: Users can chose a number of settings for thier mail to be filtered by - including SpamAssassin level and a choice of RBLs, plus a sender blacklist and whitelist. Any mail classified as spam under this system is put in a 'held mail' folder on the server. I use IMAP so the 'held mail' folder on the screenshot holds e-mail identified as spam.

If you're not familar with SpamAssassin, how it works is: Each message starts with a score of 0. Tests are performed ( examples ) and the number incremented by the test weighting every time a test is failed. Users can then pick at what threshold a message should be called spam and treated as such.

For example if a message contains 'v1agra' (weighting +2.5) and the message is 40%-50% HTML obsfucation (weighting +2.6) the final score is 5.1. If the user has chosen 5 as thier SpamAssassin threshold, the message will be classified as spam.

The SpamAssassin filters and weightings are (AFAIK) configurable by the system administrator.

Basically what I'm asking is: Could we have more SpamAssassin tests to cover the spam that's currently getting through the filters, as pictured.

Cheers,

Michael

Share this post


Link to post
Share on other sites
Basically what I'm asking is: Could we have more SpamAssassin tests to cover the spam that's currently getting through the filters, as pictured.

31719[/snapback]

If you know of a good rule set (i.e. EXTRA_MPART_TYPE, FROM_STARTS_WITH_NUMS, HTML_30_40, HTML_FONT_BIG, HTML_MESSAGE, RCVD_NUMERIC_HELO) to do what you are looking for, implementation is more likely to occur. The less work for JT to locate and implement, the better.

There have been requests for additions to the rule sets in the past. The last time it was shown they were implemented was when we went from v2 to v3 of the spamassassin application. There may have been additions since then, but none that I have seen a request then confirmation.

Share this post


Link to post
Share on other sites

http://spamassassin.apache.org/full/3.0.x/...assin_Conf.html

ok_languages xx [ yy zz ... ] (default: all)

This option is used to specify which languages are considered OK for incoming mail. SpamAssassin will try to detect the language used in the message text.

Note that the language cannot always be recognized with sufficient confidence. In that case, no points will be assigned.

The rule UNWANTED_LANGUAGE_BODY is triggered based on how this is set.

In your configuration, you must use the two or three letter language specifier in lowercase, not the English name for the language. You may also specify all if a desired language is not listed, or if you want to allow any language. The default setting is all.

.........

ok_locales xx [ yy zz ... ] (default: all)

This option is used to specify which locales (country codes) are considered OK for incoming mail. Mail using character sets used by languages in these countries will not be marked as possibly being spam in a foreign language.

If you receive lots of spam in foreign languages, and never get any non-spam in these languages, this may help. Note that all ISO-8859-* character sets, and Windows code page character sets, are always permitted by default.

Set this to all to allow all character sets. This is the default.

The rules CHARSET_FARAWAY, CHARSET_FARAWAY_BODY, and CHARSET_FARAWAY_HEADERS are triggered based on how this is set.

...........

In general, the situation appears to suggest that if you were running your own server, you could 'lock out' lots of stuff. As the service 'here' handles users world-wide, 'locking things out' isn't really an option. From reading that configuration document, it would seem that the SpamAssassin folks is where the requests/complaints should go to expand their feature set ...???

what's interesting is the 'sudden' flood of complaints about cyrillic text spam and the request for something to be done .. (or is it that the same folks are posting both here and in the newsgroups?)

As mentioned, there were some additional language/word packs tossed into the mix a long while back ... are there some cyrillic packs available? http://wiki.apache.org/spamassassin/CustomRulesets doesn't list one ...

Share this post


Link to post
Share on other sites

From the spamcop.help newsgroup <g>

From: "WazoO"

Newsgroups: spamcop.help

Subject: Re: Russian spam

Date: Tue, 16 Aug 2005 19:05:35 -0500

Message-ID: <ddtv0g$evv$1[at]news.spamcop.net>

"RW"  wrote in message

news:ddtr5o$c24$1[at]news.spamcop.net...

> Martin Cleaver wrote:

> > I invested in some spamcop pop accountsd for my family and that works

> > great. I also bought another one for our small software company in

> > Holland. However we are receiving about 102- Russian spams a day on

<snip>

> > Please can someone tune the filters to stop Russian crap too?

> > Rgds

> > Martin

>

> Do you have spam Assassin filtering turned on in your account?  That

> should help.

>

> Richard

Actually, http://forum.spamcop.net/forums/index.php?showtopic=4732

suggests "not that much help" <g>

Share this post


Link to post
Share on other sites
ok_languages xx [ yy zz ... ] (default: all)

31721[/snapback]

The trouble / problem with implementing this within SpamAssassin is that affects everyone who use the SA filter.

Now I wouldn't be troubled by the particular check but anyone who regularly receives legitimate Email in this character set would be affected and make their legit Email more likely to be captured.

So my feeling is that this would require a low score. As it stands I would be marginally inclined to speak against adding this check to spam Assassin.

It would really be better if we could get the problem IP addresses onto the blocklist and keep them reported whilst the spew continues.

Andrew

Share this post


Link to post
Share on other sites
The trouble / problem with implementing this within SpamAssassin is that affects everyone who use the SA filter.

31742[/snapback]

In the long run, maybe we should allow SpamAssassin per-user settings in addition to the global rule set.

Share this post


Link to post
Share on other sites

User adjustable settings would definately be helpful.

Share this post


Link to post
Share on other sites
User adjustable settings would definately be helpful.

31761[/snapback]

The catch is referenced in the link I offered a few posts back; http://spamassassin.apache.org/full/3.0.x/...assin_Conf.html

allow_user_rules { 0 | 1 } (default: 0)

This setting allows users to create rules (and only rules) in their user_prefs files for use with spamd. It defaults to off, because this could be a severe security hole. It may be possible for users to gain root level access if spamd is run as root. It is NOT a good idea, unless you have some other way of ensuring that users' tests are safe. Don't use this unless you are certain you know what you are doing. Furthermore, this option causes spamassassin to recompile all the tests each time it processes a message for a user with a rule in his/her user_prefs file, which could have a significant effect on server load. It is not recommended.

Note that it is not currently possible to use allow_user_rules to modify an existing system rule from a user_prefs file with spamd.

Share this post


Link to post
Share on other sites
The catch is referenced in the link I offered a few posts back; http://spamassassin.apache.org/full/3.0.x/...assin_Conf.html

Furthermore, this option causes spamassassin to recompile all the tests each time it processes a message for a user with a rule in his/her user_prefs file, which could have a significant effect on server load. It is not recommended.

31765[/snapback]

If my reading of that is right, the load is only increased when users have extra rules, not just by enabling user rules.

Could we enable user rules, but have a one-off $10 fee to get user_prefs editing rights? I'm sure people would pay, and the proceeds would fund getting more RAM or whatever.

Michael

Share this post


Link to post
Share on other sites
Could we enable user rules, but have a one-off $10 fee to get user_prefs editing rights? I'm sure people would pay, and the proceeds would fund getting more RAM or whatever.

31784[/snapback]

I don't see this happening. It isn't just about the processing load but the security issues also referred to in Wazoo's post.

Andrew

Edited by agsteele

Share this post


Link to post
Share on other sites
I don't see this happening.  It isn't just about the processing load but the security issues also referred to in Wazoo's post.

31796[/snapback]

Thanks for that ... I actually had that bit typed in, then deleted it. I'd keyed on the "when run as 'root'" comment and was going to try to search out if it was functional when run as "other than 'root'" ... time, distractions, and all that <g> .. suspecting that 'root' is actually needed due to all the (possible) system resources invoked

Share this post


Link to post
Share on other sites

I've started getting a lot of spam in what appears to be cyrillic lately. I'd like to create a filter in Webmail to automatically move that to the "spam" folder. Is there any way (Ping: JT) to get this set up?

Moderator Edit: Moved/Merged this post into an existing Topic covering the same ground. PM sent to advise of this action.

Share this post


Link to post
Share on other sites

I've started getting a lot of spam in what appears to be cyrillic lately. I'd like to create a filter in Webmail to automatically move that to the "spam" folder. Is there any way (Ping: JT) to get this set up?

Moderator Edit: Moved/Merged this post into an existing Topic covering the same ground. PM sent to advise of this action.

As I think you are aware, webmail filters only work when you are using webmail. You could probably do this currently with a custom header search looking for the character code. I don't have any here currently to test this on, however.

Share this post


Link to post
Share on other sites

As I think you are aware, webmail filters only work when you are using webmail. You could probably do this currently with a custom header search looking for the character code. I don't have any here currently to test this on, however.

Yes, but since I tend to leave a window open on the Webmail site for this very reason, it would work for me. :D I'll try your suggestion of a "custom header search" looking for that character code. Thanks for the suggestion, Steven!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×