Jump to content
Sign in to follow this  
mrmaxx

How to block/filter? (cyrillic spam)

Recommended Posts

Glad I found this thread as I just discovered it was MIA in my SC Options too.

Edit later....it's back.

Edited by Ex_Brit

Share this post


Link to post
Share on other sites

I used the new "Block Russian" feature for the whole of October '08

1448 valid emails reached my inbox.

5825 spams (188/d) 32 leakers (=0.55%) (111 of the spams were Blocked Russian so without this leaks might have been as high as 2.5%, depending on other block lists)

3 false positives, 3 leaks were due to a forged From: matching my personal whitelist.

This was a great improvement over Sept '08 when

1963 valid emails reached my inbox.

5176 spams ( 173/d), 94 leakers (=1.8 %) 45 appeared to be Russian.

5 False positive(s), 2 leaks were due to a forged From: matching my personal whitelist.

SpamCop Mail filtering service, my profile has all available Blocklists selected and SpamAssassin set=2.0. All spam is quick reported, all leaks are full reported.

Share this post


Link to post
Share on other sites

Just to note my 'Russian' spam is no longer Cyrillic - quite a few little ones like this though, in its place - http://www.spamcop.net/sc?id=z2513451308za...0bfe5d56f0f458z

So maybe all that filtering folk have been doing has worked. Though I would be surprised if it was a direct cause and effect thing. But what else, why else transliterate? Я не понимаю, Ya nye ponimaj.

Share this post


Link to post
Share on other sites
Intersting that there is no character-set data anywhere within the header or body, yet NAS chose to 'show' that it was X-NAS-Language: Slovenian .... ????
Good catch! Yeah, I've never understood how Norton does that - quick check shows the recent others ID as Slovenian too, except for this one which is seen as Polish - http://www.spamcop.net/sc?id=z2512607060z6...2e8ab2d944b2e3z

Must be a heck of a lot of processing going on if the contents of the bodies are being matched against (presumably a large number of) dictionaries, but yes, point is Slovenian and Polish are two (Slavic?) languages written in (modified) Latin scri_pt/alphabet - http://www.omniglot.com/writing/languages.htm#latin - not Cyrillic so no transliteration involved after all. Knowing neither, will have to trust NAS has made the right call (think it is supposed to use X-NAS-Language: Unknown if indeterminate).

Share this post


Link to post
Share on other sites

There again - a Russian one http://www.spamcop.net/sc?id=z2518947839zf...1d6084c640ff93z

- and a purported Icelandic one which is actually Russian - http://www.spamcop.net/sc?id=z2519132222z3...fa47f119a21e50z (assuming Dolgaya Doroga = Долгая Дорога = "Long Road"). So, tending back to the transcription theory once more and maybe NAS just guesses about language.

Share this post


Link to post
Share on other sites

Sorry to revive this thread, SC Webmail is catching all this cyrillic spam OK but, as I'm getting fed up with reporting or deleting it all the time I had set a filter to move it all to the trash, but it isn't doing that.

Wondered what I'm doing wrong, or is that not possible?

I've entered a filter for "koi8-r" under the various parameters mentioned elsewhere and applied the rule.

Edited by Ex_Brit

Share this post


Link to post
Share on other sites
Sorry to revive this thread, SC Webmail is catching all this cyrillic spam OK but, as I'm getting fed up with reporting or deleting it all the time I had set a filter to move it all to the trash, but it isn't doing that.

Wondered what I'm doing wrong, or is that not possible?

I've entered a filter for "koi8-r" under the various parameters mentioned elsewhere and applied the rule.

Looking back, I was successful with a filter

Filter Self-defined header/ content-type contains KOI8-R

I have the feeling that Filter and Search gave different results with Subject contains "KOI8-R" except when the subject had KOI8-R as text rather than a encode marker so perhaps the header text that the filter is seeing is decoded and no longer contains ?KOI8-R? ?

I suggest you make up a test folder with KOI8-R in every possible location and report on results.

Share this post


Link to post
Share on other sites

Is the omission of the "" significant?

I've altered my rule and created a special folder to have them dumped in, if it works of course.

Share this post


Link to post
Share on other sites
I've entered a filter for "koi8-r" under the various parameters mentioned elsewhere and applied the rule.
What about KOI8-U & Windows-1251? Are you trapping for those charsets as well? Firefox actually mentions many more, but I don't see them very much in e-mail.

-- rick

Share this post


Link to post
Share on other sites

I have no idea. I'm not very au fait with all this terminology or these procedures.

I was merely copying from another thread when I first set up the filter.

Share this post


Link to post
Share on other sites
I have no idea. I'm not very au fait with all this terminology or these procedures.

Tracking URL for one of those that you think should have been 'filtered but failed' ... thoughts being that we could be trying to talk about a specific set of headers/body-content, rather than simply guessing at what's there or not.

Share this post


Link to post
Share on other sites
I think this was the most recent one:

Post from michaelanglo says Filter Self-defined header/ content-type contains KOI8-R

As Rick mentions, there's a difference between that Rule and your sample spam header in that the Content-Type field does use charset="windows-1251" .... so, if you used that suggested Rule, it fails, as the data it's looking for isn't there. Suggestion would seem to be to add a new Rule using the latter data as the comparison trigger.

Share this post


Link to post
Share on other sites

OK I'll look into that, thanks

Share this post


Link to post
Share on other sites

Tried all the above suggestions and they failed. It's no big deal really. At least they are caught in Held Mail for me to report or delete.

I'll just make do with the status quo.

Thanks.

Edited by Ex_Brit

Share this post


Link to post
Share on other sites
Is the omission of the "" significant?

Oops, I was just trying to clarify. The '"' don't appear in the filter.

You might also try piggy-backing on the webmail "Blocked Russian" feature

Filter Self-defined header/ X-SpamCop-Disposition contains Russian

Share this post


Link to post
Share on other sites
What about KOI8-U & Windows-1251? Are you trapping for those charsets as well? Firefox actually mentions many more, but I don't see them very much in e-mail.

I observe that Windows-1251 as well as KOI8-R trigger 'Blocked Russian' in SC Mail. I found no examples of KOI8-U so I can't say for that.

Windows-1251 example

Here is your TRACKING URL

http://www.spamcop.net/sc?id=z3053524212z6...31fa4f6bac34c7z

Share this post


Link to post
Share on other sites
Oops, I was just trying to clarify. The '"' don't appear in the filter.

You might also try piggy-backing on the webmail "Blocked Russian" feature

Filter Self-defined header/ X-SpamCop-Disposition contains Russian

Thanks - I removed all the ""'s and included that parameter too.

I observe that Windows-1251 as well as KOI8-R trigger 'Blocked Russian' in SC Mail. I found no examples of KOI8-U so I can't say for that.

Windows-1251 example

Thanks, I've done a bit of housecleaning in my filters and will see how all this pans out for a while. I even created a special folder to dump all these into, if they work. We shall see..... B)

Share this post


Link to post
Share on other sites
Sorry to revive this thread, SC Webmail is catching all this cyrillic spam OK but, as I'm getting fed up with reporting or deleting it all the time I had set a filter to move it all to the trash, but it isn't doing that.

Wondered what I'm doing wrong, or is that not possible?

I've entered a filter for "koi8-r" under the various parameters mentioned elsewhere and applied the rule.

I am having the same problem and have employed the same countermeasure but it is having no effect. Approximately 90% of my Held Mail is the Russian spam, characterized by Cyrillic characters in the subject and text. While it's good that Spamcop is catching all this and putting it in Held Mail, it's a time waster because I still have to sift through Held Mail to ensure no legitimate emails have been trapped. Why isn't there a simpler fix for paying customers of Spamcop?

Thanks!

Share this post


Link to post
Share on other sites
I am having the same problem and have employed the same countermeasure but it is having no effect. Approximately 90% of my Held Mail is the Russian spam, characterized by Cyrillic characters in the subject and text. While it's good that Spamcop is catching all this and putting it in Held Mail, it's a time waster because I still have to sift through Held Mail to ensure no legitimate emails have been trapped. Why isn't there a simpler fix for paying customers of Spamcop?

I used the Problem button in the SC Mail interface asking why, under Spamcop options, Blacklists, we can't have an extra choice when it comes to Cyrillic spam of sending it to trash as distinct from Held Mail, as reporting it seems to be a totally futile endeavour. I said that Cyrillic spam is reaching pandemic proportions and the logical way would be for them to nip it off in the bud so to speak.

I doubt it will be adopted but at least I tried.

I've deleted all my filter rules now as not one of them worked. That part seems to be a major let-down in an otherwise good product. They really need to make the mail side of it function better.

Share this post


Link to post
Share on other sites
Approximately 90% of my Held Mail is the Russian spam, characterized by Cyrillic characters in the subject and text. While it's good that Spamcop is catching all this and putting it in Held Mail, it's a time waster because I still have to sift through Held Mail to ensure no legitimate emails have been trapped. Why isn't there a simpler fix for paying customers of Spamcop?

This isn't a problem that affects me since in the last year Cyrillic spam is down from 100 to 12 a month.

However, besides filters, Spamcop Mail (Horde) also provides the Search facility.

Do Search /Held folder/ 'Entire Message for "Blocked Russian"' you can then click 'select all' on the search result then 'report as spam'.

If you save that Search by name as a virtual folder you can get rid of all the Cyrillic spam with only a few more keystrokes than using a webmail filter.

I currently have two such folders defined.

One (SA0-5) which selects all the items with small SpamAssassin scores or SA missing. This will include all the Blocked /blocklist/ and Blocked Russian emails unless excluded and is my folder to eyeball for "ensure no legitimate emails have been trapped".

This (SA0-5) Search list is of the form:-

Entire message contains hits=-

OR Entire message contains hits=0.

OR hits=1. =2. =3. =4. =5.

OR Entire message hits= NOT Found {eg SA text not present}

add if you wish, something like

AND Entire Message contains Blocked Russian NOT

The second (NOTSA0-5) is the reverse and is used for /select all/ /report As spam/

Entire message contains hits=- NOT

AND Entire message contains hits=0. NOT

AND Entire message contains hits=1. NOT

AND {same for =2. =3. =4. =5.}

add if you wish something like

OR Entire Message contains Blocked Russian

Hope this or the ideas will help you.

Share this post


Link to post
Share on other sites

Interesting, thanks for the hints.

Share this post


Link to post
Share on other sites

The strangest thing. Looked at my held mail today...about 8 cyrillic spams. Realised that I had entered the filter Blocked Russian using "". Removed those and the cyrillic spam vanished. Not in the virtual box or in the trash.....I guess I'll never know what happened to them.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×