Recent spams using quoted-printable to duck parsing

OsakaWebbie · October 16, 2006

The last couple of days I have suddenly started getting a particular type of spam in significant numbers, which seems to be successfully ducking Spamcop's parsing of the message body (and therefore the various domains in the links) by making normal text sections of the message quoted-printable. There are both text/plain and text/html portions, which advertise the typical drugs for men, with links to a URL that changes domain every couple of messages or so. But even though both parts are perfectly readable (not gibberish denoting binary data), they are designated as "quoted-printable" and there is a "3D" before the href URL in the links, which apparently also has something to do with quoted-printable. In Thunderbird nothing at all appears in the message window (I originally thought there was no message body until I looked at the source), and when I submit them to Spamcop (using the web form, single entry of whole message source), it first announces that it has removed "quoted-printable artifacts" and then proceeds to complain that it can't parse the head (then lecturing me about the need for a full, accurate copy of the message). I don't really know how the spammer expects the average user to read their ad when it isn't displayed by email software, but they seem to have found a Spamcop loophole. Is there a way I can get Spamcop to not throw away the message body thinking it's binary junk?

I have already reported everything that has come in at the moment (and didn't record any tracking IDs), so just to give you something to look at, I put one in again for processing but didn't report it. The tracking ID is z1106533037z982cb26b627957218080eef2c706d232z.

Moderator Edit to include Tracking URL: http://www.spamcop.net/sc?id=z1106533037z9...80eef2c706d232z

MikeRG · October 16, 2006

by making normal text sections of the message quoted-printable.

and there is a "3D" before the href URL in the links, which apparently also has something to do with quoted-printable.

I am also getting this type of mail but am sending submissions by email.. I am not getting any error mails from Spamcop but I think that is normal for mail submissions.

I have just tested a WebForm Submission and by removing the offending Content-Transfer-Encoding: quoted-printable lines from the spam, it accepted and parsed the mail with no errors

There were some lines that had the 3D denotation but I left them as they were.

I hope this helps

Mike

Wazoo · October 16, 2006

I have already reported everything that has come in at the moment (and didn't record any tracking IDs), so just to give you something to look at, I put one in again for processing but didn't report it. The tracking ID is z1106533037z982cb26b627957218080eef2c706d232z.
Moderator Edit to include Tracking URL: http://www.spamcop.net/sc?id=z1106533037z9...80eef2c706d232z

That example;

Missing blank line between header and body

Any idea where this line comes from? -> From - Mon Oct 16 22:52:58 2006

Note the MIME content description of charset="koi8-r" .. yet no cyrillic text included

Out of time here ....

Farelf · October 17, 2006

Sounds like another rash of poorly constructed spam - though sometimes the problem seems to be the email application. Interesting that the parser has the facility for Removing quoted-printable artifacts (I had forgotten about that, if I ever knew of it) - assume the term "artifact" means SC has decided some mangling occurs en route. As Wazoo said, the error: couldn't parse head comes about from the lack of a mandatory line break between headers and body and whether this is caused by the parser's "removal of artifacts" is not entirely clear but from MikeRG's helpful bit of testing that may be likely. If so, the "pow'rs above" need to know there is a bug to be addressed. They will need copies both of the unsubmitted spam and of the resultant parse if this needs to be passed on.

Just for the record - nobody should be tempted to "help the parser" in the interim - note Material changes to spam are essentially forbidden. Use the parse for data for manual reports by all means, just don't submit SC reports on altered data, even if you think it is SC's "fault". "Integrity is tantamount," if you will.

OsakaWebbie · October 17, 2006

Just for the record - nobody should be tempted to "help the parser" in the interim - note Material changes to spam are essentially forbidden. Use the parse for data for manual reports by all means, just don't submit SC reports on altered data, even if you think it is SC's "fault". "Integrity is tantamount," if you will.

Agreed - I think MikeRG's suggestion is too much like a material change. But a different question: if the problem turns out to be the lack of a blank line between the header and the body, would it be okay to add that? I have three more spams of the same type waiting to be submitted, and they all are the same structure - including the missing blank line.

Farelf · October 17, 2006

... would it be okay to add that? (blank line) I have three more spams of the same type waiting to be submitted, and they all are the same structure - including the missing blank line.

If it is missing in the original spam source no, nobody could properly advise you to do that. If it is missing as a result of parser action - (possibly) still no but that could be fixed (in the fullness of time, subject to maintenance and development priorities of which "we" know nothing). But it doesn't sound like the parser is doing it because you are seeing the absence of the blank line before submission.

The trouble is, we are talking about the parser going after spamvertized links in the body, which is not its primary task. In "your" flavor of spam you say these URLs keep changing in any event. Some say sending reports to the hosts in such (or indeed, most) cases is of doubtful value in the scheme of things. I wouldn't agree with such a generalized assessment myself, I do think it is often worthwhile, but it comes back to a matter of SC priorities. If there is a bug that should get some attention at least but the refusal to treat a body as a body when it is not compliantly separated from the headers by a blank line** is not a bug and history shows SC is not inclined to follow the popular email applications outside of/beyond compliance.

[** rfc822 part 3.1]

OsakaWebbie · October 17, 2006

The trouble is, we are talking about the parser going after spamvertized links in the body, which is not its primary task. In "your" flavor of spam you say these URLs keep changing in any event. Some say sending reports to the hosts in such (or indeed, most) cases is of doubtful value in the scheme of things. I wouldn't agree with such a generalized assessment myself, I do think it is often worthwhile, but it comes back to a matter of SC priorities. If there is a bug that should get some attention at least but the refusal to treat a body as a body when it is not compliantly separated from the headers by a blank line** is not a bug and history shows SC is not inclined to follow the popular email applications outside of/beyond compliance.

So that means that if my email software doesn't display the ad links, SC won't consider them valid to trace, right?

I'm surprised at your statement that the spamvertized links are not a primary target. Yes, the URLs in "my" spams are different every couple emails or so, but the source addresses are even more random - they look like hacked machines or freemail addresses that will probably not be used twice. So I would think the most likely place to report would be the registrar(s) of the domains being used for the junk when people actually fall for the ad and click the links. Am I being naive?

Anyway, I won't add the blank line - yes, it is definitely in the original. Maybe the spammer will eventually figure out that without proper formatting their gullible potential customers can't see anything anyway, so if they want any business they'll have to fix the problem. Meanwhile I got four more since I last wrote on this forum - all with different subject lines and senders, but the same type of ad buried in the source but not visible in the email software. In fact, for the last 48 hours, on the one account that is getting these, I think that is the only type of spam I have gotten. Weird!

agsteele · October 17, 2006

I'm surprised at your statement that the spamvertized links are not a primary target.

SpamCop was established to identify the source IP addresses of spam so that straightforward DNS filtering/blocking could be implemented.

The parser also, subesquently gained the ability to identify spamvertised URLs in the content. But this was not and is not the primary use of the parser. Indeed, spamverstised URLs only generate an Email to the ISP concerned letting them know that someone is using their IP space to host something that has been promoted in spam. However, many links are bogus. Many are for legitimate organisations and included by spammers to generate false reports and to confuse spam filters into thinking a message is not spam.

So if a message content is not parsed or a spamvertised URL is not detected it really isn't a great loss. Identifying the source IP is the primary purpose of the SCBL parser.

Andrew

Farelf · October 17, 2006

So that means that if my email software doesn't display the ad links, SC won't consider them valid to trace, right?

Lots of variables in the behavior of various software IIUC. The specific instance of the missing break between headers and body is a definite "killer", always has been and, when this malformed stuff is running hot (as it seems to from time to time), many wish it were not.

...So I would think the most likely place to report would be the registrar(s) of the domains being used for the junk when people actually fall for the ad and click the links. Am I being naive?

No, you are right on the money, that would be a very effective process in essence. But it's not what spamcop does. If you haven't caught up with it already, you might like to work through Botnet scenario (link) which is an approach to taking these pests on.

turetzsr · October 18, 2006

... would it be okay to add that? (blank line) I have three more spams of the same type waiting to be submitted, and they all are the same structure - including the missing blank line.
If it is missing in the original spam source no, nobody could properly advise you to do that.
<snip>

...But see my reply in thread "Body found within headers" and subsequent discussion on how you can do the equivalent of adding a blank line between the headers and body.

karlisma · October 24, 2006

However, many links are bogus. Many are for legitimate organisations and included by spammers to generate false reports and to confuse spam filters into thinking a message is not spam.

How much of them and how do You know? Any statistics?

Wazoo · October 24, 2006

How much of them and how do You know? Any statistics?

I really don't understand the question .... but am thinking that the question was asked because the previous reply wasn't understood either ...???

Example .. stock spam referencing Yahoo pages that show charts of some sort ....

... get rich scheme that references BBC, CNN, etc. sites with the "as seen on" .... usually ignoring the fact that if one could actually find the page, it was about folks getting screwed ....

... free e-mail sources that include the viral-marketing footer links and data ...

.. worst case example, the spam that intentionally contains a ton-load of bogus links (or those that were killed off yesterday due to spam activity ...

on and on ... why do you need 'evidence' when so much spam out there contains this kind of garbage ...???

karlisma · October 24, 2006

on and on ... why do you need 'evidence' when so much spam out there contains this kind of garbage ...???

because i see those innocent bystanders only once in a while, let's say 1 on 100 parsed links. Therefore that is not a reason to drop parsing, tracing and reporting process.

"So much" ...just how?

Farelf · October 24, 2006

..."So much" ...just how?

There's been an upsurge recently of the "too may links" type, which from memory contain 35+ innocent links. You haven't seen these? I could probably dig one out of my "history" but that's a drag. Probably 2 or 3 of them (no more) in the last 50 or so reports. Haven't seen any bank phishes lately but when they're running they generally have real bank links to make them look more authentic. Some of the Nigerian 419 scams carry links to on-line news reports and so on. Not many of those lately either but the thing is they come and go - and come again.

Bottom line, innocent links are not altogether rare and they get more or less frequent over time. It may be a matter of your "flavor". Above comments relate to the spam I see at home. At work I wouldn't see one innocent link in many hundreds of spam. Whether that's a function of the lists my different addresses are on, or of ISP or work server (supposed) filtering I don't know.

karlisma · October 24, 2006

Probably 2 or 3 of them (no more) in the last 50 or so reports.

so, will that make enough to make it an excuse for not tracking spamwertized links at all?

Farelf · October 24, 2006

so, will that make enough to make it an excuse for not tracking spamwertized links at all?

Not at all. The point was that URLs are not a SC priority (the SCBL is the priority, the "product" of the enterprise) and reporters need to be aware that some of the links are innocent.

Despite spamvertized links not being a priority the parser was "improved" to handle far more of them than it did a couple of years ago before hitting the "too many links". I shudder to think what this "costs" in terms of resource in an environment where the load is over a million messages a day and spammers and hackers of all kinds are constantly trying to overload or cripple the system.

Looked at objectively, SC has bent over backwards to accommodate the "hobby" of those reporters who wish to send reports/LART the spammers web hosts. Sure, the spammers go to such pains to protect those sites it seems worthwhile to give them a hard time. Enter the era of botnet hosting or whatever it is and a whole new technique is needed to do this effectively (ref TerryNZ's Registrar / Nameserver compliance request method.

This is not just a different world (to the SCBL) it is a different universe. It makes the tacked-on "SC report of spamvertizing" model look insignificant by comparison, I think. But frankly, I don't quite see how it could be integrated into SC as a commercial proposition. Maybe Mr Bill Gates would like to fund a beneficent foundation to support the infrastructure - no, he's going to do "caller ID" for us instead . Maybe the "business community" will get sick enough of the erosion of the internet that they will do it. Can't see the mymidons of spamdom taking it laying down though. They would evolve, get meaner and/or sneakier. And so, then on to the next era (every fiber of me is aquiver in anticipation).

Now, what is it you want?

Wazoo · October 24, 2006

so, will that make enough to make it an excuse for not tracking spamwertized links at all?

Where is this coming from? There is no "refusal to track spamvertised links" anywhere in the game plan. What is being discussed here (and in countless hundreds of other similar Topcs, even a FAQ entry or two) is the reasoning behind a 'failure to resolve' a particular spamvertised URL in a particular spam submittal .. and those reasons are based on all kinds of variables ....

rooster · October 25, 2006

Wazoo;

Any idea where this line comes from? -> From - Mon Oct 16 22:52:58 2006
Out of time here ....

I didn't scour the whole thread; just scanned. Someone might have mentioned; but that line is the way Thunderbird commences it's headers. It throws off Sam Spade's parser too,("I don't recognize that header") assuming that is why you posed the question. If not, just chalk it up to another of my "Duhs".

Wazoo · October 25, 2006

I didn't scour the whole thread; just scanned. Someone might have mentioned; but that line is the way Thunderbird commences it's headers. It throws off Sam Spade's parser too,("I don't recognize that header") assuming that is why you posed the question. If not, just chalk it up to another of my "Duhs".

Wow, that really sucks a lot .... wonder why it's not come up before .....

rooster · October 25, 2006

Wow, that really sucks a lot .... wonder why it's not come up before .....

I have no idea. The SC parser hasn't identified this header line as a problem except for one or two instances (out of 8-9 thousand) and that was way back when. If I'm not mistaken, this last (top) line is actually added by my ISP... and believe me, it has and is causing me no end of grief and frustration.

(I tried to enlist your "great throbbing brain" not long ago on this, ... but I don't think it really is a SC issue ... maybe in the ng... Mozilla longheads and MS MVPs have come a cropper trying to figure it out, and my ISP ... well; if you think I'm a bit thick...)

Since the SC Parser isn't complaining (see URL below), it appeared to me it got configured to accomodate the idiosycracy; whereas Sam Spade has not been brought up to speed. Just my guess; but SS appears to be a year or so outdated. I only started using it recently ... you know, "exploring".

http://www.spamcop.net/sc?id=z1115758688z3...929c27733f1f57z

Recent spams using quoted-printable to duck parsing

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived