Jump to content

Spams with all-whitespace body


studog

Recommended Posts

Today I had two spams that had bodies that were all whitespace. This meant that the SpamSource Outlook extension I use didn't give me the "no body" error, as there was a body. However, SpamCop's parser seems to strip off all trailing space, producing the "No body" error.

I believe that is a bug in the parser.

http://www.spamcop.net/sc?id=z870673060zb4...9e943eb81236f8z

...Stu

Link to comment
Share on other sites

...My guess is that it is not a bug but the parser just treating the whitespace as a header-body separator.

40128[/snapback]

Which is the bug. The header/body delimiter is exactly one blank line. Any whitespace after that constitutes the body. The parser does not handle this correctly.

...Just edit the spam body to include something like <empty body> and re-submit.

40128[/snapback]

Yes, that's what I do. But I shouldn't have to.

...Stu

Link to comment
Share on other sites

... I shouldn't have to...

40227[/snapback]

Maybe not, but this is a well documented "feature", ref http://forum.spamcop.net/forums/index.php?...indpost&p=33422

- hardly a "help" item any more, if this is the way you're heading. Perhaps you would like this moved to "new feature request"?

There might be a problem with default acceptance of no body email though. I should think the parser treats attempted no body parses as submission errors because at least some of them are (mangled somewhere along the track, could be missing part of the headers even, how is the parser to know?) - but not in your instances, obviously. Okay, "whitespace" is not the same as nothing, but the insertion of additional blank lines is a known "mangle" so it would possibly not be safe to have the parser make assumptions.

Link to comment
Share on other sites

Maybe not, but this is a well documented "feature", ref http://forum.spamcop.net/forums/index.php?...indpost&p=33422

40229[/snapback]

Yes, that part is a feature. That's for spams that *actually have no body*. My spams have a body.
- hardly a "help" item any more, if this is the way you're heading.  Perhaps you would like this moved to "new feature request"?

40229[/snapback]

No, this is a bug AFAIK.
- but not in your instances, obviously.  Okay, "whitespace" is not the same as nothing, but the insertion of additional blank lines is a known "mangle" so it would possibly not be safe to have the parser make assumptions.

40229[/snapback]

Can you elucidate on the "mangle"?

I suppose that insertion of blank lines into the headers themselves would cause the remaining headers to be treated as body text and not be parsed properly. However, if the parser is attempting to detect that, it would need to be able to distinguish real bodies from mangled-headers bodies, and I don't think that's possible (strikes me as a variant of the halting problem).

What it looks like to me is that somewhere (hadn't considered this, but it could be the browser stripping off the whitespace before sending to the server), all trailing whitespace is stripped, making the parser conclude incorrectly that there is no body.

...Stu

Link to comment
Share on other sites

What it looks like to me is that somewhere (hadn't considered this, but it could be the browser stripping off the whitespace before sending to the server), all trailing whitespace is stripped, making the parser conclude incorrectly that there is no body.

40240[/snapback]

Just did a quick check, my browser (Opera 8.51) is sending all the trailing whitespace in the POST. So it's a bug somewhere on the server side.

...Stu

Link to comment
Share on other sites

Which is the bug. The header/body delimiter is exactly one blank line. Any whitespace after that constitutes the body. The parser does not handle this correctly.

Yes, that's what I do. But I shouldn't have to.

...Stu

40227[/snapback]

While I don't think this is worth the time to devote to it with much larger problems being out there, Stu is correct that the parser should interpret the first blank line as the seperator, then everything else as the body, as described in RFC2822

A message consists of header fields (collectively called "the header of the message") followed, optionally, by a body.  The header is a sequence of lines of characters with special syntax as defined in this standard. The body is simply a sequence of characters that follows the header and is separated from the header by an empty line (i.e., a line with nothing preceding the CRLF).
Link to comment
Share on other sites

...Can you elucidate on the "mangle"? ...

40240[/snapback]

Well, f'rinstance Rod had a spell when a blank line seemed to be getting shoved into his headers - http://forum.spamcop.net/forums/index.php?...indpost&p=33946 which caused the rest of the headers to be treated as body, leading to some unusual results. That's a mangle.

More to the point, I would bet that much of the "no body" spam commented on over the years has actually had more than one blank line after the headers (just received one such myself). No body ones (however defined) are very much the minority in my experience - though filtering & discarding of "regular" spam might make it seem otherwise. You are right if you are saying that a blank line is not nothing but as to whether the parser ought to treat n-1 blank lines as body ... ask away for a fix by all means, maybe there are no problems in fixing it - but we've managed without a fix for a long time and if there is the possibility of adverse consequences then I for one could easily go a bit longer.

Link to comment
Share on other sites

Just a taught: have you ever questioned yourself - why do they (spammers) try to stumble Your tool, if You (according to philosophy spread here) think that reporting spamvertised site is soooo seeeeecondary, that You sometimes don't even bother or care, if spamverisment parses/finds-report-address. If this reporting was useless, then they wouldn't hide themselves with different tricks, sometimes so hidden, becoming unreadable by anything.

Link to comment
Share on other sites

If this reporting was useless, then they wouldn't hide themselves with different tricks, sometimes so hidden, becoming unreadable by anything.

I think there are three answers to that question: one is that the spamcop parser will give you a reporting address for the easy ones which are probably the ones that have slipped through the ISPs defenses.

the other is that while reporting spamvertised sites is important to responsible ISPs, it is still not the primary focus of the spamcop tool. It is far too complicated to keep up with all the 'tricks' unless there are reporters who do the legwork and AFAIK, spamcop will add reporting addresses for some of the spamvertized sites.

And the third is that while attacking on several fronts is always a good idea, the primary point of reporting is to stop the spam - either because the server admin fixes the problem or by blocking until they do. If no one ever received an email about a spam site, then they would be on an equal footing with all other sites.

And also, IMHO, a certain number of those obfuscated sites are not trying to sell anything, but just evade any filters. It's only a game.

Miss Betsy

Link to comment
Share on other sites

More to the point, I would bet that much of the "no body" spam commented on over the years has actually had more than one blank line after the headers (just received one such myself).  No body ones (however defined) are very much the minority in my experience - though filtering & discarding of "regular" spam might make it seem

40251[/snapback]

My experience is exactly the opposite; I know this factually because I report all spam, and previously when I've added the "no body" string it's been right where the body should be. These new (to me) ones have hordes of whitespace, and the string goes in way at the bottom of that. So, all my previous no bodies have had no body. The whitespace-body is new to me.

otherwise.  You are right if you are saying that a blank line is not nothing but as to whether the parser ought to treat n-1 blank lines as body ...  ask away for a fix by all means, maybe there are no problems in fixing it - but we've managed without a fix for a long time and if there is the possibility of adverse consequences then I  for one could easily go a bit longer.

40251[/snapback]

It's an inconvenience to have to manually type in the no body string, especially when there clearly is a body.

I'm of the thought that all bugs should be reported and eventually fixed. Whether this actually sees any action is up to the bug fixers.

...Stu

Link to comment
Share on other sites

My experience is exactly the opposite; I know this factually because I report all spam, and previously when I've added the "no body" string it's been right where the body should be. These new (to me) ones have hordes of whitespace, and the string goes in way at the bottom of that. So, all my previous no bodies have had no body. The whitespace-body is new to me.

40254[/snapback]

Then we're talking about quite different experiences - I think most or maybe all of the "no body" cases I have seen had more than one blank line (but not that many more) after the headers (which would only tend to support your case for a fix), you may be the first person to make it an issue but you certainly wouldn't be the only person to get blank spam with additional blank lines technically constituting a body. Whether these are "as sent" or, in some instances, some artifact introduced by the email application probably doesn't matter, on reflection. Maybe you want to insert a link to "new features request"?

Just a taught: have you ever questioned yourself - why do they (spammers) try to stumble Your tool, if You (according to philosophy spread here) think that reporting spamvertised site is soooo seeeeecondary, that You sometimes don't even bother or care, if spamverisment parses/finds-report-address. If this reporting was useless, then they wouldn't hide themselves with different tricks, sometimes so hidden, becoming unreadable by anything.

40252[/snapback]

I think the point here is there's no spamvertisement in these ones, just CR LFs - if the spammer has been successful in hiding something in that then he has hardly triumphed, has he? These are probably (in the main) "probes" or other tests or bumbled manipulation of a botnet or broken spamtools/viruses. But I like your zeal!
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...