Jump to content

newline header problem


DontKnow

Recommended Posts

I report to SpamCop via e-mail and the original message is left untouched and it normally works great. I have a scri_pt that cats the Maildir file directly to mail to SpamCop, then runs it through spamassassin rules, and finally deletes it.

Recently a few messages have arrived that have added a rogue ^M (0x0D) to a received from line that causes SpamCop to not parse the headers correctly and therefore SpamCop won't parse the body of the message for URLs.

Am I really expected to EDIT the e-mail so that SpamCop receives a tampered version of the e-mail? Manually removing the ^M should produce a useable e-mail, however then the message isn't what was received.

Instead I'd expect SpamCop to recognize the bad character and to ignore it appropriately.

Tracking URL of one message: http://www.spamcop.net/sc?id=z989393386z6f...eba8b03c2d9da0z

What you will notice in the tracking URL is a weird line-break, below I have copied the header as I received it and shown the ^M which is in the last "Received: from" line.

Received: from gasconypeternixon.net [168.178.122.114^M] (helo=darwinian.vergenet.net)

by smtp7.cistron.nl with esmtp (notary)

id 3AQ6iH-9994CR-00; Sun, 02 Jul 2006 14:03:37 -0500

Link to comment
Share on other sites

Manually removing the ^M should produce a useable e-mail, however then the message isn't what was received.

While I see that ^M, I don't see where is is affecting the parsing. The body is defined by a blank line which does not occur untol the broper start to the body in this example. The header parse stops before that likely forged line because of your mailhost configuration.

1: Received: from gasconypeternixon.net [168.178.122.114 ] (helo=darwinian.vergenet.net) by smtp7.cistron.nl with esmtp (notary) id 3AQ6iH-9994CR-00; Sun, 02 Jul 2006 14:03:37 -0500

No unique hostname found for source: 168.178.122.114

Possible forgery. Supposed receiving system not associated with any of your mailhosts

Will not trust anything beyond this header

Have you confirmed removing that character resolves the issue? If so, I would submit both tracking URL's to the deputies.

Link to comment
Share on other sites

<snip>

Am I really expected to EDIT the e-mail so that SpamCop receives a tampered version of the e-mail?

<snip>

Hi, DontKnow!

...No, exactly the contrary -- editing would violate SpamCop's rules: see SpamCop FAQ (link near top of every SpamCop Forum page) entry labeled "Material changes to spam." Thanks!

Link to comment
Share on other sites

While I see that ^M, I don't see where is is affecting the parsing. The body is defined by a blank line which does not occur untol the broper start to the body in this example. The header parse stops before that likely forged line because of your mailhost configuration.

Have you confirmed removing that character resolves the issue? If so, I would submit both tracking URL's to the deputies.

I think that problem isn't in "^M" (I have many messages with ^M correctly parsed).

My buggy example (with ^M) has uncorrectly wrapped long header lines - they should be idented but continuation starts from first column

Link to comment
Share on other sites

I think that problem isn't in "^M" (I have many messages with ^M correctly parsed).

My buggy example (with ^M) has uncorrectly wrapped long header lines - they should be idented but continuation starts from first column

That is exactly the problem, however ^M can be (usually is) the visual representation for Control-M which is the control character for a carrige return (try it somewhere). If a mail system, or the application you are using to handle the source, interprets the ASCII set as the control character, the, by definition, the headers will be messed up because the carraige return will be inserted.

The headers in the tracking URL above actually has a spamassassin type rule picking up on the illegal characters in the headers. I don't think ^ is a legal character for headers.

Link to comment
Share on other sites

Recently a few messages have arrived that have added a rogue ^M (0x0D) to a received from line that causes SpamCop to not parse the headers correctly and therefore SpamCop won't parse the body of the message for URLs.

Tracking URL of one message: http://www.spamcop.net/sc?id=z989393386z6f...eba8b03c2d9da0z

What you will notice in the tracking URL is a weird line-break, below I have copied the header as I received it and shown the ^M which is in the last "Received: from" line.

Received: from gasconypeternixon.net [168.178.122.114^M] (helo=darwinian.vergenet.net)

by smtp7.cistron.nl with esmtp (notary)

id 3AQ6iH-9994CR-00; Sun, 02 Jul 2006 14:03:37 -0500

Whee, I looked again at what SpamCop did. There is another place where SpamCop is causing problems. And I resubmitted the same message removing the ^M to see how SpamCop handles it.

In the beginning of the report page SpamCop says "Removing whitespace from mangled header".

The missing whitespace that SpamCop removed is on a line with a "\r" in it. In other words, something in SpamCop is parsing the 2 character string "\r" and I assume treating it as if it were the single control character ^M.

With the ^M, I'd have expected SpamCop to know what the newlines were and to know that ^M was a whitespace character and to ignore it.

With a "\r" SpamCop should be treating that the same as any two regular characters and ignoring that \r is an escape sequence. Or are there cases in headers where escapes have to be parsed?

If I remove the ^M character, SpamCop finds the url in the body, but it still reports that the header was mangled. See http://www.spamcop.net/sc?id=z992631882z04...e1d3fe7431b7f0z

If I remove the header with the \r SpamCop doesn't tell me that it is removing whitespace, but it still tells me "error: couldn't parse head" See http://www.spamcop.net/sc?id=z992633019zcd...809a730c7640f5z

If I convert the ^M and \r to spaces, SpamCop can't even get through the headers: http://www.spamcop.net/sc?id=z992638121z42...779b81ade79399z

Therefore I see three bugs in SpamCop:

1) That the two character string \r is treated as the single character ^M and that it removes the whitespace indicating that it is an indentation. This is just a minor problem and appears to only effect the display of the output and not actual parsing.

2) That SpamCop does not recognize ^M as whitespace, but instead treats it as a newline character. SpamCop should be detecting that newlines are ^J (or ^M^J) and recognizing that something else isn't a newline.

3) That any whitespace or other illegal formatting can be added to forged received headers to confuse SpamCop. If the received header is going to be ignored because it is forged, illegal formatting shouldn't cause SpamCop to not parse the body of the message.

--DK

Link to comment
Share on other sites

That is exactly the problem, however ^M can be (usually is) the visual representation for Control-M which is the control character for a carrige return (try it somewhere). If a mail system, or the application you are using to handle the source, interprets the ASCII set as the control character, the, by definition, the headers will be messed up because the carraige return will be inserted.

The headers in the tracking URL above actually has a spamassassin type rule picking up on the illegal characters in the headers. I don't think ^ is a legal character for headers.

Whether ^M is the newline character or not depends on the operating system. On Unix only ^J is used. On MS-DOS and Windows ^M^J is Used. There are systems that only use ^M.

I was using ^M as a representation of the control character, which for Unix systems (such as my server) is just whitespace. Though in this case the whitespace is in a bad location that does produce an illegal header. My point is that SpamCop should be able to go through the body of the message, even with the bad header.

The program that picked up the illegal character was actually amavisd-new (which I use to call spamassassin and clamav for virus checking.)

And yes, I am reporting spam that gets quarantined (such as this message) to SpamCop.

--DK

Link to comment
Share on other sites

My point is that SpamCop should be able to go through the body of the message, even with the bad header.

Spamcop has as one of its first rules, the message received must be RFC compliant. If the headers are bad and it can not determine where to send the source reports, there is no reason to go through the body.

Supposed receiving system not associated with any of your mailhosts

If I remove the ^M character, SpamCop finds the url in the body, but it still reports that the header was mangled. See http://www.spamcop.net/sc?id=z992631882z04...e1d3fe7431b7f0z

Possible forgery. Supposed receiving system not associated with any of your mailhosts

This message is NOT telling you the header was mangled. It is a simple warning that the server receiving the message (smtp7.cistron.nl) is not in your mailhost configuration. If this server should be receiving email for you, then you need to complete the mailhost configuration. If not, do not worry about it, it is simply a warning.

This is a proper parse (as far as I can tell) and the ^M bug should be brought to the attention of the deputies.

Link to comment
Share on other sites

Spamcop has as one of its first rules, the message received must be RFC compliant. If the headers are bad and it can not determine where to send the source reports, there is no reason to go through the body.

Supposed receiving system not associated with any of your mailhosts

Possible forgery. Supposed receiving system not associated with any of your mailhosts

This message is NOT telling you the header was mangled. It is a simple warning that the server receiving the message (smtp7.cistron.nl) is not in your mailhost configuration. If this server should be receiving email for you, then you need to complete the mailhost configuration. If not, do not worry about it, it is simply a warning.

This is a proper parse (as far as I can tell) and the ^M bug should be brought to the attention of the deputies.

The managled header message comes from the very top of the tracking page:

SpamCop v 1.582 Copyright © 1998-2005, IronPort Systems, Inc. All rights reserved.

Removing whitespace from mangled header

Here is your TRACKING URL - it may be saved for future reference:

http://www.spamcop.net/sc?id=z992631882z04...e1d3fe7431b7f0z

Skip to Reports

Does reporting to the deputies mean going to a nntp group or posting to the suggestions forum?

--DK

Link to comment
Share on other sites

Does reporting to the deputies mean going to a nntp group or posting to the suggestions forum?

SpamCop FAQ here, several Pinned items, entries in the Announcements Forum section .... contact means, methods, even addresses are listed ....

Link to comment
Share on other sites

The managled header message comes from the very top of the tracking page:

And that also is simply a warning (note orange is a warning, not an error). The parser thinks the Received: line which is part of the X-Amavis-Alert: line is a real received line which is mangled and is removing the white space in front of it. The \r also seems to be introduced by Amavis. It is strange that while it is removing the white space, it then ignores the resulting received line in the parse.

Does reporting to the deputies mean going to a nntp group or posting to the suggestions forum?

No, it involved emailing the details to deputies[at]spamcop.net with enough infomation that they can handle the situation in the first pass (otherwise, it may take a long time to be resolved).
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...