cannot parse header

fashon · November 1, 2007

I am using a Mac, OS 10.4.9 with Eudora in full header mode; copying the headers plus body as received into Spamcop using Safari. Lately, about half of the spam I report gives me a "cannot parse header" message. I use the same method for all reports.

I am not modifying the headers or body in any way. I compared my normal copy and paste mode with the output of the Eudora work around using "view complete spam." The two reports were identical to my eye.

I reviewed the FAQs and forums and found nothing addressing this specific issue.

Have the spammers found a way to prevent my submissions being accepted by SpamCop? Or am I being obtuse?

The tracking url on two recent examples are:

http://www.spamcop.net/sc?id=z1506024443z0...e2eeba7b5162d1z

http://www.spamcop.net/sc?id=z1506028102z1...60ebcddf758cc6z

BTW: It isn't clear to me whether I should cancel or submit the report when I get this error message.

turetzsr · November 1, 2007

Hi!

...When I click on your second tracking URL (thanks for providing those!), I see the following text:

CLICK 'BACK' BUTTON TO RETURN TO SPAMCOP
################################################################################

Received: from 206.46.232.11 ([59.15.34.175])

by vms051.mailsrvcs.net (Sun Java System Messaging Server 6.2-6.01 (built Apr

3 2006)) with SMTP id <0JQU00FNUD89DG70[at]vms051.mailsrvcs.net> for

x (ORCPT x); Thu,

01 Nov 2007 14:10:10 -0500 (CDT)

Received: from 130.4.175.75 ([130.4.175.75]) by mx4.academyinternet.com

(Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006))

with SMTP id <0JO700K66CQA2TW4[at]pda942.bielomatik.de> for x;

Thu, 01 Nov 2007 15:09:54 -0500 (CDT)

Date: Thu, 01 Nov 2007 15:09:54 -0500

From: "macademyinternet.com" <hskinny2[at]washdepot.com>

X-Originating-IP: [59.15.34.175]

X-Originating-IP: [130.4.175.75]

To: <x>

Message-id: <0JQU________DG70[at]vms051.mailsrvcs.net>

MIME-version: 1.0

Content-type: multipart/alternative; boundary=1DE1E81DD7D5E2D4

Subject:

<x-html><!x-stuff-for-pete base="" src="" id="0" charset=""><HTML>

<BODY>

<P><DIV>study elderflower</DIV></P>

<DIV><font size=4>Thi<b4 />r<a4>d/Fo<b1>r<c6 />th mo<a3>nth yo<b7 />u wi<a5>ll no<c8 />ti<a4>ce<c2> a<a8>n i<f7 />ncr<c2>e<f8>a<d7>s<d1>e<c4> i<b5 />n pe<f2> n - i<a4 /> _s<b8 /> s<b3>i<e5 />ze<c8 /> o<e4>f up to<b4> 3 i<b1 />nche<d5>s<b4 /></font></DIV>

<P><DIV>

<b><font size=5>gr<f8 />e<a9 />ek<a1>y<d9>b.co<c4>m</font></b></DIV></P>

<DIV>A or pleading spidery</DIV>

</BODY></HTML>

</x-html>

The line with the word "Subject" does not look like a standard internet header to me. When I cleared out those funny characters (the ones that look like rectangles), the parser seems to accept it.

<snip>
BTW: It isn't clear to me whether I should cancel or submit the report when I get this error message.

...If there's a "Cancel" button on the parse screen then I would suggest you click it!

fashon · November 2, 2007

The squares do not appear in the incoming spam or in the pasted in copy. However, if I paste the spam into a text editor, the squares do appear but seem to be in a different place in the subject line each time. If I delete them in the text editor, then copy and paste into the report form, the report parses OK.

It does appear that the spammers have found a way to make things difficult.

turetzsr · November 2, 2007

<snip>
It does appear that the spammers have found a way to make things difficult.

...Message sent upstream on your behalf:

To: deputies[at]admin.spamcop.net
Hello, Deputies,

...Please review SpamCop Forum article ""couldn't parse head" error - with No Gnus v0.6, XEmacs 21.4.19-2 - and a pipe to sendmail -v -t" starting at http://forum.spamcop.net/forums/index.php?...ost&p=60740. It appears to me that the header "Subject" line has been corrupted somewhere along the line. I assume that what shows up in the quoted text as rectangles are some kind of unprintable character, perhaps a NULL.

...Please consider whether or not you believe this warrants attention by the programmers and I'll be happy to pass it on to the forum on your behalf.

Thanks & regards,

.Steve T

Farelf · November 3, 2007

Well, this is passing strange. On either of the reports referenced by the tracking URLs in linear post 4 above, I can't directly open the "View entire message" link - as in (link) View entire message in either Mozilla or Firefox - but IE7 is fine. What I get is the "unknown application" dialog, like:

Openinq sc _____________________

The File sc is of type application/octet-stream, and Mozilla does not know how to handle this file type. This file is located at:

.http://www.spamcop.net.

What should Mozilla do with this File?

O Open it with the default application

O Open it with [_____________________] Choose ...

O Save it to disk

[ ] Always perform this action when handling files of this type

OK Cancel

Now those "little box" characters mentioned earlier, on the subject line, represent "non-printing" characters of which there are a number (ASCII 000-031 and 127 decimal), so I'm guessing they are implicated and they're not null (ASCII 000). Need a hex editor to know. I don't believe this is a spammer ploy - with just one person reporting the problem. I'm sure it will all be blindingly obvious to some technically-proficient soul out there. Though not necessarily how those characters got themselves inserted.

Farelf · November 3, 2007

OK - grabbed a freeware hex editor, the "boxes" (I see two, one either side of the Subject:, the trailing one following a space) seem to represent ASCII 0B hex (= vertical tab) and ASCII 05 (= enquiry) in both examples. Which doesn't tell me much, except they have no place in a text field. The whole "subject" line including leading and trailing CR LFs is:

0D 0A 0B 53 75 62 6A 65-63 74 3A 20 05 0A 0D 0A 0D ...Subject: .....

I'm ignoring the possibility that there could be further mangling of the lines introduced somewhere in the filing-retrieval-copying-pasting process of me getting at the data.

Wazoo · November 3, 2007

The tracking url on two recent examples are:
http://www.spamcop.net/sc?id=z1506024443z0...e2eeba7b5162d1z

This spam submittal has the same issues as discussed (a bit) in another Topic. Please see Parsing: Spamcop not finding links in email when there are links

Actually thinking that this set of posts should actually be moved out of this Topic, as this part of the discussion has nothing to do with a "prgrammed scripting issue" ...????

http://www.spamcop.net/sc?id=z1506028102z1...60ebcddf758cc6z

This spam submittal has exactly the same 'problem with the lack of BOUNDARY lines in the body.

I'm guessing that the spammer is using a piece of crap software tool and using it badly. Using Farelf's work, I'd suggest that the program in use has some "put your information here" type setup procedures and the idiot involved didn't perform that bit of work correctly, such that the garbage being output includes some of the 'visual' indicators for the 'gonna-get-rich-someday-spammer' forgot to actually delete from the 'packaged' spam-to-be-sent.

The issue of not finding reportable URLs in the body would fall back to the lack of BOUNDARY lines, the parser having an issue with trying to 'find' the body separator/end of the headers is almost definitely due to these nasty characters being involved. That they "are not seen" while manipulating the e-mail contents for a submittal doesn't mean that they aren't there, this is rather that they are non-printable ASCII characters and any 'representation on screen' would be based on the system character-set/language/font-set in use while viewing it.

Wazoo · November 3, 2007

Farelf suggested that I revisit this Topic/Discussion with the intent of editing my last post. I'm not sure I see what I'd want to edit. I still believe that the 'extra' non-printable characters have been 'demonstrated' to be the issue with the parser's initial problem with these spam submittals. As to how / when these characters are 'arriving' does seem to be the actual issue ..... based on the "not seen" comments by the Topic starter. Further compounded by the "but seem to be in a different place in the subject line each time" description.

So what I 'did' do ... extracted this discussion from "couldn't parse head" error - with No Gnus v0.6, XEmacs 21.4.19-2 - and a pipe to sendmail -v -t as I can see no relevance to that Topic .. and made it into its own Topic ....

I'm of the thought that the 'extra characters that move around' might end up being a troubleshooting exercise for the Topic starter here .... as noted, this is a 'singular event on one person's computer / software mix' at present .... though the impact on the parsing action is not unknown.

fashon · November 3, 2007

I'll try to do some trouble shooting on my end. Keep in mind that about half of my reports go through with no problem, suggesting that the problem is introduced before arriving on my computer.

Farelf · November 3, 2007

I'll try to do some trouble shooting on my end. Keep in mind that about half of my reports go through with no problem, suggesting that the problem is introduced before arriving on my computer.

Thanks. Yes, that seems a reasonable assumption based on the evidence seen and the details you have supplied so resolution would potentially benefit others as well.

rconner · November 4, 2007

Been getting a lot of spam lately that seems to confuse the parser when it looks at the body (tracking link to example). I get the red error message given above, and a link that purports to explain the problem and suggests I improve my submission process. Of course, these spams are forwarded in and I never touched them, so I guess the spam was malformed when sent.

As you see, this message has an HTML body with extensive use of fake tags to break up the text (and the URL, which is not a hyperlink).

I'm suspecting a MIME issue as the cause of the error, but I'm not a MIME guru.

I'm wondering whether this is just sloppiness, or <paranoia>perhaps some trick calculated to evade filters</paranoia>.

-- rick

Wazoo · November 4, 2007

As you see, this message has an HTML body with extensive use of fake tags to break up the text (and the URL, which is not a hyperlink).
I'm suspecting a MIME issue as the cause of the error, but I'm not a MIME guru.

I'm wondering whether this is just sloppiness, or <paranoia>perhaps some trick calculated to evade filters</paranoia>.

Pulled this "new" Topic out of the Lounge area, merged it into this existing Topic as the issue is the same, in fact, appearing to be the same idiot spammer.

I tried t fire up my iBook one more time ... have to try that a bit later ... not sure I could compare notes anyway, still on 10.2.x, don't use Eidora, upgraded a bit to use FireFox ... but was going to try to work through some steps to see how close I could come to recreating some of these things ... but with rconner throwing up the same spam construct, it's pretty easy to say tha it is definitely the spammer.

I think the real question here is .... are any of the e-mail clients involved actually handling this garbage without an issue?

Note: was going to send a PM, but rconner was already here <g>

So folks don't get confused ... rconner's spam sample shows the 'proper' use of the BOUNDARY lines and associated MIME bits. fashon's don't 'show' these lines due to an issue with Eudora. Technically, those lines were there in the actual spam, but Eudora does things during its 'store' and 'retrieve/display' actions. Thus begat the Eudora/Outlook work-around web-form submittal page to allow those users to continue to submit what they could.

Wazoo · November 4, 2007

For giggles, here's a sample of something that's probably being attempted, but the crappy software apparently in use, perhaps even something happening during the injection is dropping some of the 'technical' bits involved to somehow manage to coerce "your" e-mail client to show the (alleged) proper charater-set/font data ....

Subject: =?ISO-2022-JP?B?GyRCM1AkKCRGJCQkXiQ5JCsbKEIo?=
	=?ISO-2022-JP?B?QF9AKRskQiEpGyhC?=

In this case, a Japanese font is trying to be invoked. If 'your' system doesn't have that character-set/font in place, and you are configured to either 'ignore' this or decline to install this additional stuff, then those 'boxes' are typical of stuff your system will throw up when trying to display that data .... if you see anything at all ....

Worst case, as fashon described, passing this data through several different applications really messes things up. The first application didn't handle this data correctly/thouroughly .. data was then 'copied' from that displayed screen and that data was then passed onto another application, which then tries to work with what it was given (which no longer came close to the original stuff) .... then yet another cut/paste/copy was performed across the net to throw the data into a web-form for a submittal .... another opportunity for applications to handle what they were given the best they can, but again, suggesting that the 'real' data has been long since lost.

Or we could guess that the "vms" really means VMS and it's set up to no be able to recognize some obnoxious foreign character-set definitions thrown at it, and it's actually the verizon.net e-mail server that's doing this ....????? (personally doubt this, but ....)

Farelf · November 5, 2007

... I'm suspecting a MIME issue as the cause of the error, but I'm not a MIME guru. ...

Nope Rick, as you've no doubt come to realize, it is as Wazoo said above. Here's the fixed parse.

rconner · November 5, 2007

Nope Rick, as you've no doubt come to realize, it is as Wazoo said above. Here's the fixed parse.

Thanks for the catch, Wazoo & Farelf. I diff'd the original spam against the one that Farelf fixed, and I see that there were indeed unprintables in the subject line (contravening sec 2.2 of RFC2822), that would certainly be reason enough for the parser to barf.

Perhaps next time I get one of these I may release it for delivery, just to see what Apple Mail makes of it.

This may be just crap software as Wazoo indicates, hard to see what advantage there is to be gained from this trick other than foiling the SpamCop parser (which would be a very small victory indeed).

-- rick

Farelf · November 5, 2007

...This may be just crap software as Wazoo indicates, hard to see what advantage there is to be gained from this trick other than foiling the SpamCop parser (which would be a very small victory indeed).

Yeah, 'specially since the head actually parses so the SC "mission" is accomplished regardless. But thanks for the offer on looking at the Apple Mail view - that is going to refine the diagnosis/aetiology somewhat.

Wazoo · November 5, 2007

Perhaps next time I get one of these I may release it for delivery, just to see what Apple Mail makes of it.

Your Tracking URL as processed by Outlook Express 6 under XP-Pro can be seen in this screen-shot.

Test e-mail - Bad Subject: line (about 115k)

Note the "(No Subject)" in the OE 'display' .... Note the bit of difference between the various 'views' of the same 'data. (scrolled around to show the "Subject:" line specifically ..... as seen in the "Source" view, it's the blank line immediately above the Subject: line that kicks the parser into 'error mode' as this indicates the end of the header section .... leading to the "missing BOUNDARY line" issue ..... a bit of 'extra' that doesn't actually show up in the Tracking URL representation ....

So the answer to my query, it "kind of renders" in an e-mail client .... it certainly is not a clickable link, so the idiots would still have to cut / paste to actually follow the link, but .... with all the other crap showing on screen to attempt to get it past various filters, one would really have to be an idiot or desparate beyond all belief to actually go to all that work.

Hmmm, I was wondering whether there had been any response "from above" but .... in all fairness, this Topic no longer resides where Steve T. pointed to ..... blame Wazoo for confusing things again <g>

fashon · November 9, 2007

Much of this discussion is over my head, but since I seem to have started it, I thought I would weigh in. I have been copying the suspect spam into TextWrangler and deleting the spurious non-printing characters then copy and pasting into the SpamCop reporting form with no problem arising.

However, this is getting old. I'm trying to think of a way to scri_pt this process before I tire of it.

turetzsr · November 9, 2007

<snip>
I have been copying the suspect spam into TextWrangler and deleting the spurious non-printing characters then copy and pasting into the SpamCop reporting form with no problem arising.

<snip>

...Please be aware of SpamCop FAQ: Material changes to spam and SpamCop FAQ: What if I break the rule(s)?. It would appear that your change does not violate the "Material changes to spam" rule but I think you would be well advised to get explicit permission from a SpamCop Deputy (deputies[at]admin.spamcop.net) to ensure that they agree you are not in violation. In the meantime, you can always use the SpamCop parser to find the appropriate e-mail addresses to whom you can send a manual complaint (which should have no suggestion in it that SpamCop is involved; also, please Cancel the results of the parse when you are finished with it, so that no one else can submit it).

Miss Betsy · November 10, 2007

The spamcop parser is primarily to find the source IP address. Those who want to report spamvertized sites, as turetzer (Steve) said, can find alternate methods of reporting them, but not as spamcop reports.

There is a thread somewhere about the efficacy of reporting spamvertized sites. There are also alternatives such Complainater (that might not be the right spelling, but it is in the software forum, I think) and the one I can never remember that begins with 'K' knujon maybe).

IMHO, the Complainater is the better choice - it goes to registrars.

Miss Betsy

rconner · November 10, 2007

Perhaps next time I get one of these I may release it for delivery, just to see what Apple Mail makes of it.

Well, I got two of these and released them, but they never showed up in my inbox. I'm guessing that they were caught in my ISPs filter.

I did get one similar spam that was not detained (for some reason), Apple Mail refused to show a subject line and may have moved some things around in the header (hard to tell).

-- rick

Wazoo · November 10, 2007

I did get one similar spam that was not detained (for some reason), Apple Mail refused to show a subject line and may have moved some things around in the header (hard to tell).

Any way to somehow duplicate what I did with OE/XP in Linear Post #17?

rconner · November 10, 2007

Any way to somehow duplicate what I did with OE/XP in Linear Post #17?

The message I have (which got by SC unmolested) does have unprintables in the subject line. I can try to make up a pic. How do I post it to the forum? Do I put it on a web server and then use the image button?

-- rick

rconner · November 10, 2007

Any way to somehow duplicate what I did with OE/XP in Linear Post #17?

OK, I decided to post the images on my own server. Wazoo, if you want to grab them and move them to the forum server for retention, feel free.

Here's a partial ASCII dump (od -a) on the raw packet as held by Apple Mail. It shows an (illegal) ENQ after the subject line (4th line from top).

0003760	i   o   n   =   3   .   2   .   3  nl   X   -   S   p   a   m
0004000	-   L   e   v   e   l   :  sp   *   *   *   *   *   *   *   *
0004020	*   *   *   *   *   *   *   *   *   *   *   *   *   *   *   *
0004040	*  nl  nl  vt   S   u   b   j   e   c   t   :  sp enq  nl   X
0004060	-   S   p   a   m   C   o   p   -   C   h   e   c   k   e   d
0004100	:  sp  nl   X   -   S   p   a   m   C   o   p   -   D   i   s

(On edit, now that I look more closely, there appears to be a VT (tab) before the "Subject:" label that probably ought not be there, but I can't be sure that it wasn't Apple Mail that put it there and not the spammer. The tab doesn't seem to show up in the raw text screen dump below.)

Here is a screen dump from Apple Mail ("Original Content" view) showing that it refuses to display a subject line on the message, and gives "(No subject)" in the window title.

Here is another screen dump from Apple Mail ("Raw Message" view, Option-Command-U keystroke) showing the raw text of the packet; the ENQ character is (of course) invisible.

These guys appear to have great success in getting past my ISP's filters, but they are usually detained by SpamCop (this message being an apparent exception). Seems like one might construct a filter that detected illegal characters in the header, but this might be too aggressive for some users's tastes.

For the record, this is Apple Mail Version 2.1 (752/752.2) running on OS X 10.4.10.

-- rick

Moderator Edit: images made 'local'

rconner · November 11, 2007

Another fresh example just reported (tracking link), showing a subject line with of BEL followed by ENQ. Same penis-growing perp, it appears, with same goofy obscured payload.

-- rick

(on edit: also has VT in front of "Subject:")

cannot parse header

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived