Jump to content

Parsing HTML part error


lb6

Recommended Posts

Hi all,

Submitting this report, the parser gave me the classical couldn't parse head error.

Of course,

- my submission is full and untouched

- I don't use outlook

My understanding is that the parser is complaining about the html (base64 encoded) lack of <head> element. I did a little research, but never saw a message about this problem.

Am I missing the point? I think bogus html spam needs correct parsing and reporting, don't you (tinu)?

Thanks for your lights,

-l

Link to comment
Share on other sites

Your message seems to have MIME inside if MIME and no termination of the outer MIME message. I have no way to submit your mesage with modifications to test these theories.

For example:

Content-Type: multipart/mixed;

boundary="6c926133fa44a3dc18b9dafbd96e1b1e"

X-RBL-Warning: (bl.spamcop.net) Blocked - see http://www.spamcop.net/bl.shtml?66.132.253.191

X-Envelope-To: x

This is a multi-part message in MIME format.

--6c926133fa44a3dc18b9dafbd96e1b1e

Content-Type: multipart/alternative;

boundary="0921c66bb54947998ba49d19d4c2851d"

This is a multi-part message in MIME format.

--0921c66bb54947998ba49d19d4c2851d

Content-Type: text/plain;

charset="iso-8859-1"

Content-Transfer-Encoding: quoted-printable

...

--0921c66bb54947998ba49d19d4c2851d

Content-Type: text/html;

charset="iso-8859-1"

Content-Transfer-Encoding: base64

...

--0921c66bb54947998ba49d19d4c2851d--

but there is not corresponding --6c926133fa44a3dc18b9dafbd96e1b1e-- to tell the parser where that message ends.

Link to comment
Share on other sites

Your message seems to have MIME inside if MIME and no termination of the outer MIME message.  I have no way to submit your mesage with modifications to test these theories.

Thanks, I missed that. However, I fixed the missing closing MIME and resubmit: error stays.

Any other idea?

Link to comment
Share on other sites

Ok, I finally found a way for the parser to take it, completely removing the bogus outer MIME part and adapting the headers' Content-Type accordingly.

Anyway, do you think it's more likely to be

- buggy MIME handling from spammer's stupid software

- trick from spammy to avoid reports

- anything else?

Thanks for the help.

-l

Link to comment
Share on other sites

Just a reminder that after making these types of changes you are not allowed to send your reports through spamcop, material changes to spam and all that. You can gather the addreses and report manually, however.

When I get these types of errors, I usually forward the tracking information (pre and post fix) to the deputies to see if a change needs to be made to the parser.

deputies<at>spamcop.net

Link to comment
Share on other sites

Take a look at http://www.spamcop.net/sc?id=z562480878zb5...27aaf924157f6dz and compare to your original. Yes it was intentional. And even though it now parses correctly, this does fall under the significant change, so one would have to report the links manually. Am kicking it up to the Deputies, perhaps then to Julian, not sure how common this one is yet.

Link to comment
Share on other sites

  • 1 month later...

Got error "Finding links in message body

Recurse multipart:

Parsing HTML part

error: couldn't parse head

Message body parser requires full, accurate copy of message"

Of course, I did provide a full, accurate copy of the message :angry:

This is the second message that I've received in a week where the spammer has

found a way to fool spamcop and result in spamcop being unable to report the

links (and blame me for not reporting correctly :angry:). I'm reluctant to post

their trick here in this forum, because I'm sure they're reading it looking for bugs

in Spamcop that can help them fool the system. But since there is no other more

secure reporting mechanism, here it is:

Note that the spammer has claimed that this message is multipart/alternative but

provides only a text/html. The text/html does not have proper html syntax. The

previous message that resulted in this error had plaintext in the supposed html

part. Twice in one week: serious hole in spamcop, I'm afraid.

CLICK 'BACK' BUTTON TO RETURN TO SPAMCOP
################################################################################
Return-Path: Griffinshawl5837157260[at]rogers.com
Received: from unknown.hostname (207.14.163.37)
        by &lt;munged in forum only&gt; (V5.4-15D, OpenVMS V7.3-2 Alpha);
        Tue, 7 Sep 2004 23:26:07 -0400 (EDT)
Received: from 138.182.56.240 by 207.14.163.37 Tue, 07 Sep 2004 21:24:05 -0700
Message-ID: &lt;dtbkozTmLm3Qv1[at]ae.com&gt;
From: "Marcella " &lt;Griffinshawl5837157260[at]rogers.com&gt;
Reply-To: "Marcella " &lt;Griffinshawl5837157260[at]rogers.com&gt;
To: &lt;munged in forum only&gt;
Subject: Important Message.
Date: Wed, 08 Sep 2004 03:21:05 -0100
X-Mailer: Internet Mail Service (5.5.2650.21) chrysler pbs
allot-churchyard: boom borg abort
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="--4314664073955685"                                 
[                                                             Priority: Normal                                                        ]

----4314664073955685
Content-Type: text/html;
Content-Encoding: bitBitNUM

&lt;x-html&gt;
&lt;body&gt;
&lt;p align="center"&gt;
&lt;a href="http://www.gr8teveryone.com/151/index.php?id=151"&gt;&lt;img src="http://www.sdf21.biz/ads/images/60pills3.gif"&gt;&lt;/a&gt;
&lt;br&gt;&lt;br&gt;
Dear &lt;munged in forum only&gt; If the Picture Above does not load please go to this site for important information.
&lt;br&gt;&lt;br&gt;
&lt;a href="http://www.onlinegr8tpills.com/151/index.php?id=151"&gt;Click here for more information!&lt;/a&gt;
&lt;br&gt;&lt;br&gt;
&lt;font color="ffffff"&gt;
&lt;br&gt; cambric visible stringent corinthian 
&lt;br&gt; tabulate embedded intercalate midwestern
&lt;/body&gt; 
&lt;html&gt;2

----4314664073955685--

Link to comment
Share on other sites

There are issues with your sample, but the problem is that it's not known whether it's from the spam itself, due to this application's problems with displaying white-space, or the handling of the spam during submittal. You alos don't mention what all is involved at your end. For example, there's a hint that you may be using a third-party tool to work around an Outlook issue (?) How about editing your last post, delete the spam sample and replace it with a Tracking URL, so that the actual submittal as seen by the SpamCop parser can be evaluated? Then at least, everyone is working on the same bit of data.

Link to comment
Share on other sites

And to add to Wazoo's request, also please state how you submitted the spam to SpamCop.

1) Cut and Paste (It could be you or your email program added a blank line in the wrong place or broke up a line because of character wrap that should not have been broken up)

2) Forward as attachement (The best and safest way to submit)

3) IMAP file transfer (another safe way to move files for submission but requires having a SpamCop email account)

4) Standard message forward (please state email program used to forward message)

5) Other method??

Link to comment
Share on other sites

there's a hint that you may be using a third party tool to work around an Outlook issue

Outlook? Friends don't let friends drive Microsoft. ;)

The tracking id is:

z647984037z9f23509ab60e80f248d757cd40b7811cz

The email program is VMSmail. The reporting mechanism is simply a forward of the message

content including all RFC822 headers, unmodified. Email is never modified by VMS after

receipt or during forwarding, and the idea of an attachment is unknown.

This was my 2646th submission to spamcop since I began using the service in March 2002;

the problem is definitely the fact that a spammer has created a message which fools Spamcop's

parsing.

The display page for this message at Spamcop displays the email exactly as the spammer sent it

and as I received it, including all spacing, blank lines, etc. It looks like the fact that the spammer

left off the HTML <head> tag is what confused SpamCop.

/john

Link to comment
Share on other sites

Glad to meet another VMS user ;) For those of you who don't know VMS and specifically it's built in mail functionality, it is text only, so very secure for viewing spam messages.

John, Have you tried taking this message from the view message or its source and modifying it by either adding the <head> tag as you presume to be the problem or remove the x- from the <x-html> tag (my guess as to the problem) and manually submitting it into the web based parser?

I have mailhosts configured and so am unable to do this myself (I would not even get through the source part of the parse).

Link to comment
Share on other sites

John, Have you tried taking this message from the view message or its source and modifying it ...

16845[/snapback]

OK, I just did that. It's not the <x-html> tag or lack of a <head></head> after all.

What I had to do was pull out the bogus garbage "[", spaces, and "]" around the

Priority: Normal header immediately following the Content-Type: multipart/alternative header.

SpamCop needs to be at least as forgiving as most mail readers, which I suspect (don't know

since I use VMS) would simply ignore a header like that.

How do we get this put on a real bug list?

/john

Link to comment
Share on other sites

You mentioned earlier yoor received another with the same error message. Do you still have that sample as well? Does it have the same characteristics?

I would send a mesage to deputies<at>spamcop.net with the tracking URL and your findings and ask them to bring it to Julian's attention.

I thought about that line as well, but have never noticed a priority line like it (good or bad) then forgot about it. Good catch.

Link to comment
Share on other sites

You mentioned earlier yoor received another with the same error message.

Just got another one. I believe this is what was going on with the one I received a few

days before the last one. For some reason I haven't been able to locate it again.

This produces the same error message for a similar (but different) reason.

z663530528z4ad7999b3ef09011c0cd5e2033259226z

There are at least two problems with the formatting of this spam:

(1) The last RFC 822 header is the X-IP:186.83.67.186 header. It should be followed

by a blank line before the ---205... Mime part boundary.

(2) The Content-Type is declared to be text/html when it is in fact text/plain.

There's a third problem, but it doesn't bother SpamCop: the message is supposedly

multipart/alternative, but only one part is provided.

So is the right way to report bugs for _me_ to simply send the problem on to deputies,

or is it required to discuss it here first and get a non-newbie to report it, or what?

(I'm hardly a newbie to SpamCop, only to these forums, and I generally don't have

time to get involved in web-based forums, netnews, or email lists).

/john

Link to comment
Share on other sites

As far as reporting, if you think there is a problem, you are welcome to submit it to the deputies, especially since there has NOT been a deduction that it is possibly caused by your reporting configuration.

Also, if you post the entire tracking URL, it will make it easier for people to reach the page you are talking about.

i.e. http://www.spamcop.net/sc?id=z663530528z4a...cd5e2033259226z

The wrapping will take care of itself.

Link to comment
Share on other sites

http://www.spamcop.net/sc?id=z663530528z4a...&action=display

This one dies due to lack of a blank line between the headers and the body

http://www.spamcop.net/sc?id=z642123189zbb...&action=display

This one fails for the same reason, no separating blank line ...

http://www.spamcop.net/sc?id=z647984037z9f...&action=display

Same problem ... that wretched "Priority Line" starts where the separating blank line should be ...

So we have StevenUnderwood using the same app without a problem, and you've provided a list of several bad submittals that all seem to hinge on one specific glitch ... any idea as to how and where the missing blank line may be disappearing?

Link to comment
Share on other sites

any idea as to how and where the missing blank line may be disappearing?

17235[/snapback]

The blank line is not "disappearing." Nor is the Priority line (Priority is a

valid MIME header) having the "[" added to it.

The spammers are SENDING the mail exactly as shown.

/john

Link to comment
Share on other sites

Wazoo, I do NOT report email from this application as it is internal only at my site and does not receive spam. It also can not send messages to the internet.

Do we know if the OP has submitted MIME messages in the past. It could be the way VMS is handling the MIME attachment. I know my configuration placed a second set of "common" headers at the top of the message which I belive I could configure away if needed. See this link for the extract of my message.

http://www.spamcop.net/sc?id=z66468420...&action=display

Link to comment
Share on other sites

Do we know if the OP has submitted MIME messages in the past.  It could be the way VMS is handling the MIME attachment.

As I said, I have submitted about 2700 MIME messages. VMSmail does not "handle" MIME

attachments at all. You mention "common headers". Those are the VMS internal "From", "To:", "CC:" and "Subj" headers, completely in addition to the RFC822 headers, which are removed, leaving only the message EXACTLY as received, by the FORWARD/NOHEADER command.

I assure you, the spammer is sending the email as it is being submitted; there has been no modification at all of the email occurring during the process of the message being received from the spammer, stored in VMSmail, or being submitted by me.

As a VMS developer, and one of the original Email developers at DEC (now at HP), I can assure you that I know the above for an absolute fact.

/john

Link to comment
Share on other sites

The spammers are SENDING the mail exactly as shown.

Do you have the mail logs showing the message coming in the DATA command like that to back up this assertion? The MTA always handles the message as it has been programmed to.

As I mentioned in my previos message, VMS is adding ana dditional set of headers at the top of messages I receive.

http://www.spamcop.net/sc?id=z66468420...&action=display

Link to comment
Share on other sites

See this link for the extract of my message.

http://members.spamcop.net/sc?id=z66468420...&action=display

17244[/snapback]

Oh, and Steven, since I don't know where the discussion of the message above is, I don't

know if the reason SpamCop would have choked on it was ever determined.

As a VMS developer, it's obvious to me. The unwanted line wrapping occurs exactly at

column 80. This indicates that you did a cut and paste from a terminal window that was

only 80 columns wide. Although VMS mail never modifies the mail, what is displayed on

the screen will be affected by your terminal size settings, and cut and paste will not put

wrapped lines back together.

/john

Link to comment
Share on other sites

Do you have the mail logs showing the message coming in the DATA command like that to back up this assertion?  The MTA always handles the message as it has been programmed to.

This is why I usually don't bother with these kind of forums. You guys don't know me from Adam, and you assume that I'm wrong. It's a bit insulting, but it's the way these forums work, so I bear no malice towards you for your skepticism. But I have been an email developer; I work in the VMS development group; I've been using VMS mail and developing other mail systems to interact with it since before many of you were born; and I know exactly how the internals of VMS mail work.

As I mentioned in my previous message, VMS is adding an additional set of headers at the top of messages I receive.

VMS keeps those headers completely separate from the message; as I mentioned in a previous message, FORWARD/NOHEADER will send the mail EXACTLY as received from the spammer.

I'll tell you about the _one_ exception, just because you've all been so polite: If a mail message, in violation of RFC822, contains headers longer than 256 characters without including the required (by the RFC) return and whitespace, VMSmail will wrap that header when storing it internally. But that's not what's happening here.

/john

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...