[Resolved] Several weird parser results

Farelf · February 3, 2009

http://www.spamcop.net/sc?id=z2581127020za...39262c79fb416ez

http://www.spamcop.net/sc?id=z2581137306zc...bedeb6681bb4baz

These are the same spam, the first case an e-mailed submission (forwarded as attachment in batch) the second case view page source, select-all, copied and pasted from Mozilla mail 4.79 directly into the parser box. The spam is malformed in any event (looks like cases analyzed on these pages before with "DATA" lines in the header without colons) but in the paste-in case it at least proceeds to offer reports. In the e-mailed case, 'something' has apparently broken the headers with a 'blank space' insertion so there is "Nothing to do":

X-Originating-IP: [91.171.112.7]

MAIL FROM: &lt;standardx[at]themeltmethod.com&gt;

Am I missing something, or should this be referred for possible engineering examination? My thought is that if it happens with other mail clients there could be a lot of missed spam (especially if happening in quick reporting). I am also mindful of Don's post that Wazoo recently resurrected concerning problems with quick reporting - maybe the (supposed) fix did something?

I had one other similar instance in the same batch, all others went through routinely (35 in total).

resolves to report being offered "To: abuse[at]pipex.net (refuses to accept this type of report)" but hitting the "Send Ð…pam Report(s) Now" button simply loops (yes, I know it "looks" uncancelled but that's the whole issue - it is cancelled otherwise I had an "Unreported Ð…pam Saved" link in my member page and the ensuing loop could not be bypassed). I'm fairly certain this one should be referred though it was the only case I've seen - linked to that reporting address of course but maybe others? I don't recall seeing other posts about this. Reports are unmunged by the way, it is not a question of refusing munged reports (which calls up a different wording and process). Thoughts?

[edit] - just got another 'looper', same address:

http://www.spamcop.net/sc?id=z2581497203z2...f0304a1828e75ez but I cancelled this one at the submission box, instead of using the "Cancel all reports" link used in the previous example.

Wazoo · February 3, 2009

Next:
http://www.spamcop.net/sc?id=z2581290362z3...58360d1c661c25z

resolves to report being offered "To: abuse[at]pipex.net (refuses to accept this type of report)" but hitting the "Send Ð…pam Report(s) Now" button simply loops (yes, I know it "looks" uncancelled but that's the whole issue - it is cancelled otherwise I had an "Unreported Ð…pam Saved" link in my member page and the ensuing loop could not be bypassed)

Perhaps you it at just the 'magic moment' ????

http://www.spamcop.net/sc?id=z2581660698zf...e0335887e07924z

Report spam to:

Using abuse#pipex.net[at]devnull.spamcop.net for statistical tracking.

Re: 195.137.12.72 (Administrator of network where email originates)

To: abuse[at]pipex.net (refuses to accept this type of report)

To: abuse#pipex.net[at]devnull.spamcop.net (Notes)

/dev/null'ing report for abuse#pipex.net[at]devnull.spamcop.net

Reports regarding this spam have already been sent:

Re: 195.137.12.72 (Administrator of network where email originates)

Reportid: 3836783033 To: abuse#pipex.net[at]devnull.spamcop.net

[edit] - just got another 'looper', same address:
http://www.spamcop.net/sc?id=z2581497203z2...f0304a1828e75ez but I cancelled this one at the submission box, instead of using the "Cancel all reports" link used in the previous example.

Based on the above, perhaps it will parse now that the database seems to have caught up with itself ????

Farelf · February 3, 2009

...Based on the above, perhaps it will parse now that hte database seems to have caught up with itself ????

Thanks! Yes (sort of) - I was able to cancel it at last (parse still would not work - trying both www. and members. but it would cancel from members.) Remains a weird one though not really 'replicatable'. Imagine one like that in the middle of a quick report batch. That might cause something like the symptoms Don mentioned. I have yet to catch up on later posts - if still relevant I think it will be worth mentioning this case to Don. I will e-mail him if so.

Wazoo · February 4, 2009

http://www.spamcop.net/sc?id=z2581127020za...39262c79fb416ez
http://www.spamcop.net/sc?id=z2581137306zc...bedeb6681bb4baz

These are the same spam, the first case an e-mailed submission (forwarded as attachment in batch) the second case view page source, select-all, copied and pasted from Mozilla mail 4.79 directly into the parser box. The spam is malformed in any event

At issue, that these bad constructs include the RCPT TO:, MAIL FROM:, and DATA lines suggests that the situation was developed between the delivering e-mail system and the receiving system in this case .... prserv.net (kcin03) .... what makes them different can probably be pointed to the various manipulations with (character-set) data between the actual attempted e-mail delivery/receipt and your attempted submittals .... dropping, converting, altering some of the (assumed) non-printable/control-code type bits. Trying to find a good source for trying to do a binary/hex dump/analysis of the differences between the two submittals would probably show what characters disappeared, but .. to what end? Nothing 'yu' can do about it, short of trying to get the folks in charge of that server to analyze it from their end, to see if they could sort out how/why it happened.

Farelf · February 4, 2009

...Nothing 'yu' can do about it, short of trying to get the folks in charge of that server to analyze it from their end, to see if they could sort out how/why it happened.

Thanks Wazoo - this type represents just a few percent of the total spam I see through that routing which I am about to drop so I guess we just let it go. It can be looked at again if it becomes a more general problem.

I wrote to Don about the other case (being mostly the "Send Ð…pam Report(s) Now" action looping because no (devnull) address provided at the time of submission) but haven't heard back. I guess it's fairly rare but still think it has the potential to play hob with VER and quick reporting.

Farelf · February 5, 2009

...I wrote to Don about the other case (being mostly the "Send Ð…pam Report(s) Now" action looping because no (devnull) address provided at the time of submission) but haven't heard back. I guess it's fairly rare but still think it has the potential to play hob with VER and quick reporting.

Don responded - nothing of use to him here (he can't replicate it, everything works now). VER-quick reporting problems he commented on elsewhere also just 'went away' as quickly as they came. Well - one or two ideas occur to me but of course conjecture is fruitless (and 'don't pay the bills').

So, topic is pretty much a waste of space (sorry). Except to record these things sometimes happen (but mostly don't).

C2H5OH · February 10, 2009

Don't know if this is related, but today I've had a couple of spam submissions bounced back.

They were sent to my submit.xxx.[at]spam.spamcop.net forwarding address, but elicited a reply beginning,

"SpamCop encountered errors while saving spam for processing:

SpamCop could not find your spam message in this email:"

I re-sent both using the web interface (2-window Outlook workaround) when both parsed and reported without problems.

I can provide a link to the submissions that worked if that would help, but of course I have no record of the failed submissions other than the bounced (and un-munged) emails.

Just an observation. Two bounce/re-submits is not the end of the world ;-)

Farelf · February 10, 2009

..."SpamCop encountered errors while saving spam for processing:
SpamCop could not find your spam message in this email:"...

Sounds like a different thing. I've had that message recently when my mail application 'kindly' converted my forwarded spam into a TIF (image). This is Mozilla and I'm not at all sure this would be general behavior (hopefully not but presumably all sorts of other mail applications can handle that format, so just maybe it is). It did it that way because, after I composed the e-mail and before I sent it, I moved the spam from "Junk" to "Trash". If I send it immediately, before moving files, SC has no problem with it.

After years of using Mozilla mail and submitting SC reports, this is apparently the first time I had followed a slightly different sequence and it had a quite unexpected result. Moz also does things a little differently, depending how I move the spam messages into the mail for submission (select a batch and 'drag and drop' or select a batch and menu/right click 'forward as attachment') but SC has no problems either way. No matter how I do it, my ISP likes to silently block my submissions, which sometimes complicates working out just what is happening.

Anyway, this might show some of the complication that can attend this business. Your own case may be quite different but something prevented SC stripping the 'envelope' from your forwarded spam (individual or batch, it is much the same process with e-mail submissions). It might have even been just a momentary glitch with SC - there have been other instances where that seemed to be the case, IIRC. If it happens again. try searching these pages for either or both phrases of the error message to see what other 'wrinkles' (and, hopefully, solutions) there may have been in the past. Note, the mail application you use is an important factor and it is often seen that the errant spam has been forwarded 'in-line', as opposed to 'attached' but I'm sure you're well aware of that last bit.

Farelf · February 21, 2009

http://www.spamcop.net/sc?id=z2581127020za...39262c79fb416ez

http://www.spamcop.net/sc?id=z2581137306zc...bedeb6681bb4baz

These are the same spam, the first case an e-mailed submission (forwarded as attachment in batch) the second case view page source, select-all, copied and pasted from Mozilla mail 4.79 directly into the parser box. The spam is malformed in any event (looks like cases analyzed on these pages before with "DATA" lines in the header without colons) but in the paste-in case it at least proceeds to offer reports. In the e-mailed case, 'something' has apparently broken the headers with a 'blank space' insertion so there is "Nothing to do":
X-Originating-IP: [91.171.112.7]

MAIL FROM: &lt;standardx[at]themeltmethod.com&gt;
...

Just to note that I am seeing an increasing number of these, that is with the added blank line after the

"X-Originating-IP: [nnn.nnn.nnn.nnn]" line

in e-mail submissions and that it happens with SeaMonkey as well as with old (now superannuated) Mozilla mail - actually 1.7.13, not 4.79 (which would be Netscape) as I had said, FWIW. So, it may well be happening with Thunderbird as well. I can't help feeling if there are several mail clients involved it may be something of a hole in the reporting process rather than a mere quirk. Perhaps worth watching out for, after all, especially for Quick/VER reporting.

Wazoo · February 21, 2009

Just to note that I am seeing an increasing number of these, that is with the added blank line after the
"X-Originating-IP: [nnn.nnn.nnn.nnn]" line

and noting .. prior to the "shouldn't be included handshaking dialog" inserted into the middle of the headers. I'm not sure I can see how your e-mail client could be causing the problem again, as that specific data shouldn't exist in the first place. Why you're sseeing it from 'different' (?) sources is the mystery to me ....

Farelf · February 21, 2009

...I'm not sure I can see how your e-mail client could be causing the problem again, as that specific data shouldn't exist in the first place. Why you're sseeing it from 'different' (?) sources is the mystery to me ....

Ah, yes, sorry, I was forgetting that point. All spam currently reported is forwarded from an attglobal account through prserv.net servers to inbound iinet, most of it is okay. Those few (but increasing number and proportion) that aren't okay have numerous things wrong with them that you mentioned, in addition to the mime boundary thing.

Not hard to see why there might be parsing problems but the importance of your observations about the SMTP session slipped my mind, coming back to it. They are certainly coming from different networks (almost certainly through a distributed botnet) though quite often they come in pairs from the same IP address - (mostly) Viagra and (sometimes) diplomas being the two products currently pushed but with slightly different headers.

http://www.spamcop.net/sc?id=z2635747862z4...95128b80530cc6z as an example of the diploma 'stream' (pasted in, so the headers parsed). Similar "From:" names (relatively mainstream 'Anglo'). Looks like it can happen with different server pairs between attglobal/prserv.net and iinet which scotches the possibility of a simple solution. Maybe it is just those two networks and unlikely therefore to affect others (hard to imagine why it is not more generally observed otherwise). Anyway iinet's filtering would certainly prevent it reaching users with filters enabled which is just about all of them except me, AFAICT, and me only recently. That network uses IronPort systems for that job so the stuff would be effectively blocked on many other networks as well, should any of those have a similar issue.

Whatever it is is unlikely to be fixed by the networks. Like most such occurrences, I guess it will just go away sooner or later. It certainly won't be affecting me much longer. That attglobal account has been retired and I'm well into the complimentary 30 day period of grace in which they continue to forward stuff. Or, even before the expiry of that, SWMBO just might shoot me for spending too much time reporting. I shouldn't have taught her to shoot. Still, more merciful than the carving knife ... probably. (Maybe I didn't teach her that well ).

Farelf · February 24, 2009

The broken spam continue - between 12 to 18 percent of my total. Anyone on the same 'lists" as I am might see the same (if their 'feed' is unfiltered) but it might also be specific to message handling between two specific providers (attglobal.net with no filtering for forwarding accounts-iinet.net.au with by-passed filtering). Posted for the record, while I still have access to this spam stream - some recent examples, using OE this time - the first in each pair is the email submission, the second is the same spam copied and pasted. Same with other mail clients used (Mozilla 1.7.13, SeaMonkey). Evidently not much concern which is as well, apparently little/nothing can be done about it, as discussed above.

http://www.spamcop.net/sc?id=z2644926911z2...14a629c7a95efbz

http://www.spamcop.net/sc?id=z2644937884z0...c184bfa64a3059z

http://www.spamcop.net/sc?id=z2644926914zb...a7f4d54562e7a3z

http://www.spamcop.net/sc?id=z2644940795z7...b624cdf1c194e4z

http://www.spamcop.net/sc?id=z2644926917z1...3c4579ce507f50z

http://www.spamcop.net/sc?id=z2644944794z3...f48ede722751ecz

http://www.spamcop.net/sc?id=z2644926976z9...a2e7284d49adf9z

http://www.spamcop.net/sc?id=z2644959073z1...699bef84f9eb3bz

http://www.spamcop.net/sc?id=z2644926995z5...b98af5b801702fz

http://www.spamcop.net/sc?id=z2644964040zb...dcf773865fcec4z

Don saw the first example of this I posted.

Wazoo · February 24, 2009

some recent examples, using OE this time - the first in each pair is the email submission, the second is the same spam copied and pasted. Same with other mail clients used (Mozilla 1.7.13, SeaMonkey). Evidently not much concern which is as well, apparently little/nothing can be done about it, as discussed above.

Nope, I'll stand on that you have no control over what's being received .. specifically, it's not your e-mail clent(s) at fault. The disappearing extra blank line has to be due to non-printable characters not making it through the cut/paste operation.

I almost jumped on the commonality of the kcin04 server, but .. you threw in one that went through kcin03. So the only next thought would be that the common software/configuration between those two servers are allowing the same kind of hack/bad traffic to pass. To my eyes, it's the network folks in control of these prserv.net servers that need to be involved with sorting this out ... and from experiences on this side of the pond, good luck at trying to get into a meaningful discussioin with them (directly) .... especially with your outgoing spam blockage issues <g> I didn't come up with a good NOC address in the various WHOIS look-ups done here, although surprized at a few of the addresses that did pop up.

Farelf · February 24, 2009

...To my eyes, it's the network folks in control of these prserv.net servers that need to be involved with sorting this out ... and from experiences on this side of the pond, good luck at trying to get into a meaningful discussioin with them (directly) .... especially with your outgoing spam blockage issues <g> I didn't come up with a good NOC address in the various WHOIS look-ups done here, although surprized at a few of the addresses that did pop up.

All interesting, thanks for that cogitation and work. At the end of the day, if is just a few (or one) reporters affected then it is no big thing - but the potential is a little worrying. I think I've seen all four of the kcin0n.prserv.net servers involved and that is a heck of a volume if they were to act up more generally with other networks.

Farelf · March 17, 2009

When a "seniors moment" extends to six weeks an unkind commentator might call that more than a moment but, comforted by the absence of such unkind folk here, I confess as follows. While I was the only user complaining about broken headers with the above spam (leading to "nothing to do" but that is "fixed" by pasting-in the full headers when the disruptive line break in the headers automagically "evaporates", being some sort of phenomenon of the mail process and the servers used or something) it seems other(s) (current News Group thread) have been receiving stuff in the same general format when one of the concerns is the "error: couldn't parse head" message.

Well, the short answer is there is no way (I can see) to fix this without coming into conflict with the (No) Material changes to spam rule. The mime boundary/delimiter needs to be fixed for the final boundary, the bogus "Data" headers need a following colon (or to be removed) and ...? I still couldn't get it to work, as an exercise in determining exactly what was wrong with the construction. Until Mike Easter had a look at it in a NG post and, quick as a flash, picked up "The mailfrom/rcptto 'header' fieldnames have spaces. No good." Yep, the original, bogus "MAIL FROM:" has to be "MAIL-FROM:" and the several "RCPT TO:" need to be "RCPT-TO:". Heck, I KNOW that stuff. Or maybe "knew". Just didn't see it for whatever reason. So, the trial "fixed" version would be http://www.spamcop.net/sc?id=z2706614109z5...115a808bc2d756z - and those changes are way beyond anything allowed, in my estimation (compare to the original http://www.spamcop.net/sc?id=z2581137306zc...bedeb6681bb4baz in the opening post).

Not to worry, the link doesn't resolve and possibly never did - apparently being one of those 14 cent auto-registered domains brought to the internet as a service from the "loyal entrepreneurs" of the People's Republic. They certainly deserve that little gold star on their flag. Other spam of similar construction push moody timepieces and fake sheepskins IIRC, the latter 'never' having a link anyway (just a handy 'phone number).

Anyway, as it seems, simply an "intellectual exercise" - and the intellect was found wanting, as confessed . But worth documenting, perhaps, as detail of just what is 'wrong' with that style of spam construction, should it persist. Quick/VER reporters are unaffected (no body parsing anyway).

StevenUnderwood · March 17, 2009

This basically sounds like broken software, since a generic SMTP session would go:

HELO host.domain.tld

MAIL FROM: <address[at]domain.tld>

RCPT TO: <address2[at]domain2.tld>

DATA

...

Farelf · March 17, 2009

This basically sounds like broken software, since a generic SMTP session would go:
HELO host.domain.tld

MAIL FROM: <address[at]domain.tld>

RCPT TO: <address2[at]domain2.tld>

DATA

...

Ah, yes - not very familiar with SMTP 'sessioning' but now you post it ... certainly explains those odd 'headers'. This is the sort of thing Wazoo was driving at but I didn't glom the mechanics behind it then. Thanks.

turetzsr · March 17, 2009

When a "seniors moment" extends to six weeks an unkind commentator might call that more than a moment but, comforted by the absence of such unkind folk here, I confess as follows.
<snip>

...Assuming that this means your inquiry is resolved, I am so marking this Forum thread. Please let us know if that assumption is mistaken so that it can be unmarked. And thanks for taking the time to let us know!

Wazoo · March 18, 2009

When a "seniors moment" extends to six weeks an unkind commentator might call that more than a moment but, .......
Until Mike Easter had a look at it in a NG post and, quick as a flash, picked up "The mailfrom/rcptto 'header' fieldnames have spaces. No good."

And I'll admit that when I saw Mike's post/explanation, I looked high and low for the "extra space" and basically decided I was wasting too much time ..... only now do I understand what he was actually referring to ...

Yep, the original, bogus "MAIL FROM:" has to be "MAIL-FROM:" and the several "RCPT TO:" need to be "RCPT-TO:". Heck, I KNOW that stuff. Or maybe "knew". Just didn't see it for whatever reason.

I still disagree ... those "statements" should not be in the e-mail headers at all ....

Anyway, as it seems, simply an "intellectual exercise" - and the intellect was found wanting, as confessed . But worth documenting, perhaps, as detail of just what is 'wrong' with that style of spam construction, should it persist. Quick/VER reporters are unaffected (no body parsing anyway).

I still stand on that something is screwed up with the handshaking/handling involved of the incoming e-mail. Thse statements should not make it into the headers of the e-mail ....

Ah, yes - not very familiar with SMTP 'sessioning' but now you post it ... certainly explains those odd 'headers'. This is the sort of thing Wazoo was driving at but I didn't glom the mechanics behind it then. Thanks.

Just repeating the above. The (test) 'fix' isn't valid, but apparently ended up being "close enough" for the parser to somewhat/partially fly with the results ... not sure if this is actually a good thing or not.

Farelf · March 18, 2009

...Just repeating the above. The (test) 'fix' isn't valid, but apparently ended up being "close enough" for the parser to somewhat/partially fly with the results ... not sure if this is actually a good thing or not.

OK, thanks for the clarification/information. Considering the results, I think it doesn't really matter, the parser (ultimately) only accepts stuff in the headers that is configured like a header which is to say:

a title, ending in a colon, with no included spaces, together with

anything (or nothing) following the colon on the same line, or

anything indented as one or more continuations of that line.

Despite this, it parses through the delivery chain (if present), apparently unperturbed by any/some shenanigans in the headers (as long as there is a proper separation between them and the body) and only drops back to (apparently) reconsider the headers when it is done with that, the 'main mission'.

Why it needs to reconsider the headers when it is looking at the body, I have little idea - MIME information, obviously, but it is prepared now to be upset not only with the body but also with stuff in the headers not meeting the basic format requirements. Mike Easter once suggested the header and body parsing in fact might run concurrently and are only output consecutively which could explain the seemingly out-of-sync or conflicting messages seen at times (also with messages arising from the resolution of sending source domain/network owners and reporting addresses and with the de-obfuscation of links and ditto reporting address) - all as maybe semi-autonomous processes within the overall processing.

I don't know why it works like that but of course anything can exist in a header if it is properly formatted. Legitimate experimental headers of great variety (with or without the handy "X-" delimiter) are allowed by rfcs and are thus largely indistinguishable from spammer spoofs and software misbehavior to the casual observer (and are anyway invisible in normal use).

It is neat that the parser can work through to find the spam source despite the obstacles. It seems a little strange that it can't do the same with links but there are probably good reasons for that (and no value to be gained in the example examined in any event). It will be interesting to see how the new version copes in comparison. When it is finally released. I might run the original spam (with altered dates) through that, experimentally, just to see. Busted bulk-mailer (as it certainly seems) or not, there's some incidence of its products. And (partially) broken spammer tools seem to be a recurrent feature of spamdom, down through the years.

Any links presently in this mangled spam is out of reach of SC reporting and that's presently no loss.

Wazoo · March 18, 2009

the parser (ultimately) only accepts stuff in the headers that is configured like a header which is to say:
a title, ending in a colon, with no included spaces, together with

anything (or nothing) following the colon on the same line, or

anything indented as one or more continuations of that line.

OK .. I'll also add that I feel educated a bit here. I recofnized that block, but admitting that I wasn't sure of what he was shooting at at the tie .. again, I hadn't yet found the "extra spaces" yet ....

My eyes were seeing the "SMTP commands" which were constructed properly, and apparently I left my much more normal 'literal' mode slip away, more or less doing the 'translation' in my head (SMTP traffic to e-mail client/server output) .... which is why I "couldn't" see the "extra" spaces .. just running with that these lines shouldn't have been seen within the header to begin with. Now that I've caught on to just what 'extra spaces' were being discussed, I now also follow what Mike was trying to say and why in that additional explanation. Thanks for clearing up the clutter for me.

[Resolved] Several weird parser results

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived