Jump to content

Parsing problem?


Guest art101

Recommended Posts

If this is covered elsewhere, please excuse the duplication. Couldn't find anything that answers this specific problem.

I rarely use the mail side of the house. I log in directly at spamcop.net to view my held mail with the 'held mail' tab, then report it with the 'report spam' tab. Therefore, I don't in any way alter the headers of original inbound spam.

The following problem crops up fairly often, and I don't understand it. When parsing held spam, I often see the following text in the "Finding links in message body" section:

"error: couldn't parse head

Message body parser requires full, accurate copy of message

More information on this error.. [this line is a link to another page]

no links found"

The "More information on this error.." link goes to a page discussing "Problems with spam not in original format" and reviews errors that can be introduced when users don't properly submit spam - specifically, erroneous wrapping of long email header lines being submitted to SpamCop.

That page states that this is "an error introduced by the recipient (you) when copying or submitting email to spamcop." The page provides details about how this error can be induced post-receipt.

Since the recipient (me) has done nothing with or to the original spam or its headers, and has only just seen the spam for the first time when attempting to report it, I'm unclear about what's going on - and what, if anything, I can do about this problem. Any tips or advice is appreciated.

For a recent example from earlier this afternoon, see this tracking URL:

http://www.spamcop.net/sc?id=z697232746z0c...4c134ff24fc750z

Link to comment
Share on other sites

The SpamCop Parser is excessively (IMHO) pedantic about what URLs it is willing to report on your behalf (reporting "no links found" when certain rules are broken by the spammer that would be broken by OE/IE and other mailreaders/browsers in their attempts to be "helpful"), and you shouldn't go around willy-nilly changing the spam to make the URLs reportable. You can complain (to deputies <at> admin.spamcop.net with a Tracking URL) about the excessive pedanticism, and you can file Manual Reports. Please see my reply at http://forum.spamcop.net/forums/index.php?...indpost&p=20110 for more info on this issue. Thanks!

Link to comment
Share on other sites

The "scrunch" factor going on in those headers ... see http://forum.spamcop.net/forums/index.php?showtopic=2158 for a previous go-round on that problem. There are other issues discussed there and reference to other Topic discussions (linked to within this discussion) .. some stuff was brought over from some discussion going on in the newsgroups ... and once upon a time, I even thought I might have found something possible from within the Horde/IMP Forum, but JT advised that I was way off base ... To my knowledge, I don't know of "the final solution" other than it does seem to be a specific spammer (or some specific spam-tool in use) that's actually at the root of this particular issue. I believe it was within this Topic that I was conjecturing / asking for some way to figure out how to gather enough samples to find the common denominator, but .....

other possibilities; http://forum.spamcop.net/forums/index.php?showtopic=2927

http://forum.spamcop.net/forums/index.php?showtopic=2014

http://forum.spamcop.net/forums/index.php?showtopic=1866

http://forum.spamcop.net/forums/index.php?showtopic=1471

http://forum.spamcop.net/forums/index.php?showtopic=734

http://forum.spamcop.net/forums/index.php?showtopic=229

Link to comment
Share on other sites

The SpamCop Parser is excessively (IMHO) pedantic about what URLs it is willing to report on your behalf (reporting "no links found" when certain rules are broken by the spammer that would be broken by OE/IE and other mailreaders/browsers in their attempts to be "helpful"), and you shouldn't go around willy-nilly changing the spam to make the URLs reportable.

Thanks for your reply, Jeff.

I do not use OE or IE and never will. Bill Gates is probably the Antichrist. I don't go around willy-nilly changing spam to make URLs reportable. As I tried to explain in my original post, I simply log in to my SpamCop account and attempt to report held mail via SpamCop's interface... just like any other user who pays for SpamCop services. I occasionally run into the bog I tried to describe in this post.

I'm asking a question about spam I've not yet been raped by on my local server (kudos to SpamCop's excellent filters), which I have not altered in any way. I have not knowingly broken any rules. I'm on your side in this awful battle against the spam avalanche.

If I was unclear in my original post, please accept my apologies and tell me where to go. Tell me to go to hell if you want, but direct me to an answer I can use.

Andy

http://art101.com

Link to comment
Share on other sites

PS:

While typing my latest, two new spams arrived at my SpamCop address:

[8526] akkxwzbtlwcq[at]emailaccount.com (Get Laid Tonight ! Preview )

Mon, 29 Nov 2004 09:27:02 +0300 (Blocked bl.spamcop.net )

[8527] gitqiqoqonj[at]verizon.net (Rolex Datejust Watch uew Preview )

Mon, 29 Nov 2004 08:44:31 -0700 (Blocked bl.spamcop.net )

I'm really beginning to hate the net. It's been highjacked by thugs, politicians, and racketeers.

Link to comment
Share on other sites

Well, both appear to have been blocked by the SpamCop filtering sequence ... not sure what you'd like to have done with the references provided, there's no way for anyone here to "look" at those spams ... the second one has not been referenced by anyone before as having the header-squashed problem, but if you hit the FAQ, Marjolien's Ban spam page has a reporting address at Rolex ....

Link to comment
Share on other sites

Andy, I was not criticizing you, I was formulating in my head a rant about the excessive pedanticism in the design of the parser, and your Topic just hightlighted that particular deficiency sufficiently to urge me to post that rant. I'm sorry if you took my reply as a criticism of you or your actions.

I'll just take this moment to flesh out that rant a little more: IMHO, the SpamCop Parsing and Reporting Service should not be HTMLCop, MIMECop, or SMTPCop, unless it wants to complain to the spammer's ISP about the spammer's disregard of HTML, MIME, and/or SMTP standards while it's complaining about the spam, rather than refusing to deal with the URL(s) in such spam because they wouldn't be clickable in perfectly standard strict HTML-rendering mail readers.

An answer you can use is, as I wrote previously, that you can complain to the Deputies and you can send Manual Reports. I'm sorry if that's insufficient for you, but I'm in the same boat you're in.

Link to comment
Share on other sites

Jeff:

I believe the code is as tight as it currently is because there were previously too many complaints about IB's sites being placed into spam (some that would not even show up in the message) but were being reported as spamvertized. Some people believed these were to make the spam look legit (the visible ones). Some people believed they were to try and get by the content checkers (it was checked by Norton so it must be OK).

If you can come up with a fool proof way to automatically determine spamvertized links without excessively reporting IB's, please present that to Julian so he may use it.

Link to comment
Share on other sites

I appreciate the replies here, especially Jeff G's. Sorry if I came off a little thin-skinned, Jeff - but like millions of white-hat spam fighters, I'm just looking for a reliable, seamless way to report spam - and (hopefully) make a dent in the ongoing avalanche.

Frankly, I don't know much about IBs - or the mysterious inner workings of any related stuff. All I know is that when I attempt to report spam in my 'held spam' folder, SpamCop occasionally impugns an altered header I've never, ever, seen before... let alone altered willy-nilly. I think this is a boo-boo in the parser, and I was attempting to point it out so that it can be corrected.

Maybe what I'm really getting at is this:

I'm fairly facile with the net. I sorta know my way around. I design lots of pretty pages for lots of clients who give me money to make their web pages sing. All of them are sickened and outraged at the ongoing theft of their time and bandwidth by jerkoff spammers. They don't know how to fight the assault. I often point them to SpamCop, and encourage them to buy a SpamCop email address. I ask them to spend maybe five or ten minutes a day reporting spam they might otherwise simply delete.

For run-of-the-mill Internet users, SpamCop isn't particularly seamless or easy to configure and use. Still, it's maybe one of the best, most ethical, most effective solutions I've found. But If I run into a bog and can't figure out - no matter how hard I try - how can I hope that anyone I refer to this resource will actually use it? It's all got to be seamless and sold as a rock. And easy to use. The new web interface design is a good start, but there's much more work to be done.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...