Jump to content

Parser fooled by < p > tags at end of URL


neviller

Recommended Posts

For the last couple of days I've been receiving quite a few spams with <p> tags added immediately after the advertised URL, and this seems to prevent SpamCop making a report. Examples are:

http://yexokoheh.cn<p><p>

http://coredetionses.ru<p>

Can the parser be changed so that it's not fooled by these?

Please provide a tracking URL for one of these. It is more likely that the parser is simply not finding the link. Have you tested to see that the link is found if that code is removed?

Link to comment
Share on other sites

...Also please see SpamCop FAQ (you'll find a link near top left of nearly all SpamCop Forum pages) entry labeled "SpamCop reporting of spamvertized sites - some philosophy." Bottom line message of that entry: finding and reporting spamvertized web sites is, at best, a secondary function of the SpamCop parser.

...Facilities designed specifically to find and report spamvertized web sites are:

Link to comment
Share on other sites

As Steven says, we need the Tracking URL to know what's happening. The parser attempts to resolve links in either plain text or html and it would be necessary to see the "Content-Type:" specified in the message, the actual format and the error message (like "Cannot resolve http://P") to be sure just what is going on. The parser doesn't handle mangled content (spammer bungling) quite as well as browsers and mail clients from those fine folk from M$ who aim to make our internet experience a memorable one and to heck with "standards".

As Steve T alludes, "spamvertized" links are not a big priority for SC for a number of reasons so even if the parser's "problem" is identified and apparently simply corrected there is not a high probability of that actually being addressed. Probably better to look elsewhere for specialists in the rogue website field which are the links he provides.

But you can use the parser to find SC reporting addresses just by cleaning up the link (remove the <p> paragraph tags) and pasting that URL (by itself) into the paste-in window of the submission form. You should see results like:

http://www.spamcop.net/sc?track=http%3A%2F%2Fyexokoheh.cn

http://www.spamcop.net/sc?track=http%3A%2F...oredetionses.ru

Note, if you modify the content like that you can't send a SpamCop report (breaches the "material changes" rule). Also, if any of the resulting addresses are special SC ones (contain "spamcop" in the address) they should not be used for your own ("manual") reports - those have been set up in conjunction with the provider and should only be addressed through standard SC reporting.

Link to comment
Share on other sites

SteveT rightly highlights the low priority for SpamCop in reporting spamvertised URLs. I, personally, go a lot further and would say it isn't worth wasting time worrying about URL reports that don't work. I've not seen any evidence that SC URL reports make any difference whatsoever.

Andrew

Link to comment
Share on other sites

Please provide a tracking URL for one of these. It is more likely that the parser is simply not finding the link. Have you tested to see that the link is found if that code is removed?

Here they are:

http://www.spamcop.net/mcgi?action=gettrac...rtid=4841785349

http://www.spamcop.net/mcgi?action=gettrac...rtid=4840554718

Yes, I confirm that I tested and found that the link would generate a report if a space was inserted between the end of the URL and the <p>.

Link to comment
Share on other sites

Those aren't tracking URLs - only you and SC staff can see what is behind them. You need to click the "Parse" button when you go to one of those links and copy the line near the top (underneath "Here is your TRACKING URL - it may be saved for future reference:") that looks something like

http://www.spamcop.net/sc?id=z3809926782z8...60895bfb02bc31z (mouse over to see full link it is not actually abbreviated).

But it probably doesn't matter - it is unlikely to show anything new, as mentioned/implied several times already. Based on history, SC developers are never going to go chasing after every possible permutation of message mangling to extract some data that is marginal to its 'mission'. Unless (maybe - just supposing) some such specific mangling becomes really wide-spread or there is an error in the processing that is some sort of threat to the SC operation.

Spamvertized URI/URLs - when the parser does resolve them - may be added to the SURBL but that is not actually part of SC. And if anyone alters their spam to allow resolution of the URI they breach their agreement with SC, they bring the integrity of the system into question and threaten ISP trust in it - and their reporting will certainly be suspended/banned as soon as SC detects it.

It is definitely worth raising the matter from a fellow-user point of view (and thanks for that) - both to allow the airing/reminding of the factors involved and to collect any chorus of "me too" responses that might indicate a systemic problem that SC just could want to look at.

Link to comment
Share on other sites

Sorry, here's a tracking URL:

http://www.spamcop.net/sc?id=z3809959229zb...1a105b741e5ae2z

I have now read the advice about KnujOn and added a KnujOn reporting address into my SpamCop prefs, so I can now send my spam to them as well, with just one extra click each time. It was worth starting this thread just to pick up on that advice.

It would be nice if that extra reporting address could be ticked by default on the SpamCop "Send reports" page.

Link to comment
Share on other sites

Sorry, here's a tracking URL:

http://www.spamcop.net/sc?id=z3809959229zb...1a105b741e5ae2z

I have now read the advice about KnujOn and added a KnujOn reporting address into my SpamCop prefs, so I can now send my spam to them as well, with just one extra click each time. It was worth starting this thread just to pick up on that advice.

It would be nice if that extra reporting address could be ticked by default on the SpamCop "Send reports" page.

Hmmm ... well, I don't see anything particularly broken in that html (and content type is correctly nominated) - absolutely any browser or mail agent is going to render it without difficulty (no attempt made to provide a clickable link, just the text of the URI). That may not be the same as "standards compliant" but still ... no expert but I would say I agree, it would be nice for SC to resolve it, just as it would for plain text. This is almost as if the design is tailored to avoid SC parsing. Probably I'm just being paranoid.
Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...