Jump to content

Error Resolving Tracking Link with # and no /


bjornstar

Recommended Posts

I've got a spate of spam coming in where spamcop is failing to parse the tracking links.

http://www.spamcop.net/sc?id=z4736244379z0...48e636bd808d9cz

http://www.spamcop.net/sc?id=z4736236365z9...de787834c68e9az

http://www.spamcop.net/sc?id=z4736232813z8...1019cb5e308186z

http://www.spamcop.net/sc?id=z4736230134z2...cbd16d3a59d592z

http://www.spamcop.net/sc?id=z4736223636za...1479d74418509fz

All of these have tracking links with percent obfuscation sprinkled with # [at] and .

In one example:

http://%36%37%75%35%38%65%73%79%64%6d%62.%6e%62%77%36%39.%63%6f%6d#%40.%76%61%66%6d%68%66%67%75%77%61%6c%75.%63%6f%6d

Successfully decodes to:

http://67u58esydmb.nbw69.com#[at].vafmhfguwalu.com

But then Spamcop reports "Cannot resolve http://%36%37%75%35%38%65%73%79%64%6d%62.%6e%62%77%36%39.%63%6f%6d#%40.%76%61%66%6d%68%66%67%75%77%61%6c%75.%63%6f%6d"

It looks like your attempts to resolve explode when you run into a # or a [at] or maybe both.

Hope this bug report reaches friendly ears and the issue gets resolved.

Link to comment
Share on other sites

...

It looks like your attempts to resolve explode when you run into a # or a [at] or maybe both.

Hope this bug report reaches friendly ears and the issue gets resolved.

Hi bjornstar,

Unfortunately links parsing does not get much priority with SC, the focus is on the e-mail origin. However - as you say, the parser just about gets there then implodes (or explodes). Perhaps it is not a great thing to complete the job. To spell out what you are saying:

From one of your example parses:

Resolving link obfuscation

http://%71%31%79%72%70%66%6d%62%69.%65%69%79%62%73.%63%6f%6d#%40.%79%77%63%68%74%78%79%75.%63%6f%6d

Percent unescape: http://%71%31%79%72%70%66%6d%62%69.%65%69%79%62%73.%63%6f%6d#%40.%79%77%63%68%74%78%79%75.%63%6f%6d

http://%66%74%6c%6a%32%74%6d%73%6a%72%6d.%75%75%6f%73%65.%63%6f%6d#%40.%7a%79%6b%64%6b%6b%67%6a.%63%6f%6d

Percent unescape: http://%66%74%6c%6a%32%74%6d%73%6a%72%6d.%75%75%6f%73%65.%63%6f%6d#%40.%7a%79%6b%64%6b%6b%67%6a.%63%6f%6d

Tracking link: http://%71%31%79%72%70%66%6d%62%69.%65%69%79%62%73.%63%6f%6d#%40.%79%77%63%68%74%78%79%75.%63%6f%6d

No recent reports, no history available

Unescaped: http://q1yrpfmbi.eiybs.com#[at].ywchtxyu.com

Cannot resolve http://%71%31%79%72%70%66%6d%62%69.%65%69%79%62%73.%63%6f%6d#%40.%79%77%63%68%74%78%79%75.%63%6f%6d

Tracking link: http://%66%74%6c%6a%32%74%6d%73%6a%72%6d.%75%75%6f%73%65.%63%6f%6d#%40.%7a%79%6b%64%6b%6b%67%6a.%63%6f%6d

No recent reports, no history available

Unescaped: http://ftlj2tmsjrm.uuose.com#[at].zykdkkgj.com

Cannot resolve http://%66%74%6c%6a%32%74%6d%73%6a%72%6d.%75%75%6f%73%65.%63%6f%6d#%40.%7a%79%6b%64%6b%6b%67%6a.%63%6f%6d

The bits I have shown in red are the (apparently) successful obfuscation handling but then the parser loses it.

If it passed on the un-obfuscated links for resolution it would handle them fine. You can paste the links into the webpage submission form and see like

Parsing input: http://ftlj2tmsjrm.uuose.com#[at].zykdkkgj.com

Routing details for 58.253.28.32

Using smaller IP block (/ 20 vs. / 13 )

Removing 2 larger (> / 20 ) route(s) from cache

[refresh/show] Cached whois for 58.253.28.32 : abuse-gd[at]china-netcom.com

Using abuse net on abuse-gd[at]china-netcom.com

abuse net china-netcom.com = abuse[at]chinaunicom.cn, abuse[at]anti-spam.cn, spam[at]ccert.edu.cn, abuse[at]cnc-noc.net

Using best contacts abuse[at]chinaunicom.cn abuse[at]anti-spam.cn spam[at]ccert.edu.cn abuse[at]cnc-noc.net

Statistics:

58.253.28.32 not listed in bl.spamcop.net

More Information..

58.253.28.32 not listed in dnsbl.njabl.org ( 127.0.0.8 )

58.253.28.32 not listed in dnsbl.njabl.org ( 127.0.0.9 )

58.253.28.32 not listed in cbl.abuseat.org

58.253.28.32 not listed in dnsbl.sorbs.net

Reporting addresses:

abuse[at]chinaunicom.cn

abuse[at]anti-spam.cn

spam[at]ccert.edu.cn

abuse[at]cnc-noc.net

... Which is the resolution you were looking for (the other from that example resolves to the same host). It may or may not achieve anything alerting those networks and authorities (certainly they do nothing in a hurry) but there doesn't seem to be much wrong to fix in the parser before it could send reports.

Perhaps some passing SC staff member will look at these pleadings with a "pitiful eye" and add "parser improperly discards unescape de-obfuscation results containing valid non-alphabetic characters (viz '[at]', '#') - for investigation and fix" to the list of bug fixes/development requests.

Link to comment
Share on other sites

You can paste the links into the webpage submission form

...And, if you do that, be sure to cancel the reports, not send them!

Posting links won't generate a parser link or ask if you want to send a report. You need to paste a msg source that begins with "Return-Path:", "Delivered-To:" or "Received:" to make that happen.

Link to comment
Share on other sites

Posting links won't generate a parser link or ask if you want to send a report.

<snip>

...:blush: <blush> Oops, I was not reading carefully (a common failing of mine, which those of you who have followed other of my posts should know well!). Thanks for pointing out the fallacy of my post.
Link to comment
Share on other sites

Posting links won't generate a parser link or ask if you want to send a report. You need to paste a msg source that begins with "Return-Path:", "Delivered-To:" or "Received:" to make that happen.
Not sure what we're talking about here, so maybe a general caveat is appropriate.

We can use SpamCop for research by pasting any URL, IP address, domain, or email address into our web spam processing form, and SpamCop will tell you all it knows about it.

We MAY NOT put in information that will lead SpamCop to prepare a report and then send that report. That is a HUGE taboo with us. It is also improper to add information to a spam to make SpamCop "find" and report something that is not part of the original, raw, spam.

- Don D'Minion - SpamCop Admin -

- service[at]admin.spamcop.net -

.

A little bird told me...

I'm not sure SC is handling this any different than it ever did. I think SC has always gone back to the submitted URL in the parse, it shows the unescaped version only as background information.

Note the unescaped URL contains "#" and "[at]". Those are not legal characters in domain names or file naming. Those are what stop the parse.

This is another example of something borked in IE where it resolves a broken URL and takes you to a webpage. Firefox doesn't resolve the URL. I tried three other online (& Samspade) decoders. None of them could resolve any of the URLs.

This is one where a 'fix' could be risky because you're screwing around with TLD interpretations. It could seriously break something else.

- Don -

.

Link to comment
Share on other sites

Thanks Don, that helps, knowing those characters are "illegal" (well "[at]" is reserved and "#" is excluded I now see - RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax). Yet the parser does "resolve" the unescaped URL if it is fed in to the submission form - as shown:

http://www.spamcop.net/sc?track=http%3A%2F...40.zykdkkgj.com

(and using the //members.spamcop.net version of the trace gives the full detail for 58.253.28.32, which I posted earlier and which is the the ftlj2tmsjrm.uuose.com part of the address). BUT from what you say this resolution may not be the actual spamvertized site which is probably why it is not attempted in reports.

Well, I think I can confirm that - I find Firefox (3.6.12 & 3.6.13) will "resolve" that address too - the address quoted in my earlier post is deliberately broken to save it being clicked through, use the one from the above trace. That ends up at a totally different web page (different resolution - 210.59.230.60) highlighting exactly the problems you speak of. Oh, and IE8 goes to the same place which is pchome.com.tw and probably is the actual spamvertized site (they certainly don't want reports) but the parser doesn't even come close to finding them.

I don't know why Firefox/Mozilla is chasing after Microsoft in supporting broken URIs/URLs but it all works out very well for spammers. (For the O/P - sending the parser along that path has long been accepted as too risky and too flaky - and too unending - by [most of] those of the SC user community who are aware of the issue - which is very fortunate because, as Don has said, SC has no intention of treading it.)

Link to comment
Share on other sites

I'm not sure SC is handling this any different than it ever did. I think SC has always gone back to the submitted URL in the parse, it shows the unescaped version only as background information.

Note the unescaped URL contains "#" and "[at]". Those are not legal characters in domain names or file naming. Those are what stop the parse.

This is another example of something borked in IE where it resolves a broken URL and takes you to a webpage. Firefox doesn't resolve the URL. I tried three other online (& Samspade) decoders. None of them could resolve any of the URLs.

This is one where a 'fix' could be risky because you're screwing around with TLD interpretations. It could seriously break something else.

I understand your priorities lie in parsing the headers and you have enough trouble already. It just seems like a hole if you guys aren't parsing urls the same way that major browsers do. It would appear that some spammers are doing URLs this way because they're getting paid and not getting caught.

This URL encoding decodes successfully in Chrome, Internet Explorer, and Opera. I didn't get a chance to check Safari. My copy of Firefox did not decode it properly. I know it's a sticky situation. Maybe with a little statistical analysis you can figure out a good way to handle it.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...