bjornstar Posted December 6, 2010 Share Posted December 6, 2010 I've got a spate of spam coming in where spamcop is failing to parse the tracking links. http://www.spamcop.net/sc?id=z4736244379z0...48e636bd808d9cz http://www.spamcop.net/sc?id=z4736236365z9...de787834c68e9az http://www.spamcop.net/sc?id=z4736232813z8...1019cb5e308186z http://www.spamcop.net/sc?id=z4736230134z2...cbd16d3a59d592z http://www.spamcop.net/sc?id=z4736223636za...1479d74418509fz All of these have tracking links with percent obfuscation sprinkled with # [at] and . In one example: http://%36%37%75%35%38%65%73%79%64%6d%62.%6e%62%77%36%39.%63%6f%6d#%40.%76%61%66%6d%68%66%67%75%77%61%6c%75.%63%6f%6d Successfully decodes to: http://67u58esydmb.nbw69.com#[at].vafmhfguwalu.com But then Spamcop reports "Cannot resolve http://%36%37%75%35%38%65%73%79%64%6d%62.%6e%62%77%36%39.%63%6f%6d#%40.%76%61%66%6d%68%66%67%75%77%61%6c%75.%63%6f%6d" It looks like your attempts to resolve explode when you run into a # or a [at] or maybe both. Hope this bug report reaches friendly ears and the issue gets resolved. Link to comment Share on other sites More sharing options...
Farelf Posted December 6, 2010 Share Posted December 6, 2010 ... It looks like your attempts to resolve explode when you run into a # or a [at] or maybe both. Hope this bug report reaches friendly ears and the issue gets resolved. Hi bjornstar, Unfortunately links parsing does not get much priority with SC, the focus is on the e-mail origin. However - as you say, the parser just about gets there then implodes (or explodes). Perhaps it is not a great thing to complete the job. To spell out what you are saying: From one of your example parses: Resolving link obfuscation http://%71%31%79%72%70%66%6d%62%69.%65%69%79%62%73.%63%6f%6d#%40.%79%77%63%68%74%78%79%75.%63%6f%6d Percent unescape: http://%71%31%79%72%70%66%6d%62%69.%65%69%79%62%73.%63%6f%6d#%40.%79%77%63%68%74%78%79%75.%63%6f%6d http://%66%74%6c%6a%32%74%6d%73%6a%72%6d.%75%75%6f%73%65.%63%6f%6d#%40.%7a%79%6b%64%6b%6b%67%6a.%63%6f%6d Percent unescape: http://%66%74%6c%6a%32%74%6d%73%6a%72%6d.%75%75%6f%73%65.%63%6f%6d#%40.%7a%79%6b%64%6b%6b%67%6a.%63%6f%6d Tracking link: http://%71%31%79%72%70%66%6d%62%69.%65%69%79%62%73.%63%6f%6d#%40.%79%77%63%68%74%78%79%75.%63%6f%6d No recent reports, no history available Unescaped: http://q1yrpfmbi.eiybs.com#[at].ywchtxyu.com Cannot resolve http://%71%31%79%72%70%66%6d%62%69.%65%69%79%62%73.%63%6f%6d#%40.%79%77%63%68%74%78%79%75.%63%6f%6d Tracking link: http://%66%74%6c%6a%32%74%6d%73%6a%72%6d.%75%75%6f%73%65.%63%6f%6d#%40.%7a%79%6b%64%6b%6b%67%6a.%63%6f%6d No recent reports, no history available Unescaped: http://ftlj2tmsjrm.uuose.com#[at].zykdkkgj.com Cannot resolve http://%66%74%6c%6a%32%74%6d%73%6a%72%6d.%75%75%6f%73%65.%63%6f%6d#%40.%7a%79%6b%64%6b%6b%67%6a.%63%6f%6d The bits I have shown in red are the (apparently) successful obfuscation handling but then the parser loses it. If it passed on the un-obfuscated links for resolution it would handle them fine. You can paste the links into the webpage submission form and see like Parsing input: http://ftlj2tmsjrm.uuose.com#[at].zykdkkgj.com Routing details for 58.253.28.32 Using smaller IP block (/ 20 vs. / 13 ) Removing 2 larger (> / 20 ) route(s) from cache [refresh/show] Cached whois for 58.253.28.32 : abuse-gd[at]china-netcom.com Using abuse net on abuse-gd[at]china-netcom.com abuse net china-netcom.com = abuse[at]chinaunicom.cn, abuse[at]anti-spam.cn, spam[at]ccert.edu.cn, abuse[at]cnc-noc.net Using best contacts abuse[at]chinaunicom.cn abuse[at]anti-spam.cn spam[at]ccert.edu.cn abuse[at]cnc-noc.net Statistics: 58.253.28.32 not listed in bl.spamcop.net More Information.. 58.253.28.32 not listed in dnsbl.njabl.org ( 127.0.0.8 ) 58.253.28.32 not listed in dnsbl.njabl.org ( 127.0.0.9 ) 58.253.28.32 not listed in cbl.abuseat.org 58.253.28.32 not listed in dnsbl.sorbs.net Reporting addresses: abuse[at]chinaunicom.cn abuse[at]anti-spam.cn spam[at]ccert.edu.cn abuse[at]cnc-noc.net ... Which is the resolution you were looking for (the other from that example resolves to the same host). It may or may not achieve anything alerting those networks and authorities (certainly they do nothing in a hurry) but there doesn't seem to be much wrong to fix in the parser before it could send reports. Perhaps some passing SC staff member will look at these pleadings with a "pitiful eye" and add "parser improperly discards unescape de-obfuscation results containing valid non-alphabetic characters (viz '[at]', '#') - for investigation and fix" to the list of bug fixes/development requests. Link to comment Share on other sites More sharing options...
turetzsr Posted December 6, 2010 Share Posted December 6, 2010 <snip> You can paste the links into the webpage submission form and see like <snip> ...And, if you do that, be sure to cancel the reports, not send them! Link to comment Share on other sites More sharing options...
SpamCop 98 Posted December 6, 2010 Share Posted December 6, 2010 You can paste the links into the webpage submission form ...And, if you do that, be sure to cancel the reports, not send them! Posting links won't generate a parser link or ask if you want to send a report. You need to paste a msg source that begins with "Return-Path:", "Delivered-To:" or "Received:" to make that happen. Link to comment Share on other sites More sharing options...
turetzsr Posted December 6, 2010 Share Posted December 6, 2010 Posting links won't generate a parser link or ask if you want to send a report. <snip> ... <blush> Oops, I was not reading carefully (a common failing of mine, which those of you who have followed other of my posts should know well!). Thanks for pointing out the fallacy of my post. Link to comment Share on other sites More sharing options...
SpamCopAdmin Posted December 7, 2010 Share Posted December 7, 2010 Posting links won't generate a parser link or ask if you want to send a report. You need to paste a msg source that begins with "Return-Path:", "Delivered-To:" or "Received:" to make that happen.Not sure what we're talking about here, so maybe a general caveat is appropriate. We can use SpamCop for research by pasting any URL, IP address, domain, or email address into our web spam processing form, and SpamCop will tell you all it knows about it. We MAY NOT put in information that will lead SpamCop to prepare a report and then send that report. That is a HUGE taboo with us. It is also improper to add information to a spam to make SpamCop "find" and report something that is not part of the original, raw, spam. - Don D'Minion - SpamCop Admin - - service[at]admin.spamcop.net - . A little bird told me... I'm not sure SC is handling this any different than it ever did. I think SC has always gone back to the submitted URL in the parse, it shows the unescaped version only as background information. Note the unescaped URL contains "#" and "[at]". Those are not legal characters in domain names or file naming. Those are what stop the parse. This is another example of something borked in IE where it resolves a broken URL and takes you to a webpage. Firefox doesn't resolve the URL. I tried three other online (& Samspade) decoders. None of them could resolve any of the URLs. This is one where a 'fix' could be risky because you're screwing around with TLD interpretations. It could seriously break something else. - Don - . Link to comment Share on other sites More sharing options...
Farelf Posted December 7, 2010 Share Posted December 7, 2010 Thanks Don, that helps, knowing those characters are "illegal" (well "[at]" is reserved and "#" is excluded I now see - RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax). Yet the parser does "resolve" the unescaped URL if it is fed in to the submission form - as shown: http://www.spamcop.net/sc?track=http%3A%2F...40.zykdkkgj.com (and using the //members.spamcop.net version of the trace gives the full detail for 58.253.28.32, which I posted earlier and which is the the ftlj2tmsjrm.uuose.com part of the address). BUT from what you say this resolution may not be the actual spamvertized site which is probably why it is not attempted in reports. Well, I think I can confirm that - I find Firefox (3.6.12 & 3.6.13) will "resolve" that address too - the address quoted in my earlier post is deliberately broken to save it being clicked through, use the one from the above trace. That ends up at a totally different web page (different resolution - 210.59.230.60) highlighting exactly the problems you speak of. Oh, and IE8 goes to the same place which is pchome.com.tw and probably is the actual spamvertized site (they certainly don't want reports) but the parser doesn't even come close to finding them. I don't know why Firefox/Mozilla is chasing after Microsoft in supporting broken URIs/URLs but it all works out very well for spammers. (For the O/P - sending the parser along that path has long been accepted as too risky and too flaky - and too unending - by [most of] those of the SC user community who are aware of the issue - which is very fortunate because, as Don has said, SC has no intention of treading it.) Link to comment Share on other sites More sharing options...
bjornstar Posted December 13, 2010 Author Share Posted December 13, 2010 I'm not sure SC is handling this any different than it ever did. I think SC has always gone back to the submitted URL in the parse, it shows the unescaped version only as background information. Note the unescaped URL contains "#" and "[at]". Those are not legal characters in domain names or file naming. Those are what stop the parse. This is another example of something borked in IE where it resolves a broken URL and takes you to a webpage. Firefox doesn't resolve the URL. I tried three other online (& Samspade) decoders. None of them could resolve any of the URLs. This is one where a 'fix' could be risky because you're screwing around with TLD interpretations. It could seriously break something else. I understand your priorities lie in parsing the headers and you have enough trouble already. It just seems like a hole if you guys aren't parsing urls the same way that major browsers do. It would appear that some spammers are doing URLs this way because they're getting paid and not getting caught. This URL encoding decodes successfully in Chrome, Internet Explorer, and Opera. I didn't get a chance to check Safari. My copy of Firefox did not decode it properly. I know it's a sticky situation. Maybe with a little statistical analysis you can figure out a good way to handle it. Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.