A related topic to this has been discussed ( see Website redirectors, How to trace? ) under reporting help, but that seemed to specifically deal with an obfuscated URL being included in the google redirect. Recently I've been getting a lot of spam using this google redirect. See the following example spam parse:
http://www.spamcop.net/sc?id=z867573303z0d...88634396882315z
Now, obviously google doesn't want to hear about this, and (arguably) they shouldn't need to for an automated redirect system. (see discussion on this here: tricking with google, hiding spamvertised site )
Personally, I hardly see any reason for these to exist in the first place (well... laziness...), but as they do, it would seem to me that for certain redirects it would be little effort on the parser to jump to the target when identifying the spamvertised site. The parser already takes the time to deobfuscate the target link. How hard would it be to check if the inital string is "http://www.google.com/url?q=" and if so, to examine everything after the = ? Whether or not the parser still reports to google, it could additionally send a non- (less?) useless report to the actual site host.
I realize this could get back to the whole argument about the usefulness of identifying/reporting spamvertised sites. Let's not go THERE again. The parser currently makes a (nontrivial) effort at ID'ing the site, this seems like it would improve that functionality at minimal effort, redeem the currently wasted parse time, and have little (?) chance for additional errors/misreports as a result.
Thoughts?
