samrudge Posted January 23, 2013 Share Posted January 23, 2013 I run a URL shortener, we're doing pretty well (~1m clicks a month at the moment), but I'm pulling my hair out trying to deal with spammers. Right now, we only allow a link to be shortened if; It is not listed in DBL, SURBL or Google Badware The IP the request came from is not in the SBL, XBL or SpamCop The target domain has a Web Of Trust 'Trustworthiness' rating of >40 (only if confidence is >=12) The target URL has a mime type of text/* or image/* The IP of the server hosting the target domain is not listed in hpHosts The target domain is not in our internal list of manually blocked domains Regular human modoration of recently created links Before a link is shortened we send a single GET request and detect any 30x redirects or java scri_pt redirects (We use Ghost.py for this) and run the spam detection on the URL redirected to, and we redirect to the target link so the redirection can't be changed after the link is saved. We re-run all these checks every 6 hours and return `410 Gone` for any links that become spam after they're created. We do all that, and it's still not enough. Today we've had quite a large influx of spam links being created. We've had 417 links created, each link created by a different IP address and the targets are split over 112 domains, all seem to be compromised sites which have had a single HTML file uploaded, I won't link to it but it's branded as a Facebook page with the text "Save the file and run! It is lol " then it tries to download an .exe file (obviously something not nice). I've now blocked all 417 links and removed them so they're all safe, the way I ended up doing it is after our spam checker has done it's mime type and redirect checks, MD5-sum the content and add that to a manual blocklist. I haven't yet figured out what's linking to these but they've had about 20-30 hits each domain as of now (The first link was created at 15:37UCT, I'd put this system in place and re-scanned all links created since then by 16:28UCT so hopefully not many people have been affected) So here's my question, how do I stop things like this? It would only take adding <!-- <?php echo time(); ?> --> to get past the system I'm using to block the links now, even now none of the domains or source IPs are in any of the lists we check against. Say I'd been out today, it could have been 24 hours before I caught these links and managed to stop them. I've tested about 20 of the links and none of them are blocked by bit.ly or tinyurl so the spammer just has to switch to one of those and they can keep on spamming for a bit. I hate being a spam "enabler" but I'm pretty much out of ideas to stop people using our service for bad stuff. P.S. I'm talking to our ISP at the moment about the best way to contact the website owners and inform them of the spam page. All the sites I checked seemed perfectly normal sites with this one bad HTML file in the root so I'd guess the site owners don't know they've been compromised. Link to comment Share on other sites More sharing options...
This topic is now archived and is closed to further replies.