Jump to content

Report ALL reasonably-detectable spamvertized URLs


Jeff G.

Recommended Posts

IMHO, anything in spam that could be construed by a strictly-RFC-compliant email client, a Microsoft email client, or even a human as something to click on or to paste into the address bar of a browser is a spamvertized URL or email address, and every spamvertized URL should be reported to the administrator(s) of the system(s) that host that URL, of course with the exceptions of IBs (Innocent Bystanders) that have been so marked by the appropriate administrators. It is high time for the Parser to stand up for the rights of SpamCop's paying customers to report ALL of the reasonably-detectable spamvertized URLs they are bombarded with each day, rather than the rights of spammers to avoid reporting of their spamvertized URLs by manipulating their spam in non-RFC-compliant manners that Microsoft (and perhaps other) email clients make clickable anyway, and that some humans insist on pasting into their browsers. I understand that TPTB may get flack from administrators whose URLs are so reported, but why would such administrators complain rather than marking their URLs as IBs?

Yes, you may have read this before in http://forum.spamcop.net/forums/index.php?...indpost&p=30977.

Link to comment
Share on other sites

So you think SC should write a parser that has as many security flaws and errors in it as a Microsoft parser?

What about all the links to Google.com, NYTimes.com and ebay? What is a definition of spam? (Un solicited email) If the IP or URL administrators are not going to do anything about the spammer what is a report from SC? spam? Noise? More wasted bandwidth? Waste of SC CPU? IMHO reports that won't be acted on are all of the above.

Link to comment
Share on other sites

So you think SC should write a parser that has as many security flaws and errors in it as a Microsoft parser?

30993[/snapback]

No, I think it should take security flaws in Microsoft and other email clients into account, and let its paying customers report any spamvertized URL that attempts to exploit those security flaws.
What about all the links to Google.com,

30993[/snapback]

Google can publish articles criticizing spammers and their activities, for searches can rank those articles highest, and for redirects can substitute those articles for the redirected links.
NYTimes.com

30993[/snapback]

The New York Times can do the same thing.
and ebay? 

30993[/snapback]

So can eBay. If the spamvertized link is for an actual eBay auction, IIRC their terms allow them to cancel the auction.
What is a definition of spam? (Un solicited email)  If the IP or URL administrators are not going to do anything about the spammer what is a report from SC? spam? Noise? More wasted bandwidth? Waste of SC CPU? IMHO reports that won't be acted on are all of the above.

30993[/snapback]

Administrators with role accounts or listed in WHOIS records have the responsibility to deal with such email, as their roles and listings are automatic solicitations for Reports per the relevant Internet Standards and RFCs.
Link to comment
Share on other sites

Google can... the redirected links.The New York Times can do the same thing.So can eBay.  If the spamvertized link is for an actual eBay auction, IIRC their terms allow them to cancel the auction.

30998[/snapback]

So let me get this straight, I can effectively remove an artical from Google or the Times by referenceing it in 1000 - 2000 emails? or do you have some magic way of telling that my reference to an artical about xyz comes from a spam or the footnote to a doctoral thesis? As for ebay you just gave me a way to cause havic - almost as good as a denial of service attack.

Administrators with role accounts or listed in WHOIS records have the responsibility to deal with such email, as their roles and listings are automatic solicitations for Reports per the relevant Internet Standards and RFCs.

30998[/snapback]

You are correct. But of course if we lived in a world where everyone did what thay should we wouldn't have spam would we?

Link to comment
Share on other sites

wow, quite the devil's advocate, aren't we :)

I think we all agree that a parser that can accurately pick out spammy URL's would be of great value. The "accurately pick out" part is obviously the crux of the matter. Any ISP getting SC reports (for source or URL's) gets a copy of the full spam text (right?). This would allow them to judge whether someone's trying to play the system. The fact that no Spamcop-endorsed blocking occurs based on URL's means that nothing bad can happen without the direct action of the ISP/responsible party (i.e., no automatic false listings, etc.). Also, because every report here is tied to a real person, there is some accountability with the reports.

And just because other people may not do their job in correcting problems, doesn't mean we shouldn't do our jobs reporting the problems.

edit: as per Jeff G's suggestion, as that was the intent

Link to comment
Share on other sites

It is a balancing act isn't it. We would want to cast the net wider to catch more spammers and spamvertizer but on the other hand we need to make sure that most of the reports are valid to retain "our" creditability.

As you said, "the 'accurately pick out' part is obviously the crux of the matter." I'm just not sure that the time and effort spent to build a parser that mimics the vagaries of all the loosely defined mail programs is time well spent. Is it worth running down the spamvertized site or should the effort be spent in the header finding the source of the spam?

The spamvertized site and their ISP probably don't give a toot about SC. On the other hand the source of the spam, most likely a zambie or other un-witting source, and their ISP has a higher probability of taking action.

IMHO spending time increasing the productivity in the header would be better spent than time spent in the body of the spam identifying addresses of those that most likely don't care.

Link to comment
Share on other sites

I'm just not sure that the time and effort spent to build a parser that mimics the vagaries of all the loosely defined mail programs is time well spent.

31341[/snapback]

I'll flop to your side now. I agree that the weight should be on reporting the source. It doesn't bother me much when SC doesn't pick up a URL. If there was a choice to be made: improve the header parsing or the URL parsing, I obviously would pick header. Makes me wonder at this point, how much coding is being done on the header? How many header issues are there to resolve right now? If there is time available for 'other than header work', where would you like to see it go? URL parsing is an acceptable choice, IMHO.

If it is a time-resource issue, maybe the URL parsing aspect of the code should be open-sourced. It already does a fairly decent job getting a variety of RFC compliant URL's. Pehaps TBTB could request code / algorithm snippits for other URL issues. All they'd have to do is provide the appropriate hooks, wouldn't even necessarily have to give up any existing code to public scrutiny. Since it's a secondary aspect of Spamcop, it shouldn't be quite that crucial that spammers could try to work around the code. There would be more people trying to plug any holes at the same time.

Link to comment
Share on other sites

<snip>  Makes me wonder at this point, how much coding is being done on the header?  How many header issues are there to resolve right now?

31346[/snapback]

My guess is quite a bit. Every time I see a "xx.xx.xx.xx ... not found" or some other dead end in the effort to send a report, I realize they're not done. Not necessarily in parsing the header but in digging through all the data bases available to get a good address to send the report to. What time I've spent trying to send manual reports before I started using SC convinced me it is not a linear process, requiring art and a few aha! moments every once in a while. That is tough to code and needs tweaks all the time.

Open-sourse -- Because the parsing of reported spam is use as part of a product, that decision would need to be based in large part on advise from the legal and "corporate" offices as apposed to all the good reasons for open-sourse. I wouldn't hold my breath. Would love to get some insight but...

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...