Jump to content


  • Content Count

  • Joined

  • Last visited

Everything posted by jeffc

  1. Hopefully this idea has already come up, but if so I could not find it with a quick scan of the existing topics. But even if it has been mentioned before I'd like to state my strong support for creating a mail blocking technology that blocks based on ULRs contained in the messages. Note that this is not the same as "URL blocking" which traditionally means preventing Web browser access to certain sites, and it's also not the same as most current realtime blocklist (RBL) approaches, which block access from certain mail servers, usually based on their IP address. Blocking mail based on URLs they contain would require a mail agent that can see, parse, and deobfuscate the content of the message body, which is something many mailers such as sendmail are not designed to do today, but which others such as Postfix appear to support. Like many other RBLs, SpamCop's RBL blocks messages from certain servers once someone has reported a spam coming from them. This is useful in that it successfully prevents much spam from the same mail server from reaching beyond the first few people, but spammers have already evolved strategies around this by using distributed trojan horse viruses, in essence stealing Internet services from many unsuspecting computers throughout the world in order to send spam in a broadly distributed way which is therefore difficult to stop since it's decentralized. That's in addition to simply exploiting existing open relay mail servers for as long as they remain open. (Certainly hundreds of thousands of spams can typically be sent through open relays before they are closed.) However what most of the spams have in common is that they attempt to drive web traffic to spam sites, for example selling drugs or software. From reporting spams that get through the many RBLs our mail servers already use, it seems to me that many or even most of those spam sites are hosted at ISPs in China. The spams come from all over the world, but web hosting providers in China seem especially likely as destinations as the URLs in spams. What I and presumably others propose is to build a blacklist of those sites and block messages that reference those URLs. At the same time a whitelist of the many common legitimate sites would need to be created to prevent spammers from getting legitimate sites blacklisted. A probably very successful first pass would be to blacklist the sites or IP blocks in China (or other spam friendly ISPs) and whitelist the rest. Further refinement could be made from there, but this would probably successfully stop 90% of spam that currently makes it through existing RBLs. I believe this may be a useful and productive solution to spam and would like to encourage it's development. I understand there is discussion in the SpamAssassin community for working on things like this. SpamCop builds a great database of spam-referenced URLs now. That databse could be used in a URL blacklist. Is anyone in the SpamCop community working on this idea?
  2. jeffc

    Forwarded spam duplicating

    Seems fixed as of this writing.
  3. jeffc

    Forwarded spam duplicating

    FWIW I'm seeing the same behavior myself. I report a spam successfully, then it says I have unreported spam. If I click the Report Now link, it says it was already reported. This is with reporting spams serially, i.e. one at a time, so there are no other old ones, just the single new one being reported and queued twice for some reason. Good to know other people are seeing the same thing, at least, and that the developers know about it already....
  4. You certainly are working in a good space, grabbing spam URLs from SpamCop and integrating them with SpamAssassin scoring. That's a general direction I was interested in also, though I agree with you that your first pass could use more integration. I don't think wildcard A records would be too much of a problem, since they presumably would only be used for spam domains. In the case of wildly varying host names in spams made possible by wildcards in the DNS, we could just use the non-wildcarded part of the domain name. In other words ignore the wildcarded part and just block the whole spam domain (up to but not including the wildcard). That could be automatically detected by a relatively simple URL comparison matching from the TLD down towards the host name. A tree-shaped data structure could make a reasonable representation. Thinking about it, perhaps the most logical approach would be to ask the SpamCop folks if they're be willing to give access to their existing spam URI database for this purpose, in other words to do something useful with it. I don't think spam hosting reports to China are doing a whole lot of good at this point.
  5. Yes, SpamAssassin uses Baysean rules. I like the idea of tarpits, but as others have rightly pointed out, their use has not slowed down spammers very much.... The amount of resources they can steal is too vast it seems. Which is why blocking based on contained URLs would probably be the most effective way of ending spam, even if it has some (solvable) technical hurdles.
  6. I wouldn't pick on engineers; I r one of them too. I was more suggesting engineers could "make it so". I'm told Postfix can see and act on message bodies. procmail, a perl or shell scri_pt or SpamAssassin could also do this. I'm told the SA developer community has already been discussing it. Regarding existing RBLs of China and Korea spam mail servers, those no longer seem effective since many people are blocking mail from China and Korea already. That seems to be why the spammers have gone to using relays or zombied personal computers not in those countries. Remember we're talking about blocking URLs in spams, and not the spam servers. Spammers have already effectively figured out how to get around spam mail server RBLs IMO. If they want to advertise a spam URL, they can be defeated by blocking mails containing the URL or related domain(s). The way to prevent spammers from adding legitimate sites in their spams to try to block mails referencing them is to have a whitelist of legitimate domains. Any domain or URL in the whitelist does not count against blocking, nor would those domains or URLs get added to the blacklist. You're probably right about legitimate messages mentioning spam site, but that would be a second order effect.
  7. I agree looking at the content of messages would require more processing power than MTAs that currently just look at headers. On the other hand if spam that mentioned specific URLs were blocked, then spam would become far less effective and there would be less of it eventually. In other words the processing power question solves itself if the approach works in the first place. I suppose one could argue it wouldn't work in the first place because mail would slow to a crawl. Kind of a chicken and egg problem. However I still believe being able to deny delivery of messages with spam URLs could largely solve most spam as we know it today. It's not so much about saving bandwidth or saving CPU cycles as it is about stopping spam. If the spammers can't reach most people then spam will cease to be a useful if unethical marketing tool. If the tool is less useful, fewer unethical people will use it and spam would decrease. Regarding the parsing of URLs, SpamCop among others seem to have fine algorithms for it. That said, only the third or second level domain name may be enough to extract. The full URL may not be necessary since most of the spammers seem to use custom domains. Blocking the domains is quick and easy and does not even require resolution of the URL. Just block the custom spam domain. No legitimate domain owner would permit spam sites under their main domain so that's probably not an issue. For efficiency, pre-whitelist all the big legitimate domains with well-enforced AUPs. Regarding databases and other engineering issues, that's what engineers are for.
  8. jeffc

    no links found

    Hmm, Not sure why it didn't find the URL in your reported spam. FWIW another reason SpamCop doesn't "find" links is that their DNS has been removed. Here's one I reported where the web site thankfully no longer resolves. (I.e. the domain registrar took out the DNS for the spamvertised site, i.e. they deliberately broke the spamvertised site. Good for them!!!) Woohoo! Jeff C.
  9. OK well hopefully some other antispam effort will work on it if SpamCop won't. Based on the latest spams, it may be a good way to stop them. I mainly wanted to try to drum up some support for the idea. Good to hear other people already think it's a good idea. Name resolution (i.e. blocking by the resolved IP address of the URL) may not even be necessary; just block based on the URL or second level domain. It would be quicker and not require and DNS lookup resources. On the other hand, an initial lookup of the IP address could be useful in some circumstances such as establishing a whitelist/blacklist score for a particular URL. JeffG, if your proposal is still available in one of these forums, can you provide a URL? I'd like to see what you were thinking.