I'll try to give a little snapshot here of how project honeypot works, as far as I know it. I am by no means an expert and am not officially associated with the project. They have a really good FAQ which might make for some nice reading some cold winter night.
The important thing to remember is that we often overestimate the intelligence of the spammer and, more importantly, his harvesting tools. There is no doubt that a crafty programmer could perform delicate surgery on all the addresses that a spammer collects. But the truth is, they don't care and don't spend that much time on it. Most are little more than scri_pt kiddies running 5-10 year old software. If they get 1 out of 10 addresses that are valid, they are happy.
Another important thing to remember: as the spammer sends E-mail to the address supplied by the honeypot page, he has no reason to believe that anything is amiss. His spam is happily accepted by the recipient. Here's the deal.
1. I install a dynamic page on my webserver (either Perl or PHP) that is, like all dynamic pages, generated by my server every time it is called by a GET request from a remote client. The link to this page (say, from my front page) is hidden so that the average user can't see it -- its invisible to everything exept robots and spiders. The name of this hidden page is different for every server in the project. I can name it anything I want.
2. The scri_pt produces the HTML for your browser or the spammer's spambot. Tucked away in this page that is generated is an E-mail address, complete with the mailto: HTML tag. However this is not visible to someone with a browser, you could only see the E-mail address if you did a "view/source" of the page or if you were a spambot -- only looking at the raw HTML.
3. Here's the important part. That E-mail address is actually created by the Project Honey Pot servers a split second before my page is sent to the requesting host. So the project knows: a) when the page was requested. b) who requested it (IP address). c) via which server (mine) it was handed out. It logs these along with the E-mail address that was generated.
4. Just as important: The E-mail address generated is completely valid with a valid domain that is (secretly) owned by the project. Any mail sent to this address will be delivered and (secretly) disected by the project's servers.
4. In the future, if spam is sent to the address that was handed out, we not only know who the spammer is, but exactly when, where and who collected that address for the spammer. If nobody ever sends E-mail to the address, no harm, no foul.
A specific example:
0. I get a personalized scri_pt from the project and install it on my server at
http://mysite.com/pizza.php1. Sally Spammer turns her spambot loose on the world from her home DSL (4.3.10.10)
2. The spambot hits my home page and finds the hidden link to
http://mysite.com/pizza.php3. The spambot goes to /pizza.php and issues a GET for that page.
4. My server starts to generate the HTML code for the requested page. (spambot waiting)
5. Part of the internal page generation includes a call to project honey pot and requests an e-mail address. (spambot still waiting)
6. Project Honey Pot logs my server, the requesting IP (4.3.10.10), the date/time and makes up an arbitratry but valid E-mail address: john[at]jankyho.com
7. My server receives john[at]jankyho.com back from the project and sticks it into the HTML code for my page. (spambot waiting)
8. I serve the page up to the spambot at 4.3.10.10. I don't keep track of anything, I'm done.
9. Sally's spambot slurps up the page and greedily finds the html code for the email address of john[at]jankyho.com and stuffs it away.
10. Sally sells her list of E-mail addresses to a spammer in New York.
11. Three months from now the New York Spammer wants to sell Viagra. He digs into his list of suckers and comes up with 1000 addresses, including john[at]jankyho.com.
12. New York turns his spam machine on (220.20.20.1) and sends out spam. When his server goes to send something to the jankyho.com domain, he does a DNS lookup for the MX record of jankyho.com. He is given the mail server address for one of the Project Honey Pot mail servers (of course this is not obvious to him or his spam-server).
13. The Project's mail servers get a spam sent to john[at]jankyho.com coming from 220.20.20.1. It takes it in, says thanks, and lets the spammer go on his way.
14. The cycle of scum is complete. We now know that Sally Spammer is the root of all evil. We can positively identify her by her address (4.3.10.10) and know exactly where and how she got the E-mail address. The johny[at]jankyho.com address was never handed out to anyone but her (and never will be handed out again).
Some things to note.
Sally and New York never know this is going on.
We make no attempt to actually stop the spam as it is being sent.
The only thing required to be secret is the list of domains owned by the Project (the ones that actually receive the spam.)
An average surfer would never even see these pages let alone get to the hidden E-mail address.
Even if a casual user or search engine DID get the E-mail address, nothing would ever happen unless they actually sent E-mail to that address.
There are several ways a spammer could defeat this approach. None of them likely in the near future.