jimgiles Posted October 13, 2009 Posted October 13, 2009 Hi All, I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR). Does anyone know of older datasets? I've seen a few that are based on individual email accounts, but I need something a little more rigorous than that. Any tips would be much appreciated! Thanks, Jim
rconner Posted October 13, 2009 Posted October 13, 2009 I occasionally search for the same thing myself, but haven't had much luck. You might find something at the spam Links website (http://spamlinks.net/). You might also Google "spam corpus" referring to the collections of spam used to "train" Bayesian filters etc. -- rick
Farelf Posted October 14, 2009 Posted October 14, 2009 The question arises from time to time but unfortunately no-one really knows what they're looking at, in terms of collection methodologies and uncertainties about ISP filtering of various kinds. Assume you saw the topic http://forum.spamcop.net/forums/index.php?showtopic=10332 with its various links? Despite its title and initial direction it goes on to talk about general spam stats. And there are other discussions/topics in these forums - I'm not sure they will help but they may be worth some effort to find them if you're not getting much joy elsewhere.
michaelanglo Posted October 17, 2009 Posted October 17, 2009 I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR). One bit of lateral thinking that (a) may help ( interest your readers anyway. would be to look for historic statistics on total email volumes I assume that if the current assertion that 80% of email is spam is indeed true it was lower in the past so we have an upper bound on spam level HTH
StevenUnderwood Posted October 18, 2009 Posted October 18, 2009 I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR). Does anyone know of older datasets? I've seen a few that are based on individual email accounts, but I need something a little more rigorous than that. Any tips would be much appreciated! You may want to contact a few of the spam blocking services. I have used postini at my last 2 positions and while that do not have historical information available publically, thay may have the numbers if you ask.
dra007 Posted October 18, 2009 Posted October 18, 2009 Even services like McAfee and Norton antivirus keep track of some spam data since some botnets and phishers are directly linked to the spam flow... good luck in your research and do come back with what you find, I used to be an avid reader of you magazine in my younger years..
jimgiles Posted October 26, 2009 Author Posted October 26, 2009 Thanks all for the suggestions. I'll check out the links. The spam filtering companies are telling me that they only have data going back to 04/05, so I'm still on the look out for older stats. I'll post a link to the story when it appears. Cheers Jim
kmolloy Posted October 26, 2009 Posted October 26, 2009 Thanks all for the suggestions. I'll check out the links. The spam filtering companies are telling me that they only have data going back to 04/05, so I'm still on the look out for older stats. I'll post a link to the story when it appears. Would news.admin.net-abuse.sightings be of any help? That goes back to the late '90s.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.