Jump to content

Journalist searching for historic spam data


jimgiles

Recommended Posts

Posted

Hi All,

I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR). Does anyone know of older datasets? I've seen a few that are based on individual email accounts, but I need something a little more rigorous than that.

Any tips would be much appreciated!

Thanks,

Jim

Posted

I occasionally search for the same thing myself, but haven't had much luck. You might find something at the spam Links website (http://spamlinks.net/). You might also Google "spam corpus" referring to the collections of spam used to "train" Bayesian filters etc.

-- rick

Posted

The question arises from time to time but unfortunately no-one really knows what they're looking at, in terms of collection methodologies and uncertainties about ISP filtering of various kinds. Assume you saw the topic http://forum.spamcop.net/forums/index.php?showtopic=10332 with its various links? Despite its title and initial direction it goes on to talk about general spam stats. And there are other discussions/topics in these forums - I'm not sure they will help but they may be worth some effort to find them if you're not getting much joy elsewhere.

Posted
I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR).

One bit of lateral thinking that

(a) may help

(B) interest your readers anyway.

would be to look for historic statistics on total email volumes

I assume that if the current assertion that 80% of email is spam is indeed true it was lower in the past

so we have an upper bound on spam level

HTH

Posted
I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR). Does anyone know of older datasets? I've seen a few that are based on individual email accounts, but I need something a little more rigorous than that.

Any tips would be much appreciated!

You may want to contact a few of the spam blocking services. I have used postini at my last 2 positions and while that do not have historical information available publically, thay may have the numbers if you ask.

Posted

Even services like McAfee and Norton antivirus keep track of some spam data since some botnets and phishers are directly linked to the spam flow...

good luck in your research and do come back with what you find, I used to be an avid reader of you magazine in my younger years..

Posted

Thanks all for the suggestions. I'll check out the links. The spam filtering companies are telling me that they only have data going back to 04/05, so I'm still on the look out for older stats. I'll post a link to the story when it appears.

Cheers

Jim

Posted
Thanks all for the suggestions. I'll check out the links. The spam filtering companies are telling me that they only have data going back to 04/05, so I'm still on the look out for older stats. I'll post a link to the story when it appears.

Would news.admin.net-abuse.sightings be of any help? That goes back to the late '90s.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...