SPAMCOP HOME · SPAMCOP FAQ · NEWSGROUPS · FORUM FAQ · WEBMAIL · SSL WEBMAIL · SPAMCOPWIKI


 Other words, data, places -->  SpamCop Pages V  FAQs & Words V  Newsgroups V  WebMail V  News-Recent Stuff V   Poll on menu

------>------> Latest and Current Announcements <------<------

Welcome Guest ( Log In | Register )

> This is a User to User Support Forum

The primary mode of support here is peer-to-peer, meaning users helping other users. (please remember this at all times!)
Another try:
This forum is composed of people who have used spamcop and those who are learning about anti-spam efforts.

 
Reply to this topicStart new topic
> Journalist searching for historic spam data
jimgiles
post Oct 13 2009, 05:28 AM
Post #1


Newbie
*

Group: Members
Posts: 2
Joined: 13-October 09
Member No.: 9635



Hi All,

I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR). Does anyone know of older datasets? I've seen a few that are based on individual email accounts, but I need something a little more rigorous than that.

Any tips would be much appreciated!

Thanks,

Jim
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
rconner
post Oct 13 2009, 11:32 AM
Post #2


Advanced Member
Group Icon

Group: Memberp
Posts: 872
Joined: 23-January 07
From: Maryland, USA
Member No.: 7388



I occasionally search for the same thing myself, but haven't had much luck. You might find something at the spam Links website (http://spamlinks.net/). You might also Google "spam corpus" referring to the collections of spam used to "train" Bayesian filters etc.

-- rick



--------------------
Richard C. Conner, P.E.
http://www.rickconner.net/spamweb/
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Farelf
post Oct 14 2009, 02:39 AM
Post #3


T-shirt wearing out
Group Icon

Group: Membersph
Posts: 3871
Joined: 23-February 04
From: Western Australia
Member No.: 491



The question arises from time to time but unfortunately no-one really knows what they're looking at, in terms of collection methodologies and uncertainties about ISP filtering of various kinds. Assume you saw the topic http://forum.spamcop.net/forums/index.php?showtopic=10332 with its various links? Despite its title and initial direction it goes on to talk about general spam stats. And there are other discussions/topics in these forums - I'm not sure they will help but they may be worth some effort to find them if you're not getting much joy elsewhere.


--------------------
Plus ça change, plus c’est la même chose
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
michaelanglo
post Oct 17 2009, 05:36 PM
Post #4


Advanced Member
***

Group: Membera
Posts: 157
Joined: 29-January 04
From: michaelanglo in Surrey, England
Member No.: 117



QUOTE(jimgiles @ Oct 13 2009, 11:28 AM) *
I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR).

One bit of lateral thinking that

(a) may help

((IMG:style_emoticons/default/cool.gif) interest your readers anyway.

would be to look for historic statistics on total email volumes

I assume that if the current assertion that 80% of email is spam is indeed true it was lower in the past
so we have an upper bound on spam level

HTH
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
StevenUnderwood
post Oct 17 2009, 09:21 PM
Post #5


What Life?
Group Icon

Group: Membersph
Posts: 5141
Joined: 20-January 04
From: Whitinsville, MA USA
Member No.: 12



QUOTE(jimgiles @ Oct 13 2009, 06:28 AM) *
I'm a reporter with New Scientist magazine. I'm putting together a piece on the history of spam and am searching for data on historic spam levels. The most comprehensive dataset that I'm aware of is maintained by MAAWG, but it only goes back to 2005 (http://www.maawg.org/about/EMR). Does anyone know of older datasets? I've seen a few that are based on individual email accounts, but I need something a little more rigorous than that.

Any tips would be much appreciated!

You may want to contact a few of the spam blocking services. I have used postini at my last 2 positions and while that do not have historical information available publically, thay may have the numbers if you ask.


--------------------
Steven P. Underwood, DNRC
Whitinsville, MA
underwood+forum[at]spamcop.net

-No trees were killed in the sending of this message. However, a large number of electrons were terribly inconvenienced.-
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
dra007
post Oct 18 2009, 12:09 PM
Post #6


Been There
Group Icon

Group: Memberp
Posts: 1413
Joined: 18-March 04
Member No.: 777



Even services like McAfee and Norton antivirus keep track of some spam data since some botnets and phishers are directly linked to the spam flow...

good luck in your research and do come back with what you find, I used to be an avid reader of you magazine in my younger years..

This post has been edited by dra007: Oct 18 2009, 12:11 PM
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
jimgiles
post Oct 26 2009, 10:55 AM
Post #7


Newbie
*

Group: Members
Posts: 2
Joined: 13-October 09
Member No.: 9635



Thanks all for the suggestions. I'll check out the links. The spam filtering companies are telling me that they only have data going back to 04/05, so I'm still on the look out for older stats. I'll post a link to the story when it appears.

Cheers

Jim
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
kmolloy
post Oct 26 2009, 11:16 AM
Post #8


Newbie
Group Icon

Group: SpamCop Staff
Posts: 6
Joined: 13-February 07
Member No.: 7476



QUOTE(jimgiles @ Oct 26 2009, 08:55 AM) *
Thanks all for the suggestions. I'll check out the links. The spam filtering companies are telling me that they only have data going back to 04/05, so I'm still on the look out for older stats. I'll post a link to the story when it appears.

Would news.admin.net-abuse.sightings be of any help? That goes back to the late '90s.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 

- Lo-Fi Version Time is now: 22nd November 2009 - 05:38 AM