Please do NOT post any requests for help in this forum. Please post all questions in the appropriate Help Forum. This forum is reserved for the development of the SpamCop FAQ (here) and is open to all who wish to contribute to building a better FAQ. Suggestions for improvements are welcome as well as pointing out areas that are unclear or you are unable to understand as we can use those comments to improve the current FAQ (here).
![]() ![]() |
| Jeff G. |
Jul 20 2005, 12:31 AM
Post
#1
|
|
T-shirt wearing out Group: Membersph Posts: 3730 Joined: 2-July 04 From: Northeast New Jersey (New York Metro Area), USA ... Please read my sig. :) Member No.: 2041 |
SenderBase's "Magnitude" appears to be on a logarithmic scale using Base Ten, such that Estimated Daily Email Volume equals 1.34 x 10^Magnitude, as follows:
Magnitude 0 = 1.34 Estimated Daily Email Volume Magnitude 1 = 13.4 Estimated Daily Email Volume Magnitude 2 = 134 Estimated Daily Email Volume Magnitude 3 = 1.34 Thousand Estimated Daily Email Volume Magnitude 4 = 13.4 Thousand Estimated Daily Email Volume Magnitude 5 = 134 Thousand Estimated Daily Email Volume Magnitude 6 = 1.34 Million Estimated Daily Email Volume Magnitude 7 = 13.4 Million Estimated Daily Email Volume Magnitude 8 = 134 Million Estimated Daily Email Volume Magnitude 9 = 1.34 Billion (Thousand Million in UK English) Estimated Daily Email Volume The interval between displayed Magnitudes (an increase in Magnitude of 0.1) is the tenth root of ten, or approximately1.2589254117941672104239541063958 (as caclulated by a Pentium). Edit 6-5-09 - note: email volume has increased greatly of the last few years with todays volume estimated closer to 200 Billion - please read the following posts This post has been edited by dbiel: Jun 5 2009, 01:12 PM -------------------- Best Regards, Jeff G. (full signature)
|
| Wazoo |
Aug 5 2005, 05:38 PM
Post
#2
|
|
What Life? Group: Forum Admin Posts: 12536 Joined: 22-January 04 From: Iowa Member No.: 18 |
Similar to the Richter scale used to measure earthquakes, SenderBase's magnitude is a measure of message volume calculated using a log scale with a base of 10. The maximum theoretical value of the scale is set to 10, which equates to 100% of the world's email message volume (approximately 10 billion messages/day). Using our log scale, a one point increase in magnitude equates to a 10x increase in actual volume. For example, a domain with a magnitude of 5 would have estimated volume of 100,000/day while a sender with a magnitude of 6 would have an estimated daily volume of 1,000,000/day. The following table illustrates the percentage of Internet email associated with each magnitude:
CODE 10.0 100% 9.0 10% 8.0 1% 7.0 0.1% 6.0 0.01% 5.0 0.001% 4.0 0.0001% 3.0 0.00001% 2.0 0.0000001% 1.0 0.00000001%
|
| Farelf |
Aug 8 2005, 12:50 PM
Post
#3
|
|
T-shirt wearing out Group: Membersph Posts: 3871 Joined: 23-February 04 From: Western Australia Member No.: 491 |
The SenderBase volume statistics are potentially a useful tool to remind doubting domain admins of the objective evidence of unusual activity on their servers. The notation of the exponents of email volume (as "magnitude") is an elegant and consistent way to express differences over a huge range which has a somewhat variable base over time (the total "emails sent" estimate), but most people will find it a little difficult to get a "feel" for what the magnitude numbers and their differences mean. The percentage change figures help but the actual change(s) in email message numbers represented will be even more readily appreciated.
To get that, the total volume of email (magnitude 10) needs to be known - k*10^10=total. The general SenderBase stats on http://www.senderbase.org/ provide sufficient information with both magnitude numbers and the matching "Estimated Daily Volume (m)"illions for the top 20 domains and the top 20 IP addresses. The simplest estimate is found from the top domain - if (for example) magnitude 7.9 relates to 118.9 million messages, then k*10^7.9=118,900,000 or k=118,900,000/10^7.9 which is k=118,900,000/79,432,823 or k=1.496862 ... A better estimate is found by calculating and averaging for all 40 results displayed - which in an actual case (early August 8) came to k=1.507289 ... Another estimate using all 40 pairs of data can be found from the slope coefficient of the linear regression (correlation) between these variables. Using Excel's LINEST spreadsheet function with the intercept forced to zero gave a result of k= 1.535364 ... for this (same) data. This, though more complicated, is the estimate with the least associated error of estimate. For instance, the derivation of a 1890% difference between an IP address with a current daily average magnitude of 4.9 versus its average magnitude of 3.6 is a change from k*10^3.6 to k*10^4.9 messages per day. The following are for each of the estimates above: 1. Top Domain - increase from 5,959 to 118,900 (msg/d) 2. Average 40 - increase from 6,001 to 119,728 3. Regression - increase from 6,112 to 121,958 It is seen that in this instance the first and simplest estimate is close enough, the elaboration of "better" estimates of k adds little to the picture. (The simplest estimate is actually well within the probable error range of the "best" estimate.) The precision of the magnitude figures (1 decimal place) is insufficient to exactly replicate the percentage difference given in SenderBase - in other words, rounding errors are appreciable (though minor). Note the estimated total daily volume in this example is variously 15.0 (U.S.) billion, 15.1 billion and 15.4 billion - k*10^10. This number is the basis of the day's magnitude calculations and is, as said, somewhat variable during the day and between days. -------------------- Plus ça change, plus c’est la même chose
|
| Jeff G. |
Aug 8 2005, 02:27 PM
Post
#4
|
|
T-shirt wearing out Group: Membersph Posts: 3730 Joined: 2-July 04 From: Northeast New Jersey (New York Metro Area), USA ... Please read my sig. :) Member No.: 2041 |
QUOTE(Farelf @ Aug 8 2005, 01:50 PM) Note the estimated total daily volume in this example is variously 15.0 (U.S.) billion, 15.1 billion and 15.4 billion - k*10^10. This number is the basis of the day's magnitude calculations and is, as said, somewhat variable during the day and between days. However, it is rather inconsistent with "the world's email message volume (approximately 10 billion messages/day)" as stated on both http://www.senderbase.org/?page=help and http://www.senderbase.org/search?page=help_magnitude.-------------------- Best Regards, Jeff G. (full signature)
|
| dbiel |
Aug 8 2005, 02:52 PM
Post
#5
|
|
Been There Group: Membersph Posts: 2453 Joined: 20-February 04 From: San Gabriel Valley CA USA (Los Angeles) Member No.: 447 |
QUOTE(Jeff G. @ Aug 8 2005, 12:27 PM) However, it is rather inconsistent with "the world's email message volume (approximately 10 billion messages/day)" as stated on both http://www.senderbase.org/?page=help and http://www.senderbase.org/search?page=help_magnitude. You need to remember that the 10 billion number comes from a static FAQ which is out of date and has not been updated (probably not since it was first created) and then it was most likely a very rough average.A note to Farelf, thankyou for a extremely well written explaination, probably a bit over many of our heads, but definately helps to clarify the issues and puts it into persepective, ie the exact numbers are unimportant, it is the trend in change of volume that is important which is provided in a simplied format. An increase of 100,000 messages a day may be a very important indicator of a spam problem for an IP whose average is only 1,000; but totally meaningless for a high volume IP -------------------- This forum is a user support forum. The Moderators and Forum Admin are volunteers (not paid) and have no special direct relationship with SpamCop.net.
If you have been unable to receive the assistance you need here please see How To Contact SpamCop Staff Thank you for your participation in our peer to peer, user based forums. |
| Farelf |
Aug 8 2005, 08:27 PM
Post
#6
|
|
T-shirt wearing out Group: Membersph Posts: 3871 Joined: 23-February 04 From: Western Australia Member No.: 491 |
Thanks for comments guys. Yes it is a bit involved - maybe an abstract for the "final" FAQ, if it is worth including at all.
Certainly the total volume is dynamic (note %change figures are independent of actual total volume, which is assumed to be the same throughout the period considered in this approach). I have dealt with the daily volumes only - weekly could be calculated using a similar approach but the supposed constancy of total volume might be an issue. Variation in total volume by the 3 approximation methods as above for a couple of later snapshots to illustrate dynamism: 17:30 GMT 08-Aug-2005 1. 14.6 (b m/d) 2. 15.0 3. 14.7 00:30 GMT 09-Aug-2005 1. 13.9 2. 14.8 3. 14.4 This is about a 4% difference in the most extreme case. This post has been edited by Farelf: Aug 8 2005, 08:41 PM -------------------- Plus ça change, plus c’est la même chose
|
| Farelf |
Aug 9 2005, 12:54 AM
Post
#7
|
|
T-shirt wearing out Group: Membersph Posts: 3871 Joined: 23-February 04 From: Western Australia Member No.: 491 |
QUOTE(Jeff G. @ Aug 9 2005, 03:27 AM) However, it is rather inconsistent with "the world's email message volume (approximately 10 billion messages/day)" as stated on both http://www.senderbase.org/?page=help and http://www.senderbase.org/search?page=help_magnitude. To further clarify - the 10 billion m/d is impossible for the current magnitudes quoted together with matching message counts on the SenderBase entry page. It was, I am sure, simply a convenient number for explanatory purposes (or maybe - Lord help us because the difference would be mostly spam - it was a good approximation a few months ago). I did say the rounding error on the magnitude figures is "appreciable though minor". I was forgetting these are exponentiated. 7.9 can actually be anything between 7.85 and 7.94999 .... Consequently the maximum error from this source, on the matching number of messages, is (very nearly) 10^0.1-1 or 25.89% (which applies to all magnitude numbers) - this is a bit more than minor! As a result, the previous figures quoted could all (just barely) be attributed to a "real" total message count of 14.55 billion. Over 1½ days the range of values for magnitude 7.9, based on that total count and allowing for maximum rounding error on both the magnitude number and the matching message count is 11.78%, well within the scope. However, at the bottom end of the scale, magnitude 6.1 (the most consistent minimum in the period), the reported values vary by a maximum 30.3% with a median of 23.5%. After just 1½ days, it is looking very unlikely that *any* static number is used and certainly not 10 billion. It was looking to me like the count is dynamic (like the real world) and further analysis is not persuading me to the contrary view. The actual volatility may be a little less than is indicated by the available "deconstruction" methods (because of the rounding errors) but I remain of the view that the treatment is useful. [Update] Incidental - won't bother with a new post. Further analysis appears to confirm the SenderBase volumes are indeed dynamic, even in the short run. CODE SENDERBASE - DECONSTRUCTION TO TOTAL EMAIL VOLUME ----------- PAIRED DIFFERENCES (AS STANDARD ERRORS) ---------- CASE "DATE TIME " ESTIMATE (LR) PROB. ERROR "1 " "2 " "3 " "4 " "5 " 6 1 07-Aug-2005 Late GMT 15,501,324,250 ± 88,361,842 "0 " -2.290258953 -10.25276948 -18.45891555 -8.775687532 -12.79216103 2 08-Aug-2005 Early GMT 15,353,642,311 ± 95,602,102 2.477919946 "0 " -8.272567858 -15.91817697 -7.022663427 -10.25261346 3 08-Aug-2005 20:30 GMT 14,736,680,458 ± 110,571,352 12.82977538 9.567875519 "0 " -5.303887968 0.300838747 0.356699937 4 09-Aug-2005 00:30 GMT 14,428,388,820 ± 86,177,135 18.00252708 14.34887779 4.13374585 "0 " 3.960342935 5.658101628 5 09-Aug-2005 07:30 GMT 14,762,024,348 ± 124,900,490 12.40453631 9.1748412 -0.339824976 -5.73990742 "0 " -0.079115125 6 10-Aug-2005 03:10 GMT 14,757,423,578 ± 86,217,550 12.48173154 9.246190178 -0.278135287 -5.660755191 0.054612374 "0 " SenderBase data (a sample of a population) is used to estimate SenderBase total email volume (the population) by the correlation method mentioned previously. Probable errors give the range where 50% of the estimates are expected to fall. Probable error is a fraction (0.67449 ...) of the Standard error. The paired differences are like Case 2, column "1": (Case 1 estimate - Case 2 estimate)/(Case 1 Standard Error). This is a shortcut, not totally rigorous, but it should be useful/close enough since really fine discrimination is not required. Where the difference in standard errors is less than -3 or more than +3 it is unlikey that the "real" (population) volumes represented by these estimates could be the same - the odds are about 1 in 370 at that point and rise rapidly thereafter. Accordingly, it seems the actual volumes behind the SenderBase data are changing (fluctuating) rapidly, if not continuously. There are a number of unknowns (particularly how well the SenderBase total volume mirrors what is actually happening in the world) but, as supposed, the (very accessable) SenderBase figures most probably can be converted to give a useful indication of email numbers from specific IP addresses - and the short-term changes to that trafic. It might even be useful to try to relate total volume estimates to peaks and troughs in SpamCop reporting. This post has been edited by Farelf: Aug 10 2005, 02:16 AM -------------------- Plus ça change, plus c’est la même chose
|
| pgreenway |
Mar 22 2007, 08:31 PM
Post
#8
|
|
Newbie ![]() Group: Members Posts: 1 Joined: 22-March 07 Member No.: 7599 |
One problem though ... Senderbase is flawed. According to them, we have a magnitude of 3.5 which would mean we're sending between 7.37 Thousand Estimated Daily Email Volume -- yet the stats of our Exchange server from PerfMon say we've send about 10,000 in the last 3 weeks. Our ISP's web traffic data supports this as we're not chewing up any additional bandwidth.
So, that would suggest that Senderbase is flawed. Especially, when I've written to them multiple times and they've refused to reply. That says even more about this company / product. (IMG:style_emoticons/default/dry.gif) |
| Farelf |
Mar 22 2007, 08:53 PM
Post
#9
|
|
T-shirt wearing out Group: Membersph Posts: 3871 Joined: 23-February 04 From: Western Australia Member No.: 491 |
...So, that would suggest that Senderbase is flawed. Especially, when I've written to them multiple times and they've refused to reply. That says even more about this company / product. (IMG:style_emoticons/default/dry.gif) Thanks for the data - your observations are of interest and, I think, similar to/supportive of the suspicions of a number of members. As to what the failure ('refusal' makes a judgement) to reply means I don't think we can say (beyond, obviously, they have no-one in the role of flackcatcher). Even Wazoo (our forum Admin) has commented on occasion that he has had no reply to inquiry when clearly it would have been in their interests to take the initiative in response and especially when raised in "these precincts".-------------------- Plus ça change, plus c’est la même chose
|
| dbiel |
Mar 22 2007, 09:24 PM
Post
#10
|
|
Been There Group: Membersph Posts: 2453 Joined: 20-February 04 From: San Gabriel Valley CA USA (Los Angeles) Member No.: 447 |
Note: this post is still under construction.
Edit, basicly gave up on the project and decide to make post visible simply to get rid of the hidden post due to the current situtation - dbiel 9-20-07 The following does not directly address pgreenway's complaint, which is a separate issue, but is being posted to show that trying to calculate any specific volume for a single magnitude is impossible. Its primary value is to indicate changes in relative volume. Please not that what follows is simply additional proof of that previously presented by Farelf but relying solely on Senderbase own calculations of volume and use a simple comparison between multiple days One problem though ... Senderbase is flawed. According to them, we have a magnitude of 3.5 which would mean we're sending between 7.37 Thousand Estimated Daily Email Volume There is one major flaw in your logic which is magnitude 3.5 does not = any constant quantity. It is a representation of the percent of estimated current daily email traffic. What would be helpful is if they would post the daily total traffic that was being used to calculate the magnitude number.Additionally the rounding factor becomes quite large when magnitudes are listed only in tenths. If you are into math, take the magnitudes and total volume of several of the largest and middle and smaller mail servers and calculate what the total traffic volume is. Do it for several different days and you will be very surprised with the wide spread in the answers you get for the total traffic. The following is a sample from todays listing listing daily magnitude and daily volume. You will notice that the magnitude is a constant 7.8 but the volume ranges from a high of 102.3 million to a low of 86.3 million with is over an 18.5% spread Note: I include the next higher and next lower magnitude to use as a limiter to allow for future comparisons. CODE Magnitude - Volume - date: March 22, 2006 ---7.9------ 123.2 telecomitalia.it Netsiel S.p.A. network unknown ---7.8-------102.3 charter.com CHARTER COMMUNICATIONS NSP ---7.8-------100.5 proxad.net Proxad / Free SAS NSP ---7.8--------98.2 ttnet.net.tr Turk Telekom unknown ---7.8--------97.0 hinet.net CHTD, Chunghwa Telecom Co., Ltd. NSP ---7.8--------95.8 163data.com.cn CHINANET-ZJ Hangzhou node network unknown ---7.8--------87.2 telesp.net.br TELECOMUNICACOES DE SAO PAULO S.A. - TELESP ISP ---7.8--------86.3 bezeqint.net ADSL-CUSTOMER-CONNECTION NSP ---7.7--------78.6 veloxzone.com.br Telemar Norte Leste S.A. ISP An additional range indicating more than 20.4% difference in volume but the same magnitude CODE Magnitude - Volume - date: March 22, 2006 Note: future entries will exclude entries that do not help to determine the point at which a specific magnitude number changes.---7.1--------17.2 pppool.de freenet Cityline GmbH unknown ---7.0--------16.5 touchtelindia.net Infrastructer unknown ---7.0--------15.6 cox.net COX COMMUNICATIONS unknown ---7.0--------15.3 layeredtech.com Cable & Wireless unknown ---7.0--------15.0 sify.net Satyam Infoway Pvt.Ltd. unknown ---7.0--------14.7 earthlink.net Earthlink Network ISP ---7.0--------14.2 siol.net SiOL d.o.o NSP ---7.0--------14.0 dialog.net.pl Dialog Internet Services Customer DSL unknown ---7.0--------14.0 seed.net.tw Digital United Inc. NSP ---7.0--------13.8 covad.net Covad Communications NSP ---7.0--------13.7 etb.net.co ETB - Colombia unknown ---7.0--------13.7 swbell.net Pac Bell Internet Services NSP ---6.9--------13.0 ukrtel.net Ukrtelecom IP access network in Kremenchug unknown CODE Magnitude - Volume - date: March 25, 2006 ---7.1--------16.8 alltel.net Central Telephone Company unknown ---7.0--------16.8 siteprotect.com Hostway Corporation unknown ---7.0--------13.5 swbell.net Pac Bell Internet Services NSP ---6.9--------13.0 netcabo.pt TVCABO-Portugal Cable Modem Network NSP -------------------- This forum is a user support forum. The Moderators and Forum Admin are volunteers (not paid) and have no special direct relationship with SpamCop.net.
If you have been unable to receive the assistance you need here please see How To Contact SpamCop Staff Thank you for your participation in our peer to peer, user based forums. |
| Telarin |
Mar 23 2007, 07:49 AM
Post
#11
|
|
Advanced Member Group: Memberp Posts: 803 Joined: 30-November 05 Member No.: 4882 |
One problem though ... Senderbase is flawed. According to them, we have a magnitude of 3.5 which would mean we're sending between 7.37 Thousand Estimated Daily Email Volume -- yet the stats of our Exchange server from PerfMon say we've send about 10,000 in the last 3 weeks. Our ISP's web traffic data supports this as we're not chewing up any additional bandwidth. So, that would suggest that Senderbase is flawed. Especially, when I've written to them multiple times and they've refused to reply. That says even more about this company / product. (IMG:style_emoticons/default/dry.gif) Senderbase statistics are estimates (as there is no way for them to monitor your email sending habits directly) and as such are not an exact number. They should, in most cases, be accurate within one order of magnitude, which in this case they appear to be. As far as contacting them, I have never had any luck getting them to fix errors in the ownership data they display on their site either. They do not appear to respond to or act on emails. -------------------- Will Russell, MCP
IT Specialist Galveston Insurance Associates |
| Farelf |
Jun 4 2009, 11:15 AM
Post
#12
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
|
T-shirt wearing out Group: Membersph Posts: 3871 Joined: 23-February 04 From: Western Australia Member No.: 491 |
Noting SenderBase
But the table should be useful withal. I know I have been guilty of being out by half an order of magnitude or so in conversions I have stated recently. No more. Full second decimal place magnitude values can be determined for the whole range by reference to the expanded 5.1 - 5.9 range (2.1 = 2,518, etc.) [edit] Oh yes - being an exponential scale, technically speaking there is no volume 0 - as magnitude tends to zero, number tends to 20. Magnitude -1.3 would about equal 1 message, I suppose, but SB doesn't use negatives and 'part messages' below 1 are (anyway) nonsensical. To all practical intents, I'm sure zero magnitude is taken as/meant to signify zero messages. Just a technicality. -------------------- Plus ça change, plus c’est la même chose
|
|||||||||||||||||||||||||||||||||||||||||||||||||||
![]() ![]() |
|
Lo-Fi Version | Time is now: 21st November 2009 - 10:27 PM |