Jump to content


  • Content Count

  • Joined

  • Last visited

Everything posted by PeterJ

  1. It appears that Eric Kolve is not the only one interested in utilizing SpamCop data on URLs. Bumped into this website today: http://spamcheck.freeapp.net/ and here a recent brief thread from the SpamAssassin general mailing list regarding SA's beta "URIDNSBL" plug-in: http://thread.gmane.org/gmane.mail.spam.sp...n.general/45246 If using SpamCop reported URLs in an RBL manner continues to grow I wonder if SpamCop may be forced to act... Either get involved and create an official RBL or prevent programs/people from automatically harvesting the data. Maybe I am exagerating the potential, but it would be shame for SpamCop to start getting "blamed" for blocking email based on URLs because they are the source of the data.
  2. Yes. I did this with five attachments recently and they were received. (Julian mentioned to me at the time that this was acceptable to do)
  3. PeterJ

    9 pinned items!

    JeffG or anyone else who has control of these forums: Please take all of the 9 pinned items in SpamCop Help and move them somewhere else and have a single pinned item in SpamCop Help that links a user to them. Look at the MozillaZine Forums for ideas on this if they are needed. An example would be the FireFox General forum that has a single "sticky" item that points to FAQs. Currently the SpamCop Help entry page is the most cluttered. Better to act now to organize this stuff.
  4. My mail setup is fairly simple. I use my spamcop mail address as my sole account almost. I do have the mail account that my ISP provides me forwarding to my spamcop mail acct, but this is purely not to miss any legitimite mail from my provider. I rarely receive mail to this latter account, spam or otherwise, as it has never been used for anything. I gave Julian's beta system a try and from what I read my spamcop account should have been automatically taken into account. Therefore the only account I entered when prompted was the mail account provided to me by my ISP. After receiving instructions from the "SpamCop robot" and following them, I received the following reply/error: As one might have surmised I am using SBC Yahoo DSL for internet access. SBC Yahoo DSL users exist with quite a variety of e-mail domains, mine happens to be "ameritech.net" If it is relevant to Julian's configuration the list of all e-mail domains possible (as far as I know) for SBC Yahoo DSL users is: I have no email address with a yahoo domain, only the one with the ameritech domain which I do not use, but forward to my spamcop account anyways. The message I received back from the bot above asked me to go back and configure remaining mailhosts. If I need to configure a yahoo host let me know, but I have no corresponding yahoo email address to accompany it. If I did something wrong please advise. Thanks, PeterJ EDIT March 18 below: I was a bit tired when I posted...today I actually clicked on the link provided in the recieved error message and this took me to the "SpamCop - MailHost problem resolver" site and subsequently to the "SpamCop - Request mailhost waiver" page, where the final response was: So, it looks like I am on track, just need to wait. Like WB8TYW mentioned, a manual override process I gather.
  5. PeterJ


    Doh! It has been a long time since I ever submitted by email. Many thanks for pointing it out, my auth code has been reset.
  6. PeterJ


    For those who have not tried the mailhost beta system, I thought I would share a couple of variations of the new parsing I have seen (after "signing up"), if anyone is interested:
  7. PeterJ

    New SpamAssassin rules

    Well my trial of usng only SpamCop's SA to filter my mail is over. I am getting too many false negatives due to low scores. I will probably turn all of the BLs back on now to help filter my account. Posted three recent spams received in my inbox to the spam NG, under SpamAssassin score of 1.7, 1.6, and 1.4 respectively. I just wanted to try this to get a feeling for how effective the SpamCop SA implementation is. I do think our blocking options are somewhat unique in that we have SA and the BLs separate. It would be cool if the checking of BLs were integrated into SA and each given an appropriate score (1, 2, etc.), but apparently that is not going to happen because JT just turned off the "RCVD_IN_BL_SPAMCOP_NET." I vote for the reversal of this if it is ever reconsidered in the future. A note to those who use a threshold of 2 or 1 with SA, I want to point out that this is not usually how SA is used. The lower we set our threshold for SA the less benefit we gain from its ability to provide a combined total score of spaminess and the more whitelisting would have to be used. Maybe "2" works for some, but "1" does not make any sense to me. I do not care what people set their scores on, I am merely suggesting that recommending to people that the option to set your SA threshold on 1 (and perhaps 2) is not really the solution to the problem. If users feel they have to set it as low as 1 to block spam, then something needs to be adjusted in the SpamCop SA implentation. Tweaking SA can go on forever, so let me be clear that I am not faulting JT with what he has set up here. I am pleased with the addition of the new rules, but from my testing I have found that the current SA implementation is not optimal (at least for the mail I receive.)
  8. PeterJ

    New SpamAssassin rules

    Feedback on antidrug rule: This rule has made the most difference for the spam I receive. At least 5 messages I have received since you implemented antidrug would have passed through if it were not for tripping antidrug and gaining additional points. (Note that I am only using SA, no blacklists are checked for may mail account) I received my first false negative since the new SA rules were implemented, with a score of 1.2, entire spam posted in spam NG under "SpamAssassin score of 1.2" The false negative in quesiton had misspelled versions of "viagra" and "cialis", which apparently the antidrug rule could not detect.
  9. PeterJ

    New SpamAssassin rules

    Roy - I went back to the Newsgroups and read through the threads that you initiated there. I saw some of the headers you posted that showed the SA test "RCVD_IN_BL_SPAMCOP_NET" contributing 12 points towards the total SA score. I feel this is a major problem and a single test should never be worth 12 points. The examples you posted were from earlier this month I think, maybe only a week ago. I think the solution is to assign a value of 1 or 2 to the "RCVD_IN_BL_SPAMCOP_NET" test. Having it set for 12 assumes that the spamcop bl is infallible. Even if "your" server is on the spamcop bl then your wanted mail would still come through with a score under your threshold (at least the majority of it) under this scenario. On a slightly different note, I am hoping that JT can let us know if the SA tests and scores that are implemented are indeed equal on all blades, because if they are not this makes it harder for us to pick an effective SA threshold.
  10. PeterJ

    New SpamAssassin rules

    lawless wrote: I am not suggesting that this course of action if correct right now, but why couldn't JT simply change the score assigned to an email from the "RCVD_IN_BL_SPAMCOP_NET" test from 2.25 to 1? Curious, you did not answer what kind of SA scores you were seeing on the FP emails you received, nor what threshold you are using for SA either. Care to post some info so a clearer picture can be formed? I am curious.
  11. PeterJ

    New SpamAssassin rules

    I just received spam that was barely caught by SA with my current settings. My threshold of 3 held this barely. Entire spam in "spam NG" under "SpamAssassin score of 3.6". Headers here: At this very moment I have not reported the spam, but interestingly the spamcop bl for shows that the IP is listed. I know the SA score on this mail is nothing out of the ordinary, I am just left wondering why the mail did not trip the SA "RCVD_IN_BL_SPAMCOP_NET" test. Does anyone have evidence of the "RCVD_IN_BL_SPAMCOP_NET" test being tripped on an email received via blade6?
  12. PeterJ

    New SpamAssassin rules

    lawless and all- I guess I have mixed feelings on this. If a SpamCop mail user chooses to ONLY turn on SA then theoretically one should get the most benefit of the spamcop bl. The supposed glory of SA is that one spam characteristic does not by itself qualify the message as spam. So I figure if SA is configured optimally, then a message from a friend of mine who happens to be on the spamcop bl should make it through. The key is adjusting either the score that "RCVD_IN_BL_SPAMCOP_NET" contributes and/or adjusting the end user's "SA held level." From what I can see the "RCVD_IN_BL_SPAMCOP_NET" test contributes 2.25 by default, perhaps this is too much of a contribution for a single test? Also, since JT implemented SA I think it is fair to say that many users have probably dropped their individual "SA held levels" lower over time. Now that JT has added additional tests, users may want to consider raising their "SA held levels" due to higher overall scores. I currently have my "SA held level" set on 3 and if the "RCVD_IN_BL_SPAMCOP_NET" does contribute 2.25 then I likely have a problem. The question is: Is it more appropriate for me to raise my "SA held level" or for JT to lower the amount of points that "RCVD_IN_BL_SPAMCOP_NET" contributes? I can defintely see a problem for users who want to use SA, but do not want to use the spamcop bl period, but really it sounds like the issue is FPs due to the "RCVD_IN_BL_SPAMCOP_NET" test and this can likely be solved by making adjustments vs turning it off. lawless, can you share some of your individual setup with us? Scores of FPs and what # you are holding SA tagged mail on would be interesting.
  13. PeterJ

    New SpamAssassin rules

    Disregard my question about the rule "Antidrug", I see it is implemented, I just received a hit for Vi[at]gra. (For those who do not know, the tests below that indicate this are "LOCAL_DRUGS_MALEDYSFUNCTION" and "LOCAL_DRUGS_MALEDYSFUNCTION_OBFU".) For completeness, here is the Antidrug ruleset So far I have received no false positives or false negatives, but have received minimal spam since you added the new rules. I am really pleased that we have the "Antidrug" rule in effect. Great job JT!
  14. PeterJ

    New SpamAssassin rules

    Excellent. I noticed chickenpox earlier today: and blackhair just now: I will try and provide feedback. Weird timing as I just decided this morning to turn off all the BLs available on my spamcop mail account and only use SpamAssassin. I am trying it set on 3 for now. Others will likely be able to provide feedback more quickly because I only receive about 20 spam per day and I know others can top that. I will post spam to the NG and reference them from this thread if I see anything coming through with low scores. Have you thought about using the rule known as "Antidrug"? I know you have concerns about doctors who may use spamcop mail and how this would affect them, but we could always find out from other admins of SA how serious a problem this really is. Can you comment on whether or not all blades are using the same SA rules or not? And, what blacklists are being used within SA to add to the cumulative scores, is it just the spamcop bl? and if so, is it implemented on all blades? Just curious. Thanks a bunch
  15. I think see what is going on after I checked out what the SA test "RCVD_IN_BL_SPAMCOP_NET" is defined as: "Received via a relay in bl.spamcop.net" So, differing IPs, could account for that.
  16. Perhaps, but I did notice that both mails were supposedly held because of the X-SpamCop-Disposition line of "Blocked bl.spamcop.net", so I guess the logical question is: Does the SA test "RCVD_IN_BL_SPAMCOP_NET" equal the SpamCop BL? I assumed it did... Otherwise maybe it is an example of SpamCop's parsing outperforming SA's with regards to source?
  17. Two spam emails posted in full in .spam newsgroup under "SpamAssassin scoring" and here are the SpamAssassin scores from identical messages (for the most part) that were received trhough blade4 and blade6. This time blade4 comes through with a "14.7" and blade6 assigns a "6", neither are particularly low of course (and they were both held by the SpamCop BL first anyways.) This time blade4 with the higher cumulative score. I do notice that blade6 is not tripping the "RCVD_IN_BL_SPAMCOP_NET" test, while blade4 is...
  18. I have noticed that any false negatives received recently by me have come through blade4. Not as scientific as DavidT's information posted, but just something I noticed as generally the case. I also noticed that bayesian filtering seems to have been turned off on blade4 (this is a separate issue from blade4's other scoring, as the bayes tests were hindering more than helping anyways.)
  19. I concur that bayes does not appear to running on blade1. I am seeing similar headers. A sidenote: I got a false postive through blade6 because SA assigend it 5 points from two tests it tripped. Since blade6 is not running bayes the message did not trip any "BAYES_*" tests. It would have been interesting to see if the message would have made it through to my inbox on blade4 due to a negative score tacked on from a low bayes test that would have dropped the final score well below 5... I still say our bayes db is hosed somehow...
  20. Do your error messages look like this? I too have occasionally received these error messages when performing quick reporting from the web mail interface, although not recently. The one I listed here is from Jan 27. I emailed JT regarding this when it occurred more frequently (to me at least) and he indicated they were aware of the issue. I really cannot remember the last time it happened now. When it was occurring there was no pattern that I could discern with my reporting or the content of the messages. Also, I always received one error message per spam message reported.
  21. Actually this one has a negative score (no doubt incorrectly.) There has been some discussion of this recently, just check the forum.
  22. PeterJ

    Bayesian filtering

    Thanks for trying out SA's bayes JT. Here are some thoughts: 1) I have to seriously wonder if our bayes db is accurate. It seems to me that a spam message should never trigger bayes_00, bayes_10, etc. I noticed one SA admin in the SA mailing list state that this would indicate improper training of the db. See a brief conversation of this here. If I am interpreting my messages correctly, then it seems one very damaging piece of training on our bayes db results from the "SpamCop Quick reporting data" messages that people receive from quick reporting in the web interface. Here is a section of the headers from one of the quick reporting messages I recently received: It is negative 104! I assume that this message was processed as ham and therefore is skewing our bayes db with improper weighting of tokens taken at least from the many spam "From:" and "Subject:" lines it contains and also possibly from domain names and IPs that the message contains. JT and fellow users, can you confirm that every "SpamCop Quick reporting data" email that we receive is processed as ham automatically, or did I just receive a fluke? 2) I also propose that our training is skewed towards spammy messages and not hammy ones. From JT's explanation above it appears that the extreme cases (on both sides) are automatically trained, however that is the only time that "ham" messages are getting trained. spam messages get trained quite a bit more because every time we submit spam these messages are getting trained as "spammy." Our current system is doing automatic training on clear cut cases for ham and spam, but additionally, users can only train on spam. No legitimate email (false positive or otherwise) can be trained by users as ham under the current system. I do not think this will reflect well on our bayes db and its effectiveness. 3) Maybe this is not a big deal, but further adding to a skewing of training on spam and not training on ham involves blade6. Right now blade6 is not performing bayes. This means that any message passing through blade6 does not get automatically classified as spam or ham, however users are reporting messages as spam via reporting that will then classify messages as spam. The end result (if I am thinking straight) would be that NO legitimate mail that passes through blade6 gets trained as ham, while most of the spam that passes through blade6 does get trained as spam. I seriously think our bayes db is weak and we should consider starting over with a new db if "case 1" (above) is true and then the "SpamCop Quick reporting data" messages could be excluded from training. Bayesian based filtering is defintely one of many tools that work well and combined with other methods can reach fantastic filtering rates. To be clear about spammers attempts at bayesian posioning by use of "word salad" in spam, they are simply not as effective as JT is painting them to be. There are many people who are routinely getting 99% filtering success using bayesian based methods today. The best success in using bayesian methods result from individual bayes databases, so I am not claiming that SpamCop's current implementation of bayesian filtering is going to get to 99%. Nor (for the record), am I asking that JT/SpamCop implement individual bayes databases either. I do think that it is important to ensure that the current implementation of bayesian filtering is optimal before deciding it has no merit. I think I remember JT mentioning that SpamCop is filtering at 80% somewhere along the line. Lets give bayesian filtering an honest try and see if it can affect this percentage significantly. JT, please consider speaking with some other SA admins regarding our current setup, perhaps Chris Santerre (not sure if he runs bayes or not.) At a minimum I think other admins would have very useful input with regards to training "ham" and their experiences with large user bases. Thanks.
  23. PeterJ

    Bayesian Filtering?

    It is one of several bayesian tests that SA can use to add to (or in this case, subtract from) the cumulative score of a particular email. See my response to the thread you started here for more info: Negative SpamAssassin talk...
  24. PeterJ

    Spamassassin negative score?

    I missed that "bayes_00" test the first time I noticed this post. The tests that SpamAssassin uses are listed here: SA Tests If you are interested, just do a quick search of the above page for "bayes_00" and then it will be apparent as to why the score was negative. Perhaps JT could elaborate on this with regards to our implementation of SA. -PeterJ P.S. Also of relevance, below quote is taken from SA's documentation on sa-learn