Jump to content

kre

Members
  • Content Count

    8
  • Joined

  • Last visited

Community Reputation

0 Neutral

About kre

  • Rank
    Newbie
  1. Woha! So we *did* understand each other. But we got entirely opposite opinions on this matter. Well, yes. But there are situations where notification does not help, or is even conterproductive. I cannot afford to loose one email from a legit customer, no matter how incompetent her or his ISP. I've yet to see a false positive in my held mail. Sure, there's several false negatives in the inbox, but that is unavoidable with the preferences I have. I, for one, strongly disagree here. Regular people who just want to communicate don't want to have to know and tackle with the details. Making innocent people's life difficult will just build dislike towards the blacklist and the isp. But it will not stop the spam. Agreed, on this I'm totally with you. And on a certain, limited level this is already happening. But unfortunately, ISPs are not chosen for their competence handling email. Software preferences, billing models, availability, and prices are more important usually. What I don't get is how you would match viruses with blacklists?
  2. As with all things AI, the bayesian perceptrons suffer from overlearning too. Usually the faster an engine learns, the higher the risk of overlearning. In mundane terms, the AI goes bonkers if you feed it to the overflowing The description of CRM114(1) for instance says to only learn the incorrectly categorized email. This will be most of all at the start, and then getting fewer with each added correction. But sure, that's harder to do on the ISP side compared to integrating it in a mail reader. Another thing that probably is not a good idea is feeding it with the input from the heuristics of SA. At best, this creates an engine that perceives near-identical to the rest of SA rules. Lets see, I'll try to sketch a way I think it can be done. Be aware that I cannot try it this way because I don't run an ISP which does webmail. It depends on the ability to syndicate emails moved between folders on the webmail backend into the learning pipeline. 1. Have it set up as usual within SA, adding the BAYES_XX attributes as usual but *not* assigning any scores, neither direction. 2. On any learning attempt, learn only spam when it has a low bayes score, and only as ham when it has a high bayes score. 2. Learn only messages moved from inbox to held mail to the spam side. 3. Any message moved off the inbox somewhere else but held mail, learn as ham. 4. Again, 2. and 3. only when the bayes score would have said otherwise. This will probably get a better distribution of spam vs. ham learned, because all spam already caught by the other SA rules is never learned. And why should you, the other methods work fine on these. And it will minimize load caused by the learning process. It should drop over time, because if all goes well, the bayes attribute will get more accurate. The amount of email which gets through to the learning process is also a good indicator as to when to start assigning scores to the bayes attribute. I think it would be good to exclude the SA header lines when learning. And lines added by the webmail software, if any. The goal is basically having the input as authentic as you can make it will help with the results. But I don't believe it's necessary to filter the spam report copies from the ham learning, at least my personal bayes had no problems with these. However, it might still be worthwhile because there's no real point in learning them. And learning costs both CPU and space in the hash maps. (1) CRM114 http://crm114.sourceforge.net/ look for "TOE strategy - that is, Train Only Errors" in the FAQ
  3. Since we've been talking about spam and blocking, I meant false positives from spam detection. That is, regular email that you want to read getting blocked. someone who really never sent spam to get blocked. I don't want these to happen, and most other people I know don't either. [edit] I found something to illustrate what I mean: http://forum.spamcop.net/forums/index.php?...st=0entry3067 Over in the email forums. In that context, it was called "collateral damage".
  4. Well, that shows that it's not exactly the delta, at -6 that would be 15. Hmm. Still assuming the server is running on PST. Though there seems to be a pattern to it. If it's EST, that would just mean I'm a lot slower than you. When the feature was introduced, I wanted to use it to verify and improve my alertness. Which is what I believe it's for. So, understanding how it works would help with the interpretation.
  5. Just curious, the reporting time seems to level out at the time zone delta, assuming that CA is -8, and I'm at +1 from GMT. I've never been able to get it below 9h. But maybe I'm just slow
  6. Jason, I am certain that doing this would devalue the SCBL so far as to be useless. The primary value of SCBL is that it's fast, and has a clear definition as to what goes in and out. Adding that charge thingie will add variable and unforseeable delays to delisting. During the time that delisting *would* occur and the charge would be due, that particular entry in the BL is essentially incorrect. You see, it would drastically reduce the value of the BL. False positives are the worst in all things concerning spam. I am sure that we will not risk customer communication just because a BL we're using has some dispute with an intermediary ISP. So, if a charge would be added, it is most likely that we stop using SC at all. The only people that would like your idea pretty much are the spammers. It will reduce the importance of SC because people stop using it, and turn the attention from the spammers to the ISP and financial issues. Go ahead, I like the idea too.
  7. Right. I'm pretty certain we'd drop SCBL real soon if the list has a chance to be polluted with entries in dispute because of a charge.
  8. Unfortunately, the mailhost setup "merged" two of my mailhosts which were separate. Now, the entry shows the email address from one, and the mailhost name of the other. The IP addresses and domains assigned seem to be the union of MX hosts. As far as I understand the workings, this is not as intended, because the two mailhosts are not on the same distance from the inbox. And they can no longer be separately maintained, because they share the same entry. No tests have been done against the parser yet. Because I did not need the entries yet (they are relatively quiet) I left them in the config for debugging. In the figure below, (1) and (2) are inbox accounts. (5) and (4) were merged, with the mailhost name of (4) and the email address of (5). (2) needed a waiver because of one inhouse hop which I got already (thanks alot!) and works fine. --> (3) ----------> | --> (5) --> (4) --> | (1) ------------------> | (2)
×