Farelf Posted May 2, 2004 Share Posted May 2, 2004 I don't know if this is new or if I am just now noticing it but fairly often the body parser seems to require a "second go" to pick up URL links. When checking how SpamCop has handled the submission prior to reporting, I sometimes see the message "links not found". This is cause for some suspicion because most of the stuff I get (maybe 80% or more) actually has links. Viewing the message through SpamCop's page ("View entire message") shows URLs are actually designated in about half these cases, usually unexceptional/simple text or HTML links in the appropriate form for the Content-Type declaration. Going back to the submission, the links have magically been found and resolved. This might be happening for up to 5% of my total processing. I know that tracking the links is variable over time for various reasons but this is different, it is the actual parsing of the message body to find the links in the first instance. Is this a known problem and how many "spamvertized" URLs might be slipping through the reporting in this way? Alternatively, it could just be the way SpamCop interacts with my particular (very old) installation but the parser seems to handle the more complicated cases with ease (base 64 encoded, all sorts of less-than-straightforward HTML etc). The lastest example where this occurred (and ended up resolved) was Processed spam report Link to comment Share on other sites More sharing options...
Wazoo Posted May 2, 2004 Share Posted May 2, 2004 Part of what I think you're talking about gets into the refreshing of some of SpamCop's cached data .... If it's cached, result some back pretty quickly, so results to the parsing tool work great ... However, if cache data isn't there, then SpamCop is at the mercy of the outside resources to obtain the data, and what can happen is that the parsing tool times out while waiting for a response, so kicks out the "nothing to report" to the parser, thus your "links not found" response. but, do a refresh or parse again, and more than likely the results from the last outside source query made it back so that the internal cache now has an answer for the look-up .. and bang, the parsing tool has results it can work with. Link to comment Share on other sites More sharing options...
Farelf Posted May 2, 2004 Author Share Posted May 2, 2004 Thanks once again Wazoo. Guess I shall just have to stay on my toes (not easy for one of "generous" proportions). A proportion (growing, maybe?) of those really not having links are those rotten "tracking" ones - you know, with one's name and sometimes address embedded 20+ times where the standard munging doesn't get them. I shall take comfort in knowing we have the little devils interested enough to take the trouble to use "our" system ;-] Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.