Finding links in message body -- no links found

nei1_j · May 24, 2013

Amigos!!!

I get some of my spam parsed, and send a few spam reports. It keeps me off the streets.

Sometimes, the (plain text) body of the spam includes a URL. But the parser only "Finds the links in the message bodies" about half the time.

Whether it's part of an <a href="http://blahblahURL/x"> term, or whether it's not surrounded by html code, that doesn't seem to matter. So, I assume that whether the parser will recognize the link or not has sometime to do with the URL itself.

Has anyone else noticed that? Is there any rhyme or reason?

And the erroneous message from the parser is, "Finding links in message body / no links found." I mean, I'm only a carbon-based life-form, but even I can find the links in the message body. Why can't Cisco?

I understand the organization whose links are found in the message body is probably a victim, and if they asked SpamCop to stop annoying them with reports (for spam that they have no control over), then I would abide. But if their Abuse department is accepting the reports, and if the parser would parse the "links in the message bodies" for me more regularly, then that would be groovy.

My occasional issues usually pertain to the Parser. Well, who you gonna call.

Happy Friday (I hope),

-neil-

turetzsr · May 24, 2013

Hi, nei1_j

...Please review SpamCop FAQ (links to which appear near the top left of each SpamCop Forum page) item labeled "SpamCop reporting of spamvertized sites - some philosophy."

...Some of the "hammers" best used for this "nail" are Knujon and Complainterator, references to which appear in various articles throughout the SpamCop Forums.

nei1_j · May 25, 2013

Hi turetzsr,

Your reply is so a cryptic and obtuse, so replete with jargon, so obviously uninterested in real communication, that there's not a lot of motivation to decipher or follow it.

To top it off, you gave me a search term ("SpamCop reporting of spamvertized sites - some philosophy") that provides zero results in the Search function. Instead of providing me exercise, why not a working URL, instead? Or, gosh forbid, if you know the answer to the question and you notice there's a keyboard in front of you...

So, to humor a little, I used the Search function for the unusual term, "philosophy," to see if that would bring me to your article. The first page of returns show zero topics with the word "philosophy" in it's title, so that didn't work either. To humor a little more, I searched again, this time just for the single word, "spamvertized." Ah-ha! Maybe did you mean to direct me to the article, "SpamCop reporting of spamvertized URLs?"

So, you provided [only] a title to an article but not its URL, and you misnamed it. That's why the Search didn't work. Then, you spontaneously added a term into the alleged title that lead to more wasted time. I think you're not so serious about helping people with questions, or else you'd answer a question plainly, or at least lead people in the right direction instead of giving them silly puzzles. I recommend replacing your forum.spamcop time with watching more Star Trek. They have dozens of TV series' and movies, and I bet you need to catch up.

Anyway, back to "SpamCop reporting of spamvertized URLs." I have problems with the thread -- it could be my dyslexia... 1) It seems to start off in the middle of a conversation. 2) I don't know what the jargon "spamvertized" in the thread's title means, so I'm not even given a chance to know what the subject is that they're talking about -- and remember, Turetzsr neglected to mention what's the subject of the article he's [allegedly] directing me to, so I share responsibility for not having a clue. 3) The discussion is very informal and meandering, it's certainly not a concise answer to the question.

This is the answer you intended to give me?

And I should mention, if it were just one bad answer directed to me, I'd just walk away. But this is an example of the chronic way that "help" is provided here, and it's a problem for a lot of people. So it deserves some light.

Another search now, for "Finding links in message body."

Hows this thread: Pinned: FAQ Entry: The Link Analysis Process

> Link analysis is performed by the SpamCop Parser, part of the SpamCop Parsing and Reporting Service. Finding links in message body is the first step of the process. It sometimes fails to find links that are really there - refreshing usually helps.

Not a concise instruction, but I think it means to refresh the parse-results. I would have always been afraid to corrupt the process by refreshing the parse-results, but I'll give it a try, next time. (Experimenting, I see that refreshing the parse-results is fast, painless, and doesn't corrupt anything.)

I'm looking at this advice about refreshing the parse results. They're reporting that the "refreshing" strategy only helps sometimes, and a lot of Ctrl-R commands could adversely load up the SpamCop servers. There's some trouble with the domain managers (RIPE, etc.) getting them to standardize on something as simple as requiring "abuse" addresses for ISPs, which would make it difficult for SpamCop to provide such an address if it doesn't exist. And I think the thread discusses additional problems contributing to the links-in-the-body-not-being-resolved problem that I didn't read yet.

The thread started in 2005. If this problem is 8 years old, and there has been a lot of talk about it, then might it finally be time to compose a clear, concise FAQ? It could be the first of its kind on the forum: clear, concise, short, directed, educational, useful. A possible, concise answer might be, "This problem was submitted to SpamCop in 2005. We will take this opportunity to re-submit it and see if there are any updates." That would be a two sentence answer, confirming to the user that (s)he is not crazy, that there is a problem, and that maybe it will be resolved although the gears of business grind slowly. So easy to roll into an clear, concise, and up-to-date FAQ, too.

And no link to a bloated, incomprehensible, 3-page thread required, although discussion is still available elsewhere on the forum. And if the situation changes, perhaps motivated by the discussion, then the FAQ can be updated. Or if the problem is rectified, then the FAQ can be eliminated, too.

When was the last time an FAQ was eliminated from the forum? Or edited for usability?

And a resolution to the wider problem begins to coalesce in my mind. My hypothetical answer could become the 1st addition to "SpamCop FAQ -- The Next Generation," the collection of clear, concise, organized FAQs, waiting for the day when they are complete enough that the bloated, jargon-laden, awkward, and mostly-indigestible Original FAQs can be retired and moved onto a 3.5" disk, where they'll stop giving people heartburn.

The tree of FAQs can't be navigated except with the Search function, and I guess the forum has been using the Search function as a crutch to avoid generating a 'usable' FAQ page. It points to the extent of the bloat, the lack of conciseness, and the lack of FAQ-order.

"FAQ," "Pinned," whatever you want to call them. "Concise" includes listing them on one page. Not scattered willy-nilly around the forum. If someone has a problem or a question, they go to the "FAQ [The Next Generation]" page. The polar opposite would be to go on a wild goose chase around the website.

This would be a good time to highlight a bad habit that has been going on in this forum for too long: If a member has the attitude that he's unwilling to plainly answer a question because it was "already answered previously in the Forum and why should I keep repeating myself," then let him follow his original instincts and simply "don't answer." Leave room for someone else who's helpful enough to make an effort to provide a real answer.

Given the state of documentation on the forum, it's an exercise in sadism when an experienced forum member provides the answer, "Read the FAQ," or "Read this thread." Doesn't this boilerplate answer deny how unsuitable some of these posts are for serving as FAQs? Must users who come to the forum for help be forced to continue dealing with the disingenuousness of crusty old forum members? Isn't the service too important to let that continue?

I would almost suggest, as an alternative to directly answering a fair question, providing instead a URL for an FAQ that directly answers the question. But given the disastrous shape of the existing FAQs, I can't suggest it. If a "Frequent Question" is going to be tersely answered only by providing a URL to an FAQ, then that FAQ should be one that was composed specifically to answer that Frequent Question, and it should include the traits of being explanatory, useful, with a dose of "Executive Summary," painlessly providing the answer instead of burying the user under tons of incomprehension.

I think y'all should begin with Step # 1: Stop blaming forum users (if there are any remaining) for not being able to deal with your "FAQs." The problem lies in the FAQs, not in the users.

And how's this for Step #2, if I might suggest: There are no stupid questions, and it's time to start giving them real answers. If you think the question is stupid and too beneath you to answer seriously, then leave it alone and let someone else deal with it. An ill-considered, careless, and offhand answer is destructive -- worse than no answer at all.

Thanks,

-neil-

Farelf · May 25, 2013

Aagh, Neil, mercy! Unless you see a "SpamCop Staff" tag against a respondent's name, we're just volunteers, ordinary members of questionable sanity doing what we can, time after time. Some of us are dead even.

Have a look perhaps at the concurrent http://forum.spamcop.net/forums/index.php?showtopic=13310 which looks at the question of Base64-encoded message bodies but walks through some of the issues in a fairly sedate manner. Yes, yes, points to the "philosophy" topic which you hate ... but over-all tries to cover some of the complexities without getting too bogged down.

Yell for more help, point by point - there are issues within issues and wrinkles on wrinkles (don't shoot the messenger) but most things can be resolved or at least explained. In time ...

alvarnell · May 25, 2013

Here is the referenced FAQ http://forum.spamcop.net/forums/index.php?showtopic=4085

dbiel · May 25, 2013

nei1_j I regret to read of your unhappy experience. The FAQ in the forum can be difficult to navigate which is why the SpamCop Wiki was created. The link is SpamCop Wiki Home Page

You will find links on the top of that page as well as every other page to a Category Listing, Page Index, there is also a glossary.

The terms are linked back and forth throughout the wiki.

Here is the link to a definition of Spamvertized URL

I believe you will find the Wiki much easier to navigate than the forum FAQ

petzl · May 25, 2013

Amigos!!!

I get some of my spam parsed, and send a few spam reports. It keeps me off the streets.

Sometimes, the (plain text) body of the spam includes a URL. But the parser only "Finds the links in the message bodies" about half the time.

-neil-

My "findings" are, sometimes if it takes too long for SpamCop (located in USA - Colorado) to connect to a site (URL. often a redirection on a zombie personal computer) so it won't. Other times the site blocks SpamCop Browser identity. Explorer/Firefox/SpamCop, Browsers identify themselves when logging on and can be blocked by receiving servers HTML code.

If you have the time or care to you can help your report by getting a program like IPNetInfo and add to your report

nei1_j · May 25, 2013

Hi Farelf,

Good to see you, in an alphanumeric sort of way.

When I described my links-in-bodies not being found by the parser, I missed a relevant part of my spams' description.

I "don't display images" in any of my email accounts by default, so I don't ever include miles of Base64 in my submissions.

So, any links-in-bodies that I submit come from linked text, or if the URL was displayed in the body. Either as part of an html term, or just the URL without html complications. Either way, it's pretty plain text.

So, what I submit for "bodies" is usually a very small bit of data. At least, a little plain text because SpamCop won't accept a submission unless there's something in the body. Even better, the URL of the phish or whatever, also in plain text.

I don't know if those long miles of Base64 have URLs encoded in them, which might be a challenge for the parser to decode. But the URLs I submit are in plain text, frequently the only thing I'm submitting for a "body," and even a caveman could recognize they're URLs.

And half the time, the parser doesn't report them.

At the moment, I'm leaning towards the problem being a glitch at SpamCop. I need some experience to see if hitting my Refresh button will deliver those missing URLs to the parser output.

Hope springs eternal.

What started as an inquiry is losing its sheen; the important thing is to notify the ISP from whence comes the spam. Not so important is the URL that is unfortunate enough to be written into the [body of the] spam. And therefore, not so important if the parser has a challenge seeing it, for whatever reason / lack of reason.

CU,

-neil-

turetzsr · May 25, 2013

Hi, nei1_j,

<snip>
Your reply is so a cryptic and obtuse, so replete with jargon, so obviously uninterested in real communication, that there's not a lot of motivation to decipher or follow it.

...Well, we'll have to agree to disagree. First, my reply was intentionally terse, as I have a day job that keeps me busy but I wanted to at least get you pointed in what I hoped would be a helpful direction right away without spending too much time. Second, I see no "jargon" in any of the content of my reply that I composed other than "FAQ" and that doesn't seem to be a concept with which you had a problem. "pamvertized" seems to be the problem word but that isn't my term, that is a quote right from the text of the link to which I was pointing you.

To top it off, you gave me a search term ("SpamCop reporting of spamvertized sites - some philosophy") that provides zero results in the Search function.

...Sorry, I don't see how you got "search term" from "SpamCop FAQ." I was hoping you'd find one of the links to the FAQ, navigate to it, and peruse the page for the term I supplied.

Instead of providing me exercise, why not a working URL, instead? Or, gosh forbid, if you know the answer to the question and you notice there's a keyboard in front of you...

...Yes, I could have spent a lot more time than I did and repeat the answer that someone else already took a lot of time to compose. If I'd had time (or the inclination, or the rudeness to treat someone as a moron who is incapable of doing a bit of her/ his own research having been given a bit of further information, but I didn't know you well enough to make that judgment and now that you've posted this reply, I now know I'd have been completely wrong to have judged you so). Your next sentence shows you did exactly what I'd hoped.

snip>
So, you provided [only] a title to an article but not its URL, and you misnamed it.

...Um, no, I didn't provide a title to an article, it was the text of a link to the article on a FAQ page. If I wrote something that made you think that it was the title of an article, I do apologize, but I don't see it (do let me know what it was, if you have the inclination to do so).

<snip>
Anyway, back to "SpamCop reporting of spamvertized URLs." I have problems with the thread <snip>

...Understandable, which is why we have a way to add additional questions to Forum topics, as you did, here. And if you have any suggestions for improving the way the message is presented, we'd be happy to consider them for inclusion in the FAQ.

<snip>
I don't know what the jargon "spamvertized" in the thread's title means,

...That's why we have a link called "SpamCop.net Glossary" near the top of the page in the drop-down list labeled "FAQs & Words," although I admit it is a bit hidden (but I do not know how to fix, or even whether I have the power to do so -- I suspect I do not) but since you know how to use the search you could have used that, instead.

<snip>
3) The discussion is very informal and meandering, it's certainly not a concise answer to the question.

...That is true and is a common condition of the help given here in the Forum by us volunteers.

This is the answer you intended to give me?

...No, it wasn't intended to be the precise answer, it was intended to be one stop along the journey towards gaining knowledge on the subject about which you were asking (and perhaps learning other useful things along that journey).

<snip>
But this is an example of the chronic way that "help" is provided here, and it's a problem for a lot of people. So it deserves some light.

...And we hope you will be kind enough to help out by providing "better" "help." We get lots of general complaints but no concrete participation to improve the FAQ. You have what I think are a lot of valuable ideas ... now if we can get some volunteers with the time to execute them, we'd be on our way. But the latter is what we've been missing all these years since the FAQ was first written!

...Incidentally, please be aware that the FAQs with URLs like "http://www.spamcop.net/fom-serve/cache/122.html" are not readily changeable and such changes can only be done by SpamCop staff, not us Forum volunteers.

<snip>
This would be a good time to highlight a bad habit that has been going on in this forum for too long: If a member has the attitude that he's unwilling to plainly answer a question because it was "already answered previously in the Forum and why should I keep repeating myself," then let him follow his original instincts and simply "don't answer."

...Sorry, I do not agree that is a good approach. If everyone followed that advice, there would be precious little help of any kind provided here; many visitors are able to find a sufficient answer to their questions once given something like my "cryptic and obtuse, so replete with jargon, so obviously uninterested in real communication" reply to you.

Leave room for someone else who's helpful enough to make an effort to provide a real answer.

...The initial terse reply does not preclude someone else coming long later with a "real answer." In the meantime, the terse reply often sends the OP in the right direction and gets her or him to content that given them enough knowledge.

<snip>
I think y'all should begin with Step # 1: Stop blaming forum users (if there are any remaining) for not being able to deal with your "FAQs." The problem lies in the FAQs, not in the users.

...No one here, certainly not I, blames anyone for not being able to deal with the FAQs. I do have a problem with people who aren't willing to put in a little of their own effort to find answers, especially when given some kind of hint, such as the text of a link in the "SpamCop FAQ" (and, no, I am not accusing you of not being willing to put in some effort -- to the contrary, this reply of yours demonstrates a remarkable amount of effort and the time you put into offering suggestions is manifest, thank you!).

<snip>
And how's this for Step #2, if I might suggest: There are no stupid questions, and it's time to start giving them real answers. If you think the question is stupid and too beneath you to answer seriously, then leave it alone and let someone else deal with it. An ill-considered, careless, and offhand answer is destructive -- worse than no answer at all.

...One person's ill-considered, careless, and offhand answer" is another's gold mine. I don't believe someone's inability to find everything they need from my one little pointer is her/ his fault but I do expect that she/ he do at least a bit of research and come back with additional questions or give up and go away quietly. Preferably (much preferably) the former! As you did.

nei1_j · May 25, 2013

Hi petzl.

Thanks for the IPNetInfo. It sounds like a more powerful parser than the one in SpamCop?

Best luck,

-neil-

--------------------------------------------------------

Hi dbiel.

Like neil, but different.

SpamCop Wiki! That's news to me. Are you sure you don't want to rename it "SpamCop FAQ, The Next Generation?"

I'll be reading you.

But now, it's after midnight. Tomorrow.

Best luck,

-neil-

Farelf · May 25, 2013

Hi Neil,

... I don't know if those long miles of Base64 have URLs encoded in them, which might be a challenge for the parser to decode. But the URLs I submit are in plain text, frequently the only thing I'm submitting for a "body," and even a caveman could recognize they're URLs.
And half the time, the parser doesn't report them.

At the moment, I'm leaning towards the problem being a glitch at SpamCop. I need some experience to see if hitting my Refresh button will deliver those missing URLs to the parser output.

Hope springs eternal.

What started as an inquiry is losing its sheen; the important thing is to notify the ISP from whence comes the spam. Not so important is the URL that is unfortunate enough to be written into the [body of the] spam. And therefore, not so important if the parser has a challenge seeing it, for whatever reason / lack of reason. ...

We learn from each other, always learning (and trying not to forget as quickly). I had my own little grumble (universally ignored) in http://forum.spamcop.net/forums/index.php?showtopic=13285 about an even "worse" case - but there are possible reasons, as mentioned there. And petzl, in this topic, points to another. Wrinkles on wrinkles.

Whenever the parser spits the dummy, if I can, I use one of those other tools he mentions to resolve and add my own report(s) for SC to send, additional to the one SC devnulls or for which it fails to find the address (takes a bit of thought to avoid spamming thereby some hapless drudge whose responsibilities are in no way related to abuse handling or maybe sending a report to a spammerbase). The important thing is to work within your comfort zone and to avoid burn-out - we need all the reporters we can get (glad to see the stats are trending upwards again). Don't be too impatient, I'm reasonably certain there will always be lots more spam to tackle, despite what Mr. Bill Gates said all those years ago. Yes, there's hope for us all - I mean he got so many things utterly wrong and still owns half the observable universe despite giving it away as fast as he can. Just a little exaggeration there, for emphasis.

But yes, there are sharper tools to tackle spamvertized links, as Steve T mentioned. Add those if you want, when you want. The main game is to alert networks about spam sources right from the early stages of breakouts (both the inadvertent sources and, via the the SC block-list should it get that far, the receiving networks) - and to provide loads of data to those spam source administrators should they have a wish to fix their problem. Or to deny them wriggle room should they not. Spamvertized website reporting via SC is a bonus - when it works, if the hosts are whitehat, if the spammer hasn't seeded in innocent bystanders, etc. or, as the old network joke has it, when our moon is in the fifth house with Venus ascending.

karlisma · June 7, 2013

Just some experience of submitting spam to spamcop and waiting whether spamcop will parse message body fro links or no:

SpamCop's parser does not pick up links encoded and written in koi-8, although I found forum topic marked [resolved] for that. My human memory tells me that link with koi-8 characters was parsed correctly 2 times in 100 messages.

When not parsing answers are (Parsing text part, no links found) if there's no http://www. at the beginning,

 http://Ð°Ð¿Ñ‚ÐµÐºÐ°-Ð¾Ð½Ð»Ð°Ð¸Ð½.Ñ€Ñ„

or (www is not a routeable IP address Cannot resolve http://www/) for links that have www at beginning

 http://www.Ð¼Ð¾Ð´Ð°-Ð½Ð°-Ñ„ÑƒÑ‚Ð±Ð¾Ð»ÐºÐ¸.Ñ€Ñ„

You would say - there's philosophy of not finding spamvertised sites abuse addresses and reporting, but... it ain't so. This particular spammer was using bit.ly system to hide spamvertised site from parser - took time to report the link manually [at]bit.ly spam with those particular links stopped right then.

Now he is using links like

 http://www.Ð¼Ð¾Ð´Ð°-Ð½Ð°-Ñ„ÑƒÑ‚Ð±Ð¾Ð»ÐºÐ¸.Ñ€Ñ„

which are not picked up 99% of time.

Farelf · June 7, 2013

Thanks karlisma - helpful observations.

...
But now, it's after midnight. Tomorrow. ...

Tomorrow came ... but neil didn't.

nei1_j · June 7, 2013

Tomorrow came ... but neil didn't.

Hey, thanks for looking me up. Wow, either you guys were up antispamming pretty early, or pretty late. Good; we need people to report the 3 AM spams "freshly."

KOI8-R links

If I were a computer programmer and were required to provide a Russian version, I'd probably switch to flower arrangement.

My spamcop account must have been given a higher priority in the parser after I bitched-&-moaned, because since then, I have not had a single link-in-body that the parser did not recognize as a URL.

That's not to say that the links were all valid URLs. Most were not (according to the parser, "Can't be resolved"). But at least the parser recognized they were links and tried 'em all.

Sometimes I wonder why a spammer would provide a non-working link. I'll have to email one of them and ask why; just kidding.

Well, I guess the other possibility is that the links are "good," but the parser is a slacker at resolving them. I'm Reading Farelf's thread about the parser proclaiming that "links cannot be resolved": http://forum.spamcop.net/forums/index.php?showtopic=13285

So, we're looking at two potential problems with the parser. 1) The parser may have trouble recognizing a link-in-body, even if it's plain-as-day, which I was complaining about. 2) The parser may successfully recognize a link-in-body, but then pronounce "It cannot be resolved," even though it can.

If we wanted to confirm if a link-in-body is good, the obvious thing to do would be try to browse to the spam-URL. If you can browse there, but the almighty Parser says the link "Cannot Be Resolved," then we have a problem, Houston. I think it would be best to use a honeypot computer to browse to spam-links, because you could find yourself in a dark, dirty, lonely, infectious corner of the Internet, and you don't want to risk, for instance, your primary computer.

But if you browse there and get out alive, please tell us all what you saw there, if your anti-virus software went off like a pin-ball machine, and if you purchased any Viagra.

As for the koi-8 problem, I haven't had any Russian URLs lately, but Karlisma has been getting his share, lately, and he sees that the parser is having severe problems processing KOI8-R links, even though the latest word is that it's supposed to be capable. So Karlisma, if you think it's a significant problem, I might recommend starting a new thread focusing on your observation, and send a PM (Private [or Provocative] Message) to "SpamCopAdmin", alerting him of the new thread.

It's been years, but I recall SpamCopAdmin is a decent chap. We'll see what he can do.

Best luck,

-neil-

karlisma · June 10, 2013

Hey, thanks for looking me up. Wow, either you guys were up antispamming pretty early, or pretty late. Good; we need people to report the 3 AM spams "freshly."

If I were a computer programmer and were required to provide a Russian version, I'd probably switch to flower arrangement.

My spamcop account must have been given a higher priority in the parser after I bitched-&-moaned, because since then, I have not had a single link-in-body that the parser did not recognize as a URL.

That's not to say that the links were all valid URLs. Most were not (according to the parser, "Can't be resolved"). But at least the parser recognized they were links and tried 'em all.

Sometimes I wonder why a spammer would provide a non-working link. I'll have to email one of them and ask why; just kidding.

Well, I guess the other possibility is that the links are "good," but the parser is a slacker at resolving them. I'm Reading Farelf's thread about the parser proclaiming that "links cannot be resolved": http://forum.spamcop.net/forums/index.php?showtopic=13285

So, we're looking at two potential problems with the parser. 1) The parser may have trouble recognizing a link-in-body, even if it's plain-as-day, which I was complaining about. 2) The parser may successfully recognize a link-in-body, but then pronounce "It cannot be resolved," even though it can.

If we wanted to confirm if a link-in-body is good, the obvious thing to do would be try to browse to the spam-URL. If you can browse there, but the almighty Parser says the link "Cannot Be Resolved," then we have a problem, Houston. I think it would be best to use a honeypot computer to browse to spam-links, because you could find yourself in a dark, dirty, lonely, infectious corner of the Internet, and you don't want to risk, for instance, your primary computer.

But if you browse there and get out alive, please tell us all what you saw there, if your anti-virus software went off like a pin-ball machine, and if you purchased any Viagra.

As for the koi-8 problem, I haven't had any Russian URLs lately, but Karlisma has been getting his share, lately, and he sees that the parser is having severe problems processing KOI8-R links, even though the latest word is that it's supposed to be capable. So Karlisma, if you think it's a significant problem, I might recommend starting a new thread focusing on your observation, and send a PM (Private [or Provocative] Message) to "SpamCopAdmin", alerting him of the new thread.

It's been years, but I recall SpamCopAdmin is a decent chap. We'll see what he can do.

Best luck,

-neil-

Yes, I think it is a problem. The same spammer who is bugging me once in a while uses bit.ly links, if he has small budget client, who is not into byuing koi-8 domain name.

Manual reporting to bit.ly stops those messages immediately. So I think, "it would be great" if spamcop picked up all sites and reported them. The thing is those are not unreal/fake/unresolvable addresses. I'v tried them all.

Today's stats - 35 unpicked sites with both of answers ("no links found" if no www follows hhtp://; and "cannot resolve http://www. if http:// is followed by those magic letters of WorldWideWeb)

turetzsr · June 10, 2013

<snip>
So I think, "it would be great" if spamcop picked up all sites and reported them.

<snip>

...Many SpamCop users would. However, history suggests that it just isn't going to happen. You might wish to consider using a tool like Knujon or Complainterator for reporting Spamvertized links. What I've done is to add the Knujon address, nonregistered[at]coldrain.net, to my SpamCop reporting account's Preferences | "Report Handling Options" | "Public standard report recipients" so I can easily add Knujon to the report recipients to which SpamCop sends my complaints. You could also add the abuse address for bit.ly to use when appropriate.

Farelf · June 10, 2013

Agree with Steve T except the parser's apparent refusal to deal with "www." text URL/URIs is apparently something recently broken and should be investigated (though perhaps not quite so clean-cut, maybe affecting only lookups made on one or two RIRs, maybe some other specificity). Such bugs are, too many times, indicative of other, more subtle defects of unsuspected (thus unassessed) potential for mischief or mayhem.

Much conjecture (and much more unstated) in http://forum.spamcop.net/forums/index.php?showtopic=13285 however karlisma's observation seems to go some way towards explaining it neatly - though (hopefully) setting off alarm bells deep within the bowels of the body corporate, neatness not meaning devoid of significance.

nei1_j · June 11, 2013

... parser's apparent refusal to deal with "www."

I can match that for absurdness. I got a plain-text link-in-body that was ".net." Even after reloading the parser a few times, I couldn't get the parser to notice it was a URL.

I have trouble following directions, such as FarelF suggesting IPNetInfo as a DNS lookup tool. Instead, I've had http://network-tools.com/ bookmarked for a few decades. I select "Express," then fill in the URL, and click Go. Usually, it provides an "Abuse" email address (towards the bottom of the page), otherwise, there'll be another email address, which I'm happy to use in those cases.

I've been using network-tools.com a lot, lately, and they haven't kicked me off, yet. It is, however, an extra step -- that takes extra time -- in the reporting of spam that wouldn't be necessary if the parser were working properly.

---

I might have missed something... What was the general consensus about why Cisco refuses to properly maintain SpamCop, like a good netizen in their position would be expected to do? (E.g., can't handle URLs with "www" :wacko: or .net...) Did they purchase it to kill it?

nei1_j · June 11, 2013

Ugh. Link-in-the-body, it said "verizon.com." The parser couldn't see it.

Pitiful.

turetzsr · June 11, 2013

<snip>
I might have missed something... What was the general consensus about why Cisco refuses to properly maintain SpamCop, like a good netizen in their position would be expected to do? (E.g., can't handle URLs with "www" or .net...) Did they purchase it to kill it?

<snip>

...I think the consensus here is (as I was hoping the FAQ entry to which I referred you earlier would lead you to conclude but I admit that I haven't read it carefully myself and therefore am not surprised that it may not make clear) that decoding spamvertized links is simply not a priority for SpamCop (and hasn't been, even before it was purchased by Cisco); they leave more comprehensive, reliable treatment of those to products developed expressly for that purpose, such as Knujon and Complainterator, to which I referred in an earlier reply to Karlisma 84865[/snapback].

nei1_j · June 14, 2013

...I think the consensus here is (as I was hoping the FAQ entry to which I referred you earlier would lead you to conclude but I admit that I haven't read it carefully myself and therefore am not surprised that it may not make clear) that decoding spamvertized links is simply not a priority for SpamCop (and hasn't been, even before it was purchased by Cisco); they leave more comprehensive, reliable treatment of those to products developed expressly for that purpose, such as Knujon and Complainterator, to which I referred in an earlier reply to Karlisma 84865[/snapback].

karlisma · September 11, 2013

Ah, Your answer does not encourage at all. It is sooooo bad.

While i think, that reporting here does good to all others except for me (because I already got that spam in my box).

But to the question - it has always been said that the philosophy of spamcop is to report just the sender, and i have had my time reading this philosophy. None the less - philosophy still keeps link parser alive, then - why not make it better? It is not the question that parser cannot pick those links, it obviously can. It just cannot adjust to "web 2.0" which is already 5-10 years old, the web that allows take domain names in any other alphabet there is in the world.

Let's say as i see it parser has hard times translating addresses like http://www.Ð°Ð´Ñ€ÐµÑÑ.Ñ€Ñ„ to language that is understandable to all parsers. (it translates into

http://xn--80aid7bga.xn--p1ai

or to the

http://xn--80aid7bga.Ñ€Ñ„

because .Ñ€Ñ„ is accepted domain under this protocol http://www.icann.org/en/resources/idn).

And - why not make the damn thing here better, cooler and more usable, living according to times that are changing.... i.e. being progressive?

turetzsr · September 11, 2013

<snip>
While i think, that reporting here does good to all others except for me (because I already got that spam in my box).

...And potentially helps you, in future, too, if it helps stop spam from recurring from the same source and/ or if you (or your ESP) use the SpamCop blacklist to filter your own e-mail.

<snip>
None the less - philosophy still keeps link parser alive, then - why not make it better?

<snip>

And - why not make the damn thing here better, cooler and more usable, living according to times that are changing.... i.e. being progressive?

...Because SpamCop staff have a choice between spending their time on their core mission, identifying spam sources and keeping the tool running well, vs improving the identification of spamvertized sites, something that other tools, like Knujon and Complainterator, do better than SpamCop can ever hope to do.

karlisma · September 11, 2013

...And potentially helps you, in future, too, if it helps stop spam from recurring from the same source and/ or if you (or your ESP) use the SpamCop blacklist to filter your own e-mail....Because SpamCop staff have a choice between spending their time on their core mission, identifying spam sources and keeping the tool running well, vs improving the identification of spamvertized sites, something that other tools, like Knujon and Complainterator, do better than SpamCop can ever hope to do.

It doubles the work to be done, like for me, as a volunteer.

Besides: Get things right and push those two out of the market.

turetzsr · September 11, 2013

It doubles the work to be done, like for me, as a volunteer.
<snip>

...Not if you add Knujon and/ or Complainterator e-mail addresses to your list of SpamCop reporting "Public standard report recipients" (which can be found in Preferences | Report Handling Options). Knujon, for example, is nonregistered[at]coldrain.net.

Finding links in message body -- no links found

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived