Jump to content
Sign in to follow this  
eric

SC Forum is being Googled?

Recommended Posts

:(

I hate it when I'm the last to notice something... Google has been spidering and archiving this forum, and that's a very different policy than the [unfortunately deprecated] newsgroups officially have/had. No doubt other search engine spiders have been feeding as well.

Is there an official reason why there is not a robots.txt prohibition for search engines on the forum pages?

If not, then posters may want to be aware that their words not only will be preserved for posterity, but that they will be searchable and "Googleable".

Another reminder that you might not want to post here, or in a newsgroup, anything that you would not want printed out and posted on a wall somewhere.

[Timely holiday tie-in:] Or sent to your mother!

This is a civilized society. Be civil at all times...

Share this post


Link to post
Share on other sites

You may want to visit the (your) www.spamcop.net page .. hit help .. try the search link .... http://www.spamcop.net/help.shtml#search ... pick your section ... then take a look at the top of the resulting Google output/results page .... (Coincidentally, I had just challenged JT about the archiving of the 'spamcop' newsgroup last night . only to get slapped around a bit <g> ... I was looking for 'spamcop' and therefore not recognizing the 'spamcop-list' results .. ooops!) ... It was way back when Courtney (Ironport staffer) was doing some updates on the www.spamcop.net FAQ that both I and DavidT were hitting her from both sides with 'suggestions' . one of which was the addition of the web-Forum to the above mentioned search page ... and I know that I've mentioned a few times that Google is a bit more useful than the seach utility provided within this application due to things like the <4 character word issue ...

The reminders and edits done here to posted e-mail addresses have always been to reduce the spammer scraping and exposure of some data in the search engine results. A number of folks have been asked if they'd care to change their registered name 'here' .. somehow managing to use an e-mail address .. but only one user has ever asked for this action. (since I was given that capability I should say)

Newsgroup activity is still on-going .. as a matter of fact, I just stole two posts over there and added them to the Forum FAQ here ... one new entry, one entry updated with additional content .... the spamcop.mail group is about dead, spamcop.help has a little traffic (a current thread there with an old 'friend' once again bitching about my 'testiness' in a response to one of his queries) .. but most traffic has shifted to the 'spamcop' newsgroup, still active with input from lots of folks that refuse to 'lower their standards' and come over 'here' ... (and a handful of folks that participate in both worlds)

Share this post


Link to post
Share on other sites

Yes, the problem of scraping email addresses and other information is a problem, but Google and other forthright search engines do respect the robots.txt file. Even though spammers don't, it seems to me (ISTM) that it would be worthwhile for the SC forum to discourage search engine glomming of the forum since there is a reasonable internal search capability.

Or perhaps prohibit Googlebot via the robots.txt, and feed a suitable input to Google via their subscriber feed, perhaps masking email addresses and other personal identifying information.

The scrapers who find anything with an at-sign in it (like neko[at]spamcop.net which is an eminently scrapable, never validly used, even spam-trappy address) will send spam to those addresses. Reputable spiders will never get to those strings because they obeyed the robots.txt prohibition which kept them from fetching that file in the first place.

I'm very bummed that the newsgroups are deprecated to the extent they are. I hate being forced to be onlne to read any single posting, rather than fetching a newsfeed and reading all articles at my leisure. But I've been a Usenet denizen since '82, as admin of UUCP host 'wombat', so what could I possibly know...

Share this post


Link to post
Share on other sites
I'm very bummed that the newsgroups are deprecated to the extent they are.  I hate being forced to be onlne to read any single posting, rather than fetching a newsfeed and reading all articles at my leisure.

27517[/snapback]

I hadn't realised that the newsgroups are deprecated. Some folk over there dislike the web forums, some dislike the newsgroups and others float back and forth.

I guess they each have a life of their own so some stuff gets dealt with in one place and some the other. Certainly the web-based forums are where many of the first time seekers for assistance come but I suspect that's because the first place to go when seeking help is the web and reject messages take the person with the messge to the web as well.

But I'm not aware of any deprecation going on <_<

Andrew

Deprecate: 1. To express disapproval of; deplore. 2. To belittle; depreciate.

deprecation: a prayer to avert or remove some evil or disaster

:D

Share this post


Link to post
Share on other sites
I hadn't realised that the newsgroups are deprecated.

27525[/snapback]

Actually, both spamcop.help and spamcop.mail are going away entirely.

JT

481[/snapback]

EDIT: Please note that the quote above is actually quite old, from JT's January 29, 2004 Post in the 8th Topic here in this Forum's infancy, http://forum.spamcop.net/forums/index.php?...=findpost&p=481. Edited by Jeff G.

Share this post


Link to post
Share on other sites

But the "going away" message you posted only mentions ".help" and ".mail" but not the general "spamcop" group. You could say that those two sub-groups have been "deprecated," couldn't you?

DT

Share this post


Link to post
Share on other sites

Yes, you could.

Share this post


Link to post
Share on other sites
Deprecate:  1.  To express disapproval of; deplore. 2. To belittle; depreciate.

deprecation: a prayer to avert or remove some evil or disaster

Those are not from "technical" dictionaries...if you google "deprecated" and scroll past the first few definitions collected by "answers.com," you'll find these:

deprecate (from the Computer Desktop Encyclopedia):

To make invalid or obsolete by removing or flagging the item. When commands or statements in a language are planned for deletion in future releases of the compiler or rendering engine, they are said to be deprecated.

deprecation (from the Wikipedia):

In computer software standards and documentation, deprecation is the gradual phasing-out of a software or programming language feature.

DT

Share this post


Link to post
Share on other sites
But I'm not aware of any deprecation going on  <_<

27525[/snapback]

Well, since several key SC personages have stated that they intend to frequent the forums, and not the newsgroups, it seems clear that "the newsgroups are deprecated to the extent they are."

I'm sorry you were not aware of it.

BTW, your definitions are deficient.

deprecated

Said of a program or feature that is considered obsolescent and in the process of being phased out, usually in favour of a specified replacement. Deprecated features can, unfortunately, linger on for many years. This term appears with distressing frequency in standards documents when the committees writing the documents realise that large amounts of extant (and presumably happily working) code depend on the feature(s) that have passed out of favour.

Share this post


Link to post
Share on other sites
BTW, your definitions are deficient.

27580[/snapback]

That's the trouble with the English language. Just as soon as you think you know something, someone else just goes ahead and defines the word differently and it takes on that new meaning. :blink:

Andrew

Edited by agsteele

Share this post


Link to post
Share on other sites
Well, since several key SC personages have stated that they intend to frequent the forums, and not the newsgroups, it seems clear that "the newsgroups are deprecated to the extent they are."

??? I can't recall that at all. Of course, I'm also not sure who the "key SC personages" might be <g> The 'paid staff' folks haven't made such a statement, the owners of either side of the house or IronPort staff rarely post in either place ...

Share this post


Link to post
Share on other sites

JT, owner of the SpamCop email service, and the servers that hold the web forum and newsgroups, decided that help would only happen on the web forum since it is primarily newcomers who need help. He seems to prefer to set up something that runs itself and doesn't particularly like it when he has to get involved. Though I thought that he used to actively answer email questions. I don't use the email service so I don't know what happens there.

Julian, creator of the parser and past owner of SpamCop website and parser, started the Mailhosts discussion here because JT told him that was where discussions should take place and since Julian has not frequented the ngs for a long time, he accepted that. IIRC, Julian and all the deputies have, at least, made allusions to preferring the newsgroup as a means of communication. Julian rarely posts anywhere. The deputies post, depending on their workload, in both places and AFAICT don't skimp one place or another. Ironport has only been involved once for a short time and made no preference known.

Newcomers who prefer newsgroups still go to the spamcop ng. Almost all the others come to the web forum initially because of the way the introductory web page is set up. People who are interested in helping frequent both groups (except for a couple who won't come to the web forum on principle). People who are interested in discussing ideas generally stay in the forum or ng - whichever they prefer. In general, the more technically fluent discussions are in the ngs, the non-technically fluent in the web forum. The web forum is not as 'colorful' as the ngs since most newcomers are not used to the frank way that ng posters reply (part of the reason that the web forum was suggested).

That's my impression of the way things are.

Miss Betsy

Share this post


Link to post
Share on other sites
Well, since several key SC personages have stated that they intend to frequent the forums, and not the newsgroups, it seems clear that "the newsgroups are deprecated to the extent they are."

I'm sorry you were not aware of it.

<snip>

27580[/snapback]

??? I can't recall that at all. Of course, I'm also not sure who the "key SC personages" might be <g> The 'paid staff' folks haven't made such a statement, the owners of either side of the house or IronPort staff rarely post in either place ...

27582[/snapback]

...See JT's post in thread "Avatars." It's old but I've seen nothing "official" to contradict it.

Share this post


Link to post
Share on other sites

I think its good that Google is indexing the forum, and it should be expected. Unless you specifically deny honest search engines access, your posts, web pages, blogs, etc. will all be indexed. So it should be no surprise.

Also, people are more likely to find SpamCop.net because the forums are indexed and it also sends them directly to a relevant discussion. This is a good thing.

If you really didn't want the public to see, then there are security measures that prevent people from seeing it (i.e. making forums available to logged in users only, password protecting forums, etc.). By posting to a public forum, like this one, you are making your thoughts public and they could be quoted in the New York Times tomorrow if they happened to want to quote you. Google spider or no Google spider.

Dishonest spiders ignore robots.txt anyway, so additional security is required to prevent the public from seeing your thoughts. Besides, it would be people who use or misused what you said, not the spider itself.

I think blocking Google's robot won't stop anyone from printing out a copy and sending it to your mom, so to speak, and anyone posting on a public forums should realize what they say could be seen by anyone in the world (which is actually the power of a forum in the first place). The reason why Forums were originally called Bulletin Board Systems (BBSs) is because you, in essense, were posting something on an electronic wall for everyone to see.

I don't remember what the agreement said when I signed up for this forum, but I know on forums I run, the standard agreement specifically states that anything you post is considered public information. This forum software may have similar wording to that effect, not sure.

Share this post


Link to post
Share on other sites

I actually appreciate having it googled. Makes it much easier to search through the 27000+ posts. I personally can't stand the search-results interface in the forum.

Share this post


Link to post
Share on other sites
??? I can't recall that at all.  Of course, I'm also not sure who the "key SC personages" might be <g>  The 'paid staff' folks haven't made such a statement, the owners of either side of the house or IronPort staff rarely post in either place ...

27582[/snapback]

...See JT's post in thread "Avatars."  It's old but I've seen nothing "official" to contradict it.

JT = one guy (albeit having certain special powers <g>)

since [several key SC personages] - [JT] = several key SC personages

The "several key SC personages" is still undefined <g>

Share this post


Link to post
Share on other sites

I've repeated JT's predictions of doom for those newsgroups, am I one of those "several key SC personages"? :)

Share this post


Link to post
Share on other sites

Only if you can show me where you said

since several key SC personages have stated that they intend to frequent the forums, and not the newsgroups,

Last I recall, you did say you monitored the spamcop.mail group (and after seeing how little traffic there was I added it and thus can state that I've seen you posting there <g>)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×