Jump to content

Difficulty parsing geocities.com


mshalperin

Recommended Posts

http://www.spamcop.net/sc?id=z846331382z4e...0d260be8a38ce2z

Every time I get a spam with geocities.com as a spamvertized website, I have to reload the page many time before the parsing "takes". This is usualy at least 10 times and I've gone as far as 27 (in an obsessive moment) before it processed (Yahoo.com). OTOH, the single line parser gets it on the first try each time I've tried. I'd rather it appeared as a parsed spamvertzed site than a user notification. Why would the primary parser have such a difficult time with this frequently seen site?

Link to comment
Share on other sites

this has come up so many times, i'm a bit at a loss as to just which previous Topic to Merge this one into. Have you read any of the previous queries on this "parsing issue" .... have you checked the FAQ entries here dealing with these "parsing issues" ...????

The one-line lookup and the full-spam parser don't use the same code, don't have the same constraints applied (the most obvious being the time-out associated with doing a lookup)

Specific answer has been for quite a long time - programmers are looking at it ....

Per SECTION 5 - Login & Navigation, this Forum section defaults to only showing the last 30 days of traffic .... change that drop-down selector to see the older Topics ....

Link to comment
Share on other sites

Have you read any of the previous queries on this "parsing issue" .... have you checked the FAQ entries here dealing with these "parsing issues" ...???? 

The one-line lookup and the full-spam parser don't use the same code, don't have the same constraints applied

38406[/snapback]

I have searched this general issue and found several entries as well as recall other discussions. I am aware of the differences between the full spam and single line entry parsing - if you read my post carefully that is self evident. However, the issue I'm raising is the unique difficuty with a specific site - geocities.com which is a frequent offender. There are many instances where the full-spam parser has to be reloaded to process a spamvertized site - usually this takes (only) 1 - 3 times. This site consistently takes nearly an order of magnitude more to get it through. None of the prior discussions I could find mentioned this site as presenting an exceptionally difficult time for the full-spam parser.

Specific answer has been for quite a long time - programmers are looking at it ....

With this specific site??? And yes, it's been quite a long time...

Link to comment
Share on other sites

http://www.spamcop.net/sc?id=z846331382z4e...0d260be8a38ce2z

Every time I get a spam with geocities.com as a spamvertized website, I have to reload the page many time before the parsing "takes".  This is usualy at least 10 times and I've gone as far as 27 (in an obsessive moment) before it processed (Yahoo.com).  OTOH, the single line parser gets it on the first try each time I've tried.  I'd rather it appeared as a parsed spamvertzed site than a user notification. Why would the primary parser have such a difficult time with this frequently seen site?

38404[/snapback]

I've seen this same issue with geocities links in particular, and have had to reload many more than 27 times (just out of pure bloodymindedness) to get it to work. It seems random, but that's just my perception.

Unless someone thinks that Yahoo is specifically targeting SpamCop DNS or whois or whatever queries (which I haven't heard mention of), then this seems more like some kind of bug in SpamCop's production parsing engine than the other issues that have been discussed in the FAQs.

I'm glad that the programmers are looking at it...this seems like an easily reproducible issue and thus should be interesting for a programmer to isolate and hopefully eliminate.

Link to comment
Share on other sites

I've seen this same issue with geocities links in particular, and have had to reload many more than 27 times (just out of pure bloodymindedness) to get it to work.  It seems random, but that's just my perception.

38605[/snapback]

It certainly is random... It isn't something I've ever seen - personally. (Although clearly it does affect others).

But as Steve T notes further up the thread... The SCBL and reporting process is not focussed on reporting spamvertised URLs but on identifying mail servers being used to disseminate spam.

Andrew

Link to comment
Share on other sites

It certainly is random... It isn't something I've ever seen - personally. (Although clearly it does affect others).

The SCBL and reporting process is not focussed on reporting spamvertised URLs but on identifying mail servers being used to disseminate spam.

38609[/snapback]

For me it has not been random - geocities.com consistently chokes the parser a great deal more than any other domain (except those it can't resolve at all). I realize that spamvertized site reporting is a lower priority than the source IP addresses, but if Spamcop is bothering to do it at all, why not make it as efficient as possible? In this case, geocities resolves to a yahoo admin which Spamcop automatically frequently reports to as an "intermediary".

Link to comment
Share on other sites

  • 3 weeks later...
Well, it doesn't seem to be a server load issue....I'm trying to report a spam containing http://uk.geocities.com/iorgo15348ingmar84674/ here at 5:50am Eastern, and I'm at about 30 reloads with no luck yet.

39434[/snapback]

If you put uk.geocities.com in the single line parser you now get:

Parsing input: uk.geocities.com

Host uk.geocities.com (checking ip) = 66.218.77.68

host 66.218.77.68 = intl1.geo.vip.scd.yahoo.com (cached)

No recent reports, no history available

Routing details for 66.218.77.68

[refresh/show] Cached whois for 66.218.77.68 : network-abuse[at]cc.yahoo-inc.com

Using abuse net on network-abuse[at]cc.yahoo-inc.com

abuse net cc.yahoo-inc.com = abuse[at]yahoo.com

Using best contacts abuse[at]yahoo.com

abuse[at]yahoo.com redirects to network-abuse[at]cc.yahoo-inc

network-abuse[at]cc.yahoo-inc bounces (9 sent : 8 bounces)

Cannot find master for:uk.geocities.com

No valid email addresses found, sorry

Apparently, Yahoo no longer accepts spam reports!!

Link to comment
Share on other sites

it was fine for a time

now - new tricks, unparseable

http://www.spamcop.net/sc?id=z862595742zef...858af118aab191z

39584[/snapback]

I refreshed this a "bunch" of times and got:

Resolving link obfuscation

  http://ca.geocities.com/ichabod41576onida33153/

  Host ca.geocities.com (checking ip) = 66.218.77.68

  host 66.218.77.68 = intl1.geo.vip.scd.yahoo.com (cached)

  http://ca.geocities.com/titos44145barbey47112/

Tracking link: http://ca.geocities.com/ichabod41576onida33153/

No recent reports, no history available

Resolves to 66.218.77.68

Routing details for 66.218.77.68

[refresh/show] Cached whois for 66.218.77.68 : network-abuse[at]cc.yahoo-inc.com

Using abuse net on network-abuse[at]cc.yahoo-inc.com

abuse net cc.yahoo-inc.com = abuse[at]yahoo.com

Using best contacts abuse[at]yahoo.com

abuse[at]yahoo.com redirects to network-abuse[at]cc.yahoo-inc.com

Link to comment
Share on other sites

  • 1 month later...

I've noticed over the past week that the parser isn't catching the Geocities spammed links. I've refreshed as many as 30 times, yet it still doesn't pick it up.

Here's one: http://www.spamcop.net/sc?id=z893658988ze0...2efe1f6463b7a2z

As you can see:

Parsing input: http://ca.geocities.com/jonis99213helena98939
No recent reports, no history available
Routing details for 66.218.77.68
[refresh/show] Cached whois for 66.218.77.68 : network-abuse[at]cc.yahoo-inc.com
Using abuse net on network-abuse[at]cc.yahoo-inc.com
abuse net cc.yahoo-inc.com = abuse[at]yahoo.com
Using best contacts abuse[at]yahoo.com
abuse[at]yahoo.com redirects to network-abuse[at]cc.yahoo-inc.com

Reporting addresses:
network-abuse[at]cc.yahoo-inc.com 

I understand that reporting to Yahoo is an excercize in futility, but I'd still like to report to them and not have to manually do so everytime.

Link to comment
Share on other sites

  • 2 weeks later...

Anyone know why the issue persists?  I came across the parser missing Geocities sites yesterday and today.

41486[/snapback]

it is version = stable now, as far as i can remember. Some bugs still arise, looks like it is getting worse and worse... not yesterday, not today... more or less working condition is met once per week, or so... does it depend on weather, or something....

i know, because of 30-something spams a-day i report, 40-50% are with geocities.com links... (and i think they are targeted that way - being untraced, by spamcop, because the trackingmarks they hide in headers are becoming more sophisticated than before, links to geocities are getting simplier.... they even pot two of them, the second - never being parsed, never ever.)

other 40% being RocketStock, and of, course the other University Diplomas and OEMsoft.

Of course, then there is that local philosophy....

For those who can not read joke, do not read.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...