Jump to content

FAQ Entry: The Link Analysis Process


Jeff G.

Recommended Posts

Tried a dozen things - coming to the conclusion that the parser is only interested in text, skips anything where it is told about HTML. If it is NOT told about it, it handles HTML without a problem. Well, apart from wasting time looking at the standard www.w3.org link in the opening DOCTYPE comment.

Parse (that succeeds) with modified headers and body to remove content declaration and boundary definition shown:

http://www.spamcop.net/sc?id=z5551238400z1...5c2f2e0934ae65z

(<!-- SpamCop::Web::Look $Revision: #17 $ produced by prod-sc-www1 -->)

Seems almost/exactly as if the parser has been modified to skip declared HTML sections of the body. This may have been a "temporary" measure several/many attempts ago in the lead-in to system updates - to keep things simple, reduce the variables for trouble-shooting. "Seems", only the engineers could say, I suppose.

Link to comment
Share on other sites

  • Replies 51
  • Created
  • Last Reply

Top Posters In This Topic

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...