Jump to content

Incorrect parsing of Chinese URLs


caltenba

Recommended Posts

I've been getting regular (~1/day) Chinese spam on one of my e-mail addresses. They always contain links that show up in clear text when hovering over them, but spamcop basically tries to parse them in gibberish and thus always fails. Somehow the parser does not correctly understand the encoding, it seems.

What the browser sees (shown in the lower left in chrome) when hovering over the link in the e-mail:

"shorts.url.co.il/b2656b"

What the spamcop parser sees instead:

...

Finding links in message body

Parsing HTML part

Resolving link obfuscation

http://shï½ï½’t.urls.cï½ï¼Žï½‰ï½Œ/b2656b

No recent reports, no history available

shï½ï½’t.urls.cï½ï¼Žï½‰ï½Œ is not a routeable IP address

...

What is the best way to submit real examples so this can be tested and corrected?

Thanks.

Link to comment
Share on other sites

Hi, caltenba; welcome to the SC Forum!

&nbsp &nbsp&nbsp&nbsp&nbsp My recommendation is to not worry too much about SC's parsing of such spamvertized links; this is not a particularly important part of what SpamCop does. You could report them instead via Knujon (that's what I do because Knujon has an e-mail address that is available even to us non-subscribers) or Complainterator, about which you can find more by searching the SC Forum.

&nbsp &nbsp&nbsp&nbsp&nbsp If you are really committed to asking for a review of the SC parser, I would recommend that you send this information (or a pointer to this Forum thread) to the SC Deputies at e-mail address deputies[at]admin.spamcop.net. They may or may not reply to you and may or may not do anything with the information you send them, though, so please keep your expectations low. :) <g>

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...