QUOTE(Miss Betsy @ Apr 10 2009, 03:57 PM)

You might be interested in reading this topic which discusses, to some extent, why spamcop does not fix these bugs.
blog spot spamBasically, spamcop is the wrong tool for spamvertised websites.
I disagree. SpamCop clearly is used a lot for reporting spamvertised websites and does the task pretty well - I've even received such reports myself when a site we host is used as an authority for a news story (effectively a "joe job" rather than the spammer's intended landing page). Also there is a URIBL based on reports of spamvertised websites reported to SC: SpamAssassin has a rule URIBL_SC_SURBL that hits about 20% of incoming spam with a nice low rate of false positives.
I think this inconsistency in parsing is just an omission. A low-priority one, admittedly, but fixing it would enable SC to behave more consistently from the perspective of both user and abuse desk.
QUOTE(Wazoo @ Apr 11 2009, 05:18 AM)

Tracking URL still needed for discussion on the actual parsing output. I'm not interested in trying to recreate your 'posted evidence' into a parsable form, as I still have no idea what the 'actual/real' spam looked like .. and totally not meaning to skip over the fact that no one really wants to 'read' your spam here .... most have enough of their own to handle in one way or another.

Er... what? A big advantage of reporting via SpamCop is that it anonymises the report. If I don't want the ISP's abuse desk to see the recipient address, I'm not going to post it (or the tracking URL which might still include the receiving server name, although as you can see the report was months old) to a public webpage. I don't want "help" in any case - I wanted to report a bug. And, besides, I don't think I can find the original Tracking URL any more

It was you who I noticed said it lacked context, so I supplied the context (cutting and pasting it into SC will be parsable, so long as you indent the second line of headers appropriately. It's text/plain and there are no other MIME parts, if that is really relevant.) I cannot understand at all why you say you don't know have any idea what it looked like. There it is in post number 3 with only the recipient address and receiving server omitted.
BTW as I do abuse work (even at the weekend

), I don't just have my own spam to handle, but that of thousands of users.
QUOTE(rconner @ Apr 11 2009, 07:16 AM)

The perp is giving the IP address as a "dotted hex quad" rather than the more conventional decimal form.
Here is a place where you can break down this URL into a more usual form.
Yes, the dotted hex quad is what SpamCop doesn't parse correctly. The deobfuscator link is indeed a useful one for users, but I know my sixteen-times table anyway

QUOTE
You mention converting this to a decimal quad by hand. I would be very careful about doing this if I were you, because it runs right smack into the strict rule that we are not supposed to alter the spam we submit to get SpamCop to find things that it would not otherwise find. People get kicked off SpamCop for failing to follow this rule. On the other hand, there's no problem reporting this outside SpamCop if you wish to.
To clarify, I wasn't suggesting altering the headers submitted to SpamCop, and did indeed (I think I recall) report it outside SpamCop manually.
QUOTE
It seems to me that browsers used to support hex quads (and octal quads and other forms as well), but I cannot get my browsers (on the Mac) to recognize this link. Neither whois nor nslookup nor curl command lines will support it either. It's as though everyone has decided to stop permitting these things. Maybe they still work in IE, I'm not sure. Certainly the only plausible use for such a trick is deception, and perhaps that is why browsers now "boycott" it. If they do, then this guy isn't going to get much business.
Firefox 2 and Opera 9 both recognise it. So does wget (which I use in place of curl). IE6 does not, and IE users are usually the more naive users. I agree that the spammer is therefore artificially restricting the number of people who they might defraud. Actually I can find any references in RFCs to hex quads, but I'd imagine with moves towards IPv6, it's a newer trend rather than one being phased out. Certainly RFC 1630 (1994, formalising URIs) doesn't seem to be conscious of hex quads - I'll test it on Amaya.
QUOTE
Wazoo is correct that you need to post this sort of thing under a tracking URL (see
here for how to do this). In your post, not only do we see the whole spam (and help the spammer promote it via search-engine crawling), you've also left the links "live" so that anyone can click on them (I munged the link in the quote above to stop the board software from "linkifying" them).
On edit, I note that the "fixed" version of this link will not load (timeout), so there may be no justification for reporting it.
I'm seeing a 404 page in Chinese (Big5), not getting a timeout. Maybe your upstream provider is firewalling it? I don't think there's any risk in posting the link since the page was presumably taken down months ago, and phishing only works in context, (if it weren't a 404 it would now be more likely to come up in searches for "spam" than for "IxRxS". I was reporting the bug in the hope that it's fixed in future - the spam was only an example.
QUOTE(StevenUnderwood @ Apr 11 2009, 04:26 PM)

There is development going on, but usually not in the area of finding links in the body of the spam messages,. It has been this way for quite a while and, IMO, is unlikely to change at any time in the near future.
Thanks for the info. It would be nice if there were some sourceforge-style bugtracker to keep hold of minor issues like this, but I can understand that SC is not open source and there's no public access to this.