Problems parsing links containing "< >" characters

mshalperin · December 30, 2005

http://www.spamcop.net/sc?id=z849162328z72...25446b8823044ez

This spam contains a spamertized link: http://wattage<yugoslav>narcotic.p63721.net

The full spam parser saw it as http://wattage which it obviously could not resolve. The actual domain, p63721.net, was easily resolved by the single line parser and I entered as a manually as postmaster<at>wildblue.net. It appears that "<yugoslav> confused the parser. I recall reading something about this issue in the past but I couldn't find it with a brief search. If this has been reported and discussed before, my apologies.

Wazoo · December 30, 2005

This spam contains a spamertized link: http://wattage<yugoslav>narcotic.p63721.net
The full spam parser saw it as http://wattage which it obviously could not resolve. The actual domain, p63721.net, was easily resolved by the single line parser and I entered as a manually as postmaster<at>wildblue.net. It appears that "<yugoslav> confused the parser. I recall reading something about this issue in the past but I couldn't find it with a brief search. If this has been reported and discussed before, my apologies.

38689[/snapback]

Please note the headers of your spam sample.

Content-type: TEXT/PLAIN

This should have made the link provided unusable by any 'decent' e-mail application. The parser was also following the description and therefore handled that link as "text" which then resulted in a non-functional string of characters that did not result in an actual URL.

Jeff G. · December 30, 2005

As a test, I replaced "TEXT/PLAIN" with "TEXT/HTML", producing the same results in http://www.spamcop.net/sc?id=z849472746ze3...d7ad770e2f0df6z.

StevenUnderwood · December 30, 2005

As a test, I replaced "TEXT/PLAIN" with "TEXT/HTML", producing the same results in http://www.spamcop.net/sc?id=z849472746ze3...d7ad770e2f0df6z.

38703[/snapback]

But are the < and > characters legal allowed characters per relevant RFC's within a host name?

Jeff G. · December 30, 2005

But are the < and > characters legal allowed characters per relevant RFC's within a host name?
38706[/snapback]

No, but that doesn't stop the "single address (one line only)" Parser and it doesn't stop IE6, so why should it stop the "entire spam (headers, blank line, body)" Parser?

StevenUnderwood · December 30, 2005

No, but that doesn't stop the "single address (one line only)" Parser and it doesn't stop IE6, so why should it stop the "entire spam (headers, blank line, body)" Parser?

38708[/snapback]

We already know the "single address (one line only)" Parser is a lot more lienient bacically not searching for strings but assuming the entire entry is a string and working with that. I'm not going to even comment on how IE handles something.

Basically, the same as always, it comes down to the parser finding the appropriate text in the message. Each exception adds to the complexity of the search. I'm sure the search parameters to locate RFC compliant links is bad enough without adding an exception for every trick that some spammer could come up with. If this trick becomes common, I am sure that the exception will be made. And once again, to me it seems (and it should be) links are a secondary (maybe tertiary) focus of the parsing function, behind alerting ISP's of spam email and adding to the BL.

Also, in your HTML parser test, you did not create the link as html <A HREF=http...>. Does that make any difference to the parser? I would expect it easier for the HTML parser to take the whole <A HREF=http...> string and grab the link part from there.

EDIT: On second hand, that would probably BREAK the link because of the search looking for the closing > and finding it in the middle of the link, unless you recognized nested <> characters, again getting very complex in the search.

mshalperin · December 31, 2005

I'm not going to even comment on how IE handles something.
Each exception adds to the complexity of the search. I'm sure the search parameters to locate RFC compliant links is bad enough without adding an exception for every trick that some spammer could come up with. If this trick becomes common, I am sure that the exception will be made. And once again, to me it seems (and it should be) links are a secondary (maybe tertiary) focus of the parsing function, behind alerting ISP's of spam email and adding to the BL.

38710[/snapback]

http://www.spamcop.net/sc?id=z849599669z11...fdcb8d66b82b9cz

I got 2 more of these today, all with the same actual URL, but with different "middle names". The last one used: http://buckley<successor>fluff.o94182.com which was resolved by the single line parser to 202.65.99.20. This would seem like a new evolving trick except that none of them would link direcctly from MS Outlook. I didn' try the first 2, but this one will link if pasted to IE6 or Firefox 1.0.6.

Wazoo · December 31, 2005

Traffic from spamcop.help newsgroup;

From: RW <nobody[at]spamcop.net>
Newsgroups: spamcop.help

Subject: Re: spamcop parser error

Date: Sat, 31 Dec 2005 00:47:20 -0600

Message-ID: <dp59hk$cq$1[at]news.spamcop.net>

DougW wrote:

> Seems spammy is using a new url hiding tactic.

>

> Urls are in the form as below.

>

> http://usurious<kobayashi>coconut.p63721.net

> http://prank<augusta>inferno.o94182.com

>

> Spamcop chokes on the <

>

We've received several reports of this today. I've filed a bug report

so hopefully this will get dealt with quickly

Richard

btech · January 4, 2006

I'm using SCMail and haven't had any problems with the "<" symbols. All my reports parse properly.. I may have to refresh a few times for the link to catch, but I'm certainly not seeing stuff like:

"Resolving link obfuscation

http://buckley

host buckley (getting name) no name

buckley is not a hostname

Tracking link: http://buckley

No recent reports, no history available

buckley is not a hostname

Cannot resolve http://buckley"

I would think this to be a mail client issue? Like when I try to forward mail from my Hotmail account to my SC reporting email address, instead of copying and pasting manually... I get several parsing errors.

mshalperin · January 4, 2006

I'm using SCMail and haven't had any problems with the "<" symbols. All my reports parse properly.. I may have to refresh a few times for the link to catch,
"I would think this to be a mail client issue? Like when I try to forward mail from my Hotmail account to my SC reporting email address, instead of copying and pasting manually... I get several parsing errors.

38816[/snapback]

I'm not sure I follow your point - the "example" you posted didn't even contain the characters "< >" . This can't be a mail client issue. How could any mail client or copy/paste process add those characters to a spamvertised link within the body of the email? In my case they were submitted directly from my Spamcop email account (VER) and arrived to the parser intact. These characters were deliberately added by the spammer(s) to break the parsing of their site.

btech · January 4, 2006

http://www.spamcop.net/sc?id=z851639084z95...1d75ea84cf878dz

What I was trying to point out was that SCMail doesn't have a problem parsing the symbols, whereas you were not able to. Have any of you even tried to visit those links? The "<" signs don't matter and the links are pointing to sites.

This is the link that was parsed: http://lxxfdjdmhr1sldmo<7hxi3>ynrnut...llerykam.com/av

Wazoo · January 4, 2006

http://www.spamcop.net/sc?id=z851639084z95...1d75ea84cf878dz
What I was trying to point out was that SCMail doesn't have a problem parsing the symbols, whereas you were not able to. Have any of you even tried to visit those links? The "<" signs don't matter and the links are pointing to sites.

38838[/snapback]

Major difference in the spam "header description" and construct between your sample and the original query in this Topic. The original was "plain text" whereas yours is HTML 'enabled'

Content-Type: multipart/related;

boundary="------------36233587"

This is a multi-part message in MIME format.

--------------36233587

Content-Type: text/html

Content-Transfer-Encoding: 7bit

<a href="http://lXxfdjdMhR1sLDMO<7HxI3>YNrnuT6ceKHBI%2Egallerykam%2Ecom/av"><img alt=""

src="cid:2.01.29230262024504.63209367[at]cnnsi.com" border="0"></a><br>

btech · January 4, 2006

I'm not following.. what does that mean?

If you copy and paste the link, even the one taken from the plain text, it still points to a live website: http://lXxfdjdMhR1sLDMO<7HxI3>YNrnuT...erykam%2Ecom/av

Merlyn · January 4, 2006

I'm not following.. what does that mean?
If you copy and paste the link, even the one taken from the plain text, it still points to a live website: http://lXxfdjdMhR1sLDMO<7HxI3>YNrnuT...erykam%2Ecom/av

38852[/snapback]

That link takes you to

http://lxxfdjdmhr1sldmo<7hxi3>ynrnut...filiateid=11147

Jeff G. · January 4, 2006

If you copy and paste the link, even the one taken from the plain text, it still points to a live website: http://lXxfdjdMhR1sLDMO<7HxI3>YNrnuT...erykam%2Ecom/av
38852[/snapback]

Yes, but whether or not your email client or web browser will actually take you there is an entirely different question.

Merlyn · January 4, 2006

Yes, but whether or not your email client or web browser will actually take you there is an entirely different question.

38854[/snapback]

Your right! I should have said that :-)

StevenUnderwood · January 5, 2006

Your right! I should have said that :-)

38855[/snapback]

Also, keep in mind that this was reported about a week ago and SpamCop representatives have stated it was being looked into. Parsing may have been fixed as Jeff G tried it with html headres (though not in an <a href...> format.

Wazoo · January 5, 2006

Also, keep in mind that this was reported about a week ago and SpamCop representatives have stated it was being looked into. Parsing may have been fixed as Jeff G tried it with html headres (though not in an <a href...> format.

38873[/snapback]

No change seen in mshalperin's Topic starter Tracking URL .. though again noting the Content-Type was plain-text there, so that no way should anything have 'displayed' or 'used' that as a functional link .....

moulty · January 7, 2006

I get up to fifty delivery-attempts of these /day /address, most or all pretty clearly sent by Alex Polyakov. "Mortgage" (phish) emails, counterfeit drug sites (for hire), and stock pumps - his stock in trade.

Polyakov:

http://www.spamhaus.org/rokso/evidence.las...okso_id=ROK5931

http://hillscapital.com/antispam

Very recent examples of ineffective SC's url parsing

reportid=1612975575

reportid=1612975263

reportid=1612974842

Wazoo · January 7, 2006

Very recent examples of ineffective SC's url parsing
reportid=1612975575

reportid=1612975263

reportid=1612974842

38994[/snapback]

Report-ID listings don't do a thing for anyone here but you. There is a FAQ etry here dealing woith converting a Report-ID back to a Tracking URL which would then be something for "us" to look at.

Problems parsing links containing "< >" characters

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived