Jump to content

Problems parsing links containing "< >" characters


mshalperin

Recommended Posts

Posted

http://www.spamcop.net/sc?id=z849162328z72...25446b8823044ez

This spam contains a spamertized link: http://wattage<yugoslav>narcotic.p63721.net

The full spam parser saw it as http://wattage which it obviously could not resolve. The actual domain, p63721.net, was easily resolved by the single line parser and I entered as a manually as postmaster<at>wildblue.net. It appears that "<yugoslav> confused the parser. I recall reading something about this issue in the past but I couldn't find it with a brief search. If this has been reported and discussed before, my apologies.

Posted
This spam contains a spamertized link: http://wattage<yugoslav>narcotic.p63721.net

The full spam parser saw it as http://wattage which it obviously could not resolve.  The actual domain, p63721.net, was easily resolved by the single line parser and I entered as a manually as postmaster<at>wildblue.net.  It appears that "<yugoslav> confused the parser.  I recall reading something about this issue in the past but I couldn't find it with a brief search.  If this has been reported and discussed  before, my apologies.

38689[/snapback]

Please note the headers of your spam sample.

Content-type: TEXT/PLAIN

This should have made the link provided unusable by any 'decent' e-mail application. The parser was also following the description and therefore handled that link as "text" which then resulted in a non-functional string of characters that did not result in an actual URL.

Posted
But are the < and > characters legal allowed characters per relevant RFC's within a host name?

38706[/snapback]

No, but that doesn't stop the "single address (one line only)" Parser and it doesn't stop IE6, so why should it stop the "entire spam (headers, blank line, body)" Parser?
Posted
No, but that doesn't stop the "single address (one line only)" Parser and it doesn't stop IE6, so why should it stop the "entire spam (headers, blank line, body)" Parser?

38708[/snapback]

We already know the "single address (one line only)" Parser is a lot more lienient bacically not searching for strings but assuming the entire entry is a string and working with that. I'm not going to even comment on how IE handles something.

Basically, the same as always, it comes down to the parser finding the appropriate text in the message. Each exception adds to the complexity of the search. I'm sure the search parameters to locate RFC compliant links is bad enough without adding an exception for every trick that some spammer could come up with. If this trick becomes common, I am sure that the exception will be made. And once again, to me it seems (and it should be) links are a secondary (maybe tertiary) focus of the parsing function, behind alerting ISP's of spam email and adding to the BL.

Also, in your HTML parser test, you did not create the link as html <A HREF=http...>. Does that make any difference to the parser? I would expect it easier for the HTML parser to take the whole <A HREF=http...> string and grab the link part from there.

EDIT: On second hand, that would probably BREAK the link because of the search looking for the closing > and finding it in the middle of the link, unless you recognized nested <> characters, again getting very complex in the search.

Posted
I'm not going to even comment on how IE handles something.

Each exception adds to the complexity of the search.  I'm sure the search parameters to locate RFC compliant links is bad enough without adding an exception for every trick that some spammer could come up with.  If this trick becomes common, I am sure that the exception will be made.  And once again, to me it seems (and it should be) links are a secondary (maybe tertiary) focus of the parsing function, behind alerting ISP's of spam email and adding to the BL.

38710[/snapback]

http://www.spamcop.net/sc?id=z849599669z11...fdcb8d66b82b9cz

I got 2 more of these today, all with the same actual URL, but with different "middle names". The last one used: http://buckley<successor>fluff.o94182.com which was resolved by the single line parser to 202.65.99.20. This would seem like a new evolving trick except that none of them would link direcctly from MS Outlook. I didn' try the first 2, but this one will link if pasted to IE6 or Firefox 1.0.6.

Posted

Traffic from spamcop.help newsgroup;

From: RW <nobody[at]spamcop.net>

Newsgroups: spamcop.help

Subject: Re: spamcop parser error

Date: Sat, 31 Dec 2005 00:47:20 -0600

Message-ID: <dp59hk$cq$1[at]news.spamcop.net>

DougW wrote:

> Seems spammy is using a new url hiding tactic.

>

> Urls are in the form as below.

>

> http://usurious<kobayashi>coconut.p63721.net

> http://prank<augusta>inferno.o94182.com

>

> Spamcop chokes on the <

>

We've received several reports of this today.  I've filed a bug report

so hopefully this will get dealt with quickly

Richard

Posted

I'm using SCMail and haven't had any problems with the "<" symbols. All my reports parse properly.. I may have to refresh a few times for the link to catch, but I'm certainly not seeing stuff like:

"Resolving link obfuscation

http://buckley

host buckley (getting name) no name

buckley is not a hostname

Tracking link: http://buckley

No recent reports, no history available

buckley is not a hostname

Cannot resolve http://buckley"

I would think this to be a mail client issue? Like when I try to forward mail from my Hotmail account to my SC reporting email address, instead of copying and pasting manually... I get several parsing errors.

Posted
I'm using SCMail and haven't had any problems with the "<" symbols.  All my reports parse properly.. I may have to refresh a few times for  the link to catch,

"I would think this to be a mail client issue?  Like when I try to forward mail from my Hotmail account to my SC reporting email address, instead of copying and pasting manually... I get several parsing errors.

38816[/snapback]

I'm not sure I follow your point - the "example" you posted didn't even contain the characters "< >" . This can't be a mail client issue. How could any mail client or copy/paste process add those characters to a spamvertised link within the body of the email? In my case they were submitted directly from my Spamcop email account (VER) and arrived to the parser intact. These characters were deliberately added by the spammer(s) to break the parsing of their site.

Posted
http://www.spamcop.net/sc?id=z851639084z95...1d75ea84cf878dz 

What I was trying to point out was that SCMail doesn't have a problem parsing the symbols, whereas you were not able to.  Have any of you even tried to visit those links?  The "<" signs don't matter and the links are pointing to sites.

38838[/snapback]

Major difference in the spam "header description" and construct between your sample and the original query in this Topic. The original was "plain text" whereas yours is HTML 'enabled'

Content-Type: multipart/related;

boundary="------------36233587"

This is a multi-part message in MIME format.

--------------36233587

Content-Type: text/html

Content-Transfer-Encoding: 7bit

<a href="http://lXxfdjdMhR1sLDMO<7HxI3>YNrnuT6ceKHBI%2Egallerykam%2Ecom/av"><img alt=""

src="cid:2.01.29230262024504.63209367[at]cnnsi.com" border="0"></a><br>

Posted
Yes, but whether or not your email client or web browser will actually take you there is an entirely different question.

38854[/snapback]

Your right! I should have said that :-)

Posted
Your right! I should have said that :-)

38855[/snapback]

Also, keep in mind that this was reported about a week ago and SpamCop representatives have stated it was being looked into. Parsing may have been fixed as Jeff G tried it with html headres (though not in an <a href...> format.

Posted
Also, keep in mind that this was reported about a week ago and SpamCop representatives have stated it was being looked into.  Parsing may have been fixed as Jeff G tried it with html headres (though not in an <a href...> format.

38873[/snapback]

No change seen in mshalperin's Topic starter Tracking URL .. though again noting the Content-Type was plain-text there, so that no way should anything have 'displayed' or 'used' that as a functional link .....

Posted
Very recent examples of ineffective SC's url parsing

reportid=1612975575

reportid=1612975263

reportid=1612974842

38994[/snapback]

Report-ID listings don't do a thing for anyone here but you. There is a FAQ etry here dealing woith converting a Report-ID back to a Tracking URL which would then be something for "us" to look at.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...