Jump to content

unicode sequence knocks out spamcop processing


Wizel603

Recommended Posts

It seems some characters, when in the header of a spam, knock out spamcop processing. Notice the "<U+FEFF>" at the beginning of the Received header below, which I gather is a unicode byte order mark.

When submitted to spamcop, processing stops with "error:No IP found".

From dcm[at]dcmp.qc.ca Mon Feb 10 16:44:11 2014
Return-path: &lt;dcm[at]dcmp.qc.ca&gt;
Envelope-to: x[at]x.x
Delivery-date: Mon, 10 Feb 2014 16:44:11 -0500
Received: from [200.69.242.1] (helo=customer-static-242-1.iplannetworks.net)
		by x.x.x with esmtp (Exim 4.80)
		(envelope-from &lt;dcm[at]dcmp.qc.ca&gt;)
		id 1WCyeB-0005dn-26
		for x[at]x.x; Mon, 10 Feb 2014 16:44:11 -0500
&lt;U+FEFF&gt;Received: from 10.1.0.95 ([10.1.0.95])
Message-ID: &lt;B1F645ABD2F84FF8AFBAB6C12AB5A296[at]dediasjgf&gt;
From: "Adrian Garrison" &lt;dcm[at]dcmp.qc.ca&gt;
To: "Applicant" &lt;x[at]x.x&gt;
Subject: Vacancy assignment for you
Date: Mon, 10 Feb 2014 21:44:03 +0200 (EEST)
MIME-Version: 1.0
Content-Type: text/plain;
Content-Length:		626

Link to comment
Share on other sites

...No expert I but it appears to me that someone or some process placed the offending string of characters into the internet header, in violation of accepted standards. The SpamCop parser is reported very, very picky about compliance to standards so that it can be confident that it is truly receiving data that accurately represents the spam. As I understand it, the general response for these types of things is to go back to the party responsible for the bad header and ask her/ him/ them to fix her/ his/ their e-mail internet header handling. But you may wish to wait for more expert advice here.

Link to comment
Share on other sites

Oh I understand that spamcop's parser is very strict for good reason. Though I'm not sure what response I would get if I were to ask the spammer to fix their spambot so that it didn't insert stray unicode when handing off the mail to my mailserver.

Link to comment
Share on other sites

>- <U+FEFF>Received: from 10.1.0.95 ([10.1.0.95])

That line is irrelevant because it features an "internal only" IP address. SpamCop will ignore it anyway.

>- Received: from [200.69.242.1] (helo=customer-static-242-1.iplannetworks.net) by x.x.x

The parse should be finding 200.69.242.1, but the server getting the email from outside is using an invalid name (x.x.x)

- Don D'Minion - SpamCop Admin -

- Service[at]Admin.SpamCop.net -

Link to comment
Share on other sites

>- Received: from [200.69.242.1] (helo=customer-static-242-1.iplannetworks.net) by x.x.x

The parse should be finding 200.69.242.1, but the server getting the email from outside is using an invalid name (x.x.x)

Here's the unmunged header. I should have mentioned that I had removed my mailserver hostname and email address from the paste. The submitted report I sent to spamcop was completely unmunged. Should I try another submission of the same spam email and see if it was a one-time hiccup?

From dcm[at]dcmp.qc.ca Mon Feb 10 16:44:11 2014
Return-path: &lt;dcm[at]dcmp.qc.ca&gt;
Envelope-to: robh[at]rut.org
Delivery-date: Mon, 10 Feb 2014 16:44:11 -0500
Received: from [200.69.242.1] (helo=customer-static-242-1.iplannetworks.net)
		by linear.rut.org with esmtp (Exim 4.80)
		(envelope-from &lt;dcm[at]dcmp.qc.ca&gt;)
		id 1WCyeB-0005dn-26
		for robh[at]rut.org; Mon, 10 Feb 2014 16:44:11 -0500
&lt;U+FEFF&gt;Received: from 10.1.0.95 ([10.1.0.95])
Message-ID: &lt;B1F645ABD2F84FF8AFBAB6C12AB5A296[at]dediasjgf&gt;
From: "Adrian Garrison" &lt;dcm[at]dcmp.qc.ca&gt;
To: "Applicant" &lt;robh[at]rut.org&gt;
Subject: Vacancy assignment for you
Date: Mon, 10 Feb 2014 21:44:03 +0200 (EEST)
MIME-Version: 1.0
Content-Type: text/plain;
Content-Length:		626

Link to comment
Share on other sites

I had exactly the same thing happen to me on the same day, with some bad characters in the same place as reported here, and because I'm a SC email customer and I used the "quick report" option from within webmail, the result was a spam report going to my own hosting provider--I'm not a happy camper. I'll send you a tracking URL, Don--the damage is already done, in that now my own server's IP has a "History" of spam reports!

DT

Link to comment
Share on other sites

<snip>

I used the "quick report" option ..., the result was a spam report going to my own hosting provider--I'm not a happy camper.

<snip>

...Here's another opportunity for me to strongly recommend that everyone who uses or is considering using Quick Reporting to:

please read carefully the SpamCop FAQ article labeled "What is Quick Reporting?" and note especially the paragraph labeled "WARNING!"

Link to comment
Share on other sites

...Here's another opportunity for me to strongly recommend that everyone who uses or is considering using Quick Reporting to:

please read carefully the SpamCop FAQ article labeled "What is Quick Reporting?" and note especially the paragraph labeled "WARNING!"

Duh! I hope you know that I'm fully aware of the potential danger, but I've kept my "mailhosts config" up to date, and that should prevent EXACTLY WHAT HAPPENED. So no need sending ME to the FAQ, although others are welcome to read and reread. I've been here forever and am #8 of the top-ten posters (several of whom are dead, RIP), so I'm very familiar with the procedures.

DT

Link to comment
Share on other sites

Hi, DT,

...Sorry, I should have explicitly mentioned that my recommendation was not directed to you (I thought we knew each other well enough that my saying that explicitly wasn't necessary :) <g>)! I recognized immediately upon reading your post that your problem wasn't caused by failure to heed my warning.

Link to comment
Share on other sites

Still seeing spams arriving with the unicode byte-order-mark "<U+FEFF>".

From the page http://en.wikipedia.org/wiki/Zero-width_non-breaking_space :

Character U+FEFF is intended for use as a Byte Order Mark at the start of a file. However, if encountered elsewhere it should, according to Unicode, be treated as a "zero-width non-breaking space".

So it would seem to me that the spamcop parser is not following correct unicode protocol, assuming it is even unicode aware. Regardless, it seems to bail and not even retain the IP from the earlier header if it encounters a line starting with those bytes. My first guess is that the parser isn't seeing the line break as valid because the newline isn't followed by a 7-bit ascii printable or white space, rendering the entire first Received header unparsible.

Pasting non-printable unicode into spamcop's web form is not an option, so again I've included a sample spam which is unmunged except for "<U+FEFF>" which represents the byte sequence 0xEF 0xBB 0xBF.

From lowe[at]vcave.com Tue Feb 18 16:37:23 2014
Return-path: &lt;lowe[at]vcave.com&gt;
Envelope-to: robh[at]rut.org
Delivery-date: Tue, 18 Feb 2014 16:37:23 -0500
Received: from [85.105.34.61] (helo=85.105.34.61.static.ttnet.com.tr)
		by linear.rut.org with esmtp (Exim 4.80)
		(envelope-from &lt;lowe[at]vcave.com&gt;)
		id 1WFsLz-0000wq-Ik
		for robh[at]rut.org; Tue, 18 Feb 2014 16:37:23 -0500
&lt;U+FEFF&gt;Received: from 10.0.1.58 ([10.0.1.58])
Message-ID: &lt;0B62ADACD76146A1B05C610778E1D478[at]terras-sensnd&gt;
From: "Victor Blare" &lt;lowe[at]vcave.com&gt;
To: "Christian Ralphs" &lt;robh[at]rut.org&gt;
Subject: Vacancy - apply online
Date: Tue, 18-Feb-2014 21:37:30 GMT
MIME-Version: 1.0
Content-Type: text/plain;
		format=flowed;
		charset="iso-8859-2";
		reply-type=original
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6109
Content-Length:		566

Dear Sir!
You are welcome our hiring process. If you are taking a career break, are on a maternity leave, recently retired or simply looking for some part-time job, this position is for you. Occupation: Flexible schedule 2 to 8 hours per day. We can guarantee a minimum 20 hrs/week occupation. Salary: Starting salary is $2000 per month plus commission, paid every month. Business hours: part time.
No startup fees or deposits to start working for us. To get an application form please register www.danna.in.ua/
Faithfully,
Consolite Technology Company
Victor Blare

Link to comment
Share on other sites

<snip>

From the page http://en.wikipedia.org/wiki/Zero-width_non-breaking_space :

<snip>

So it would seem to me that the spamcop parser is not following correct unicode protocol, assuming it is even unicode aware.

<snip>

...Hm, I don't know for a fact that Wikipedia is authoritative or that anyone with any knowledge has ever claimed that the SpamCop parser follows "unicode protocol" (not that Wizel603 is claiming either). Can anyone cite such references?
Link to comment
Share on other sites

...Again, I'm sure you know this, DT, but more for the benefit of others: Don doesn't always follow up here but unless he's forgotten I'm sure he would contact you when he had anything useful to tell you. You could check back with him via e-mail just to be sure.

Link to comment
Share on other sites

Actually, Don followed up in a related thread in the Mailhost Configuration sub-forum, as follows:

The problem is caused by the odd characters in that "Received" line. I don't know why.

If you remove them, the spam will process properly.

Unfortunately, that doesn't really solve the problem for those of us who pay for CESMail SpamCop email accounts and prefer to use the "Report as spam" links in the webmail system. That's a perk that we're paying for, and it's now dangerously broken.

DT

Link to comment
Share on other sites

CESmail support would like examples of the spam messages with "goofy characters" causing SpamCop to parse them incorrectly.

They asked that we create a separate webmail folder, put the offending spam messages in it, then email support with the SpamCop account name (I suppose that means the email address) and the folder name and explain what the problem is, and they will look at it.

Please send an email directly to SpamCop Support <support-cases at spamcop.net> and use "Re: (Case 55538) [Problem Report] SpamCop parser affecting Quick Reporting" as the subject line.

Thanks!

Link to comment
Share on other sites

Cisco/SpamCop sent me a reply saying that they are aware of the problem and working on it. They would like to figure out how the characters are getting into the headers in the first place and prevent that, but their engineers are trying to get the parser to handle them anyway.

CESmail (SpamCop webmail) indicated that they will be looking at any examples we send to be sure the characters are not coming from the CES servers.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...