Jump to content

[Resolved] Request: Dealing with BASE64 encoded HTML emails


WildGeek

Recommended Posts

While reviewing the report id 5955859004 I've noticed this error.

Finding links in message body
Parsing text part
error: couldn't parse head
Message body parser requires full, accurate copy of message
More information on this error..
no links found

I found the no links found part interesting as I was able see a fully rendered HTML message on my email client (Apple Mail v6.3 build 1503). How could this happen?

Then I went to check the message itself (cmd+option+u). Headers check, but the message body wasn't what I expected, plain old and usually bad written HTML. Instead it was a blob of text encoded in BASE64. And I thought to myself: these spammers are some clever bastards.

So here comes my feature request. When the program isn't able to find any links on the message body, instead of displaying an error message to the user, try to BASE64 decode and rerun the HTML parsing.

If you need more report ids just let me know. I get a lot of spam from the same source that developed this technique.

Link to comment
Share on other sites

...If you need more report ids just let me know. I get a lot of spam from the same source that developed this technique.
Sadly, over-riding all else, parser development has been quite slow. That aside, actually, only you (for your own) and SpamCop staff can access through the ReportID WildGeek, what is needed for wider discussion (if desired) is the Tracking URL from those reports.

The parser handles Base64 encoding but not in every case, obviously - mail formats and misconfigurations of content type that most/many browsers/mail agents can tolerate can defeat its strict handling and many spammers have stumbled across this many times over the years. Many such "tricks" are discussed, particularly in the SpamCop Reporting Help section, but in truth:

Nevertheless, it is interesting and informative to see what the little devils are up to right now and if you would like some further discussion, by all means let us see some of your reports. As implied, I don't think there is much to worry about in "tipping off" some spamming mastermind (surely an oxymoron but we shouldn't be too dismissive, I suppose) - but that is your call.

Link to comment
Share on other sites

Link to comment
Share on other sites

OK, thanks! Taking just that top one ...

No complications there with muddled mime boundaries (often the cause of parser "misreading" I think), the encoding renders perfectly well into an HTML with loads of links to http://tr.parceiros.exexgv.com.br/ which the parser should find. The key, for that one at least, is in the message the parser throws up when it starts on parsing the body (possibly actually processed concurrently with the headers but appears in the output as if sequential):

Finding links in message body

Parsing text part

error: couldn't parse head

Message body parser requires full, accurate copy of message

More information on this error..

no links found

If you follow that link you will find discussion on "incorrectly wrapped long header lines" which is what has happened here. The blurb says this has happened post-receipt (that is, in your system/handling) which may be so but certainly dodgy mass mailers and (maybe) other processes and relays on the sender-side seem to be implicated sometimes.

Compare when I copy your spam and "correct" the garbled headers (which we're not allowed to do when reporting for real, an integrity thing - though if the garbling was actually a post-receipt who would know? would need to be sure though, fiddling with the evidence is a huge no-no):

http://www.spamcop.net/sc?id=z5508537460zc...d216f00a0144a0z

Can you see the difference in those long header lines? My ones are indented (spaced right) in the second and subsequent lines which tells a compliant mail agent they are continuations of the line above. Most mail agents are not compliant these days - they go through extraordinary contortions (and always "improving") to deliver the "full internet experience" which the parser cannot hope to catch and match. The parser chokes if it doesn't find those continuations, can't be sure if it is reading a new header line or has stumbled into the body (the concurrent processing factor). Bottom line - "my" (cancelled) report has found the links just fine.

Based on that slender sample, I would say this topic needs to be marked "Resolved" and moved to the SpamCop Reporting Help part of the forum where it may best help other members looking for answers. If you are agreeable, I (or someone else) will do so.

Incidentally, if Brazilian spam is a problem, member petzl seems to be achieving some results by adding mail-abuse[at]cert.br as a "User_Notification" reporting address for those - note http://forum.spamcop.net/forums/index.php?...ost&p=84579 (user notification not available in free reporting accounts though similar should be available for all in the reporting account "Preferences" tab, "Report Handling Options" link, "Public standard report recipients" item - I'm a little vague since I don't use that).

Oh, there are liabilities in opening HTML spam by the way - that example is loaded with image links which can allow the remote owner/operator to gather a surprising amount of information about you. Deal with spam using the plain text source wherever possible. You don't want "them" profiling you, if that's what they might take a notion to do. And/or it might uptick some "affiliate" revenue stream - you don't want to do that either, let 'em starve!

Link to comment
Share on other sites

Oh, there are liabilities in opening HTML spam by the way - that example is loaded with image links which can allow the remote owner/operator to gather a surprising amount of information about you. Deal with spam using the plain text source wherever possible. You don't want "them" profiling you, if that's what they might take a notion to do. And/or it might uptick some "affiliate" revenue stream - you don't want to do that either, let 'em starve!

I don't open HTML messages marked as spam. Apple Mail is quite smart and does the right thing here. It shows to you the content that has been embedded in the message body itself without loading any images. However, if the user clicks on some link or on the load images button the benefits of this are gone.

Anyway, thanks for this insightful input. I guess I have to tweak my reporting scri_pt to correctly reformat headers from this source.

On that regard I too consider this issue as resolved.

Have a nice weekend!

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...