Jump to content
Sign in to follow this  
Chris_Spam_Reporter

Unicode domain names are breaking the parser

Recommended Posts

Hi,

I’d like to raise an issue with the parser. I’m getting a lot of spam with unicode domain names. The Apple mail app is able to decode them, but I can't get the raw data into Spamcop in a way that it can understand.

Here is the raw data I'm getting for a domain name:

e2 92 b8 e2 93 87 e2 93 84 e2 93 88 e2 93 88 e2 93 81 e2 92 ba e2 93 89 e2 92 be e2 93 83 e2 93 88 e2 93 8a e2 93 87 e2 92 b6 e2 93 83 e2 92 b8 e2 92 ba 2e e2 93 83 e2 92 ba e2 93 89

These are almost all 3 byte unicode characters. You can see the mappings here:

http://unicode-search.net/unicode-namesear...Fterm%3Dcircled

I'll try and paste in the chars below:

ⒸⓇⓄⓈⓈâ“ⒺⓉⒾⓃⓈⓊⓇⒶⓃⒸⒺ.ⓃⒺⓉ

You will be unsurprised to hear that Apple mail understands this as:

crossletinsurance.net

Further, pasting text out of the Apple Mail ‘View Raw Source†doesn’t give quite the same data, so that adds another layer of difficulty.

I can provide more samples if needed.

Chris

Share this post


Link to post
Share on other sites

How did the parser handle the domain name?

A TRACKING URL would be nice (necessary) so that others can see an example of the problem you are reporting and provide assistance.

Share this post


Link to post
Share on other sites

...Note, though, that this likely falls within the parser limitations discussed in the SpamCop FAQ article "SpamCop reporting of spamvertized sites - some philosophy," so I would not recommend spending much time on it.

Share this post


Link to post
Share on other sites

This is a reporting issue more than routing/address. And an interesting one, though as Steve T says, not in the mainstream of the SC mission. Moving the topic herewith - hopefully other reporters might have some constructive comments but most of us would need that Tracking url to contribute much. Seems to it is probably in the same category as "picture spam" and Base64 encoding - previous topics and FAQ entries on those may be relevant.

Share this post


Link to post
Share on other sites

..."<A href="http://DiªmºnDªssºrtºr.cOm/ICv3HrI.jlc?127537175=1318427&949523=110&19=1&6nl9vv=1" >?" It is my guess that to know that the "href" part of the <A> tag is Unicode and not plain text is probably beyond the capabilities of the SpamCop parser or any likely enhanced version of the parser.

Share this post


Link to post
Share on other sites

No character set specified in the headers which may or may not affect things (Content-Type: text/html; charset=UTF-8 expected for unicode). I suppose there might be some configuration setting which might force Apple Mail "View Raw Source" into the same characterset display as the mail client or vice-versa (seems Google finds quite a bit of commentry on changing Apple Mail default characterset) but that wouldn't solve the problem by itself.

Since you can decode you could try pasting the decoded URL or a partial into the parser submission box (by itself) as a separate step to obtain the report routing, and if it seems worthwhile adding that as a "User Notification" report (if you have a paid account).

Parsing input: http://crossletinsurance.net/

No recent reports, no history available

Host crossletinsurance.net (checking ip) = 209.190.38.218

Routing details for 209.190.38.218

[refresh/show] Cached whois for 209.190.38.218 : abuse[at]ee.net

Using abuse net on abuse[at]ee.net

abuse net ee.net = abuse[at]twtelecom.net, abuse[at]ee.net

Using best contacts abuse[at]twtelecom.net abuse[at]ee.net

Reports disabled for abuse[at]ee.net

Using abuse#ee.net[at]devnull.spamcop.net for statistical tracking.

So, a report could be sent to abuse[at]twtelecom.net, with appropriate notes.

Or, the spam body could be modified to include the decoded address (some things are permissible, if annotated, but generally we are NOT permitted to "help" the parser find something it cannot do by itself, not in the headers anyway but this is the body). Anyway, a modification like this works (non-mailhosted example):

http://www.spamcop.net/sc?id=z5901122797zc...15ae3132de2034z

You would need to ask the SC Admin if that, specifically, is permitted (I'm not sure):

Don D'Minion, SpamCop Administrator

service [at] admin.spamcop.net

Which begs the question, is this targeting? Time was when 90% or more of internet users would not see that spam URL resolved/linked but I suppose that proportion is changing with the take-up of iPads, iPhones, etc.

Share this post


Link to post
Share on other sites
<snip>

adding that as a "User Notification" report (if you have a paid account).

<snip>

...Or to Preferences | "Report Handling Options" | "Public standard report recipients" if you do not have a paid account.

Share this post


Link to post
Share on other sites

i am experiencing a similar problem with parsing нкла.рф - http://www.spamcop.net/mcgi?action=gettrac...rtid=6164555900 - which appears in the parser as ýúûð.рф

i have found that нкла.рф -- or xn--80atdg.xn--p1ai -- is hosted at 91.200.12.17 but the spamcop parser won't even attempt it.

is there some work around for this problem, or should i continue to LART these manually?

Share this post


Link to post
Share on other sites

another unicode domain that breaks the parser:

йттът.аоÑк.рф

parser response:

Parsing input: йттът.аоÑк.рф

Decimal ampersand decode: 9BBJB.0>A:.[at]D

No recent reports, no history available

1092://;/ is not a routeable IP address

Cannot resolve 1092://;/

if i paste йттът.аоÑк.рф into the address bar of my browser, it immediately goes to "Dr. MaxMan - Max Penis Enlarger Pills!" but i can't report it, because 1092://;/ is not a routable address... 8/

what do i do now?

Share this post


Link to post
Share on other sites

another unicode domain that breaks the parser:

йттът.аоÑк.рф

parser response:

Parsing input: йттът.аоÑк.рф

Decimal ampersand decode: 9BBJB.0>A:.[at]D

No recent reports, no history available

1092://;/ is not a routeable IP address

Cannot resolve 1092://;/

if i paste йттът.аоÑк.рф into the address bar of my browser, it immediately goes to "Dr. MaxMan - Max Penis Enlarger Pills!" but i can't report it, because 1092://;/ is not a routable address... 8/

what do i do now?

Can you get a IP?

If so how?

Once got you can add to your Report

http://йттът.аоÑк.рф/en/

"Resolves to IP"

Share this post


Link to post
Share on other sites

Just noting DomainDossier (http://centralops.net/co/DomainDossier.aspx) is one which will resolve these (with "network whois record" checked):

йттът.аоск.рф = 81.181.8.179 (Netbyte Telecom)

...

route: 81.181.8.0/24

descr: Netbyte Telecom

origin: AS39900

descr: +------------------------------------

descr: | Abuse reports: abuse[at]netbyte.ro |

...

- providing the information to add a user-defined reporting address or manual LART.

Share this post


Link to post
Share on other sites

Just noting DomainDossier (http://centralops.net/co/DomainDossier.aspx) is one which will resolve these (with "network whois record" checked):

йттът.аоÑк.рф = 81.181.8.179 (Netbyte Telecom)

- providing the information to add a user-defined reporting address or manual LART.

Thanks none of my sources would resolve it.

Share this post


Link to post
Share on other sites

I was going to make my own post about this but fortunately I see I'm not the only one to have noticed this.

My address issue is similar to salamander

hxxp://okra.моуе.рф/?r=Click+here+to+proceed

Result:

Finding links in message body

Parsing HTML part

Resolving link obfuscation

http:/ /okra.üþуõ.рф

Tracking link: http:/ /okra./üþуõ.рф

No recent reports, no history available

okra. is not a hostname

okra. is not a routeable IP address

Cannot resolve http:/ /okra./üþуõ.рф

Edited by SteveT to break links to avoid accidental undesired navigation.

Share this post


Link to post
Share on other sites

I was going to make my own post about this but fortunately I see I'm not the only one to have noticed this.

My address issue is similar to salamander

hxxp://okra.моуе.рф/?r=Click+here+to+proceed

Result:

Finding links in message body

Parsing HTML part

Resolving link obfuscation

http:/ /okra.üþуõ.рф

Tracking link: http:/ /okra./üþуõ.рф

No recent reports, no history available

okra. is not a hostname

okra. is not a routeable IP address

Cannot resolve http:/ /okra./üþуõ.рф

WIN 8.1 64

Using FireFox ver 31 (latest)

Can't connect either

Dangerous to click links

Have had this hijack my browsers!

https://dl.dropboxusercontent.com/u/50667687/MAL04.jpg

Called "ransom ware" splash page I traced back to botnet in Ukraine

Hard to remove

Your browsers become hijacked and that is all you see

Edited by SteveT to break links in quoted post to avoid accidental undesired navigation.

Edited by turetzsr

Share this post


Link to post
Share on other sites

I use the wrong OS to worry too much about the links, but I did click one in iOS 6 or mb it was OSX 10.7 (i think) and it was just a blank page. I'm sure from what petzl is saying that it was probably trying to do something evil tho. ;)

Edited by emanmb

Share this post


Link to post
Share on other sites

I use the wrong OS to worry too much about the links, but I did click one in iOS 6 or mb it was OSX 10.7 (i think) and it was just a blank page. I'm sure from what petzl is saying that it was probably trying to do something evil tho. ;)

Yes that splash page hijacked my computer screen managed to get a screen imgage of it (took two images as it was way bigger than my screen) just from check spam links

I have now put in this FF Add-On

https://adblockplus.org/en/firefox

seems to work?

Share this post


Link to post
Share on other sites

We all know this already but possibly worth stating for the record.

There's no safe way to follow a click trail. I use Sandboxie when I "have to" try my hand - which runs browser and any files created in a virtual environment and can (sort of) "evaporate" the whole lot at the end of the session. Even that isn't safe. Consider you are potentially up against third generation hackers, even if they comply with the "loner" stereotype they build off each other's work. That's a huge resource built by people who love their "work". Incredibly efficient use of resources. Depending on (free) publicly-available utilities - therefore able to be deconstructed and analysed at leisure - is always going to be a risk. Adblock/Adblockplus apparently does a great job - but caution would qualify that - ALMOST all the time.

None of this happens in isolation. There are other vectors dedicated to compromising home PCs and can (at least) create vulnerabilites for other opportunistic exploits. For instance have seen a report that tinyurl was (briefly, apparently) suborned at the initial interface level with a realistic dialog pop-up trying to lure users to install an update to an "Outdated Java Plug-in". Which wasn't (but how many would fall for it?). Maybe Adblockplus would stop it, maybe not, plenty of AVs and other real-time protections would probably prevent download. Probably - no-one's defences are perfect. Apple OS device users aren't bullet-proof either, though they currently have less to face. All those iPads out there (and cloud file hosting) have to be a tempting target.

So, we always consider that we are taking a sizeable risk whenever the urge to explore an untrusted link strikes us, don't we?

☠(U+2620)

Edited by Farelf

Share this post


Link to post
Share on other sites

here's another one:

sdo.рёъънбхважврб.рф

which the parser renders:

-----

Parsing input: sdo.рёъънбхважврб.рф

Decimal ampersand decode: sdo.[at]QJJ=1E2062[at]1.[at]D

No recent reports, no history available

1092://;/ is not a routeable IP address

Cannot resolve 1092://;/

-----

in reality, its IP address is 91.200.12.17, and it has the following whois information according to DomainDossier:

-----
% Information related to '91.200.12.0 - 91.200.15.255'

% Abuse contact for '91.200.12.0 - 91.200.15.255' is 'noc[at]lugalink.net'

inetnum:        91.200.12.0 - 91.200.15.255
netname:        GLUBINA-NET
descr:          PP SKS-Lugan
org:            ORG-PS152-RIPE
remarks:
remarks:        **********************************Attention***************************************
remarks:        The pool is used other company !
remarks:        In case of questions related to spam, HACKING, SECURITY
remarks:        Please contact directly abus[at]ealchevsk.org
remarks:        tel: +38 (044) 228-14-42; +38 (050) 472-06-34; +38 (067) 921-89-42
remarks:        ***********************************************************************************
remarks:
country:        UA
admin-c:        NASA-RIPE
tech-c:         DVC31-RIPE
status:         ASSIGNED PI
mnt-by:         RIPE-NCC-END-MNT
mnt-lower:      RIPE-NCC-END-MNT
mnt-by:         GLUBINA-MNT
mnt-routes:     GLUBINA-MNT
mnt-domains:    GLUBINA-MNT
changed:        hostmaster[at]ripe.net 20070921
changed:        nsa[at]alchevsk.net 20130726
changed:        hostmaster[at]ripe.net 20100319
changed:        hostmaster[at]ripe.net 20130729
changed:        hostmaster[at]ripe.net 20131001
source:         RIPE

organisation:   ORG-PS152-RIPE
org-name:       PP SKS-LUGAN
org-type:       LIR
address:        PP SKS-LUGAN
address:        Lenina 42/6
address:        94207
address:        Alchevsk
address:        UKRAINE
phone:          +380506492511
fax-no:         +380644250006
abuse-c:        AR17440-RIPE
admin-c:        TAU-RIPE
mnt-ref:        LUGAN-MNT
mnt-ref:        RIPE-NCC-HM-MNT
mnt-by:         RIPE-NCC-HM-MNT
source:         RIPE
e-mail:         lir[at]lugalink.net
changed:        bitbucket[at]ripe.net 20140305

person:         Dmitrij Chaban
address:        Ukraine
phone:          +38 044 2281442
e-mail:         jecky[at]vhoster.net
nic-hdl:        DVC31-RIPE
mnt-by:         VHOSTER-MNT
changed:        jecky[at]vhoster.net 20130723
source:         RIPE

person:         Novohatsky Sergey
address:        Ukraine
mnt-by:         NASA-MNT
phone:          +380 6442 50006
e-mail:         nsa[at]alchevsk.net
notify:         nsa[at]alchevsk.net
nic-hdl:        NASA-RIPE
changed:        nsa[at]alchevsk.net 20131009
source:         RIPE

% Information related to '91.200.12.0/22AS35804'

route:          91.200.12.0/22
descr:          PP "SKS-Lugan"
origin:         AS35804
mnt-by:         GLUBINA-MNT
changed:        patrocl[at]gmail.com 20071126
changed:        tugik[at]alchevsk.net 20130924
source:         RIPE

% This query was served by the RIPE Database Query Service version 1.75 (DB-1)
-----

abuse[at]alchevsk.net is denied when i add it to spamcop's LART. i have not tried a manual LART yet.

Share this post


Link to post
Share on other sites

[AKA sdo.xn--80abbcbx9bxag9b0ca3g.xn--p1ai (DomainDossier)]

Using refresh:

abuse[at]alchevsk.org bounces (6 sent : 6 bounces)

Email Checker Connection Process

Resolving host name "vhoster.net"...

Connecting to host address "91.200.14.1"...

[Whois Lookup - IP Lookup]

Connected.

Email Verifier Process

S 220 billing.vhoster.net ESMTP Exim 4.80 Wed, 20 Aug 2014 04:53:10 +0300

C HELO ipaddresslocation.org

S 250 billing.vhoster.net Hello dedicated195.tchmachines.com [208.76.87.18]

C MAIL FROM: <info[at]ipaddresslocation.org>

S 250 OK

C RCPT TO: <abuse[at]alchevsk.org>

S 250 Accepted

Verified address, not sending email.

The Result of Email Lookup Process

This host states that the address is valid.

Disconnected.

abuse[at]alchevsk.org is

a valid deliverable e-mail box address.

They don't like SC

C:\Documents and Settings\Admin>nslookup -type=txt sdo.xn--80abbcbx9bxag9b0ca3g.xn--p1ai 8.8.8.8

Server: google-public-dns-a.google.com

Address: 8.8.8.8

xn--80abbcbx9bxag9b0ca3g.xn--p1ai

primary name server = ns1.xn--80abbcbx9bxag9b0ca3g.xn--p1ai

responsible mail addr = admin.xn--80abbcbx9bxag9b0ca3g.xn--p1ai

serial = 2009000000

refresh = 600 (10 mins)

retry = 900 (15 mins)

expire = 1209600 (14 days)

default TTL = 43200 (12 hours)

Nothing useable ...

Share this post


Link to post
Share on other sites

http://www.ipaddresslocation.org/email_lookup/check-email.php can't verify aaarrrh[at]pirate.computer and i know it's a valid address, 'cause it's mine... <_<

...

Unsure of your point there. Well, that'll larn ya for using a new gTLD. No doubt ipaddresslocation.org should be keeping a closer eye on those roll-outs or bypassing whatever lame parsing they currently use. Name servers have no problem with .computer so neither should their service. Even (!) SpamCop has no problems with it. Except abuse[at]racksrv.com bounces (99 sent : 99 bounces)!! and now diverts to devnull.

Fact remains abuse[at]alchevsk.org is apparently a valid, deliverable address - but note lack of validation is inconclusive (not a factor with that one), validation does not ensure delivery, delivery does not ensure a message is read and reading doesn't ensure (appropriate) action. Hurdle 1/4 successfully negotiated. I note "abuse[at]racksrv.com is a valid deliverable e-mail box address." too. But SC, for one, can't reach it either. Just pointing that out because deep down inside I'm a practicing curmudgeon.

No closer to knowing whether a manual lart will get through to tugik[at]alchevsk.net I'm afraid - but it certainly is NOT ruled out yet. That was the point. But I suspect they already know very well what they are doing and don't intend to quit easily.

Apropos of putting on pressure, the SURBL (http://www.surbl.org/surbl-analysis) has them listed (via sdo.xn--80abbcbx9bxag9b0ca3g.xn--p1ai - it doesn't "do" Cyrillic either - but not 91.200.12.17, oddly, but it seems it is now 95.31.192.232, the good old A record rotation maybe, not listed under that IP either). The URIBL (https://admin.uribl.com/) doesn't have them listed under either IP or alias (and also doesn't handle Cyrillic). They (sdo.рёъънбхважврб.рф) SEEM (from our POV) to be going out of their way to avoid RBL listing so that is probably the only way to apply effective pressure - do what they don't want. SC only indirectly supports (one) real-time URI listing, the SURBL.

Yes, it would help if SC resolved these, with the chance then of the SURBL picking up feed accordingly - but of course, "that's not the principal mission". Registering with each and "manually" submitting to those URI listings remains an option - but given the obfuscation and the inability anyway of so many systems to accept the Cyrillic domain names that might seem like the drop of water wearing away at the rock. Well, if we all did it that might help.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×