|
Hot on the heels of other ICANN Internationalized Domain Name (IDN) Top-Level Domain (TLD) launch errors1, we now have another example of ICANN’s failure to comprehend the differences between IDN and ASCII names, this time to the detriment of potential IDN registrants and the new IDN generic TLD (gTLD) Registries. This gaff really makes you wonder whether the SSAC and Multilinguism departments at ICANN have ever met.
I’m sure you are by now aware of the late-to-the-party security concerns regarding “name collisions” resulting in “NGPC Resolution for Addressing the Consequences of Name Collisions”.
This resulted mainly from router manufacturers and corporations deciding to use “made up” TLDs for their internal and sometimes public-facing systems. Bad idea. Things like “.home” for routers. Godaddy even used host names ending in “corp.gd” for internal mail servers2.
Anyway, somewhere along the way it was decided that all the illegal hostnames appearing in the DNS root server NXDOMAIN stats must be caused by people setting up silly hostnames in routers and other configuration errors, thus every NXDOMAIN entry must be a security problem.
So now we get to the stage where the first 4 new gTLDs get launched, and ICANN publishes the “block list” that is going to fix all these security issues. Let’s concentrate on the 2 Cyrillic new gTLDs—.???? (= .site in English) and .?????? (.online in English)—those are quite nice TLDs if you ask me. Let’s take a look at the blocklists published: xn—80asehdb & xn—80aswg
Despite the fact that the Registry for both of those TLDs (CORE) never intends to offer ANY ASCII registrations in these 2 TLDs, the block lists have a huge list of ASCII SLDs they have to specifically block. If we ignore all those ASCII strings needlessly listed, we’re left with the Cyrillic strings.
According to ICANN, all these Cyrillic strings are “collisions” and therefore a security concern. Let’s take a look at a sample of them, and translate them into English for those of you that don’t understand Cyrillic characters…
cyrillic | (punycode) | english translation |
????? | (xn—80akiqd) | anime.online |
??????? | (xn—80aaomshs) | Armenia.online |
?? | (xn—90ae) | bg = Bulgaria.online |
???? | (xn—90agyt) | BDSM.online |
??????? | (xn—90aalqlrt) | baseball.online |
?????? | (xn—90aaubo5j) | bible.online |
???? | (xn—90asln) | boxing.online |
??????? | (xn—80aba1bco2b) | bulbank.online A large bank in Bulgaria |
? | (xn—b1a) | v.online |
????? | (xn—b1aedk6a) | video.online |
?????-????? | (xn——ctbgen7afbbhm) | porno-video.online |
???????? | (xn—b1agfdb3cm8e) | visitors.online |
????? | (xn—b1aed0b0h) | video.online |
?????-??????? | (xn——ctbgeqanixdim4z) | joke-videos.online |
??? | (xn—c1aej) | gays.online |
??? | (xn—c1aem) | gay.online |
??? | (xn—c1aqi) | goal.online |
?????? | (xn—c1abdnx4g) | money.online |
?????????? | (xn—80adi0angbeo3k) | europaplus.online Moscow FM radio station 106.2 MHz |
???????????? | (xn—80akegavfdk4ajc) | celebrity.online |
???? | (xn—80afo7a) | game.online |
???? | (xn—c1ajz8b) | games.online |
??????? | (xn—e1afpbok0e) | ???????.online (Russian TV series) |
???? | (xn—c1akx0g) | games.online (in Ukranian) |
??????????? | (xn—80atbdbsooh2gqb) | calculator.online |
???? | (xn—h1adke) | movie.online |
????????? | (xn—80ajihqgqnc) | movietheater.online |
?????????? | (xn—80ajihqgqnc5g) | movietheaters.online |
???? | (xn—j1agd8g) | movie.online (Ukranian) |
????? | (xn—c1ajbfp) | books.online |
?????? | (xn—j1abetp1d) | luxfm.online (FM radio station) |
????? | (xn—e1agkva) | Messi.online (football player) |
?????? | (xn—80anjg9azc) | music.online |
?????? | (xn—80apahn7a) | Nikita.online |
??????? | (xn—b1amnebsh) | news.online |
???? | (xn—m1abcf) | porn.online |
????? | (xn—m1abbbg) | porno.online |
????? | (xn—80aimyg) | radio.online |
?? | (xn—p1ag) | ru.online |
???????? | (xn—80abap1arsf) | savings bank.online |
???? | (xn—e1aktc) | sex.online |
????????? | (xn—b1aebcnk6arc) | sex video.online |
????? | (xn—e1aon0cp) | family.online |
???????? | (xn—e1angicic4f) | watch.online |
??? | (xn—90a5ad) | St Petersburg.online |
?? | (xn—b1a5a) | tv.online |
????????? | (xn—b1afajepzqk) | television.online |
??????? | (xn—h1aaagb4bs) | physics.online |
????? | (xn—h1agd3a5b) | film.online |
?? | (xn—l1ap) | fm.online |
??????? | (xn—n1aaaemk5a) | photoshop.online |
??? | (xn—w0a6a5b) | food.online |
?????? | (xn—90auioef) | football.online |
?????? | (xn—80ajnozp) | hentai.online |
??? | (xn—u1aaa) | xxx.online |
?????? | (xn—d1acpjx3f) | Yandex - the most popular search engine in Russia |
Sorry for the long list, but it really demonstrates the magnitude of ICANN’s error.
Do any of the above looks like made-up hostnames people are likely to have loaded into router configurations? Is there even any router configuration interfaces that let you type Cyrillic into a hostname field and automatically convert it into the punycode required to correctly use it in DNS?
Another question: how many of you, when sitting on a search engine HTML page on your browser, accidentally grabbed the mouse, clicked on the URL bar instead of the search bar and typed in a search term (lets say “online game” for instance) and then been a little embarrassed when the browser has gone off to the DNS and produced an error because there isn’t a domain “online%20game”. I know I have and I’m willing to bet you’ve done it too. No doubt they’re all in the NXDOMAIN stats.
Now, if you take a look at that big list of ICANN-designated “collisions” above, doesn’t it look like it could be a list of the top Russian search phrases that include the word “online”? Myself, I think that’s exactly what it is. Maybe there is a browser somewhere out there that is used regularly by people in Russia (or another country that uses Cyrillic) that, when confronted by a search phrase typed into the URL bar, converts (DNS illegal) spaces into (DNS legal) .‘s instead of URL encoded %20’s and heads off to the DNS for a resolution attempt?
Here’s a quick test that suggests this is indeed true—go to http://www.yandex.ru and paste “??????” into the search bar then press the space bar. Look at the list of suggestions that appear. All of the 2 word ones that appear in the drop-down list are in the .?????? blocklist. Whoops!
The other potential reason for such “collisions” is that new Internet users in these countries, not knowing the history of the Internet and the old DNS ASCII-only restrictions quite reasonably expect that when they type a Cyrillic domain name into the URL bar, it’s going to work. I can imagine they get quite confused when it doesn’t and isn’t that the whole reasoning behind IDN TLDs in the first place? ICANN finally launches these IDN gTLDs and just beforehand hobbles them by placing the most potentially popular domain names in a “collision” list.
Can you imagine the confusion that is going to ensue when CORE is finally allowed to open registrations to the general public and the most popular choices are artificially blocked? “I’m sorry the enormously popular phrase you have chosen to register has been deemed to be a security threat by ICANN and so cannot be registered”.
The Internet users keen to try out the new IDN gTLDs in their web browsers are most likely going to try the domains blocked by this list (what domains would you try first, if you heard .online was alive?)—it might take them 20 tries before they finally strike an obvious hostname that isn’t on the block list. Opera—a very popular browser in Russia—is going to report back “Network problem – Check that the address is spelled correctly, or try searching for the site.” (well, in Russian of course), along with a google search box, sending the excited IDN new gTLD experimenter off to a search engine in .com/.ru, thus negating the whole damn reason for launching these IDN gTLDs in the first place. A lot of them might give up before then and decide these newfangled IDN gTLDs don’t actually work.
Fadi Chehadé, the CEO of ICANN, is a supporter of the concept of IDN’s (yay!) and is fluent in Arabic. Perhaps he should take a long hard look at the blocklist for .???? (“network” in Arabic—one of the other of the first four new gTLDs to launch)3, have a think about the arabic words in that blocklist4, maybe consult with some language and foreign SEO experts and then make some urgent adjustments to the blocklist methodology in time for these first 4 new gTLDs to launch without restrictions?
1 (a) Intellectual Property Constituency blocking of Verisign transliterated .com special launch requirements, based on verifiably erroneous and weak data, to the detriment of hundreds of thousands of existing IDN.com/org Registrants. See:
http://forum.icann.org/lists/comments-rpm-requirements-06aug13/msg00037.html
http://forum.icann.org/lists/comments-rpm-requirements-06aug13/msg00065.html
http://forum.icann.org/lists/comments-rpm-requirements-06aug13/msg00061.html and
http://forum.icann.org/lists/comments-rpm-requirements-06aug13/msg00067.html
(b) ICANN require that the ONLY domain that is allowed to be resolved when the TLD is first launched is the ASCII string “nic”—IDN new gTLDs are not even allowed to offer the equivalent string in the script the entire TLD will operate under. Just another ICANN cultural error. This no-activation window was chosen to match the 120-day period that the CA/Browser Forum gives its certificate authority members to revoke clashing certificates. Have they even issued any certificates in xn—format?
2 “Received: from unknown (HELO gdmailer05.dc1.corp.gd) (208.109.14.190) by m1plcorpmail001.prod.mesa1.secureserver.net”
3 http://www.icann.org/sites/default/files/tlds/xn—ngbc5azd/xn—ngbc5azd-apd-list-17oct13-en.csv
4 Hint: It includes world.network islamic.network arabic.network AlJazeera.network and health.network
Sponsored byVerisign
Sponsored byRadix
Sponsored byVerisign
Sponsored byDNIB.com
Sponsored byWhoisXML API
Sponsored byCSC
Sponsored byIPv4.Global
ICANN should be ashamed of themselves. They make all the right noises about being international, yet they epitomize their image of being that California based ivory-tower company.
The point made in the footnote #1 is fascinating. As a non-English speaker I feel for those hundreds of thousands of registrants who are about to be screwed over. And what’s worse is that ICANN’s virtually non-existent “outreach program”, has meant that anyone outside of the U.S, has no idea what is about to happen to them.
If you want a perfect example of ICANN’s contempt for non English registrants, take a look at this comment left by a native speaker in the recent comment period: http://forum.icann.org/lists/comments-rpm-requirements-06aug13/msg00043.html
I remember seeing this same encoding problem on the same ICANN comment platform in 2009.
What kind of organization says they represent the international community, but is inept in not allowing for non-English comments to be made.
ICANNs only interests are to serve the Intellectual Property Constituency (IPC) who seem to govern everything they do. Registrants come 2nd. and somewhere after that non-English registrants.
Having browsed the block list for .guru, I reach the conclusion this problem isn’t restricted to just IDN newGTLD’s.
Some very nice keyword .guru domains will be prevented from being registered, affecting the .guru registry’s launch success, and making people wonder if .guru is not functioning correctly…
accounts.guru
apple.guru
apps.guru
art.guru
asus.guru
bedroom.guru (Yep! That’s me!)
biz.guru
blog.guru
business.guru
candy.guru
casino.guru
chess.guru
cocktail.guru
cricket.guru
date.guru
email.guru
exim.guru
firewall.guru
forex.guru
guitar.guru
hardware.guru
history.guru
idea.guru
image.guru
internet.guru
investment.guru
loan.guru
love.guru (me, again, cough hack maybe not)
market.guru
marketing.guru
microsoft.guru
mobile.guru
money.guru
online.guru
opera.guru
pc.guru
php.guru
podcast.guru
porn.guru
property.guru
router.guru
sex.guru
smart.guru
technology.guru
travel.guru
weather.guru
web.guru
webcam.guru
xxx.guru
Not a complete list, but you get the idea…