Home / Blogs

In Pursuit of IDN Perfection?

IDNs (International Domain Names) have been the subject of a great deal of discussion. IDNs are a way to allow non-ASCII scripts to be used in URLs. There are a number of difficulties with IDNs. One is that there are letters or punctuation that look similar to normal ASCII characters or punctuation. This allows people to spoof other URLs and use it to fool users and steal their banking information for instance. The other criticism is whether people really need them. The argument (which until recently I agreed with) is that everyone in the world reads ascii and can’t people at least type the URLs in ASCII.

Fellow board member Hualin Qian said that the Chinese were using IDNs using a browser plugin and that since most Chinese read only Chinese web pages, it seemed to be doing quite well. I would have to concur. I think one thing that we forget is that the type of people who come to ICANN meetings and argue about this stuff tend to speak multiple languages, care about what is going on in other languages, and are trying to get everything perfect. We are not the norm. I remember when we set up Infoseek Japan, we decided to index only Japanese pages. I argued that we should index English pages, but I was overruled by the people who said most Japanese don’t read English web pages.

Many of the problems of IDNs come from trying to do multiple languages at the same time or languages one can’t read. The biggest difficulty is implementing them in gTLDs like .com or .org. I think that if we focus on helping the country level TLDs (ccTLDs) get going with IDNs in their own native languages, we would be solving the problem for 80% or so of the people. My concern is holding up the ability for these people to use IDNs because we can find the perfect solution for the edge cases.

This is a philosophically opposed to my “Global Voices” position which focuses on building bridges between cultures and languages, but I believe that the benefit for the digital divide to get something running soon is worth it. Also, once we have a lot of people using IDNs in different regions, I’m sure we can use this experience to come up with more creative ways to solve the more difficult IDN problems.

Again, this is my personal opinion and not any sort of consensus of staff or the board of ICANN. I am mainly pointing this out because until this meeting, my position (privately) was “why the hell do we need IDNs?” On the other hand, I think we are moving forward and the discussions during this meeting in MdP were very helpful.

NORDVPN DISCOUNT - CircleID x NordVPN
Get NordVPN  [74% +3 extra months, from $2.99/month]
By Joi Ito

Filed Under

Comments

Richard Henderson  –  Apr 11, 2005 10:51 PM

I should have thought that people at a local level (eg: ccTLD level) are best-equipped to form really informed judgments about what their own communities need and want; and if the process is developed there first of all, that would create a testbed which could then (if felt necessary) be extended to the rest of the world (eg: gTLD level).

I’m not saying that IDNs aren’t needed at gTLD level. I’m just saying that perhaps this shouldn’t delay implementation at country level.

Better to learn from mistakes (and successes!) in a restricted area of the net, than to create ‘first efforts’ at a global level and live to regret what was done.

Nominet please take note: we expect accommodation of accents for the Gaelic! It was here in UK long before Anglo-Saxon!

Richard H

Suresh Ramasubramanian  –  Apr 12, 2005 2:06 PM

The chinese can use an arbitrary standard and an arbitrary browser plugin so that local people can read only chinese-only pages. Other countries can implement IDN the same ad-hoc way.

But interoperability between different IDN schemes, and the fact that there is going to be a non trivial number of speakers of chinese, japanese and korean reading websites in each others languages. 

And that is enough reason not to let ad hoc systems come in place that work for 80% of the population and fail horribly for the remaining 20% .. chances are that the edge cases are going to prove far more consequential than people thought.  Lots of quickfix solutions in various fields (SPF for email, just to quote an example) tend to turn a nelson’s eye to edge cases .. and that is bad - far worse than people think - in the long term.

And there have been the usual wars between different and fundamentally inoperable variants of different technologies (K56 and that other modem protocol, before V90 came into existence, several proprietory, closed and non standard messaging protocols before SMTP ..).

I would have thought that people over the ages will have become extremely wary of ad-hoc fixes and technologies that dont have global consensus, and which fail non-gracefully in the case of edge situations.  But no :(

I’m not even going near the spectre of IDN in various other languages, some of which have right to left scripts, just to add that little extra touch of complexity, and several other languages, some of which don’t yet have a standardized and usable character set just yet. That is going to be a challenge for the future of IDN.

Joi Ito  –  Apr 12, 2005 3:06 PM

Suresh, what Richard said. ;-) When I said 80%, I was suggesting that we let the ccTLDs go with a limited launch to cover the needs of 80% of the people and use that experience to figure out how to solve the problem for gTLD and the edge cases. I am also not happy about ad hoc solutions and the Chinese plug-in should be a wake-up call to the local level need for IDNs. It’s also not something we can stop. John Klensin mentioned several times that if phishing continued to be a problem and we didn’t provide a solution, that they are threatening to convert everything to puny code which would definitely be sub-optimal as well.

Suresh Ramasubramanian  –  Apr 12, 2005 3:34 PM

Thank you for clarifying this .. it does need to be made very clear

—srs (who still thinks the same as you privately did till a while back - why the hell do we need IDN)? :)

Joi Ito  –  Apr 12, 2005 3:57 PM

Sorry, one more clarification. The “they” John Klensin was referring to was the browser developers.

Ram Mohan  –  Apr 15, 2005 7:38 PM

Joi,
I am still wary and uncomfortable that in China, the “com” and “net” suffixes have been converted to Chinese characters, and have been sold to registrants, likely not the same registrants who own these names in ASCII.

Given that .COM is recognized everywhere, the precedent set in China to allow “com”, “net” etc to be registered independently in Chinese sets the stage for ad-hoc and ultimately incoherent versions of the Web based on what plugin/browser/script/geography you use. It promotes more phishing, not less.

If the solution you suggest is restricted to only the ccTLD suffix, the problem is more localized.  However, this does not seem to be the approach taken in China, thereby promoting balkanization.

-ram

Richard Henderson  –  Apr 15, 2005 10:48 PM

Ram,

I take your point of course, but it could be argued that what you call “balkanisation” is simply a national registry deciding what is best suited to their own local and national needs… in short, it could be argued that the ccTLDs should be left as far as possible to run their own affairs, and that neither ICANN nor the USA’s Department of Commerce should have any say in the shape and methodology that an independent nation state (or the registry representing it) selects for its own nationally-designated namespace.

Indeed, the future may present us not only with a “balkanisation” of suffixes, but even a “balkanisation” of roots.

Obviously I can see some merit in what you are saying: the anxiety that ccTLD registries acting locally or unilaterally may add to confusion and facilitate abuses.

But this could be seen as the anxiety of those “in control”... the reality is that there really is a place called the Balkans somewhere beyond the USA, and the people who live there have every right to “balkanise” (nationalise) their own namespace.

Or to apply it to China, what possible reason does an English-dominated ICANN have for telling China that it should not use its own language or characters in its own country? (China’s arguments, of course, are weakened by its appalling record on censorship and human rights, but that is secondary to the point.)

If efforts are made to “centralise” a uniform development of the ccTLDs, then those efforts have to be based on consensus and the agreement of the local (national) registry.

It is one thing for ICANN to enforce its Agreements with gTLDs (indeed, Ram, I wish that ICANN had been more proactive with Afilias when it was breaking its own Agreement) but a Californian quango accountable to the DoC of a single country cannot possible claim the right to enforce other nations or their registries to adhere to anything.

ICANN simply lacks any mandate or credibility to do such things.

In 1979 I attended the signing of a “Declaration of Human Rights and National Identity” by the Russian dissident Vladimir Bukovsky and representatives of many East European Nations then under the yoke of the Soviet dictatorship. This declaration has since been prophetically fulfilled and history has run its course.

In Internet terms, the principles of human rights are inextricably linked to the right of nation states to determine their own identities. In short, the principle of ccTLD independence is something that should *not* be compromised. Even if a ccTLD registry makes mistakes or takes wrong turnings.

That, at least, is my view. With regard to IDNs, such wrong turnings may be viewed by everyone else, and lessons may be learned, but the consequences will at least be confined to local levels… and at least progress can be made at the local level, based on local needs, local aspirations, and the local market.

I find it very attractive that the ICANN universe of gTLDs co-exists and lives in balance with the multiplicity of ccTLDs.

I’m immensely proud of our own British ccTLD, and although there are things that Nominet could learn from the gTLDs, I think there’s a whole lot that the gTLDs could learn from Nominet particularly in areas of responsiveness and accessibility.

I expect the British ccTLD to develop in the way that British people find most useful in their own local environment.

If we are unhappy about the China registry using its own national characters, I suggest that the problem may lie with us, and that we should go and register .coms .nets and .orgs instead. (Or .infos if we are desperate enough!)

If you analyse the purchase of .com.cn domain names you will see that a whole load of them were bought by American or European domain name speculators. And yet, in all honesty, the .cn registry should exist primarily for China’s own people.

And so we come back to the artificial shortage of gTLDs. Flood the market with 100 new gTLDs over ten years and the need to speculate in foreign registries would more or less come to an end!

With regard to IDNs, and with regard to the determination of policy more generally, it is up to the ccTLDs (along with encouragement to discuss and consult, which I take it is what you are advocating Ram).

Yrs,

Richard Henderson

Dr Govind  –  Apr 16, 2005 11:05 AM

I attended the recent ICANN meeting at MdP, Argentina where IDN was discussed in the open forum extensively. Our country, India is rich with diversity of languages ( more than 18 officially) and scripts. This places India as a unique country in the world which has multiple languages and multiple scripts, unlike other countries which has either one language one script or one language two script. We have a population of more than a billion and of which more than 95% are non English speaking. We want to implement IDN in the local script . In this context, the article of Joi and the subsequent comments from Richard, Suresh and Ram are interesting.It looks like that IDN is at the cross roads. gTLDs have made IDN to work in a limited way and are in the process of making it to apply to all languages of the world. Some of the ccTLDs are pursuing the development of their languages with their own root server to have faster deployment. The question is which path to follow so that Internet is available in local language script and at the same time it is not fractured in terms of resolution of uniquely identifying the root for each of the domain.

In this context are their issues still to be resolved at the standard level for the IDN implementation or standards are under formulation and still to come up for the full implementation of IDN for all the languages of the world? Further are there fundamental script related issues in certain languages such as CJK ( from structure and their variants points of view )  which is prompting them to go differently to IDN implementation.

Suresh Ramasubramanian  –  Apr 16, 2005 12:00 PM

Thanks, Dr.Govind. In any case, the issues in the cjk area are at a level where interoperability is quite possible, given sufficient discussion

At least in India, we need a lot more attention to indian language characters in unicode first.

University of Hyderabad, Anna University, CDAC and others are working at this but perhaps would benefit from increased government support in this issue.  I do applaud the recent attention that is being given to collecting and releasing free / open source as well as proprietory software in the Tamil language - <http://www.financialexpress.com/fe_full_story.php?content_id=88079>

We need further such coordinated effort, to build a good installed base of local language capable software that all follows a common and well defined standard, before we can start looking at an implementation of IDN for Indian languages.

Ram Mohan  –  Apr 17, 2005 4:18 AM

Richard - my issue is not with ICANN or ccTLDs on regulation - my concern is much more mundane and user-oriented - when someone registers “citibank.com” in Chinese script, and puts up a phising site, the normal Internet user is going to get a heck of a shock. The proposed system seems to allow this to happen.

On Dr. Govind’s qn. regarding CJK model—the problem with a purely localised model are: (a) need to have special client installed in every browser in the usage area, (b) potential need to play tricks with DNS to make some of these non-standard names resolve, and (c) only some users get to use the domains some of the time.

Suresh, perhaps an even larger part of what’s needed for India is local-language content, not only software.

Suresh Ramasubramanian  –  Apr 17, 2005 4:56 AM

> Suresh, perhaps an even larger part of
> what’s needed for India is local-language
> content, not only software.

Chicken and egg problem there.

All three are part of a complex equation ..

1. .in domains + a better managed nixi (you’ve heard me on this in the past) = easy availablity of fast, cheap hosting locally.

2. popularization of local language software (editors, etc - browsers that recognize unicode can render tamil etc just fine .. see http://ta.wikipedia.org (tamil) and http://hi.wikipedia.org (hindi).  In fact see if you can read this url - http://hi.wikipedia.org/wiki/मुख्य_पृष्ठ on your browser, the last part after the / is “mukhya prusht” / main page in devnagari script. Works just fine on opera / firefox (not tried it on IE, dont have it on my laptop just now, but it should work on IE 6.x)

3. These two will drive local language content. Which will then drive a better spread of these two ...

So, I’d suggest we get moving on both these and the content availablity issue will sort itself out all by itself.  Quite a lot of local language newspapers and portals are online in India, that’d be glad to use unicode instead of relying on proprietory fonts / scanned graphics images etc.

Comment Title:

  Notify me of follow-up comments

We encourage you to post comments and engage in discussions that advance this post through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can report it using the link at the end of each comment. Views expressed in the comments do not represent those of CircleID. For more information on our comment policy, see Codes of Conduct.

CircleID Newsletter The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

Related

Topics

Brand Protection

Sponsored byCSC

Domain Names

Sponsored byVerisign

New TLDs

Sponsored byRadix

DNS

Sponsored byDNIB.com

IPv4 Markets

Sponsored byIPv4.Global

Threat Intelligence

Sponsored byWhoisXML API

Cybersecurity

Sponsored byVerisign