|
In Internet Draft draft-lee-dnsop-scalingroot-00.txt, I described with my coauthors a method of distributing the task of providing DNS Root Name Service both globally and universally. In this article I will explain the sense of the proposal in a voice meant to be understood by a policy-making audience who may in many cases be less technically adept than the IETF DNSOP Working Group for whom the scalingroot-00 draft was crafted. I will also apologize for a controversial observation concerning the addition of new root name servers, because while in truth and in fact I am proposing to add millions of root name server operators, I am in no way proposing to add only seven. It’s my hope that the actual proposal (millions of new root name server operators), while having far greater policy implications than the non-proposal (adding only seven root name server operators), will be uncontroversial when understood in its full context.
Routing
It is necessary for the appreciation of this proposal to lightly and briefly communicate the hierarchical nature of Internet routing. We often hear Internet related discussions that differentiate between the edge and the core, and that simple taxonomy is well understood even by a non-technical audience, since the Internet is usually drawn as a “cloud” which has edges on its outside and a core in its middle. However, the actual hierarchy has more than two levels—more than just edge and core—and the actual and deeper taxonomy can also be understood by a broadly non-technical audience.
To be a router in Internet terms means a device which can look at an Internet Protocol “packet”, focus on its destination address, choose a “next-hop” based on that destination address, and forward the packet to that next-hop. A router has a table of destinations, each with a next-hop, which guide this activity. This routing table can be very small with details only about nearby destinations, or it can be a complete table of all destinations, depending on that router’s role. In general, routers near the edge of the Internet have smaller routing tables, while routers near the core of the Internet have complete (and very large) routing tables.
To avoid any possible confusion, let me distinguish a router from a switch, which is a lower level device that looks at the destination address of an Ethernet packet (generally speaking), whereas a router is a higher level device that looks at the destination of an Internet packet. We live in a complicated world where most routers also have switching features and many switches also have routing features. These hybrid router+switch (or switch+router) devices complicate our networks as well as our understanding of our networks, but the fundamental actions of routing and switching nevertheless remain distinct. For the purpose of this article we only need to know about routing, but be aware that the device performing that action may be known locally as a switch or an IP switch.
Importantly, every Internet connected edge device (sometimes called a host or server) contains within it an IP router with a routing table, and performs the action of routing in addition to its primary roles of running applications and converting network data (in computer-readable form) to and from presentation format (in human readable form). To round out the explanation and cement the concept of all hosts and servers containing a routing table and performing the function of IP routing, consider IP network 127. This /8 was foolishly wasted in the early design of IPv4 by assigning it to every IP endpoint, always referring to the local IP endpoint. I say “wasted” because we chewed up about 16 million addresses when we only needed one address. It’s like we used an elephant gun to kill a flea. However, it has always been true and will always be true that address 127.0.0.1 refers to the endpoint itself. One way to insult strangers on the Internet is to create a domain name pointing at 127.0.0.1, basically telling them, if you want to reach servers at that domain name, “go talk to yourself.”
An Internet edge device usually has a routing table consisting of three elements. First there’s 127.0.0.0/8, intended to capture packets destined for IP network 127 and deliver them to local services running within the device itself. Next there will be an interface route that holds the edge device’s assigned address, usually the address assigned by DHCP but on many servers and some hosts this address is still manually configured. This route likewise ensures that packets destined for the edge device’s currently-assigned address are delivered to local services running within the device itself, but it also informs the routing logic that there is a Local Area Network (LAN) containing other edge devices having similar addresses, which are directly reachable by Ethernet (probably via a switch). Finally there is a default route, sometimes shown as 0.0.0.0/0 or even 0/0, which points to a next-hop that is closer to the Internet core than the device itself. It is this default route which concerns us now.
An Internet edge device can directly reach itself (either by the universal 127.0.0.1 address, or by its currently assigned interface address), and it can directly reach other edge devices on the same LAN. Anything further away requires more knowledge than the average Internet edge device ever has—or ever needs, since such knowledge is only needed by devices having more than one Internet connection. Therefore we can characterize an edge device’s routing logic as: (1) is this packet for me? (2) if not, is it for someone else on my LAN? (3) if not, punt it toward the core (“coreward”). And when I say coreward I do not mean to imply that every edge device is one hop away from the core! Rather, coreward means that the next-hop is some router with more than one connection and a more diverse routing table. The next hop upstream (“coreward”) from a LAN is usually an “edge router” which, for small-office/home-office networks, is a DSL or cable modem, or a fiber-to-the-home modem, or a set-top-box. For academic and enterprise and ISP networks, the next hop coreward from an edge device will usually be at the boundary of a floor or office suite, or an office building, or a campus, or a datacenter, whereas the next-hop coreward from an “edge router” is usually the boundary for that ISP of a city or neighborhood.
Hierarchy and Anycast
Exactly how many layers of router you have to pass through before you reach the core of the Internet where routers have no 0.0.0.0/0 route because their routing table is complete and covers the entire IP address space, depends on design decisions made by routing architects, and the specifics do not concern us except to note that every router is owned and operated by some enterprise or ISP, and these enterprises and ISP’s have to interconnect with each other in order that they can each provide complete reachability for all IP destination addresses of any packet they might receive from their edges. This interconnection is done either through private interconnections or Internet Exchange Points (IXP’s) such as Equinix. Some network operators are very open about who they will interconnect with, figuring that more connections means more routing choices which is always better; for other network operators, this kind of interconnection is a business decision weighing the possibility of getting other networks to pay for their interconnection or not. For the purposes of this article, the Internet core can remain a mystery.
DNS requests are IP packets, and like all IP packets they have a destination address. DNS requests concerning the content of the DNS Root Zone have a destination address of one of the DNS Root Name Servers, of which at the time of this writing there are thirteen. Actually there are a lot more than thirteen servers, but there are thirteen root server names, most of which have two addresses, one for IPv4 and one for IPv6. When a Recursive DNS (RDNS) server is asked a question with a TLD it does not recognize, that question is forwarded to the root name servers. This is how IANA’s DNS name space is managed—all new TLDs are published through the root name servers, which all fetch their root zone content from IANA (or from VeriSign, who publishes IANA’s data). As you can see, much of IANA’s authority depends on cooperation between IANA, and the root name servers who use IANA’s data, and the rest of the world who uses the root name servers. None of this is legislated anywhere—it’s an organic system based on trust and inertia, and it has worked better than any other technology in history, largely because it is built upon on cooperation and trust, and not on law.
So, as a DNS request packet travels from the edge toward the core, it passes through a hierarchy of routers, beginning with the sender’s own host, then to its next-hop router, and so on. For the purposes of this article, the layers in this hierarchy are as follows:
If any of those routers has an entry in its routing table corresponding to a root name server, then it will use that route in preference to its default route. That’s how I can say that there are a lot more than thirteen root name servers even though there are only thirteen root server names. Most of the root name server operators have installed many copies of their server infrastructure around the world, and we use the same address on all of them. The happy result of this is, an IP packet containing a DNS request for a TLD that wasn’t known to some RDNS, will be most often handled by the topologically closest clone of that server. So if a root name server operator contracts with an ISP or IXP to host a root server clone (sometimes called an anycast node), it just means that traffic aimed at that destination address will be sucked out of the coreward flow, and answered, as close to the request source as possible. There are hundreds of root name servers, each operated by one of the root name server operators, each using one of the well-known root name server addresses, and each advertised into the routing table at the ISP or IXP layer of the routing hierarchy.
Root DNS anycast was done first by the M-root team at WIDE in Japan, then copied at larger scale by the F-root team at ISC in California, and these days, copied by almost all the other root name server operators. The goal isn’t performance, although there is a performance improvement from doing it this way. Rather, the original goal was resilience. Root name servers are a popular target for Distributed Denial of Service (DDoS) attacks. Having hundreds of clones of the root name servers around the world makes them slightly harder to attack. That’s the only reason root name server operators operate anycast clouds. The amazingly high cost of deploying, monitoring, repairing, upgrading, and otherwise operating hundreds of root name servers cannot be justified by any benefit less important than keeping the Internet running even during DDoS attacks.
Notably, others also operate DNS root anycast nodes. On various ISP and Academic networks around the world, there are servers answering to the well-known root name server destination addresses, all controlled by routing tables which are private to that ISP or that academic network. There’s no law related to this behaviour—on the Internet the relevant rule of behaviour is my network, my rules. Like most Internet technology, the global routing system is almost completely insecure, and anybody can advertise any route, and if this advertisement is localized, there’s no way for the rest of the world to even know it’s happening. And just as notably, the servers operated privately for this purpose don’t have to limit themselves to IANA data. This is one of the reasons that so many of us have spent so many years trying to make Secure DNS (DNSSEC) work—we want it to be possible for cooperating end system operators to know when they’re getting IANA DNS data vs. when they’re getting something else from someone else. Like root name service anycast, DNSSEC’s cost and complexity would be an outrage—except that the benefit is keeping the Internet running even when someone is injecting DNS poison into it.
Unowned Anycast
Ten or twelve years ago we noticed that the IP addresses used for Network Address Translation (NAT) were the topic of vast numbers of DNS queries, and that since the domain names associated with these addresses had not been delegated to anybody, these queries were all reaching the DNS root name servers. Now, these addresses are like Network 127.0.0.0/8 in that they are used everywhere, and they have no universal meaning. So, there is no correct answer from the DNS root name servers to questions concerning these addresses—and so, queries concerning the host names on Network 10.0.0.0/8, Network 192.168.0.0/16, and Network 172.12.0.0/12 should be answered locally, where the name of a host using one of these answers might actually be known, and where that knowledge might actually be useful. Therefore the decision was made to delegate the domain names associated with these address blocks to some name servers which did not truly exist in any specific sense, but which could be created anywhere, by anybody. We called this unowned anycast because unlike DNS root name server addresses which each belonged to some root name server operator, the name servers to whom the NAT address space had been delegated, belonged to nobody in particular.
It is therefore possible to answer queries for the NAT address space (Nets 10, 192.168, and 172.12) at any layer of the routing hierarchy. You can serve this DNS content from your DNS server itself, so that when it tries to reach the appropriate name servers it ends up talking to itself. Or you can serve this DNS content at a name server visible to your whole department, or your whole campus, or your whole ISP, or your whole country. Of course, the further up the hierarchy you have to go to get an answer, the less useful that answer will be to you, because other customers of your ISP or other Internet users in your country, are likely to use the NAT address ranges differently than you do. Nevertheless such name servers are still valuable wherever they are, because they keep this “junk traffic” from reaching the DNS root name servers, and they avoid wasting expensive long-haul Internet capacity on traffic which will likely benefit nobody. And of course, there are name servers of last resort, which are visible in the global Internet routing table, and which are reachable from anywhere on the Internet. By convention, these name servers are locally advertised using Autonomous System Number 112, thus giving us a name for the project itself: AS112.
Moments after the apparent success of the AS112 Project, many voices were heard saying, we should publish the DNS Root Zone this same way. Sadly, the rationale given by many of these voices was that it was time to break the IANA’s control over the DNS name space, and everyone on the Internet should be operating their own version of the DNS Root Zone, so that questions about which TLD’s existed and who operated each TLD could be answered by “the invisible hand of the market” rather than by fiat. I spoke against these proposals, because the universality of the DNS name space was a special strength of the Internet, and the IANA’s stewardship had always been and would always be necessary to avoid what I called “DNS Balkanization”. My view on that subject has not changed. However, my view on the possibility of using unowned anycast for publication of IANA’s root zone data has been revised, and the reason for that revision is: Secure DNS (DNSSEC).
Secure DNS (DNSSEC)
During the years from 1996 to 2012, the Internet technical and policy communities wrestled with ways and means to secure the DNS. In its raw non-secure form, it is nearly trivial to poison or modify DNS data during transit in a way that convinces some DNS client who has sent a request that some false answer is true. This can lead to small consequences like fetching a Web advertisement from the wrong server, or to large consequences like sending one’s banking credentials to the wrong web server, and it’s a gating item to truly using the Internet for all global commerce rather than just for retail transactions. You may say that 18 years is a long time for a technical team to work on a single task, considering that NASA put humans on Earth’s moon in less time. You’d be right, and many lessons were learned about how to modify the DNS specification and how to work together effectively while doing so. But the point of this story is: we did it. It is now possible for an RDNS server to know when somebody upstream is trying to fool it. This works by giving each RDNS server a public key that can be used to verify signatures made by the associated private key, and then giving IANA the private key so that the root zone can be signed in a way that all participating RDNS servers will be able to verify signatures.
There’s far more to be said about Secure DNS, like that’s all well and good for the root zone and its TLDs but what about all the other deeper DNS names? Those questions are answered elsewhere. For the purpose of this article, it’s enough to know that unowned anycast for the root zone no longer creates a global poisoning risk where any given company or university or ISP or nation or multinational region can create their own TLDs or change the meaning and ownership of existing TLDs. Secure DNS means that any attempt to modify DNS root zone content can be detected and ignored. And on that basis, it is time to explore the implications of having millions, rather than merely hundreds, of DNS root name servers on the Internet.
Root Name Service Everywhere
To make unowned anycast work for the root zone, it will be necessary for IANA to set aside some IP address space (two small IPv4 address blocks, and two small IPv6 address blocks), and to create some name server names (X.ROOT.IANA.NET and Y.ROOT.IANA.NET, for examples) each having one address in an IPv4 prefix and one address in an IPv6 prefix. Then, IANA would craft a second copy of the root zone, having different apex NS RR’s, and obviously having different signatures, but having no other differences from the traditional root zone, which has thirteen apex NS RR’s. Two things are vital for complete understanding of this proposal: (1) we’re not adding any new root name server identities—there will be no new pins in the map; and, (2) we’re not allowing anyone other than IANA to add or modify TLD definitions—IANA remains at the epicenter of Internet trust and stewardship.
An RDNS operator server who wished to serve itself a copy of this new edition of the IANA root zone would simply start a second name server instance and add a few entries to the server’s IP routing table. This would make that RDNS server completely independent of the rest of the Internet, at least for the purpose of discovering the existence and reachability and meaning of TLD names.
An exit-gateway operator for a LAN, or a department, or a campus, or an ISP, could operate a root name server and advertise its existence to others on that LAN, in that department, on that campus, or within that ISP. As in the local server case, this would eliminate external dependency on root name service for that IP routing domain.
Importantly, any nation or multinational region could fund and operate an unlimited number of root name servers within its borders, and advertise the reachability of these root name servers to every ISP, either directly or via Internet Exchange Points (IXP’s). If done well, this configuration would make that nation (or region) totally independent of any root name server outside its borders.
As with AS112, there will be many servers answering to these root name server addresses which are globally visible and reachable, so that any network that does not have a local root service clone can still get the same great service that the thirteen traditional root name servers have always offered. In fact, any of the thirteen existing server operators are free to add this new copy of the root zone to our service to their operating plants, to continue and expand upon their long tradition of public service and their support of a single coherent IANA-managed DNS name space.
IANA would have one other task, which is to accept subscriptions to a “notify list”, whereby operators of these unowned anycast root name servers can request to be told in real time of each change to the IANA root zone. Or as an alternative they can each use the traditional method where they poll for changes several times per day. The root zone is currently only modified once per day or less, so this kind of periodic polling would not unduly delay propagation of root zone changes.
The intended result of implementing this proposal is to make root name service ubiquitous throughout the Internet, while still relying on the IANA as the global steward of a single coherent Internet name space. The architecture being proposed will scale to a future Internet of literally any size. Under this proposal, either a DNSSEC-aware RDNS or a DNSSEC-aware end host would find an authentic clone of the IANA root name service either on the same host, or on the same LAN, or in the same campus or ISP, or in the same country, and only rarely have to reach for a “root name server” using the global Internet.
An Apologia
In the first version of the Internet Draft on this topic, I wrote of the possibility of adding several new traditional root name server identities to the IANA system, based on the fact that IPv6 packet sizes were larger than in IPv4, and so in a future IPv6 Internet, it would be theoretically possible to have more than thirteen designated root name servers. This text was confusing and misleading, and has led to a lot of unnecessary panic and controversy. Let me clarify my remarks here, even though the text in question will be removed from the next edition of the draft-lee-dnsop-scalingroot-??.txt Internet Draft.
While IPv6-only Internet access including DNS root zone access is a valid technology goal, it is a separate topic from this proposal. This proposal seeks to enhance the current dual-stack IPv6+IPv4 DNS root name service to include unowned anycast as a distribution and publication option. Furthermore, adding more root name servers in the traditional method would both fail to solve problems we know about (which is: the need for millions not hundreds of servers), and cause problems we don’t want to live through (like picking new root name server operators, or handling an ever-longer list of server addresses when retrying root name service queries).
I am therefore starkly opposed to adding more traditional root name servers, because the problem with the current root name server system is not that there are twelve server operators, but rather, that there are not millions of server operators. I am working toward an Internet with millions of root name servers and name server operators, not an insignificant change from thirteen to twenty. I strongly support research into IPv6-only DNS root name service, as contemplated by draft-song-sunset4-ipv6only-dns-00, which would also be amenable to a secure unowned anycast approach. For example, IANA could generate a copy of its DNS root zone having apex NS records that are only served by IPv6, and IANA could sign that copy of the zone with the current root zone signing key, to assure that IPv6-only networks and servers had no hidden IPv4 dependencies.
Conclusion
We can, using only existing protocols and existing software implementations, expand the number of root name server operators from twelve to millions, and expand the number of anycast root name servers from hundreds to millions. This effort would require moderate administrative work by IANA, to (1) create an otherwise-identical copy of the DNS root zone, having different apex NS records, but signed with the same root zone key, and (2) operate publication servers capable of serving millions of “stealth secondary” root name servers, and finally (3) operate a subscription service whereby these servers can ask for and receive NOTIFY messages concerning root zone changes.
This proposal is intended to be noncontroversial. The Internet loves local autonomy, and the centralization of the root zone service has been one of the Internet’s great weaknesses to date. While we must always have a single DNS name space having a single DNS root zone maintained by IANA and signed by IANA’s trusted Secure DNS key, it is no longer necessary that we give each root name server a unique address.
Sponsored byWhoisXML API
Sponsored byCSC
Sponsored byVerisign
Sponsored byIPv4.Global
Sponsored byRadix
Sponsored byDNIB.com
Sponsored byVerisign
There’s a weakness to this proposal which seems to have been overlooked.
It’s possible to tell via DNSSEC when someone is serving you fraudulent information. But it is not likewise possible to tell when someone is withholding valid information. This proposal therefore permits intentional censorship of legitimate hosts, for example, by commercial interests for commercial purposes.
Example #1: I am ISP X who also sells a service X’ which competes with another companies service Y’; I therefore pretend company Y’ does not exist for customers of my ISP service.
Example #2: I am a repressive government trying to pretend that certain inconvenient political speech does not exist; I therefore “fail” to find the sites hosting this speech when attempts are made to look them up from within my area of control.
Both of these seem to argue against doing this sort of distributed control until such time as it is possible to establish a trust relationship such that one can be sure there is no omitted information.
terry, i agree in principle that this kind of censorship would become possible, even though i know it’s already occurring in that some network operators intercept queries to the thirteen existing root name servers—so this won’t be a new problem.
nevertheless, in detail, you’re wrong. dnssec expresses secure denial of existence, and if an RDNS forwards a FOO.BAR query to the root servers because it doesn’t know where BAR is, then a censoring root server who receives this query would have to answer securely to deny the existence of BAR, including a signed proof that no names between (BAR) - 1 and (BAR) + 1 exist. this will not be possible for interlopers who lack IANA’s root signing key.
the effect will be the same—lookups in .BAR will fail. however, the RDNS will know that the failure is because of the bogusness of the response it heard, and not because the IANA root zone has signed proof of the nonexistence of .BAR.
the difference doesn’t matter operationally—lookups still fail. but procedurally, and for policy purposes, these two lookup failures are very different. and i can live with that.
vixie
Paul—nice job taking a dense topic and making it clear to this non-engineer. You make a logical case for what could be a dramatic change in how queries are resolved. All behind the scenes of course—nothing the end user surfer will see.
Question—is it early to be declaring complete victory for DNSSEC? Aren’t there steps remaining to close the chain of trust, particularly in the area of registrar policy?
christopher, there are many laggard registrars who don’t support ipv6 glue and/or dnssec. i always tell people, switch to someone who cares, like dyn or gandi. so, dnssec is within reach, now that com net org root and many cctld’s are signed.
however, none of that matters for the purposes of this unowned anycast proposal. the keys for tld’s come to iana via the iana/tld data path, which does not involve any registrar. so, all rdns servers should be running with dnssec validation enabled, because the root zone is signed, and the absolute minimum benefit you’ll receive is, validating root zone content.
thanks for your kind words.—vixie
Ah, right. Just the root TLD data. Thanks Paul. So, what are the chances of this happening?