Thoughts on the Open Internet - Part 4: Locality and Interdependence

Home / Blogs

Thoughts on the Open Internet - Part 4: Locality and Interdependence

	By Geoff Huston Author & Chief Scientist at APNIC
	October 08, 2015 Views: 11,468 Add Comment

The Internet was not originally designed as a single network that serviced much of the world’s digital communications requirements. Its design was sufficiently flexible that it could be used in many contexts, including that of small network domains that were not connected to any other domain, through to large diverse systems with many tens of thousands of individual network elements. If that is indeed the case, then why is it that when networks wish to isolate themselves from the Internet, or when a natural calamity effectively isolates a network, the result is that the isolated network is often non-functional. Where is the inter-dependence in the Internet that binds each network component into the whole and what efforts are being made to reduce this level of interdependence?

Locality and GeoLocation of Names and Addresses

Names and addresses on the Internet are not intrinsically tied to any particular physical location, nor any country or region.

While a class of domain names are associated with individual countries, namely the per country Top level Domains (ccTLDs), it is a matter of policy by the administrators of these TLDs whether the registration of subdomains within these ccTLDs is limited in any way to entities that reside in the associated country, or whether the services named within a ccTLD name space is restricted in some fashion to refer to a service located in that country. Domain names can be see as symbolic names that refer to attachment interfaces to a network, through the mapping of a domain name to an IP address, and it is the IP address, and via this address, the location of the device that is using this address that is the actual geolocation of the address and, ultimately, the address of the DNS name.

An IP address does not contain any internal structure that identifies a country or particular locale, nor does it intrinsically identify a network operator. There is no single authoritative maintained database that maps IP addresses to the geographic location of where that address is being used.

The address allocation process used by the Regional Address Registries administers a registry that records allocated addresses and the country where the address holder is located. Of course that is a slightly different definition of location than the country where the device that uses an address is located, but in most cases these are the same.

Private interests have constructed more detailed databases that attempt to provide location of addresses to finer levels, and similar studies have been undertaken by academics with public inputs.

These approaches cannot map the entire address space of course. Shared addresses used in private contexts (such as the widely used 192.168.0.0/24) have no particular location. Addresses used in satellite-based access systems similarly have no fixed locale on the earth’s surface. Addresses used by operators of mobile data services generally can be mapped to countries, but resolution to finer levels may depend on the address management practices used by the mobile operator. If the access network operator uses address sharing, such as a Carrier Grade NAT (CGN) then the physical location of the address may not be clearly established, and it may be only possible to resolve the location to country or region depending on the internal structure of the network and its operational practices in address sharing.

With a sufficiently coarse level of resolution, such as that of location to countries, a number databases exist, both public and private, that map addresses to countries with a reasonable level of accuracy. Finer levels of resolution are available in certain countries and for certain address ranges.

This locality information, that associates IP addresses to locations, is used to inform much of the operational work when implementing policies about locality of online services and infrastructure.

Locality in Routing and Traffic

Locality in traffic flows can be a critical issue for users. The shorter the network path between communicating end points then the lower the time taken to send a packet and receive a response (a so called “round trip time”). The lower the round trip time the faster a flow controlled protocol can sense the equilibrium flow rate for a traffic stream, and the greater the Transmission Control Protocol (TCP) carriage capacity, all other things being equal. If shorter paths across the network produce better outcomes for users in terms of perceived performance of network transactions, and such shorter paths allow the transport protocols to make more efficient utilisation of the network by the traffic flows triggered by these transactions, then how are this localised network paths realised?

The behaviour of the routing system is the first place where there is a natural bias towards finding the shortest possible paths in the network. Most automatic routing protocols take an arbitrary interconnection graph between switching elements and select candidate paths that represent the shortest path through the network (If every switch-to-switch link is assigned a “cost”, then these routing protocols reach a converged state by computing the shortest possible path through the network, where the path cost is simply the sum of the individual link costs). Such link based routing protocols are used within individual network domains. Between such routing domains the internet uses the Border Gateway Protocol (BGP) as its interdomain routing protocol. Rather than looking at network paths at a level of detail that considers each path as a sequence of individual point-to-point links, BGPO looks at the network as a set of interconnections between component networks, and each path through the aggregate network is defined as a sequence of networks. BGP is a shortest path selection routing protocol, and it selects candidate paths to use that transit the fewest possible networks.

Of course a routing protocol cannot create new inter-domain connections, and if the shortest possible path between two networks that are not locally interconnected is a path that traverses an arbitrary number of external networks, the BGP will select such a path.

One way to assist the routing protocols to select localised paths is to enrich the local network interconnection at the physical layer. One of the more efficient ways to achieve this is through use of a local traffic Internet Exchange (or “IX”). An exchange can be seen as a switching mesh: an individual network that connects with a single connection to an exchange can create virtual point-to-point connections to all other parties who are also present at the same exchange. In this scenario all parties connected to an exchange point can directly exchange traffic with all other parties at the exchange without the traffic traversing third party networks. Many exchanges also permit selective forms of interconnection, where each pair of networks represented at the exchange can determine whether they interconnect, ands the commercial terms of such an interconnection independently of any other connections that they may have set up at the exchange. Exchanges are commonly operated at a city level, or a national level. A smaller number of exchanges operate with large number of local and international providers, essentially operating in a role of being a regional connection hub, and examples of these are at London (LINX), Amsterdam (AMSIX) and Frankfurt (DECIX).

Exchanges operate at a number of levels. They allow access providers to interconnect customer traffic without the use of a transit provider acting as a middleman. They allow access providers to “see” a range of competing access providers, bring competitive pressures to bear upon the transit role. Given this rich collection of connectivity, exchanges are also powerful attractors for content distributors. A content distribution network located at an exchange can directly access a far larger number of access networks and their end user population, and often do so at a lower cost than remote access that is arbitrated by a transit network’s services.

Exchanges can assist in keeping network traffic between local end points local by facilitating rich interconnectivity between local access providers and transit and content providers.

Localisation may also be an outcome of regulatory actions, where specific regulatory measures prevent the “exporting” of data beyond national boundaries.

There are also aspects of tensions between localisation and various commercial pressures. Conventionally, a structure of dense localisation of interconnection would be expected to be an outcome that reduce costs for all parties, but at times local circuits may incur higher costs for a network operator, or the use of local sender keep all commercial interconnection arrangements may expose one service provider’s network assets to its competitors without due financial compensation. In such cases, the outcome of such environmental conditions is local fracturing of the network, and consequently in these cases external connections and services are an essential component of ensuring local inter-connectivity.

It may also be the case that true locality of traffic flows is not the same as the IP-centric view of the locality of a traffic flow. While at the IP level it may be that the start and end points, and even the IP intermediate switching points may be able to be mapped to geographical locations that sit within the bounds of some locale, such an IP-centric view of the network may not necessarily reveal intermediate encapsulations, such as Multi-Protocol Label Switching (MPLS) or other forms of IP tunnelling, including IPv6-in-IPv4 transition tunnelling mechanisms. When these location of the basic paths of the carriage system and the carriage level switching equipment is revealed it is always possible that the physical path of the traffic flow does not match the logical view provided at the IP level.

Locality and the DNS

While there is a desire to restrain traffic flows so as to ensure that Internet traffic between users in a particular scope or locale remains within the realms of a certain locale, country or region, there is a similar consideration relating to DNS queries.

The DNS is an intrinsic part of almost every form of user transaction on the Internet, and the choice of resolver used to perform DNS resolution is one aspect of locality and interdependence. Users may use DNS resolution tools that make use of non-local DNS resolvers. The issues of the interaction of these measures with DNS-based approaches to content filtering is a consideration here, as such DNS-blocking measures intended to support local content filters are circumvented by this form of use of non-local DNS resolvers. The use of such non-local DNS resolvers also creates an external dependence that would be broken if the local network were to lose a route to the non-local resolver.

There is also the consideration of the flows of meta-information. DNS queries can be analysed to provide near real time information about the online behaviour of users. The use of non-local DNS resolvers potentially results in this information being passed across national borders into regimes that may operate with different frameworks concerning personal data concerning individuals who are not citizens of that country.

The issue here is that the DNS was not designed with privacy in mind at the outset, and the DNS is notorious in leaking information to third parties.

The problem lies in the conventional practice of passing the full domain name of the query to parent zones when the desired response is the name servers of the child zone. It is slightly less efficient, but far more secure to minimise the query string when resolving a name. This is the topic of current study by the DNS Operations Working Group of the IETF.

Sample-based measurements show that the extent of this use of non-local DNS resolvers appears to range between 1-2% of users in some countries to in excess of 60% of users in other countries.

Another aspect of the DNS concerns interdependence. The DNS name space is a rooted hierarchy, and the DNS resolution protocol makes the assumption when resolving a name that critical aspects of the DNS are available answering queries. In particular this concerns the availability of the DNS root zone servers (but also extends to the authoritative servers for other popularly queried TLDs). The widespread use of anycast servers for the DNS root zone has improved the performance of the DNS in terms of the time taken to resolve a DNS name, but these resolvers need to regularly refresh the content of their local cache with the content of the primary server. The implication here is that isolation of local DNS resolvers from the authoritative servers that serve the root zone of the DNS will eventually mean that the local servers will cease to answer queries in an extended hiatus of connectivity.

“Anycast” is a deliberate approach to deploying multiple servers using the same IP address at diverse locations. As long as each anycast instance responds in precisely the same manner to queries, then anycast is an efficient method to improve the performance and robustness of a service. With anycast, a user’s query is directed to the “closest” instance of an anycast server constellation. What is “close” in this context is the outcome of the routing protocols selection of shortest path. If a server in the anycast constellation fails then as long as the individual anycast routye is withdrawn, then the local traffic that would normally be directed to this instance is directed elsewhere. Anycastr also assists in DDOS mitigation. If the attack originates from a single source, or a small set of related sources then the attack will be directed to a single instance of the anycast server constellation and all other servers will operate without interruption. A broad scale distributed source attack will be spread across the anycast constellation, so that any individual server will experience only a small fraction of the total DDOS volume.

Locality and Content

Locality of network extends beyond access and transit networks and the mechanisms of controlling traffic flows to the aspect of what content is accessible by which users.

While the intended operation of the Internet was to blur geography and create a network where location was not a visible part of the user experience per se, the questions relating to the local of users and content remain. Can content be customized to locales? What information is available to generate this locale tagging?

The externalities of the commercial arrangements relating to content distribution may impose a limit on access to certain content to users located in certain locales. This can take the form of IP address geo-location, where the locale of the user is guessed by consulting a geographic location database with the remote side IP address making the connection to the content.

Alternatively, this can be performed by the DNS name resolution, where the IP address of the resolver querying for the DNS name of the content is used to lookup a geographic location database. The response provided by the authoritative name servers to the query is based on the assumed location of the user, as the assumption here is that the user and their DNS resolver are closely located. With the increasing popularity of common open DNS resolver services, such as Google’s Public DNS and the Open DNS project, such assumptions are no longer always the case. The assumed location of the resolver that puts a query to an authoritative name server may not be closely located to the original user who made the query.

Such approaches to limit content to defined geographic locales generates a response from some users to want to bypass such geolocation blocking of content. In an open environment such as the Internet there are providers of services that are intended to address precisely this problem and virtually relocate the user’s device to a network location within the desired geographic region. Such approaches typically involve the use of secure VPN technology, where all the user’s traffic is carried inside an encrypted IP tunnel to emerge into the internet within the desired locality. The side effect of this increased adoption of such advanced tunnelling solutions to access local content implies an increased level of use of encryption for user traffic.

It is evident that there is no clear cut absolute solution for a content provider to unambiguously determine precisely where an end user is physically located, and efforts by a content provider to enforce geographical based differential outcomes for content distribution creates a secondary market in responding to these measures. The increasing sophistication of the measures and their responses in effect fuels demand for increased levels of user anonymity. This, in turn, frustrates many of the conventional measures used by law enforcement agencies, as it fuels the creation of an online environment where individual actions by users are effectively anonymous and cannot be readily mapped back to a physical identity or location.

NORDVPN DISCOUNT - CircleID x NordVPN
Get NordVPN [74% +3 extra months, from $2.99/month]

By Geoff Huston, Author & Chief Scientist at APNIC — (The above views do not necessarily represent the views of the Asia Pacific Network Information Centre.)
Visit Page

Filed Under

Comments

The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.