|
I recently read an interesting post on LinkedIn Engineering’s blog entitled “TCP over IP Anycast – Pipe dream or Reality?” The authors describe a project to optimize the performance of www.linkedin.com. The web site is served from multiple web server instances located in LinkedIn’s POPs all over the world. Previously LinkedIn used DNS geomapping exclusively to route its users to the best web server instance, but the post describes how they tried using BGP routing instead. In a controlled experiment, they published a single anycast address corresponding to every instance of their web site worldwide and let BGP routing direct users to the best one. Performance wasn’t universally improved and worsened in some cases. They ended up using a combination of DNS and BGP routing to direct users to the best place. Rather than use a single anycast IP address for every instance worldwide, they grouped their POPs by geography and assigned an anycast address to all the instances within a region. When a user visits www.linkedin.com, DNS geomapping techniques direct her to the appropriate anycast IP for her region, and then BGP routing chooses the path to the best web server instance in that region. LinkedIn’s before and after performance measurements show the experiment was a success.
This post raises an obvious question: are techniques using DNS to “steer” traffic (such as IP geolocation) sufficient, or do you need to consider using anycast as LinkedIn did? The short answer is that DNS steering works well and is only getting better.
LinkedIn’s situation is special: they run the 14th busiest web site in the world according to Alexa. They have the resources and engineering talent to build their own worldwide network of POPs to serve all that traffic. Few companies can do similar, nor do they need to. The benefits of anycasted web content only start to matter in a situation such as LinkedIn’s with a large number of content-serving instances. Most companies have content distributed to far fewer sites and will be much better served by using DNS steering techniques.
Let’s address some of the specific issues that LinkedIn cites regarding using DNS to direct users. First, they point out the lack of visibility to the actual user’s IP address to make steering decisions. Recall that end user devices send DNS queries to a recursive nameserver, usually run by their ISP or on a corporate network. The recursive server queries the authoritative server (in this example the nameserver for linkedin.com) on the user’s behalf, and thus any DNS steering decisions in the authoritative server are made based on the recursive server’s address, not the end user’s address. Usually the recursive server is close to the user, but not always, especially in the case of large public DNS providers, such as Google Public DNS or OpenDNS.
The good news is that the DNS engineering community has known about the issue for a long time and there’s a solution: EDNS Client Subnet, or ECS. This DNS protocol extension allows the recursive server to pass the user’s subnet address to the authoritative server, finally giving the authoritative server visibility to the actual end user address. (The subnet is sent rather than the specific IP address for privacy reasons.) ECS is winding its way through the IETF standards process and has already seen wide deployment: major DNS providers such as Google and OpenDNS—the ones whose users are most geographically distributed—already support it. So the issue of the authoritative server not knowing the end user’s address to make accurate steering decisions is going away quickly.
The second DNS-related issue LinkedIn mentioned was accuracy of IP geolocation databases. It doesn’t do much good if the authoritative server has visibility to the end user’s actual address but the geographic mapping for that address is incorrect. There are several commercially available IP geolocation databases and they all have their faults, which is why Dyn has built our own to power our products. We start with commercial data but then augment and refine with various patent-pending proprietary techniques.
Ultimately using DNS to steer users to content gives you the most flexibility. When you rely on anycast and BGP routing, your options for control are limited. You’re relying on the routing policy of other people’s networks and some things are just outside your control. For example, commercial arrangements and disagreements between ISPs can cause traffic to take suboptimal paths. But with DNS, you can use any criteria you want to route traffic. IP geolocation is popular because it performs well, but there are other options. For example, Dyn offers a real user monitoring (RUM) service that measures CDN and web site performance from inside the web browser of actual users (hence the name).
So while adding anycast to your web site can offer good performance, ultimately DNS steering gives you the most flexibility.
Sponsored byIPv4.Global
Sponsored byDNIB.com
Sponsored byRadix
Sponsored byCSC
Sponsored byWhoisXML API
Sponsored byVerisign
Sponsored byVerisign
matt, i think linkedin’s analysis is good as far as it goes, and that yours is also, but that neither of you come close to what ibm was doing with websphere’s CDN mapping starting more than ten years ago.
anycast tcp has an obvious risk in that a bgp route change can break the session. some companies (3tcp for example) work around this by flood-filling tcp state to all members of the anycast cluster—world wide!—in case a segment comes out of clear blue sky due to a bgp route change. but ibm didn’t do this, they found a more simple hack.
if you accept the initial session by global (not regionalized) anycast, and from history about the client’s ip prefix, send a 302 redirect toward a more-specific URL “hostname” which is not anycast in any way, you get the bost of both worlds.
i know that linkedin hasn’t thought of this. but it seems like something Dyn would try, if for no other reason than to rule it out?
what you’re calling “dns steering” here has additional complexity costs, as you know. if you know where you want to redirect somebody, then doing it in their initial tcp session to your web service cluster is a little later than doing it in the initial dns transaction, but both approaches benefit from (and require the benefits of) session-level binding.
paul
We described another production anycast-based CDN routing scheme here... https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/flavel The proposal you have Paul on doing a 302 redirect is a perfectly acceptable thing to do in cases where a) The redirection cost (i.e. latency) is negligible compared to delivering the content and b) you don't mind exposing the redirected DNS name in the browser (which can then be copied and pasted and emailed to friends around the world). Often a metric that is considered when building a latency sensitive service is the time-to-first-byte. A 302 redirect makes this significantly longer. However if you see a lot of TCP resets on long-lived connections (which as far as I know there has been no evidence this is a widespread problem), this is an ideal solution as the TTFB doesn't matter as much - and you could do it just for the large objects such as video. A big benefit in building an anycast based system is the simplicity of the global system - no need to share global state about load or "health" - each PoP can act independently. Building a hybrid anycast and unicast model is always possible to get best of both worlds (and the one you suggest Paul is likely the most simple I can think of) - but if you build one that sometimes returns an anycast IP and sometimes returns a unicast IP at the DNS layer - you are getting the corner cases of both solutions (meaning you are the one getting up in the middle of the night to troubleshoot the problems with it as no one else can understand it :)). I also agree that unless you are a relatively big guy, then a basic DNS proximity service like Dyn, Azure Traffic Manager or Route53 are perfectly acceptable for global traffic management (especially if you run in a public cloud where anycast isn't available - although Google's HTTP Load Balancer sounds like it might be using anycast (https://cloud.google.com/compute/docs/load-balancing/http/)). Thats my 2c anyway :)