|
Nicholas Thompson at Wired Blog sums up yesterday’s Wall Street Journal piece on Google. To summarize his summary,
• Google’s edge caching isn’t new or evil
• Lessig didn’t shift gears on NN
• Microsoft and Yahoo have been off the NN bandwagon since 2006
• The Obama team still supports NN
• Amazon’s Kindle support is consistent with its NN support
Yet . . . yet . . . I’m looking for a good technical description of edge caching and a good technical description of Google’s OpenEdge. I’ve been chasing the links in Rick Whitt’s Google Policy Blog posting, but so far I don’t have anything solid.
So I’ve been assuming that the edge cache that Whitt is describing is at the edge of the Internet, and that it is connected to “the cloud” just like any server or customer. Whitt as much as says that this is what Google is doing.
But what if there’s more going on?
What if Google were attempting to put a server in, say, a head end or a central office in such a way that it faced the local customers connected to that head end or central office. In this case, Google would be in a very privileged position. It would be communicating with the cableco’s or telco’s customers NOT via the Internet, but via a single wire, the Last Mile.
The ability to put a caching server this close to the customer is powerful, especially with a fiber or VDSL or DOCSIS 3 distribution network. There’s no bottlenecks between cache and customer. The content arrives NOT from the Internet but straight from the provider.
The advantage to the telco or cableco is that the incremental costs of Internet traffic to its end-user customers would be lower. Popular videos would not need to travel over the Internet every time one of the local customers attached to the central office or head end requested it. Instead, it would be sent over the Internet to the cache once, then distributed to customers connected to the cache via central office or head end many times.
The disadvantage would be that the telco or cableco that owned the central office or head end would need to share a putatively proprietary advantage. It might be risking losing that proprietary advantage altogether.
This would be dangerous especially to a cableco.
Does a cableco or telco have a duty to let a co-locator into its head-end or central office? Yes for telcos under the telecom act of 1996, but the whole notion of unbundled elements has been so trashed that I don’t know where it stands. And the situation is even murkier for cablecos. Of course any of this could change, depending on how issues like structural separation (between infrastructure and applications) play out.
Would such an arrangement be a Net Neutrality issue? Hmmm. Yes if the pipes connecting the cache to the Internet had special properties, or if the telco/cableco dictated what kinds of apps could be cached, or yes if the telco/cableco wouldn’t let other players co-locate caching servers in central offices and head ends, but my feeling is that once the content is cached and served out to the customer at the other end of the Last Mile, it’s video, not the Internet.
I am not sure what I think here. It’s a plausible scenario, but hypothetical. I invite the readers of this post to think this through with me . . .
Sponsored byRadix
Sponsored byDNIB.com
Sponsored byVerisign
Sponsored byVerisign
Sponsored byWhoisXML API
Sponsored byCSC
Sponsored byIPv4.Global
Wow, that like sounds exactly what Akamai and other such organizations have been doing for years already. They simply say to an ISP “can we hang up a box, we give you money/it saves you transit/peering costs” et voila, the box is at the rack as close to the customers as possible, good for the customers and good for the ISP.
I wonder why it would be so evil if any other company would do exactly the same thing…
Also please remember that for most distribution networks bandwidth is the issue, not latency, thus placing a box at the ISP’s datacenter versus putting it at the DSLAM/cable-headend etc, only saves a bit on latency, the fat pipe from there to the ISP’s datacenter really couldn’t care less.
Yes, I’ve often wondered through all this NN talk why so many other technologies prevalent on the Internet are not generating the same headlines as the Comcast/BitTorrent issue. Caching is one of them, as it proxies for the actual content that resides on the source web server posted by whomever is offering the content, and the content delivered by the cache could indeed be stale if the source content had recently been updated. So a user may be getting something less than what they wanted, at least in theory. Is that alone a NN violation? It probably just depends on your perspective on NN as an ideal state versus the technical techniques available (and long since deployed) that make for a better Internet experience even if they fall into a NN grey area.
(There are other technologies as well, e.g., NAT, WAN optimization, load balancing, etc. that fall into this argument but let’s stick with the subject, caching.)
But as Jeroen points out, caching (and the other aforementioned technologies) are good for the customer, good for the ISP and (my opinion from here) typically are not deployed with any malice or preconceived ulterior motives. The ISP just wants to make service better and cheaper. Yes, they are looking out for themselves to a degree, to reduce costs and enhance profits, but the user benefits as well, typically. It’s not as sinister as so many (most???) NN advocates suggest. The realities of technological limitations combined with financial constraints and the need for a service provider to actually turn a profit are more in play than the conspiracy theories. There will always be territoriality, with incumbents trying to protect their legacy territory or service or move into new ones, but I think the sinister motives that are constantly pointed out by NN advocates are often exaggerations that lack real evidence.
Specifically on caching, there are typically two main reasons it is deployed:
1. Reduce the time it takes to deliver content to the end user because of the time it would take to download the content from the source web server each time a user requests the same data. If it was downloaded once already, take advantage of that and re-deliver it to the next guy from a closer source, the cache, rather than the source web server. Keep in mind that it’s not just latency in the sense of the pure round trip time from the client to server and back on a packet by packet basis. The latency, the higher it goes, affects the total throughput that TCP can deliver. TCP backs off more and more as latency increases. It’s more noticeable in file transfers than in short HTTP transactions, but even HTTP transactions degrade as latency increases, and caching can improve the experience.
2. Reduce bandwidth costs. I’ll agree that on the private network, backbone transport costs have come down enough that the savings isn’t that great when considering whether the cache should go closer to the customer edge or upstream near your upstream gateways to your transit ISPs. But there are still per-Mbps (or Gbps) transit costs (on top of the physical circuit costs) that you have to contend with. It’s real money, part of your OPEX costs that ultimately effects your retail pricing, and these costs vary by ISP and region of the world. Offloading content via caching, which typically yields about 25-30% traffic reduction (when the law of large numbers applies), saves you that same percentage in circuit / transit OPEX. One would like to hope that at least some of that cost savings is passed on to the user at least in the form of controlling cost increases.
It generally doesn’t make that much sense to put a cache engine in every CO or head end. There’s too many of them and it becomes a big CAPEX hit to purchase and deploy and then would be a lot of hardware to operate. Even if you don’t put them at your ISP gateways but further downstream toward the customer, you are probably going to put them at an aggregation point, where multiple DSLAMs or cable head ends are homed in, and not actually in the CO/head-end. Of course, Google has $$$ resources unlike most others, and the hardware and OPEX cost to run it may dwarf the value they obtain from the service.
There are also some differences between proactive content distribution versus reactive transparent caching. There are also things like HTTP pre-fetch that come into play. Furthermore, many cache engines, like BlueCoat, are multi-functional boxes that combine caching with web filtering, virus scanning, QoS and WAN optimization among other features, all things that in theory might cause NN eyebrows to go up. Many of those features are intended to be deployed in an enterprises environment and not necessarily in service provider networks, but plenty are, and I continue to wonder why the throttling / traffic shaping that products by Sandvine and Cisco as used by Comcast and others draws so much scrutiny from NN proponents while others such as caching, which are somewhat similar from a high level NN perspective, do not.
A major advantage someone like Google has by deploying caching as far into a service provider’s network as is possible is to simply have a view of all traffic, even traffic not directed at Google’s search engine, in order to further index the Internet and gain an understanding of specific customer content preferences, for targeted advertising or anything else within their goals. It would be like having a sniffer or NetFlow feed at each POP. A lot could be gained from that. But there are a lot of ways to deploy such a thing, so it would come down to what content was being cached and what traffic was being redirected to the cache – just HTTP, or video, or other content – and if some traffic were treated differently than others.
Network neutrality isn’t the only issue known to man in this field, so let’s try to draw some reasonable boundaries around it. For example, the behaviour of a network service provider might be anticompetitive without being a violation of network neutrality. Not all misbehaviour is a network neutrality issue, and not all neutrality violations are misbehaviour.
Network neutrality is essentially a line of demarcation. If a network service provider is offering Internet connectivity, then we expect it to route IP packets with “best effort” accordingly. The line of demarcation is the IP header: everything beyond that point is the exclusive business of the endpoints, not the network providers, unless otherwise negotiated. Ideally, there is never any cause for the network service provider to engage in deep packet inspection (beyond this line of demarcation) at all, but we live in a less than ideal world.
So-called transparent web proxying and caching is one of these unhappy compromises. It is not a technically ideal solution, but it does arguably improve the performance of the HTTP protocol when it works. Furthermore, the technology has been refined to a state where the benefits arguably outweigh the disadvantages. It’s still less than ideal, but neither the client nor server has much cause for complaint, particularly since the caching can usually be bypassed by shifting the server to a non-standard port. Transparent proxying of this sort is arguably making up for a deficiency in the design of HTTP. Hopefully we’ll figure out a better protocol in the long run which performs well without this nasty hack.
Other forgiveable violations of network neutrality include certain anti-abuse measures, such as the blocking of TCP port 25 outgoing. Ideally this is not necessary, but given the actual pattern of compromised computers, spam, and so on, it’s arguably better for everyone that domestic Internet service be somewhat filtered, although the details are open to debate. (Some NN ideologists will argue that this filtering is never acceptable as a matter of principle, just as some defend spamming as a legitimate exercise of free speech.) ISPs should, of course, be entirely up-front about the fact that these blockages are in place. Abuse control can be a valid reason for violating network neutrality when no better alternatives exist. Where violation of neutrality is the best option, I believe it reveals an underlying design weakness in the Internet protocol stack.
The Comcast interference with various P2P protocols was unreasonable because bandwidth management could be effected at the IP level, without regard to the application. There have been some less than convincing arguments put forth that application-agnostic management was not in fact possible, due to limitations of DOCSIS or whatever else. In retrospect these arguments are even less convincing because Comcast has backed off its interference with these protocols, but in the unlikely event that such a problem can not be solved without crossing the network neutrality line of demarcation, the non-neutral interference should be fully disclosed in the terms of service.
Again, bear in mind that network neutrality is not the only issue known to man. It may be acceptable for a network provider to openly announce that it is crossing the line of demarcation in some circumstances, whereas the same action may be unacceptable for reasons of antitrust in different circumstances.
In the matter of edge caching in the style of Google or Akamai, it seems that they are providing endpoints, not network service, and so the matter of network neutrality is irrelevant. I’m not familiar with the technical details of their respective approaches, but my first guess at how it’s done is that Google/Akamai/Whoever co-locate servers with the ISP, and create a local route with that ISP such that some of their network address allocation is locally mapped into the ISP’s network. Customers of that ISP are then routed (at the IP level) to the co-located servers rather than the ‘general’ servers. IP wasn’t designed with this behaviour in mind, but it would work in controlled conditions.
Assuming that my guess is at least in the right ballpark, there’s no network neutrality issue here at all. There may be questions of antitrust, or collusion, and there may be technical questions like, “can we find a more general solution to the problem that’s being addressed here?” Net neutrality, however, is not the issue.