|
In a recent article, I shared the idea that there is a new category of network architecture, the Network of Probabilities. This differs from classical circuits (Network of Promises) or best-effort packet data (Network of Possibilities). I personally believe it’s the next revolution in telecoms. What’s new is that it provides a trading space for allocating contention between flows, and does this with some novel applied mathematics.
A bit like the progression from 2G to 3G to 4G wireless data encoding, better mathematics can squeeze a lot more out of fixed networks. Indeed, we could say there is an equivalent generational progression in multiplexed fixed networks, from TDM to ATM to IP. In this post, I’d like to lead you a little further along my own journey of enlightenment to the fourth generation of fixed networking, called Contention Management. It’s feeling lonely out here right now.
The hard thing to do is to let go of your intuitive beliefs about ‘bandwidth’ in networks. Packet networks do not have ‘bandwidth’, just like the sun is not made of ‘shine’. We mustn’t mistake metaphors for reality. Indeed, in order to get more out of networks, we must transcend the approximation of bandwidth-based thinking to network reality, and adopt a more robust model that is inclusive of both quantity and quality effects.
When better bandwidth is bad
Here are three examples of what happens in real networks when you apply naïve bandwidth thinking to packet networks like the Internet.
Example 1: Your network is working fine, has lots of bandwidth available, but the users keep reporting short outages and poor bandwidth. What’s going on?
Example 2: Imagine you have a standard 20 Mbit/sec DSL line from a central exchange to your home. One day, your telco comes along and ‘upgrades’ you. Now you have a 1 Gbit/sec fibre to a street cabinet, and then say 50 Mbit/sec copper onwards to your home. The fibre is fast, and your copper loop is shorter, so bandwidth goes up. But customers are complaining, and you notice that your online gaming has worse performance than before. What’s going on?
Example 3: To speed up application performance to your holiday cottage, you bond together two links, say 3G and a satellite link. Bandwidth goes up. When you test what happens to the applications, you find terrible performance problems. What’s going on?
Buffers badly batter bandwidth
Let’s see what really happens in networks.
The first is a well-known phenomenon called bufferbloat. When networks saturate, it disrupts the control loops that TCP uses to say ‘faster!’ and ‘slower!’ to the end points of the flows. This can lead to all the queues filling up, multiple packets getting lost in a row, and sudden collapses in transmission speed that are experienced at transient outages by users. The network recovers, but only slowly. As fast memory has become cheaper, the queues in routers have become longer, driven by the mistaken belief that it is always better to delay a packet than to drop it. This just makes the collapse bigger, and recovery slower. And more bandwidth makes the collapses more sudden.
When the telco upgraded from a single copper loop to fibre plus copper, it inserted an extra queue. This added new delay effects that undid all the benefits of additional bandwidth for delay-sensitive applications. Furthermore, it allowed ‘greedy’ applications to stuff that queue with pulsed traffic, which raised loss and delay for better-behaved applications. Hence customer experience got worse, despite more ‘bandwidth’.
If you take two network links and bond them, you can run into trouble in multiple ways. For a start, you have done nothing to improve the delay characteristics of the new ‘synthetic’ combined link. If you fire packets randomly down one or the other link, you get order-reversal, which TCP treats as a loss, and slows down. If you send packets from the same flow down the same link, the still self-contend, but the application may face unexpectedly different characteristics for each flow. So the different loss and delay for audio and video of a Skype call may seriously confuse the encoding algorithm. Furthermore, any outage or transient saturation effect, even momentary, may cause odd oscillations in the traffic that create poor user experience.
As you can see, the explanations require looking at the properties of the queues over short time periods; none of the problems were as a result of a lack of bandwidth. These aren’t isolated edge cases, as the problems are endemic.
Networks all have failure modes.
The questions are: how big are they? how to manage them? and at what cost?
The three laws of networking
Rather like the laws of thermodynamics, there are three fundamental laws of networks.
These are not opinions, but are provable matters of mathematical fact.
The trouble with telecom
The telecoms industry is in trouble, because it is fighting all three fundamental laws. Needless to say, in a fight between management and mathematics, the latter always wins.
This is not the network you are looking for
It’s worse than you think.
Telcos all over the world are splurging capex on unnecessary network upgrades to paper over what are often quality issues. So the first thing they do is build out fast, fat pipes, and sell them.
And sell them. And sell them. They are then over-selling the capacity, in the mistaken belief that they sell bandwidth. But you run out of quality a long time before you run out of bandwidth. Applications collapse and customers complain—and the network doesn’t yet appear to be ‘full’. That was never in the business case. Are you sure this is a safe utility stock still?
And when you do try to explicitly package and sell quality, to mitigate the collapse effects, you get an effect called ‘quality inversion’. It’s cheaper for customers to buy a fatter, faster pipe with lower packet service times than to buy the ‘quality-assured’ one. That’s a by-product of seeing quality through a bandwidth lens, and mispricing it as a result.
Bye-bye to bandwidth
The bandwidth approach has no means for modelling or managing the failure modes of multiplexed networks. Indeed, it takes infinite bandwidth at infinite cost to have no failure modes. The contention model lets you manage the failures, at a finite cost. Sounds like a good alternative, no?
In a world where capex is constrained, and demand is not, we’re going to see an inevitable shift towards getting more out of what we have. The financial and network maths tells us we must manage the true fundamental resources of the network, not fantasy ones.
At the end of the day, there’s no contention. Bandwidth is bust.
This article was originally published as a Future of Communications newsletter. Martin Geddes and Dean Bubley are also conducting public workshops on Future of Voice and Telco-OTT Services in London on 26th & 27th April and US East Coast in May 14th-15th. Get the free newsletter, and learn more about the workshops at www.futureofcomms.com.
Sponsored byCSC
Sponsored byVerisign
Sponsored byRadix
Sponsored byVerisign
Sponsored byIPv4.Global
Sponsored byDNIB.com
Sponsored byWhoisXML API