NANOG 97 Explores the Networking Challenges Behind the AI Boom

Home / Blogs

NANOG 97 Explores the Networking Challenges Behind the AI Boom

	By Geoff Huston Author & Chief Scientist at APNIC
	June 29, 2026 Views: 2,783 Add Comment

NANOG 97 was held in Bellevue, Washington, at the start of June 2026. These days, you could be excused for suspecting that the world has gone AI-mad, and if you were at the NANOG meeting, your suspicions would’ve only been confirmed! The topics discussed were the design of the data centres used to generate the large language models that underpin today’s AI tools and the application of AI tools in network operations.

AI Eats The World

The tenor of the meeting was set by Cogent’s Dave Schaeffer, who provided a set of statistics and associated observations that I have to admit didn’t manage to engage my uncritical enthusiasm and instead confirmed some of my own alarm bells about the extent of this irrational insanity that pervades this space.

Expenditure in 2026 is centered on a whole new round of data centers built specifically for the demanding requirements of intense deployments of GPUs. In historical terms, the sums involved are massive. In the US alone, expenditure will approach USD 750 billion for this year (2026), almost double last year’s USD 450 billion. Current projections of the longer-term multi-year build cost within the US are now heading to USD 7 trillion. In economic terms, that’s 2% of the US GDP for the next four years. And that’s not counting expenditure occurring in other economies. There’s no doubt we’re in the middle of a wild boo. It’s an economic inevitability that booms are followed by crashes, and the emerging question is how soon and how catastrophic the crash will be when this bubble bursts.

Oddly enough, it’s not as if all of the major investors are there because they’re driven by a sense of exuberance and optimism over the future prospects of market dominance and wealth through their AI platform. Many of these investors are today’s digital behemoths, and for them, it appears that the driving motivation is that engagement with AI is a forced activity: If they don’t sign up and invest in their own AI platform now, then they will be unable to survive when their competitors steal their existing markets with these new AI platforms. For example, what good is search when an AI tool can provide me with the answers? If it’s fear and greed that drive markets, then today it appears that fear dominates many corporate decisions to play in this space!

AI training, inference and agentic workloads are placing a new generation of stresses on today’s data center network architectures. Large clusters of GPU engines coupled with the decoupling of processing from memory using RDMA (remote direct memory access) generate sustained high-volume loss-intolerant traffic in the data center’s transmission and switching capability. These demands challenge the prior conventional datacenter design parameters around bandwidth density, electrical reach, and optical capacity. So far, we’ve relied on the constant increase in chip capabilities to resolve these scaling challenges, but as these environments demand even further scaling, limitations are emerging that cannot be addressed through incremental capacity upgrades to existing designs.

Bear in mind that the true genius of the TCP transport control protocol was its tolerance to packet loss (and jitter). RoCEv2 (RDMA over Converged Ethernet, version 2) is a network protocol that allows computers to transfer data directly between memory on different servers, bypassing the operating system kernel and CPU, using standard Ethernet. This enables low-latency memory access and is vital for large-scale AI clusters and high-performance computing. However, RoCEv2 is completely intolerant to packet loss, packet re-ordering and jitter. This implies that TCP is not a suitable transport, and RoCEv2 is layered over UDP. Normally, it would be left to the application to detect and repair packet loss and packet reordering, but RoCEv2 contains no such support. The outcome is that the network simply cannot lose a single packet! This is a rather novel challenge for large-scale high-speed data center design.

We are building out on the fine margins of what we can deliver in terms of silicon processors, storage systems, photonics and switching, power delivery and cooling. For example, in the power market, these data centers are expected to consume more electricity in the US than all energy-intensive manufacturing combined by the end of the 2020’s. The current projection of power requirements will account for up to 12% of all US electricity consumption within a few years, and it’s not as if some other sector is switching off its power by a comparable amount over this period. This is a new demand on the country’s power generation system, and represents a new source of pressure on existing electricity generation capabilities and power grid infrastructure.

It’s not just the packet handling layer that is feeling the stress of large-scale AI data center deployments. AI infrastructure impacts are also being applied to the underlying optical layer. Photonics is now becoming a binding architectural constraint in AI data center networks rather than a transparent transport layer. Current architectures are encountering scaling limits, including faceplate density, thermal and power budgets for pluggable optics, and the operational complexity of rapidly expanding fiber plants.

We are now evaluating several optical architectural approaches, including co-packaged optics, external laser source models, and optical circuit switching as a complement to packet-switched fabrics. For each, it highlights the problems these approaches aim to address, the new constraints they introduce, and where they might provide needed capability within operational environments.

The growth in data center optical systems in the coming few years is expected to grow at a compound annual growth rate of 45% per year. Gallium Arsenide semiconductor systems using multi-mode fiber have been a mainstay for some years, and that technology appears to top out at around 200G per lane, and it mandates short interconnects. Higher capacity systems are achievable with Indium Phosphide semiconductors, which are used with single-mode optics for optical spans of 500 meters and above. This is typically what has been deployed in mass by the large hyperscalers. These systems support 200G per lane, and there is a general expectation that 400G per lane is going to be deployed in the near future. However, 400G per lane is encountering some slowdown in deployment, and nobody really knows what’s going to happen next after 400G. Higher capacities tend to run into power density limits and thermal dissipation. Pluggable optics no longer scale as easily, and while we can foresee total per-fiber capabilities of 3.2T as being achievable, the path to yet higher capacity is not clear, in terms of the scaling capabilities of optics, silicon, power, cooling, physical packaging and unit costs per bit.

It’s apparent that the construction of these data centers doesn’t resemble a traditional speculative boom with the prospect of short-term windfall returns for investors who are underwriting the costs. The best-case scenario for investors is that data centers earn steady utility-like returns in the long term. The worst-case scenario is that the capital cost is underwritten by a financial bubble that bursts midway through the construction process. But maybe this time really is different from the conventional boom and bust business cycles. The difference is that it appears that a major investor is actually you and I through a public sector-based investment, in the form of taxation concessions given to the enterprises that are building this AI infrastructure.

It’s a fascinating journey we are on with AI. There is a well-founded belief that the quality of the AI systems is heavily reliant on the scale of the processing used to assemble the models, and so far, the rule of thumb is that larger scale produces a better outcome. There is no clear idea of what scale is “enough” or even if there is a point of diminishing returns where double the scale of processing capability only generates marginal improvements in the service quality. Without a clear idea of where any logical endpoint might exist, we continue to push hard at every aspect of the components in these facilities to gain further scale. It’s a wild ride!

GeoLocation

The task of geolocation, or assigning a location of a remote device based on the IP address it uses to access a service, has become a popular topic in recent NANOG meetings.

It’s an area that continues to raise many more questions than we can provide answers. It appears to start with a rather innocuous question: “Where are you?”

But who are you in this question? Is it you who is the human using a device, or is it the device itself? In human terms, the question of location becomes one of adapting the service experience using an appropriate language, an appropriate character set and presentation layout that matches the conventional use and cultural norms in that location. There is also the societal dimension. What laws apply to my ability to access digital services? Which country’s legal code applies to my transactions? If transaction taxes are applicable, which taxation code is applicable?

Where is a similarly vexed question. We could use latitude/longitude coordinates to pinpoint a location to a point on the surface of the earth. Or perhaps a more useful taxonomy is to use a geopolitical location system, using country, state, city and so on.

Is my location where I am located physically, or is it somewhere else? If I use a VPN service or a remote proxy agent, then is my location the location of the remote portal? There is also a temporal dimension, as I could be carrying my mobile device in a train, a boat or an airplane. Mobility also raises the question of stability and duration of location. What is the useful lifetime of IP location data?

What’s the granularity of location? What are the tensions between personal privacy and high-precision location data? Where has the deprecation of post codes already been seen in some geofeed systems, where the cost code is at such a level of precision that it can be used as personally identifying information? When we consider the precision of location, it’s useful to ask if this information is to be used as a delivery address, as in “Where do I deliver the pizza?” Or should we blur the precision and simply refer to a country, or a state within a country? Some service providers use location as a means of making a good selection of a “nearby” data center. In this case, the metrics of “proximity” and “distance” are not based on a physical distance but are based on metrics that use an underlying network topology and the state of the network’s routing system.

The common intention of geolocation is a mapping service that takes an IP address as input and generates a location as output. The question is where and how you can find the data that acts as a seed for this IP-to-location map.

Some network providers maintain this map for the IP addresses that they use. For example, Starlink and Apple’s Private Relay service both publish location maps, but this is the exception rather than the rule in the ISP space. In some cases, it’s left to individual users to publish their own location, but how can a consumer of this data determine the truth (or otherwise) of these self-assertions of location? Is there any form of external validation that could be used to test the veracity of this geolocation data? Is my location a piece of public data? Or is it an instance of personally identifying information that would allow my identity to be inferred? What level of location granularity turns a general locale into my specific location?

I also can’t help but ask myself, why are there multiple geo-location providers? And why do they differ in detail? Surely location is impartial information and not open to variable interpretation?

In many ways, these panel sessions on geolocation are curiously illuminating. What seems to be a very simple question has oddly complicated answers. There are multiple layers of nuance and complexity lurking behind what at the outset is a very simple question.

Measuring IPv6

How much IPv6 has been deployed? How much IPv6 is being used on today’s public Internet? There are a couple of web pages that report on IPv6 use. The IPv6 measurement site operated by Google reports on the IPv6 capability of a sample of users who access the Google home page. There is also the data published by APNIC Labs. Both of these reports look at the capability of users to use IPv6, but not actual use. A number of popular Internet exchanges report on traffic volumes, such as the AMSIX report on IPv6 volumes at their exchange.

These efforts to generate a “big picture” report can be complemented by small-scale studies of individual cases, as is reported in the presentation on a “non-Binary View of IPv6 Adoption”.

The results of an analysis of user traffic at a small number of sites show that a number of common applications do not have support for IPv6, including Zoom, TikTok, GitHub, and Twitch. Depending on the level of use of these applications, different users will show different levels of IPv6 use.

On the server side, of the Tranco top 110k sites, 58% are IPv4-only, 30% require IPv4 to load, and 12% of these sites are fully IPv6 enabled. Of course, what this does not show is the number of users of these services, the frequency of use and the amount of traffic generated by these services.

In the Cloud service environment, some platforms have extensive IPv6 support, such as Cloudflare, and others, notably Amazon, are mainly IPv4. Again, however, this data needs to be placed into a context of relative use and relative traffic volumes.

The conclusion for me is that while there are clear signs of progress with the transition to IPv6, the pace is slow. If anyone is impatient to move to an IPv6-only environment, then they will necessarily exist within a highly fragmented space. If we wish to preserve the coherence of the Internet, then some further patience is needed while we traverse this transition path.

The Cost of SSH

These days, SSH is the default transport for many applications. When you need encryption on the wire (and only the most stupidly cavalier would ignore channel encryption in a world of WiFi), and maybe you’d like to understand that you are communicating with the service that you intended to communicate with, then SSL can help.

But it’s slow.

We learned last October at NANOG 95 that many of the performance issues with SSH arise from poor internal configuration and buffer dimensioning, and HPN-SSH is a SSH variant that shows what is achievable in a performance-optimized SSH implementation.

However, there is another dimension to SSH performance, and that is the overhead to start an SSH session. There’s the TCP handshake and the SSH initial version exchange, which accounts for 2 round-trip time intervals (RTT). This is followed by algorithm initiation and key exchange, and a further 1 to 3 RTT intervals. At this point, the application layer needs to make a service request, and the parties need to perform authentication, with a further 2—4 RTTs. For a terminal session, there is then the channel open exchange, shell initiation and terminal display characteristics. The total delay is some 10 to 15 RTT intervals. Can we improve on this situation?

A useful observation is that TLS 1.3 makes significant improvements to this connection establishment time, and HTTP using TLS 1.3 can complete a connection within just 3 RTT intervals. This is exploited in a proposal to use an edge proxy, where the network service uses HTTPS over TLS 1.3, and the edge proxy talks terminal emulation (PTY) over SSH to the device.

If you are using SSH to manage a couple of devices with network automation, the SSH connection overheads can be annoying, but they are not major. If, on the other hand, you are automating a system with 10 or 20 thousand devices, then the additional RTT delays are a significant factor, and the HTTPS to SSH proxy approach starts to look very attractive!

NANOG

This is a small sample of the busy three-day program at NANOG 97. Of the other presentations at this meeting, I noted that the keynote, “The SRE of AI: Engineering Network Reliability for the Tokenized Era,” argued that AI workloads make network reliability a first-order performance constraint. The presentation noted the need to rethink service level objectives, reduce instability that can throttle processing clusters, improve monitoring, and use data-driven automation to manage traffic and availability across geographically distributed systems. Which all seems like a pretty challenging wish list! The increased scale of many compute-intensive environments has prompted some examination of the limits of traditional monitoring and the need for richer, real-time telemetry for routing and network state. As someone who has been peeking into BGP for some decades, I appreciated the presentation that examined how BMP can provide full-fidelity routing telemetry across thousands of BGP sessions, contrasting BMP with approaches such as SNMP, gNMI, and full-table dumps, which can struggle to deliver per-peer visibility at large scale.

The full content is available at https://nanog.org/events/nanog-97/agenda/.

The next NANOG meeting will be held in Miami on October 19-21 2026.

NORDVPN DISCOUNT - CircleID x NordVPN
Get NordVPN [74% +3 extra months, from $2.99/month]

By Geoff Huston, Author & Chief Scientist at APNIC — (The above views do not necessarily represent the views of the Asia Pacific Network Information Centre.)
Visit Page

Filed Under

Comments

The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.