|
This week’s issue of EE Times carries a story Pflops here; now what? about IBM’s new 1 petaFLOPS supercomputer, the Roadrunner, and how its designers are scrambling to run benchmarks in advance of the annual International Supercomputing Conference (ISC) being held June 17th-20th. It’s an article (dare I say, a puff piece?) about IBM, but it does mention competing supercomputers by Japanese vendors. However, it makes no mention of distributed computing projects like SETI@Home or, more importantly, of the Google computing cluster.
The BOINC projects (which include SETI@Home) are averaging 1.1 petaFLOPS on a sustained basis, day in and day out. Of course these are specific algorithms tailored to match the extremely distributed nature of a system where individual computer users volunteer their spare cycles for a good cause. So maybe this approach doesn’t count with the real supercomputer folks.
But what about Google? There are several approaches to supercomputing including vector processors (one instruction applies to many data elements), multi-processors (typically one OS controlling multiple processing cores) and clusters (multiple processors and multiple OS instances connected with a high-speed network). Over time, all the big machines have migrated to the third approach. Indeed IBM’s new Roadrunner is made up of 3,240 separate compute modules, each of which is a multiprocessor. Well that’s exactly what Google has been doing since their inception. While they don’t tout their technology, they did publish a paper in 2003 describing The Google Cluster Architecture which described how the early system worked (less than 15,000 servers in the early days).
Today, Google has perhaps 20 to 100 petaFLOPs of processing power in their distributed computing system. In mid-2006, the New York Times estimated Google had 450,000 interconnected servers in their various server farms. Their capital budget continues to expand, they continue to hirer (including for very super-computer specific jobs) and they are building a global fiber optic network to better connect their distributed server farms, so it’s reasonable to assume Google has well over 500,000 servers on-line today. None of these machines is more than 3 years old with an average age nearer 15 months based on the economics described in the 2003 paper. A new server for late 2007 and early 2008 has dual quad-core Xeon processors at 2.5 GHz or 3 GHz. Intel claims the quad-core Xeon provides 77-81 gigaFLOPS and today’s servers have two such processors, i.e. 160 GFLOPS. Let’s discount that for Intel hype and the fact that the average Google server is whatever commercial machines of 1/2007 could do—say 100 GFLOPS. And lets assume they haven’t added new buildings and new servers and have only 500,000 machines in their cluster. That’s still 50 petaFLOPS.
Note that Google also has an A team of researchers who, occasionally, publish fascinating glimpses into what’s going on.
I’ve never attended an International Supercomputing Conference—it’s a little out of my field—but I’d be interested to know if there is any public recognition, at the ISC, of what’s going on within the Googleplex. I don’t see any speakers from Google or any mention of Google on the ISC website. Have the supercomputer folks been bypassed and they don’t even know it?
Sponsored byRadix
Sponsored byVerisign
Sponsored byDNIB.com
Sponsored byVerisign
Sponsored byWhoisXML API
Sponsored byIPv4.Global
Sponsored byCSC
I think it is just the Press looking for big numbers and focusing only on processing speed.
The big supercomputers are often built for specific tasks, such as climate modeling, and similar numerical modeling tasks. For these tasks you often need to exchange lots of data between many processing nodes on a routine basis (one system I worked with needed to do global sums for each time step of the simulation for conservation calculations, as well as routine exchange of information between nodes modeling adjacent bits of fluids). So what matters is the combination of processing performance, and interconnect speed, and memory storage and I/O.
Google have a very different mix of the above, compared to a traditional supercomputers problem domain.
What was fascinating when I was trained on long forgotten Cray machines, was how beautifully tuned the hardware was for solving particular types of elliptical equations numerically.
It wasn’t the fastest machine processing wise at the time, but it was clearly engineered with great care to be able to handle these specific problems such that disks could feed the solid state storage just fast enough to get data in and results out, and the solid state storage fed memory, which fed processors, which did maths in a conveyor belt fashion. Every step has just enough bits to keep the next faster bit of the system busy for the particular task at hand.
So I don’t doubt Google have one of the most fascinating computing system on the planet, but it is likely designed and built to solve very different problems. For a start given bits of the Google system are in various different data centers around the world, I’m guessing their interconnect latency is many orders of magnitude larger than the peak latency between nodes of a traditional supercomputer. But it probably doesn’t matter if your search query takes 50ms to get to the right place.
It may be like comparing a train and an aircraft carrier, sure aircraft carriers generate a million horsepower, but pretty useless for getting commuters into work in the morning. Collectively the trains that do the London commute in the morning have more horse power than any aircraft carrier, but you wouldn’t want to fight a war with them.
Train engineers and aircraft carrier engineers share knowledge, but I wouldn’t be surprised if the aircraft carrier engineers had conferences where no train engineers turned up, and vice-versa.
Simon, thanks! You are of course entirely correct that supercomputer designs are typically tuned to particular algorithms. It’s also true that, once a specific supercomputer is available, inventive folks find ways to tweak their algorithms to best utilize the particular system’s ratio of processing, memory and inter-processor communication bandwidth.
As to the specifics of the Google supercomputer, it’s interconnectivity is rather like that of many Beowulf clusters, i.e. Gigabit Ethernet between servers that each have substantial local disk storage. Typically Google has thousands or tens of thousands of machines in a given data center, so speed of light delays between data centers around the world are probably not an issue for most computations.
The Top 500 supercomputer list (http://www.top500.org/list/2007/11) requires that all the machines in a cluster be located within one data center. If you agree with that criteria (it seems a dated idea to me), Google might not come out at the top of the list, but they would occupy 20+ separate locations on the list.
In any event, there are some very interesting papers published by Googlers listed here: http://research.google.com/pubs/papers.html#category4
In particular, their Map-reduce algorithm should be a joy to any LISP programmer. http://labs.google.com/papers/mapreduce-osdi04.pdf