Online Extra: Fired Up for the Supercomputer Derby

A DARPA contest to spur supercomputers to even more unthinkable speeds is down to three heavyweight contenders

The speed of today's supercomputers boggles the mind. "Teraflops" is easy to explain: trillions (tera) of floating-point operations per second (flops). But nothing that happens trillions of times every second is within the realm of human sensibility.

Now try to imagine petaflops -- quadillions of calcuations a second. That's what Washington is shooting for in the next generation of supercomputers. The Pentagon's Defense Advance Research Projects Agency (DARPA) is spearheading a competition to develop such incredible speedsters in collaboration with other federal outfits, including the National Security Agency, NASA, and the Energy Dept.

DARPA launched the contest in early 2002, shortly before Japan switched on the world's fastest computer, called Earth Simulator. That summer, the agency picked five proposals for further refinement. They came from Cray (CRAY ), Hewlett-Packard (HPQ ), IBM (IBM ), Silicon Graphics (SGI ), and Sun Microsystems (SUNW ). Last July, three interim winners were selected -- Sun, IBM, and Cray -- and split a prize of $150 million. The next milestone will come in 2006, when the trio's digital prototypes will face off to determine which design will get a juicy award to become a real-world prototype in 2009.


  All three contenders are working on innovations in both hardware and software. For example, DARPA liked Sun's concept, dubbed Hero, for the clever way it proposes to link chips together: sans wires. Instead of connecting chips with copper wires on printed-circuit boards, Hero will harness capacitive coupling. This is akin to the way some battery-powered toothbrushes and cordless shavers are recharged by wirelessly transmitting electricity to the product as it sits in a charging cradle.

By butting chips close to each other, with just a very thin gap in between, Sun hopes that signals can be made to jump from chip to chip, through the air. If it works, the trick could dramatically boost the flow of data -- by as much as 100 times. The technique could be used to improve performance not only of supers but also of Sun's servers and workstations.

The time it takes signals to flit between chips has become a major bottleneck, and Sun believes it will be very difficult to engineer a petaflops computer using circuit boards. After all, the distances that signals travel over circuit-board pathways haven't changed significantly for decades -- whereas the length of wires between transistors inside chips has shrunk regularly and repeatedly. This reduction in distance is the main key to Moore's Law, the edict that calls for microprocessor power to double every 18 months.


  Cray's petaflops candidate, designated Cascade, is a vector-scalar hybrid. That means it'll sport two types of brain chips: "heavyweight" processors for vector jobs and "lightweight" versions for scalar code.

Vector chips were invented in the early 1970s by Seymour Cray (1925-96), the legendary father of supercomputing. They're specifically tailored to process vectors, or strings of related numbers -- like the buildup of heat on a plane's wing, second after second, as its speed approaches the sound barrier. Japan's Earth Simulator has 5,120 vector chips.

Scalar code is software that can be chopped up into small chunks for simultaneous processing by hundreds or thousands of microprocessors. As a result, a computer's speed increases, or scales, in proportion to the number of chips. (A vector string can't be processed simultaneously, because each result affects the next number.)

Cascade's lightweight chips will feature a processor-in-memory (PIM) layout. As the name suggests, the processor is surrounded by memory circuits. Again, this shaves the distance that signals travel. PIM chips were introduced in the late 1990s to meet demand for speedier graphics chips in video-game consoles, especially Sony's (SNE ) PlayStation. Now, PIM chips are establishing a beachhead in supercomputing. They're in Blue Gene/L, the 360-teraflops cluster that IBM is building at Lawrence Livermore National Laboratory.


  For its entry in DARPA's petaflops sweepstakes, though, Big Blue is focusing on a radically different concept: reconfigurable chips. Conceived by two computer scientists at the University of Texas in Austin, Stephen W. Keckler and Douglas Burger, the IBM processors will be switch-hitters. They'll swat vector and scalar code with equal ease. That's because the chips can be instantly rewired, inside a computer, to create logic circuits customized for each successive segment of a program.

"On the fly, the chip could flip from vector to scalar and back to vector, whichever would be best for the code that's about to run," says Burger. And since the underlying silicon circuitry doesn't change, the chips could be mass-produced. "With this technology," adds Keckler, "you could build vector systems with the same economics as clusters." That would be a sea change. Today, vector machines cost about five times more per teraflops than scalar clusters.

Raw speed alone won't determine which design wins the final prize. Today, supercomputer bragging rights are pegged to theoretical peak speed and the results of benchmark tests used to rank the world's giant computers for the Top500 Supercomputer Sites list ( But it's common knowledge that neither measure accurately predicts a machine's performance when it runs real-world software. To give prospective buyers a better handle on the pros and cons of different systems, DARPA is also sponsoring development of new benchmarking and efficiency measures.


  Ultimately, what's most important to supercomputer users is "time to solution." That's the message delivered at numerous industry conferences by DARPA's Robert B. Graybill, the program manager in charge of the petaflops effort. Writing the software to explore a new theory or engineering design often accounts for the bulk of time between an idea and its resolution. So innovative programming tools that can speed up software development will carry a lot of weight.

Corporate researchers and academic scientists can't wait to see how this turns out.


By Otis Port in New York

    Before it's here, it's on the Bloomberg Terminal.