A huge chunk of the electricity grid fails. The Internet clogs up, and PCs crash. The space shuttle falls to the earth. Complex high-tech systems everywhere appear to be failing, and our society feels increasingly threatened. What is going on? Have we built a high-tech society that is doomed to crash and burn again and again? Can we fix it?
Behind these calamities lies a common flaw: The systems are too complex to manage. Each was created with an enormous number of moving parts that threw off an incredible amount of data that had to be observed, analyzed, and managed. But when things went wrong, people had to react very quickly, perhaps too quickly. They had to communicate with many others, perhaps too many. They had to balance conflicting demands in their decision-making -- efficiency vs. safety, profit vs. costs, science vs. politics -- perhaps too many. Just as the first VCRs had so many features they overwhelmed consumers, our high-tech systems are being designed with far more complexity than we can handle.
We may be making it worse by centralizing and standardizing systems. In an effort to improve efficiencies and cut prices, we are moving toward a single national electricity grid. We already have one standard computer operating system. And while there are many benefits inherent in this kind of integration, it may also be undermining the systems' reliability and security. It is strange for a nation that has thrived on diversity and decentralization to build its economy on the opposite principles. Ironically, our model for the 21st century information society appears to be 19th century industrial society. We are building big centralized systems stuffed with bells and whistles and are inadvertently making America an easier target for economic and political terrorists who can bring down whole swaths of society with one blow.
We are also starving these complex systems of the resources needed to manage them safely. It is important to build in redundancy and backup for when things go wrong. Yet political decisions and market forces prevent a sufficient cushion from being created. The crash of the shuttle is perhaps the best example. Political pressures in the '90s cut NASA's budgets to the bone, even as it was shouldering new responsibilities for building an orbiting space station. Under pressure, NASA managers ignored seven pieces of foam that broke off in flights before one destroyed the Columbia. The electric grid failed in part because insufficient investment had been made in it. The decision to keep the grid regulated while the more lucrative power-generation business was deregulated led to the grid being starved for capital. And computers crashed because Microsoft Corp. put few resources into making software secure and reliable until very recently. There were no countervailing market forces forcing it to do so.
In his book Inviting Disaster: Lessons from the Edge of Technology, James R. Chiles reminds us that all complex systems, by their very nature, are destined to fail at some point. The key is being able to manage the failures early so that they do not grow. If the failure goes unnoticed or is ignored, if it swamps those in charge or links to a wider network and spreads quickly, then it is likely to become a major event, perhaps even a catastrophe.
There is a better way. Design systems that give people adequate time to manage failure. Make them diverse and flexible enough so that parts of a system continue to operate when something goes down. Invest enough resources to have backup that keeps critical functions running when emergencies occur. In effect, provide enough flex in the system to allow human beings the time to manage properly.
Monocultures in nature die because they are too fragile. That's the lesson we should take away from recent events. We don't have a technology problem per se. We need to use markets and the political process to design systems that are within human limits to manage and defend them.
By Bruce Nussbaum