In The Datasphere, No Word Goes Unheard
Since September 11 more than 3,000 al Qaeda operatives have been nabbed, and some 100 terrorist attacks have been blocked worldwide, according to the FBI. Details on how all this was pulled off are hush-hush. But no doubt two keys were electronic snooping -- using the secret Echelon network -- and computer data mining. Now, these technologies are getting tune-ups -- but nagging privacy concerns won't be put to rest easily.
Echelon is the global eavesdropping system run by the National Security Agency (NSA) and its counterparts in Australia, Britain, Canada, and New Zealand. For decades, Echelon's electronic ears have been scooping up all communications relayed by satellite, microwave towers, and even some fiber-optic and copper cables. Each day's intercepts -- phone calls, e-mails, and Web uploads and downloads -- would fill the Library of Congress 10 times.
The NSA's supercomputers strain to sift though this flood of data to spot clues of terrorism. Those documents go to human translators and analysts, and the rest is dumped. But the humans aren't as efficient as Echelon. Two Arabic messages collected on Sept. 10, 2001, hinting of a major event the next day, weren't translated until Sept. 12. Now, the intelligence agencies vow to do better, and the FBI says it has already shrunk translation delays to under 12 hours.
Long term, the goal is near real-time analysis. That would set the stage for data-mining systems that could look through multiple databases and spot oblique correlations that together warn of plots in the hatching. The Terrorism Information Awareness (TIA) project was supposed to do that, but Congress killed it in 2003 because of privacy concerns. In addition to inspecting multiple commercial and government databases, TIA was designed to spin out its own terrorist scenarios -- an attack on New York Harbor, say -- and then determine effective means to uncover and blunt the plots spawned by computers. It might have considered searching customer lists of diving schools and outfits that rent scuba gear, then looking for similar names on visa application or airline passenger lists.
TIA is dead, but the concept lives on. Most companies involved in database management, big and small, now offer tools to quiz the database of a willing partner. And to forestall another privacy panic, various methods have emerged to keep personal and company-confidential information under wraps during such database sharing. Most of these are explored in a massive 2003 report from a blue-ribbon commission convened by the Markle Foundation think tank. Members of the group included Netscape Communications Corp. (TWX ) founder James L. Barksdale and Craig J. Mundie, a Microsoft Corp. (MSFT ) chief technologist. The Markle study recommends ways to ensure that personal data won't normally be revealed, even to intelligence and law-enforcement types with proper clearances.
One tool is "anonymization." Using what's called hashing in cryptography, names and Social Security numbers can be converted into a meaningless jumble of letters and digits. Data-mining software would still be able to search and correlate separate databases -- spotting suspicious financial transactions in bank databases, for example. But personal details would remain cloaked until an agent marshals enough corroborating evidence to justify a warrant to decrypt them.
Technology will never eliminate terrorism, but techniques such as advanced data mining are some of the more powerful tools available right now for preventing future attacks.
By Otis Port in New York