Data Storage: From Digits To Dust
Up to 20% of the information carefully collected on Jet Propulsion Laboratory computers during NASA's 1976 Viking mission to Mars has been lost. Some POW and MIA records and casualty counts from the Vietnam War, stored on Defense Dept. computers, can no longer be read. And at Pennsylvania State University, all but 14 of some 3,000 computer files containing student records and school history are no longer accessible because of missing or outmoded software.
What's going on? The world is in a headlong rush to go digital. From Tokyo to Tampa, schools, libraries, factories, and churches are forking over great sums to computerize everything from Johnny's latest math scores to Aunt Hattie's dental records. Computers are supposed to help us manage this information explosion by storing oceans of data that, at some later date, can be recalled at the click of a mouse.
Trouble is, all these bits of information are piling up so fast that hardly anybody is thinking about saving them. By 2000, Forrester Research Inc. estimates, one of every three Americans will be online. What's more, half to three-quarters of the data produced each day will be "born digital"--that is, it will never have existed on paper. Says Eric Almasey, a digital media expert at Mercer Management Consulting: "We're not just doubling amounts of electronic data every six months, we're quadrupling it."
The Information Age is creating a digital dilemma. For years, computer scientists told us that digital 1s and 0s could last forever. But now, we're discovering that the media we're using to carry our precious information on into the future are turning out to be far from eternal--so fragile, in fact, that some might not last through the decade. More is at risk than government and corporate records. The danger extends to cultural legacies: new music, early drafts of literature, and academic works originate in digital form--without hard copies.
HOUSTON CALLING. To be sure, all our information is not in jeopardy. There are some solutions, even new software to back up data on special paper disks. But there's no quick fix. The data lost from the Viking Mars mission, for example, was trapped on decaying digital magnetic tape, forcing NASA to call mission specialists out of retirement to help the agency reconstruct key data. "Digital information lasts forever, or five years--whichever comes first," says Jeff Rothenberg, senior computer scientist at RAND Corp.
Forget forever. Under less-than-optimal storage conditions, digital tapes and disks, including CD-ROMs and optical drives, might deteriorate about as fast as newsprint--in 5 to 10 years. Tests by the National Media Lab, a St. Paul (Minn.)-based government and industry consortium, show that tapes might preserve data for a decade, depending on storage conditions. Disks--whether CD-ROMs used for games or the type used by some companies to store pension plans--may become unreadable in five years.
For consumers, the biggest worry is CD-ROMs. Unlike paper records, CD-ROMs often don't show decay until it's too late. Experts are just beginning to realize that stray magnetic fields, oxidation, humidity, and material decay can quickly erase the information stored on them. Says Robert Stein, founder of New York-based Voyager Co., which makes commercial CD-ROM books and games: "CDs have a tendency to degrade much faster than anybody, at least in the companies that make them, is willing to predict." Stein doesn't expect the CD-ROMs Voyager sells to last more than 5 or 10 years, and neither, he says, should customers.
There's another problem: the unrelenting pace of technology. Chances are good that the software needed to get at much of today's data might not be readily available in 10 years. Anyone who has tried wrestling information from a 5 1/4-inch floppy disk knows that. Just ask scientists conducting rain forest research. Satellite photos of the Amazon Basin taken in the 1970s--data critical to establishing deforestation trends--are trapped on indecipherable magnetic tapes no longer on the market.
But even keeping a step ahead of data decay and software obsolescence is no guarantee of escaping the problem. Companies spending heavily on sophisticated new computers and software to beat the technology reaper say they're beginning to run into a whole new problem. All too often, when they transfer information from one aging media or computer system to a newer one, not all bits make the migration. Sometimes, just a footnote or spreadsheet is lost. Other times, whole categories of data evaporate. Says Rothenberg: "It's like playing the child's game of Telephone. It doesn't take many translations from one media to another before you have lost significant aspects of the original data."
The Food & Drug Administration reports that some pharmaceutical companies are discovering errors as they copy drug-testing data that back up claims of long-term product safety and effectiveness. In several recent cases involving data transfers from Unix computers to systems running Microsoft's Windows NT operating system, blood-pressure numbers were randomly off by up to eight digits from those in original records, FDA and company data specialists report.
Sophisticated software can catch most of the errors, but "not all the time," says Rone Lewis, vice-president of business development of Surety Technologies, a data recovery and migration firm. Some companies fear the problem could expose them to lawsuits. "In our litigation-prone age, it's harder to defend yourself if you're losing parts of your records when you migrate them," says Henry Perritt, dean of Chicago Kent College of Law.
What to do? Some government agencies have a solution--of sorts. The National Archives requires technical documentation about how the records being submitted were created. And federal regulators, including the Securities & Exchange Commission, won't take digital filings from companies they oversee unless they are sent in plain-vanilla computer formats. "Otherwise, you start getting file formats that nobody is going to be able to read in 20 years," says Bill Combs, the SEC's computer expert.
Some technology managers are urging companies to make preservation more of a priority when buying new computer systems. Ellen Knapp, chief knowledge officer with Coopers & Lybrand, says companies need to give info-tech managers more input so that incompatible systems don't compound the migration problem. "Some companies have shorter visions when purchasing new technology," she says, "and end up having more compatibility problems migrating data as a result."
Ray Paddock, a director for Storage Technology Corp., says the problem is so bad for some of his clients that they're creating new databases just to decipher the data they have on tape and disks. Others, he says, are simply keeping the old version of the software used to create documents.
NO STANDARDS. Meanwhile, the government is looking into establishing durability standards for digital media. A task force--including representatives of Eastman Kodak, IBM, and archivists at leading museums and universities--has agreed on a digital longevity test ultimately aimed at increasing the life span of CD-ROMs and other types of digital media. The only problem: So far, no manufacturer has tested its products using the age-test created by the task force. And the group is still working on a standard for magnetic tape.
Others are at work on new technologies to solve the problem. NORSAM Technologies in Los Alamos, N.M., for example, is promoting its HD-Rosetta project, which permanently stores historical documents--but only if they are converted from digital back to analog recording formats.
But at least one remedy being offered by researchers sounds a lot more like the distant past than the future: Cobblestone Software Inc. in Lexington, Mass., is promoting PaperDisk, which uses paper to print out complex patterns of dots and dashes representing digitized files. Cobblestone President Tom Antognini claims it should last for centuries--or about as long as old-fashioned, high-quality paper.