DNA Method May Enable Storage of All World’s Data

U.K. scientists developed a way of storing data from a million compact discs in a gram of DNA, a method that could potentially house all the world’s digital information.

Computer files totaling 739 kilobytes of hard-disk storage were encoded and made into DNA that fit inside a test tube. The artificial DNA was sequenced using an Illumina Inc. machine and later reconstructed into the original files with 100 percent accuracy, according to Nick Goldman and Ewan Birney at the European Molecular Biology Laboratory-European Bioinformatics Institute in Hinxton, England. The research was published today in the journal Nature.

While a similar experiment at Harvard University has shown that data can be encoded and synthesized in DNA, the new method also introduces error correction during the process, which makes it more reliable, the scientists said. This technique could be scaled to store the 3 zettabytes, or 3,000 billion billion bytes, of data estimated to exist on Earth, and the only limitation to wide implementation is the high cost of synthesizing DNA, they said.

“It scales remarkably well; the coding scheme would work to a zettabyte level,” Birney told reporters on a conference call. “As the price comes down, it could work on a big scale for large corporations, governments, and in the future for individuals,” Goldman said.

Shakespeare’s Sonnets

The two scientists enlisted the help of Santa Clara, California-based Agilent Technologies Inc., which volunteered to synthesize an MP3 file of Martin Luther King’s “I Have a Dream” speech and a text file of all of Shakespeare’s sonnets, among other items.

The result was a collection of DNA the size of a piece of dust, which was mailed back to the institute. The scientists were then able to sequence the DNA and decode the files without errors, they said.

“Error correction is a ubiquitous technology” used in laptops and mobile phones, Goldman said. “In almost every circumstance, information we would like to store and transmit gets a little bit corrupted along the way, and the point of error-correcting code is to be able to not be too upset by that.”

The cost of information storage in DNA in 10 years may drop to one-hundredth of the current $12,400 per megabyte it is now, according to the authors. At that rate, it may be possible to realize Goldman’s cloud-computing business idea, where individuals would upload files from their hard drives to the cloud and the company would send them back as DNA to keep safe for a hundred or even a thousand years, he said.

“This is something that’s just 10 years away,” Goldman said.