How to Turn Old Data into Business GoldRobert Plant
Posted on Harvard Business Review: February 1, 2011 10:32 AM
As I gaze around my office at the stacks of odd-sized discs and tapes containing data in forgotten formats from companies such as Quadronix and General Automation, I keep wondering: When am I going to have my NASA moment?
Everyone thinks of NASA as the quintessential forward-looking organization, but to me a NASA moment is a sudden insight in which a present-day problem is solved with "useless" resources from the past.
I call it a NASA moment because at the beginning of January I was impressed to read that a team of scientists at the space agency struck gold in a stash of data from lunar seismometers that had been beaming signals to Earth from 1969 through 1977. Somehow no one had erased the data, and when NASA needed seismic moon readings, there they were. The scientists were able to apply new analytical techniques to the data, yielding valuable insights such as that the moon has an iron-rich core similar to Earth's. Needless to say, delving into the archives was a lot less expensive, time-consuming, and risky than sending new seismic probes to the moon would have been.
Help Your Company—and Yourself
There's a lesson in this for CIOs who are trying to make their budgets stretch ever further in these austere times. Development of small, low-cost, low-risk projects involving archived datasets can sometimes have a disproportionately high impact on the business—and on the CIO's reputation.
A company that has used this strategy to great effect is Houston-based Geotrace, which develops advanced seismic-processing techniques to help oil and gas producers get more from their fields. Often that means working with decades-old data created when fields were originally explored. Geotrace's reexamination of data from a 15-year-old oil field in the Gulf of Mexico, for instance, led a client to drill a new well that increased production 35%.
Finding the latent value in old data requires some effort. You'll need to perform a forensic analysis: Why was the data collected? What format it is in? Creating a semantic map using metadata of this type will enable analysts within the IT organization and the business units to identify possible uses. For example, a retailer that has kept and labeled its old POS data could now subject it to today's sophisticated analytical techniques, gaining a valuable understanding of long-term consumer trends.
Now Is a Good Time to Start
Unfortunately, many IT organizations haven't done a good job of preserving and labeling their old data. But it's never too late to start. Here's how to give yourself—or your successor, or your successor's successor—an opportunity to make the best use of today's data:
1. Develop an "information lifecycle management" governance policy that addresses such issues as: What are the specific criteria for retaining or discarding data? What is the rationale for the criteria? (Remember not to be too rigid: Sometimes the uses for archived data don't emerge for years.)
2. Establish policies for converting retained data into formats that can be read 10 or 20 years from now.
3. Make your data future-friendly by creating technical documents that explain how it is currently being used and describe any gaps or limitations that have been identified by users. Supplement these documents with brief video recordings by in-house analysts; the videos will function like the knowledge banks that consultants leave behind after they wrap up their consulting engagements.
4. Determine how accessible and secure you want the data to be (beyond just conformance with regulatory policies such as Sarbanes-Oxley and HIPAA). The levels of accessibility and security will affect the data's total cost of ownership. With storage costs constantly dropping, these assessments will need to be revised frequently.
5. Establish retrieval policies in collaboration with the business units and C-level executives, who will apply tactical and strategic perspectives. For example, in the case of an audit, data will need to be retrieved in a timely manner; data archived in standard formats in a digital archive stored off site will facilitate this process.
My own retrieval policy for the 31 years' worth of data stashed in my office is pretty basic: I'm the only person who can access it because I'm the only person who can understand my storage algorithm and decipher the labels on the boxes. But I have an excellent sense of where everything is, and I truly believe that at least some of it will prove valuable someday, once the right question arises. Anyone need to know the relationship between the locations of patent awardees and venture capital firms from 1972 to 1987, by any chance? Let me know, and I'll gladly engage in a little alchemy, turning my "useless" data into gold.