A Rich Vein for 'Reality Mining'

Researchers and companies are finding novel uses for information extracted from cell-phone data
Peter Arkle

In the aftermath of the September 11 attacks, U.S. officials quickly turned their attention to other potential targets, including California's Golden Gate Bridge. What would happen if terrorists took down the bridge between San Francisco and Marin County? How much of the region would be affected and for how long?

For insights, the Homeland Security Dept. turned to a Microsoft spin-off called Inrix. The startup analyzes data from satellite navigation gear that's widely installed on trucks and some cars to produce real-time traffic information, which it sells commercially. Parsing years of stored traffic data using proprietary software, Inrix was able to model not only the immediate impact of a Golden Gate Bridge catastrophe, but also how drivers in the region would work around it. In the model, the Bay Area pulls off an amazingly quick recovery. Within a few days, drivers understand what is happening and adapt to the new reality, says Inrix Chief Executive Bryan Mistele.

The technique Inrix used is called reality mining. It's a twist on data mining that allows researchers to extract information from the usage patterns of mobile phones and other wireless devices. Because these machines are almost always switched on and are constantly in contact with cellular base stations, they produce a persistent digital record of where the users are going, how long they stay, and who they come in contact with. Particularly when phones are equipped with global positioning system chips, they can generate precise location maps in phone company databases. Such trails are far more accurate than human beings' subjective accounts of their comings and goings.

The reality miners excel at dreaming up exotic applications. In addition to helping cities prepare for possible terrorist attacks, they have devised ways to ease traffic congestion; helped city planners find the best locations for schools, hospitals, and convention centers; and enabled all types of businesses—not least, phone companies—to improve customer service. In the future, reality mining may also allow health officials to track and contain outbreaks of infectious diseases. "There is so much societal good that can come from this," says Alex "Sandy" Pentland, a Massachusetts Institute of Technology professor and reality mining pioneer. "Suddenly we have the ability to know what is happening with the mass of humanity."


Signals among phones and base stations can be detected by commercial sensing devices. But the detailed records of who is calling whom belong entirely to the phone companies. Right now, they make little use of that data, in part because they fear alienating subscribers worried about privacy infringement. But cellular operators have begun signing deals with business partners who are eager to market products based on specific phone users' location and calling habits. If reality mining catches on, phone companies' calling records will become precious assets. And these will only grow in value as customers use their phones to browse the Web, purchase products, and update their Facebook pages—and as marketers apply reality mining's toolkit to these activities.

In academia, reality miners are interested in applying the technology to areas such as disease management. Suppose health officials in a city suspect passengers arriving at an airport have been exposed to avian flu. In the not-too-distant future, they might be able to enlist cellular operators and use reality mining to monitor clusters of individuals thought to be at risk. Phone records could reveal that an unusual number of passengers on the flight are staying home from work or are in the hospital. With further digging, officials could uncover a record of contacts with taxi drivers, waiters, even random people in a supermarket. In such a crisis, the technology could save lives. "It's one of the application areas that [works] well both on the individual and on large groups," says Alex Kass, a researcher at consulting firm Accenture (ACN).

Even in an imagined crisis, however, such scenarios would raise red flags among privacy advocates. Guilherme Roschke, a staff attorney at the Electronic Privacy Information Center, a nonprofit in Washington, D.C., worries whenever people are monitored without their consent. "There is a lot of new information being collected, and it brings significant new capabilities," he says. "Whenever it's put to a new use, it must be disclosed."

Academic researchers acknowledge the risks and have begun setting rules for how data should be collected and used. "Our first assumption is that people own their own data," says MIT's Pentland. But companies may find it difficult to comply with the new rules. Nathan Eagle, one of Pentland's MIT colleagues, has access to a database that holds an entire month's worth of calling data for a whole European country—he won't say which one. The data set contains information on 250 million cell phones and land lines and some 12 billion phone calls. For research purposes, the data has been scrubbed of all information that might be used to identify individuals.

In Eagle's lab, which is funded partly by Nokia (NOK), he and his colleagues use the data to test the power of their algorithms. They've observed phenomena that may interest the phone companies that own the unscrubbed data. For example, each neighborhood has heavy users who influence other people, say, by proselytizing for new phone applications. Eagle says phone companies can identify these "influencers," and they'll bend over backwards to make sure these subscribers don't jump to rival carriers. "If someone who makes a lot of calls walks away, there's a higher potential that they'll take more people along with them," Eagle says.

    Before it's here, it's on the Bloomberg Terminal.