With the firehose of information enabled by Facebook, Twitter, location-based services, and other forms of social media, the era of Big Data is upon us. However, outside of the consumer world, the stakes are much higher: While advertisers and consumers are focused on monetizing sites that have hundreds of millions of users for a few pennies each, the ubiquity of connectivity and the growth of sensors has opened up a larger storehouse of information that will not only help businesses profit, but will also boost safety and enable environmental benefits.
For example, a Boeing (BA) jet generates 10 terabytes of information per engine every 30 minutes of flight, according to Stephen Brobst, the CTO of Teradata (TDC).So for a single six-hour, cross-country flight from New York to Los Angeles on a twin-engine Boeing 737—the plane used by many carriers on this route—the total amount of data generated would be a massive 240 terabytes of data.There are about 28,537 commercial flights in the sky in the U.S.on any given day.Using only commercial flights, a day's worth of sensor data quickly climbs into the petabyte scale—for a single day.Multiply that by weeks, months, and years, and the scale of sensor data gets massive.
Brobst, whose company sells data warehousing appliances and analytics software, points out that the Internet of Things will dwarf social media sites in its ability to generate data. The stakes and potential for monetization are huge in a world where roads have sensors and can communicate with the vehicles passing over them to determine traffic patterns, find more sustainable ways to route cars, and perhaps even generate data to be sold to insurance companies or other businesses seeking to tap transportation information.
Brobst says within the next five years, sensor data will hit the crossover point with unstructured data generated by social media. From there, the sensor data will dominate by factors 10 to 20 times that of social media. However, using this data will be difficult for the time being, as there are no standards to ensure the data's readability beyond those possessing the right software or algorithm. There's also a question of who owns the data.
For example, if a roadway has sensors embedded in it, does the federal or state government own them? Plus, what software does the government need to talk to those sensors, and since highway projects tend to be bid out by the states, will one state be using the same sensors or software as another? Once someone adds private industry to the mix, such as trying to assemble traffic data from cars, or trying to optimize routes for fuel efficiency, the questions become: Should a car manufacturer place that information in the car? Should the consumer opt-in via a cell phone? Or should a consumer buy an insurance policy at a discount in exchange for getting a black box that will deliver a stream of data back to the insurer?
Other than ownership and interoperability questions, there's also the question of how long companies should store the data and who has access to it. With so many disparate sources of data, and no real vision right now for a way to get all the data formatted in a manner that could be used by any number of interested parties, larger providers like Microsoft (MSFT) are seeking to create marketplaces where data can be bought and sold among interested parties.
However, as Brobst says, the amount of data is only going to continue to rise, so figuring out how to manage it, what to keep, and how to mine it for useful information will become increasingly important. Effectively utilizing this data—from energy to fuel consumption to weather data—could also provide valuable tools for environmental sustainability. Big Data is a big opportunity, but it's also leading to big questions.
Also from the GigaOM Network: