Most companies are facing an onslaught of information about customers from social networks, the Internet, and mobile devices. More than 80 percent of these data don't fit neatly into conventional relational databases, and to make it work is simply too expensive. So, companies including Disney, GE, Wal-Mart, and others are using software called Hadoop to process large amounts of information such as text, Facebook and Twitter updates, and clicks on websites to gain a better understanding of their businesses, products, and customers.

Hadoop is open-source software that got its start with Web companies such as Google (GOOG) and Yahoo. In fact, the technology is widely used at Yahoo and the company's software developers have contributed most of the code to the project. Now, a new group of startups such as Cloudera is taking that software and adding services that large companies like Nokia need to run Hadoop on their networks. Read on to see which companies are using Hadoop. Photographer: David Paul Morris/Bloomberg

Most companies are facing an onslaught of information about customers from social networks, the Internet, and mobile devices. More than 80 percent of these data don't fit neatly into conventional relational databases, and to make it work is simply too expensive. So, companies including Disney, GE, Wal-Mart, and others are using software called Hadoop to process large amounts of information such as text, Facebook and Twitter updates, and clicks on websites to gain a better understanding of their businesses, products, and customers.

Hadoop is open-source software that got its start with Web companies such as Google (GOOG) and Yahoo. In fact, the technology is widely used at Yahoo and the company's software developers have contributed most of the code to the project. Now, a new group of startups such as Cloudera is taking that software and adding services that large companies like Nokia need to run Hadoop on their networks. Read on to see which companies are using Hadoop. Photographer: David Paul Morris/Bloomberg

How Large Companies Tackle Big Data with Hadoop

Information-Processing Software
Information-Processing Software

Most companies are facing an onslaught of information about customers from social networks, the Internet, and mobile devices. More than 80 percent of these data don't fit neatly into conventional relational databases, and to make it work is simply too expensive. So, companies including Disney, GE, Wal-Mart, and others are using software called Hadoop to process large amounts of information such as text, Facebook and Twitter updates, and clicks on websites to gain a better understanding of their businesses, products, and customers.

Hadoop is open-source software that got its start with Web companies such as Google (GOOG) and Yahoo. In fact, the technology is widely used at Yahoo and the company's software developers have contributed most of the code to the project. Now, a new group of startups such as Cloudera is taking that software and adding services that large companies like Nokia need to run Hadoop on their networks. Read on to see which companies are using Hadoop. Photographer: David Paul Morris/Bloomberg

AOL
AOL

The company wanted to improve how it targets ads to people by mining large amounts of information about users, according to a presentation the company gave at Hadoop World 2010. With Hadoop, AOL (AOL) can serve up ads for services near where a user is located or ads targeted toward a user's interests.

Bank of America
Bank of America

With Hadoop, Bank of America (BAC) has been able to analyze billions of records to gain a better understanding of the impact of new and existing financial products. The bank can now examine things like credit and operational risk of products across different lines of business including home loans, insurance, and online banking, according to a 2010 Hadoop Summit presentation by Abhishek Mehta, co-founder of Tresata, a company that does Hadoop-powered data analysis for the financial-services industry.

Disney
Disney

Hadoop became a cost-effective way for Walt Disney (DIS) to analyze and correlate information from its different businesses including theme-park attendance, reservations at resort hotels, and Disney Channel cable TV viewers, according to a PricewaterhouseCoopers report.

General Electric
General Electric

The marketing and communications teams at GE (GE) can assess how the public perceives the company through sentiment analysis, according to a presentation at Hadoop World 2010. The company uses Hadoop to mine text such as updates on Facebook and Twitter along with news reports and other information on the Internet to understand—with 80 percent accuracy—how consumers feel about GE and its various divisions.

LinkedIn
LinkedIn

Using Hadoop, LinkedIn (LNKD) culls information it collects on the site to find people you may know. It scores as many as 120 billion relationships per day using a statistical model to determine the probability that you may know another LinkedIn member. That's how the site serves up uncannily accurate suggestions for people with whom you may want to connect.

IBM
IBM

IBM's (IBM) Watson computer relied on Hadoop when it competed against Brad Rutter and Ken Jennings on the game show Jeopardy. "There's an engine room in Watson and that engine room runs Hadoop," said Anant Jhingran, vice-president and chief technology officer of information management at IBM, during the Hadoop Summit 2011. "It plunges through those 200 million documents, it parses them, it analyzes them so that Watson appears extremely smart."

Nokia
Nokia

In the past year, the handset maker began to analyze large quantities of information such as text and Web reports to gain a better understanding of customers. With the help of Hadoop, the company will be able to manage as much as 20 petabytes of information over the next year, up from several hundred terabytes over the last 12 months, according to Amy O'Connor, senior director of analytics at Nokia (NOK).

Orbitz
Orbitz

Hadoop let the travel website significantly reduce the cost per terabyte of storing data because it runs on commodity hardware. With the ability to store and analyze more data, Orbitz (OWW) has used Hadoop to try to improve hotel rankings and to measure and track how long it takes users to download Web pages.

Yahoo!
Yahoo!

From the software's inception until June 2011, Yahoo (YHOO) has contributed about 84 percent of the nearly 500,000 lines of code that still exist in the main body of Apache Hadoop, according to a blog post from Owen O'Malley, who worked at Yahoo and was one of the original contributors to Hadoop. In June 2011, Yahoo and Benchmark Capital announced the formation of an independent company called Hortonworks, which focuses on the software's development. O'Malley is a co-founder.

@WalmartLabs
@WalmartLabs

Hadoop is part of Wal-Mart's (WMT) strategy to analyze large amounts of data to better compete against online retailers including Amazon.com (AMZN). For example, Wal-Mart bids on keywords on Google and Bing to drive Internet traffic to Walmart.com for specific products. Hadoop lets the company collect information about millions of keywords and come up with optimal bids for each word, according to Anand Rajaraman, senior vice-president of global e-commerce and head of @WalmartLabs. Photographer: David Paul Morris/Bloomberg