Photographer: Krisztian Bocsi/Bloomberg

SAP Unveils Software for Spark, Open-Source Big Data Sieve

  • Product lets users mix application information, sensor data
  • Open-source Apache Spark cuts analysis times for data sets

SAP SE has become the latest big technology company to throw its weight behind open-source data-sifting software called Spark as it tackles information streaming from industries such as retail, telecommunications and transport.

The German company is releasing software this month called Hana Vora, which lets customers combine business data stored in SAP’s Hana database with information from industrial sensors, phone networks and other sources stored in Spark. SAP joins companies including International Business Machines Corp., Microsoft Corp. and Oracle Corp. in supporting Spark, which can quickly process data using groups of low-cost servers.

“This is lubricant -- it makes it easier to bring that data into the business context,” said SAP Chief Technology Officer Quentin Clark. The Hamburg Port Authority, for example, ensures incoming goods can be loaded onto trucks around one of Europe’s busiest ports as efficiently as possible by processing different types of data in tandem. The port combines data from sensors on the trucks flowing into in open-source tools with information about the movement of goods that’s stored in SAP’s business applications.

SAP, the largest maker of business applications for financial and human-resources management, is seeking to expand beyond its traditional strongholds. Embracing new data-processing technology like Spark is important for SAP as it tries to build share for its Hana product, which has just 6.3 percent of the $34 billion global database market. That compares with 46.2 percent for Oracle and 19.4 percent for Microsoft, according to market researcher IDC.

Shares of SAP advanced 0.1 percent to 58.79 euros at 9:49 a.m. in Frankfurt. The stock is little changed this year.

Spark Projects

Originally developed at the University of California at Berkeley, Spark is seen as a successor to the widely used Hadoop tools for analyzing so-called big data, since it’s able to load huge sets into fast computer memory.

IBM is committing more than 3,500 researchers and developers to Spark and building the software into its analysis and e-commerce products. Microsoft offers Spark as part of its Azure HDInsight cloud-computing service for online computation and storage. Oracle’s Big Data Appliance computer also runs Spark.

Silicon Valley startups are building new businesses on the technology too. Databricks Inc., which is cooperating with IBM, in June delivered a set of online tools for developers to manage Spark projects. Origami Logic Inc. applies big-data processing to help track marketing campaigns.

Combining large data sets from industrial sensors or video tracking of retail shoppers with information from business systems is becoming more important as companies capitalize on an area called the industrial Internet. SAP, Siemens AG and General Electric Co. have been investing in the area, and consultancy Accenture estimates that industrial Internet companies attracted $1.5 billion in venture-capital funding last year, primarily from large corporations.

Before it's here, it's on the Bloomberg Terminal. LEARN MORE