If 2014 was the year that Apache Hadoop sparked the big data revolution, 2015 may be the year that Apache Spark supplants Hadoop with its superior capabilities for richer and more timely analysis.
“There is a strong industry consensus that Spark is the way to go,” said Curt Monash, head of the IT analyst firm Monash Research.
“Next year, you will see a lot of [Hadoop] use cases that transcend Hadoop,” said Ali Ghodsi, CEO and co-founder of Databricks, a company formed by a number of the creators of Spark that offers a hosted Spark service, as well as technical support for software distributors selling Spark packages.
Spark is an engine for analyzing data stored across a cluster of computers. Like Hadoop, Spark can be used to examine data sets that are too large to fit into a traditional data warehouse or a relational database. Also like Hadoop, Spark can work on unstructured data, such as event logs, that hasn’t been formatted into database tables.