Spark 1.0 is its biggest release yet, with a list of new features for enterprise customers.
Congratulations to the Apache Spark community for today’s release of Spark 1.0, which includes contributions from more than 100 people (including Cloudera’s own Diana Carroll, Mark Grover, Ted Malaska, Sean Owen, Sandy Ryza, and Marcelo Vanzin). We think this release is an important milestone in the continuing rapid uptake of Spark by enterprises — which is supported by Cloudera via Cloudera Enterprise 5 — as a modern, general-purpose processing engine for Apache Hadoop.
Spark 1.0 contains, among other things:
- History Server, for improved monitoring capabilities
- Improvements to MLLib (Sparse Vector Support)
- Improvements to Apache Avro integration
- Support for Java 8 and lambda expressions
- Simplified job submission to YARN cluster
- Spark Streaming integration with Kerberos
- Authentication of all Spark communications
- Introduction of Spark SQL (alpha)
- Unified application configuration and submission through spark-submit
- PySpark on YARN support
Spark 1.0 will be packaged inside Cloudera’s CDH 5.1 release/available as a Cloudera Manager 5.1 parcel, which are forthcoming soon.
Fire it up!
Justin Kestelyn is Cloudera’s developer outreach director.
Spark Summit 2014 is coming (June 30 – July 2)! Register here to get 20% off the regular conference price.