Apache Spark 1.0 is Released

Spark 1.0 is its biggest release yet, with a list of new features for enterprise customers.

Congratulations to the Apache Spark community for today’s release of Spark 1.0, which includes contributions from more than 100 people (including Cloudera’s own Diana Carroll, Mark Grover, Ted Malaska, Sean Owen, Sandy Ryza, and Marcelo Vanzin). We think this release is an important milestone in the continuing rapid uptake of Spark by enterprises — which is supported by Cloudera via Cloudera Enterprise 5 — as a modern, general-purpose processing engine for Apache Hadoop.

Spark 1.0 contains, among other things:

  • History Server, for improved monitoring capabilities
  • Improvements to MLLib (Sparse Vector Support)
  • Improvements to Apache Avro integration
  • Support for Java 8 and lambda expressions
  • Simplified job submission to YARN cluster
  • Spark Streaming integration with Kerberos
  • Authentication of all Spark communications
  • Introduction of Spark SQL (alpha)
  • Unified application configuration and submission through spark-submit
  • PySpark on YARN support

(You’ll find more details about these features in the Release Notes. You can also read more from Databricks, here.)

Spark 1.0 will be packaged inside Cloudera’s CDH 5.1 release/available as a Cloudera Manager 5.1 parcel, which are forthcoming soon.

Fire it up!

Justin Kestelyn is Cloudera’s developer outreach director.


Spark Summit 2014 is coming (June 30 – July 2)! Register here to get 20% off the regular conference price.

Filed under:

12 Responses
  • What is the date for CDH 5.1 & Spark 1.0? / June 12, 2014 / 2:27 PM

    When Spark 1.0 is released with CD 5.1 will it include Spark SQL?

    What is the expected date for CDH 5.1?

    • Justin Kestelyn (@kestelyn) / June 13, 2014 / 9:39 AM

      As explained above, CDH 5.1 will contain Spark 1.0, including SparkSQL. However, the latter is currently considered “alpha” and thus will not be supported in that release.

      CDH 5.1 will be available very soon (mid-Summer).

  • Sourabh Chaki / June 17, 2014 / 2:53 AM

    Can we integrate spark 1.0 with cdh 5.0? If yes, what are the steps for that? I believe , we need to explode the cdh 5.0 parcel and add replace spark 0.9 with 1.0. Will this approach work? Please confirm.

  • Manoj / July 10, 2014 / 12:20 PM

    Will Spark 1.0 released with CDH 5.1 include Spark Streaming?

  • Manoj / July 10, 2014 / 12:22 PM

    We are currently at CDH 4.6 and would like to test Spark SQL and Spark Streaming. What are our possible ways?

  • Calin-Andrei Burloiu / August 11, 2014 / 6:17 AM

    We upgraded to CDH 5.1.0 and we now have Spark 1.0.0. Unfortunately, I am can’t see Spark History Server when I am trying to add it in Spark service, Instances tab. Additionally, I can no longer see running jobs in Master Web UI. Do you happen to know what’s the problem?

    • Justin Kestelyn (@kestelyn) / August 11, 2014 / 8:58 AM

      For more rapid response, I recommend you post this issue to the “Spark” area at cloudera.com/community.

  • tonsat / September 18, 2014 / 4:44 AM

    We are not able to find Spark-SQL in CDH5.1.2, is it included ? if not is there a way we can install?

    • Justin Kestelyn (@kestelyn) / September 18, 2014 / 9:13 AM

      Spark SQL is in 5.1.2 (along with other Spark modules). It’s an alpha however and thus not supported.

Leave a comment


2 + = five