New functionality includes support for spot instances, automatic job submission, and integrated setup for HA and Kerberized clusters.
Cloudera Director is the manifestation of Cloudera’s commitment to provide a simple and reliable way to deploy, scale, and manage Apache Hadoop clusters in the cloud of your choice. Cloudera Director lets you deploy production-ready clusters for big data applications and successfully run workloads in the cloud. With Cloudera Director 2.0,
Spark Dataflow from Cloudera Labs is now part of Google’s New Dataflow SDK, which will be proposed to the Apache Incubator.
Spark Dataflow is an experimental implementation of Google’s Dataflow programming model that runs on Apache Spark. The initial implementation was written by Josh Wills, and entered Cloudera Labs exactly a year ago. Since then, we’ve seen a number of contributions to the project, culminating in the recent addition of an implementation of streaming (running on Spark Streaming) by Amit Sela from PayPal.
Which topics interested attracted the most community interest in 2015? Find out below.
It’s our annual custom to bring you a list of this blog’s most popular posts of the year. (See the 2013 and 2014 versions.) Why? Because this list reflects interests across the ecosystem; it’s one of the best passive surveys we have, actually.
As usual, when drawing conclusions, be sure to account for data skew.
With this new beta release, column-level privileges set via Apache Sentry (incubating) are now enforced on Spark/MapReduce jobs.
Cloudera is excited to announce the availability of the second beta release for RecordService. This release is based on CDH 5.5 and provides some new features, including:
- Support for Sentry column-level security. Previously, column-level access control required the use of views; now,
The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen, Imran Rashid [PMC], Sandy Ryza, and Marcelo Vanzin) for their perspectives about how the Spark community has worked and is working together, and the work to be done via the One Platform initiative to make the Spark stack enterprise-ready.