Category Archives: General

The Top 10 Cloudera Engineering Blog Posts of 2015

Categories: Community General Hadoop

Which topics interested attracted the most community interest in 2015? Find out below.

It’s our annual custom to bring you a list of this blog’s most popular posts of the year. (See the 2013 and 2014 versions.) Why? Because this list reflects interests across the ecosystem; it’s one of the best passive surveys we have, actually.

As usual, when drawing conclusions, be sure to account for data skew.

Read More

Announcing RecordService Beta 2: Brings Column-level Security to Apache Spark and MapReduce

Categories: General Platform Security & Cybersecurity Sentry Spark

With this new beta release, column-level privileges set via Apache Sentry (incubating) are now enforced on Spark/MapReduce jobs.

Cloudera is excited to announce the availability of the second beta release for RecordService. This release is based on CDH 5.5 and provides some new features, including:

  • Support for Sentry column-level security. Previously, column-level access control required the use of views; now,

Read More

Meet Cloudera’s Apache Spark Committers

Categories: Community General Meet the Engineer Spark

The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen, Imran Rashid [PMC], Sandy Ryza, and Marcelo Vanzin) for their perspectives about how the Spark community has worked and is working together, and the work to be done via the One Platform initiative to make the Spark stack enterprise-ready.

Recently,

Read More

How-to: Install Apache Zeppelin on CDH

Categories: General Guest How-to Spark

Our thanks to Karthik Vadla and Abhi Basu, Big Data Solutions engineers at Intel, for permission to re-publish the following (which was originally available here).

Data science is not a new discipline. However, with the growth of big data and adoption of big data technologies, the request for better quality data has grown exponentially. Today data science is applied to every facet of life—product validation through fault prediction,

Read More