Tag Archives: security

How-to: Prepare Unstructured Data in Impala for Analysis

Categories: How-to Impala

Learn how to build an Impala table around data that comes from non-Impala, or even non-SQL, sources.

As data pipelines start to include more aspects such as NoSQL or loosely specified schemas, you might encounter situations where you have data files (particularly in Apache Parquet format) where you do not know the precise table definition. This tutorial shows how you can build an Impala table around data that comes from non-Impala or even non-SQL sources,

Read More

Meet Cloudera’s Apache Spark Committers

Categories: Community General Meet the Engineer Spark

The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen, Imran Rashid [PMC], Sandy Ryza, and Marcelo Vanzin) for their perspectives about how the Spark community has worked and is working together, and the work to be done via the One Platform initiative to make the Spark stack enterprise-ready.

Recently,

Read More

Community Meetups at Strata + Hadoop World NYC 2015

Categories: Community Events

Strata + Hadoop World 2015 NYC is more than a daytime conference; it’s also a nighttime meetup experience. (Plus, there are a bunch of book signings.)

It won’t be long before we’re all in NYC for Strata + Hadoop World (Sept. 29-Oct. 1; if you haven’t registered yet, a 20% discount is still available). So, consider for your evening agenda:

Read More

How-to: Run Apache Mesos on CDH

Categories: CDH Cloudera Manager Guest Ops and DevOps

Big Industries, Cloudera systems integration and reseller partner for Belgium and Luxembourg, has developed an integration of Apache Mesos and CDH that can be deployed and managed through Cloudera Manager. In this post, Big Industries’ Rob Gibbon explains the benefits of deploying Mesos on your cluster and walks you through the process of setting it up.

[Editor’s Note: Mesos integration is not currently supported by Cloudera, thus the setup described below is not recommended for production use.]

Apache Mesos is a distributed,

Read More

What’s New in Cloudera Director 1.5?

Categories: Cloud

Cloudera Director 1.5 is now available; this post describes what’s inside, including a new open source plugin interface.

Cloudera Director is the manifestation of Cloudera’s commitment to providing a simple and reliable way to deploy, scale, and manage Apache Hadoop in the cloud of your choice. With Cloudera Director 1.5, we continue the story of enabling production-ready clusters and big data applications by focusing on the following themes.

Read More