Cloudera Engineering Blog · Cloud Posts

Cloudera Enterprise 5.2 is Released

Cloudera Enterprise 5.2 contains new functionality for security, cloud deployments, and real-time architectures, and support for the latest open source component releases and partner technologies.

We’re pleased to announce the release of Cloudera Enterprise 5.2 (comprising CDH 5.2, Cloudera Manager 5.2, Cloudera Director 1.0, and Cloudera Navigator 2.1).

Using Impala, Amazon EMR, and Tableau to Analyze and Visualize Data

Our thanks to AWS Solutions Architect Rahul Bhartia for allowing us to republish his post below.

Apache Hadoop provides a great ecosystem of tools for extracting value from data in various formats and sizes. Originally focused on large-batch processing with tools like MapReduce, Apache Pig, and Apache Hive, Hadoop now provides many tools for running interactive queries on your data, such as Impala, Drill, and Presto. This post shows you how to use Amazon Elastic MapReduce (Amazon EMR) to analyze a data set available on Amazon Simple Storage Service (Amazon S3) and then use Tableau with Impala to visualize the data.

The Definitive "Getting Started" Tutorial for Apache Hadoop + Your Own Demo Cluster

Using this new tutorial alongside Cloudera Live is now the fastest, easiest, and most hands-on way to get started with Hadoop.

At Cloudera, developer enablement is one of our most important objectives. One only has to look at examples from history (Java or SQL, for example) to know that knowledge fuels the ecosystem. That objective is what drives initiatives such as our community forums, the Cloudera QuickStart VM, and this blog itself.

Secrets of Cloudera Support: Using OpenStack to Shorten Time-to-Resolution

Automating the creation of short-lived clusters for testing purposes frees our support engineers to spend more time on customer issues.

The first step for any support engineer is often to replicate the customer’s environment in order to identify the problem or issue. Given the complexity of Cloudera customer environments, reproducing a specific issue is often quite difficult, as a customer’s problem might only surface in an environment with specific versions of Cloudera Enterprise (CDH + Cloudera Manager), configuration settings, certain number of nodes, or the structure of the dataset itself. Even with Cloudera Manager’s awesome setup wizards, setting up Apache Hadoop can be quite time consuming, as the software was never designed with ephemeral clusters in mind.

Meet the Engineer: Andrei Savu

In this installment of “Meet the Engineer”, our subject is Andrei Savu!

What do you do at Cloudera?

Cloudera Live: The Instant Apache Hadoop Experience

Get started with Apache Hadoop and use-case examples online in just seconds.

Today, we announced Cloudera Live, a new online service for developers and analysts (currently in public beta) that makes it easy to learn, explore, and try out CDH, Cloudera’s open source software distribution containing Apache Hadoop and related projects. No downloads, no installations, no waiting — just point-and-play!

Best Practices for Deploying Cloudera Enterprise on Amazon Web Services

This FAQ contains answers to the most frequently asked questions about the architecture and configuration choices involved.

In December 2013, Cloudera and Amazon Web Services (AWS) announced a partnership to support Cloudera Enterprise on AWS infrastructure. Along with this announcement, we released a Deployment Reference Architecture Whitepaper. In this post, you’ll get answers to the most frequently asked questions about the architecture and the configuration choices that have been highlighted in that whitepaper.

How-to: Use Impala on Amazon EMR

Developers, rejoice: Impala is now available on EMR for testing and evaluation.

Very recently, Amazon Web Services announced support for running Cloudera Impala queries on its Elastic MapReduce (EMR) service. This is very good news for EMR users — as well as for users of other platforms interested in kicking Impala’s tires in a friction-free way. It’s also yet another sign that Impala is rapidly being adopted across the ecosystem as the gold standard for interactive SQL and BI queries on Apache Hadoop.

Meet the Project Founder: Tom White

Tom

In this new installment of our “Meet the Project Founder” series, meet Tom White, founder of Apache Whirr, PMC Member for multiple other projects (Apache Hadoop, Apache Avro, Apache Bigtop, Apache Sqoop), and author of O’Reilly Media’s best-selling book, Hadoop: The Definitive Guide.

How-to: Create a CDH Cluster on Amazon EC2 via Cloudera Manager

Editor’s Note (added Feb. 28, 2014): The instructions below are deprecated for Cloudera Manager releases beyond 4.5. Please refer to this doc for instructions pertaining to releases 4.6 and later.

Cloudera Manager includes a new express installation wizard for Amazon Web Services (AWS) EC2. Its goal is to enable Cloudera Manager users to provision CDH clusters and Cloudera Impala (the open source distributed query engine for Apache Hadoop) on EC2 as easily as possible (for testing and development purposes only, not supported for production workloads) - and thus is currently the fastest way to provision a Cloudera Manager-managed cluster in EC2.

Older Posts