Category Archives: CDH

Multi-node Clusters with Cloudera QuickStart for Docker

Categories: CDH QuickStart VM

Getting hands-on with a multi-node cluster for self-learning or testing is even easier, now.

Last December, we introduced the Cloudera QuickStart Docker image to make it easier than ever before to explore Cloudera’s distributed data processing platform, including tools such as Apache Impala (incubating), Apache Spark, and Apache Solr. While the single-node getting-started image was well-received, we noted a large number of requests from the community for a multi-node CDH deployment via Docker.

Read More

Cloudera Enterprise 5.8 is Now Available

Categories: CDH Cloudera Manager Hadoop

Cloudera Enterprise 5.8 is now generally available (comprising CDH 5.8, Cloudera Manager 5.8, and Cloudera Navigator 2.7). 

Cloudera is excited to announce the general availability of Cloudera Enterprise 5.8! Main highlights of this release include Impala read/write support on Amazon S3, a redesigned SQL query editor GUI, the expansion of role-based access control functionality to Cloudera Search, and the GA of Cloudera Navigator Optimizer to facilitate and optimize workload migrations.

Read More

What’s New in Cloudera Director 2.1?

Categories: CDH Cloud Cloudera Manager Hadoop

This new release contains, among other things, support for usage-based billing, deployments to Microsoft Azure, and deployments across providers or regions.

Cloudera Director is a manifestation of Cloudera’s commitment to provide a simple and reliable way to deploy, scale, and manage Apache Hadoop in the cloud of your choice. Cloudera Director enables you to deploy production-ready clusters for big data applications and successfully run workloads in the cloud. With Cloudera Director 2.1,

Read More

How-to: Detect and Report Web-Traffic Anomalies in Near Real-Time

Categories: CDH Flume Impala Spark Use Case

This framework based on Apache Flume, Apache Spark Streaming, and Apache Impala (incubating) can detect and report on abnormal bad HTTP requests within seconds.                     

Website performance and availability are mission-critical for companies of all types and sizes, not just those with a revenue stream directly tied to the web. Web pages can become unavailable for many reasons, including overburdened backing data stores or content-management systems or a delay in load times of third-party content such as advertisements.

Read More

New in CDH 5.7: Improved Performance, Security, and SQL Experience in Hue

Categories: CDH Hue

CDH 5.7 includes a lot of changes (more than 1,500) to Hue, the Web UI that makes Apache Hadoop easier to use.

In this new release, the emphasis on performance and security carries over from 5.5. The overall improvement in the SQL user experience is also considerable.

In this post, we’ll cover some highlights.

New Hive Metastore Interface

This app is now on a single page, 

Read More