Category Archives: Hadoop

New in CDH 5.1: HDFS Read Caching

Categories: CDH Hadoop HDFS Impala Performance

Applications using HDFS, such as Impala, will be able to read data up to 59x faster thanks to this new feature.

Server memory capacity and bandwidth have increased dramatically over the last few years. Beefier servers make in-memory computation quite attractive, since a lot of interesting data sets can fit into cluster memory, and memory is orders of magnitude faster than disk.

For the latest release of CDH 5.1,

Read More

Progress Report: Cloudera Community Forums After One Year

Categories: Community Hadoop

Cloudera Community forums are proving their value as an important contributor to a rich user experience.

It’s been almost exactly one year since the debut of the Cloudera Community forums. In addition to doing the birthday shout-out, I thought it would be interesting to bring you up to date about adoption and usage patterns.

Launched in response to candid feedback from our customers, use of these forums has been steadily growing,

Read More

New in Cloudera Manager 5.1: Direct Active Directory Integration for Kerberos Authentication

Categories: Hadoop How-to Security

With this new release, setting up a separate MIT KDC for cluster authentication services is no longer necessary.

Kerberos (initially developed by MIT in the 1980s) has been adopted by every major component of the Apache Hadoop ecosystem. Consequently, Kerberos has become an integral part of the security infrastructure for the enterprise data hub (EDH).

Until recently, the preferred architecture was to configure your Hadoop cluster to connect directly to an MIT key distribution center (KDC) for authentication services.

Read More

Cloudera Enterprise 5.1 is Now Available

Categories: CDH Cloudera Manager Hadoop

Cloudera Enterprise’s newest release contains important new security and performance features, and offers support for the latest innovations in the open source platform.

We’re pleased to announce the release of Cloudera Enterprise 5.1 (comprising CDH 5.1, Cloudera Manager 5.1, and Cloudera Navigator 2.0).

Cloudera Enterprise 5, released April 2014, was a milestone for users in terms of security, performance, and support for the latest community-driven innovations, and this update includes significant new investments in those areas,

Read More

Jay Kreps, Apache Kafka Architect, Visits Cloudera

Categories: Cloudera Life Hadoop Kafka

It was good to see Jay Kreps (@jaykreps), the LinkedIn engineer who is the tech lead for that company’s online data infrastructure, visit Cloudera Engineering yesterday to spread the good word about Apache Kafka.

Kafka, of course, was originally developed inside LinkedIn and entered the Apache Incubator in 2011. Today, it is being widely adopted as a pub/sub framework that works at massive scale (and which is commonly used to write to Apache Hadoop clusters,

Read More