Category Archives: CDH

Learn for Free How to Deploy Cloudera Enterprise on Microsoft Azure

Categories: Altus CDH Cloud Ops and DevOps Training

At Cloudera, we spend our time helping customers benefit from data. We help them with different types of data—structured, semi-structured, or raw unstructured. We also help them implement solutions for storing, tracing, securing, processing, enriching, analyzing, and visualizing it.

Over the past several years, we’ve observed that customers are increasingly working with their data in the cloud, and Microsoft’s Azure cloud service is a popular deployment option. Cloudera University is pleased to announce a free course,

Read more

Proactive Data Pipeline Alerting with Pulse

Categories: CDH Events Guest Search

In mid-2017, we were working with one of the world’s largest healthcare companies to put a new data application into production. The customer had grown through acquisition and in order to maintain compliance with the FDA, they needed to aggregate data in real-time from dozens of different divisions of the company. The consumers of this application, of course, did not care how we built the data pipeline. However, they cared greatly that if it broke,

Read more

Protecting Hadoop Clusters From Malware Attacks

Categories: Altus CDH Platform Security & Cybersecurity

Two new strains of malware–XBash and DemonBot–are targeting Apache Hadoop servers for Bitcoin mining and DDOS purposes. This malware is scanning the internet so vigorously for Hadoop clusters that an infection can occur within minutes of an insecure cluster being placed on the open internet. This blog post describes the mechanism this malware uses and offers specific actions to protect your Hadoop-based clusters.

A History of Hadoop Malware

Roughly two years ago there were a spate of attacks against the open source database solution MongoDB,

Read more

New in Cloudera Enterprise 6: Apache Hive 2.1

Categories: CDH Hive

We recently released Cloudera Enterprise 6.0 featuring significant improvements across a number of core components. In this blog post, we’re going to focus on Apache Hive 2.1.

Hive’s Approach to Rebase: Stability and Quality Most Important

Prior to the release of Cloudera Enterprise 6.0, Cloudera’s supported platform included Apache Hive 1.1 augmented with numerous features, enhancements and fixes from the later Apache Hive releases—all of which were included only after rigorous quality criteria were met.

Read more

Third-Party Libraries in C6

Categories: CDH General Platform Security & Cybersecurity

Cloudera has put a significant amount of work into upgrading the third-party libraries used in our just-released C6 version. This major upgrade of our software has given us the opportunity to upgrade many of the libraries we use. These upgrades allow us to avoid security vulnerabilities, use modern versions of libraries, and to standardize versions of libraries across CDH.

Modern software development relies on reusing other people’s code. Code reused in this fashion is called a “third-party library.”

Read more