Category Archives: CDH

Proactive Data Pipeline Alerting with Pulse

Categories: CDH Events Guest Search

In mid-2017, we were working with one of the world’s largest healthcare companies to put a new data application into production. The customer had grown through acquisition and in order to maintain compliance with the FDA, they needed to aggregate data in real-time from dozens of different divisions of the company. The consumers of this application, of course, did not care how we built the data pipeline. However, they cared greatly that if it broke,

Read more

Protecting Hadoop Clusters From Malware Attacks

Categories: Altus CDH Platform Security & Cybersecurity

Two new strains of malware–XBash and DemonBot–are targeting Apache Hadoop servers for Bitcoin mining and DDOS purposes. This malware is scanning the internet so vigorously for Hadoop clusters that an infection can occur within minutes of an insecure cluster being placed on the open internet. This blog post describes the mechanism this malware uses and offers specific actions to protect your Hadoop-based clusters.

A History of Hadoop Malware

Roughly two years ago there were a spate of attacks against the open source database solution MongoDB,

Read more

New in Cloudera Enterprise 6: Apache Hive 2.1

Categories: CDH Hive

We recently released Cloudera Enterprise 6.0 featuring significant improvements across a number of core components. In this blog post, we’re going to focus on Apache Hive 2.1.

Hive’s Approach to Rebase: Stability and Quality Most Important

Prior to the release of Cloudera Enterprise 6.0, Cloudera’s supported platform included Apache Hive 1.1 augmented with numerous features, enhancements and fixes from the later Apache Hive releases—all of which were included only after rigorous quality criteria were met.

Read more

Third-Party Libraries in C6

Categories: CDH General Platform Security & Cybersecurity

Cloudera has put a significant amount of work into upgrading the third-party libraries used in our just-released C6 version. This major upgrade of our software has given us the opportunity to upgrade many of the libraries we use. These upgrades allow us to avoid security vulnerabilities, use modern versions of libraries, and to standardize versions of libraries across CDH.

Modern software development relies on reusing other people’s code. Code reused in this fashion is called a “third-party library.”

Read more

Custom Hostname for Cloud Instances

Categories: Altus CDH Cloud Cloudera Director How-to Ops and DevOps Tools

Cloudera Altus Director provides the simplest way to deploy and manage Cloudera Enterprise in the cloud. It enables customers to unlock the benefits of enterprise-grade Hadoop while leveraging the flexibility, scalability, and affordability of the cloud. It integrates seamlessly with Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure, and provides support to build custom plugins for other public or private cloud environments.

Motivation

While automating the provisioning of a cluster on the cloud using Altus Director,

Read more