In mid-2017, we were working with one of the world’s largest healthcare companies to put a new data application into production. The customer had grown through acquisition and in order to maintain compliance with the FDA, they needed to aggregate data in real-time from dozens of different divisions of the company. The consumers of this application, of course, did not care how we built the data pipeline. However, they cared greatly that if it broke,
Two new strains of malware–XBash and DemonBot–are targeting Apache Hadoop servers for Bitcoin mining and DDOS purposes. This malware is scanning the internet so vigorously for Hadoop clusters that an infection can occur within minutes of an insecure cluster being placed on the open internet. This blog post describes the mechanism this malware uses and offers specific actions to protect your Hadoop-based clusters.
A History of Hadoop Malware
Roughly two years ago there were a spate of attacks against the open source database solution MongoDB,
We recently released Cloudera Enterprise 6.0 featuring significant improvements across a number of core components. In this blog post, we’re going to focus on Apache Hive 2.1.
Hive’s Approach to Rebase: Stability and Quality Most Important
Prior to the release of Cloudera Enterprise 6.0, Cloudera’s supported platform included Apache Hive 1.1 augmented with numerous features, enhancements and fixes from the later Apache Hive releases—all of which were included only after rigorous quality criteria were met.
Cloudera has put a significant amount of work into upgrading the third-party libraries used in our just-released C6 version. This major upgrade of our software has given us the opportunity to upgrade many of the libraries we use. These upgrades allow us to avoid security vulnerabilities, use modern versions of libraries, and to standardize versions of libraries across CDH.
Modern software development relies on reusing other people’s code. Code reused in this fashion is called a “third-party library.”
Cloudera Altus Director provides the simplest way to deploy and manage Cloudera Enterprise in the cloud. It enables customers to unlock the benefits of enterprise-grade Hadoop while leveraging the flexibility, scalability, and affordability of the cloud. It integrates seamlessly with Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure, and provides support to build custom plugins for other public or private cloud environments.
While automating the provisioning of a cluster on the cloud using Altus Director,