Tag Archives: backup

Cloudera Enterprise 5.12 is Now Available

Categories: Altus CDH Cloud Cloudera Manager Cloudera Navigator Data Science Hue Impala Kafka Kudu

Cloudera is pleased to announce that Cloudera Enterprise 5.12 is now generally available (GA). The release includes enhancements for running in cloud environments (with broader ADLS support and improved AWS Spot Instance support), usability and productivity improvements for both data science and analytic workloads, as well as performance gains and self-service performance management across a range of workloads.

As usual, there are also a number of quality enhancements, bug fixes, and other improvements across the stack.

Read more

DistCp Performance Improvements in Apache Hadoop

Categories: CDH Hadoop HDFS Performance Tools

Recent improvements to Apache Hadoop’s native backup utility, which are now shipping in CDH, make that process much faster.

DistCp is a popular tool in Apache Hadoop for periodically backing up data across and within clusters. (Each run of DistCp in the backup process is referred to as a backup cycle.) Its popularity has grown in popularity despite relatively slow performance.

In this post, we’ll provide a quick introduction to DistCp.

Read more