Recent improvements to Apache Hadoop’s native backup utility, which are now shipping in CDH, make that process much faster.
DistCp is a popular tool in Apache Hadoop for periodically backing up data across and within clusters. (Each run of DistCp in the backup process is referred to as a backup cycle.) Its popularity has grown in popularity despite relatively slow performance.
In this post, we’ll provide a quick introduction to DistCp.
Via a combination of beta functionality in CDH 5.5 and new Cloudera Labs packages, you now have access to Apache HTrace for doing performance tracing of your HDFS-based applications.
HTrace is a new Apache incubator project that provides a bird’s-eye view of the performance of a distributed system. While log files can provide a peek into important events on a specific node, and metrics can answer questions about aggregate performance,
Now there’s an even quicker “QuickStart” option for getting hands-on with the Apache Hadoop ecosystem and Cloudera’s platform: a new Docker image.
You might already be familiar with Cloudera’s popular QuickStart VM, a virtual image containing our distributed data processing platform. Originally intended as a demo environment, the QuickStart VM quickly evolved over time into quite a useful general-purpose environment for developers, customers,
Cloudera has announced support for Spark SQL/DataFrame API and MLlib. This post explains their benefits for app developers, data analysts, data engineers, and data scientists.
In July 2015, Cloudera re-affirmed its position since 2013: that Apache Spark is on course to replace MapReduce as the default general-purpose data processing engine for Apache Hadoop. Thanks to initiatives like the One Platform Initiative,
Cloudera Enterprise 5.5 (comprising CDH 5.5, Cloudera Manager 5.5, and Cloudera Navigator 2.4) has been released.
Cloudera is excited to bring you news of Cloudera Enterprise 5.5. Our persistent emphasis on quality is especially pronounced in this release, with more than 500 issues identified and triaged during its development.
A highlight of this release is the inclusion of Cloudera Navigator Optimizer (available in limited beta for select Cloudera Enterprise customers;