Tag Archives: cloud

Cloudera Director and Spot Instances: Resilience and Repair

Categories: CDH Cloud Testing

Cloudera Director enables self-service provisioning and management of CDH and Cloudera Enterprise Data Hub in the cloud. Running Cloudera Enterprise on top of public cloud infrastructure allows you to pay only for the resources you need to meet your data processing demands.

Amazon Web Services (AWS) provides the ability to bid on spare Amazon EC2 computing capacity at a discount through Amazon EC2 Spot instances. With Cloudera Director, you can configure clusters to use Spot instances to improve workload execution time and save costs.

Read more

Using Amazon S3 with Cloudera BDR

Categories: CDH Cloud Cloudera Manager HDFS Hive

More of you are moving to public cloud services for backup and disaster recovery purposes, and Cloudera has been enhancing the capabilities of Cloudera Manager and CDH to help you do that. Specifically, Cloudera Backup and Disaster Recovery (BDR) now supports backup to and restore from Amazon S3 for Cloudera Enterprise customers.

BDR lets you replicate Apache HDFS data from your on-premise cluster to or from Amazon S3 with full fidelity (all file and directory metadata is replicated along with the data).

Read more

Data Engineering with Cloudera Altus

Categories: Altus Cloud Hive Spark

With modern businesses dealing with an ever-increasing volume of data, and an expanding set of data sources, the data engineering process that enables analysis, visualization, and reporting only becomes more important.

When considering running data engineering workloads in the public cloud, there are capabilities which enable different operational models from on-premises deployments. The key factors here are the presence of a distinct storage layer within the cloud environment, and the ability to provision compute resources on-demand (e.g.: with Amazon’s S3 and EC2 respectively).

Read more

What’s New in Cloudera Director 2.2?

Categories: CDH Cloud Cloudera Manager Hadoop

This new release adds support for Amazon EBS volumes and the ability to diagnose cluster bootstrap errors quickly.

Cloudera Director provides a simple, reliable, enterprise-grade way to deploy, scale, and manage Apache Hadoop in the cloud of your choice. Cloudera Director enables you to deploy production-ready clusters for big data applications and successfully run workloads in the cloud.

Cloudera Director makes it easier for customers to:

  • Deploy clusters in line with patterns native to cloud infrastructure
  • Use an interface to define in one place the desired cluster specification all the way down to the operating system
  • Repeatedly and programmatically instantiate these cluster definitions
  • Adapt to the dynamic nature of cloud infrastructure

Cloudera Director 2.2 provides additional mechanisms to get that initial cluster definition right and the ability to diagnose errors and iterate quickly.

Read more

YCSB, the Open Standard for NoSQL Benchmarking, Joins Cloudera Labs

Categories: Cloudera Labs HBase Performance

YCSB, the open standard for comparative performance evaluation of data stores, is now available to CDH users for their Apache HBase deployments via new packages from Cloudera Labs.

Many factors go into deciding which data store should be used for production applications, including basic features, data model, and the performance characteristics for a given type of workload. It’s critical to have the ability to compare multiple data stores intelligently and objectively so that you can make sound architectural decisions.

Read more