Tag Archives: Amazon S3

New in Cloudera Enterprise 5.12: Hue 4 Interface and Query Assistant

Categories: CDH Cloudera Manager Cloudera Navigator Hadoop Hue

When it comes to self-service business intelligence and exploratory analytics, Cloudera has continued to push limits and innovate to help our customers expedite this journey and get the most value from their data. Over the past year, we have made a number of significant advancements in Hue to provide a more powerful user experience for SQL developers and make them more productive for their every day self-service BI tasks and workflows.

With the recent release of Cloudera 5.12,

Read more

Introducing S3Guard: S3 Consistency for Apache Hadoop

Categories: Altus CDH Cloud Hadoop

Synopsis

This article introduces a new Apache Hadoop feature called S3Guard. S3Guard addresses one of the major challenges with running Hadoop on Amazon’s Simple Storage Service (S3), eventual consistency. We outline the problem of S3’s eventual consistency, how it affects Hadoop workloads, and explain how S3Guard works.

Problem

Although Apache Hadoop has support for using Amazon Simple Storage Service (S3) as a Hadoop filesystem, S3 behaves different than HDFS.  One of the key differences is in the level of consistency provided by the underlying filesystem.

Read more

Using Amazon S3 with Cloudera BDR

Categories: CDH Cloud Cloudera Manager HDFS Hive

More of you are moving to public cloud services for backup and disaster recovery purposes, and Cloudera has been enhancing the capabilities of Cloudera Manager and CDH to help you do that. Specifically, Cloudera Backup and Disaster Recovery (BDR) now supports backup to and restore from Amazon S3 for Cloudera Enterprise customers.

BDR lets you replicate Apache HDFS data from your on-premise cluster to or from Amazon S3 with full fidelity (all file and directory metadata is replicated along with the data).

Read more

Cloudera Enterprise 5.11 is Now Available

Categories: CDH Cloud Cloudera Manager Cloudera Navigator Hadoop

Cloudera Enterprise 5.11 is Now Available

Cloudera is pleased to announce that Cloudera Enterprise 5.11 is now generally available (GA). The highlights of this release include lineage support for Apache Spark, Apache Kudu security integration, embedded data discovery for self-service BI, and new cloud capabilities for Microsoft ADLS and Amazon S3.

As usual, there are also a number of quality enhancements, bug fixes, and other improvements across the stack. Here is a partial list of what’s included (see the Release Notes for a full list):

  • Core Platform and Cloud
    • Amazon S3 Consistency: S3Guard ensures that operations on Amazon S3 are immediately visible to other clients,

Read more

How To Set Up a Shared Amazon RDS as Your Hive Metastore

Categories: Cloud Hadoop Hive How-to Impala Spark Use Case

Before CDH 5.10, every CDH cluster had to have its own Apache Hive Metastore (HMS) backend database. This model is ideal for clusters where each cluster contains the data locally along with the metadata. In the cloud, however, many CDH clusters run directly on a shared object store (like Amazon S3), making it possible for the data to live across multiple clusters and beyond any cluster’s lifespan. In this scenario clusters need to regenerate and coordinate metadata for the underlying shared data individually.

Read more