Category Archives: Cloud

How-to: Deploy a Secure Enterprise Data Hub on AWS

Categories: CDH Cloud How-to Ops and DevOps

Learn how to use Cloudera Director, Microsoft Active Directory, and Centrify Express to deploy a secure EDH cluster for workloads in the public cloud. 

There are several best practices for deploying a secure Apache Hadoop-powered enterprise data hub (EDH) cluster on Amazon Web Services (AWS), including use of Centrify Express for Linux-to-Active Directory host integration and Microsoft Active Directory as the core integration point for identity, authentication, authorization, and public key infrastructure (PKI).

Read More

How-to: Install Cloudera Enterprise on Microsoft Azure (Part 2)

Categories: Cloud Guest

Recently, GoDataDriven installed a Cloudera cluster on Microsoft Azure. This two-part blog post, written by Alexander Bij and Tünde Alkemade and republished with permission, provides information about use case, implemented design, installation.

In the first post we discussed some information about the use case, the design and some basic information about Microsoft Azure. We showed some options how you can install Cloudera on Azure and what best practices we saw when installing a distributed system on Azure.

Read More

How-to: Integrate Cloudera Director with a Data Pipeline in the Cloud

Categories: Cloud Ops and DevOps

Learn how to use Cloudera Director to automate cluster operations (and more) in the cloud.

Cloudera Director was designed from the beginning to be primarily an API that can integrate with your existing data pipelines and workflows to handle tasks like creating, terminating, and resizing the Apache Hadoop (CDH) clusters used to run your data processing jobs or SQL queries.

Among many other new features,

Read More

How-to: Install Cloudera Enterprise on Microsoft Azure (Part 1)

Categories: Cloud Guest How-to

Recently, GoDataDriven installed a Cloudera Enterprise (CDH + Cloudera Manager) cluster on Microsoft Azure. This two-part series, written by Alexander Bij and Tünde Alkemade and republished with permission, includes information about use case, design, and installation.

Processing large amounts of unstructured data requires serious computing power and also maintenance effort. As load on computing power typically fluctuates due to time and seasonal influences and/or processes running on certain times,

Read More

What’s New in Cloudera Director 2.0?

Categories: Cloud General Ops and DevOps

New functionality includes support for spot instances, automatic job submission, and integrated setup for HA and Kerberized clusters.

Cloudera Director is the manifestation of Cloudera’s commitment to provide a simple and reliable way to deploy, scale, and manage Apache Hadoop clusters in the cloud of your choice. Cloudera Director lets you deploy production-ready clusters for big data applications and successfully run workloads in the cloud. With Cloudera Director 2.0,

Read More