Learn how to use Cloudera Director, Microsoft Active Directory (AD DS, AD CS, AD DNS), SAMBA, and SSSD to deploy a secure EDH cluster for workloads in the public cloud.
Authenticating users in Apache Hadoop is the first line of security we recommend. Like most, if not all RDBMS, a user is provided with a username and a password to validate their identity. This is a requirement to access any data managed by those systems.
Since the launch of sparklyr, working with Apache Spark in Apache Hadoop has become much easier for R users. sparklyr contains a dplyr interface into Spark and allows users to leverage crucial machine learning algorithms from Spark MLlib and H2O Sparkling Water. This greatly reduces the barrier of entry for R users in adopting Spark as a tool for big data and should go a long way in enabling R workloads to migrate to Hadoop.
Updated 11/22/16 – Important: All features below are working on CDH 5.9.0 and CM 5.9.0 and above.
This tool makes Oozie migrations off Apache Derby (or any other supported database) easy, in addition to streamlining upgrades.
The Apache Oozie server is a stateless web application by design, with all information about running and completed workflows, coordinator jobs, and bundle jobs stored in a relational database.
Get started with scalable graph analysis via simple examples that utilize GraphFrames and Spark SQL on HDFS.
Graphs—also known as “networks”—are ubiquitous across web applications. As a refresher, a graph consists of nodes and edges. A node can be any object, such as a person or an airport, and an edge is a relation between two nodes, such as a friendship or an airline connection between two cities.
Learn how to use Cloudera Director, Microsoft Active Directory, and Centrify Express to deploy a secure EDH cluster for workloads in the public cloud.
In Part 1 of this series, you learned about configuring Microsoft Active Directory and Centrify Express for optimal security across your Cloudera-powered EDH, whether for on-premise or public-cloud deployments. In this concluding installment, you’ll learn the cloud-specific pieces in this process, including some AWS fundamentals and in-depth details about cluster provisioning using Cloudera Director.