Apache Hadoop’s security was designed and implemented around 2009, and has been stabilizing since then. However, due to a lack of documentation around this area, it’s hard to understand or debug when problems arise. Delegation tokens were designed and are widely used in the Hadoop ecosystem as an authentication method. This blog post introduces the concept of Hadoop Delegation Tokens in the context of Hadoop Distributed File System (HDFS) and Hadoop Key Management Server (KMS),
As customers use Apache Hadoop clusters in ways other than through HUE and Hadoop Command Line Interface (CLI) and integrate it closely with the applications they develop, we often get asked how to access their secure Hadoop cluster from within the custom applications. Many customers use a service account in their application and access the cluster with a fixed service account. However, other customers would like to access as the end users who have authenticated to the application.
You may have heard of the recent (and ongoing) hacks targeting open source database solutions like MongoDB and Apache Hadoop. From what we know, an unknown number of hackers scanned for internet-accessible installations that had been set up using the default, non-secure configuration. Finding the exposure, these hackers then accessed the systems and in some cases deleted the files or held them for ransom.
These attacks were not technologically sophisticated,
Learn how to use Cloudera Director, Microsoft Active Directory, and Centrify Express to deploy a secure EDH cluster for workloads in the public cloud.
In Part 1 of this series, you learned about configuring Microsoft Active Directory and Centrify Express for optimal security across your Cloudera-powered EDH, whether for on-premise or public-cloud deployments. In this concluding installment, you’ll learn the cloud-specific pieces in this process, including some AWS fundamentals and in-depth details about cluster provisioning using Cloudera Director.
Cloudera’s distribution (now on release 2.0) of Kafka is based on Apache Kafka 0.9 and includes various new features (especially for security and usability), enhancements, and bug fixes.
Kafka is rapidly gaining momentum in enterprise Apache Hadoop deployments and has become the de facto messaging bus in most Big Data technology stacks. During this period of rapid adoption (and since Cloudera began shipping Kafka in February 2015),