A quick conversation with most Chief Information Security Officers (CISOs) reveals they understand they need to modernize their security architecture and the correct answer is to adopt a machine learning and analytics platform as a fundamental and durable part of their data strategy. However, many CISOs fear deployment of an initial use case will be somewhat daunting. Cloudera has partnered along with Arcadia Data and StreamSets to make it easier than ever for CISOs to take the first step and deploy basic use cases leveraging data sources common to many environments.
With an ever-increasing number of IoT use cases on the CDH platform, security for such workloads is of paramount importance. This blog post describes how one can consume data from Kafka in Spark, two critical components for IoT use cases, in a secure manner.
The Cloudera Distribution of Apache Kafka 2.0.0 (based on Apache Kafka 0.9.0) introduced a new Kafka consumer API that allowed consumers to read data from a secure Kafka cluster.
You may have heard of the recent (and ongoing) hacks targeting open source database solutions like MongoDB and Apache Hadoop. From what we know, an unknown number of hackers scanned for internet-accessible installations that had been set up using the default, non-secure configuration. Finding the exposure, these hackers then accessed the systems and in some cases deleted the files or held them for ransom.
These attacks were not technologically sophisticated,
In Part 1 of the blog, we covered all the prerequisites needed to deploy a CDH cluster on the Microsoft Azure cloud platform. In Part 2, we will cover the resources required on the Azure platform and actually deploy a cluster with Cloudera Director.
Cloudera Director Use Case
Cloudera Director simplifies cluster creation and lessen the time to an operational cluster on the cloud. It’s a great tool for running POCs in your organization.
Learn how to use Cloudera Director, Microsoft Active Directory (AD DS, AD CS, AD DNS), SAMBA, and SSSD to deploy a secure EDH cluster for workloads in the public cloud.
Authenticating users in Apache Hadoop is the first line of security we recommend. Like most, if not all RDBMS, a user is provided with a username and a password to validate their identity. This is a requirement to access any data managed by those systems.