As a BI Analyst, have you ever encountered a dashboard that wouldn’t refresh because other teams were using it? As a data scientist, have you ever had to wait 6 months before you could access the latest version of Spark? As an application architect, have you ever been asked to wait 12 weeks before you could get hardware to onboard a new application? Until now, perhaps your only alternative was to find someone that knew how to bypass central IT and set up an independent ‘point solution’. But in doing so, you may be creating a security and compliance risk for the company. What if there was a way to address these delays without bypassing IT and creating risk? And what if there was a way to do so without leaving the data center? Cloudera is happy to announce our first data analytics platform aimed at addressing these challenges – CDP Private Cloud.
CDP Private Cloud represents the biggest platform innovation in the data center since the merger of Cloudera and Hortonworks. At its core, CDP Private Cloud is a next-generation, cloud-native architecture for on-premises deployments. It separates and revamps compute and storage, integrates them with a suite of security and governance tools, and packages all pieces together with a simpler and intuitive management console. At Cloudera, we took these steps with one goal at the top of the list: to bring the agility demanded by data users without giving up the security and governance demanded by central IT and InfoSec.
In the first part of this multi-part blog post, we’ll deep dive into the various components that make up CDP Private Cloud. In part 2 coming next week, we’ll walk you through the specifics of how CDP Private Cloud tackles these agility problems without giving up security and governance.
With that, let’s take a closer look at each of the five major components that make up CDP Private Cloud.
The Major Components of CDP Private Cloud
- Analytic Experiences: CDP Private Cloud introduces new analytic experiences targeting the main pillars of every data analytics platform. We will start by offering two experiences this summer: Cloudera Machine Learning (CML) and Cloudera Data Warehouse (CDW). In the next few months, we will introduce Cloudera Data Engineering (CDE) and Cloudera DataFlow (CDF). These experiences have been developed from the ground up as containerized services orchestrated by Kubernetes. With this new architecture, CDP can provision and scale these analytic experiences in minutes and allocate just enough resources to meet current demand.
The Analytic Experiences also feature new end-user interfaces to simplify the end-to-end workflows of their respective data analytics domains. So rather than just exposing a JDBC endpoint, CDW offers a rich SQL editor. Instead of providing only a Spark shell, CDE allows users to create and visualize complex job pipelines, and monitor the performance of ETL jobs over time.
- Management Console: a new management console gives administrators an easy way to provision, grow, shrink, decommission, and configure isolated instances of each analytic experience. It also provides a global view to resource consumption across the Kubernetes cluster, resource quotas per user or per team (coming soon), and user management capabilities including integrations with Active Directory. In an upcoming version of CDP Private Cloud, the management console will also provide direct access to three operational tools: Cloudera Replication Manager for backup and disaster recovery, Cloudera Workload Manager for profiling, debugging, and optimizing user workloads, and Cloudera Data Catalog for finding, curating, and auditing data.
- Object Storage: with the release of CDP Private Cloud, Cloudera is also introducing a new object store, powered by Apache Ozone. Ozone is deployed separately from the analytic experiences, allowing for independent scaling, upgrades, management, and maintenance. More importantly, Ozone can scale to billions of objects without slowing down – an architectural advantage of object stores that has fueled their rise in popularity for cloud-native platforms. Ozone will reduce the operational costs of managing the ‘small files problem’ that plagues HDFS, without giving up all the performance, security, and functional capabilities that made HDFS popular in the first place.
- Security and Governance: CDP Private Cloud comes with a complete suite of security and governance capabilities that we call the Shared Data Experience, or SDX. These include role-based and attribute-based access controls with Apache Ranger, data lineage and data discovery with Apache Atlas, end-to-end TLS wire encryption, and encryption-at-rest with Ranger’s key management service. These services regulate what end-users can do through the analytic experiences, but operate independently of these experiences. This means the security and governance tools can be independently configured, managed, and upgraded and these changes will automatically get reflected in the analytic experiences.
- Traditional Workloads: while most new workloads are best suited for the new decoupled, modular architecture, we also know that customers have mission-critical applications developed using previously popular engines like MapReduce or Tez. Cloudera will continue to support these workloads running on bare metal, co-located with storage. Most importantly, these workloads will have access to the same global storage, security, and governance tools as the container-native experience.
These five components come together to make up CDP Private Cloud. But it’s how they come together that makes this platform particularly powerful. CDP Private Cloud runs the analytic experiences on physically separate machines from the storage and metadata processes. We use this independence to optimize the infrastructure layer on both the compute side (through containerization) and the storage side (through adopting an object store). We then take advantage of Kubernetes to simplify provisioning, scaling, multi-tenancy, and upgrades of the analytic experiences. And finally, we connect and test all components together to deliver a complete solution.
In the second part of this blog post, we’ll discuss how this modular architecture in CDP Private Cloud improves agility in the data center.
Interested in having a discussion about CDP Private Cloud? Sign up for this webinar to go into the details.
Wowwww
I like this way.
CDP Private Cloud can be a powerful tool for agile businesses. As a manufacturer of clusters and servers for HDP and CDH, we are excited for the opportunity to work with Cloudera on deploying turnkey platforms for CDP Private Cloud. I totally agree that waiting 12 weeks is unacceptable. That is why PSSC Labs (www.pssclabs.com) has developed a methodology to delivery a turnkey cluster in less than 6 weeks!