Conquering hybrid and multi-cloud with big data fabric

Editor’s Note, August 2020: CDP Data Center is now called CDP Private Cloud Base. You can learn more about it here.

Though they may disagree on the exact percentages and splits now and in the future, analysts very much agree that hybrid and multi-cloud adoption is rising across enterprises. It’s no surprise: compared to traditional on-premises systems and applications, cloud, public or private, has tremendous benefits around ease-of-use, flexibility, agility and scalability as well as the potential for cost savings. It also fits tremendously well with ever-exploding data volumes and the shift towards real-time data and streaming analytics. So why are organisations not all in (or on) cloud yet?

 The crux of the matter is distributed data management. With several infrastructures in play, each with their own frameworks and approaches, governing, securing and integrating fragmented data sets and analytics becomes a challenge, impacting time-to-insight and innovation. With data privacy regulations expanding, getting data management right without impacting the business is becoming a key differentiator. While 69% of executives are clear that their organisations need a comprehensive data strategy to ensure success, only 35% are convinced the strategy they do have is sufficient. A lack of analytical experience, together with silos, both the data and organizational kind, are the main reasons for not being able to maximise data for strategic gain.

Forrester Research identified this challenge three years ago and termed the architecture that helps organisations process, transform, secure and orchestrate data from disparate data sources to deliver a trusted and real-time view of enterprise data, a ‘big data fabric’. With a reference architecture built on six distinct layers (see figure 1), the updated 2.0 version embraces emerging technologies like streaming as well as hybrid and multi-cloud deployments. The big data fabric helps organisations to

“[Orchestrate] disparate data sources intelligently and securely in a self-service manner,

leveraging data platforms […] to deliver a unified, trusted, and comprehensive view of customer and business data across the enterprise.”

The big data fabric’s core capabilities are critical for any deployment of data and analytics. The fabric breaks data and organizational silos and creates real-time trusted and integrated insights for business innovation and growth. Big data fabric is not about supporting a single use case; it enables a true enterprise data strategy with application in areas ranging from customer intelligence for targeted offers to streaming IoT analytics for preventative maintenance. Since data security and governance is core to the architecture, it also facilitates compliance and audit reporting of sensitive data.

The Cloudera Data Platform (CDP), the world’s first enterprise data cloud, perfectly embodies the big data fabric reference architecture in a single platform with key capabilities covering each of the layers (figure 2).

Figure 2: Cloudera Data Platform as big data fabric

Three CDP characteristics in particular help organisations manage their distributed data and analytics:

1)    CDP deploys to any infrastructure. Public cloud for CDP Data Hub as well as the Data Warehouse and Machine Learning experiences were announced at Strata Data New York in 2019 for AWS. Since then, Microsoft Azure has been added with CDP Data Center for on-premises deployments coming later this year; more form factors (private cloud) and public cloud support (Google) will be added in 2020. With that, organisations are not tied to any particular infrastructure provider and can run their data and analytical workloads flexibly on any of the supported infrastructures, without redevelopment, and from a single control pane.

2) Control through a single control plane. No need to switch between the various operational interfaces for different deployments or get to grips with their subtle quirks and differences. CDP’s control plane provides a one stop shop for consistent and efficient administration of all infrastructure, data, and analytic workloads across hybrid and multi-cloud environments.

3) SDX (Shared Data Experience) provides consistent data security, governance and control. Policies defined once and consistently are applied across the platform, regardless of deployment. Consistency is guaranteed, whether between persistent and ephemeral workloads on the same or across different form factors. SDX also provides the migration and replication capabilities so that, should data need to move, associated security and governance policies stay connected. This, in and of itself, delivers unique and advanced platform capabilities such as intelligent migration between infrastructures and bursting workloads to the cloud from on-premises.

With CDP and thanks to SDX, organizations can deliver the self-service access to data and analytics business users need with the centralized control for data security and governance IT demands, across any and all infrastructures.

As a guest speaker in a recent Cloudera webinar, Forrester’s principal analyst Noel Yuhanna discussed the role of a big data fabric for enterprise data management across hybrid and multi-cloud deployments. If you missed it, you can watch a replay of it here. Find out more about the Cloudera Data Platform and stay abreast of its releases here.

Wim Stoop
Director Product Marketing
More by this author

Leave a comment

Your email address will not be published. Links are not permitted in comments.