Announcing the GA of Cloudera DataFlow for the Public Cloud

Announcing the GA of Cloudera DataFlow for the Public Cloud

Are you ready to turbo-charge your data flows on the cloud for maximum speed and efficiency? We are excited to announce the general availability of Cloudera DataFlow for the Public Cloud (CDF-PC) – a brand new experience on the Cloudera Data Platform (CDP) to address some of the key operational and monitoring challenges of standard Apache NiFi clusters that are overloaded with high-performant flows. Deploy, manage and monitor your standard NiFi flows running on-premises or on CDP Data Hub into cloud-native flows running on Kubernetes clusters on AWS. 

NEW Cloudera DataFlow for the Public Cloud

Key operational and monitoring challenges of a NiFi administrator

  • Resource contention – When multiple high-performant flows are loaded onto a single cluster, there may be periods when these flows may compete against each other for shared resources like CPU, memory, I/O operations etc. This can lead to an overall performance impact for all flows in that cluster.
  • Right-sizing the cluster – In order to address the crowded cluster situation described above, administrators often err on the side of caution and oversize the cluster. This can lead to a few idle nodes during non-peak periods, which can add to your infrastructure costs unnecessarily.
  • Manually scaling the cluster on-demand – It is not a sustainable process for an administrator to keep watching the utilization of resources, and then adding nodes when needed and removing nodes when not needed. There is a strong need for automation here.
  • Centralized monitoring – Once data flows move to the cloud, there can be multiple environments where such flows may be running. It will be very hard for an administrator to look across the landscape of deployments and understand the health of all flows. Troubleshooting bottlenecks or identifying chokepoints can also be challenging without a centralized dashboard.

NEW Cloudera DataFlow for the Public Cloud

We recognized these challenges from our own customer base and decided to make the Flow Ops experience a lot simpler. The new offering enables NiFi/cloud administrators and developers with key capabilities to make the process of flow deployment / management / monitoring extremely simple.

  • Environments – You can enable DataFlow for any AWS environment you have registered with CDP. The enablement process creates the Kubernetes infrastructure required by CDF and each environment maps to one Kubernetes cluster.
  • Flow Catalog – A brand new flow catalog is now available to import standard NiFi flow definitions from your existing clusters into the new CDF for the Public Cloud. The catalog will be the starting point for any flow deployment. The catalog also supports versioning of flows.
  • Deployment Wizard – Simplify the process of deploying flows into the new cloud-native environment with the brand new deployment wizard. Using the simple steps in the wizard, you can supply configuration parameters, auto-scaling settings and KPI definitions for your flow deployment.
  • Dashboard – Monitor all your deployed flows across all your registered cloud environments in one single view with the new centralized dashboard. For each flow deployment, you can open the deployment details pane which shows you KPIs you have defined, system metrics as well as system events and alerts.
  • ReadyFlows – These are pre-built NiFi flows ready for use for some of the most common use cases like moving data from Kafka to an S3 bucket. Just pick a Readyflow, provide some configuration parameters and then deploy.

Key Benefits

  • Boost your operational efficiencies by deploying flows in a streamlined manner and by defining key metrics to measure their performance.
  • Optimize your infrastructure setup by allowing CDF for Public Cloud to auto-scale your flows based on needs. This ensures that you are not over-sizing your infrastructure unnecessarily.
  • Prevent resource contention on crowded clusters by isolating your flows to their own individual cloud-native clusters.
  • Monitor all your flow deployments across multiple cloud clusters from a single dashboard. This enables seamless troubleshooting from an administrator’s perspective. In-built alerting capabilities also make administration a lot easier.
  • Boost productivity by getting a headstart on common flow use cases by leveraging pre-built flows from a gallery of ReadyFlows

After participating in the technical preview program for Cloudera DataFlow for the Public Cloud, our customers have been raving about it. We are sure you will rave too. Check out for more information and assets like datasheets, a product tour and more. For a deep dive into the product and its features, check out the technical deep dive blog post written by Michael Kohs, Sr. Product Manager, Cloudera.

Also, if you are interested in seeing the product live, we are doing a live launch webinar on August 18th, 2021. Register for it today. 

Leave a comment

Your email address will not be published. Links are not permitted in comments.