Announcing GA of DataFlow Functions

Announcing GA of DataFlow Functions

Technology Spotlight

Today, we’re excited to announce that DataFlow Functions (DFF), a feature within Cloudera DataFlow for the Public Cloud, is now generally available for AWS, Microsoft Azure, and Google Cloud Platform. DFF provides an efficient, cost optimized, scalable way to run NiFi flows in a completely serverless fashion. This is the first complete no-code, no-ops development experience for functions, allowing users to save time and resources. 

Fig1: First no-code UI in the industry to quickly develop and deploy functions to cloud providers’ serverless compute services.

First no-code UI for serverless functions

Previously, developers had to write code and rely on code samples to get started with functions. Now, they can use DataFlow’s no-code UI to be more productive – they can quickly design new NiFi flows and then run them as functions in AWS Lambda, Azure Functions, and Google Cloud Functions.

Fig2: DataFlow Functions runtime environments are available in
AWS Lambda, Azure Functions, and Google Cloud Functions.

Optimize cost and eliminate infrastructure management

Since the data flows are running in serverless environments in the public clouds, infrastructure management is a thing of the past. The flow is only executed when an event triggers the function, offering a very efficient way of deploying event-driven use cases without requiring developers to expend valuable resources on operational responsibilities. For instance, a file landing in an object store (S3, ADLS, or GCS) triggers the execution of a data flow, which then processes the file and sends the result somewhere else.

Fig3: A sample use case where a file that lands in an object store triggers a function that processes that file and sends results to a destination.

DataFlow Functions provides an efficient, cost optimized, scalable way to run NiFi flows in a completely serverless fashion for event-driven use cases.

The right runtime for your use cases

There are now two ways to run your Apache NiFi data flows in the Cloudera DataFlow service: DataFlow deployments and DataFlow Functions:

  • Deployments runtime is optimized for high-throughput, low-latency streaming use cases
  • Functions runtime is best suited for event-driven, short-lived use cases 

Fig4: Runtime options in the public cloud: DataFlow Deployments and DataFlow Functions

Below is a more detailed breakdown of the two NiFi runtime options in the public cloud: 

Runtime options in the Public Cloud
Feature DataFlow Deployments DataFlow Functions
Cloud Runtime NiFi clusters using 

Kubernetes/containers

NiFi flows running on cloud providers’ serverless compute services (AWS Lambda, Azure Functions, and Google Cloud Functions)
Use Case Use cases that need low latency for high throughput workloads requiring always-running NiFi flows Event driven, micro-bursty use cases with no sub-second latency requirement where NiFi flows do not need to run continuously
Benefits Auto-scaling Kubernetes clusters for long running workflows with centralized monitoring Efficient, cost optimized, scalable way to run NiFi flows serverless, allowing developers to focus on business logic

Summary

DataFlow Functions provides a new, efficient way to run your event-driven Apache NiFi data flows.

With DataFlow Functions you can deploy your flow applications in minutes by leveraging the serverless architecture of all major public cloud providers (AWS, Azure, and Google Cloud Platform), and you do not have to worry about the operational overhead of managing and maintaining NiFi flow runtime environments.

To learn more on how to set up and run DataFlow Functions in AWS Lambda, Azure Functions, and Google Cloud Functions, checkout our technical blog, or take a product tour for a lightweight step-by-step experience.

Robert Hryniewicz
Director of Product Marketing
More by this author

Leave a comment

Your email address will not be published. Links are not permitted in comments.