Cloudera released a lot of things around Apache NiFi recently! We just released Cloudera Flow Management (CFM) 2.1.1 that provides Apache NiFi on top of Cloudera Data Platform (CDP) 7.1.6. This major release provides the latest and greatest of Apache NiFi as it includes Apache NiFi 1.13.2 and additional improvements, bug fixes, components, etc. Cloudera also released CDP 7.2.9 on all three major cloud platforms, and it also brings Flow Management on DataHub with Apache NiFi 1.13.2 and more. Let’s have a look at the main highlights of these releases.
NiFi’s monitoring
These new releases provide many new features to extend the range of capabilities you have to monitor your NiFi instances and flows. Some of these new features are:
- Prometheus endpoint: NiFi now exposes an endpoint allowing Prometheus to collect monitoring data about NiFi instances and running flows and giving you the ability to create very custom dashboards.
- QueryNiFiReportingTask: this new reporting task allows you to run SQL queries against the internal monitoring data stored by NiFi (metrics, status, bulletins, provenance, etc.) and define a sink for where the data should go (Kafka, Site to Site, Prometheus, a database, etc.). This offers a really nice integration with Flink’s Streaming SQL Builder to run SQL queries on top of data streams to monitor NiFi and generate alerts.
- Status History: NiFi now provides Nodes Status History information with many metrics about how the NiFi nodes are performing. Also, all the Status History data can be persisted across restarts of NiFi, which greatly improves the monitoring capabilities of your deployments.
- Prometheus Reporting Task: you can expose specific pieces of your flow to Prometheus to have very fine-grained dashboards showing very specific metrics of the most critical components of your flows.
If you missed it, Cloudera gave a webinar about NiFi’s monitoring capability, and you can watch the replay on demand.
NiFi’s user experience
- Run Once: you can now right-click on a processor and click “Run Once” instead of starting the processors and stopping it right away when building flows during the development phase.
- Empty all queues: you can now empty all queues of a process group in one click. This capability will recursively remove all the flow files contained in the relationships assuming you have the permissions to do so.
- Import flow definition: by dragging and dropping a process group on the canvas, you can now easily import a flow definition that you exported in another environment. You start by right-clicking on the process group and selecting “Download flow definition” – that’s how you can import your flows in the DataFlow service and run NiFi on Kubernetes!
NiFi’s components
We’re adding no less than 40 new components (processors, controller services, and reporting tasks) in NiFi since the previous CFM 2.0.4 release. Cloudera commits to provide you with the best options to move data from any system to any other system. Among these 40 new components, you may be interested in the processors to interact with Microsoft Azure Data Lake Storage or the new ScriptedTransformRecord processor allowing you to define very business-specific transformations when processing your data in NiFi. We’re also adding a distributed cache based on Hazelcast to have a truly distributed cache embedded into NiFi.
You can have a look at the release notes for more details. We hope you’ll enjoy these new releases, and you can get NiFi clusters up and running in Amazon Web Services, Microsoft Azure, and Google Cloud in no time! Give it a try!