Special co-author credits: Adam Andras Toth, Software Engineer Intern
With enterprises’ needs for data analytics and processing getting more complex by the day, Cloudera aims to keep up with these needs, offering constantly evolving, cutting-edge solutions to all your data related problems. Cloudera Stream Processing aims to take real-time data analytics to the next level. We’re excited to highlight job monitoring with notifications, a new feature for SQL Stream Builder (SSB).
What problem are we solving with job notifications?
The sudden failing of a complex data pipeline can lead to devastating consequences—especially if it goes unnoticed. A real-time financial fraud detector, or a complex architecture collecting and aggregating data to create insights and allow their customers to make data-driven decisions—these are systems that have little to no room for error or extended downtime. This is why we build job notifications functionality into SSB, to deliver maximum reliability in your complex real-time data pipelines.
How job notifications will make your life easier
Job notifications can help make sure that you can detect failed jobs without checking on the UI, which can save a lot of time for the user. This feature is very useful, especially when the user has numerous jobs running and keeping track of their state would be hard without notifications.
Architecture
First, we would like to introduce the architecture of job notifications. Let us use a figure to demonstrate how job notifications fit into SSB, then we will discuss each type separately.
Overview
In SSB you can manage multiple projects. Projects represent the software development life cycle (SDLC) in SQL Stream Builder (SSB): you can create an environment for developing SQL jobs, verifying the results and then pushing them to production. The resources for job creation are shared within the jobs of a project, and there are resources available that can be used between the projects. The basic concept of a project can also be expanded for collaboration by sharing the projects with team members in streaming SQL console, or using source control to synchronize the project with a Git repository.
Job notifications also belong to projects. That means in one project you can define multiple notifications, and those notifications can only belong to the jobs of that project. In the figure below, you can see the architecture of a project from the perspective of job notifications. As of now there are two types of notifications: email and webhook. The notifications can also be organized into groups. The benefit of this is that if you want to assign the same set of notifications to multiple jobs you don’t have to do this one by one in every job, you can just create a notification group and assign that to the jobs. One notification can be included in multiple groups and a group can even contain another group.
In the figure below, the same job notifications are marked with the same color. As you can see in the project we have three jobs. In the first one we only have notifications, so if that job fails these four notifications will fire. In the second one we have a webhook notification and a notification group that has another webhook and an email notification, so if this job fails these three notifications will go off. The third job has a webhook notification, a group that contains an email notification, and another notification group that has two notifications, so if this job fails these four notifications will fire.
Notifications
As I mentioned before, there are two types of notifications and you can assign them to groups. I will first introduce placeholders, which you can use to create notifications.
Placeholders
The email message or webhook request that is sent upon the trigger for a notification can be completely customized. More than that, SSB also allows the usage of placeholders, which can be used to provide all necessary information in the notification. With the ability to customize messages and to use placeholders, users will also potentially be able to automatically parse the incoming notifications and create automatic responses for them, thus guaranteeing that critical pipelines can be restarted without requiring human intervention.
The placeholders currently available for usage are:
- jobName
- jobStatus
- jobStatusDescription
- ssbJobId
- flinkJobId
- clusterId
- lastException
You can use a placeholder in the following format: “Houston we have a problem, your job with name ${jobName} has failed.”
Email notifications
Email notifications are (as you could guess from its name) sending emails to the given email address upon job failure. To make this work some CM properties need to be configured:
- Mail server host for job notifications: The host of the SMTP server for job failure notifications
- Mail server username for job notifications: The username to access the SMTP server for job failure notifications
- Mail server password for job notifications: The password to access the SMTP server for job failure notifications
- SMTP authentication for job notifications: Enable SMTP authentication for job notifications (default value: True)
- StartTLS for job notifications: Use the StartTLS command to establish a secure connection to the SMTP server for job notifications (default value: True)
- Job notifications sender mail address: Sender mail address for job notifications
- Mail server port for job notifications: The port of the SMTP server for job failure notifications (default value: 587)
If you have these things set up properly and you add a notification to your job, you should get an email if the job fails.
Webhook notifications
With webhook notifications you can make webhook requests upon a job failure. If you use the placeholders correctly, then you can use the defined webhook endpoints of external applications to handle the failures in a more efficient way. (For example, you can set up a webhook notification with Slack to send you a message directly if a job fails.)
In the case of webhook notifications you can set one property in CM:
- Job notifications webhook sender parallelism: Number of threads used by the job notification task to call user-specified webhooks when notifying about a failed or missing job (default value: 10)
DISCLAIMER: The payload template of a webhook notification must be a valid JSON! Also make sure to put placeholders within quotes!
E.g.:
- “name”: ${jobName} is invalid
- “name”:”${jobName}” is valid
- “name”:”whatever i want here ${jobName}” is also valid
Notification groups
As I mentioned above you can assign your notifications into groups. This way you don’t need to add all the notifications to the jobs one by one. A cool thing about the groups is that they can also contain other notification groups.
How to use job notifications
SSB’s job notifications feature is a cool way to keep track of your failing jobs and thus minimize the downtime of them. You just need to make sure the “enable job notifications” functionality in CM is checked. The job-monitoring task periodically queries the state of your jobs, and triggers the assigned notifications if a failed job is found. The check interval in CM can be configured with the job notifications monitoring interval property (default value: 60s).
In this section I will show you some video examples for the usages of the job notifications.
Create and use an Email notification:
Create and use a Webhook notification:
Create and use a Notification Group
Try it out yourself!
Anybody can try out SSB using the Stream Processing Community Edition (CSP-CE). CE makes developing stream processors easy, as it can be done right from your desktop or any other development node. Analysts, data scientists, and developers can now evaluate new features, develop SQL-based stream processors locally using SQL Stream Builder powered by Flink, and develop Kafka Consumers/Producers and Kafka Connect Connectors, all locally before moving to production in CDP.