Data and analytics have become second nature to most businesses, but merely having access to the vast volumes of data from these devices will no longer suffice. Leading enterprises realize that the speed of data presents a new frontier for competitive differentiation. It is imperative for organizations to reduce time-to-insights to gain a competitive advantage by responding decisively to competitors, fine-tuning operations, and serving fickle customers.
According to Gartner, by 2022, more than half of major new business systems will incorporate “continuous intelligence” that use real-time context data to improve decision-making.
A modern streaming architecture supported by an enterprise data cloud meets the needs of today’s real-time data, driven by new connected data sources, and their sophisticated integration across business processes. This entails next-generation stream processing and analytics to ingest, process, and deliver real-time data. I had introduced Cloudera SQL Stream Builder in my earlier blog post and how it augments the powerful stream processing capabilities of the Cloudera DataFlow (CDF) platform by accelerating time to market and democratizing access to real-time data using continuous SQL.
SQL Stream Builder Transforms Events to Insights Immediately
Data has a short shelf life and its dynamic nature requires it to be processed as soon as it is received. Information becomes less valuable as time passes and streaming operates at a velocity that makes a critical difference between finding out what happened and stopping it.
Cloudera SQL Stream Builder queries data streams continuously such that it instantly identifies value in newly-created data, and recalculates aggregate queries when there are changes. This produces granular, accurate analytics in real-time which cannot be achieved with typical batch processing methods.
This feature is essential in gaining insights at the time events are occurring so that intervention can occur. In many cases, earlier detection of issues enables a predictive instead of reactive approach to nipping issues at the bud before they develop further.
Severstal is one of Russia’s largest producers of iron ore and coking coal and is a prime high-quality supplier of flats, longs, and steel pipes for the construction, automotive, machinery, and oil and gas industries.
With an aim to optimize steel production coupled with maintaining product quality, the firm implemented a solution that detects defects of the steel surface. It includes ten cameras set on the production line sending over five million photos of steel surfaces each day to a CV model. The model processes received photos and predicts whether the photo contains defects or not, and sends the result to a web application.
Millions of messages from IoT devices and sensor data from the production machinery are collected each minute, and this vast amount of data makes it possible to use advanced analytic techniques for operations optimization. Since implementation, performance has increased by more than 6.5%, which provides more than 100 thousand tonnes of additional metal processing per year.
Employ scalable solutions to process vast amounts of data
Businesses have cited common obstacles to implementation of real-time analytics as data complexity, people skills or resource availability, cost, and security.
The majority of legacy data platforms lack scalability and flexibility, often resulting in expensive IT overheads. Costs will continue to swell as the business moves toward becoming more and more data-driven.
To make sense of large volumes of data, streaming SQL has the ability to transform, filter, aggregate, and enrich data. Bringing these functions together allows organizations to extract the maximum value from the data streaming into their systems by drawing connections that are not immediately apparent.
Vodafone Automotive offers telematic services and electronic products for the automotive sector, particularly stolen vehicle tracking and emergency assistance, usage-based insurance, and fleet management. It created the User-Based Insurance project (UBI) to store, stream, and analyze ever-increasing amounts of vehicle data in a flexible and resilient way, as well as scale architectural structure to effectively manage both real-time processing of data as well as storage for subsequent processing. Without a capable data platform, the data quantity saturates the processing ability, especially during peak hours.
A real-time, streaming platform allows Vodafone Automotive to process information with latencies of a few seconds regardless of quantity and frequency, gathering information on trips, speed data, and geographical information acquired through GPS. The platform accounts for time-sensitive information in the event of a crash or when a theft attempt alarm is recorded. Moreover, the ability to acquire and process data in real-time unlocks new possibilities through the use of machine learning.
Liberate access to self-service analytics for any user persona
A whole host of newfound possibilities to utilize enterprise data, from streamlining manufacturing operations, reducing cyber threats to predicting customer buying patterns, has embedded data usage across business functions. However, as real-time queries are typically executed by those with unique skills like Scala or Java, there could be a mismatch between expertise and increasing workloads. A report by AOPG in 2019 revealed that 59% of firms in ASEAN feel that the lack of skilled personnel and resources hinders their ability to implement real-time analytics.
The SQL Stream Builder enables any user persona to construct streaming applications with SQL without the need to write code, as SQL is one of the most popular skills among enterprise users from data analysts to data scientists. Assets and resources are optimised while bringing about data democratization. In addition, users can join multiple data streams and perform aggregations, giving way to agility and innovation as teams engage in data exploration and discovery for novel ways to inform critical business decisions.
In the midst of expanding data access using Cloudera DataFlow, companies can ensure a unified security and governance across the entire platform with a Shared Data Experience(SDX).
Cloudera DataFlow, powered by Apache NiFi, Apache Flink and Apache Kafka, is available on the CDP platform, making edge-to-cloud data management possible as part of a complete and connected data life cycle.
If you want to learn more about SQL Stream Builder, download our Tech Brief or the datasheet.
For a live demo of this product, attend our webinar on 2nd June.