Resilience in Action: How Cloudera’s Platform, and Data in Motion Solutions, Stayed Strong Amid the CrowdStrike Outage

Resilience in Action: How Cloudera’s Platform, and Data in Motion Solutions, Stayed Strong Amid the CrowdStrike Outage

Late last week, the tech world witnessed a significant disruption caused by a faulty update from CrowdStrike, a cybersecurity software company that focuses on protecting endpoints, cloud workloads, identity, and data. This update led to global IT outages, severely affecting various sectors such as banking, airlines, and healthcare. Many organizations found their systems rendered inoperative, highlighting the critical importance of system resilience and reliability. 

However, amidst this disruption, one Cloudera customer reported that although many of their systems were impacted, Cloudera’s data-in-motion stack specifically demonstrated remarkable resilience, experiencing no downtime. Here, we’ll briefly discuss the incident, and how Cloudera protected its customers’ most critical analytic workloads from potential downtime.

The Incident: A Brief Overview

The CrowdStrike incident, which stemmed from a problematic update to their Falcon platform, caused widespread compatibility issues with Microsoft systems. This resulted in numerous systems experiencing the infamous Windows “blue screen of death” among other operational failures. While this incident did not involve a cyberattack, the technical glitch led to significant disruptions to global operations.

Cloudera’s Resilience – Data in Motion and the Entire Cloudera Data Platform

The Cloudera customer reported that despite many of their systems going down, Cloudera services running on Linux instances in Amazon Web Services (AWS) remained up and functional. These services included their data-in-motion stack, but it’s important to note that Cloudera’s entire platform and all hybrid cloud data services are equally resilient largely due to Cloudera’s focus on high availability, disaster tolerance, and long history serving mission-critical workloads to our large enterprise customers.

Cloudera offers the only open true hybrid platform for data, analytics and AI, and with that comes unique opportunities for supporting high availability and disaster tolerance. With portable data services that can run on any cloud, and on-premises, you can configure a variety of available sites that mix between different clouds and include on-premises resources, reducing the dependency on a single platform, vendor, or service to operate. For more information on how Cloudera is designed for resilience, read the Cloudera blog on Disaster Recovery, and follow the Cloudera Reference Architecture for Disaster Recovery for guidance and best practices to further your own resilience and availability goals with Cloudera.  

Data in motion is a set of technologies, including Apache NiFi, Apache Flink, and Apache Kafka, that enable customers to capture, process, and distribute any data anywhere, enabling real-time analytics, AI, and machine learning. These technologies are key components for many mission-critical workloads and applications – from network monitoring and service assurance in telecommunications to fraud detection and prevention in financial services. Real-time workloads, when they are mission critical, carry the additional weight of timeliness, and, as such, a potential outage could have a significantly greater business impact compared to less time-critical workloads.

Fortunately for this and many other Cloudera customers, data in motion has been designed with Cloudera’s most exacting standards for high availability and disaster tolerance, including support for hybrid cloud, ensuring even if some components were to have a dependency on a CrowdStrike affected system or service, it would not have presented itself as a single point of failure for the platform. The continuity of service that they experienced underscores the reliability and resilience of Cloudera, even in the face of significant external disruptions, as well as Cloudera’s potential for reducing the business impact of cloud provider outages.

Architect for Resilience, Especially for Real-Time Applications

The CrowdStrike incident is not the first major service disruption that businesses have experienced, and it very likely will not be the last. The cloud provides many benefits from a cost, flexibility, and scalability perspective, especially for analytic workloads. However, it also comes with some operational risk. Many workloads and applications that rely on the real-time capturing, processing, and analysis of data have zero tolerance for downtime.

Cloudera’s platform, and the data-in-motion stack, are built with resilience in mind. Cloudera’s unique approach to hybrid cloud and investment in proven architectures for high availability and disaster tolerance can mitigate the challenges many companies have experienced in the past few days, protecting their mission-critical workloads and ensuring business continuity.

Learn more about Cloudera and data in motion here.

Jeremiah Morrow
Product Marketing Manager
More by this author

Leave a comment

Your email address will not be published. Links are not permitted in comments.