Using Cloudera Machine Learning to Build a Predictive Maintenance Model for Jet Engines

Posted in Technical | October 14, 2020 5 min read

Introduction

Running a large commercial airline requires the complex management of critical components, including fuel futures contracts, aircraft maintenance and customer expectations. Airlines in just the U.S. alone average about 45,000 daily flights, transporting over 10 million passengers a year (source: FAA). Airlines typically operate on very thin margins, and any schedule delay immediately angers or frustrates customers. Flying is not inherently dangerous, but the consequence of a failure is catastrophic. Airlines have such a sophisticated business model that encompasses a culture of streamlined supply chains, predictive maintenance, and unwavering customer satisfaction.

To maximize safety for all passengers and crew members, while also delivering profits, airlines have heavily invested in predictive analytics to gain insight on the most cost-effective way to maintain real-time engine performance. Additionally, airlines ensure availability and reliability of their fleet by leveraging maintenance, overhaul and repair (MRO) organizations, such as Lufthansa Technik.

Lufthansa Technik is an MRO that worked with Cloudera to build a predictive maintenance platform that services a fleet of 5000 aircraft throughout its global network of 800 MRO facilities. Lufthansa Technik extended a standard practice of placing sensors on aircraft engines and enabling predictive maintenance to automate fulfilment solutions. By combining profound airline operation expertise, data science, and engine analytics to a predictive maintenance schedule, Lufthansa Technik can now ensure critical parts are on the ground (OTG) when needed, instead of the entire aircraft being OTG and not producing revenue.

The objective of this blog is to show how to use Cloudera Machine Learning (CML), running Cloudera Data Platform (CDP), to build a predictive maintenance model based on advanced machine learning concepts.

The Process

Many companies build machine learning models using libraries, whether they are building perception layers for autonomous vehicles, allowing autonomous vehicle operation, or modeling a complex jet engine. Kaggle, a site that provides test training data sets for building machine learning models, provides simulation data sets from NASA that measures engine component degradation for turbofan jet engines. The models in this blog are built on CML and are based on inputting various engine parameters showing typical sensor values of engine temperature, fuel consumption, vibration, or fuel to oxygen mixture (see Fig 1). One item to note in this blog is that the term “failure” is not to imply catastrophic failure, but rather, that one of its components (pumps, values, etc) is not operating to specification. Airlines design their aircraft to operate at 99.999% reliability.

Fig 1: Turbofan jet engine

Step 1: Using the training data to create a model/classifier

First, four test and training data sets for varying conditions and failure modes were organized in preparation for CML (see box 1 in Fig 2).

Each set of training data shows the engine parameters per flight while each engine is “flown” until an engine component signals failure. This is done at both sea level and all flight conditions. This data will be used to train the model that can predict how many flights a given engine has until failure.
For each training set, there is a corresponding test data set that provides data on 100 jet engines at various stages of life with actual values on which to test the predictive model for accuracy.

Fig 2: Diagram showing how CML is used to build ML training models

Step 2: Iterate on the model to validate and improve effectiveness

CML was used to create a model that estimated the amount of remaining useful life (RUL) for a given engine using the provided test and training data sets. A threshold of one week–the time allowance to place parts on the ground–was planned for a scenario that alerts an airline before a potential engine component failure. Assuming four flights daily, this means the airline would like to know with confidence if an engine is going to fail within 40 flights. The model was tested for each engine, and the results were classified as true or false for potential failure within 40 flights (see Table 1).

Table 1: Data in table based on one week of data of 40 flights.

Step 3: Apply an added cost value to the results

With no preventative maintenance, an engine that runs out of life or fails can compromise safety and cost millions more dollars to replace an engine. If an engine is maintained or overhauled before it runs out of life, the cost of overhaul is significantly less. However, if the engine is overhauled too early, there is potential engine life that could have still been utilized. The estimated cost in this model for each of these overhaul outcomes can be seen below (see Fig 3).

Fig 3: Cost-benefit confusion matrix

Conclusion

Using Cloudera Machine Learning to analyze NASA jet engine simulation data provided by Kaggle, our predictive maintenance model predicted when an engine was likely to fail or when it required an overhaul with very high accuracy. Combining the cost-benefit analysis with this predictive model against the test data sets suggested significant savings across all applied scenarios. Airline decisions are always made with a consideration to safety first and then consideration to profit second. Predictive maintenance is preferred because it is always the safest choice, and it delivers drastically lower maintenance costs over reactive (engine replacement after failure) or proactive (replacing components before engine replacement) approaches.

Next Steps

To see all this in action, please click on links below to a few different sources showcasing the process that was created.

Video – If you’d like to see and hear how this was built, see video at the link.
Tutorials – If you’d like to do this at your own pace, see a detailed walkthrough with screenshots and line by line instructions of how to set this up and execute.
Meetup – If you want to talk directly with experts from Cloudera, please join a virtual meetup to see a live stream presentation. There will be time for direct Q&A at the end.
CDP Users Page – To learn about other CDP resources built for users, including additional video, tutorials, blogs and events, click on the link.

Tui Leauanae

More by this author

Nicolas Pelaez

More by this author

Editor's Choice

Business

Generative AI for the Enterprise

Technical

Building Trust in Public Sector AI Starts with Trusting Your Data