Building an application to predict customer churn

Too often, companies are finding out after the fact that customers have stopped using their product or service, without enough notice to have done anything about it.  The term customer churn is used to describe the loss of existing customers.  These are people or organizations that were using a company’s products and/or services and have decided not to use them anymore, in favor of a competitor.  Tracking customer churn is a key business metric for most companies.  It’s important because most companies have found that existing customers are more profitable than acquiring new customers.  So, companies are trying to keep their existing customers while acquiring net new customers.  While the impact differs by industry, most companies have found that a few percentage points of improvement in customer churn has a positive impact on their bottom line.

One challenge with customer churn is that companies don’t know who is at risk of leaving.  Even if they have enough notice, companies are finding that they aren’t sure what to offer these “at-risk” customers to retain them.  Even coming up with offering options is a bit of a “black box” which makes it difficult to understand why these options were proposed.  So, chasing after these “at-risk” or even “lost” customers is an exercise in futility and frustration.

Companies are turning to data to address customer churn.  There are a number of different issues they need to tackle:

  • Collecting the data – First, they need to pull in data from various systems, such as their CRM, call center, ERP, etc.  These systems provide a piece of the information needed to have a holistic view of their customers.  By combining all the data, companies will be able to better identify the customers most at risk of attrition.  
  • Access to streaming data – The next issue is that the data needs to be as close to real-time as possible.  Having data that is a few days or weeks old means that the customer base may have changed in that timeframe. The more recent the data, the more likely the prediction of which customers are “at-risk” of attrition is accurate.
  • Identifying a relevant offer – Having the data and training a machine learning (ML) model on the data to identify which customers are at risk is only part of the solution. The other part of that solution is determining what is the next step to keep that customer. Is there a special offer, training, or promotion?  Telling a customer service representative that a specific customer is at risk of churning without providing them offers that will address the customer’s issues won’t make a difference to the churn rate.

The solution requires multiple analytic engines to work together (at minimum ingesting, transforming, querying, and predicting from data).  These engines must work near real-time on streaming data.  Lastly, there are a host of regulations on data privacy and acceptable use that must be followed.  Access to these massive data sets can’t be given to just anyone.  Companies need to be able to manage, secure and govern the data and subsequent use.

Lastly, companies need to be able to understand the ML model.  To ensure they are making good business decisions, they need to be able to explain why a prediction was made and the key factors that led to that action.  This is referred to as interpretability, understanding the factors, and the importance of each, which influenced a model into making a specific recommendation.

Join us at our upcoming webinar on  May 19, where we will discuss how Cloudera Data Platform addresses these challenges and will show you how to build a customer churn insights application with interpretability.

Sushant Rao
Sushant Rao

Cloud Product Marketing

1 Comments

by Timothy on

I am interested

Leave a comment

Your email address will not be published. Links are not permitted in comments.