Churn prediction, or detecting customers who are likely going to cancel their subscription, is an ever-growing field of study within marketing and actively managing customer churn is now more critical than ever.
In today’s economy, the interaction with the customer has drastically changed: for each service or product the customer has a substantial amount of options to choose from and is becoming more and more demanding. We all agree that the interaction with the customer does not end with a sale. The after-sales process is equally important: how can we retain these demanding valuable customers, or in other words how can we prevent customer churn?
One thing is certain: the changed landscape implies we need to approach customers differently. We need to respond to customers better and faster and this in a proactive way. We need to develop a different strategy, where each individual will have a personalized targeting plan.
Profitability highly depends on whether existing customers can be retained. Indeed, research has demonstrated that it costs 5 times more to attract new customers than to keep existing ones.
Most companies keep track of their customer’s behavior: they know who they are, what products they like and how they interacted during the customer journey (and lifecycle). But without getting the valuable insights, most companies just sit on a pile of data.
How Can We Predict Customer Churn?
Machine learning techniques are very useful when predicting customer churn. To detect patterns in certain ‘churn profiles’, customer profile information of churned and non-churned customers are needed. Once the ‘churn profiles’ are determined, new or current customers can be put to the test and likely churners can be identified to be approached with targeted strategies.
- What kind of data is needed?
Historical data is used to determine which group of customers is likely to leave. Useful data can contain any information about the customer profile, account and services information, any interaction the customer has had with your product or service and whether the customer has left the company or not. The more customer specific information you have available, the better. Most companies already keep track of their customer’s profile and behavior. Therefore it is important to leverage this valuable data and translate it into useful insights.
Tip: in the webapp below you can check the ‘Input Data’ tab menu to see how this historical data might look like.
- What type of models can be used?
Once qualitative data has been gathered and formatted, machine learning techniques are applied to the existing data and are extrapolated to predict the churn rate for new and existing customers. The purpose of these mathematical models is to find patterns in churned customer profiles and ultimately boils down to a binary classification model: is a customer going to churn or not.
The same logic and models can be used for a large set of business problems, such as whether a certain customer will buy a product or not, estimating the Click Through Rate (CTR), identifying a fraudulent case, etc.
DEMO: Predicting Churn Rate at a Telecom Company
To demonstrate how a machine learning model can work to predict churn, we used open source data from a Telecom company to estimate which customers are likely to churn.
A web app enables you to enter some basic information about a fictive customer. The churn rate will then be calculated based on three different models.
The demo is just an example of how mathematical models can help in predicting churn rate on one customer but can easily be applied to many customers at once.
Suppose we create a fictive customer: a male customer with a Fiber Optic Internet Service, including Phone Line, based on a month-to-month contract and 12 months tenure. The data can be entered on the webapp in the left side panel. On the right side of the app the churn prediction result is then displayed: the model has predicted that this fictive customer is a likely churner. All three models estimated a high churn probability.
Let’s suppose we can convince the fictive customer to take a one-year contract instead of a month-to-month contract. The churn probability decreases significantly, and the customer is now identified as a non-churner:
Three models were used to estimate churn probability:
- logistic regression,
- binomial tree
- and random forest.
Each model has their advantages and disadvantages and more finetuning is needed to develop a more accurate model. But for demonstration purposes, these simple models can already prove the power machine learning has in classification problems.
To compare the performance of different models in a classification problem, we can use the AUC-ROC curve (Area Under The Curve – Receiver Operating Characteristics curve), one of the most important evaluation metrics for this type of problems. In general, an AUC of 0.5 suggests no discrimination (i.e. ability to identify a customer to churn or not based on the test), 0.7 to 0.8 is considered acceptable, 0.8 to 0.9 is considered excellent, and more than 0.9 is considered outstanding. The logistic regression model has the highest AUC score, with the Random Forest score a close second. Both models are therefore considered excellent. The Binomial Tree score of 0.794 is still an acceptable score to predict churn.
As can be seen in the web app demo, the outcome of each model can differ significantly from one another. Indeed, the web app displays a percentage likelihood of churn and not the binomial outcome we would need: churn or no churn. We can set a fixed percentage threshold to which we define when we consider likelihood to churn. In general, we use 50% as the threshold: in case the model has a probability of more than 50%, the customer is likely to churn. If the outcome is below 50%, we consider the customer a non-churner. However, we can choose to change this threshold based on the requirements: if we need to avoid as many false negatives as possible (when a customer is identified as non-churner, but will actually churn), we can lower the threshold.
Let’s take the fictive customer again from above. When we changed his contract type to a one-year contract, the customer was no longer likely to churn. However, when looking deeper into the model results, we can see that for two models the churn probability is fairly high, 41% and 42% for Logistic Regression and Random Forest respectively. We can lower the 50% threshold when we want to avoid falsely identifying likely churners as non-churners.
Tip: the accuracy of the models will be impacted by changing the threshold. Check the bottom right panel to see the performance impact.
So, when this lockdown is over, will I keep my Netflix subscription? Probably it’s a question these companies already know the answer to and maybe I have already been approached as a likely churner. Note that as companies are building up post-covid data, it will be interesting to see whether existing models or models based on historical pre-covid data are still valid. Perhaps even, the new post-covid data will bring new churn patterns to light.
Most companies are sitting on piles of (customer) data, growing bigger by the day. Using the latest technological evolutions (AI, machine learning or even 10 year old data mining techniques) can help you make strategic decisions and target customers you cannot afford losing!
Contact us if you would like to know what we can do with your data.
- Mandrekar, J.N. 2010. Receiver Operating Characteristic Curve in Diagnostic Test Assessment, Journal of Thoracic Oncology, Volume 5, Issue 9, September 2010, Pages 1315-1316
- Kumar, V. and Reinartz, W. 2018, Customer Relationship Management: Concept, Strategy and Tools. Springer, Berlin. 422 pp.