Are your newly acquired customers (during this COVID-19 crisis) here to stay?

From one day to the other, because of the lockdown, I needed to entertain my 2-year-old 24/7. And I literally mean 24/7: his idea of sleeping includes waking up at 3 o’clock in the morning and indicating he needs you to set up his colouring table and crayons. Any distraction, any entertaining solution albeit only for 5 minutes, is something I would now invest in - just to keep him busy - while my husband and I can continue to work. So, we bought puzzles, games, books and yes, also a television subscription: Netflix. The corona-virus has led to many new subscribers in the streaming entertainment industry. But once this lockdown is over, and those new subscribers no longer need the services they offer, how will they try and retain these valuable customers otherwise so difficult to attract? How can they proactively set up campaigns that will keep them aboard?
Cynthia Hadinoto

Churn prediction, or detecting customers who are likely going to cancel their subscription, is an ever-growing field of study within marketing and actively managing customer churn is now more critical than ever.

In today’s economy, the interaction with the customer has drastically changed: for each service or product the customer has a substantial amount of options to choose from and is becoming more and more demanding. We all agree that the interaction with the customer does not end with a sale. The after-sales process is equally important: how can we retain these demanding valuable customers, or in other words how can we prevent customer churn?

One thing is certain: the changed landscape implies we need to approach customers differently. We need to respond to customers better and faster and this in a proactive way. We need to develop a different strategy, where each individual will have a personalized targeting plan.

Profitability highly depends on whether existing customers can be retained. Indeed, research has demonstrated that it costs 5 times more to attract new customers than to keep existing ones.

Most companies keep track of their customer’s behavior: they know who they are, what products they like and how they interacted during the customer journey (and lifecycle). But without getting the valuable insights, most companies just sit on a pile of data.

 

How Can We Predict Customer Churn?

Machine learning techniques are very useful when predicting customer churn. To detect patterns in certain ‘churn profiles’, customer profile information of churned and non-churned customers are needed. Once the ‘churn profiles’ are determined, new or current customers can be put to the test and likely churners can be identified to be approached with targeted strategies.

  • What kind of data is needed?

Historical data is used to determine which group of customers is likely to leave. Useful data can contain any information about the customer profile, account and services information, any interaction the customer has had with your product or service and whether the customer has left the company or not. The more customer specific information you have available, the better. Most companies already keep track of their customer’s profile and behavior. Therefore it is important to leverage this valuable data and translate it into useful insights.

Tip: in the webapp below you can check the ‘Input Data’ tab menu to see how this historical data might look like.

  • What type of models can be used?

Once qualitative data has been gathered and formatted, machine learning techniques are applied to the existing data and are extrapolated to predict the churn rate for new and existing customers. The purpose of these mathematical models is to find patterns in churned customer profiles and ultimately boils down to a binary classification model: is a customer going to churn or not.

The same logic and models can be used for a large set of business problems, such as whether a certain customer will buy a product or not, estimating the Click Through Rate (CTR), identifying a fraudulent case, etc.

 

DEMO: Predicting Churn Rate at a Telecom Company

To demonstrate how a machine learning model can work to predict churn, we used open source data from a Telecom company to estimate which customers are likely to churn.

A web app enables you to enter some basic information about a fictive customer. The churn rate will then be calculated based on three different models.

The demo is just an example of how mathematical models can help in predicting churn rate on one customer but can easily be applied to many customers at once.

Suppose we create a fictive customer: a male customer with a Fiber Optic Internet Service, including Phone Line, based on a month-to-month contract and 12 months tenure. The data can be entered on the webapp in the left side panel. On the right side of the app the churn prediction result is then displayed: the model has predicted that this fictive customer is a likely churner. All three models estimated a high churn probability.

 

Let’s suppose we can convince the fictive customer to take a one-year contract instead of a month-to-month contract. The churn probability decreases significantly, and the customer is now identified as a non-churner:

 

Churn predictor web app

This web app enables you to enter some basic information about a fictive customer. The churn rate will then be calculated based on three different models.

This demo is just an example of how mathematical models can help in predicting churn rate on one customer but can easily be applied to many customers at once.

 

Model

Three models were used to estimate churn probability:

  1. logistic regression,
  2. binomial tree
  3. and random forest.

Each model has their advantages and disadvantages and more finetuning is needed to develop a more accurate model. But for demonstration purposes, these simple models can already prove the power machine learning has in classification problems.

To compare the performance of different models in a classification problem, we can use the AUC-ROC curve (Area Under The Curve – Receiver Operating Characteristics curve), one of the most important evaluation metrics for this type of problems. In general, an AUC of 0.5 suggests no discrimination (i.e. ability to identify a customer to churn or not based on the test), 0.7 to 0.8 is considered acceptable, 0.8 to 0.9 is considered excellent, and more than 0.9 is considered outstanding. The logistic regression model has the highest AUC score, with the Random Forest score a close second. Both models are therefore considered excellent. The Binomial Tree score of 0.794 is still an acceptable score to predict churn.

As can be seen in the web app demo, the outcome of each model can differ significantly from one another. Indeed, the web app displays a percentage likelihood of churn and not the binomial outcome we would need: churn or no churn. We can set a fixed percentage threshold to which we define when we consider likelihood to churn. In general, we use 50% as the threshold: in case the model has a probability of more than 50%, the customer is likely to churn. If the outcome is below 50%, we consider the customer a non-churner. However, we can choose to change this threshold based on the requirements: if we need to avoid as many false negatives as possible (when a customer is identified as non-churner, but will actually churn), we can lower the threshold.

Let’s take the fictive customer again from above. When we changed his contract type to a one-year contract, the customer was no longer likely to churn. However, when looking deeper into the model results, we can see that for two models the churn probability is fairly high, 41% and 42% for Logistic Regression and Random Forest respectively. We can lower the 50% threshold when we want to avoid falsely identifying likely churners as non-churners.

Tip: the accuracy of the models will be impacted by changing the threshold. Check the bottom right panel to see the performance impact.

 

Conclusion

So, when this lockdown is over, will I keep my Netflix subscription? Probably it’s a question these companies already know the answer to and maybe I have already been approached as a likely churner. Note that as companies are building up post-covid data, it will be interesting to see whether existing models or models based on historical pre-covid data are still valid. Perhaps even, the new post-covid data will bring new churn patterns to light.

Most companies are sitting on piles of (customer) data, growing bigger by the day. Using the latest technological evolutions (AI, machine learning or even 10 year old data mining techniques) can help you make strategic decisions and target customers you cannot afford losing!

Contact us if you would like to know what we can do with your data.

 

Churn predictor web app

This web app enables you to enter some basic information about a fictive customer. The churn rate will then be calculated based on three different models.

This demo is just an example of how mathematical models can help in predicting churn rate on one customer but can easily be applied to many customers at once.

 

 

 

Sources:

  • Mandrekar, J.N. 2010. Receiver Operating Characteristic Curve in Diagnostic Test Assessment, Journal of Thoracic Oncology, Volume 5, Issue 9, September 2010, Pages 1315-1316
  • Kumar, V. and Reinartz, W. 2018, Customer Relationship Management: Concept, Strategy and Tools. Springer, Berlin. 422 pp.
Churn predictor web app

This web app enables you to enter some basic information about a fictive customer. The churn rate will then be calculated based on three different models.

This demo is just an example of how mathematical models can help in predicting churn rate on one customer but can easily be applied to many customers at once.

Thanks for reading

Share blog

More interesting articles

Insight

The 5 pillars of a data mature organization

A need for smart & data-driven organizations Core competencies for incumbent organizations are fundamentally different than those of the digitally enabled organizations. Consider brand-awareness creation:…
Read more
Insight

The rise of things

We cannot deny the existence of two of the technologies discussed in this post: companies must take into account the rise of “Big Data” and…
Read more