Datrics USE CASE

How to Predict Customer Churn Without Writing Code

Customer churn or customer attrition is a well-known problem that leads to reduced MRR and increased spendings to grow the client base. Often companies focus on new clients acquisition instead of retaining existing ones as it's hard to understand specific patterns that lead to losing the customer. However, the cost of acquisition is almost always higher than taking retention actions.

The reasons why people decide to leave might be very different: they might prefer a competitor, may struggle to reach the right outcome, the price-value ratio may be wrong, or simply the credit card is not valid anymore, and they forgot to renew the subscription. However, while the exact reason might not be apparent, the behavioral patterns for the customers that are about to leave might be very similar.

Let's review a particular example of such a case in the telecom industry. Here we have a dataset from the telco provider of 4250 samples (users).

Some of the fields it contains are

  • The US state
  • The area code
  • The number of months the customer has been with the current telco provider
  • The number of months the customer has an international plan, voice mail plan
  • The total minutes/number/charge of day/evening/night/international/customer service calls
  • and, of course, if they churned or not
We also introduce two more features based on what we already know: the general cost of all services consumed by the user and the share of day/evening/night/international calls in the total amount.

The whole data analytics pipeline will look the following way.
Let's review it step by step and understand how it works.

First, we do encoding. Encoding is a process where we convert categorical features to numerical for further modeling (e.g., does the user have an international plan? - yes or no; or have the user churned? - yes or no).
We skipped the State variable for now as there are too many of them.

Next, we split the data into training and testing datasets (80/20) to evaluate the ML model afterward.
Now let's start modeling itself. Instantly we face some challenges.

The first one is that we have too many features (columns) that have various impacts on churn. We want to leave only the most impactful ones to get a more accurate model. To do that, we use the Variable Selection Brick, set the target variable as Churn, and put a threshold of 0.03. Higher values indicate that feature has strong predictive power (higher correlation with Churn), while features with lower values are excluded.
When we're done with filtering out not-so-important features, the next challenge we face is imbalanced data. In our case, the proportion of churn cases to all cases is 14%. In simple words, if we leave it as is, the ML model might think that the cases where the churn didn't happen are very-very often (86%) and be biased in this direction. When it's 50/50 or 60/40, the model is much less biased.

In Datrics, there is an automatic function to do balancing - the Balanced Sampling brick. In simple terms, it selects a few samples from each class by undersampling the major class and oversampling the minor one (just randomly duplicating the features from class 1 (churn=yes) and taking the random sample of class 0). So let it be 2000 samples for each class.
Now it's time for modeling. First, we feed the training data to the AutoML Predictive Model. AutoML functionality in Datrics enables non-data scientists to make predictions using a simplified machine learning toolset.

So, we trained the classification ML model. What is worth attention in training results?
The most important features that have the highest impact on churn are:

  • The total amount paid for services by customers
  • The number of calls to the customer support
  • The usage of the international plan: it's flag whether a person has an international plan, the number of such calls, and their duration
We also have excellent classification metrics: high F1 score, ROC AUC, which indicates a good balance between precision and recall of the model (which is crucial especially for imbalanced data) + high precision and recall for both classes illustrated in the following report:
In general, we are rarely wrong with class 0 (when churn=no); however, we experience some errors when the predicted churn is 0 (no), whereas the actual value is 1 (yes) - this happens in 12% of cases.
Overall, this is a very good start. For now, we can accept this performance and evaluate the model on the test dataset.
The accuracy metrics remain very high on the test data (not seen by the model before). The prediction was accurate for 100 customers out of 120 (83%).
This kind of modeling, together with appropriate actions, may significantly decrease the number of churned customers.

If interested in churn analytics and customer segmentation or exploring the other predictive and time-series analytical capabilities of the Datrics platform, please give us a shout, let's discuss.
Read more about our other updates and features and use-cases!

The demo is accessible here.
Do you want to discover more about Datrics?
Drop us a note to follow-up on any issues, support requests or enterprise plans.
Your feedback matters!