Datrics USE CASE

Customer Segmentation and Ideal Customer Profile Detection

Imagine a luxury goods store at the airport and the crowds of people going back and forth. Some of them walk in; some don't. Most people are casually browsing, killing time waiting for a flight; some are potential buyers. Consultants' job is to welcome all the people coming, tell about the goods and walk through the promos. However, a good consultant can often distinguish a real buyer from a person who is just browsing. At the same time, a perfect consultant can also say what kind of good the person might be here for or even sell a product more expensive than the buyer was initially up to. It is a real-life example of customer segmentation and profiling.


As things went online, answering the question of how the ideal customer looks became much more complicated. As an opposite to older approaches where experts evaluate the population defining the demography and social parameters of the target audience, modern digital marketing consumes many more data points finding more sophisticated features that characterize people. Years ago, the characteristics of ideal clients might have sounded like age, gender, education and employment status, income level, and other basic information. Nowadays, we can connect alternative data sources such as social networks, geolocation, the power of cloud analytical products, and track the engagement history within our services. Netflix is a good example of tailoring content for specific audiences. For instance, they change the thumbnails for featured series based on local geographical preferences or recommend content based on similar types of actors, e.g., if you like Tom Hardy, would you like to watch some films with Logan Marshall-Green?
This article will show you how to do the customer segmentation and analyze the ideal customer profile without writing code, using only the Datrics platform.

Here's how the data analytics pipeline looks.
Let me explain it step-by-step. We have two eCommerce datasets as the info occasionally has been stored in two different systems - CRM and the promo distribution system, so we should merge them first. The total number of records (customers) is 15 850.

The promo dataset contains fields such as the customer ID, age, gender, education, number of days since the customer has registered, number of promo offers which were received by the customer, the share of promo offers that were viewed by the customer (% of received), the share of promo offers which were fully reviewed by the customer (% of viewed).

The CRM one contains the following data: ID of the customer, the total number of transactions, the share of transactions which included promo goods, total spending amount, average cheque, number of days since the last order, the maximum number of days that passed between two orders, the average number of orders per week, average spending per week.

Let's allow Datrics to do some AutoML magic - let it cluster the customers' data for you.

AutoML functionality in Datrics enables non-data scientists to make predictions using a simplified machine learning toolset.

Here we detect the clusters of our clients automatically, excluding the personal data such as gender and education to analyze only users' behavior and purchases history, and then connect this to personal characteristics while doing profiling.
So, we got some insights here. We see three fairly distinguishable clusters of clients (lucky we, it's not always that smooth). Based on Silhouette statistics and visual representation of the clusters, identified groups are well separated; customers in one group look similar and differ from the customers in other clusters.
Now let's dive deeper into the data and figure out what makes those clusters different. We normalized the features to make the charts a little bit more readable. Let's start with some demographics.
Cluster 2 has mostly young new clients with an estimated income of an average level. Cluster 0 predominantly consists of long-term clients on the older side with different income levels, at the same time higher than in other clusters. Finally, cluster 1 is less clear and consists of different people.

Let's take a look at the transactions activity.
Even though Cluster 1 is less evident in demographics, it consists of the highest paying individuals with more frequent transactions and higher cheques. The "younger" Cluster 2 has a little less than average cheque but buys significantly less often, so the weekly spending is relatively low. The "older" Cluster 1 is somewhere between.

What about the promos' effectiveness?
The younger second cluster opens promos more often but rarely goes through. The mixed cluster 1 opens promos almost equally often to the young cluster but reads them more often; the older cluster is on the lower side for both opening and reading the offerings.

To sum up:

  • Cluster 1 - the highest spending, better conversion from promo offerings
  • "Older" cluster 0 - average cheques are on the higher side, but not too often, not interested in promos
  • "Younger" cluster 2 - lower spending, not too involved in communication

Therefore, Cluster 1 might contain our ideal customers, it makes sense to analyze the features of this customer group to understand it better and target promos for more narrow groups. At the same time, younger customers might require a different approach to promo communications.

Diving deeper into Cluster 1, we may find that the median age is 35. It has more males than females, mostly having B.Sc. or M.Sc. degrees.
If interested in this kind of Ideal Customer Profile analytics and customer segmentation or exploring the other predictive and time-series analytical capabilities of the Datrics platform, please give us a shout, and let's talk.
Read more about our other updates and features and use-cases!

The demo is accessible here.
Do you want to discover more about Datrics?
Drop us a note to follow-up on any issues, support requests or enterprise plans.
Your feedback matters!