Banking is one of those traditional industries that has gone through a steady transformation over the decades. Yet, many banks today with a sizeable customer base hoping to gain a competitive edge have not tapped into the vast amounts of data they have, especially in solving one of the most acknowledged problems – customer churn.
While retaining existing customers and thereby increasing their lifetime value is something everyone acknowledges as being important, there is little the banks can do about customer churn when they don’t see it coming in the first place. This is where predicting churn at the right time becomes important, especially when clear customer feedback is absent. Early and accurate churn prediction empowers CRM and customer experience teams to be creative and proactive in their engagement with the customer. In fact, by simply reaching out to the customer early enough, 11% of the churn can be avoided.
But how do you look for signs of churn? Collecting comprehensive feedback about the customer’s experience can be a challenging task. For one, surveys are expensive and infrequent. Moreover, not all customers receive it or care to respond. So where else can you pick up signs of potential customer disengagement? The answer lies in extracting early warning signs from the already existing data. Advanced machine learning (ML) and data science (DS) techniques can learn from past customer behavior and external triggers that led to churn and use this learning to predict the future occurrence of a churn-like event.
From business problem to predictive insights – it is all in the translation
Defining Churn: The key to extracting meaningful predictive insights is in defining the problem statement building blocks as accurately as possible. In the case of customer churn, it starts with defining what is considered as a “churn event”.
In general, churn is expressed as a degree of customer inactivity or disengagement, observed over a given time. This manifests within the data in various forms such as the recency of account actions or change in the account balance. For example, in the case of HNW (High Net-Worth) customers, it is useful to define churn based on the rate of decline of assets over a specified period. There could be an instance where a customer may be highly active in terms of account operations but has effectively pulled out more than 50% of her assets in the last six months.
Another aspect of the business problem is how early do you want the model to predict? A prediction that is too far out may be less accurate. On the other hand, a short prediction horizon may fare better in accuracy, but it could be too late to intervene once the customer has already made her mind up.
Finally, it is crucial to determine if churn has to be defined at a product level (customer likely to disengage with a particular product, like discontinuing a credit card) or at the relationship level (customer likely to disengage with the bank itself). When the data is analyzed at a relationship level, you get a better understanding of the customer’s point of view. For example, excessive withdrawal from one’s savings account could be a down payment for a house or funding for college tuition. Such insights into customer life events are very powerful not only to prevent churn but also to cross-sell complementing products which can further strengthen the relationship.
Building a Customer 360 view: One of the first milestones in using machine learning and advanced analytics to predict a churn event is to capture and represent all key aspects of a customer’s relationship with the bank. Building this Customer 360 data mart in a scalable, phased manner is the foundation for not just churn prediction, but also for other use cases such as cross-sell/upsell recommendation, customer lifetime value calculation, etc. In addition to customer demographics and profile indicators, other major subject areas in the data mart can include transactions on current and savings accounts, investment activities, product ownership (line of credit, insurance, and cards), channel-wise activity, campaign response, and customer service interactions. All these help in bringing various aspects of customer touchpoints and activities into a single snapshot.
Feature Engineering: Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. It plays a pivotal role in defining and creating data elements that capture customer behavior. For example, two accounts with the same monthly closing balance can be hard to tell apart. However, feature engineering can add a stochastic process (time dimension) to the same data so that algorithms can distinguish whether the monthly closing balance amount is a deviation from what is usually expected from the customer. Basic indicators like net balance outflow in the last few months, to more nuanced indicators like rate of change of average gap between bill payments made, can prove effective in providing early warning signals of impending churn. It is imperative to actively involve the business stakeholders in this stage to identify potentially useful indicators based on their collective experience. Features can represent “symptoms” of churn, like increasing withdrawals or dormant accounts, or, can represent the “drivers” of churns itself – such as difficult online experience, poor customer service or heavy fees on ATM withdrawals. Creative ways of mining insights from textual feedback (sentiment analysis) such as conversation notes from customer service, a bad experience with a poorly performing relationship manager, and interaction at various online and offline customer touchpoints create new features that are very effective in capturing intent to churn. Similarly, external data such as competitor offers, localized macroeconomic indicators, when combined with internal data provide powerful insights into customer behavior.
Train, Test, and Validate!
Churn prediction falls under the typical classification problem category. From the tried and tested logistic regression techniques to complex tree-based techniques like XG Boost, the key is to identify the technique that offers the right balance of interpretability and performance. While complex algorithms like random forest and XGBoost capture non-linear patterns in data and handle null values (which can be quite common!) quite comfortably, logistic regression offers a more transparent and intuitive explanation of the impact of each variable on the predicted outcome. It is always advisable to validate the model on at least two to three time periods outside its training dataset and check for consistency in predictions across seasons.
Apart from providing the business teams with a regularly updated churn score for each customer, it is highly advisable to list the top predictors (cause) and their relative influence (effect) for each customer. This helps in driving the right conversations at the time of intervention.
Driving Business Adoption
Driving business adoption is one of the most challenging activities and requires orchestrated effort from all the stakeholders involved. It starts with effectively demonstrating the model’s predictive power for a recent historical period and running several simulations to measure the effectiveness of the model and associated strategy. Claims such as ‘the model captures 70% of potential churners at least six months prior to the event’ – are sure to generate interest. Also, an articulation of business value such as “the model can save an additional $20 million of balance outflow every month” demonstrate tangible impact.
The next milestone is designing an effective campaign to put the model to test. Typical questions to be expected are – How many of your “at-risk” customers do I contact? How do I further prioritize my contact strategy? What treatment or interventions do I offer? How do I measure the impact of the model post-campaign? All of these questions can be answered by understanding the CRM team’s current efforts to mitigate churn, the resources they have to run the campaign, and the marketing and business teams’ definition of a ‘priority’ customer.
Test vs. Control approach is one of the standard approaches where you can select similar groups of customers among the different categories of likely to churn. Of which, the test group receives the model-based interventions and the control group does not. The next step is to observe the churn rates in both groups in the post-campaign period and use the churn rate in the control group as a baseline to highlight the incremental effect in the test group (in this case, an x% reduction in churn rate).
To summarize, all banks acknowledge customer churn as a significant business problem, but often lack a systematic and proactive way to address it. Even after harnessing the power of massive data and building robust churn prediction models, the challenge lies in creating an ecosystem of enablers at every stage. It encompasses setting the data pipeline to generate, store, and report churn likelihood scores. And then collaborating with business, marketing, CRM and the concerned relationship managers and branch colleagues, who will benefit from the insights during their conversations to offer right retention strategies.
For more inputs on intervention strategies and making fundamental changes to impact customer experience posit, await our next blog on ‘Customer Sentiment Analysis – An entity-level approach to understanding underlying reasons for dissatisfaction’.Tags: Banking Customer Churn Data Science Machine Learning Predictive Analysis