For centuries statisticians have studied and predicted life expectancy (survival rates). The first example dates back to 1662 when English statistician John Graunt developed the Life Table which predicted the percentage of people who will live to each successive age and their life expectancy.
Elula, in conjunction with the University of New South Wales (UNSW), has been undertaking R&D activities into how we can combine traditional survival analysis with state-of-the-art deep learning time series techniques.
What is survival analysis?
Traditionally, survival analysis has been used to model the expected time until the death of a biological organism (hence the name survival). There are many factors such as disease or treatments that may impact the likelihood of the organism surviving past a certain time, and the goal of the modelling is to predict the probability of survival given these factors.
These same concepts have also been applied in a variety of other fields to model the time to an event. Commercial examples include predicting mechanical failure so that maintenance can be carried out before it is too late; determining life expectancy to price life insurance premiums; and predicting when a customer may churn.
Overcoming the limitations of traditional survival analysis
Survival modelling is a well-known statistical method. Traditional methods, however, often struggle to handle complexity found in today’s large datasets. They make assumptions to simplify the problem and impose limitations which can impact the accuracy of resulting models. Cox Proportional-Hazards Model is a common methodology and assumes constant linear relationships between each of the datapoints, i.e. the influence of a variable does not change over time and the relationship between the variable and the survival rate is linear. Yet, in the real-world, interactions may be non-linear and time dependent. For example, a change in a home loan interest rate today may influence a customer’s likelihood to churn to a greater or lesser degree than they did a year ago.
To attempt to mitigate these problems, Elula is actively working in the rapidly expanding field of deep learning. This field has the advantage of being able to handle real world complexities, and offer improvements to traditional survival analysis techniques.
Deep learning is a sub-category of machine learning and artificial intelligence inspired by the human brain where its structure is replicated with hierarchical artificial neural networks. Deep learning provides insight into increasingly complex patterns and relationships in the data and has greater predictive accuracy than standard statistical models.
Research into the fusion of deep learning and survival analysis “deep survival analysis” is emerging and has so far shown great promise in the medical field despite the relatively small datasets available in this field. Given that deep learning methods perform better with very large data sets, one would expect this technique will also apply well to business use cases, which contain significantly larger data sets.
Application in customer retention
Modelling the likelihood of customer churn is a problem naturally suited to survival modelling; however, it can be very challenging due to the complexity of modelling human behaviour. One particular challenge is to capture how an individual customer’s behaviour changes over time. We need to account for these changes by incorporating a time dimension into many features and characteristics of an individual customer. Having a modelling technique that uses that historical information may provide a significant advantage here.
Elula and UNSW have been researching, trialling, and applying these state-of-the-art deep learning time series techniques in the real world to help our clients deliver better customer service and reduce churn. In our customer retention product (Sticky), we found that by incorporating Long Short-Term Memory (LSTM) neural network architectures into a survival model, we can get an uplift in results compared to existing models. These models have the advantage of learning the time series information with the LSTM, being able to learn complex relationships in the neural network architecture while still harnessing the traditional power survival models.
We are also starting to experiment with different loss functions for the models. The loss function quantifies how well the model has learned from the data and can be referred to as “Goodness of fit”. Appropriate selection of the loss function can improve the model performance which mostly depends on the distribution of the values that we want to predict. Existing and default loss functions embedded in deep-learning packages are general and can work well for most of the cases, however, for survival analyses, we can choose other distribution models which may improve performance even more.
At Elula we continue to invest in R&D and ensure our products perform to the highest standard, remain future-proof and importantly, our customers get the very best outcomes.