How Common Explainable AI Algorithms Work

“Nothing in life is to be feared, it is only to be understood. Now is the time to understand more, so that we may fear less.” Marie Curie

Explainable AI (XAI) is becoming increasingly important as we want to understand why machine learning models make a decision or prediction. XAI will help build trust in solutions and provide context and reasons for predictions. A number of methods for producing explanations have become increasingly popular over the last year, and methods such as LIME, SHAP and Anchor are emerging as leading methodologies. In this blog we provide an overview of how these three explainer algorithms work, and in the coming blogs, we will explore some strengths and weaknesses of these approaches.

How do they work

To help us explain how LIME, SHAP and Anchor work we will use a well-known Kaggle example – trying to predict who will survive the Titanic disaster using inputs such as age, gender, cabin location, etc.

SHAP and LIME are very similar in terms of their output. These algorithms work by producing a feature importance or feature attribution. This means that LIME and SHAP tell you which of the input features were most important to the prediction.

LIME

To understand how LIME works, consider the prediction “Rose will survive the Titanic”, where we have a prediction model that predicts reasonably accurately who will survive and who will not. To understand why Rose was predicted to survive, LIME will generate thousands of imaginary people who are all a bit different from Rose and see whether the model predicts them to survive or not. It then fits an interpretable linear model on these predictions and approximates the behaviour of the larger global model (i.e. we use a simpler model on a smaller dataset, to differentiate Rose from people similar to her). Once we have a good local approximation of the full model, where we look only at people like Rose, we can produce a feature importance measure: the most important feature is the one with the highest coefficient in the linear model. So we have a feature importance measure.

SHAP

SHAP’s algorithm is quite different: using results from game theory, SHAP pretends that each feature is a member of some team and the prediction is the prize the team wins. SHAP then calculates Shapley values, which are estimates of how much each team member contributed to the team winning, and then tells us which team members/features contributed most. This is done by “removing” one member from the team (one feature from the model) and seeing how it performs. But SHAP doesn’t just want to see how useful the team member is in one team; the algorithm wants to see how useful they would be in any team. So, it simulates every possible team given the team members that we have, and sees how much each team member on average contributes to that team winning by removing that member from every team.

In the Titanic case, SHAP will create a game where gender, age, and cabin location are players. These participants are trying to win by working together to predict whether Rose will survive or not.

However, you can’t really remove a feature from a model, so instead SHAP replaces the value of that feature with another random value from the dataset and does that many times to get the average result of randomising that feature. Since this is computationally intensive, SHAP uses KMeans clustering and substitutes in the values from the clusters to approximate Shapley values.

This method has all kinds of nice results. For example, it is guaranteed to have unique solutions and so is more stable than LIME, which has non-unique solutions. SHAP importance’s are also additive, so the success of the team is the sum of the importance of each team member. Compare this to LIME, where the importance numbers help you rank features, but LIME’s importance numbers themselves do not really mean anything. So, SHAP is normally the preferred feature importance algorithm.

Anchor

A third algorithm, and one that Elula prefers, is an algorithm called Anchor. Anchor was developed by the creators of LIME but is less well known.

Instead of creating a feature importance like SHAP and LIME, Anchor instead creates a region in the feature space. That is, it comes up with an easily interpretable rule which guarantees the prediction. So instead of saying that Rose survived the Titanic because of her gender, age, and location, Anchor helps fine tune the explanation and would say that Rose survived because she was a woman aged between 15 and 25, who had a cabin on the upper deck. Anchor gives us a set of information about a person, rather than a feature importance measure. If Anchor does its job right, then this set of rules will give anyone who satisfies those rules a very high chance of survival.

To find this set of rules, Anchor does a search across all the possible rectangular regions in the feature space. We say “rectangular” because they follow the form of “age is between 15 and 25, cabin location is between decks A and B”. Each of these rules mark out a rectangle in the feature space. Anchor’s search is quite computationally intensive, especially as the number of features increase, so it can become impractical for wide-scale use with complex models on large feature sets. But the computational cost is often worth having good explanations.

Anchor solves some problems inherent in LIME, which we will detail in a future blog post.

In summary

We have shared a high-level description of how some common explainer algorithms work. As always, it’s the application of these through extensive testing that drives the real outcomes. In the coming weeks, we will explore some strengths and weaknesses of these approaches, explain where there is room for improvement and start to share a little more into Elula’s proprietary explainers.

How Common Explainable AI Algorithms Work

Majella

Previous PostNews: Mortgage churn preventable with the right AI

Next PostDeep Survival Analysis