Machine Learning Explainability (XAI): Part One

by Henry Jia

As more and more machine learning models go into production, they are playing an increasingly crucial role in our technology infrastructure. However, not many models are rigorously examined to understand their inner mechanics. This understanding is necessary for users to trust a prediction to be actionable and trust a model to perform within all parameters a human would.

One of the ways to gain some understanding into "black box" models is a technique call LIME (Local interpretable model-agnostic explanations). This technique was first proposed in a paper by Marco Ribeiro in 2016. It is a method that "explains models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem."

This is a powerful technique for a few reasons:

  • It is model agnostic: it can explain models using SVM, Random Forest, and Deep Neural Networks.
  • In the problem space close to the prediction, it has very little trade off between faithfulness and interpretability.
  • Its explainers are user-friendly and drive confidence in taking actions based on model predictions.

Now let's take a look at how LIME works:

When a model reaches a decision, it is drawing from its inputs. The interaction between all the input values and their associated weights through mathematical functions eventually produces that decision. While all the inputs might be considered, only a subset of them dominates the decision-making process. LIME aims to identify these predictive features, but more importantly, LIME communicates these features in a way that's human-friendly. In the workflow above, after LIME identifies the dominant factors in making a flu diagnosis, it shows the user that the model sees these symptoms as the most important predictors, which is incredibly valuable information to assist the user to take actions.

So how does LIME create such transparency and simplicity out of complex models?

Local Decision Approximation

Imagine a high-dimensional problem space being reduced to a 2-D flat surface, and the blue and yellow regions represent the 2 possible outcomes from the model. For a particular decision, the model looks at the very specific region where that decision took place, and creates a line to best approximate the decision boundary. The function that generates this line is used find the features most informative to the user.

In Part 2, we will go through an coding example to analyze a Random Forest NLP model to see how it classifies consumer complaints from the Consumer Financial Protection Bureau (CFPB)