“Why Should I Trust You?: Explaining the Predictions of Any Classifier” was submitted to arXiv on February 16, 2016 by Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin of the University of Washington, and presented at KDD 2016. It introduced LIME, short for Local Interpretable Model-agnostic Explanations, one of the first widely adopted tools for explaining the decisions of black-box machine learning models.
LIME’s idea is that even when a model is globally complex - a deep network or a large random forest - its behavior near any single input can be approximated by something simple. To explain one prediction, LIME perturbs the input many times (removing words, hiding image regions), records how the model’s output changes, and fits an interpretable model such as a sparse linear regression weighted toward the original point. The coefficients of that local model show which features pushed the prediction one way or the other. Because it only needs the model’s inputs and outputs, LIME is “model-agnostic” - it works on any classifier without seeing its internals.
The paper’s most cited demonstration was a classifier that distinguished huskies from wolves with high accuracy but, when explained by LIME, turned out to be detecting snow in the background rather than the animal. That example became a standard illustration of how a model can be right for the wrong reasons, and why per-prediction explanations matter for trust.
LIME, alongside SHAP a year later, helped launch the practical field now called explainable AI. Its limitations - explanations can be unstable across runs and sensitive to how perturbations are sampled - drove much of the follow-up work.
Why business readers should care: when a model denies a loan or flags a transaction, regulators and customers increasingly expect a reason. LIME was an early, general way to produce one without rebuilding the model.