“A Unified Approach to Interpreting Model Predictions” was submitted to arXiv on May 22, 2017 by Scott Lundberg and Su-In Lee of the University of Washington, and published at NIPS 2017. It introduced SHAP, short for SHapley Additive exPlanations, which has become one of the most widely used methods for explaining individual model predictions.
SHAP borrows from cooperative game theory. The Shapley value, defined by Lloyd Shapley in 1953, is the unique fair way to split a payout among players based on how much each contributes across all possible coalitions. SHAP treats a model’s prediction as the payout and each input feature as a player, assigning every feature a value that measures how much it pushed the prediction above or below the average. These values add up exactly to the difference between the model’s output and its baseline, a property the paper calls “local accuracy.”
The paper’s central theoretical contribution was to show that several earlier explanation methods - including LIME and DeepLIFT - are special cases of a single class of “additive feature attribution” methods, and that Shapley values are the only ones in that class satisfying a small set of desirable fairness properties. Computing exact Shapley values is expensive, so the authors and their later work introduced fast approximations, including TreeSHAP for tree-based models that runs in polynomial time.
Because TreeSHAP is exact and fast for gradient-boosted trees - the workhorse of tabular machine learning - SHAP became standard in industries like finance, insurance, and healthcare where per-decision explanations are often required.
Why business readers should care: SHAP gives a mathematically principled, consistent answer to “which factors drove this specific prediction,” which is what audit, compliance, and customer-facing explanation usually demand.