Inherent Trade-Offs in the Fair Determination of Risk Scores

“Inherent Trade-Offs in the Fair Determination of Risk Scores,” by Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan, was posted in 2016 in the middle of the public argument over the COMPAS recidivism tool. It gave that argument a precise mathematical resolution by proving an impossibility result.

The paper formalizes three intuitive fairness conditions for a risk score. Calibration requires that among everyone assigned a given risk score, the actual rate of the outcome is the same across groups. Balance for the positive class requires that people who do experience the outcome receive similar average scores regardless of group, and balance for the negative class requires the same for people who do not. Each sounds like a reasonable thing to demand of a fair system.

The central theorem shows that, except in narrow special cases, no scoring method can satisfy all three at the same time. In particular, when the underlying base rate of the outcome differs between groups, the conditions are mathematically incompatible, even approximately. This is exactly why both ProPublica and the makers of COMPAS could be correct: they were measuring different, mutually exclusive notions of fairness.

Why a business reader should care: this result means fairness is not a single box to check but a set of conflicting goals that cannot all be met. Any organization deploying a risk score on people must decide, in advance and explicitly, which fairness property it will prioritize, because the math guarantees it cannot have them all.

Inherent Trade-Offs in the Fair Determination of Risk Scores

Sources

Related