Fairness impossibility (the impossibility theorem)

The fairness impossibility theorem is the now-standard term for a family of results showing that the leading mathematical definitions of fairness for predictive systems are mutually incompatible. The sharpest version, proved by Kleinberg, Mullainathan, and Raghavan in 2016, shows that a risk score generally cannot be simultaneously calibrated across groups and balanced in its error rates across groups whenever the groups have different base rates of the outcome being predicted. A related result by Alexandra Chouldechova arrived at the same conclusion from the perspective of the COMPAS debate.

The practical meaning is that the public dispute over the COMPAS recidivism tool had no clean winner. Northpointe could correctly claim COMPAS was calibrated, meaning a given score implied the same reoffense rate regardless of race, while ProPublica could correctly claim its error rates fell unequally across races. Both were true; the math says you cannot fix one without breaking the other when base rates differ.

This reframes fairness from a property a system either has or lacks into a set of tradeoffs that must be chosen among. There is no universally fair scoring system, only systems that are fair according to a particular definition, at the cost of others.

Why a business reader should care: it is tempting to ask whether a model is fair, full stop, but the question is not well-posed. The responsible move is to pick the fairness criterion that matches the stakes of the decision, document that choice, and accept that competing notions will be violated by design.

Fairness impossibility (the impossibility theorem)

Sources

Related