Deep Learning with Differential Privacy (DP-SGD)

This 2016 paper by Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang, all then at Google, showed how to train deep neural networks while giving a formal differential privacy guarantee about the resulting model. The technique it introduced, differentially private stochastic gradient descent, or DP-SGD, became the standard method for private deep learning.

The idea attaches privacy machinery to the ordinary training loop. At each step, the per-example gradients are clipped so that no single training example can have an outsized influence, and then calibrated random noise is added to the aggregated gradient before the model is updated. Because each example’s contribution is bounded and masked by noise, what the final model reveals about any one training record is provably limited. The paper’s other major contribution was the moments accountant, a tighter way of tracking how much of the privacy budget is spent across the many steps of training, which let the authors reach useful accuracy at a “modest privacy budget” where naive accounting would have been far too pessimistic.

The results showed that meaningful privacy and usable accuracy could coexist on standard benchmarks, at a cost in performance and compute, but not a prohibitive one. This mattered because deep models are known to memorize training data; without protection, an attacker can sometimes extract specific records a model was trained on.

For businesses, DP-SGD is the practical answer to a recurring question: can we train a model on customer or patient data and be able to prove, to a regulator or a customer, that the model does not leak any individual’s information. It is the bridge between the abstract guarantee of differential privacy and the concrete act of training a neural network.

Deep Learning with Differential Privacy (DP-SGD)

Sources

Related