Practical Secure Aggregation for Privacy-Preserving Machine Learning

This 2017 paper from Google, presented at the ACM Conference on Computer and Communications Security and authored by Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth, closed an important gap in federated learning. Federated averaging keeps raw data on devices, but the model updates each device sends back can themselves leak information about that device’s data. Secure aggregation makes those updates invisible to the server.

The protocol uses cryptography so that the server can compute only the sum of all participating devices’ updates, never any individual update. Each device masks its contribution with random values that are constructed to cancel out when all the contributions are added together; the masks are coordinated through a key-exchange scheme, so the server sees only meaningless noise from any one device but recovers the true total once everything is combined. A central practical challenge the paper solved is robustness to dropouts: on mobile networks devices routinely disappear mid-round, and the protocol still produces a correct sum even when many participants drop out, while remaining secure against a curious or actively malicious server. The authors reported modest overhead, on the order of a “1.73x communication expansion” in one configuration, low enough for real deployment.

For a business reader, secure aggregation is what makes federated learning trustworthy rather than merely decentralized. It means a coordinating party, whether a tech company or a consortium of hospitals, can build a shared model from many participants’ updates while being cryptographically unable to inspect any single participant’s contribution. It is a building block now combined with differential privacy to give end-to-end guarantees.

Practical Secure Aggregation for Privacy-Preserving Machine Learning

Sources

Related