Score-Based Generative Modeling through Stochastic Differential Equations

“Score-Based Generative Modeling through Stochastic Differential Equations,” posted to arXiv on November 26, 2020 by Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole, provided the unifying mathematical theory behind diffusion models. It cast image generation as a continuous-time process described by a stochastic differential equation, or SDE.

The framework works in two directions. A forward SDE gradually adds noise to data until it becomes pure random noise, and a corresponding reverse SDE removes the noise step by step to recover a sample. Crucially, the reverse process depends only on the score, the gradient of the noised data distribution, which a neural network is trained to estimate. The authors showed that this single formulation contained both the score-matching methods and the denoising diffusion models that had been developed separately, and that it enabled new tools, including a predictor-corrector sampler and an equivalent deterministic process that allows exact likelihood computation.

This paper gave the burgeoning field of diffusion models a common language and a flexible toolbox, and it directly influenced the samplers, guidance methods, and accelerations that made systems like Stable Diffusion practical. For a general reader, it is the theoretical backbone explaining why adding and then carefully removing noise turns out to be such a powerful way to generate images, audio, and other data.

Sources

Last verified June 7, 2026