Wasserstein GAN (WGAN)

“Wasserstein GAN,” posted to arXiv on January 26, 2017 by Martin Arjovsky, Soumith Chintala, and Leon Bottou, tackled the most frustrating problem with generative adversarial networks: training instability. Standard GANs were prone to mode collapse, where the generator produces only a few kinds of output, and gave loss values that told the practitioner almost nothing about progress.

The paper traced these problems to the distance measure implicitly used by the original GAN objective and proposed replacing it with the Wasserstein, or earth-mover, distance. Intuitively, this distance measures how much work it would take to reshape one probability distribution into another, and it behaves more smoothly when the real and generated distributions barely overlap, which is exactly the situation early in training. The reformulation produced gradients that remained useful, reduced mode collapse, and yielded a loss curve that actually correlated with sample quality, making it possible to debug training.

WGAN was a turning point for the reliability of adversarial generative models, and follow-up work on gradient penalties refined it further. Many subsequent high-fidelity GANs adopted Wasserstein-style objectives or the lessons behind them. For a general reader, WGAN is a good example of how a more principled mathematical foundation can turn a temperamental research technique into something dependable enough to scale.

Sources

Related