Population Based Training of Neural Networks

“Population Based Training of Neural Networks” was submitted in November 2017 by Max Jaderberg and colleagues at DeepMind. It proposed a way to tune hyperparameters that does not separate training from tuning but interleaves them.

Population Based Training (PBT) runs a population of models in parallel, each with its own hyperparameters. Periodically the method evaluates the population: poorly performing members copy the weights and hyperparameters of better members (exploit) and then perturb their hyperparameters (explore), continuing training from where the good members left off. Because hyperparameters can change over the course of a single run, PBT effectively discovers a schedule, for example a learning rate that should rise then fall, rather than a single fixed value. The authors demonstrated gains on deep reinforcement learning, machine translation, and generative adversarial network training, improving both convergence speed and final quality within a fixed compute budget.

PBT became a popular technique for problems where the best hyperparameters change during training and is implemented in tuning libraries such as Ray Tune.

For a business reader, PBT shows that the best settings for a system are not always static, and that letting good configurations propagate while exploring variations can outperform a one-time search.

Sources

Last verified June 7, 2026