“Neural Architecture Search with Reinforcement Learning” was submitted by Barret Zoph and Quoc V. Le of Google Brain in November 2016. It is the paper that popularized the term neural architecture search (NAS) and showed that the design of a neural network, normally a slow, expert-driven craft, could itself be learned.
The method trains a recurrent neural network, called the controller, to emit a description of a candidate architecture as a sequence of tokens (for example, the number of filters and kernel sizes of each layer). Each candidate is trained on the real task, and its validation accuracy is used as a reward signal to update the controller with reinforcement learning. Over many iterations the controller learns to propose better architectures. On CIFAR-10 image classification and Penn Treebank language modeling, the discovered networks matched or exceeded the best human-designed models of the time.
The approach was famously expensive, using hundreds of GPUs over weeks, which sparked a wave of follow-up work on making NAS cheaper, including weight-sharing methods and differentiable search. It also helped seed Google’s later AutoML products.
For a business reader, this paper marks the moment the field began automating one of its most specialized human tasks, foreshadowing tools that let non-experts build custom models.