“NAS-Bench-101: Towards Reproducible Neural Architecture Search” was submitted in February 2019 by Chris Ying, Aaron Klein, Esteban Real, Eric Christiansen, Kevin Murphy, and Frank Hutter. It addressed a serious credibility problem in the field: because every neural architecture search method trained its candidates from scratch on different hardware with different tricks, results were enormously expensive to reproduce and almost impossible to compare fairly.
The authors defined a constrained but rich search space of convolutional cells and then exhaustively trained and evaluated every architecture in it, recording metrics for more than five million trained models in a single queryable table. With NAS-Bench-101 a researcher can ask for the accuracy of any architecture in the space in milliseconds instead of GPU-hours, which means search algorithms can be benchmarked cheaply and on equal footing.
The dataset launched a family of tabular and surrogate NAS benchmarks (such as NAS-Bench-201) that became standard tools for evaluating new search methods.
For a business reader, this is an example of how a field matures: by building shared, reproducible benchmarks so that claims of progress can be checked rather than taken on faith.