AlphaDev finds faster sorting algorithms

On June 7, 2023, Google DeepMind published “Faster sorting algorithms discovered using deep reinforcement learning” in Nature, describing AlphaDev. Rather than playing a board game, AlphaDev treated the search for a faster program as a single-player game: it built short sorting routines instruction by instruction in low-level assembly, with a reward for producing correct, fast code. The agent itself was a descendant of AlphaZero, the reinforcement-learning system that had earlier mastered chess and Go.

Working at the assembly level let AlphaDev find shortcuts a human writing in a high-level language would not see. For sorting very short sequences its routines were up to 70 percent faster than the existing library code, and about 1.7 percent faster for sequences longer than 250,000 elements. The discovered algorithms were then contributed to the LLVM libc++ standard sorting library - the first change to that part of the library in over a decade, and the first such addition designed through reinforcement learning.

The result mattered because sorting is one of the most-run operations in computing. DeepMind estimated the new routines are executed trillions of times every day inside software around the world. AlphaDev showed that the game-playing recipe of self-play and search could be pointed at practical engineering problems, finding genuinely novel and useful algorithms rather than just winning games.