David Silver is a British AI researcher who led the reinforcement-learning team at DeepMind from 2013, and is a professor at University College London. He earned his Ph.D. on reinforcement learning at the University of Alberta in 2009. His own page describes his goal as building AI that learns for itself to solve problems humans cannot, the thread running through all of his major systems.
Silver was the lead researcher behind DeepMind’s landmark game-playing programs. He led AlphaGo, the first program to defeat a top professional at the full game of Go, by combining deep learning, reinforcement learning, tree search, and large-scale computing. He then led AlphaGo Zero and AlphaZero, which threw out human game records entirely and learned Go, and then chess and shogi, purely from self-play, surpassing every prior program. MuZero went further still, mastering games without even being told the rules in advance. His team also built AlphaStar, which reached professional level at StarCraft II. He received the 2019 ACM Prize in Computing for these advances.
This cluster of systems is the single largest covered area in the library, yet Silver was, until now, never named. For the reader, he is the connective figure behind those entries: AlphaGo’s famous “Move 37” against Lee Sedol, the self-play breakthroughs of AlphaZero, and the rule-free planning of MuZero are all his team’s work. They are the clearest demonstration that an agent learning from its own experience, rather than from human examples, can reach and then exceed the limits of human skill.