In July 2019, Noam Brown and Tuomas Sandholm, working with Facebook AI, unveiled Pluribus, an AI that beat elite human professionals at six-player no-limit Texas Hold’em. The result was published in Science under the title “Superhuman AI for multiplayer poker” (Brown and Sandholm, Vol. 365, No. 6456, pp. 885-890, 2019). Carnegie Mellon announced it on July 11, 2019.
This was a harder problem than the two-player poker that Libratus had conquered in 2017. Two-player zero-sum games have a well-defined optimal strategy that cannot be beaten in the long run, but multiplayer games do not: with five opponents who may collude or play unpredictably, the clean game-theoretic guarantees disappear. As Brown explained, “Playing a six-player game rather than head-to-head requires fundamental changes in how the AI develops its playing strategy.” Pluribus learned its blueprint strategy by playing against copies of itself and then refined its play in real time during each hand.
Pluribus performed significantly better than human pros over 10,000 hands, both when one copy faced five professionals and when five copies faced a single professional. Sandholm called it “a recognized milestone in artificial intelligence and in game theory that has been open for decades.” Notably, it was also cheap to run by the standards of the era: the entire blueprint was trained for a small fraction of the compute used by earlier game-playing systems, showing that superhuman multiplayer reasoning did not require massive resources.