DeepStack reaches expert poker with deep learning

DeepStack, built by a team at the University of Alberta, Charles University, and Czech Technical University, was one of two systems that in 2017 first beat professional players at heads-up no-limit Texas hold’em - the same imperfect-information game that Carnegie Mellon’s Libratus tackled around the same time. The work was published in Science under the title “DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker” (Moravcik et al.), after appearing on arXiv in January 2017.

In a study of 44,000 hands against eleven professional players, DeepStack beat the humans with statistical significance. Its central idea, which the authors called “continual re-solving,” combined three things: recursive reasoning to handle hidden information, decomposition to focus computation on the decision at hand, and a form of fast intuition learned from self-play with deep neural networks. To measure performance reliably despite poker’s high variance, the team used a variance-reduction technique called AIVAT.

The approach differed from systems that precompute a giant blueprint strategy for the whole game. DeepStack instead reasoned locally during play, using its trained value networks to estimate the worth of situations it had not explicitly solved in advance. Together with Libratus, it marked the moment when AI overtook top humans at a major game where players cannot see each other’s cards and bluffing matters.