Parables on the Power of Planning in AI: From Poker to Diplomacy

This is Noam Brown’s talk “Parables on the Power of Planning in AI: From Poker to Diplomacy,” delivered as a distinguished lecture at the University of Washington’s Paul G. Allen School and posted on its official YouTube channel in May 2024. Brown led the research behind Libratus and Pluribus, which beat top human professionals at poker, and Cicero, which played the negotiation game Diplomacy at a human level.

His through-line is that search and planning, letting a system deliberate before it acts, have repeatedly been the missing ingredient behind superhuman game-playing AI. He walks from Deep Blue through AlphaGo to his own poker and Diplomacy systems, drawing out the lesson that spending more computation at decision time can be worth far more than a larger model alone. The talk landed just before OpenAI’s reasoning models made test-time compute a mainstream topic, and Brown went on to work on that effort.

For a general or technical reader, this is a clear firsthand explanation of why “thinking longer” became a central idea in AI. It connects directly to the o1 reasoning milestone and to the broader shift toward models that reason step by step rather than answer instantly.

Parables on the Power of Planning in AI: From Poker to Diplomacy

Sources

Related