“Tree of Thoughts: Deliberate Problem Solving with Large Language Models,” submitted to arXiv on May 17, 2023 by Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, and Karthik Narasimhan, generalized chain-of-thought from a single line of reasoning into a search over many.
Chain-of-thought prompting produces one left-to-right sequence of reasoning steps. Tree of Thoughts (ToT) instead treats intermediate “thoughts” - coherent chunks of text that represent partial progress - as nodes in a tree. The model generates several candidate next thoughts at each step, evaluates them itself, and then uses classic search strategies like breadth-first or depth-first search to explore promising branches, look ahead, and backtrack from dead ends. This turns problem solving into deliberate decision making rather than a single forward pass.
The payoff is largest on tasks that need planning. On the Game of 24, a number puzzle, GPT-4 with standard chain-of-thought solved only 4 percent of instances, while the same model with Tree of Thoughts solved 74 percent. ToT also improved performance on creative writing and mini crosswords. The cost is that it issues many more model calls and requires a way to score partial solutions.
Why business readers should care: Tree of Thoughts shows that wrapping a language model in a search loop can convert near-total failure into majority success on planning problems. It is a template for when a problem is worth spending many model calls to solve well.