Reinforcement Learning: An Introduction (Sutton and Barto)

“Reinforcement Learning: An Introduction” by Richard Sutton and Andrew Barto is the standard textbook of the field. The first edition was published by MIT Press in 1998, and a substantially expanded second edition followed in 2018. The book lays out the subject from first principles: the agent-environment loop, Markov decision processes, dynamic programming, Monte Carlo methods, temporal-difference learning, function approximation, and policy-gradient methods.

What set the book apart, beyond clarity, is that the authors have always made the full text freely available online from Sutton’s own website, along with code, exercises, and teaching slides. That open availability helped it become the entry point for a generation of students and practitioners and made reinforcement learning one of the more approachable corners of machine learning.

The textbook is closely tied to the authors’ own research. Many of the methods it explains, including temporal-difference learning and actor-critic algorithms, are ones Sutton and Barto invented. When the two received the 2024 Turing Award, the book was cited as part of the body of work that built the conceptual and algorithmic foundations of the field.