DreamerV2: Mastering Atari with Discrete World Models

DreamerV2 was introduced in “Mastering Atari with Discrete World Models,” posted to arXiv on October 5, 2020 by Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, and Jimmy Ba. It extended the original Dreamer agent and became the first model-based reinforcement learning system to reach human-level performance across the Atari benchmark while learning behaviors entirely inside a learned world model.

The key change from the original Dreamer was the move to discrete representations. Instead of describing the latent state of the world with continuous numbers, DreamerV2 uses a set of discrete categorical variables, which turned out to be a much better fit for the sharp, abrupt transitions common in video games. Trained from pixels alone, it matched the performance of top model-free agents on 55 Atari tasks and also handled continuous control problems such as humanoid locomotion.

DreamerV2 was significant because model-based methods had long lagged behind model-free ones on the Atari benchmark, and this closed that gap. For a general reader, it is evidence that an agent which learns its own model of how a game works can play just as well as one that learns only by direct trial and error.

Sources

Last verified June 7, 2026