Dreamer: Learning Behaviors by Latent Imagination

Dreamer was introduced in “Dream to Control: Learning Behaviors by Latent Imagination,” posted to arXiv on December 3, 2019 by Danijar Hafner, Timothy Lillicrap, Jimmy Ba, and Mohammad Norouzi. It is a model-based reinforcement learning agent that solves long-horizon tasks from raw images by learning a compact internal model of the world and then planning inside it.

The agent first learns a world model that compresses observations into a small latent state and predicts how that state evolves. It then improves its behavior almost entirely in imagination: it rolls out imagined trajectories in the latent space and propagates analytic gradients of learned state values back through those trajectories to refine the policy. Because the imagined rollouts are cheap compared to acting in the real environment, Dreamer achieved strong data efficiency, learning effective controllers across 20 visual control tasks while being faster than competing methods.

Dreamer launched an influential line of work, the DreamerV2 and DreamerV3 successors, that made model-based RL competitive with the best model-free agents. For a general reader, the appeal is intuitive: like a person rehearsing a plan in their head, the agent practices in an internal simulation before acting in the world.

Dreamer: Learning Behaviors by Latent Imagination

Sources

Related