Voyager: An Open-Ended Embodied Agent with Large Language Models

“Voyager: An Open-Ended Embodied Agent with Large Language Models,” posted to arXiv on May 25, 2023 by Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar - a team including NVIDIA and Caltech researchers - described what it called the first LLM-powered embodied lifelong-learning agent in Minecraft. Voyager plays the game continuously, deciding its own goals, acquiring skills, and making discoveries without human intervention.

It works through three pieces, all running on GPT-4 via plain API calls with no fine-tuning. An automatic curriculum proposes increasingly difficult tasks suited to the agent’s current state. A growing skill library stores each capability the agent develops as a piece of executable code, so that mastered skills can be retrieved and composed rather than relearned. And an iterative prompting loop feeds environment feedback, execution errors, and self-verification back into the model to refine its programs. Storing skills as code is the key idea: the agent literally writes its own ever-expanding toolbox.

The reported gains were large. Voyager collected 3.3 times more unique items than prior methods, traveled 2.3 times farther, and unlocked key tech-tree milestones up to 15.3 times faster, and it could carry its learned skills into a fresh Minecraft world to solve new tasks from scratch. The paper became a touchstone for open-ended, self-improving agents and for the pattern of treating generated code as the unit of an agent’s memory and skill.

Voyager: An Open-Ended Embodied Agent with Large Language Models

Sources

Related