On November 4, 2024 Microsoft Research’s AI Frontiers lab announced Magentic-One, a generalist multi-agent system for completing complex, open-ended tasks that span the web, the file system, and code execution. Rather than one monolithic agent, it coordinates a small team of specialists under a lead agent.
The lead is an Orchestrator that decomposes a task and assigns work, and it does so using two explicit records. A Task Ledger holds the known facts, assumptions, and the current plan; a Progress Ledger tracks how the work is advancing and which agent is doing what. The Orchestrator directs four specialized agents: a WebSurfer that drives a Chromium browser, a FileSurfer that reads local files and navigates directories, a Coder that writes and analyzes code, and a ComputerTerminal that runs programs and installs libraries. If the work stalls, the Orchestrator can revise its plan. Microsoft reported results competitive with the prior state of the art on the GAIA, AssistantBench, and WebArena benchmarks, well above standalone GPT-4. The system was built on top of the AutoGen framework, and its design lets agents be added or removed without reworking the whole system.
Magentic-One was a notable demonstration that the orchestrator-plus-specialists structure, paired with explicit ledgers for planning and progress, could handle the messy multi-step tasks that single agents struggle with. The ledger mechanism in particular became a reference point for how to keep a multi-agent system on track over a long horizon.