“ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models” was posted to arXiv on May 23, 2023 by Binfeng Xu, Zhiyuan Peng, Bowen Lei, Subhabrata Mukherjee, Yuchen Liu, and Dongkuan Xu. It addresses a cost problem in the dominant ReAct-style agent loop, where the model pauses after each thought to call a tool, reads the result, and feeds the whole growing transcript back into the next prompt.
ReWOO separates the planning from the execution. A planner module reasons through the entire problem and writes out a full plan of tool calls in advance, using placeholders where it expects to slot in the tool outputs. The tools are then executed, and a solver combines the plan with the gathered evidence to produce the answer. Because the bulky tool observations are no longer threaded back through every reasoning step, the prompt stays small. The authors reported about five times better token efficiency on the HotpotQA multi-step reasoning benchmark along with a 4 percent accuracy gain, and showed that the planner’s reasoning could be distilled from a 175B GPT-3.5 into a 7B LLaMA model.
ReWOO is a standard reference for the plan-and-execute family of agent designs, which trade the flexibility of step-by-step interleaving for lower cost and latency. The tradeoff is that a plan fixed in advance adapts less gracefully when a tool returns something unexpected, so the pattern fits tasks whose structure can be anticipated.