Published in May 2024 and presented at NeurIPS 2024, SWE-agent introduced the Agent-Computer Interface (ACI) — a purpose-built set of tools designed to mediate between language models and software engineering environments. Rather than giving models generic filesystem and shell access, the ACI provided structured commands optimised for code navigation, editing, and debugging.
The Princeton NLP and Stanford team benchmarked SWE-agent on SWE-bench, a dataset of real GitHub issues from popular open-source repositories. The agent achieved state-of-the-art performance on the benchmark, demonstrating that structured tool design — not just raw model capability — was critical to autonomous software engineering success.
SWE-agent’s ACI concept became highly influential. The insight that the interface between the model and the computer environment matters as much as the model itself shaped subsequent agent designs, including mini-SWE-agent (which achieved >74% on SWE-bench Verified using only bash), OpenHands, and commercial agents including Devin. The project also established SWE-bench as the standard evaluation for autonomous software engineering.