“MemGPT: Towards LLMs as Operating Systems” was posted to arXiv on October 12, 2023 by Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez, from UC Berkeley. It tackles a basic limitation of language model agents: a fixed context window means they forget anything that scrolls out of view, which breaks long documents and long-running conversations.
The paper’s idea is virtual context management, borrowed directly from how operating systems handle memory. Just as an OS pages data between fast main memory and slower disk to give programs the illusion of more memory than physically exists, MemGPT moves information between the model’s limited context (its “main memory”) and external storage, using function calls and interrupts so the agent can decide what to keep in view and what to swap out. The authors demonstrated the approach on document analysis over texts that exceed the context window and on multi-session chat where the agent maintains persistent memory and an evolving persona across conversations.
MemGPT became the foundation of the open-source agent-memory project later commercialized as Letta, and it is one of the most-cited treatments of agent memory - the question of how an agent remembers facts, preferences, and past actions across sessions, which is central to building assistants that feel continuous rather than amnesiac.