Agent Memory

Agent memory refers to the techniques that let an AI agent hold onto information beyond the few thousand tokens it can keep in its context window at any moment. A language model on its own is stateless between calls and forgetful within a long session: once early messages scroll out of the window, they are gone. Memory systems give an agent a way to write important facts to durable storage and read them back later, so it can remember a user, a project, or its own past decisions.

A foundational example is MemGPT, whose 2023 paper proposed treating the model like an operating system managing limited RAM. It organizes information into tiers, a small in-context working memory plus larger external storage, and has the model itself decide what to keep close at hand and what to page out, paging relevant facts back in when needed. Other approaches lean on retrieval, embedding past interactions in a vector database and fetching the most relevant ones for the current task, and on structured summaries that compress long histories into compact notes. The Generative Agents work used a memory stream with retrieval, recency, and importance scoring to make simulated characters behave consistently over days.

Agent memory matters because durable recall is what separates a one-shot assistant from a colleague that improves over time. Without it, every conversation starts from zero; with it, an agent can build on prior work and personalize to a user. For a business reader, memory is the capability that makes long-running, accountable AI assistants practical rather than amnesiac.

Sources

Related