In November 2022, days before ChatGPT’s launch, Jerry Liu pushed the first commit to GPT Index, the project that would soon be renamed LlamaIndex. It began as a simple tree index over text and grew into one of the most widely used open-source frameworks for building applications that let language models work over private data.
LlamaIndex addresses a basic limitation: a model only knows what is in its training data and what fits in its context window, so it cannot answer questions about a company’s own documents out of the box. The framework supplies data connectors to ingest sources such as Notion, Slack, and Google Drive, ways to structure and index that content, and a retrieval and query interface that fetches the most relevant pieces to feed the model, the pattern known as retrieval-augmented generation. The project was renamed from GPT Index to LlamaIndex in early 2023 to signal it was not tied to OpenAI’s models. By its first anniversary in November 2023 it reported hundreds of open-source contributors, thousands of dependent projects, and roughly 900,000 monthly downloads, and the team had incorporated and raised funding to build the commercial LlamaCloud.
LlamaIndex, alongside LangChain, helped define the toolkit layer of the LLM application stack, the connective code between raw models and the documents, databases, and tools that make them useful in a specific business.