On February 15, 2024, Google introduced Gemini 1.5 Pro and, with it, a dramatic jump in context length. The model launched with a standard 128,000-token window, a 1 million-token window in limited preview, and Google reported having successfully tested up to 10 million tokens in research. For comparison, most leading models at the time topped out in the low hundreds of thousands of tokens.
A long context window changes what a single prompt can hold. Google demonstrated Gemini 1.5 Pro analyzing tens of thousands of lines of code at once, summarizing documents thousands of pages long, answering specific questions about an entire 45-minute film, and learning to translate the low-resource language Kalamang from a single uploaded grammar manual.
Gemini 1.5 helped make very long context a headline capability of frontier models rather than a niche research result. The ability to drop an entire codebase, book, or hours of media into one prompt and reason over all of it reshaped expectations for how much information a model could attend to at once.