“Billion-scale similarity search with GPUs” was submitted to arXiv on February 28, 2017 by Jeff Johnson, Matthijs Douze, and Herve Jegou, then at Facebook AI Research. It is the paper behind FAISS (Facebook AI Similarity Search), the open-source library that became a standard tool for nearest-neighbor search over large collections of embedding vectors.
The core problem is speed at scale: given a query vector, find the closest stored vectors among millions or billions of candidates fast enough to feel instant. Exact brute-force comparison is too slow, so the field uses approximate methods, often with product quantization that stores compressed vector codes. The paper’s contribution is a GPU implementation of these methods, including a redesigned k-selection step the authors report runs “at up to 55% of theoretical peak performance.” They report nearest-neighbor search roughly 8.5 times faster than prior GPU work, building a k-nearest-neighbor graph over 95 million images in about 35 minutes, and processing one billion vectors in under 12 hours on four Maxwell Titan X GPUs.
FAISS was released as open-source software (MIT-licensed) and is widely used as the engine inside vector search systems and retrieval-augmented generation pipelines. By GitHub’s own count the repository has tens of thousands of stars. Much of the modern vector-database industry either builds on FAISS directly or reimplements the index types it popularized, which is why this systems paper, not a model-architecture paper, sits underneath a large slice of production AI retrieval.