Okapi BM25: Approximations to the 2-Poisson Model

By the early 1990s researchers had a strong theoretical model of relevance based on the statistics of how often words appear, but the exact form was awkward to compute. In a 1994 paper at the ACM SIGIR conference (pages 232-241), Stephen Robertson and Steve Walker derived simple, effective approximations to that “2-Poisson” model, producing the ranking function now universally known as BM25 (the “Best Match” series, developed in the Okapi retrieval system).

BM25 scores a document for a query by combining three intuitions in one tidy formula. It rewards documents containing the query terms, weights each term by its rarity across the collection (the IDF idea from Sparck Jones), lets term frequency help but with diminishing returns so a word repeated many times does not dominate, and normalizes for document length so long documents are not unfairly favored just for containing more words. The result was both well-motivated by probability theory and easy to tune and deploy.

BM25 went on to become the standard keyword-ranking baseline in information retrieval, the default scoring function in widely used search engines and libraries, and the benchmark that every newer method, including modern neural retrievers, is measured against. Its durability is remarkable: a formula from 1994 is still hard to beat for many search tasks.

For a business reader, BM25 is the workhorse you have used countless times without knowing its name: it is the math behind a large share of the keyword searches that return useful results on the first try.

Okapi BM25: Approximations to the 2-Poisson Model

Sources

Related