reCAPTCHA transcribed old books at over 99 percent word accuracy

In the 2008 Science paper “reCAPTCHA: Human-Based Character Recognition via Web Security Measures,” Luis von Ahn and colleagues reported that reCAPTCHA transcribed words that optical character recognition could not read with a word accuracy “exceeding 99 percent” - the same guarantee offered by professional human transcribers. The accuracy came from pairing each unknown word with a known control word and requiring agreement across multiple independent CAPTCHA solvers. With CAPTCHAs being solved on the order of 200 million times a day, the system digitized hundreds of millions of words of newspaper and book archives, including back issues of The New York Times, as a free byproduct of spam-blocking.

Sources

Last verified June 7, 2026