Karen Sparck Jones (1935-2007) was a British computer scientist at the University of Cambridge who shaped the foundations of information retrieval and natural language processing. Her best-known contribution, described in her 1972 paper in the Journal of Documentation, is inverse document frequency (IDF): the insight that a word’s value as a search term falls the more documents it appears in, so rare words should count for more. According to the Cambridge Computer Laboratory’s obituary, IDF “has been adopted as standard in modern systems,” and it remains a core ingredient of search engines through the term-frequency-inverse-document-frequency (TF-IDF) weighting scheme.
She worked in NLP and retrieval from the late 1950s onward, much of it with Stephen Robertson on probabilistic models of relevance. Her honours included Fellowship of the British Academy, the BCS Lovelace Medal, the ACM SIGIR Salton Award, the Association for Computational Linguistics Lifetime Achievement Award, and the ACM-AAAI Allen Newell Award.
Sparck Jones was also a forceful advocate for women in the field, and is remembered for her slogan, “Computing is too important to be left to men.” She helped found the women@cl network at Cambridge and worked to bring more girls into computing. Her IDF idea is one of the few pieces of 1970s research a modern user touches every single day, every time a search engine ranks results.