Sparky British computer scientist Karen Spärck Jones FBA (1935–2007) is the woman behind the concept of inverse document frequency and index-term weighting — the principles underpinning modern search engines, including nearly 3.5 billion Google searches every day!
In 2019, The New York Times called her “a pioneer of computer science for work combining statistics and linguistics and an advocate for women in the field.”
Karen Ida Boalth Spärck Jones was born in Huddersfield, U.K., to a Norwegian mother who fled to England during World War II. Following grammar school, a history degree at Cambridge’s Girton College, and a stint as a schoolteacher, she leapt over into computer science. She was initially intrigued by her husband’s work at the Cambridge Language Research Unit — where she later landed her first research position before joining the Cambridge University Computer Laboratory in 1974. Starting in 1999, she held the post of professor of computers and information.
“I think it’s very important to get more women into computing. My slogan is: Computing is too important to be left to men.”
Connecting With Computers
At a time when most scientists were trying to make people use code to talk to computers, Spärck Jones was determined to teach computers to understand us. Fascinated by natural language processing (NLP) and information retrieval, she wanted to program a computer to understand words like “field” that can have many meanings. So, she set about the arduous task of programming a massive thesaurus — the beginning of NLP as we know it today.
By the 1970s, she had come up with the revolutionary concepts of inverse document frequency (IDF) and index-term weighting. By combining statistics with linguistics, she established formulas that embodied principles for how computers could interpret relationships between words. Today, these principles are at the heart of every modern search engine.
She continued to contribute to computer science until her death in 2007, whether by joining the Alvey Program to support research in knowledge engineering in the U.K. or advocating for more women in computing and technology.
Karen Spärck Jones publishes “Synonymy and Semantic Classification”, now considered as a foundational paper in the field of natural language processing.
Karen Spärck Jones’ seminal 1972 paper in the Journal of Documentation lays the groundwork for the modern search engine.
The Karen Spärck Jones Award
In recognition of Spärck Jones’ achievements in information retrieval (IR) and natural language processing (NLP), the Karen Spärck Jones Award has been given annually since 2008 for outstanding research in one or both fields.