This website uses cookies.

We use cookies to personalise content and ads, to provide social media features and to analyse our traffic. We also share information about your use of our site with our social media, advertising and analytics partners who may combine it with other information that you’ve provided to them or that they’ve collected from your use of their services.

Welcome to History of Data Science. Discover the stories of heroes who transformed our daily lives!

BROUGHT TO YOU BY Dataiku Dataiku

xperiences-ico The Graphic Novel
Filter
Date
Families
Karen-SPARCK-JONES
Computer Science / Database Dataiku Favorite

Karen Spärck Jones: The Search Engineer Enabler

4 min read
08_25_2021
Sparky British computer scientist Karen Spärck Jones FBA (1935–2007) is the woman behind the concept of inverse document frequency and index-term weighting — the principles underpinning modern search engines, including nearly 3.5 billion Google searches every day!
In 2019, The New York Times called her “a pioneer of computer science for work combining statistics and linguistics and an advocate for women in the field.”

Karen Ida Boalth Spärck Jones was born in Huddersfield, U.K., to a Norwegian mother who fled to England during World War II. Following grammar school, a history degree at Cambridge’s Girton College, and a stint as a schoolteacher, she leapt over into computer science. She was initially intrigued by her husband’s work at the Cambridge Language Research Unit — where she later landed her first research position before joining the Cambridge University Computer Laboratory in 1974. Starting in 1999, she held the post of professor of computers and information.

“I think it’s very important to get more women into computing. My slogan is: Computing is too important to be left to men.”

Connecting With Computers

At a time when most scientists were trying to make people use code to talk to computers, Spärck Jones was determined to teach computers to understand us. Fascinated by natural language processing (NLP) and information retrieval, she wanted to program a computer to understand words like “field” that can have many meanings. So, she set about the arduous task of programming a massive thesaurus — the beginning of NLP as we know it today.

By the 1970s, she had come up with the revolutionary concepts of inverse document frequency (IDF) and index-term weighting. By combining statistics with linguistics, she established formulas that embodied principles for how computers could interpret relationships between words. Today, these principles are at the heart of every modern search engine.

She continued to contribute to computer science until her death in 2007, whether by joining the Alvey Program to support research in knowledge engineering in the U.K. or advocating for more women in computing and technology.