Set up by data scientist Fei-Fei Li in 2006, the ImageNet database now contains more than 14 million annotated images. It has played a key role in advancing computer vision across applications like object recognition, image classification, and object localization.
It all began in 1985. George A. Miller and his team at Princeton University started working on WordNet, a lexical database for the English language. As a cross between a dictionary and a thesaurus, it enabled applications in Natural Language Processing (NLP).
Fast forward 21 years and data scientist Fei-Fei Li came up with the idea of ImageNet at the University of Illinois Urbana-Champaign. At the time, most AI researchers thought algorithms were more important than the data itself. However, Li was convinced that vast amounts of real-world data would make algorithms more accurate. By then, WordNet was mature, and the version 3.0 had recently been released. When Li met WordNet researcher Christiane Fellbaum from Princeton University, she decided to use the wordbase and hierarchy of WordNet for her ambitious image database. Her aim? Support visual object recognition software research.
“The paradigm shift of the ImageNet thinking is that while a lot of people are paying attention to models, let’s pay attention to data. Data will redefine how we think about models.” – Fei-Fei Li
Collective Contributions With Spectacular Results
In July 2008, ImageNet had zero images. By December, it had categorized three million images across 6,000+ synsets. In April 2010, there were more than 11 million images in 15,000+ synsets. Such results would have been inconceivable for a handful of researchers. They were made possible through crowdsourcing on Amazon’s Mechanical Turk platform.
In 2010, the first ever ImageNet Large Scale Visual Recognition Challenge (ILSVRC) was organized. Software programs competed to correctly classify and detect objects and scenes.
Since being launched, ImageNet has given researchers a common set of images to benchmark their models and algorithms. In turn, this has driven research in machine learning and deep neural networks, making it easier to classify images and complete other tasks associated with computer vision. The data is available for free to researchers for non-commercial use.
ImageNet is presented for the first time as a poster at the Conference on Computer Vision and Pattern Recognition (CVPR) in Florida.
The deep convolutional neural network architecture AlexNet beats the field in the ImageNet Challenge by a whopping 10.8% — arguably kickstarting the current boom in computer vision.
95% Accuracy in Computer Vision
29 of 38 the teams competing in the ImageNet Challenge achieve greater than 95% accuracy. Image recognition has been taken to unprecedented levels.