How We're Teaching Computers to Understand Pictures

This is Fei-Fei Li’s TED talk from 2015, hosted on the official TED channel. In it she explains, for a broad audience, how researchers are teaching computers to understand the content of images, and why that problem is so much harder than it first appears.

Li describes the central insight behind ImageNet: that just as a child learns to see by being exposed to enormous numbers of examples, a computer needs vast amounts of labeled data to learn visual concepts. She recounts building a dataset of roughly fifteen million labeled photos and using it to train algorithms that can begin to name what they see, from cats and chairs to richer scenes.

As the inventor of ImageNet, Li is the right person to tell this story. The talk predates much of the generative AI wave but captures the moment that made it possible, because the dataset and competition she describes were the proving ground where deep learning first demonstrated its power. For a general reader, it is a short, clear origin story for modern computer vision.

Sources

Last verified June 6, 2026