This is Fei-Fei Li’s TED2024 talk, recorded in April 2024 and posted on the official TED YouTube channel. Li built ImageNet, the dataset that helped launch the deep-learning era of computer vision, and the talk is her own statement of where she thinks the field goes next.
She begins with the evolutionary story of sight - how the appearance of vision in early organisms drove an explosion of life and learning - and argues that an analogous moment is now arriving for machines. Her central idea is spatial intelligence: the ability not just to recognize what is in an image but to understand 3D space well enough to predict and act within it. She shows progress on turning images into 3D structure and connects it to robots and assistants that operate in the real world, sketching applications in areas such as health care.
This is a short, accessible primary source that complements her 2015 TED talk on teaching computers to see. As firsthand testimony from one of the founders of modern computer vision, it is a clear window into the research direction now driving her work.