In December 2016 JAMA published “Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs” by Varun Gulshan, Lily Peng, and colleagues at Google. It was one of the earliest large-scale demonstrations that a deep neural network could read a medical image with expert-level accuracy on a screening task affecting millions of people.
Diabetic retinopathy is a leading cause of preventable blindness, and screening relies on trained graders examining photographs of the back of the eye. The team trained a convolutional neural network on a development set of 128,175 retinal images, each labeled by a panel of ophthalmologists, to detect referable diabetic retinopathy and related conditions.
On two independent validation sets, EyePACS-1 (9,963 images) and Messidor-2 (1,748 images), the algorithm reached about 90.3 percent and 87.0 percent sensitivity at high specificity (98.1 and 98.5 percent), with alternative operating points trading toward higher sensitivity. In other words, the network could be tuned to catch nearly all referable cases or to minimize false alarms, depending on the screening goal.
For a general reader, this paper helped establish the template for clinical AI: assemble a large expertly labeled dataset, train an image classifier, and validate it against independent data and human graders. The same Google team’s eye-screening work later moved toward real-world deployment, and the study is frequently cited as a starting point for autonomous diabetic-retinopathy screening systems.