In February 2020 Cell published “A Deep Learning Approach to Antibiotic Discovery” by Jonathan Stokes, Regina Barzilay, James Collins, and colleagues at MIT and the Broad Institute. It became a landmark example of machine learning finding a genuinely new drug candidate rather than just re-ranking known ones.
The team trained a deep neural network to predict whether a molecule would inhibit the growth of E. coli, using a training set of about 2,335 molecules. They then applied the model to screen large chemical libraries and flagged a compound, originally investigated for diabetes, that the model scored highly but that looks structurally unlike conventional antibiotics. They named it halicin.
Halicin proved to have broad bactericidal activity, killing a wide range of pathogens including drug-resistant strains such as carbapenem-resistant Enterobacteriaceae and Mycobacterium tuberculosis. In mouse experiments it cleared infections of Clostridioides difficile and pan-resistant Acinetobacter baumannii, and the authors showed it works by a mechanism distinct from existing drugs, disrupting bacteria’s ability to maintain an electrochemical gradient across their membranes.
For a general reader, this paper matters because antibiotic resistance is a slow-moving public health crisis and few new antibiotic classes have reached the clinic in decades. Demonstrating that a model could surface a structurally novel candidate from existing molecule libraries suggested AI might help refill a stalled drug pipeline.