ESM (Evolutionary Scale Modeling) is a family of protein language models from Meta AI’s FAIR lab. The models are transformers trained on hundreds of millions of protein sequences in the same masked-language style used for text, learning the statistical patterns of amino-acid sequences without being told anything about 3D structure.
The central result appeared in Science in 2023 as “Evolutionary-scale prediction of atomic-level protein structure with a language model” by Lin and colleagues. It introduced ESM-2, scaled up to 15 billion parameters, and ESMFold, a structure predictor built on top of it. Unlike AlphaFold2, which relies on multiple-sequence alignments of related proteins, ESMFold predicts structure directly from a single sequence, which makes it much faster - about an order of magnitude faster - at some cost in accuracy.
That speed let Meta apply ESMFold at metagenomic scale. The team released the ESM Metagenomic Atlas, an open database of 617 million predicted protein structures drawn from microbial and environmental sequences that are largely absent from existing structure databases, computed in roughly two weeks on a cluster of GPUs.
ESM showed that a language model trained purely on sequence can absorb enough about protein biology to predict folded structure, an independent line of attack on the same problem AlphaFold solved with alignments.