In March 2023 Science published “Evolutionary-scale prediction of atomic-level protein structure with a language model” by Zeming Lin, Alexander Rives, and colleagues at Meta AI. It introduced ESMFold, a structure predictor built on top of ESM-2, a transformer language model trained on protein sequences.
The key finding is that information about a protein’s three-dimensional structure emerges inside a language model simply from learning to predict masked amino acids across millions of sequences, and that this emergent knowledge sharpens as the model scales. The authors trained ESM-2 models up to 15 billion parameters and used the largest as the basis for ESMFold. Crucially, ESMFold predicts structure directly from a single sequence, skipping the slow step of building a multiple-sequence alignment that AlphaFold and RoseTTAFold depend on.
That design made ESMFold roughly an order of magnitude faster than alignment-based methods, at accuracy that was somewhat lower but still useful. The team exploited that speed to fold more than 600 million metagenomic proteins, releasing the results as the ESM Metagenomic Atlas, a structural view into the vast “dark matter” of microbial proteins that had never been characterized.
For a general reader, this paper showed that the same language-model recipe powering chatbots transfers to biology, and that trading a little accuracy for large speed gains can open up problems, such as folding hundreds of millions of unknown proteins, that slower methods could never reach.