Autoformalization with Large Language Models

“Autoformalization with Large Language Models” (arXiv, May 2022, by Yuhuai Wu, Albert Jiang, Wenda Li, Markus Rabe, Charles Staats, Mateja Jamnik, and Christian Szegedy) studied whether language models can perform autoformalization: automatically translating mathematics written in ordinary natural language into a precise formal language that a proof assistant can check.

The authors found that large language models could perfectly translate 25.3 percent of mathematical competition problems into formal specifications in Isabelle/HOL, a striking result given how exacting formal syntax is. They went further and used the machine-generated formal statements to train a neural theorem prover, which improved its success rate on the MiniF2F benchmark from 29.6 percent to 35.2 percent, showing that autoformalized data is useful, not just a curiosity.

Autoformalization is a key bottleneck for AI in mathematics and verification: the world’s mathematical knowledge is written informally, but machines can only check formal statements. This paper was an early demonstration that LLMs could begin to bridge that gap automatically. For businesses in safety-critical software and hardware, autoformalization points toward a future where specifications and proofs can be generated from plain-language requirements.

Sources

Last verified June 7, 2026