On 25 July 2024 Google DeepMind announced that two systems, AlphaProof and AlphaGeometry 2, had together solved four of the six problems at that year’s International Mathematical Olympiad, scoring 28 out of a possible 42 points. That total reached the standard of a silver medal and fell just one point short of the gold-medal threshold of 29, the first time an AI system performed at that level on the world’s most prestigious mathematics competition for pre-university students.
AlphaProof is a reinforcement-learning system that proves mathematical statements in the formal language Lean, combining a pretrained language model with the AlphaZero algorithm that powered DeepMind’s game-playing breakthroughs. It solved two algebra problems and one number-theory problem, including the competition’s hardest problem, which only a handful of human contestants cracked. AlphaGeometry 2, an improved version of the earlier geometry system now built on Gemini, solved the geometry problem in 19 seconds. The two combinatorics problems went unsolved. The solutions were graded by the mathematicians Timothy Gowers, a Fields medalist, and Joseph Myers.
The milestone matters because olympiad problems demand genuine creative reasoning rather than calculation, and because the proofs were produced in a formal language that a computer can check, addressing the persistent worry that AI mathematical reasoning is unreliable. It marked a major step toward AI systems that can do rigorous, verifiable mathematics.