AI-Descartes: combining data and theory for derivable scientific discovery

“Combining data and theory for derivable scientific discovery with AI-Descartes,” by Cristina Cornelio and colleagues, was published in Nature Communications in April 2023. It tackles a weakness of pure data-fitting approaches to discovering scientific laws: many different formulas can fit the same noisy data, and not all of them are physically meaningful.

AI-Descartes pairs symbolic regression, which searches for the mathematical form of an equation rather than just tuning parameters, with a logical reasoning engine that checks candidate equations against background scientific theory. When several formulas explain the data equally well, the system prefers the one that is actually derivable from, or consistent with, accepted principles. The distinctive ingredient is this ability to reason, not just curve-fit.

The authors demonstrated the system rediscovering laws including Kepler’s third law of planetary motion, Einstein’s relativistic time-dilation formula, and Langmuir’s theory of gas adsorption, in some cases from small or imperfect datasets where data alone would be ambiguous.

For a general reader, AI-Descartes embodies a long-standing dream of AI for science: machines that help formulate theories, not merely fit numbers. By insisting that a discovered equation be both data-consistent and theory-derivable, it gestures at a more rigorous, explainable kind of automated discovery than black-box prediction can offer.

Sources

Last verified June 7, 2026