Research

My main professional goal is to contribute to research aimed at mastering mathematics with AI systems. I am particularly enthusiastic about using AI systems for formal mathematics, for reasons explained below. My research revolves around three exciting and interconnected areas: language modelling, tree search algorithms and reinforcement learning.

Language modelling

I am very interested in language modelling for formal mathematics, as it elegantly combines the creative and generalisation capabilities of language models with the automated rigorous verification offered by proof assistants.

During my internship at OpenAI with Stanislas Polu, I explored the potential of GPT-3.5 to solve, generate, and auto-formalise mathematical problems using the Lean proof assistant. At EleutherAI, I collaborated on replicating Google’s Minerva project by building OpenWebMath, an open-source mathematical dataset, and Llemma, a series of powerful open-source language models trained on OpenWebMath, arXiv, and code. These models successfully surpassed Minerva’s performance on an equi-parameter basis, while covering a wider distribution of tasks, including tool use and formal mathematics. Currently, at the University of Cambridge and Numina, I apply language models to automate conjecturing and advance theorem proving capabilities.

Tree search algorithms

Inspired by AlphaGo’s success in the game of Go, I am very motivated by the idea of adapting tree search algorithms to theorem proving. Theorem proving can be seen as a single-player game with rare winning states, where the objective is to find any solution at all. This specific setting requires a different approach to tree search, and encourages us to adapt tree search algorithms for the special requirements of theorem proving.

During my internship at the University of Cambridge, I studied Monte Carlo tree search for theorem proving with the Isabelle proof assistant. At Numina, I explore various tree search algorithms, including MCTS and BFS, conducting large-scale experiments to push this approach further.

Reinforcement learning

I see tremendous promise in reinforcement learning as applied to mathematics. Proof assistants provide a perfect environment for mathematics, providing immediate feedback and clear rewards, making them ideal platforms for applying RL algorithms such as Expert iteration and GRPO. The great success of reinforcement learning in games like Go deeply motivates me to translate these achievements into mathematics.

At OpenAI, we used Expert iteration with synthetic data to tackle increasingly complex problems. Now, in my PhD research at the University of Cambridge, I apply reinforcement learning for iterative refinement in conjecturing and theorem proving.

Marco Dos Santos

Research

Language modelling

Tree search algorithms

Reinforcement learning