GPT-f is a language model that has been trained to generate theorem proofs
The transformer model GPT-3 was used as the architecture. GPT-f is designed to assist mathematicians in proving theorems. The model works for the formal theorem-proving Metamath language, which syntax looks like this:
$( Declare the constant symbols we will use $)
$c 0 + = -> ( ) term wff |- $.
$( Declare the metavariables we will use $)
$v t r s P Q $.
Based on the results of the experiments, GPT-f found new short proofs that were accepted into the Metamath library. This is the first time a deep learning model has generated proofs for theorems that have been accepted by the mathematician community.
Automatic theorem proving is a promising area of research for the following reasons:
- Proving theorems requires the ability to draw logical conclusions and reason;
- A quick search for evidence and check it for correctness;
- Automatic data generation for training evidence search models
1. Microsoft Azure Machine Learning x Udacity — Lesson 4 Notes
2. Fundamentals of AI, ML and Deep Learning for Product Managers
3. Roadmap to Data Science
4. Work on Artificial Intelligence Projects
The researchers used the formal Metamath framework to validate evidence. The main Metamath library contains ~ 38 thousand proofs based on the Zermelo-Fraenkel axiom system. Two pretrained GPT-3s were used as models:
- Filtered CommonCrawl;
- Based on Github data, arXiv and Math StackExchange
The model was retrained on synthetic datasets.
GPT-f is a state-of-the-art approach for the Metamath environment. The best model generated 56.22% of the evidence from the test dataset. The previous SOTA model MetaGen-IL was able to prove only 21.16% of theorems.