MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

08/31/2021
by   Kunhao Zheng, et al.
15

We present miniF2F, a dataset of formal Olympiad-level mathematics problems statements intended to provide a unified cross-system benchmark for neural theorem proving. The miniF2F benchmark currently targets Metamath, Lean, and Isabelle and consists of 488 problem statements drawn from the AIME, AMC, and the International Mathematical Olympiad (IMO), as well as material from high-school and undergraduate mathematics courses. We report baseline results using GPT-f, a neural theorem prover based on GPT-3 and provide an analysis of its performance. We intend for miniF2F to be a community-driven effort and hope that our benchmark will help spur advances in neural theorem proving.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2023

ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics

We introduce ProofNet, a benchmark for autoformalization and formal prov...
research
02/03/2022

Formal Mathematics Statement Curriculum Learning

We explore the use of expert iteration in the context of language modeli...
research
03/06/2019

A data analysis of women's trails among ICM speakers

The International Congress of Mathematicians (ICM), inaugurated in 1897,...
research
02/11/2014

Learning-assisted Theorem Proving with Millions of Lemmas

Large formal mathematical libraries consist of millions of atomic infere...
research
12/05/2019

Exploration of Neural Machine Translation in Autoformalization of Mathematics in Mizar

In this paper we share several experiments trying to automatically trans...
research
01/01/2021

Formalizing Hall's Marriage Theorem in Lean

We formalize Hall's Marriage Theorem in the Lean theorem prover for incl...
research
05/05/2019

Interaction with Formal Mathematical Documents in Isabelle/PIDE

Isabelle/PIDE has emerged over more than 10 years as the standard Prover...

Please sign up or login with your details

Forgot password? Click here to reset