DeepAI AI Chat
Log In Sign Up

Experimental results from applying GPT-4 to an unpublished formal language

by   Gregor vom Scheidt, et al.

Can large language models be used to complete mathematical tasks that are traditionally performed either manually or with the aid of theorem provers? To answer this question, a state-of-the-art system, GPT-4, was provided with a concise natural language specification for a previously unpublished formal system and asked to complete a number of tasks, from stating function and type definitions to proving simple theorems and verifying user-supplied proofs. The system completed all tasks successfully, showed extensive domain knowledge, invented helpful new syntax and semantics, and exhibited generalization and inference abilities. So the answer seems to be: yes.


page 3

page 4

page 6

page 9

page 10

page 11

page 12

page 13


Autoformalization with Large Language Models

Autoformalization is the process of automatically translating from natur...

Towards a Mathematics Formalisation Assistant using Large Language Models

Mathematics formalisation is the task of writing mathematics (i.e., defi...

Formal Specifications from Natural Language

We study the generalization abilities of language models when translatin...

Prove-It: A Proof Assistant for Organizing and Verifying General Mathematical Knowledge

We introduce Prove-It, a Python-based general-purpose interactive theore...

Learning-assisted Theorem Proving with Millions of Lemmas

Large formal mathematical libraries consist of millions of atomic infere...

Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning

Imagine you are in a supermarket. You have two bananas in your basket an...

Generating Mutually Inductive Theorems from Concise Descriptions

We describe defret-mutual-generate, a utility for proving ACL2 theorems ...