DeepAI
Log In Sign Up

Deep Reinforcement Learning in HOL4

10/25/2019
by   Thibault Gauthier, et al.
0

The paper describes an implementation of deep reinforcement learning through self-supervised learning within the proof assistant HOL4. A close interaction between the machine learning modules and the HOL4 library is achieved by the choice of tree neural networks (TNNs) as machine learning models and the internal use of HOL4 terms to represent tree structures of TNNs. Recursive improvement is possible when a given task is expressed as a search problem. In this case, a Monte Carlo Tree Search (MCTS) algorithm guided by a TNN can be used to explore the search space and produce better examples for training the next TNN. As an illustration, tasks over propositional and arithmetical terms, representative of fundamental theorem proving techniques, are specified and learned: truth estimation, end-to-end computation, term rewriting and term synthesis.

READ FULL TEXT

page 1

page 2

page 3

page 4

11/02/2018

Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning

The problem-solving in automated theorem proving (ATP) can be interprete...
06/12/2020

StarCraft II Build Order Optimization using Deep Reinforcement Learning and Monte-Carlo Tree Search

The real-time strategy game of StarCraft II has been posed as a challeng...
09/03/2020

Tree Neural Networks in HOL4

We present an implementation of tree neural networks within the proof as...
02/22/2020

Towards model discovery with reinforcement learning

We propose to learn (i) models expressed in analytical form, (ii) which...
06/20/2017

Towards Proof Synthesis Guided by Neural Machine Translation for Intuitionistic Propositional Logic

Inspired by the recent evolution of deep neural networks (DNNs) in machi...
10/13/2019

Neural Program Synthesis By Self-Learning

Neural inductive program synthesis is a task generating instructions tha...
05/19/2018

Reinforcement Learning of Theorem Proving

We introduce a theorem proving algorithm that uses practically no domain...