Improving exploration in policy gradient search: Application to symbolic optimization

07/19/2021
by   Mikel Landajuela Larma, et al.
0

Many machine learning strategies designed to automate mathematical tasks leverage neural networks to search large combinatorial spaces of mathematical symbols. In contrast to traditional evolutionary approaches, using a neural network at the core of the search allows learning higher-level symbolic patterns, providing an informed direction to guide the search. When no labeled data is available, such networks can still be trained using reinforcement learning. However, we demonstrate that this approach can suffer from an early commitment phenomenon and from initialization bias, both of which limit exploration. We present two exploration methods to tackle these issues, building upon ideas of entropy regularization and distribution initialization. We show that these techniques can improve the performance, increase sample efficiency, and lower the complexity of solutions for the task of symbolic regression.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2023

Representation-Driven Reinforcement Learning

We present a representation-driven framework for reinforcement learning....
research
04/13/2021

Distilling Wikipedia mathematical knowledge into neural network models

Machine learning applications to symbolic mathematics are becoming incre...
research
03/06/2020

AutoML-Zero: Evolving Machine Learning Algorithms From Scratch

Machine learning research has advanced in multiple aspects, including mo...
research
09/26/2020

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

We present Revel, a partially neural reinforcement learning (RL) framewo...
research
04/28/2019

Learning walk and trot from the same objective using different types of exploration

In quadruped gait learning, policy search methods that scale high dimens...
research
09/13/2018

Sequential Coordination of Deep Models for Learning Visual Arithmetic

Achieving machine intelligence requires a smooth integration of percepti...

Please sign up or login with your details

Forgot password? Click here to reset