Towards Mixed Optimization for Reinforcement Learning with Program Synthesis

07/01/2018
by   Surya Bhupatiraju, et al.
2

Deep reinforcement learning has led to several recent breakthroughs, though the learned policies are often based on black-box neural networks. This makes them difficult to interpret and to impose desired specification constraints during learning. We present an iterative framework, MORL, for improving the learned policies using program synthesis. Concretely, we propose to use synthesis techniques to obtain a symbolic representation of the learned policy, which can then be debugged manually or automatically using program repair. After the repair step, we use behavior cloning to obtain the policy corresponding to the repaired program, which is then further improved using gradient descent. This process continues until the learned policy satisfies desired constraints. We instantiate MORL for the simple CartPole problem and show that the programmatic representation allows for high-level modifications that in turn lead to improved learning of the policies.

READ FULL TEXT
research
07/11/2019

Imitation-Projected Programmatic Reinforcement Learning

We study the problem of programmatic reinforcement learning, in which po...
research
09/07/2023

Learning of Generalizable and Interpretable Knowledge in Grid-Based Reinforcement Learning Environments

Understanding the interactions of agents trained with deep reinforcement...
research
08/10/2020

Robot Action Selection Learning via Layered Dimension Informed Program Synthesis

Action selection policies (ASPs), used to compose low-level robot skills...
research
02/07/2020

Provably efficient reconstruction of policy networks

Recent research has shown that learning poli-cies parametrized by large ...
research
10/13/2019

Neural Program Synthesis By Self-Learning

Neural inductive program synthesis is a task generating instructions tha...
research
01/19/2019

Learning retrosynthetic planning through self-play

The problem of retrosynthetic planning can be framed as one player game,...
research
02/08/2016

Graying the black box: Understanding DQNs

In recent years there is a growing interest in using deep representation...

Please sign up or login with your details

Forgot password? Click here to reset