Programmatic Policy Extraction by Iterative Local Search

01/18/2022
by   Rasmus Larsen, et al.
0

Reinforcement learning policies are often represented by neural networks, but programmatic policies are preferred in some cases because they are more interpretable, amenable to formal verification, or generalize better. While efficient algorithms for learning neural policies exist, learning programmatic policies is challenging. Combining imitation-projection and dataset aggregation with a local search heuristic, we present a simple and direct approach to extracting a programmatic policy from a pretrained neural policy. After examining our local search heuristic on a programming by example problem, we demonstrate our programmatic policy extraction method on a pendulum swing-up problem. Both when trained using a hand crafted expert policy and a learned neural policy, our method discovers simple and interpretable policies that perform almost as well as the original.

READ FULL TEXT
research
04/06/2018

Programmatically Interpretable Reinforcement Learning

We study the problem of generating interpretable and verifiable policies...
research
07/11/2019

Imitation-Projected Programmatic Reinforcement Learning

We study the problem of programmatic reinforcement learning, in which po...
research
09/16/2021

Interpretable Local Tree Surrogate Policies

High-dimensional policies, such as those represented by neural networks,...
research
06/06/2013

Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee

Local Policy Search is a popular reinforcement learning approach for han...
research
05/12/2017

A Formal Characterization of the Local Search Topology of the Gap Heuristic

The pancake puzzle is a classic optimization problem that has become a s...
research
04/02/2015

End-to-End Training of Deep Visuomotor Policies

Policy search methods can allow robots to learn control policies for a w...
research
05/14/2019

Homotopic Convex Transformation: A New Method to Smooth the Landscape of the Traveling Salesman Problem

This paper proposes a novel landscape smoothing method for the symmetric...

Please sign up or login with your details

Forgot password? Click here to reset