Provably efficient reconstruction of policy networks

02/07/2020
by   Bogdan Mazoure, et al.
18

Recent research has shown that learning poli-cies parametrized by large neural networks can achieve significant success on challenging reinforcement learning problems. However, when memory is limited, it is not always possible to store such models exactly for inference, and com-pressing the policy into a compact representation might be necessary. We propose a general framework for policy representation, which reduces this problem to finding a low-dimensional embedding of a given density function in a separable inner product space. Our framework allows us to de-rive strong theoretical guarantees, controlling the error of the reconstructed policies. Such guaran-tees are typically lacking in black-box models, but are very desirable in risk-sensitive tasks. Our experimental results suggest that the reconstructed policies can use less than 10 no decrease in rewards.

READ FULL TEXT
research
02/26/2019

Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies

We consider a settings of hierarchical reinforcement learning, in which ...
research
07/01/2018

Towards Mixed Optimization for Reinforcement Learning with Program Synthesis

Deep reinforcement learning has led to several recent breakthroughs, tho...
research
08/12/2021

A functional mirror ascent view of policy gradient methods with function approximation

We use functional mirror ascent to propose a general framework (referred...
research
02/23/2021

State Augmented Constrained Reinforcement Learning: Overcoming the Limitations of Learning with Rewards

Constrained reinforcement learning involves multiple rewards that must i...
research
06/23/2020

Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation

Offline Reinforcement Learning (RL) is a promising approach for learning...
research
02/09/2018

Learning Robust Options

Robust reinforcement learning aims to produce policies that have strong ...
research
02/05/2016

Active Information Acquisition

We propose a general framework for sequential and dynamic acquisition of...

Please sign up or login with your details

Forgot password? Click here to reset