Uncertainty-Aware Policy Optimization: A Robust, Adaptive Trust Region Approach

12/19/2020
by   James Queeney, et al.
0

In order for reinforcement learning techniques to be useful in real-world decision making processes, they must be able to produce robust performance from limited data. Deep policy optimization methods have achieved impressive results on complex tasks, but their real-world adoption remains limited because they often require significant amounts of data to succeed. When combined with small sample sizes, these methods can result in unstable learning due to their reliance on high-dimensional sample-based estimates. In this work, we develop techniques to control the uncertainty introduced by these estimates. We leverage these techniques to propose a deep policy optimization approach designed to produce stable performance even when data is scarce. The resulting algorithm, Uncertainty-Aware Trust Region Policy Optimization, generates robust policy updates that adapt to the level of uncertainty present throughout the learning process.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2017

Trust-PCL: An Off-Policy Trust Region Method for Continuous Control

Trust region methods, such as TRPO, are often used to stabilize policy o...
research
03/11/2023

Uncertainty-Aware Off-Policy Learning

Off-policy learning, referring to the procedure of policy optimization w...
research
02/28/2018

Model-Ensemble Trust-Region Policy Optimization

Model-free reinforcement learning (RL) methods are succeeding in a growi...
research
12/13/2022

PPO-UE: Proximal Policy Optimization via Uncertainty-Aware Exploration

Proximal Policy Optimization (PPO) is a highly popular policy-based deep...
research
05/20/2020

Mirror Descent Policy Optimization

We propose deep Reinforcement Learning (RL) algorithms inspired by mirro...
research
07/04/2014

Robust Optimization using Machine Learning for Uncertainty Sets

Our goal is to build robust optimization problems for making decisions b...
research
05/13/2021

Policy Optimization in Bayesian Network Hybrid Models of Biomanufacturing Processes

Biopharmaceutical manufacturing is a rapidly growing industry with impac...

Please sign up or login with your details

Forgot password? Click here to reset