TTR-Based Rewards for Reinforcement Learning with Implicit Model Priors

03/23/2019
by   Xubo Lyu, et al.
0

Model-free reinforcement learning (RL) provides an attractive approach for learning control policies directly in high dimensional state spaces. However, many goal-oriented tasks involving sparse rewards remain difficult to solve with state-of-the-art model-free RL algorithms, even in simulation. One of the key difficulties is that deep RL, due to its relatively poor sample complexity, often requires a prohibitive number of trials to obtain a learning signal. We propose a novel, non-sparse reward function for robotic RL tasks by leveraging physical priors in the form of a time-to-reach (TTR) function computed from an approximate system dynamics model. TTR functions come from the optimal control field and measure the minimal time required to move from any state to the goal. However, TTR functions are intractable to compute for complex systems, so we compute it in a lower-dimensional state space, and then do a simple transformation to convert it into a TTR-based reward function for the MDP in RL tasks. Our TTR-based reward function provides highly-informative rewards that account for system dynamics.

READ FULL TEXT

page 1

page 6

page 7

research
06/16/2018

BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning

Model-free Reinforcement Learning (RL) offers an attractive approach to ...
research
05/26/2023

Reinforcement Learning with Simple Sequence Priors

Everything else being equal, simpler models should be preferred over mor...
research
06/17/2023

The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions

Reinforcement learning (RL) algorithms have proven transformative in a r...
research
11/04/2020

MBVI: Model-Based Value Initialization for Reinforcement Learning

Model-free reinforcement learning (RL) is capable of learning control po...
research
09/04/2019

Learning sparse representations in reinforcement learning

Reinforcement learning (RL) algorithms allow artificial agents to improv...
research
05/31/2023

ROSARL: Reward-Only Safe Reinforcement Learning

An important problem in reinforcement learning is designing agents that ...
research
07/18/2018

Backplay: "Man muss immer umkehren"

A long-standing problem in model free reinforcement learning (RL) is tha...

Please sign up or login with your details

Forgot password? Click here to reset