Learning Relative Return Policies With Upside-Down Reinforcement Learning

02/23/2022
by   Dylan R. Ashley, et al.
0

Lately, there has been a resurgence of interest in using supervised learning to solve reinforcement learning problems. Recent work in this area has largely focused on learning command-conditioned policies. We investigate the potential of one such method – upside-down reinforcement learning – to work with commands that specify a desired relationship between some scalar value and the observed return. We show that upside-down reinforcement learning can learn to carry out such commands online in a tabular bandit setting and in CartPole with non-linear function approximation. By doing so, we demonstrate the power of this family of methods and open the way for their practical use under more complicated command structures.

READ FULL TEXT
research
11/03/2016

Quantile Reinforcement Learning

In reinforcement learning, the standard criterion to evaluate policies i...
research
06/02/2022

When does return-conditioned supervised learning work for offline reinforcement learning?

Several recent works have proposed a class of algorithms for the offline...
research
10/21/2022

Implicit Offline Reinforcement Learning via Supervised Learning

Offline Reinforcement Learning (RL) via Supervised Learning is a simple ...
research
09/19/2018

Interpretable Reinforcement Learning with Ensemble Methods

We propose to use boosted regression trees as a way to compute human-int...
research
03/11/2021

Multi-Task Federated Reinforcement Learning with Adversaries

Reinforcement learning algorithms, just like any other Machine learning ...
research
05/27/2021

Pattern Transfer Learning for Reinforcement Learning in Order Dispatching

Order dispatch is one of the central problems to ride-sharing platforms....
research
07/04/2022

Goal-Conditioned Generators of Deep Policies

Goal-conditioned Reinforcement Learning (RL) aims at learning optimal po...

Please sign up or login with your details

Forgot password? Click here to reset