Uncertainty-driven Trajectory Truncation for Model-based Offline Reinforcement Learning

04/10/2023
by   Junjie Zhang, et al.
0

Equipped with the trained environmental dynamics, model-based offline reinforcement learning (RL) algorithms can often successfully learn good policies from fixed-sized datasets, even some datasets with poor quality. Unfortunately, however, it can not be guaranteed that the generated samples from the trained dynamics model are reliable (e.g., some synthetic samples may lie outside of the support region of the static dataset). To address this issue, we propose Trajectory Truncation with Uncertainty (TATU), which adaptively truncates the synthetic trajectory if the accumulated uncertainty along the trajectory is too large. We theoretically show the performance bound of TATU to justify its benefits. To empirically show the advantages of TATU, we first combine it with two classical model-based offline RL algorithms, MOPO and COMBO. Furthermore, we integrate TATU with several off-the-shelf model-free offline RL algorithms, e.g., BCQ. Experimental results on the D4RL benchmark show that TATU significantly improves their performance, often by a large margin.

READ FULL TEXT

page 12

page 13

research
06/16/2022

Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination

The learned policy of model-free offline reinforcement learning (RL) met...
research
09/16/2023

DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning

Model-based reinforcement learning (RL), which learns environment model ...
research
04/15/2019

Curious iLQR: Resolving Uncertainty in Model-based RL

Curiosity as a means to explore during reinforcement learning problems h...
research
02/22/2021

GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning

Offline reinforcement learning approaches can generally be divided to pr...
research
02/10/2021

Personalization for Web-based Services using Offline Reinforcement Learning

Large-scale Web-based services present opportunities for improving UI po...
research
06/01/2023

IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Model-based reinforcement learning (RL) has shown great promise due to i...
research
03/16/2021

Few Shot System Identification for Reinforcement Learning

Learning by interaction is the key to skill acquisition for most living ...

Please sign up or login with your details

Forgot password? Click here to reset