Model-Based Offline Planning with Trajectory Pruning

05/16/2021
by   Xianyuan Zhan, et al.
8

Offline reinforcement learning (RL) enables learning policies using pre-collected datasets without environment interaction, which provides a promising direction to make RL useable in real-world systems. Although recent offline RL studies have achieved much progress, existing methods still face many practical challenges in real-world system control tasks, such as computational restriction during agent training and the requirement of extra control flexibility. Model-based planning framework provides an attractive solution for such tasks. However, most model-based planning algorithms are not designed for offline settings. Simply combining the ingredients of offline RL with existing methods either provides over-restrictive planning or leads to inferior performance. We propose a new light-weighted model-based offline planning framework, namely MOPP, which tackles the dilemma between the restrictions of offline learning and high-performance planning. MOPP encourages more aggressive trajectory rollout guided by the behavior policy learned from data, and prunes out problematic trajectories to avoid potential out-of-distribution samples. Experimental results show that MOPP provides competitive performance compared with existing model-based offline planning and RL approaches, and allows easy adaptation to varying objectives and extra constraints.

READ FULL TEXT

page 3

page 4

page 5

page 6

page 7

page 10

page 12

page 14

research
08/12/2020

Model-Based Offline Planning

Offline learning is a key part of making reinforcement learning (RL) use...
research
11/22/2021

UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning Leveraging Planning

Offline reinforcement learning (RL) provides a framework for learning de...
research
06/24/2023

Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching

Offline optimization paradigms such as offline Reinforcement Learning (R...
research
01/07/2022

Offline Reinforcement Learning for Road Traffic Control

Traffic signal control is an important problem in urban mobility with a ...
research
11/15/2021

Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics

Applications of Reinforcement Learning (RL) in robotics are often limite...
research
03/28/2023

Planning with Sequence Models through Iterative Energy Minimization

Recent works have shown that sequence modeling can be effectively used t...
research
08/11/2023

Learning Control Policies for Variable Objectives from Offline Data

Offline reinforcement learning provides a viable approach to obtain adva...

Please sign up or login with your details

Forgot password? Click here to reset