Efficient Diffusion Policies for Offline Reinforcement Learning

05/31/2023
by   Bingyi Kang, et al.
0

Offline reinforcement learning (RL) aims to learn optimal policies from offline datasets, where the parameterization of policies is crucial but often overlooked. Recently, Diffsuion-QL significantly boosts the performance of offline RL by representing a policy with a diffusion model, whose success relies on a parametrized Markov Chain with hundreds of steps for sampling. However, Diffusion-QL suffers from two critical limitations. 1) It is computationally inefficient to forward and backward through the whole Markov chain during training. 2) It is incompatible with maximum likelihood-based RL algorithms (e.g., policy gradient methods) as the likelihood of diffusion models is intractable. Therefore, we propose efficient diffusion policy (EDP) to overcome these two challenges. EDP approximately constructs actions from corrupted ones at training to avoid running the sampling chain. We conduct extensive experiments on the D4RL benchmark. The results show that EDP can reduce the diffusion policy training time from 5 days to 5 hours on gym-locomotion tasks. Moreover, we show that EDP is compatible with various offline RL algorithms (TD3, CRR, and IQL) and achieves new state-of-the-art on D4RL by large margins over previous methods. Our code is available at https://github.com/sail-sg/edp.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/12/2022

Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning

Offline reinforcement learning (RL), which aims to learn an optimal poli...
research
07/10/2023

Diffusion Policies for Out-of-Distribution Generalization in Offline Reinforcement Learning

Offline Reinforcement Learning (RL) methods leverage previous experience...
research
02/23/2022

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

Offline Reinforcement Learning (RL) aims to learn policies from previous...
research
06/01/2023

Improving and Benchmarking Offline Reinforcement Learning Algorithms

Recently, Offline Reinforcement Learning (RL) has achieved remarkable pr...
research
04/25/2023

Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning

Guided sampling is a vital approach for applying diffusion models in rea...
research
06/08/2023

Instructed Diffuser with Temporal Condition Guidance for Offline Reinforcement Learning

Recent works have shown the potential of diffusion models in computer vi...
research
06/24/2023

Fighting Uncertainty with Gradients: Offline Reinforcement Learning via Diffusion Score Matching

Offline optimization paradigms such as offline Reinforcement Learning (R...

Please sign up or login with your details

Forgot password? Click here to reset