IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

06/01/2023
by   Rohan Chitnis, et al.
0

Model-based reinforcement learning (RL) has shown great promise due to its sample efficiency, but still struggles with long-horizon sparse-reward tasks, especially in offline settings where the agent learns from a fixed dataset. We hypothesize that model-based RL agents struggle in these environments due to a lack of long-term planning capabilities, and that planning in a temporally abstract model of the environment can alleviate this issue. In this paper, we make two key contributions: 1) we introduce an offline model-based RL algorithm, IQL-TD-MPC, that extends the state-of-the-art Temporal Difference Learning for Model Predictive Control (TD-MPC) with Implicit Q-Learning (IQL); 2) we propose to use IQL-TD-MPC as a Manager in a hierarchical setting with any off-the-shelf offline RL algorithm as a Worker. More specifically, we pre-train a temporally abstract IQL-TD-MPC Manager to predict "intent embeddings", which roughly correspond to subgoals, via planning. We empirically show that augmenting state representations with intent embeddings generated by an IQL-TD-MPC manager significantly improves off-the-shelf offline RL agents' performance on some of the most challenging D4RL benchmark tasks. For instance, the offline RL algorithms AWAC, TD3-BC, DT, and CQL all get zero or near-zero normalized evaluation scores on the medium and large antmaze tasks, while our modification gives an average score over 40.

READ FULL TEXT
research
11/17/2020

Combining Reinforcement Learning with Model Predictive Control for On-Ramp Merging

We consider the problem of designing an algorithm to allow a car to auto...
research
03/07/2023

ENTROPY: Environment Transformer and Offline Policy Optimization

Model-based methods provide an effective approach to offline reinforceme...
research
06/20/2017

Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control

Trial-and-error based reinforcement learning (RL) has seen rapid advance...
research
01/04/2023

Learning-based MPC from Big Data Using Reinforcement Learning

This paper presents an approach for learning Model Predictive Control (M...
research
02/24/2023

Neural Laplace Control for Continuous-time Delayed Systems

Many real-world offline reinforcement learning (RL) problems involve con...
research
09/24/2022

Unsupervised Model-based Pre-training for Data-efficient Control from Pixels

Controlling artificial agents from visual sensory data is an arduous tas...
research
04/10/2023

Uncertainty-driven Trajectory Truncation for Model-based Offline Reinforcement Learning

Equipped with the trained environmental dynamics, model-based offline re...

Please sign up or login with your details

Forgot password? Click here to reset