Hierarchical Decision Transformer

09/21/2022
by   André Correia, et al.
0

Sequence models in reinforcement learning require task knowledge to estimate the task policy. This paper presents a hierarchical algorithm for learning a sequence model from demonstrations. The high-level mechanism guides the low-level controller through the task by selecting sub-goals for the latter to reach. This sequence replaces the returns-to-go of previous methods, improving its performance overall, especially in tasks with longer episodes and scarcer rewards. We validate our method in multiple tasks of OpenAIGym, D4RL and RoboMimic benchmarks. Our method outperforms the baselines in eight out of ten tasks of varied horizons and reward frequencies without prior task knowledge, showing the advantages of the hierarchical model approach for learning from demonstrations using a sequence model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/10/2019

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

Hierarchical Reinforcement Learning (HRL) is a promising approach to sol...
research
09/21/2023

Safe Hierarchical Reinforcement Learning for CubeSat Task Scheduling Based on Energy Consumption

This paper presents a Hierarchical Reinforcement Learning methodology ta...
research
11/22/2018

Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning

In hierarchical reinforcement learning a major challenge is determining ...
research
09/29/2020

Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Reinforcement Learning algorithms require a large number of samples to s...
research
01/24/2020

Active Task-Inference-Guided Deep Inverse Reinforcement Learning

In inverse reinforcement learning (IRL), given a Markov decision process...
research
06/08/2022

Deep Hierarchical Planning from Pixels

Intelligent agents need to select long sequences of actions to solve com...
research
02/19/2018

Learning High-level Representations from Demonstrations

Hierarchical learning (HL) is key to solving complex sequential decision...

Please sign up or login with your details

Forgot password? Click here to reset