Learning Representations for Control with Hierarchical Forward Models

06/22/2022
by   Trevor McInroe, et al.
0

Learning control from pixels is difficult for reinforcement learning (RL) agents because representation learning and policy learning are intertwined. Previous approaches remedy this issue with auxiliary representation learning tasks, but they either do not consider the temporal aspect of the problem or only consider single-step transitions. Instead, we propose Hierarchical k-Step Latent (HKSL), an auxiliary task that learns representations via a hierarchy of forward models that operate at varying magnitudes of step skipping while also learning to communicate between levels in the hierarchy. We evaluate HKSL in a suite of 30 robotic control tasks and find that HKSL either reaches higher episodic returns or converges to maximum performance more quickly than several current baselines. Also, we find that levels in HKSL's hierarchy can learn to specialize in long- or short-term consequences of agent actions, thereby providing the downstream control policy with more informative representations. Finally, we determine that communication channels between hierarchy levels organize information based on both sides of the communication process, which improves sample efficiency.

READ FULL TEXT

page 7

page 15

research
02/22/2021

Return-Based Contrastive Representation Learning for Reinforcement Learning

Recently, various auxiliary tasks have been proposed to accelerate repre...
research
02/22/2021

Reinforcement Learning with Prototypical Representations

Learning effective representations in image-based environments is crucia...
research
11/16/2020

Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning

We present a hierarchical planning and control framework that enables an...
research
09/23/2019

Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?

Hierarchical reinforcement learning has demonstrated significant success...
research
12/29/2022

Long-horizon video prediction using a dynamic latent hierarchy

The task of video prediction and generation is known to be notoriously d...
research
03/19/2021

Learning Task Decomposition with Ordered Memory Policy Network

Many complex real-world tasks are composed of several levels of sub-task...
research
12/07/2021

JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning

Learning rational behaviors in open-world games like Minecraft remains t...

Please sign up or login with your details

Forgot password? Click here to reset