Modulated Policy Hierarchies

11/30/2018
by   Alexander Pashevich, et al.
0

Solving tasks with sparse rewards is a main challenge in reinforcement learning. While hierarchical controllers are an intuitive approach to this problem, current methods often require manual reward shaping, alternating training phases, or manually defined sub tasks. We introduce modulated policy hierarchies (MPH), that can learn end-to-end to solve tasks from sparse rewards. To achieve this, we study different modulation signals and exploration for hierarchical controllers. Specifically, we find that communicating via bit-vectors is more efficient than selecting one out of multiple skills, as it enables mixing between them. To facilitate exploration, MPH uses its different time scales for temporally extended intrinsic motivation at each level of the hierarchy. We evaluate MPH on the robotics tasks of pushing and sparse block stacking, where it outperforms recent baselines.

READ FULL TEXT

page 2

page 5

page 6

research
05/18/2017

Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning

The problem of sparse rewards is one of the hardest challenges in contem...
research
08/12/2021

HAC Explore: Accelerating Exploration with Hierarchical Reinforcement Learning

Sparse rewards and long time horizons remain challenging for reinforceme...
research
06/23/2020

ELSIM: End-to-end learning of reusable skills through intrinsic motivation

Taking inspiration from developmental learning, we present a novel reinf...
research
03/18/2019

Scheduled Intrinsic Drive: A Hierarchical Take on Intrinsically Motivated Exploration

Exploration in sparse reward reinforcement learning remains a difficult ...
research
09/23/2019

Why Does Hierarchy (Sometimes) Work So Well in Reinforcement Learning?

Hierarchical reinforcement learning has demonstrated significant success...
research
11/28/2022

CIM: Constrained Intrinsic Motivation for Sparse-Reward Continuous Control

Intrinsic motivation is a promising exploration technique for solving re...
research
10/15/2020

An Empowerment-based Solution to Robotic Manipulation Tasks with Sparse Rewards

In order to provide adaptive and user-friendly solutions to robotic mani...

Please sign up or login with your details

Forgot password? Click here to reset