DeepAI
Log In Sign Up

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

10/02/2020
by   Luisa Zintgraf, et al.
4

Meta-learning is a powerful tool for learning policies that can adapt efficiently when deployed in new tasks. If however the meta-training tasks have sparse rewards, the need for exploration during meta-training is exacerbated given that the agent has to explore and learn across many tasks. We show that current meta-learning methods can fail catastrophically in such environments. To address this problem, we propose HyperX, a novel method for meta-learning in sparse reward tasks. Using novel reward bonuses for meta-training, we incentivise the agent to explore in approximate hyper-state space, i.e., the joint state and approximate belief space, where the beliefs are over tasks. We show empirically that these bonuses allow an agent to successfully learn to solve sparse reward tasks where existing meta-learning methods fail.

READ FULL TEXT

page 1

page 2

page 3

page 4

02/11/2020

Hyper-Meta Reinforcement Learning with Sparse Reward

Despite their success, existing meta reinforcement learning methods stil...
08/06/2020

Explore then Execute: Adapting without Rewards via Factorized Meta-Reinforcement Learning

We seek to efficiently learn by leveraging shared structure between diff...
07/15/2021

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Exploration in reinforcement learning is a challenging problem: in the w...
01/27/2019

Reward Shaping via Meta-Learning

Reward shaping is one of the most effective methods to tackle the crucia...
03/11/2020

Meta-learning curiosity algorithms

We hypothesize that curiosity is a mechanism found by evolution that enc...
04/05/2019

Synthesized Policies for Transfer and Adaptation across Tasks and Environments

The ability to transfer in reinforcement learning is key towards buildin...
03/11/2021

Population-Based Evolution Optimizes a Meta-Learning Objective

Meta-learning models, or models that learn to learn, have been a long-de...