Learn to Effectively Explore in Context-Based Meta-RL

06/15/2020
by   Jin Zhang, et al.
0

Meta reinforcement learning (meta-RL) provides a principled approach for fast adaptation to novel tasks by extracting prior knowledge from previous tasks. Under such settings, it is crucial for the agent to perform efficient exploration during adaptation to collect useful experiences. However, existing methods suffer from poor adaptation performance caused by inefficient exploration mechanisms, especially in sparse-reward problems. In this paper, we present a novel off-policy context-based meta-RL approach that efficiently learns a separate exploration policy to support fast adaptation, as well as a context-aware exploitation policy to maximize extrinsic return. The explorer is motivated by an information-theoretical intrinsic reward that encourages the agent to collect experiences that provide rich information about the task. Experiment results on both MuJoCo and Meta-World benchmarks show that our method significantly outperforms baselines by performing efficient exploration strategies.

READ FULL TEXT
research
04/01/2022

earning Context-aware Task Reasoning for Efficient Meta Reinforcement Learning

Despite recent success of deep network-based Reinforcement Learning (RL)...
research
02/20/2018

Meta-Reinforcement Learning of Structured Exploration Strategies

Exploration is a fundamental challenge in reinforcement learning (RL). M...
research
03/03/2020

Learning Context-aware Task Reasoning for Efficient Meta-reinforcement Learning

Despite recent success of deep network-based Reinforcement Learning (RL)...
research
01/01/2020

Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies

We propose and address a novel few-shot RL problem, where a task is char...
research
07/05/2023

First-Explore, then Exploit: Meta-Learning Intelligent Exploration

Standard reinforcement learning (RL) agents never intelligently explore ...
research
11/11/2019

MAME : Model-Agnostic Meta-Exploration

Meta-Reinforcement learning approaches aim to develop learning procedure...
research
01/18/2023

Human-Timescale Adaptation in an Open-Ended Task Space

Foundation models have shown impressive adaptation and scalability in su...

Please sign up or login with your details

Forgot password? Click here to reset