Exploration via Sample-Efficient Subgoal Design

10/21/2019
by   Yijia Wang, et al.
0

The problem of exploration in unknown environments continues to pose a challenge for reinforcement learning algorithms, as interactions with the environment are usually expensive or limited. The technique of setting subgoals with an intrinsic shaped reward allows for the use of supplemental feedback to aid an agent in environment with sparse and delayed rewards. In fact, it can be an effective tool in directing the exploration behavior of the agent toward useful parts of the state space. In this paper, we consider problems where an agent faces an unknown task in the future and is given prior opportunities to "practice" on related tasks where the interactions are still expensive. We propose a one-step Bayes-optimal algorithm for selecting subgoal designs, along with the number of episodes and the episode length, to efficiently maximize the expected performance of an agent. We demonstrate its excellent performance on a variety of tasks and also prove an asymptotic optimality guarantee.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2022

Redeeming Intrinsic Rewards via Constrained Optimization

State-of-the-art reinforcement learning (RL) algorithms typically use ra...
research
08/30/2023

Cyclophobic Reinforcement Learning

In environments with sparse rewards, finding a good inductive bias for e...
research
03/02/2022

Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning

Exploration versus exploitation dilemma is a significant problem in rein...
research
07/27/2020

Fast active learning for pure exploration in reinforcement learning

Realistic environments often provide agents with very limited feedback. ...
research
12/26/2020

Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards

A major challenge in reinforcement learning is the design of exploration...
research
10/25/2021

Multitask Adaptation by Retrospective Exploration with Learned World Models

Model-based reinforcement learning (MBRL) allows solving complex tasks i...
research
09/17/2021

Knowledge is reward: Learning optimal exploration by predictive reward cashing

There is a strong link between the general concept of intelligence and t...

Please sign up or login with your details

Forgot password? Click here to reset