Generative Exploration and Exploitation

04/21/2019
by   Jiechuan Jiang, et al.
0

Sparse reward is one of the biggest challenges in reinforcement learning (RL). In this paper, we propose a novel method called Generative Exploration and Exploitation (GENE) to overcome sparse reward. GENE dynamically changes the start state of agent to the generated novel state to encourage the agent to explore the environment or to the generated rewarding state to boost the agent to exploit the received reward signal. GENE relies on no prior knowledge about the environment and can be combined with any RL algorithm, no matter on-policy or off-policy, single-agent or multi-agent. Empirically, we demonstrate that GENE significantly outperforms existing methods in four challenging tasks with only binary rewards indicating whether or not the task is completed, including Maze, Goal Ant, Pushing, and Cooperative Navigation. The ablation studies verify that GENE can adaptively tradeoff between exploration and exploitation as the learning progresses by automatically adjusting the proportion between generated novel states and rewarding states, which is the key for GENE to solving these challenging tasks effectively and efficiently.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 7

page 9

page 10

page 11

research
01/04/2021

Variationally and Intrinsically motivated reinforcement learning for decentralized traffic signal control

One of the biggest challenges in multi-agent reinforcement learning is c...
research
07/13/2021

Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on ...
research
05/30/2022

SEREN: Knowing When to Explore and When to Exploit

Efficient reinforcement learning (RL) involves a trade-off between "expl...
research
11/16/2020

ACDER: Augmented Curiosity-Driven Experience Replay

Exploration in environments with sparse feedback remains a challenging r...
research
08/16/2022

Solving the Diffusion of Responsibility Problem in Multiagent Reinforcement Learning with a Policy Resonance Approach

SOTA multiagent reinforcement algorithms distinguish themselves in many ...
research
07/04/2018

Region Growing Curriculum Generation for Reinforcement Learning

Learning a policy capable of moving an agent between any two states in t...
research
08/12/2020

REMAX: Relational Representation for Multi-Agent Exploration

Training a multi-agent reinforcement learning (MARL) model is generally ...

Please sign up or login with your details

Forgot password? Click here to reset