A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

02/03/2019
by   Francisco M. Garcia, et al.
12

In this paper we consider the problem of how a reinforcement learning agent that is tasked with solving a sequence of reinforcement learning problems (a sequence of Markov decision processes) can use knowledge acquired early in its lifetime to improve its ability to solve new problems. We argue that previous experience with similar problems can provide an agent with information about how it should explore when facing a new but related problem. We show that the search for an optimal exploration strategy can be formulated as a reinforcement learning problem itself and demonstrate that such strategy can leverage patterns found in the structure of related problems. We conclude with experiments that show the benefits of optimizing an exploration strategy using our proposed approach.

READ FULL TEXT

page 6

page 8

page 9

page 15

page 16

research
03/03/2018

Some Considerations on Learning to Explore via Meta-Reinforcement Learning

We consider the problem of exploration in meta reinforcement learning. T...
research
10/20/2021

More Efficient Exploration with Symbolic Priors on Action Sequence Equivalences

Incorporating prior knowledge in reinforcement learning algorithms is ma...
research
11/11/2021

Agent Spaces

Exploration is one of the most important tasks in Reinforcement Learning...
research
11/24/2017

Identifying Reusable Macros for Efficient Exploration via Policy Compression

Reinforcement Learning agents often need to solve not a single task, but...
research
11/15/2022

General Intelligence Requires Rethinking Exploration

We are at the cusp of a transition from "learning from data" to "learnin...
research
11/29/2016

Exploration for Multi-task Reinforcement Learning with Deep Generative Models

Exploration in multi-task reinforcement learning is critical in training...
research
04/05/2023

Constrained Exploration in Reinforcement Learning with Optimality Preservation

We consider a class of reinforcement-learning systems in which the agent...

Please sign up or login with your details

Forgot password? Click here to reset