The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning

by   Emma Brunskill, et al.
Carnegie Mellon University

Transferring knowledge across a sequence of related tasks is an important challenge in reinforcement learning (RL). Despite much encouraging empirical evidence, there has been little theoretical analysis. In this paper, we study a class of lifelong RL problems: the agent solves a sequence of tasks modeled as finite Markov decision processes (MDPs), each of which is from a finite set of MDPs with the same state/action sets and different transition/reward functions. Motivated by the need for cross-task exploration in lifelong learning, we formulate a novel online coupon-collector problem and give an optimal algorithm. This allows us to develop a new lifelong RL algorithm, whose overall sample complexity in a sequence of tasks is much smaller than single-task learning, even if the sequence of tasks is generated by an adversary. Benefits of the algorithm are demonstrated in simulated problems, including a recently introduced human-robot interaction problem.


page 1

page 2

page 3

page 4


Sample Complexity of Multi-task Reinforcement Learning

Transferring knowledge across a sequence of reinforcement-learning tasks...

Scaling Distributed Multi-task Reinforcement Learning with Experience Sharing

Recently, DARPA launched the ShELL program, which aims to explore how ex...

Lipschitz Lifelong Reinforcement Learning

We consider the problem of knowledge transfer when an agent is facing a ...

Categorical semantics of compositional reinforcement learning

Reinforcement learning (RL) often requires decomposing a problem into su...

Reinforcement Learning under Threats

In several reinforcement learning (RL) scenarios, mainly in security set...

Optimal Farsighted Agents Tend to Seek Power

Some researchers have speculated that capable reinforcement learning (RL...

Provably Efficient Multi-Task Reinforcement Learning with Model Transfer

We study multi-task reinforcement learning (RL) in tabular episodic Mark...

Please sign up or login with your details

Forgot password? Click here to reset