Continual Learning In Environments With Polynomial Mixing Times

by   Matthew Riemer, et al.

The mixing time of the Markov chain induced by a policy limits performance in real-world continual learning scenarios. Yet, the effect of mixing times on learning in continual reinforcement learning (RL) remains underexplored. In this paper, we characterize problems that are of long-term interest to the development of continual RL, which we call scalable MDPs, through the lens of mixing times. In particular, we establish that scalable MDPs have mixing times that scale polynomially with the size of the problem. We go on to demonstrate that polynomial mixing times present significant difficulties for existing approaches and propose a family of model-based algorithms that speed up learning by directly optimizing for the average reward through a novel bootstrapping procedure. Finally, we perform empirical regret analysis of our proposed approaches, demonstrating clear improvements over baselines and also how scalable MDPs can be used for analysis of RL algorithms as mixing times scale.


page 1

page 2

page 3

page 4


Avalanche RL: a Continual Reinforcement Learning Library

Continual Reinforcement Learning (CRL) is a challenging setting where an...

CLVOS23: A Long Video Object Segmentation Dataset for Continual Learning

Continual learning in real-world scenarios is a major challenge. A gener...

Block Contextual MDPs for Continual Learning

In reinforcement learning (RL), when defining a Markov Decision Process ...

Regret Bounds for Reinforcement Learning via Markov Chain Concentration

We give a simple optimistic algorithm for which it is easy to derive reg...

Steady State Analysis of Episodic Reinforcement Learning

This paper proves that the episodic learning environment of every finite...

Deep Reinforcement and InfoMax Learning

Our work is based on the hypothesis that a model-free agent whose repres...

Loss of Plasticity in Continual Deep Reinforcement Learning

The ability to learn continually is essential in a complex and changing ...

Please sign up or login with your details

Forgot password? Click here to reset