Continual Learning In Environments With Polynomial Mixing Times

12/13/2021
by   Matthew Riemer, et al.
0

The mixing time of the Markov chain induced by a policy limits performance in real-world continual learning scenarios. Yet, the effect of mixing times on learning in continual reinforcement learning (RL) remains underexplored. In this paper, we characterize problems that are of long-term interest to the development of continual RL, which we call scalable MDPs, through the lens of mixing times. In particular, we establish that scalable MDPs have mixing times that scale polynomially with the size of the problem. We go on to demonstrate that polynomial mixing times present significant difficulties for existing approaches and propose a family of model-based algorithms that speed up learning by directly optimizing for the average reward through a novel bootstrapping procedure. Finally, we perform empirical regret analysis of our proposed approaches, demonstrating clear improvements over baselines and also how scalable MDPs can be used for analysis of RL algorithms as mixing times scale.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2022

Avalanche RL: a Continual Reinforcement Learning Library

Continual Reinforcement Learning (CRL) is a challenging setting where an...
research
04/09/2023

CLVOS23: A Long Video Object Segmentation Dataset for Continual Learning

Continual learning in real-world scenarios is a major challenge. A gener...
research
10/13/2021

Block Contextual MDPs for Continual Learning

In reinforcement learning (RL), when defining a Markov Decision Process ...
research
08/06/2018

Regret Bounds for Reinforcement Learning via Markov Chain Concentration

We give a simple optimistic algorithm for which it is easy to derive reg...
research
11/12/2020

Steady State Analysis of Episodic Reinforcement Learning

This paper proves that the episodic learning environment of every finite...
research
06/12/2020

Deep Reinforcement and InfoMax Learning

Our work is based on the hypothesis that a model-free agent whose repres...
research
03/13/2023

Loss of Plasticity in Continual Deep Reinforcement Learning

The ability to learn continually is essential in a complex and changing ...

Please sign up or login with your details

Forgot password? Click here to reset