The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning

11/29/2022
by   Samuel Kessler, et al.
0

We study the use of model-based reinforcement learning methods, in particular, world models for continual reinforcement learning. In continual reinforcement learning, an agent is required to solve one task and then another sequentially while retaining performance and preventing forgetting on past tasks. World models offer a task-agnostic solution: they do not require knowledge of task changes. World models are a straight-forward baseline for continual reinforcement learning for three main reasons. Firstly, forgetting in the world model is prevented by persisting existing experience replay buffers across tasks, experience from previous tasks is replayed for learning the world model. Secondly, they are sample efficient. Thirdly and finally, they offer a task-agnostic exploration strategy through the uncertainty in the trajectories generated by the world model. We show that world models are a simple and effective continual reinforcement learning baseline. We study their effectiveness on Minigrid and Minihack continual reinforcement learning benchmarks and show that it outperforms state of the art task-agnostic continual reinforcement learning methods.

READ FULL TEXT

page 1

page 6

research
05/28/2022

Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline

We study task-agnostic continual reinforcement learning (TACRL) in which...
research
06/21/2019

Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction

Deep reinforcement learning has made significant progress in the field o...
research
09/25/2020

Continual Model-Based Reinforcement Learning with Hypernetworks

Effective planning in model-based reinforcement learning (MBRL) and mode...
research
07/11/2019

DisCoRL: Continual Reinforcement Learning via Policy Distillation

In multi-task reinforcement learning there are two main challenges: at t...
research
06/19/2020

Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes

Continuously learning to solve unseen tasks with limited experience has ...
research
10/21/2020

PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation

In language generation models conditioned by structured data, the classi...
research
04/27/2020

FORECASTER: A Continual Lifelong Learning Approach to Improve Hardware Efficiency

Computer applications are continuously evolving. However, significant kn...

Please sign up or login with your details

Forgot password? Click here to reset