Loss of Plasticity in Continual Deep Reinforcement Learning

03/13/2023
by   Zaheer Abbas, et al.
2

The ability to learn continually is essential in a complex and changing world. In this paper, we characterize the behavior of canonical value-based deep reinforcement learning (RL) approaches under varying degrees of non-stationarity. In particular, we demonstrate that deep RL agents lose their ability to learn good policies when they cycle through a sequence of Atari 2600 games. This phenomenon is alluded to in prior work under various guises – e.g., loss of plasticity, implicit under-parameterization, primacy bias, and capacity loss. We investigate this phenomenon closely at scale and analyze how the weights, gradients, and activations change over time in several experiments with varying dimensions (e.g., similarity between games, number of games, number of frames per game), with some experiments spanning 50 days and 2 billion environment interactions. Our analysis shows that the activation footprint of the network becomes sparser, contributing to the diminishing gradients. We investigate a remarkably simple mitigation strategy – Concatenated ReLUs (CReLUs) activation function – and demonstrate its effectiveness in facilitating continual learning in a changing environment.

READ FULL TEXT

page 16

page 17

research
02/28/2022

Avalanche RL: a Continual Reinforcement Learning Library

Continual Reinforcement Learning (CRL) is a challenging setting where an...
research
10/27/2020

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning

We identify an implicit under-parameterization phenomenon in value-based...
research
11/07/2017

Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games?

Deep reinforcement learning has achieved many recent successes, but our ...
research
03/02/2023

Understanding plasticity in neural networks

Plasticity, the ability of a neural network to quickly change its predic...
research
03/01/2019

Model-Based Reinforcement Learning for Atari

Model-free reinforcement learning (RL) can be used to learn effective po...
research
04/20/2022

Understanding and Preventing Capacity Loss in Reinforcement Learning

The reinforcement learning (RL) problem is rife with sources of non-stat...
research
12/13/2021

Continual Learning In Environments With Polynomial Mixing Times

The mixing time of the Markov chain induced by a policy limits performan...

Please sign up or login with your details

Forgot password? Click here to reset