Lifelong Incremental Reinforcement Learning with Online Bayesian Inference

by   Zhi Wang, et al.

A central capability of a long-lived reinforcement learning (RL) agent is to incrementally adapt its behavior as its environment changes, and to incrementally build upon previous experiences to facilitate future learning in real-world scenarios. In this paper, we propose LifeLong Incremental Reinforcement Learning (LLIRL), a new incremental algorithm for efficient lifelong adaptation to dynamic environments. We develop and maintain a library that contains an infinite mixture of parameterized environment models, which is equivalent to clustering environment parameters in a latent space. The prior distribution over the mixture is formulated as a Chinese restaurant process (CRP), which incrementally instantiates new environment models without any external information to signal environmental changes in advance. During lifelong learning, we employ the expectation maximization (EM) algorithm with online Bayesian inference to update the mixture in a fully incremental manner. In EM, the E-step involves estimating the posterior expectation of environment-to-cluster assignments, while the M-step updates the environment parameters for future learning. This method allows for all environment models to be adapted as necessary, with new models instantiated for environmental changes and old models retrieved when previously seen environments are encountered again. Experiments demonstrate that LLIRL outperforms relevant existing methods, and enables effective incremental adaptation to various dynamic environments for lifelong learning.


page 1

page 2

page 3

page 4


Dynamics-Adaptive Continual Reinforcement Learning via Progressive Contextualization

A key challenge of continual reinforcement learning (CRL) in dynamic env...

A Dirichlet Process Mixture of Robust Task Models for Scalable Lifelong Reinforcement Learning

While reinforcement learning (RL) algorithms are achieving state-of-the-...

Instance Weighted Incremental Evolution Strategies for Reinforcement Learning in Dynamic Environments

Evolution strategies (ES), as a family of black-box optimization algorit...

Deep Online Learning via Meta-Learning: Continual Adaptation for Model-Based RL

Humans and animals can learn complex predictive models that allow them t...

Dex: Incremental Learning for Complex Environments in Deep Reinforcement Learning

This paper introduces Dex, a reinforcement learning environment toolkit ...

Fast Reinforcement Learning with Incremental Gaussian Mixture Models

This work presents a novel algorithm that integrates a data-efficient fu...

Playing it safe: information constrains collective betting strategies

Every interaction of a living organism with its environment involves the...

Please sign up or login with your details

Forgot password? Click here to reset