A Dirichlet Process Mixture of Robust Task Models for Scalable Lifelong Reinforcement Learning

05/22/2022
by   Zhi Wang, et al.
0

While reinforcement learning (RL) algorithms are achieving state-of-the-art performance in various challenging tasks, they can easily encounter catastrophic forgetting or interference when faced with lifelong streaming information. In the paper, we propose a scalable lifelong RL method that dynamically expands the network capacity to accommodate new knowledge while preventing past memories from being perturbed. We use a Dirichlet process mixture to model the non-stationary task distribution, which captures task relatedness by estimating the likelihood of task-to-cluster assignments and clusters the task models in a latent space. We formulate the prior distribution of the mixture as a Chinese restaurant process (CRP) that instantiates new mixture components as needed. The update and expansion of the mixture are governed by the Bayesian non-parametric framework with an expectation maximization (EM) procedure, which dynamically adapts the model complexity without explicit task boundaries or heuristics. Moreover, we use the domain randomization technique to train robust prior parameters for the initialization of each task model in the mixture, thus the resulting model can better generalize and adapt to unseen tasks. With extensive experiments conducted on robot navigation and locomotion domains, we show that our method successfully facilitates scalable lifelong RL and outperforms relevant existing methods.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 6

page 7

page 9

page 11

research
07/28/2020

Lifelong Incremental Reinforcement Learning with Online Bayesian Inference

A central capability of a long-lived reinforcement learning (RL) agent i...
research
09/01/2022

Dynamics-Adaptive Continual Reinforcement Learning via Progressive Contextualization

A key challenge of continual reinforcement learning (CRL) in dynamic env...
research
04/29/2023

Meta-Reinforcement Learning Based on Self-Supervised Task Representation Learning

Meta-reinforcement learning enables artificial agents to learn from rela...
research
10/13/2022

Dirichlet process mixture models for non-stationary data streams

In recent years, we have seen a handful of work on inference algorithms ...
research
02/04/2023

Locally Constrained Policy Optimization for Online Reinforcement Learning in Non-Stationary Input-Driven Environments

We study online Reinforcement Learning (RL) in non-stationary input-driv...
research
08/25/2021

Lifelong Infinite Mixture Model Based on Knowledge-Driven Dirichlet Process

Recent research efforts in lifelong learning propose to grow a mixture o...
research
03/27/2013

ABC Reinforcement Learning

This paper introduces a simple, general framework for likelihood-free Ba...

Please sign up or login with your details

Forgot password? Click here to reset