Deep Reinforcement Learning amidst Lifelong Non-Stationarity

06/18/2020
by   Annie Xie, et al.
0

As humans, our goals and our environment are persistently changing throughout our lifetime based on our experiences, actions, and internal and external drives. In contrast, typical reinforcement learning problem set-ups consider decision processes that are stationary across episodes. Can we develop reinforcement learning algorithms that can cope with the persistent change in the former, more realistic problem settings? While on-policy algorithms such as policy gradients in principle can be extended to non-stationary settings, the same cannot be said for more efficient off-policy algorithms that replay past experiences when learning. In this work, we formalize this problem setting, and draw upon ideas from the online learning and probabilistic inference literature to derive an off-policy RL algorithm that can reason about and tackle such lifelong non-stationarity. Our method leverages latent variable models to learn a representation of the environment from current and past experiences, and performs off-policy RL with this representation. We further introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.

READ FULL TEXT
research
06/10/2020

The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning

Non-stationarity arises in Reinforcement Learning (RL) even in stationar...
research
02/04/2023

Locally Constrained Policy Optimization for Online Reinforcement Learning in Non-Stationary Input-Driven Environments

We study online Reinforcement Learning (RL) in non-stationary input-driv...
research
10/31/2020

A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning

A fundamental challenge in multiagent reinforcement learning is to learn...
research
04/14/2023

Bandit-Based Policy Invariant Explicit Shaping for Incorporating External Advice in Reinforcement Learning

A key challenge for a reinforcement learning (RL) agent is to incorporat...
research
06/03/2020

Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains

Reinforcement learning algorithms have had tremendous successes in onlin...
research
11/27/2018

Prioritizing Starting States for Reinforcement Learning

Online, off-policy reinforcement learning algorithms are able to use an ...
research
09/25/2018

Hierarchical Deep Multiagent Reinforcement Learning

Despite deep reinforcement learning has recently achieved great successe...

Please sign up or login with your details

Forgot password? Click here to reset