Decomposed Mutual Information Optimization for Generalized Context in Meta-Reinforcement Learning

10/09/2022
by   Yao Mu, et al.
0

Adapting to the changes in transition dynamics is essential in robotic applications. By learning a conditional policy with a compact context, context-aware meta-reinforcement learning provides a flexible way to adjust behavior according to dynamics changes. However, in real-world applications, the agent may encounter complex dynamics changes. Multiple confounders can influence the transition dynamics, making it challenging to infer accurate context for decision-making. This paper addresses such a challenge by Decomposed Mutual INformation Optimization (DOMINO) for context learning, which explicitly learns a disentangled context to maximize the mutual information between the context and historical trajectories, while minimizing the state transition prediction error. Our theoretical analysis shows that DOMINO can overcome the underestimation of the mutual information caused by multi-confounded challenges via learning disentangled context and reduce the demand for the number of samples collected in various environments. Extensive experiments show that the context learned by DOMINO benefits both model-based and model-free reinforcement learning algorithms for dynamics generalization in terms of sample efficiency and performance in unseen environments.

READ FULL TEXT
research
10/11/2018

Empowerment-driven Exploration using Mutual Information Estimation

Exploration is a difficult challenge in reinforcement learning and is of...
research
03/02/2022

Integrating Contrastive Learning with Dynamic Models for Reinforcement Learning from Images

Recent methods for reinforcement learning from images use auxiliary task...
research
10/26/2020

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Model-based reinforcement learning (RL) has shown great potential in var...
research
01/16/2020

MIME: Mutual Information Minimisation Exploration

We show that reinforcement learning agents that learn by surprise (surpr...
research
02/16/2021

Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models

Reinforcement learning is a promising paradigm for solving sequential de...
research
04/27/2018

Efficiently Learning Nonstationary Gaussian Processes for Real World Impact

Most real world phenomena such as sunlight distribution under a forest c...
research
04/27/2018

Efficiently Learning Nonstationary Gaussian Processes

Most real world phenomena such as sunlight distribution under a forest c...

Please sign up or login with your details

Forgot password? Click here to reset