Self-Supervised Policy Adaptation during Deployment

07/08/2020
by   Nicklas Hansen, et al.
4

In most real world scenarios, a policy trained by reinforcement learning in one environment needs to be deployed in another, potentially quite different environment. However, generalization across different environments is known to be hard. A natural solution would be to keep training after deployment in the new environment, but this cannot be done if the new environment offers no reward signal. Our work explores the use of self-supervision to allow the policy to continue training after deployment without using any rewards. While previous methods explicitly anticipate changes in the new environment, we assume no prior knowledge of those changes yet still obtain significant improvements. Empirical evaluations are performed on diverse environments from DeepMind Control suite and ViZDoom. Our method improves generalization in 25 out of 30 environments across various tasks, and outperforms domain randomization on a majority of environments.

READ FULL TEXT
research
04/06/2022

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

Deep Reinforcement Learning (DRL) has been a promising solution to many ...
research
10/10/2022

Learning Real-world Autonomous Navigation by Self-Supervised Environment Synthesis

Machine learning approaches have recently enabled autonomous navigation ...
research
11/14/2020

A Geometric Perspective on Self-Supervised Policy Adaptation

One of the most challenging aspects of real-world reinforcement learning...
research
07/25/2022

Optimizing Empty Container Repositioning and Fleet Deployment via Configurable Semi-POMDPs

With the continuous growth of the global economy and markets, resource i...
research
04/06/2019

Reinforcement Learning with Attention that Works: A Self-Supervised Approach

Attention models have had a significant positive impact on deep learning...
research
09/17/2021

Dropout's Dream Land: Generalization from Learned Simulators to Reality

A World Model is a generative model used to simulate an environment. Wor...
research
08/08/2021

Towards real-world navigation with deep differentiable planners

We train embodied neural networks to plan and navigate unseen complex 3D...

Please sign up or login with your details

Forgot password? Click here to reset