Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers

06/24/2020
by   Benjamin Eysenbach, et al.
15

We propose a simple, practical, and intuitive approach for domain adaptation in reinforcement learning. Our approach stems from the idea that the agent's experience in the source domain should look similar to its experience in the target domain. Building off of a probabilistic view of RL, we formally show that we can achieve this goal by compensating for the difference in dynamics by modifying the reward function. This modified reward function is simple to estimate by learning auxiliary classifiers that distinguish source-domain transitions from target-domain transitions. Intuitively, the modified reward function penalizes the agent for visiting states and taking actions in the source domain which are not possible in the target domain. Said another way, the agent is penalized for transitions that would indicate that the agent is interacting with the source domain, rather than the target domain. Our approach is applicable to domains with continuous states and actions and does not require learning an explicit model of the dynamics. On discrete and continuous control tasks, we illustrate the mechanics of our approach and demonstrate its scalability to high-dimensional tasks.

READ FULL TEXT
research
05/25/2020

Contradistinguisher: Applying Vapnik's Philosophy to Unsupervised Domain Adaptation

A complex combination of simultaneous supervised-unsupervised learning i...
research
10/21/2021

Off-Dynamics Inverse Reinforcement Learning from Hetero-Domain

We propose an approach for inverse reinforcement learning from hetero-do...
research
05/19/2018

Learning Sampling Policies for Domain Adaptation

We address the problem of semi-supervised domain adaptation of classific...
research
05/28/2023

Cross-Domain Policy Adaptation via Value-Guided Data Filtering

Generalizing policies across different domains with dynamics mismatch po...
research
10/05/2016

EPOpt: Learning Robust Neural Network Policies Using Model Ensembles

Sample complexity and safety are major challenges when learning policies...
research
07/24/2023

Contrastive Example-Based Control

While many real-world problems that might benefit from reinforcement lea...
research
06/28/2020

Image Classification by Reinforcement Learning with Two-State Q-Learning

In this paper, a simple and efficient Hybrid Classifier is presented whi...

Please sign up or login with your details

Forgot password? Click here to reset