Cross-Domain Policy Adaptation via Value-Guided Data Filtering

05/28/2023
by   Kang Xu, et al.
0

Generalizing policies across different domains with dynamics mismatch poses a significant challenge in reinforcement learning. For example, a robot learns the policy in a simulator, but when it is deployed in the real world, the dynamics of the environment may be different. Given the source and target domain with dynamics mismatch, we consider the online dynamics adaptation problem, in which case the agent can access sufficient source domain data while online interactions with the target domain are limited. Existing research has attempted to solve the problem from the dynamics discrepancy perspective. In this work, we reveal the limitations of these methods and explore the problem from the value difference perspective via a novel insight on the value consistency across domains. Specifically, we present the Value-Guided Data Filtering (VGDF) algorithm, which selectively shares transitions from the source domain based on the proximity of paired value targets across the two domains. Empirical results on various environments with kinematic and morphology shifts demonstrate that our method achieves superior performance compared to prior approaches.

READ FULL TEXT
research
08/19/2020

Unsupervised Cross-domain Image Classification by Distance Metric Guided Feature Alignment

Learning deep neural networks that are generalizable across different do...
research
09/28/2022

Focused Adaptation of Dynamics Models for Deformable Object Manipulation

In order to efficiently learn a dynamics model for a task in a new envir...
research
06/24/2020

Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers

We propose a simple, practical, and intuitive approach for domain adapta...
research
07/04/2023

AdAM: Few-Shot Image Generation via Adaptation-Aware Kernel Modulation

Few-shot image generation (FSIG) aims to learn to generate new and diver...
research
10/05/2016

EPOpt: Learning Robust Neural Network Policies Using Model Ensembles

Sample complexity and safety are major challenges when learning policies...
research
10/25/2021

Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning

Unsupervised reinforcement learning aims to acquire skills without prior...
research
12/05/2021

Toward Real-World Pathological Voice Detection

Voice disorders significantly undermine people's ability to speak in the...

Please sign up or login with your details

Forgot password? Click here to reset