Curriculum Reinforcement Learning using Optimal Transport via Gradual Domain Adaptation

10/18/2022
by   Peide Huang, et al.
21

Curriculum Reinforcement Learning (CRL) aims to create a sequence of tasks, starting from easy ones and gradually learning towards difficult tasks. In this work, we focus on the idea of framing CRL as interpolations between a source (auxiliary) and a target task distribution. Although existing studies have shown the great potential of this idea, it remains unclear how to formally quantify and generate the movement between task distributions. Inspired by the insights from gradual domain adaptation in semi-supervised learning, we create a natural curriculum by breaking down the potentially large task distributional shift in CRL into smaller shifts. We propose GRADIENT, which formulates CRL as an optimal transport problem with a tailored distance metric between tasks. Specifically, we generate a sequence of task distributions as a geodesic interpolation (i.e., Wasserstein barycenter) between the source and target distributions. Different from many existing methods, our algorithm considers a task-dependent contextual distance metric and is capable of handling nonparametric distributions in both continuous and discrete context settings. In addition, we theoretically show that GRADIENT enables smooth transfer between subsequent stages in the curriculum under certain conditions. We conduct extensive experiments in locomotion and manipulation tasks and show that our proposed GRADIENT achieves higher performance than baselines in terms of learning efficiency and asymptotic performance.

READ FULL TEXT

page 2

page 8

page 9

page 10

page 23

page 29

research
06/23/2020

Multi-source Domain Adaptation via Weighted Joint Distributions Optimal Transport

The problem of domain adaptation on an unlabeled target dataset using kn...
research
11/21/2022

Unsupervised Domain Adaptation via Deep Hierarchical Optimal Transport

Unsupervised domain adaptation is a challenging task that aims to estima...
research
02/19/2020

Curriculum in Gradient-Based Meta-Reinforcement Learning

Gradient-based meta-learners such as Model-Agnostic Meta-Learning (MAML)...
research
06/10/2021

Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent

We focus on the problem of domain adaptation when the goal is shifting t...
research
03/13/2018

Optimal Transport for Multi-source Domain Adaptation under Target Shift

In this paper, we propose to tackle the problem of reducing discrepancie...
research
02/13/2022

Metric Learning-enhanced Optimal Transport for Biochemical Regression Domain Adaptation

Generalizing knowledge beyond source domains is a crucial prerequisite f...
research
10/07/2019

Self-Paced Contextual Reinforcement Learning

Generalization and adaptation of learned skills to novel situations is a...

Please sign up or login with your details

Forgot password? Click here to reset