Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond

04/18/2022
by   Haoxiang Wang, et al.
0

The vast majority of existing algorithms for unsupervised domain adaptation (UDA) focus on adapting from a labeled source domain to an unlabeled target domain directly in a one-off way. Gradual domain adaptation (GDA), on the other hand, assumes a path of (T-1) unlabeled intermediate domains bridging the source and target, and aims to provide better generalization in the target domain by leveraging the intermediate ones. Under certain assumptions, Kumar et al. (2020) proposed a simple algorithm, Gradual Self-Training, along with a generalization bound in the order of e^O(T)(ε_0+O(√(log(T)/n))) for the target domain error, where ε_0 is the source domain error and n is the data size of each domain. Due to the exponential factor, this upper bound becomes vacuous when T is only moderately large. In this work, we analyze gradual self-training under more general and relaxed assumptions, and prove a significantly improved generalization bound as O(ε_0 + TΔ + T/√(n) + 1/√(nT)), where Δ is the average distributional distance between consecutive domains. Compared with the existing bound with an exponential dependency on T as a multiplicative factor, our bound only depends on T linearly and additively. Perhaps more interestingly, our result implies the existence of an optimal choice of T that minimizes the generalization error, and it also naturally suggests an optimal way to construct the path of intermediate domains so as to minimize the accumulative path length TΔ between the source and target. To corroborate the implications of our theory, we examine gradual self-training on multiple semi-synthetic and real datasets, which confirms our findings. We believe our insights provide a path forward toward the design of future GDA algorithms.

READ FULL TEXT

page 2

page 13

research
07/01/2020

Sequential Unsupervised Domain Adaptation through Prototypical Distributions

We develop an algorithm for unsupervised domain adaptation (UDA) of a cl...
research
04/25/2022

Algorithms and Theory for Supervised Gradual Domain Adaptation

The phenomenon of data distribution evolving over time has been observed...
research
06/29/2020

A No-Free-Lunch Theorem for MultiTask Learning

Multitask learning and related areas such as multi-source domain adaptat...
research
02/26/2020

Understanding Self-Training for Gradual Domain Adaptation

Machine learning systems must adapt to data distributions that evolve ov...
research
10/03/2022

Information-Theoretic Analysis of Unsupervised Domain Adaptation

This paper uses information-theoretic tools to analyze the generalizatio...
research
07/11/2022

Gradual Domain Adaptation without Indexed Intermediate Domains

The effectiveness of unsupervised domain adaptation degrades when there ...
research
09/16/2021

Unsupervised domain adaptation with non-stochastic missing data

We consider unsupervised domain adaptation (UDA) for classification prob...

Please sign up or login with your details

Forgot password? Click here to reset