Super-model ecosystem: A domain-adaptation perspective

08/30/2022
by   Fengxiang He, et al.
0

This paper attempts to establish the theoretical foundation for the emerging super-model paradigm via domain adaptation, where one first trains a very large-scale model, i.e., super model (or foundation model in some other papers), on a large amount of data and then adapts it to various specific domains. Super-model paradigms help reduce computational and data cost and carbon emission, which is critical to AI industry, especially enormous small and medium-sized enterprises. We model the super-model paradigm as a two-stage diffusion process: (1) in the pre-training stage, the model parameter diffuses from random initials and converges to a steady distribution; and (2) in the fine-tuning stage, the model parameter is transported to another steady distribution. Both training stages can be mathematically modeled by the Uhlenbeck-Ornstein process which converges to two Maxwell-Boltzmann distributions, respectively, each of which characterizes the corresponding convergent model. An 𝒪(1/√(N)) generalization bound is then established via PAC-Bayesian framework. The theory finds that the generalization error of the fine-tuning stage is dominant in domain adaptation. In addition, our theory suggests that the generalization is determined by a new measure that characterizes the domain discrepancy between the source domain and target domain, based on the covariance matrices and the shift of the converged local minimum.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2021

Gradual Fine-Tuning for Low-Resource Domain Adaptation

Fine-tuning is known to improve NLP models by adapting an initial model ...
research
06/15/2015

A New PAC-Bayesian Perspective on Domain Adaptation

We study the issue of PAC-Bayesian domain adaptation: We want to learn, ...
research
10/19/2022

Variational Model Perturbation for Source-Free Domain Adaptation

We aim for source-free domain adaptation, where the task is to deploy a ...
research
01/13/2015

An Improvement to the Domain Adaptation Bound in a PAC-Bayesian context

This paper provides a theoretical analysis of domain adaptation based on...
research
06/23/2020

Domain Adaptation for Semantic Parsing

Recently, semantic parsing has attracted much attention in the community...
research
02/07/2021

Domain Adversarial Neural Networks for Domain Generalization: When It Works and How to Improve

Theoretically, domain adaptation is a well-researched problem. Further, ...

Please sign up or login with your details

Forgot password? Click here to reset