Information-Theoretic Bounds on Transfer Generalization Gap Based on Jensen-Shannon Divergence

10/13/2020
by   Sharu Theresa Jose, et al.
0

In transfer learning, training and testing data sets are drawn from different data distributions. The transfer generalization gap is the difference between the population loss on the target data distribution and the training loss. The training data set generally includes data drawn from both source and target distributions. This work presents novel information-theoretic upper bounds on the average transfer generalization gap that capture (i) the domain shift between the target data distribution P'_Z and the source distribution P_Z through a two-parameter family of generalized (α_1,α_2)-Jensen-Shannon (JS) divergences; and (ii) the sensitivity of the transfer learner output W to each individual sample of the data set Z_i via the mutual information I(W;Z_i). For α_1 ∈ (0,1), the (α_1,α_2)-JS divergence can be bounded even when the support of P_Z is not included in that of P'_Z. This contrasts the Kullback-Leibler (KL) divergence D_KL(P_Z||P'_Z)-based bounds of Wu et al. [1], which are vacuous under this assumption. Moreover, the obtained bounds hold for unbounded loss functions with bounded cumulant generating functions, unlike the ϕ-divergence based bound of Wu et al. [1]. We also obtain new upper bounds on the average transfer excess risk in terms of the (α_1,α_2)-JS divergence for empirical weighted risk minimization (EWRM), which minimizes the weighted average training losses over source and target data sets. Finally, we provide a numerical example to illustrate the merits of the introduced bounds.

READ FULL TEXT
research
11/04/2020

Transfer Meta-Learning: Information-Theoretic Bounds and Information Meta-Risk Minimization

Meta-learning automatically infers an inductive bias by observing data f...
research
05/09/2012

Multiple Source Adaptation and the Renyi Divergence

This paper presents a novel theoretical study of the general problem of ...
research
07/12/2022

An Information-Theoretic Analysis for Transfer Learning: Error Bounds and Applications

Transfer learning, or domain adaptation, is concerned with machine learn...
research
07/30/2020

Beyond ℋ-Divergence: Domain Adaptation Theory With Jensen-Shannon Divergence

We reveal the incoherence between the widely-adopted empirical domain ad...
research
01/17/2022

Transfer Learning in Quantum Parametric Classifiers: An Information-Theoretic Generalization Analysis

A key step in quantum machine learning with classical inputs is the desi...
research
05/18/2020

Information-theoretic analysis for transfer learning

Transfer learning, or domain adaptation, is concerned with machine learn...
research
11/19/2022

Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States

Stochastic differential equations (SDEs) have been shown recently to wel...

Please sign up or login with your details

Forgot password? Click here to reset