Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees

06/06/2022
by   Haotian Ju, et al.
0

We consider transfer learning approaches that fine-tune a pretrained deep neural network on a target task. We investigate generalization properties of fine-tuning to understand the problem of overfitting, which often happens in practice. Previous works have shown that constraining the distance from the initialization of fine-tuning improves generalization. Using a PAC-Bayesian analysis, we observe that besides distance from initialization, Hessians affect generalization through the noise stability of deep neural networks against noise injections. Motivated by the observation, we develop Hessian distance-based generalization bounds for a wide range of fine-tuning methods. Next, we investigate the robustness of fine-tuning with noisy labels. We design an algorithm that incorporates consistent losses and distance-based regularization for fine-tuning. Additionally, we prove a generalization error bound of our algorithm under class conditional independent noise in the training dataset labels. We perform a detailed empirical study of our algorithm on various noisy environments and architectures. For example, on six image classification tasks whose training labels are generated with programmatic labeling, we show a 3.26 the Hessian distance measure of the fine-tuned network using our algorithm decreases by six times more than existing approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2021

Improved Regularization and Robustness for Fine-tuning in Neural Networks

A widely used algorithm for transfer learning is fine-tuning, where a pr...
research
11/01/2020

An Information-Geometric Distance on the Space of Tasks

This paper computes a distance between tasks modeled as joint distributi...
research
02/19/2020

Distance-Based Regularisation of Deep Networks for Fine-Tuning

We investigate approaches to regularisation during fine-tuning of deep n...
research
09/19/2018

Identifying Generalization Properties in Neural Networks

While it has not yet been proven, empirical evidence suggests that model...
research
02/09/2023

Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion

Graph neural networks are widely used tools for graph prediction tasks. ...
research
05/25/2019

Efficient Neural Task Adaptation by Maximum Entropy Initialization

Transferring knowledge from one neural network to another has been shown...
research
01/29/2022

Transfer Learning for Estimation of Pendubot Angular Position Using Deep Neural Networks

In this paper, a machine learning based approach is introduced to estima...

Please sign up or login with your details

Forgot password? Click here to reset