Towards Making Deep Transfer Learning Never Hurt

11/18/2019
by   Ruosi Wan, et al.
0

Transfer learning have been frequently used to improve deep neural network training through incorporating weights of pre-trained networks as the starting-point of optimization for regularization. While deep transfer learning can usually boost the performance with better accuracy and faster convergence, transferring weights from inappropriate networks hurts training procedure and may lead to even lower accuracy. In this paper, we consider deep transfer learning as minimizing a linear combination of empirical loss and regularizer based on pre-trained weights, where the regularizer would restrict the training procedure from lowering the empirical loss, with conflicted descent directions (e.g., derivatives). Following the view, we propose a novel strategy making regularization-based Deep Transfer learning Never Hurt (DTNH) that, for each iteration of training procedure, computes the derivatives of the two terms separately, then re-estimates a new descent direction that does not hurt the empirical loss minimization while preserving the regularization affects from the pre-trained weights. Extensive experiments have been done using common transfer learning regularizers, such as L2-SP and knowledge distillation, on top of a wide range of deep transfer learning benchmarks including Caltech, MIT indoor 67, CIFAR-10 and ImageNet. The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks. All in all, DTNH strategy can improve state-of-the-art regularizers in all cases with 0.1 higher accuracy in all experiments.

READ FULL TEXT
research
01/26/2019

DELTA: DEep Learning Transfer using Feature Map with Attention for Convolutional Networks

Transfer learning through fine-tuning a pre-trained neural network with ...
research
07/19/2021

Non-binary deep transfer learning for imageclassification

The current standard for a variety of computer vision tasks using smalle...
research
08/26/2020

What is being transferred in transfer learning?

One desired capability for machines is the ability to transfer their kno...
research
01/07/2021

Transfer Learning Between Different Architectures Via Weights Injection

This work presents a naive algorithm for parameter transfer between diff...
research
06/08/2015

Learning to Select Pre-Trained Deep Representations with Bayesian Evidence Framework

We propose a Bayesian evidence framework to facilitate transfer learning...
research
03/25/2021

SMILE: Self-Distilled MIxup for Efficient Transfer LEarning

To improve the performance of deep learning, mixup has been proposed to ...
research
07/22/2022

Hyper-Representations for Pre-Training and Transfer Learning

Learning representations of neural network weights given a model zoo is ...

Please sign up or login with your details

Forgot password? Click here to reset