Towards Understanding the Transferability of Deep Representations

09/26/2019
by   Hong Liu, et al.
12

Deep neural networks trained on a wide range of datasets demonstrate impressive transferability. Deep features appear general in that they are applicable to many datasets and tasks. Such property is in prevalent use in real-world applications. A neural network pretrained on large datasets, such as ImageNet, can significantly boost generalization and accelerate training if fine-tuned to a smaller target dataset. Despite its pervasiveness, few effort has been devoted to uncovering the reason of transferability in deep feature representations. This paper tries to understand transferability from the perspectives of improved generalization, optimization and the feasibility of transferability. We demonstrate that 1) Transferred models tend to find flatter minima, since their weight matrices stay close to the original flat region of pretrained parameters when transferred to a similar target dataset; 2) Transferred representations make the loss landscape more favorable with improved Lipschitzness, which accelerates and stabilizes training substantially. The improvement largely attributes to the fact that the principal component of gradient is suppressed in the pretrained parameters, thus stabilizing the magnitude of gradient in back-propagation. 3) The feasibility of transferability is related to the similarity of both input and label. And a surprising discovery is that the feasibility is also impacted by the training stages in that the transferability first increases during training, and then declines. We further provide a theoretical analysis to verify our observations.

READ FULL TEXT

page 2

page 4

page 5

page 14

page 15

research
11/06/2014

How transferable are features in deep neural networks?

Many deep neural networks trained on natural images exhibit a curious ph...
research
06/08/2023

Boosting Adversarial Transferability by Achieving Flat Local Maxima

Transfer-based attack adopts the adversarial examples generated on the s...
research
11/29/2022

Transferability Estimation Based On Principal Gradient Expectation

Deep transfer learning has been widely used for knowledge transmission i...
research
07/22/2020

Adversarial Training Reduces Information and Improves Transferability

Recent results show that features of adversarially trained networks for ...
research
04/05/2023

Going Further: Flatness at the Rescue of Early Stopping for Adversarial Example Transferability

Transferability is the property of adversarial examples to be misclassif...
research
09/13/2021

Evolving Architectures with Gradient Misalignment toward Low Adversarial Transferability

Deep neural network image classifiers are known to be susceptible not on...
research
06/06/2023

Quantifying the Variability Collapse of Neural Networks

Recent studies empirically demonstrate the positive relationship between...

Please sign up or login with your details

Forgot password? Click here to reset