On the Connection between Pre-training Data Diversity and Fine-tuning Robustness

07/24/2023
by   Vivek Ramanujan, et al.
0

Pre-training has been widely adopted in deep learning to improve model performance, especially when the training data for a target task is limited. In our work, we seek to understand the implications of this training strategy on the generalization properties of downstream models. More specifically, we ask the following question: how do properties of the pre-training distribution affect the robustness of a fine-tuned model? The properties we explore include the label space, label semantics, image diversity, data domains, and data quantity of the pre-training distribution. We find that the primary factor influencing downstream effective robustness (Taori et al., 2020) is data quantity, while other factors have limited significance. For example, reducing the number of ImageNet pre-training classes by 4x while increasing the number of images per class by 4x (that is, keeping total data quantity fixed) does not impact the robustness of fine-tuned models. We demonstrate our findings on pre-training distributions drawn from various natural and synthetic data sources, primarily using the iWildCam-WILDS distribution shift as a test for downstream robustness.

READ FULL TEXT

page 2

page 10

page 17

page 18

page 19

research
06/21/2021

Pre-training also Transfers Non-Robustness

Pre-training has enabled many state-of-the-art results on many tasks. In...
research
08/10/2022

Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP

Web-crawled datasets have enabled remarkable generalization capabilities...
research
01/28/2019

Using Pre-Training Can Improve Model Robustness and Uncertainty

Tuning a pre-trained network is commonly thought to improve data efficie...
research
07/19/2023

What can we learn from Data Leakage and Unlearning for Law?

Large Language Models (LLMs) have a privacy concern because they memoriz...
research
09/15/2021

Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative

Pre-training, where models are trained on an auxiliary objective with ab...
research
04/25/2023

LEMaRT: Label-Efficient Masked Region Transform for Image Harmonization

We present a simple yet effective self-supervised pre-training method fo...
research
08/30/2016

What makes ImageNet good for transfer learning?

The tremendous success of ImageNet-trained deep features on a wide range...

Please sign up or login with your details

Forgot password? Click here to reset