An Empirical Investigation of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration

07/17/2023
by   Hiroki Naganuma, et al.
0

In the realm of out-of-distribution generalization tasks, finetuning has risen as a key strategy. While the most focus has been on optimizing learning algorithms, our research highlights the influence of pre-trained model selection in finetuning on out-of-distribution performance and inference uncertainty. Balancing model size constraints of a single GPU, we examined the impact of varying pre-trained datasets and model parameters on performance metrics like accuracy and expected calibration error. Our findings underscore the significant influence of pre-trained model selection, showing marked performance improvements over algorithm choice. Larger models outperformed others, though the balance between memorization and true generalization merits further investigation. Ultimately, our research emphasizes the importance of pre-trained model selection for enhancing out-of-distribution generalization.

READ FULL TEXT
research
10/21/2021

Ensemble of Averages: Improving Model Selection and Boosting Performance in Domain Generalization

In Domain Generalization (DG) settings, models trained on a given set of...
research
10/27/2022

Do Pre-trained Models Benefit Equally in Continual Learning?

Existing work on continual learning (CL) is primarily devoted to develop...
research
10/20/2020

Model-specific Data Subsampling with Influence Functions

Model selection requires repeatedly evaluating models on a given dataset...
research
09/05/2023

A study on the impact of pre-trained model on Just-In-Time defect prediction

Previous researchers conducting Just-In-Time (JIT) defect prediction tas...
research
12/08/2021

The Effect of Model Size on Worst-Group Generalization

Overparameterization is shown to result in poor test accuracy on rare su...
research
10/21/2019

Detecting Extrapolation with Local Ensembles

We present local ensembles, a method for detecting extrapolation at test...
research
04/07/2021

Interpreting A Pre-trained Model Is A Key For Model Architecture Optimization: A Case Study On Wav2Vec 2.0

A deep Transformer model with good evaluation score does not mean each s...

Please sign up or login with your details

Forgot password? Click here to reset