Model soups to increase inference without increasing compute time

01/24/2023
by   Charles Dansereau, et al.
0

In this paper, we compare Model Soups performances on three different models (ResNet, ViT and EfficientNet) using three Soup Recipes (Greedy Soup Sorted, Greedy Soup Random and Uniform soup) from arXiv:2203.05482, and reproduce the results of the authors. We then introduce a new Soup Recipe called Pruned Soup. Results from the soups were better than the best individual model for the pre-trained vision transformer, but were much worst for the ResNet and the EfficientNet. Our pruned soup performed better than the uniform and greedy soups presented in the original paper. We also discuss the limitations of weight-averaging that were found during the experiments. The code for our model soup library and the experiments with different models can be found here: https://github.com/milo-sobral/ModelSoup

READ FULL TEXT
research
03/25/2019

Fine-tune BERT for Extractive Summarization

BERT, a pre-trained Transformer model, has achieved ground-breaking perf...
research
10/04/2021

Effectiveness of Optimization Algorithms in Deep Image Classification

Adam is applied widely to train neural networks. Different kinds of Adam...
research
08/16/2022

Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model

Diffusion Denoising Probability Models (DDPM) and Vision Transformer (Vi...
research
04/28/2021

[Re] Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias

Singh et al. (2020) point out the dangers of contextual bias in visual r...
research
10/11/2022

An Exploration of Hierarchical Attention Transformers for Efficient Long Document Classification

Non-hierarchical sparse attention Transformer-based models, such as Long...
research
02/18/2021

Attempted Blind Constrained Descent Experiments

Blind Descent uses constrained but, guided approach to learn the weights...
research
07/03/2023

Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

We propose Rockmate to control the memory requirements when training PyT...

Please sign up or login with your details

Forgot password? Click here to reset