Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost

08/23/2022
by   Lu Yin, et al.
8

Lottery tickets (LTs) is able to discover accurate and sparse subnetworks that could be trained in isolation to match the performance of dense networks. Ensemble, in parallel, is one of the oldest time-proven tricks in machine learning to improve performance by combining the output of multiple independent models. However, the benefits of ensemble in the context of LTs will be diluted since ensemble does not directly lead to stronger sparse subnetworks, but leverages their predictions for a better decision. In this work, we first observe that directly averaging the weights of the adjacent learned subnetworks significantly boosts the performance of LTs. Encouraged by this observation, we further propose an alternative way to perform an 'ensemble' over the subnetworks identified by iterative magnitude pruning via a simple interpolating strategy. We call our method Lottery Pools. In contrast to the naive ensemble which brings no performance gains to each single subnetwork, Lottery Pools yields much stronger sparse subnetworks than the original LTs without requiring any extra training or inference cost. Across various modern architectures on CIFAR-10/100 and ImageNet, we show that our method achieves significant performance gains in both, in-distribution and out-of-distribution scenarios. Impressively, evaluated with VGG-16 and ResNet-18, the produced sparse subnetworks outperform the original LTs by up to 1.88 2.36 dense-model up to 2.22

READ FULL TEXT

page 3

page 10

research
06/28/2021

FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity

Recent works on sparse neural networks have demonstrated that it is poss...
research
02/05/2022

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Random pruning is arguably the most naive way to attain sparsity in neur...
research
04/06/2023

PopulAtion Parameter Averaging (PAPA)

Ensemble methods combine the predictions of multiple models to improve p...
research
11/26/2021

How Well Do Sparse Imagenet Models Transfer?

Transfer learning is a classic paradigm by which models pretrained on la...
research
05/24/2023

SWAMP: Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning

Given the ever-increasing size of modern neural networks, the significan...
research
10/03/2021

Boost Neural Networks by Checkpoints

Training multiple deep neural networks (DNNs) and averaging their output...
research
10/13/2020

Training independent subnetworks for robust prediction

Recent approaches to efficiently ensemble neural networks have shown tha...

Please sign up or login with your details

Forgot password? Click here to reset