Deep Ensembles on a Fixed Memory Budget: One Wide Network or Several Thinner Ones?

05/14/2020
by   Nadezhda Chirkova, et al.
0

One of the generally accepted views of modern deep learning is that increasing the number of parameters usually leads to better quality. The two easiest ways to increase the number of parameters is to increase the size of the network, e.g. width, or to train a deep ensemble; both approaches improve the performance in practice. In this work, we consider a fixed memory budget setting, and investigate, what is more effective: to train a single wide network, or to perform a memory split – to train an ensemble of several thinner networks, with the same total number of parameters? We find that, for large enough budgets, the number of networks in the ensemble, corresponding to the optimal memory split, is usually larger than one. Interestingly, this effect holds for the commonly used sizes of the standard architectures. For example, one WideResNet-28-10 achieves significantly worse test accuracy on CIFAR-100 than an ensemble of sixteen thinner WideResNets: 80.6 correspondingly. We call the described effect the Memory Split Advantage and show that it holds for a variety of datasets and model architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2020

On Power Laws in Deep Ensembles

Ensembles of deep neural networks are known to achieve state-of-the-art ...
research
10/27/2020

Are wider nets better given the same number of parameters?

Empirical studies demonstrate that the performance of neural networks im...
research
02/17/2020

BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning

Ensembles, where multiple neural networks are trained individually and t...
research
03/14/2019

Improving Neural Architecture Search Image Classifiers via Ensemble Learning

Finding the best neural network architecture requires significant time, ...
research
09/18/2017

Coupled Ensembles of Neural Networks

We investigate in this paper the architecture of deep convolutional netw...
research
06/13/2020

Collegial Ensembles

Modern neural network performance typically improves as model size incre...
research
04/24/2019

S^2-LBI: Stochastic Split Linearized Bregman Iterations for Parsimonious Deep Learning

This paper proposes a novel Stochastic Split Linearized Bregman Iteratio...

Please sign up or login with your details

Forgot password? Click here to reset