DeepAI AI Chat
Log In Sign Up

Economical ensembles with hypernetworks

by   Joao Sacramento, et al.

Averaging the predictions of many independently trained neural networks is a simple and effective way of improving generalization in deep learning. However, this strategy rapidly becomes costly, as the number of trainable parameters grows linearly with the size of the ensemble. Here, we propose a new method to learn economical ensembles, where the number of trainable parameters and iterations over the data is comparable to that of a single model. Our neural networks are parameterized by hypernetworks, which learn to embed weights in low-dimensional spaces. In a late training phase, we generate an ensemble by randomly initializing an additional number of weight embeddings in the vicinity of each other. We then exploit the inherent randomness in stochastic gradient descent to induce ensemble diversity. Experiments with wide residual networks on the CIFAR and Fashion-MNIST datasets show that our algorithm yields models that are more accurate and less overconfident on unseen data, while learning as efficiently as a single network.


Rapid Training of Very Large Ensembles of Diverse Neural Networks

Ensembles of deep neural networks with diverse architectures significant...

BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning

Ensembles, where multiple neural networks are trained individually and t...

Intra-Ensemble in Neural Networks

Improving model performance is always the key problem in machine learnin...

The BeMi Stardust: a Structured Ensemble of Binarized Neural Networks

Binarized Neural Networks (BNNs) are receiving increasing attention due ...

Parsimonious Deep Learning: A Differential Inclusion Approach with Global Convergence

Over-parameterization is ubiquitous nowadays in training neural networks...

Simplicity Bias in 1-Hidden Layer Neural Networks

Recent works have demonstrated that neural networks exhibit extreme simp...

Boost Neural Networks by Checkpoints

Training multiple deep neural networks (DNNs) and averaging their output...

Code Repositories