BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning

02/17/2020
by   Yeming Wen, et al.
13

Ensembles, where multiple neural networks are trained individually and their predictions are averaged, have been shown to be widely successful for improving both the accuracy and predictive uncertainty of single neural networks. However, an ensemble's cost for both training and testing increases linearly with the number of networks, which quickly becomes untenable. In this paper, we propose BatchEnsemble, an ensemble method whose computational and memory costs are significantly lower than typical ensembles. BatchEnsemble achieves this by defining each weight matrix to be the Hadamard product of a shared weight among all ensemble members and a rank-one matrix per member. Unlike ensembles, BatchEnsemble is not only parallelizable across devices, where one device trains one member, but also parallelizable within a device, where multiple ensemble members are updated simultaneously for a given mini-batch. Across CIFAR-10, CIFAR-100, WMT14 EN-DE/EN-FR translation, and out-of-distribution tasks, BatchEnsemble yields competitive accuracy and uncertainties as typical ensembles; the speedup at test time is 3X and memory reduction is 3X at an ensemble of size 4. We also apply BatchEnsemble to lifelong learning, where on Split-CIFAR-100, BatchEnsemble yields comparable performance to progressive neural networks while having a much lower computational and memory costs. We further show that BatchEnsemble can easily scale up to lifelong learning on Split-ImageNet which involves 100 sequential learning tasks.

READ FULL TEXT
research
09/12/2018

Rapid Training of Very Large Ensembles of Diverse Neural Networks

Ensembles of deep neural networks with diverse architectures significant...
research
02/23/2022

Prune and Tune Ensembles: Low-Cost Ensemble Learning With Sparse Independent Subnetworks

Ensemble Learning is an effective method for improving generalization in...
research
07/25/2020

Economical ensembles with hypernetworks

Averaging the predictions of many independently trained neural networks ...
research
07/16/2020

On Power Laws in Deep Ensembles

Ensembles of deep neural networks are known to achieve state-of-the-art ...
research
07/01/2020

Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet

Ensemble learning is a general technique to improve accuracy in machine ...
research
03/01/2018

Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning

An ensemble of neural networks is known to be more robust and accurate t...
research
05/14/2020

Deep Ensembles on a Fixed Memory Budget: One Wide Network or Several Thinner Ones?

One of the generally accepted views of modern deep learning is that incr...

Please sign up or login with your details

Forgot password? Click here to reset