Effective training-time stacking for ensembling of deep neural networks

06/27/2022
by   Polina Proscura, et al.
0

Ensembling is a popular and effective method for improving machine learning (ML) models. It proves its value not only in classical ML but also for deep learning. Ensembles enhance the quality and trustworthiness of ML solutions, and allow uncertainty estimation. However, they come at a price: training ensembles of deep learning models eat a huge amount of computational resources. A snapshot ensembling collects models in the ensemble along a single training path. As it runs training only one time, the computational time is similar to the training of one model. However, the quality of models along the training path is different: typically, later models are better if no overfitting occurs. So, the models are of varying utility. Our method improves snapshot ensembling by selecting and weighting ensemble members along the training path. It relies on training-time likelihoods without looking at validation sample errors that standard stacking methods do. Experimental evidence for Fashion MNIST, CIFAR-10, and CIFAR-100 datasets demonstrates the superior quality of the proposed weighted ensembles c.t. vanilla ensembling of deep learning models.

READ FULL TEXT
research
09/12/2018

Rapid Training of Very Large Ensembles of Diverse Neural Networks

Ensembles of deep neural networks with diverse architectures significant...
research
06/27/2022

Transfer learning for ensembles: reducing computation time and keeping the diversity

Transferring a deep neural network trained on one problem to another req...
research
02/23/2022

Prune and Tune Ensembles: Low-Cost Ensemble Learning With Sparse Independent Subnetworks

Ensemble Learning is an effective method for improving generalization in...
research
03/29/2020

SuperNet – An efficient method of neural networks ensembling

The main flaw of neural network ensembling is that it is exceptionally d...
research
05/23/2022

Advanced Transient Diagnostic with Ensemble Digital Twin Modeling

The use of machine learning (ML) model as digital-twins for reduced-orde...
research
12/05/2020

Weight Update Skipping: Reducing Training Time for Artificial Neural Networks

Artificial Neural Networks (ANNs) are known as state-of-the-art techniqu...
research
05/14/2023

CREMP: Conformer-Rotamer Ensembles of Macrocyclic Peptides for Machine Learning

Computational and machine learning approaches to model the conformationa...

Please sign up or login with your details

Forgot password? Click here to reset