Diversity Matters When Learning From Ensembles

10/27/2021
by   Giung Nam, et al.
0

Deep ensembles excel in large-scale image classification tasks both in terms of prediction accuracy and calibration. Despite being simple to train, the computation and memory cost of deep ensembles limits their practicability. While some recent works propose to distill an ensemble model into a single model to reduce such costs, there is still a performance gap between the ensemble and distilled models. We propose a simple approach for reducing this gap, i.e., making the distilled performance close to the full ensemble. Our key assumption is that a distilled model should absorb as much function diversity inside the ensemble as possible. We first empirically show that the typical distillation procedure does not effectively transfer such diversity, especially for complex models that achieve near-zero training error. To fix this, we propose a perturbation strategy for distillation that reveals diversity by seeking inputs for which ensemble member outputs disagree. We empirically show that a model distilled with such perturbed samples indeed exhibits enhanced diversity, leading to improved performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2020

Hydra: Preserving Ensemble Diversity for Model Distillation

Ensembles of models have been empirically shown to improve predictive pe...
research
04/30/2019

Ensemble Distribution Distillation

Ensemble of Neural Network (NN) models are known to yield improvements i...
research
10/30/2019

When does Diversity Help Generalization in Classification Ensembles?

Ensembles, as a widely used and effective technique in the machine learn...
research
08/06/2021

Auxiliary Class Based Multiple Choice Learning

The merit of ensemble learning lies in having different outputs from man...
research
04/13/2022

CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

Model ensemble is a popular approach to produce a low-variance and well-...
research
04/09/2018

G-Distillation: Reducing Overconfident Errors on Novel Samples

Counter to the intuition that unfamiliarity should lead to lack of confi...
research
01/14/2021

DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation

Deep ensembles perform better than a single network thanks to the divers...

Please sign up or login with your details

Forgot password? Click here to reset