Selective Ensembles for Consistent Predictions

by   Emily Black, et al.

Recent work has shown that models trained to the same objective, and which achieve similar measures of accuracy on consistent test data, may nonetheless behave very differently on individual predictions. This inconsistency is undesirable in high-stakes contexts, such as medical diagnosis and finance. We show that this inconsistent behavior extends beyond predictions to feature attributions, which may likewise have negative implications for the intelligibility of a model, and one's ability to find recourse for subjects. We then introduce selective ensembles to mitigate such inconsistencies by applying hypothesis testing to the predictions of a set of models trained using randomly-selected starting conditions; importantly, selective ensembles can abstain in cases where a consistent outcome cannot be achieved up to a specified confidence level. We prove that that prediction disagreement between selective ensembles is bounded, and empirically demonstrate that selective ensembles achieve consistent predictions and feature attributions while maintaining low abstention rates. On several benchmark datasets, selective ensembles reach zero inconsistently predicted points, with abstention rates as low 1.5



There are no comments yet.


page 1

page 2

page 3

page 4


Towards Consistent Predictive Confidence through Fitted Ensembles

Deep neural networks are behind many of the recent successes in machine ...

LSCP: Locally Selective Combination in Parallel Outlier Ensembles

In unsupervised outlier ensembles, the absence of ground truth makes the...

Diversity regularization in deep ensembles

Calibrating the confidence of supervised learning models is important fo...

Towards Class-Specific Unit

Class selectivity is an attribute of a unit in deep neural networks, whi...

Selective Classification Can Magnify Disparities Across Groups

Selective classification, in which models are allowed to abstain on unce...

Testing Selective Influence Directly Using Trackball Movement Tasks

Systems factorial technology (SFT; Townsend & Nozawa, 1995) is regarded ...

Anti-Distillation: Improving reproducibility of deep networks

Deep networks have been revolutionary in improving performance of machin...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.