Selective Ensembles for Consistent Predictions

11/16/2021
by   Emily Black, et al.
0

Recent work has shown that models trained to the same objective, and which achieve similar measures of accuracy on consistent test data, may nonetheless behave very differently on individual predictions. This inconsistency is undesirable in high-stakes contexts, such as medical diagnosis and finance. We show that this inconsistent behavior extends beyond predictions to feature attributions, which may likewise have negative implications for the intelligibility of a model, and one's ability to find recourse for subjects. We then introduce selective ensembles to mitigate such inconsistencies by applying hypothesis testing to the predictions of a set of models trained using randomly-selected starting conditions; importantly, selective ensembles can abstain in cases where a consistent outcome cannot be achieved up to a specified confidence level. We prove that that prediction disagreement between selective ensembles is bounded, and empirically demonstrate that selective ensembles achieve consistent predictions and feature attributions while maintaining low abstention rates. On several benchmark datasets, selective ensembles reach zero inconsistently predicted points, with abstention rates as low 1.5

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

06/22/2021

Towards Consistent Predictive Confidence through Fitted Ensembles

Deep neural networks are behind many of the recent successes in machine ...
12/04/2018

LSCP: Locally Selective Combination in Parallel Outlier Ensembles

In unsupervised outlier ensembles, the absence of ground truth makes the...
02/22/2018

Diversity regularization in deep ensembles

Calibrating the confidence of supervised learning models is important fo...
11/22/2020

Towards Class-Specific Unit

Class selectivity is an attribute of a unit in deep neural networks, whi...
10/27/2020

Selective Classification Can Magnify Disparities Across Groups

Selective classification, in which models are allowed to abstain on unce...
09/18/2018

Testing Selective Influence Directly Using Trackball Movement Tasks

Systems factorial technology (SFT; Townsend & Nozawa, 1995) is regarded ...
10/19/2020

Anti-Distillation: Improving reproducibility of deep networks

Deep networks have been revolutionary in improving performance of machin...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.