Overinterpretation reveals image classification model pathologies

03/19/2020
by   Brandon Carter, et al.
20

Image classifiers are typically scored on their test set accuracy, but high accuracy can mask a subtle type of model failure. We find that high scoring convolutional neural networks (CNN) exhibit troubling pathologies that allow them to display high accuracy even in the absence of semantically salient features. When a model provides a high-confidence decision without salient supporting input features we say that the classifier has overinterpreted its input, finding too much class-evidence in patterns that appear nonsensical to humans. Here, we demonstrate that state of the art neural networks for CIFAR-10 and ImageNet suffer from overinterpretation, and find CIFAR-10 trained models make confident predictions even when 95 humans are unable to discern salient features in the remaining pixel subset. Although these patterns portend potential model fragility in real-world deployment, they are in fact valid statistical patterns of the image classification benchmark that alone suffice to attain high test accuracy. We find that ensembling strategies can help mitigate model overinterpretation, and classifiers which rely on more semantically meaningful features can improve accuracy over both the test set and out-of-distribution images from a different source than the training data.

READ FULL TEXT

page 4

page 6

page 12

page 13

page 15

page 19

research
04/11/2023

Zoom is what you need: An empirical study of the power of zoom and spatial biases in image classification

Image classifiers are information-discarding machines, by design. Yet, h...
research
07/27/2022

Multi-layer Representation Learning for Robust OOD Image Classification

Convolutional Neural Networks have become the norm in image classificati...
research
11/18/2018

DeepConsensus: using the consensus of features from multiple layers to attain robust image classification

We consider a classifier whose test set is exposed to various perturbati...
research
02/07/2020

CIFAR-10 Image Classification Using Feature Ensembles

Image classification requires the generation of features capable of dete...
research
01/29/2023

Diverse, Difficult, and Odd Instances (D2O): A New Test Set for Object Classification

Test sets are an integral part of evaluating models and gauging progress...
research
05/28/2018

Sacrificing Accuracy for Reduced Computation: Cascaded Inference Based on Softmax Confidence

We study the tradeoff between computational effort and accuracy in a cas...
research
10/21/2021

Improving the Deployment of Recycling Classification through Efficient Hyper-Parameter Analysis

The paradigm of automated waste classification has recently seen a shift...

Please sign up or login with your details

Forgot password? Click here to reset