Harnessing Adversarial Distances to Discover High-Confidence Errors

06/29/2020
by   Walter Bennette, et al.
0

Given a deep neural network image classification model that we treat as a black box, and an unlabeled evaluation dataset, we develop an efficient strategy by which the classifier can be evaluated. Randomly sampling and labeling instances from an unlabeled evaluation dataset allows traditional performance measures like accuracy, precision, and recall to be estimated. However, random sampling may miss rare errors for which the model is highly confident in its prediction, but wrong. These high-confidence errors can represent costly mistakes, and therefore should be explicitly searched for. Past works have developed search techniques to find classification errors above a specified confidence threshold, but ignore the fact that errors should be expected at confidence levels anywhere below 100%. In this work, we investigate the problem of finding errors at rates greater than expected given model confidence. Additionally, we propose a query-efficient and novel search technique that is guided by adversarial perturbations to find these mistakes in black box models. Through rigorous empirical experimentation, we demonstrate that our Adversarial Distance search discovers high-confidence errors at a rate greater than expected given model confidence.

READ FULL TEXT

page 6

page 8

page 9

research
02/25/2021

Generalized Adversarial Distances to Efficiently Discover Classifier Errors

Given a black-box classification model and an unlabeled evaluation datas...
research
05/11/2020

Spanning Attack: Reinforce Black-box Attacks with Unlabeled Data

Adversarial black-box attacks aim to craft adversarial perturbations by ...
research
10/12/2018

Facility Locations Utility for Uncovering Classifier Overconfidence

Assessing the predictive accuracy of black box classifiers is challengin...
research
05/20/2018

Targeted Adversarial Examples for Black Box Audio Systems

The application of deep recurrent networks to audio transcription has le...
research
02/27/2023

Online Black-Box Confidence Estimation of Deep Neural Networks

Autonomous driving (AD) and advanced driver assistance systems (ADAS) in...
research
11/09/2018

Universal Decision-Based Black-Box Perturbations: Breaking Security-Through-Obscurity Defenses

We study the problem of finding a universal (image-agnostic) perturbatio...
research
05/28/2018

Confidence Prediction for Lexicon-Free OCR

Having a reliable accuracy score is crucial for real world applications ...

Please sign up or login with your details

Forgot password? Click here to reset