Identification of Systematic Errors of Image Classifiers on Rare Subgroups

03/09/2023
by   Jan-Hendrik Metzen, et al.
0

Despite excellent average-case performance of many image classifiers, their performance can substantially deteriorate on semantically coherent subgroups of the data that were under-represented in the training data. These systematic errors can impact both fairness for demographic minority groups as well as robustness and safety under domain shift. A major challenge is to identify such subgroups with subpar performance when the subgroups are not annotated and their occurrence is very rare. We leverage recent advances in text-to-image models and search in the space of textual descriptions of subgroups ("prompts") for subgroups where the target model has low performance on the prompt-conditioned synthesized data. To tackle the exponentially growing number of subgroups, we employ combinatorial testing. We denote this procedure as PromptAttack as it can be interpreted as an adversarial attack in a prompt space. We study subgroup coverage and identifiability with PromptAttack in a controlled setting and find that it identifies systematic errors with high accuracy. Thereupon, we apply PromptAttack to ImageNet classifiers and identify novel systematic errors on rare subgroups.

READ FULL TEXT

page 8

page 14

page 15

page 16

page 17

page 18

page 19

page 20

research
07/01/2021

The Spotlight: A General Method for Discovering Systematic Errors in Deep Learning Models

Supervised learning models often make systematic errors on rare subsets ...
research
11/28/2019

Detection and Mitigation of Rare Subclasses in Neural Network Classifiers

Regions of high-dimensional input spaces that are underrepresented in tr...
research
08/20/2018

Adversarial Removal of Demographic Attributes from Text Data

Recent advances in Representation Learning and Adversarial Training seem...
research
07/17/2021

Automatic Fairness Testing of Neural Classifiers through Adversarial Sampling

Although deep learning has demonstrated astonishing performance in many ...
research
09/20/2023

Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge

Text-conditioned image generation models have recently achieved astonish...
research
12/14/2017

DANCin SEQ2SEQ: Fooling Text Classifiers with Adversarial Text Example Generation

Machine learning models are powerful but fallible. Generating adversaria...
research
06/05/2019

A systematic framework for natural perturbations from videos

We introduce a systematic framework for quantifying the robustness of cl...

Please sign up or login with your details

Forgot password? Click here to reset