Masking Strategies for Background Bias Removal in Computer Vision Models

08/23/2023
by   Ananthu Aniraj, et al.
0

Models for fine-grained image classification tasks, where the difference between some classes can be extremely subtle and the number of samples per class tends to be low, are particularly prone to picking up background-related biases and demand robust methods to handle potential examples with out-of-distribution (OOD) backgrounds. To gain deeper insights into this critical problem, our research investigates the impact of background-induced bias on fine-grained image classification, evaluating standard backbone models such as Convolutional Neural Network (CNN) and Vision Transformers (ViT). We explore two masking strategies to mitigate background-induced bias: Early masking, which removes background information at the (input) image level, and late masking, which selectively masks high-level spatial features corresponding to the background. Extensive experiments assess the behavior of CNN and ViT models under different masking strategies, with a focus on their generalization to OOD backgrounds. The obtained findings demonstrate that both proposed strategies enhance OOD performance compared to the baseline models, with early masking consistently exhibiting the best OOD performance. Notably, a ViT variant employing GAP-Pooled Patch token-based classification combined with early masking achieves the highest OOD robustness.

READ FULL TEXT

page 1

page 2

research
03/15/2017

Convolutional Low-Resolution Fine-Grained Classification

Successful fine-grained image classification methods learn subtle detail...
research
12/10/2014

Object-centric Sampling for Fine-grained Image Classification

This paper proposes to go beyond the state-of-the-art deep convolutional...
research
08/17/2022

Conviformers: Convolutionally guided Vision Transformer

Vision transformers are nowadays the de-facto preference for image class...
research
11/14/2015

Learning Fine-grained Features via a CNN Tree for Large-scale Classification

We propose a novel approach to enhance the discriminability of Convoluti...
research
03/08/2022

Coarse-to-Fine Vision Transformer

Vision Transformers (ViT) have made many breakthroughs in computer visio...
research
11/15/2022

Scalar Invariant Networks with Zero Bias

Just like weights, bias terms are the learnable parameters of many popul...
research
07/15/2020

Focus-and-Expand: Training Guidance Through Gradual Manipulation of Input Features

We present a simple and intuitive Focus-and-eXpand () method to guide th...

Please sign up or login with your details

Forgot password? Click here to reset