Perceptual Score: What Data Modalities Does Your Model Perceive?

10/27/2021
by   Itai Gat, et al.
0

Machine learning advances in the last decade have relied significantly on large-scale datasets that continue to grow in size. Increasingly, those datasets also contain different data modalities. However, large multi-modal datasets are hard to annotate, and annotations may contain biases that we are often unaware of. Deep-net-based classifiers, in turn, are prone to exploit those biases and to find shortcuts. To study and quantify this concern, we introduce the perceptual score, a metric that assesses the degree to which a model relies on the different subsets of the input features, i.e., modalities. Using the perceptual score, we find a surprisingly consistent trend across four popular datasets: recent, more accurate state-of-the-art multi-modal models for visual question-answering or visual dialog tend to perceive the visual data less than their predecessors. This trend is concerning as answers are hence increasingly inferred from textual cues only. Using the perceptual score also helps to analyze model biases by decomposing the score into data subset contributions. We hope to spur a discussion on the perceptiveness of multi-modal models and also hope to encourage the community working on multi-modal classifiers to start quantifying perceptiveness via the proposed perceptual score.

READ FULL TEXT
research
04/30/2022

SHAPE: An Unified Approach to Evaluate the Contribution and Cooperation of Individual Modalities

As deep learning advances, there is an ever-growing demand for models ca...
research
10/21/2020

Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies

Many recent datasets contain a variety of different data modalities, for...
research
03/23/2022

On Adversarial Robustness of Large-scale Audio Visual Learning

As audio-visual systems are being deployed for safety-critical tasks suc...
research
10/21/2021

Single-Modal Entropy based Active Learning for Visual Question Answering

Constructing a large-scale labeled dataset in the real world, especially...
research
07/23/2021

Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU

The combined use of multiple modalities enables accurate pedestrian dete...
research
10/16/2022

Perceptual-Score: A Psychophysical Measure for Assessing the Biological Plausibility of Visual Recognition Models

For the last decade, convolutional neural networks (CNNs) have vastly su...
research
12/05/2016

Deep Multi-Modal Image Correspondence Learning

Inference of correspondences between images from different modalities is...

Please sign up or login with your details

Forgot password? Click here to reset