BinaryVQA: A Versatile Test Set to Evaluate the Out-of-Distribution Generalization of VQA Models

01/28/2023
by   Ali Borji, et al.
0

We introduce a new test set for visual question answering (VQA) called BinaryVQA to push the limits of VQA models. Our dataset includes 7,800 questions across 1,024 images and covers a wide variety of objects, topics, and concepts. For easy model evaluation, we only consider binary questions. Questions and answers are formulated and verified carefully and manually. Around 63 questions per image and question length are 7 and 5, respectively. The state of the art OFA model achieves 75 significantly lower than its performance on the VQA v2 test-dev dataset (94.7 a) performance over different categories such as text, counting and gaze direction, b) model interpretability, c) the effect of question length on accuracy, d) bias of models towards positive answers and introduction of a new score called the ShuffleAcc, and e) sensitivity to spelling and grammar errors. Our investigation demonstrates the difficulty of our dataset and shows that it can challenge VQA models for next few years. Data and code are publicly available at: DATA and CODE.

READ FULL TEXT

page 6

page 12

page 14

page 15

page 16

page 17

page 19

page 20

research
12/04/2017

Learning by Asking Questions

We introduce an interactive learning framework for the development and t...
research
03/22/2023

Integrating Image Features with Convolutional Sequence-to-sequence Network for Multilingual Visual Question Answering

Visual Question Answering (VQA) is a task that requires computers to giv...
research
05/17/2022

Gender and Racial Bias in Visual Question Answering Datasets

Vision-and-language tasks have increasingly drawn more attention as a me...
research
08/23/2022

How good are deep models in understanding the generated images?

My goal in this paper is twofold: to study how well deep models can unde...
research
03/31/2022

SimVQA: Exploring Simulated Environments for Visual Question Answering

Existing work on VQA explores data augmentation to achieve better genera...
research
12/23/2017

Interpretable Counting for Visual Question Answering

Questions that require counting a variety of objects in images remain a ...
research
03/15/2022

Can you even tell left from right? Presenting a new challenge for VQA

Visual Question Answering (VQA) needs a means of evaluating the strength...

Please sign up or login with your details

Forgot password? Click here to reset