Regularizing Attention Networks for Anomaly Detection in Visual Question Answering

09/21/2020
by   Doyup Lee, et al.
5

For stability and reliability of real-world applications, the robustness of DNNs in unimodal tasks has been evaluated. However, few studies consider abnormal situations that a visual question answering (VQA) model might encounter at test time after deployment in the real-world. In this study, we evaluate the robustness of state-of-the-art VQA models to five different anomalies, including worst-case scenarios, the most frequent scenarios, and the current limitation of VQA models. Different from the results in unimodal tasks, the maximum confidence of answers in VQA models cannot detect anomalous inputs, and post-training of the outputs, such as outlier exposure, is ineffective for VQA models. Thus, we propose an attention-based method, which uses confidence of reasoning between input images and questions and shows much more promising results than the previous methods in unimodal tasks. In addition, we show that a maximum entropy regularization of attention networks can significantly improve the attention-based anomaly detection of the VQA models. Thanks to the simplicity, attention-based anomaly detection and the regularization are model-agnostic methods, which can be used for various cross-modal attentions in the state-of-the-art VQA models. The results imply that cross-modal attention in VQA is important to improve not only VQA accuracy, but also the robustness to various anomalies.

READ FULL TEXT

page 2

page 3

page 12

page 13

page 14

research
10/09/2018

Knowing Where to Look? Analysis on Attention of Visual Question Answering System

Attention mechanisms have been widely used in Visual Question Answering ...
research
10/17/2020

Answer-checking in Context: A Multi-modal FullyAttention Network for Visual Question Answering

Visual Question Answering (VQA) is challenging due to the complex cross-...
research
09/19/2017

Exploring Human-like Attention Supervision in Visual Question Answering

Attention mechanisms have been widely applied in the Visual Question Ans...
research
02/03/2021

Answer Questions with Right Image Regions: A Visual Attention Regularization Approach

Visual attention in Visual Question Answering (VQA) targets at locating ...
research
11/16/2017

A Novel Framework for Robustness Analysis of Visual QA Models

Deep neural networks have been playing an essential role in many compute...
research
07/18/2023

Towards a performance analysis on pre-trained Visual Question Answering models for autonomous driving

This short paper presents a preliminary analysis of three popular Visual...
research
10/12/2019

Neural Memory Plasticity for Anomaly Detection

In the domain of machine learning, Neural Memory Networks (NMNs) have re...

Please sign up or login with your details

Forgot password? Click here to reset