Transformer Module Networks for Systematic Generalization in Visual Question Answering

01/27/2022
by   Moyuru Yamada, et al.
0

Transformer-based models achieve great performance on Visual Question Answering (VQA). However, when we evaluate them on systematic generalization, i.e., handling novel combinations of known concepts, their performance degrades. Neural Module Networks (NMNs) are a promising approach for systematic generalization that consists on composing modules, i.e., neural networks that tackle a sub-task. Inspired by Transformers and NMNs, we propose Transformer Module Network (TMN), a novel Transformer-based model for VQA that dynamically composes modules into a question-specific Transformer network. TMNs achieve state-of-the-art systematic generalization performance in three VQA datasets, namely, CLEVR-CoGenT, CLOSURE and GQA-SGL, in some cases improving more than 30

READ FULL TEXT
research
06/15/2021

How Modular Should Neural Module Networks Be for Systematic Generalization?

Neural Module Networks (NMNs) aim at Visual Question Answering (VQA) via...
research
09/06/2021

Improved RAMEN: Towards Domain Generalization for Visual Question Answering

Currently nearing human-level performance, Visual Question Answering (VQ...
research
03/24/2022

Towards Efficient and Elastic Visual Question Answering with Doubly Slimmable Transformer

Transformer-based approaches have shown great success in visual question...
research
05/03/2021

Iterated learning for emergent systematicity in VQA

Although neural module networks have an architectural bias towards compo...
research
05/27/2019

Structure Learning for Neural Module Networks

Neural Module Networks, originally proposed for the task of visual quest...
research
09/15/2023

D3: Data Diversity Design for Systematic Generalization in Visual Question Answering

Systematic generalization is a crucial aspect of intelligence, which ref...
research
12/12/2019

CLOSURE: Assessing Systematic Generalization of CLEVR Models

The CLEVR dataset of natural-looking questions about 3D-rendered scenes ...

Please sign up or login with your details

Forgot password? Click here to reset