DeepAI AI Chat
Log In Sign Up

Iterated learning for emergent systematicity in VQA

by   Ankit Vani, et al.

Although neural module networks have an architectural bias towards compositionality, they require gold standard layouts to generalize systematically in practice. When instead learning layouts and modules jointly, compositionality does not arise automatically and an explicit pressure is necessary for the emergence of layouts exhibiting the right structure. We propose to address this problem using iterated learning, a cognitive science theory of the emergence of compositional languages in nature that has primarily been applied to simple referential games in machine learning. Considering the layouts of module networks as samples from an emergent language, we use iterated learning to encourage the development of structure within this language. We show that the resulting layouts support systematic generalization in neural agents solving the more complex task of visual question-answering. Our regularized iterated learning method can outperform baselines without iterated learning on SHAPES-SyGeT (SHAPES Systematic Generalization Test), a new split of the SHAPES dataset we introduce to evaluate systematic generalization, and on CLOSURE, an extension of CLEVR also designed to test systematic generalization. We demonstrate superior performance in recovering ground-truth compositional program structure with limited supervision on both SHAPES-SyGeT and CLEVR.


page 1

page 2

page 3

page 4


Transformer Module Networks for Systematic Generalization in Visual Question Answering

Transformer-based models achieve great performance on Visual Question An...

Neural Module Networks

Visual question answering is fundamentally compositional in nature---a q...

How Modular Should Neural Module Networks Be for Systematic Generalization?

Neural Module Networks (NMNs) aim at Visual Question Answering (VQA) via...

Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering

Answering questions that involve multi-step reasoning requires decomposi...

CLOSURE: Assessing Systematic Generalization of CLEVR Models

The CLEVR dataset of natural-looking questions about 3D-rendered scenes ...

Compositionality and Generalization in Emergent Languages

Natural language allows us to refer to novel composite concepts by combi...

Graph-based Heuristic Search for Module Selection Procedure in Neural Module Network

Neural Module Network (NMN) is a machine learning model for solving the ...