COVR: A test-bed for Visually Grounded Compositional Generalization with real images

09/22/2021
by   Ben Bogin, et al.
1

While interest in models that generalize at test time to new compositions has risen in recent years, benchmarks in the visually-grounded domain have thus far been restricted to synthetic images. In this work, we propose COVR, a new test-bed for visually-grounded compositional generalization with real images. To create COVR, we use real images annotated with scene graphs, and propose an almost fully automatic procedure for generating question-answer pairs along with a set of context images. COVR focuses on questions that require complex reasoning, including higher-order operations such as quantification and aggregation. Due to the automatic generation process, COVR facilitates the creation of compositional splits, where models at test time need to generalize to new concepts and compositions in a zero- or few-shot setting. We construct compositional splits using COVR and demonstrate a myriad of cases where state-of-the-art pre-trained language-and-vision models struggle to compositionally generalize.

READ FULL TEXT

page 1

page 2

page 15

page 16

page 19

page 20

research
11/05/2021

Grounded Graph Decoding Improves Compositional Generalization in Question Answering

Question answering models struggle to generalize to novel compositions o...
research
09/06/2021

Finding needles in a haystack: Sampling Structurally-diverse Training Sets from Synthetic Data for Compositional Generalization

Modern semantic parsers suffer from two principal limitations. First, tr...
research
01/21/2022

Environment Generation for Zero-Shot Compositional Reinforcement Learning

Many real-world problems are compositional - solving them requires compl...
research
06/16/2020

A Study of Compositional Generalization in Neural Models

Compositional and relational learning is a hallmark of human intelligenc...
research
07/08/2020

The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning

In this work, we focus on an analogical reasoning task that contains ric...
research
05/26/2023

CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning

Compositionality, the ability to combine existing concepts and generaliz...
research
10/07/2020

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

A major challenge in visually grounded language generation is to build r...

Please sign up or login with your details

Forgot password? Click here to reset