ReaSCAN: Compositional Reasoning in Language Grounding

09/18/2021
by   Zhengxuan Wu, et al.
0

The ability to compositionally map language to referents, relations, and actions is an essential component of language understanding. The recent gSCAN dataset (Ruis et al. 2020, NeurIPS) is an inspiring attempt to assess the capacity of models to learn this kind of grounding in scenarios involving navigational instructions. However, we show that gSCAN's highly constrained design means that it does not require compositional interpretation and that many details of its instructions and scenarios are not required for task success. To address these limitations, we propose ReaSCAN, a benchmark dataset that builds off gSCAN but requires compositional language interpretation and reasoning about entities and relations. We assess two models on ReaSCAN: a multi-modal baseline and a state-of-the-art graph convolutional neural model. These experiments show that ReaSCAN is substantially harder than gSCAN for both neural architectures. This suggests that ReaSCAN can serve as a valuable benchmark for advancing our understanding of models' compositional generalization and reasoning capabilities.

READ FULL TEXT
research
09/29/2020

Think before you act: A simple baseline for compositional generalization

Contrarily to humans who have the ability to recombine familiar expressi...
research
08/18/2023

Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models

With the advances in large scale vision-and-language models (VLMs) it is...
research
10/01/2022

Differentiable Parsing and Visual Grounding of Verbal Instructions for Object Placement

Grounding spatial relations in natural language for object placing could...
research
08/10/2022

CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical Reasoning

We introduce CLEVR-Math, a multi-modal math word problems dataset consis...
research
01/22/2023

Variational Cross-Graph Reasoning and Adaptive Structured Semantics Learning for Compositional Temporal Grounding

Temporal grounding is the task of locating a specific segment from an un...
research
05/02/2020

Robust and Interpretable Grounding of Spatial References with Relation Networks

Handling spatial references in natural language is a key challenge in ta...
research
10/18/2022

Systematicity in GPT-3's Interpretation of Novel English Noun Compounds

Levin et al. (2019) show experimentally that the interpretations of nove...

Please sign up or login with your details

Forgot password? Click here to reset