CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning

06/03/2020
by   Alessandro Suglia, et al.
13

Approaches to Grounded Language Learning typically focus on a single task-based final performance measure that may not depend on desirable properties of the learned hidden representations, such as their ability to predict salient attributes or to generalise to unseen situations. To remedy this, we present GROLLA, an evaluation framework for Grounded Language Learning with Attributes with three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation. We also propose a new dataset CompGuessWhat?! as an instance of this framework for evaluating the quality of learned neural representations, in particular concerning attribute grounding. To this end, we extend the original GuessWhat?! dataset by including a semantic layer on top of the perceptual one. Specifically, we enrich the VisualGenome scene graphs associated with the GuessWhat?! images with abstract and situated attributes. By using diagnostic classifiers, we show that current models learn representations that are not expressive enough to encode object attributes (average F1 of 44.27). In addition, they do not learn strategies nor representations that are robust enough to perform well when novel scenes or objects are involved in gameplay (zero-shot best accuracy 50.06

READ FULL TEXT

page 2

page 15

research
05/24/2023

GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions

Generalization to unseen tasks is an important ability for few-shot lear...
research
04/12/2018

A Large-scale Attribute Dataset for Zero-shot Learning

Zero-Shot Learning (ZSL) has attracted huge research attention over the ...
research
11/05/2020

Imagining Grounded Conceptual Representations from Perceptual Information in Situated Guessing Games

In visual guessing games, a Guesser has to identify a target object in a...
research
03/07/2023

Describe me an Aucklet: Generating Grounded Perceptual Category Descriptions

Human language users can generate descriptions of perceptual concepts be...
research
04/18/2021

Language in a (Search) Box: Grounding Language Learning in Real-World Human-Machine Interaction

We investigate grounded language learning through real-world data, by mo...
research
07/05/2022

Pretraining on Interactions for Learning Grounded Affordance Representations

Lexical semantics and cognitive science point to affordances (i.e. the a...
research
05/25/2019

Reasoning on Grasp-Action Affordances

Artificial intelligence is essential to succeed in challenging activitie...

Please sign up or login with your details

Forgot password? Click here to reset