Describe me an Aucklet: Generating Grounded Perceptual Category Descriptions

03/07/2023
by   Bill Noble, et al.
0

Human language users can generate descriptions of perceptual concepts beyond instance-level representations and also use such descriptions to learn provisional class-level representations. However, the ability of computational models to learn and operate with class representations is under-investigated in the language-and-vision field. In this paper, we train separate neural networks to generate and interpret class-level descriptions. We then use the zero-shot classification performance of the interpretation model as a measure of communicative success and class-level conceptual grounding. We investigate the performance of prototype- and exemplar-based neural representations grounded category description. Finally, we show that communicative success reveals performance issues in the generation model that are not captured by traditional intrinsic NLG evaluation metrics, and argue that these issues can be traced to a failure to properly ground language in vision at the class level. We observe that the interpretation model performs better with descriptions that are low in diversity on the class level, possibly indicating a strong reliance on frequently occurring features.

READ FULL TEXT
research
06/05/2023

Visually-Grounded Descriptions Improve Zero-Shot Image Classification

Language-vision models like CLIP have made significant progress in zero-...
research
07/07/2022

Improving Few-Shot Image Classification Using Machine- and User-Generated Natural Language Descriptions

Humans can obtain the knowledge of novel visual concepts from language d...
research
06/03/2020

CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning

Approaches to Grounded Language Learning typically focus on a single tas...
research
07/17/2020

End-to-end Deep Prototype and Exemplar Models for Predicting Human Behavior

Traditional models of category learning in psychology focus on represent...
research
04/05/2017

Generating Descriptions with Grounded and Co-Referenced People

Learning how to generate descriptions of images or videos received major...
research
06/13/2016

Learning to Generate Compositional Color Descriptions

The production of color language is essential for grounded language gene...
research
09/07/2023

A Function Interpretation Benchmark for Evaluating Interpretability Methods

Labeling neural network submodules with human-legible descriptions is us...

Please sign up or login with your details

Forgot password? Click here to reset