What is not where: the challenge of integrating spatial representations into deep learning architectures

07/21/2018
by   John D. Kelleher, et al.
0

This paper examines to what degree current deep learning architectures for image caption generation capture spatial language. On the basis of the evaluation of examples of generated captions from the literature we argue that systems capture what objects are in the image data but not where these objects are located: the captions generated by these systems are the output of a language model conditioned on the output of an object detector that cannot capture fine-grained location information. Although language models provide useful knowledge for image captions, we argue that deep learning image captioning architectures should also model geometric relations between objects.

READ FULL TEXT

page 4

page 6

page 9

research
08/09/2015

Image Representations and New Domains in Neural Image Captioning

We examine the possibility that recent promising results in automatic ca...
research
04/17/2018

Learning to Color from Language

Automatic colorization is the process of adding color to greyscale image...
research
06/20/2019

Informative Image Captioning with External Sources of Information

An image caption should fluently present the essential information in a ...
research
10/12/2016

Generating captions without looking beyond objects

This paper explores new evaluation perspectives for image captioning and...
research
12/27/2022

Using Large Language Models to Generate Engaging Captions for Data Visualizations

Creating compelling captions for data visualizations has been a longstan...
research
07/05/2023

Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations

This research focuses on assessing the ability of large language models ...
research
10/02/2020

CAPTION: Correction by Analyses, POS-Tagging and Interpretation of Objects using only Nouns

Recently, Deep Learning (DL) methods have shown an excellent performance...

Please sign up or login with your details

Forgot password? Click here to reset