Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face Descriptions

03/10/2018
by   Albert Gatt, et al.
0

The past few years have witnessed renewed interest in NLP tasks at the interface between vision and language. One intensively-studied problem is that of automatically generating text from images. In this paper, we extend this problem to the more specific domain of face description. Unlike scene descriptions, face descriptions are more fine-grained and rely on attributes extracted from the image, rather than objects and relations. Given that no data exists for this task, we present an ongoing crowdsourcing study to collect a corpus of descriptions of face images taken `in the wild'. To gain a better understanding of the variation we find in face description and the possible issues that this may raise, we also conducted an annotation study on a subset of the corpus. Primarily, we found descriptions to refer to a mixture of attributes, not only physical, but also emotional and inferential, which is bound to create further challenges for current image-to-text methods.

READ FULL TEXT

page 2

page 4

research
09/11/2018

Unsupervised Stylish Image Description Generation via Domain Layer Norm

Most of the existing works on image description focus on generating expr...
research
11/12/2014

Collecting Image Description Datasets using Crowdsourcing

We describe our two new datasets with images described by humans. Both t...
research
06/20/2016

Pragmatic factors in image description: the case of negations

We provide a qualitative analysis of the descriptions containing negatio...
research
11/26/2019

Text2FaceGAN: Face Generation from Fine Grained Textual Descriptions

Powerful generative adversarial networks (GAN) have been developed to au...
research
08/31/2023

Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images

Generating 3D faces from textual descriptions has a multitude of applica...
research
06/01/2011

What's in an Attribute? Consequences for the Least Common Subsumer

Functional relationships between objects, called `attributes', are of co...
research
05/24/2023

Pento-DIARef: A Diagnostic Dataset for Learning the Incremental Algorithm for Referring Expression Generation from Examples

NLP tasks are typically defined extensionally through datasets containin...

Please sign up or login with your details

Forgot password? Click here to reset