Image Captioning using Facial Expression and Attention

08/08/2019
by   Omid Mohamad Nezami, et al.
4

Benefiting from advances in machine vision and natural language processing techniques, current image captioning systems are able to generate detailed visual descriptions. For the most part, these descriptions represent an objective characterisation of the image, although some models do incorporate subjective aspects related to the observer's view of the image, such as sentiment; current models, however, usually do not consider the emotional content of images during the caption generation process. This paper addresses this issue by proposing novel image captioning models which use facial expression features to generate image captions. The models generate image captions using long short-term memory networks applying facial features in addition to other visual features at different time steps. We compare a comprehensive collection of image captioning models with and without facial features using all standard evaluation metrics. The evaluation metrics indicate that applying facial features with an attention mechanism achieves the best performance, showing more expressive and more correlated image captions, on an image caption dataset extracted from the standard Flickr 30K dataset, consisting of around 11K images containing faces. An analysis of the generated captions finds that, perhaps unexpectedly, the improvement in caption quality appears to come not from the addition of adjectives linked to emotional aspects of the images, but from more variety in the actions described in the captions.

READ FULL TEXT

page 2

page 8

page 21

page 22

page 23

research
07/06/2018

Face-Cap: Image Captioning using Facial Expression Analysis

Image captioning is the process of generating a natural language descrip...
research
01/30/2018

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Automatic image captioning has recently approached human-level performan...
research
02/11/2022

Deep soccer captioning with transformer: dataset, semantics-related losses, and multi-level evaluation

This work aims at generating captions for soccer videos using deep learn...
research
04/15/2022

Guiding Attention using Partial-Order Relationships for Image Captioning

The use of attention models for automated image captioning has enabled m...
research
05/10/2021

Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning

Observing a set of images and their corresponding paragraph-captions, a ...
research
10/06/2015

SentiCap: Generating Image Descriptions with Sentiments

The recent progress on image recognition and language modeling is making...
research
07/10/2020

Image Captioning with Compositional Neural Module Networks

In image captioning where fluency is an important factor in evaluation, ...

Please sign up or login with your details

Forgot password? Click here to reset