Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

01/30/2018
by   Quanzeng You, et al.
0

Automatic image captioning has recently approached human-level performance due to the latest advances in computer vision and natural language understanding. However, most of the current models can only generate plain factual descriptions about the content of a given image. However, for human beings, image caption writing is quite flexible and diverse, where additional language dimensions, such as emotion, humor and language styles, are often incorporated to produce diverse, emotional, or appealing captions. In particular, we are interested in generating sentiment-conveying image descriptions, which has received little attention. The main challenge is how to effectively inject sentiments into the generated captions without altering the semantic matching between the visual content and the generated descriptions. In this work, we propose two different models, which employ different schemes for injecting sentiments into image captions. Compared with the few existing approaches, the proposed models are much simpler and yet more effective. The experimental results show that our model outperform the state-of-the-art models in generating sentimental (i.e., sentiment-bearing) image captions. In addition, we can also easily manipulate the model by assigning different sentiments to the testing image to generate captions with the corresponding sentiments.

READ FULL TEXT
research
08/08/2019

Image Captioning using Facial Expression and Attention

Benefiting from advances in machine vision and natural language processi...
research
10/06/2015

SentiCap: Generating Image Descriptions with Sentiments

The recent progress on image recognition and language modeling is making...
research
07/06/2018

Face-Cap: Image Captioning using Facial Expression Analysis

Image captioning is the process of generating a natural language descrip...
research
06/23/2023

Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation

Image captioning aims to describe visual content in natural language. As...
research
05/18/2018

SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text

Linguistic style is an essential part of written communication, with the...
research
11/19/2018

Intention Oriented Image Captions with Guiding Objects

Although existing image caption models can produce promising results usi...
research
07/31/2023

Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences

Stylized visual captioning aims to generate image or video descriptions ...

Please sign up or login with your details

Forgot password? Click here to reset