Towards Generating Stylized Image Captions via Adversarial Training

08/08/2019
by   Omid Mohamad Nezami, et al.
0

While most image captioning aims to generate objective descriptions of images, the last few years have seen work on generating visually grounded image captions which have a specific style (e.g., incorporating positive or negative sentiment). However, because the stylistic component is typically the last part of training, current models usually pay more attention to the style at the expense of accurate content description. In addition, there is a lack of variability in terms of the stylistic aspects. To address these issues, we propose an image captioning model called ATTEND-GAN which has two core components: first, an attention-based caption generator to strongly correlate different parts of an image with different parts of a caption; and second, an adversarial training mechanism to assist the caption generator to add diverse stylistic components to the generated captions. Because of these components, ATTEND-GAN can generate correlated captions as well as more human-like variability of stylistic patterns. Our system outperforms the state-of-the-art as well as a collection of our baseline models. A linguistic analysis of the generated captions demonstrates that captions generated using ATTEND-GAN have a wider range of stylistic adjectives and adjective-noun pairs.

READ FULL TEXT
research
11/24/2018

Senti-Attend: Image Captioning using Sentiment and Attention

There has been much recent work on image captioning models that describe...
research
03/30/2017

Speaking the Same Language: Matching Machine to Human Captions by Adversarial Training

While strong progress has been made in image captioning over the last ye...
research
05/18/2018

SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text

Linguistic style is an essential part of written communication, with the...
research
07/10/2018

"Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention

Generating stylized captions for an image is an emerging topic in image ...
research
10/13/2021

Diverse Audio Captioning via Adversarial Training

Audio captioning aims at generating natural language descriptions for au...
research
10/06/2015

SentiCap: Generating Image Descriptions with Sentiments

The recent progress on image recognition and language modeling is making...
research
04/26/2017

Punny Captions: Witty Wordplay in Image Descriptions

Wit is a quintessential form of rich inter-human interaction, and is oft...

Please sign up or login with your details

Forgot password? Click here to reset