Improving Image Captioning with Conditional Generative Adversarial Nets

05/18/2018
by   Chen Chen, et al.
0

In this paper, we propose a novel conditional generative adversarial nets based image captioning framework as an extension of traditional reinforcement learning (RL) based encoder-decoder architecture. To deal with the inconsistent evaluation problem between objective language metrics and subjective human judgements, we are inspired to design some "discriminator" networks to automatically and progressively determine whether generated caption is human described or machine generated. Two kinds of discriminator architecture (CNN and RNN based structures) are introduced since each has its own advantages. The proposed algorithm is generic so that it can enhance any existing encoder-decoder based image captioning model and we show that conventional RL training method is just a special case of our framework. Empirically, we show consistent improvements over all language evaluation metrics for different stage-of-the-art image captioning models.

READ FULL TEXT

page 8

page 11

page 12

research
05/25/2016

Review Networks for Caption Generation

We propose a novel extension of the encoder-decoder framework, called a ...
research
09/13/2018

Improving Reinforcement Learning Based Image Captioning with Natural Language Prior

Recently, Reinforcement Learning (RL) approaches have demonstrated advan...
research
11/13/2018

Image Captioning Based on a Hierarchical Attention Mechanism and Policy Gradient Optimization

Automatically generating the descriptions of an image, i.e., image capti...
research
10/31/2019

Can adversarial training learn image captioning ?

Recently, generative adversarial networks (GAN) have gathered a lot of i...
research
09/11/2017

Stack-Captioning: Coarse-to-Fine Learning for Image Captioning

The existing image captioning approaches typically train a one-stage sen...
research
10/31/2019

Hidden State Guidance: Improving Image Captioning using An Image Conditioned Autoencoder

Most RNN-based image captioning models receive supervision on the output...

Please sign up or login with your details

Forgot password? Click here to reset