A Novel Actor Dual-Critic Model for Remote Sensing Image Captioning

10/05/2020
by   Ruchika Chavhan, et al.
0

We deal with the problem of generating textual captions from optical remote sensing (RS) images using the notion of deep reinforcement learning. Due to the high inter-class similarity in reference sentences describing remote sensing data, jointly encoding the sentences and images encourages prediction of captions that are semantically more precise than the ground truth in many cases. To this end, we introduce an Actor Dual-Critic training strategy where a second critic model is deployed in the form of an encoder-decoder RNN to encode the latent information corresponding to the original and generated captions. While all actor-critic methods use an actor to predict sentences for an image and a critic to provide rewards, our proposed encoder-decoder RNN guarantees high-level comprehension of images by sentence-to-image translation. We observe that the proposed model generates sentences on the test data highly similar to the ground truth and is successful in generating even better captions in many critical cases. Extensive experiments on the benchmark Remote Sensing Image Captioning Dataset (RSICD) and the UCM-captions dataset confirm the superiority of the proposed approach in comparison to the previous state-of-the-art where we obtain a gain of sharp increments in both the ROUGE-L and CIDEr measures.

READ FULL TEXT

page 5

page 6

page 7

research
06/15/2020

SD-RSIC: Summarization Driven Deep Remote Sensing Image Captioning

Deep neural networks (DNNs) have been recently found popular for image c...
research
04/12/2017

Deep Reinforcement Learning-based Image Captioning with Embedding Reward

Image captioning is a challenging problem owing to the complexity in und...
research
06/29/2017

Actor-Critic Sequence Training for Image Captioning

Generating natural language descriptions of images is an important capab...
research
12/12/2019

Meaning guided video captioning

Current video captioning approaches often suffer from problems of missin...
research
08/26/2020

Attr2Style: A Transfer Learning Approach for Inferring Fashion Styles via Apparel Attributes

Popular fashion e-commerce platforms mostly provide details about low-le...
research
11/17/2018

Improving Automatic Source Code Summarization via Deep Reinforcement Learning

Code summarization provides a high level natural language description of...
research
02/25/2019

Making History Matter: Gold-Critic Sequence Training for Visual Dialog

We study the multi-round response generation in visual dialog systems, w...

Please sign up or login with your details

Forgot password? Click here to reset