Reflective Decoding Network for Image Captioning

08/30/2019
by   Lei Ke, et al.
1

State-of-the-art image captioning methods mostly focus on improving visual features, less attention has been paid to utilizing the inherent properties of language to boost captioning performance. In this paper, we show that vocabulary coherence between words and syntactic paradigm of sentences are also important to generate high-quality image caption. Following the conventional encoder-decoder framework, we propose the Reflective Decoding Network (RDN) for image captioning, which enhances both the long-sequence dependency and position perception of words in a caption decoder. Our model learns to collaboratively attend on both visual and textual features and meanwhile perceive each word's relative position in the sentence to maximize the information delivered in the generated caption. We evaluate the effectiveness of our RDN on the COCO image captioning datasets and achieve superior performance over the previous methods. Further experiments reveal that our approach is particularly advantageous for hard cases with complex scenes to describe by captions.

READ FULL TEXT

page 3

page 8

research
11/03/2020

Attention Beam: An Image Captioning Approach

The aim of image captioning is to generate textual description of a give...
research
05/28/2021

New Image Captioning Encoder via Semantic Visual Feature Matching for Heavy Rain Images

Image captioning generates text that describes scenes from input images....
research
12/12/2016

Text-guided Attention Model for Image Captioning

Visual attention plays an important role to understand images and demons...
research
09/07/2019

Look and Modify: Modification Networks for Image Captioning

Attention-based neural encoder-decoder frameworks have been widely used ...
research
02/23/2021

Enhanced Modality Transition for Image Captioning

Image captioning model is a cross-modality knowledge discovery task, whi...
research
01/04/2022

Interactive Attention AI to translate low light photos to captions for night scene understanding in women safety

There is amazing progress in Deep Learning based models for Image captio...
research
09/08/2021

RefineCap: Concept-Aware Refinement for Image Captioning

Automatically translating images to texts involves image scene understan...

Please sign up or login with your details

Forgot password? Click here to reset