Show, Recall, and Tell: Image Captioning with Recall Mechanism

01/15/2020
by   Li Wang, et al.
0

Generating natural and accurate descriptions in image cap-tioning has always been a challenge. In this paper, we pro-pose a novel recall mechanism to imitate the way human con-duct captioning. There are three parts in our recall mecha-nism : recall unit, semantic guide (SG) and recalled-wordslot (RWS). Recall unit is a text-retrieval module designedto retrieve recalled words for images. SG and RWS are de-signed for the best use of recalled words. SG branch cangenerate a recalled context, which can guide the process ofgenerating caption. RWS branch is responsible for copyingrecalled words to the caption. Inspired by pointing mecha-nism in text summarization, we adopt a soft switch to balancethe generated-word probabilities between SG and RWS. Inthe CIDEr optimization step, we also introduce an individualrecalled-word reward (WR) to boost training. Our proposedmethods (SG+RWS+WR) achieve BLEU-4 / CIDEr / SPICEscores of 36.6 / 116.9 / 21.3 with cross-entropy loss and 38.7 /129.1 / 22.4 with CIDEr optimization on MSCOCO Karpathytest split, which surpass the results of other state-of-the-artmethods.

READ FULL TEXT
research
01/20/2021

Macroscopic Control of Text Generation for Image Captioning

Despite the fact that image captioning models have been able to generate...
research
05/10/2020

Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning

Most image captioning models are autoregressive, i.e. they generate each...
research
04/29/2023

The Effectiveness of Applying Different Strategies on Recognition and Recall Textual Password

Using English words as passwords have been a popular topic in the last f...
research
11/15/2016

The Role of Word Length in Semantic Topology

A topological argument is presented concering the structure of semantic ...
research
04/18/2019

Learning to Collocate Neural Modules for Image Captioning

We do not speak word by word from scratch; our brain quickly structures ...
research
11/27/2016

A theory of interpretive clustering in free recall

A stochastic model of short-term verbal memory is proposed, in which the...
research
09/30/2020

Teacher-Critical Training Strategies for Image Captioning

Existing image captioning models are usually trained by cross-entropy (X...

Please sign up or login with your details

Forgot password? Click here to reset