What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator?

08/07/2017
by   Marc Tanti, et al.
0

In neural image captioning systems, a recurrent neural network (RNN) is typically viewed as the primary `generation' component. This view suggests that the image features should be `injected' into the RNN. This is in fact the dominant view in the literature. Alternatively, the RNN can instead be viewed as only encoding the previously generated words. This view suggests that the RNN should only be used to encode linguistic features and that only the final representation should be `merged' with the image features at a later stage. This paper compares these two architectures. We find that, in general, late merging outperforms injection, suggesting that RNNs are better viewed as encoders, rather than generators.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/27/2017

Where to put the Image in an Image Caption Generator

When a neural language model is used for caption generation, the image i...
research
12/14/2017

Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition

Recurrent Neural Networks (RNNs) are powerful sequence modeling tools. H...
research
12/12/2018

Recurrent Neural Networks for Fuzz Testing Web Browsers

Generation-based fuzzing is a software testing approach which is able to...
research
10/10/2017

Network of Recurrent Neural Networks

We describe a class of systems theory based neural networks called "Netw...
research
04/18/2017

Diagonal RNNs in Symbolic Music Modeling

In this paper, we propose a new Recurrent Neural Network (RNN) architect...
research
04/05/2019

Measuring scheduling efficiency of RNNs for NLP applications

Recurrent neural networks (RNNs) have shown state of the art results for...
research
07/08/2016

Log-Linear RNNs: Towards Recurrent Neural Networks with Flexible Prior Knowledge

We introduce LL-RNNs (Log-Linear RNNs), an extension of Recurrent Neural...

Please sign up or login with your details

Forgot password? Click here to reset