A Deep Neural Framework for Image Caption Generation Using GRU-Based Attention Mechanism

03/03/2022
by   Rashid Khan, et al.
7

Image captioning is a fast-growing research field of computer vision and natural language processing that involves creating text explanations for images. This study aims to develop a system that uses a pre-trained convolutional neural network (CNN) to extract features from an image, integrates the features with an attention mechanism, and creates captions using a recurrent neural network (RNN). To encode an image into a feature vector as graphical attributes, we employed multiple pre-trained convolutional neural networks. Following that, a language model known as GRU is chosen as the decoder to construct the descriptive sentence. In order to increase performance, we merge the Bahdanau attention model with GRU to allow learning to be focused on a specific portion of the image. On the MSCOCO dataset, the experimental results achieve competitive performance against state-of-the-art approaches.

READ FULL TEXT

page 10

page 11

page 12

page 13

research
03/08/2021

Analysis of Convolutional Decoder for Image Caption Generation

Recently Convolutional Neural Networks have been proposed for Sequence M...
research
11/25/2019

Event Recognition with Automatic Album Detection based on Sequential Processing, Neural Attention and Image Captioning

In this paper a new formulation of event recognition task is examined: i...
research
11/09/2015

Visual Language Modeling on CNN Image Representations

Measuring the naturalness of images is important to generate realistic i...
research
03/30/2018

Guide Me: Interacting with Deep Networks

Interaction and collaboration between humans and intelligent machines ha...
research
04/21/2019

3G structure for image caption generation

It is a big challenge of computer vision to make machine automatically d...
research
07/09/2019

BASN – Learning Steganography with Binary Attention Mechanism

Secret information sharing through image carrier has aroused much resear...
research
09/22/2022

Learning Visual Explanations for DCNN-Based Image Classifiers Using an Attention Mechanism

In this paper two new learning-based eXplainable AI (XAI) methods for de...

Please sign up or login with your details

Forgot password? Click here to reset