IMAGETOTEXT: IMAGE CAPTION GENERATION USING HYBRID RECURRENT NEURAL NETWORK

02/25/2021
by   md-asifuzzaman-jishan, et al.
0

Generating a natural language description from images is an important problem at the section of computer vision, natural language processing, artificial intelligence and image processing. Observing many recent works in deep learning sector, we introduced a hybrid RNN model which is generating text from the given input images. We presented the learning model that generates natural language of images. The model utilized the connections between natural language and visual data by produced text line based contents from a given image. Our Hybrid Recurrent Neural Network model is based on the combination of Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) and Bi-directional Recurrent Neural Network (BRNN) models. We used three benchmark datasets: Flickr8K, Flickr30K and MS COCO for training our model and observed the accuracy improvement comparing with the state of the art work. A new Bangla dataset is also created which we named as BNLIT (Bangla Natural Language Image to Text) is made to generate Bangla caption from given input image. This dataset contains 8,700 images and all the images are in Bangladesh perspective images. Our hybrid model learns from a new set of data and annotations that reflect the Bangladeshi geographical context.

READ FULL TEXT

page 15

page 30

page 33

research
02/25/2021

Natural language description of images using hybrid recurrent neural network

We presented a learning model that generated natural language descriptio...
research
09/28/2021

The VVAD-LRS3 Dataset for Visual Voice Activity Detection

Robots are becoming everyday devices, increasing their interaction with ...
research
12/15/2014

Translating Videos to Natural Language Using Deep Recurrent Neural Networks

Solving the visual symbol grounding problem has long been a goal of arti...
research
03/06/2021

TextMage: The Automated Bangla CaptionGenerator Based On Deep Learning

Neural Networks and Deep Learning have seen an upsurge of research in th...
research
10/15/2020

TextMage: The Automated Bangla Caption Generator Based On Deep Learning

Neural Networks and Deep Learning have seen an upsurge of research in th...
research
07/27/2021

Remember What You have drawn: Semantic Image Manipulation with Memory

Image manipulation with natural language, which aims to manipulate image...
research
02/25/2021

Hybrid deep neural network for Bangla automated image descriptor

Automated image to text generation is a computationally challenging comp...

Please sign up or login with your details

Forgot password? Click here to reset