Bangla language textual image description by hybrid neural network model

02/25/2021
by   md-asifuzzaman-jishan, et al.
0

Automatic image captioning task in different language is a challenging task which has not been well investigated yet due to the lack of dataset and effective models. It also requires good understanding of scene and contextual embedding for robust semantic interpretation of images for natural language image descriptor. To generate image descriptor in Bangla, we created a new Bangla dataset of images paired with target language label, named as Bangla natural language image to text (BNLIT) dataset. To deal with the image understanding, we propose a hybrid encoder-decoder model based on encoder-decoder architecture and the model is evaluated on our newly created dataset. This proposed approach achieves significance performance improvement on task of semantic retrieval of images. Our hybrid model uses the convolutional neural network as an encoder whereas the bidirectional long short term memory is used for the sentence representation that decreases the computational complexities without trading off the exactness of the descriptor. The model yielded benchmark accuracy in recovering Bangla natural language and we also conducted a thorough numerical analysis of the model performance on the BNLIT dataset.

READ FULL TEXT

page 2

page 4

page 8

page 10

research
02/25/2021

Hybrid deep neural network for Bangla automated image descriptor

Automated image to text generation is a computationally challenging comp...
research
02/25/2021

Natural language description of images using hybrid recurrent neural network

We presented a learning model that generated natural language descriptio...
research
09/03/2022

vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain using Swin Transformer and Attention-based LSTM

This study presents our approach on the automatic Vietnamese image capti...
research
02/21/2020

Image to Language Understanding: Captioning approach

Extracting context from visual representations is of utmost importance i...
research
02/22/2021

Image Captioning using Deep Stacked LSTMs, Contextual Word Embeddings and Data Augmentation

Image Captioning, or the automatic generation of descriptions for images...
research
10/14/2019

Tell-the-difference: Fine-grained Visual Descriptor via a Discriminating Referee

In this paper, we investigate a novel problem of telling the difference ...
research
12/06/2018

Auto-Encoding Scene Graphs for Image Captioning

We propose Scene Graph Auto-Encoder (SGAE) that incorporates the languag...

Please sign up or login with your details

Forgot password? Click here to reset