TextMage: The Automated Bangla CaptionGenerator Based On Deep Learning

03/06/2021
by   md-asifuzzaman-jishan, et al.
0

Neural Networks and Deep Learning have seen an upsurge of research in the past decade due to the improved results. Generates text from the given image is a crucial task that requires the combination of both sectors which are computer vision and natural language processing in order to understand an image and represent it using a natural language. However existing works have all been done on a particular lingual domain and on the same set of data. This leads to the systems being developed to perform poorly on images that belong to specific locales' geographical context. TextMage is a system that is capable of understanding visual scenes that belong to the Bangladeshi geographical context and use its knowledge to represent what it understands in Bengali. Hence, we have trained a model on our previously developed and published dataset named BanglaLekhaImageCaptions. This dataset contains 9,154 images along with two annotations for each image. In order to access performance, the proposed model has been implemented and evaluated.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2020

TextMage: The Automated Bangla Caption Generator Based On Deep Learning

Neural Networks and Deep Learning have seen an upsurge of research in th...
research
02/25/2021

IMAGETOTEXT: IMAGE CAPTION GENERATION USING HYBRID RECURRENT NEURAL NETWORK

Generating a natural language description from images is an important pr...
research
11/06/2017

What Really is Deep Learning Doing?

Deep learning has achieved a great success in many areas, from computer ...
research
05/09/2016

LightNet: A Versatile, Standalone Matlab-based Environment for Deep Learning

LightNet is a lightweight, versatile and purely Matlab-based deep learni...
research
10/29/2018

Visual Re-ranking with Natural Language Understanding for Text Spotting

Many scene text recognition approaches are based on purely visual inform...
research
07/18/2022

The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks

Neural networks have seen an explosion of usage and research in the past...
research
11/09/2015

A Century of Portraits: A Visual Historical Record of American High School Yearbooks

Many details about our world are not captured in written records because...

Please sign up or login with your details

Forgot password? Click here to reset