Visual Re-ranking with Natural Language Understanding for Text Spotting

10/29/2018
by   Ahmed Sabir, et al.
0

Many scene text recognition approaches are based on purely visual information and ignore the semantic relation between scene and text. In this paper, we tackle this problem from natural language processing perspective to fill the gap between language and vision. We propose a post-processing approach to improve scene text recognition accuracy by using occurrence probabilities of words (unigram language model), and the semantic correlation between scene and text. For this, we initially rely on an off-the-shelf deep neural network, already trained with a large amount of data, which provides a series of text hypotheses per input image. These hypotheses are then re-ranked using word frequencies and semantic relatedness with objects or scenes in the image. As a result of this combination, the performance of the original network is boosted with almost no additional cost. We validate our approach on ICDAR'17 dataset.

READ FULL TEXT
research
10/23/2018

Visual Semantic Re-ranker for Text Spotting

Many current state-of-the-art methods for text recognition are based on ...
research
04/21/2020

Textual Visual Semantic Dataset for Text Spotting

Text Spotting in the wild consists of detecting and recognizing text app...
research
10/24/2018

Image-based Natural Language Understanding Using 2D Convolutional Neural Networks

We propose a new approach to natural language understanding in which we ...
research
06/05/2017

Visual attention models for scene text recognition

In this paper we propose an approach to lexicon-free recognition of text...
research
08/04/2019

Deep Neural Network for Semantic-based Text Recognition in Images

State-of-the-art text spotting systems typically aim to detect isolated ...
research
02/26/2015

A hypothesize-and-verify framework for Text Recognition using Deep Recurrent Neural Networks

Deep LSTM is an ideal candidate for text recognition. However text recog...
research
03/06/2021

TextMage: The Automated Bangla CaptionGenerator Based On Deep Learning

Neural Networks and Deep Learning have seen an upsurge of research in th...

Please sign up or login with your details

Forgot password? Click here to reset