A hypothesize-and-verify framework for Text Recognition using Deep Recurrent Neural Networks

02/26/2015
by   Anupama Ray, et al.
0

Deep LSTM is an ideal candidate for text recognition. However text recognition involves some initial image processing steps like segmentation of lines and words which can induce error to the recognition system. Without segmentation, learning very long range context is difficult and becomes computationally intractable. Therefore, alternative soft decisions are needed at the pre-processing level. This paper proposes a hybrid text recognizer using a deep recurrent neural network with multiple layers of abstraction and long range context along with a language model to verify the performance of the deep neural network. In this paper we construct a multi-hypotheses tree architecture with candidate segments of line sequences from different segmentation algorithms at its different branches. The deep neural network is trained on perfectly segmented data and tests each of the candidate segments, generating unicode sequences. In the verification step, these unicode sequences are validated using a sub-string match with the language model and best first search is used to find the best possible combination of alternative hypothesis from the tree structure. Thus the verification framework using language models eliminates wrong segmentation outputs and filters recognition errors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2015

Telugu OCR Framework using Deep Learning

In this paper, we address the task of Optical Character Recognition(OCR)...
research
11/15/2018

Multi-cell LSTM Based Neural Language Model

Language models, being at the heart of many NLP problems, are always of ...
research
08/04/2013

Generating Sequences With Recurrent Neural Networks

This paper shows how Long Short-term Memory recurrent neural networks ca...
research
10/29/2018

Visual Re-ranking with Natural Language Understanding for Text Spotting

Many scene text recognition approaches are based on purely visual inform...
research
11/09/2019

On Architectures for Including Visual Information in Neural Language Models for Image Description

A neural language model can be conditioned into generating descriptions ...
research
04/15/2021

Neural Sequence Segmentation as Determining the Leftmost Segments

Prior methods to text segmentation are mostly at token level. Despite th...
research
02/24/2017

Sequence Modeling via Segmentations

Segmental structure is a common pattern in many types of sequences such ...

Please sign up or login with your details

Forgot password? Click here to reset