Visual attention models for scene text recognition

06/05/2017
by   Suman K. Ghosh, et al.
0

In this paper we propose an approach to lexicon-free recognition of text in scene images. Our approach relies on a LSTM-based soft visual attention model learned from convolutional features. A set of feature vectors are derived from an intermediate convolutional layer corresponding to different areas of the image. This permits encoding of spatial information into the image representation. In this way, the framework is able to learn how to selectively focus on different parts of the image. At every time step the recognizer emits one character using a weighted combination of the convolutional feature vectors according to the learned attention model. Training can be done end-to-end using only word level annotations. In addition, we show that modifying the beam search algorithm by integrating an explicit language model leads to significantly better recognition results. We validate the performance of our approach on standard SVT and ICDAR'03 scene text datasets, showing state-of-the-art performance in unconstrained text recognition.

READ FULL TEXT

page 2

page 8

research
06/14/2019

Towards End-to-End Text Spotting in Natural Scenes

Text spotting in natural scene images is of great importance for many im...
research
04/20/2019

FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition

Scene text recognition has recently been widely treated as a sequence-to...
research
03/09/2016

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

We present recursive recurrent neural networks with attention modeling (...
research
09/04/2018

Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images

Word spotting in natural scene images has many applications in scene und...
research
05/08/2023

Scene Text Recognition with Image-Text Matching-guided Dictionary

Employing a dictionary can efficiently rectify the deviation between the...
research
10/29/2018

Visual Re-ranking with Natural Language Understanding for Text Spotting

Many scene text recognition approaches are based on purely visual inform...
research
01/13/2016

Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues

Recognizing scene text is a challenging problem, even more so than the r...

Please sign up or login with your details

Forgot password? Click here to reset