Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues

01/13/2016
by   Anand Mishra, et al.
0

Recognizing scene text is a challenging problem, even more so than the recognition of scanned documents. This problem has gained significant attention from the computer vision community in recent years, and several methods based on energy minimization frameworks and deep learning approaches have been proposed. In this work, we focus on the energy minimization framework and propose a model that exploits both bottom-up and top-down cues for recognizing cropped words extracted from street images. The bottom-up cues are derived from individual character detections from an image. We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them. These interactions are top-down cues obtained from a lexicon-based prior, i.e., language statistics. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We evaluate our proposed algorithm extensively on a number of cropped scene text benchmark datasets, namely Street View Text, ICDAR 2003, 2011 and 2013 datasets, and IIIT 5K-word, and show better performance than comparable methods. We perform a rigorous analysis of all the steps in our approach and analyze the results. We also show that state-of-the-art convolutional neural network features can be integrated in our framework to further improve the recognition performance.

READ FULL TEXT

page 2

page 3

research
12/20/2013

Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Recognizing arbitrary multi-character text in unconstrained natural phot...
research
11/26/2017

Coplanar Repeats by Energy Minimization

This paper proposes an automated method to detect, group and rectify arb...
research
09/25/2013

Characterness: An Indicator of Text in the Wild

Text in an image provides vital information for interpreting its content...
research
06/05/2017

Visual attention models for scene text recognition

In this paper we propose an approach to lexicon-free recognition of text...
research
07/21/2015

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Image-based sequence recognition has been a long-standing research topic...
research
12/29/2015

Robust Scene Text Recognition Using Sparse Coding based Features

In this paper, we propose an effective scene text recognition method usi...
research
12/03/2019

Scene recognition based on DNN and game theory with its applications in human-robot interaction

Scene recognition model based on the DNN and game theory with its applic...

Please sign up or login with your details

Forgot password? Click here to reset