A Multitask Network for Localization and Recognition of Text in Images

06/21/2019
by   Mohammad Reza Sarshogh, et al.
0

We present an end-to-end trainable multi-task network that addresses the problem of lexicon-free text extraction from complex documents. This network simultaneously solves the problems of text localization and text recognition and text segments are identified with no post-processing, cropping, or word grouping. A convolutional backbone and Feature Pyramid Network are combined to provide a shared representation that benefits each of three model heads: text localization, classification, and text recognition. To improve recognition accuracy, we describe a dynamic pooling mechanism that retains high-resolution information across all RoIs. For text recognition, we propose a convolutional mechanism with attention which out-performs more common recurrent architectures. Our model is evaluated against benchmark datasets and comparable methods and achieves high performance in challenging regimes of non-traditional OCR.

READ FULL TEXT
research
06/14/2019

Towards End-to-End Text Spotting in Natural Scenes

Text spotting in natural scene images is of great importance for many im...
research
04/16/2021

TeLCoS: OnDevice Text Localization with Clustering of Script

Recent research in the field of text localization in a resource constrai...
research
09/12/2016

Detecting Text in Natural Image with Connectionist Text Proposal Network

We propose a novel Connectionist Text Proposal Network (CTPN) that accur...
research
03/20/2018

Text Detection and Recognition in images: A survey

Text Detection and recognition is a one of the important aspect of image...
research
05/15/2017

WordFence: Text Detection in Natural Images with Border Awareness

In recent years, text recognition has achieved remarkable success in rec...
research
10/07/2013

End-to-End Text Recognition with Hybrid HMM Maxout Models

The problem of detecting and recognizing text in natural scenes has prov...
research
08/05/2022

GLASS: Global to Local Attention for Scene-Text Spotting

In recent years, the dominant paradigm for text spotting is to combine t...

Please sign up or login with your details

Forgot password? Click here to reset