Towards End-to-End Text Spotting in Natural Scenes

06/14/2019
by   Hui Li, et al.
1

Text spotting in natural scene images is of great importance for many image understanding tasks. It includes two sub-tasks: text detection and recognition. In this work, we propose a unified network that simultaneously localizes and recognizes text with a single forward pass, avoiding intermediate processes such as image cropping and feature re-calculation, word separation, and character grouping. In contrast to existing approaches that consider text detection and recognition as two distinct tasks and tackle them one by one, the proposed framework settles these two tasks concurrently. The whole framework can be trained end-to-end and is able to handle text of arbitrary shapes. The convolutional features are calculated only once and shared by both detection and recognition modules. Through multi-task training, the learned features become more discriminate and improve the overall performance. By employing the 2D attention model in word recognition, the irregularity of text can be robustly addressed. It provides the spatial location for each character, which not only helps local feature extraction in word recognition, but also indicates an orientation angle to refine text localization. Our proposed method has achieved state-of-the-art performance on several standard text spotting benchmarks, including both regular and irregular ones.

READ FULL TEXT

page 8

page 10

page 11

page 12

page 13

research
07/13/2017

Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks

In this work, we jointly address the problem of text detection and recog...
research
12/24/2018

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network

Reading text from images remains challenging due to multi-orientation, p...
research
06/05/2017

Visual attention models for scene text recognition

In this paper we propose an approach to lexicon-free recognition of text...
research
06/21/2019

A Multitask Network for Localization and Recognition of Text in Images

We present an end-to-end trainable multi-task network that addresses the...
research
08/05/2022

GLASS: Global to Local Attention for Scene-Text Spotting

In recent years, the dominant paradigm for text spotting is to combine t...
research
07/19/2020

Character Region Attention For Text Spotting

A scene text spotter is composed of text detection and recognition modul...
research
08/25/2023

DISGO: Automatic End-to-End Evaluation for Scene Text OCR

This paper discusses the challenges of optical character recognition (OC...

Please sign up or login with your details

Forgot password? Click here to reset