SPTS v2: Single-Point Scene Text Spotting

01/04/2023
by   Yuliang Liu, et al.
0

End-to-end scene text spotting has made significant progress due to its intrinsic synergy between text detection and recognition. Previous methods commonly regard manual annotations such as horizontal rectangles, rotated rectangles, quadrangles,and polygons as a prerequisite, which are much more expensive than using single-point. For the first time, we demonstrate that training scene text spotting models can be achieved with an extremely low-cost single-point annotation by the proposed framework, termed SPTS v2. SPTS v2 reserves the advantage of the auto-regressive Transformer with an Instance Assignment Decoder (IAD) through sequentially predicting the center points of all text instances inside the same predicting sequence, while with a Parallel Recognition Decoder (PRD) for text recognition in parallel. These two decoders share the same parameters and are interactively connected with a simple but effective information transmission process to pass the gradient and information. Comprehensive experiments on various existing benchmark datasets demonstrate the SPTS v2 can outperform previous state-of-the-art single-point text spotters with fewer parameters while achieving 14x faster inference speed. Most importantly, within the scope of our SPTS v2, extensive experiments further reveal an important phenomenon that single-point serves as the optimal setting for the scene text spotting compared to non-point, rectangular bounding box, and polygonal bounding box. Such an attempt provides a significant opportunity for scene text spotting applications beyond the realms of existing paradigms. Code is available at https://github.com/shannanyinxiang/SPTS.

READ FULL TEXT

page 1

page 2

page 5

page 7

page 9

page 10

page 11

research
12/15/2021

SPTS: Single-Point Text Spotting

Almost all scene text spotting (detection and recognition) methods rely ...
research
04/05/2022

Text Spotting Transformers

In this paper, we present TExt Spotting TRansformers (TESTR), a generic ...
research
06/06/2023

Looking and Listening: Audio Guided Text Recognition

Text recognition in the wild is a long-standing problem in computer visi...
research
12/20/2019

Exploring the Capacity of Sequential-free Box Discretization Network for Omnidirectional Scene Text Detection

Omnidirectional scene text detection has received increasing research at...
research
07/17/2023

Box-DETR: Understanding and Boxing Conditional Spatial Queries

Conditional spatial queries are recently introduced into DEtection TRans...
research
08/20/2023

ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer

In recent years, end-to-end scene text spotting approaches are evolving ...
research
07/15/2022

Decoupling Recognition from Detection: Single Shot Self-Reliant Scene Text Spotter

Typical text spotters follow the two-stage spotting strategy: detect the...

Please sign up or login with your details

Forgot password? Click here to reset