DEER: Detection-agnostic End-to-End Recognizer for Scene Text Spotting

03/10/2022
by   Seonghyeon Kim, et al.
0

Recent end-to-end scene text spotters have achieved great improvement in recognizing arbitrary-shaped text instances. Common approaches for text spotting use region of interest pooling or segmentation masks to restrict features to single text instances. However, this makes it hard for the recognizer to decode correct sequences when the detection is not accurate i.e. one or more characters are cropped out. Considering that it is hard to accurately decide word boundaries with only the detector, we propose a novel Detection-agnostic End-to-End Recognizer, DEER, framework. The proposed method reduces the tight dependency between detection and recognition modules by bridging them with a single reference point for each text instance, instead of using detected regions. The proposed method allows the decoder to recognize the texts that are indicated by the reference point, with features from the whole image. Since only a single point is required to recognize the text, the proposed method enables text spotting without an arbitrarily-shaped detector or bounding polygon annotations. Experimental results present that the proposed method achieves competitive results on regular and arbitrarily-shaped text spotting benchmarks. Further analysis shows that DEER is robust to the detection errors. The code and dataset will be publicly available.

READ FULL TEXT

page 1

page 2

page 7

page 8

research
12/08/2020

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

Recently end-to-end scene text spotting has become a popular research to...
research
02/17/2020

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

Many approaches have recently been proposed to detect irregular scene te...
research
07/15/2022

Decoupling Recognition from Detection: Single Shot Self-Reliant Scene Text Spotter

Typical text spotters follow the two-stage spotting strategy: detect the...
research
04/07/2023

Towards Unified Scene Text Spotting based on Sequence Generation

Sequence generation models have recently made significant progress in un...
research
07/30/2019

Towards Pure End-to-End Learning for Recognizing Multiple Text Sequences from an Image

Here we address a challenging problem: recognizing multiple text sequenc...
research
06/06/2023

TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision

End-to-end text spotting is a vital computer vision task that aims to in...
research
08/14/2023

Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning

Due to the flexible representation of arbitrary-shaped scene text and si...

Please sign up or login with your details

Forgot password? Click here to reset