Arbitrary Shape Text Detection using Transformers

02/22/2022
by   Zobeir Raisi, et al.
13

Recent text detection frameworks require several handcrafted components such as anchor generation, non-maximum suppression (NMS), or multiple processing stages (e.g. label generation) to detect arbitrarily shaped text images. In contrast, we propose an end-to-end trainable architecture based on Detection using Transformers (DETR), that outperforms previous state-of-the-art methods in arbitrary-shaped text detection. At its core, our proposed method leverages a bounding box loss function that accurately measures the arbitrary detected text regions' changes in scale and aspect ratio. This is possible due to a hybrid shape representation made from Bezier curves, that are further split into piece-wise polygons. The proposed loss function is then a combination of a generalized-split-intersection-over-union loss defined over the piece-wise polygons and regularized by a Smooth-ln regression over the Bezier curve's control points. We evaluate our proposed model using Total-Text and CTW-1500 datasets for curved text, and MSRA-TD500 and ICDAR15 datasets for multi-oriented text, and show that the proposed method outperforms the previous state-of-the-art methods in arbitrary-shape text detection tasks.

READ FULL TEXT

page 3

page 6

research
04/05/2022

Text Spotting Transformers

In this paper, we present TExt Spotting TRansformers (TESTR), a generic ...
research
11/21/2019

All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting

Recently, end-to-end text spotting that aims to detect and recognize tex...
research
08/24/2019

Towards Unconstrained End-to-End Text Spotting

We propose an end-to-end trainable network that can simultaneously detec...
research
02/24/2020

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

Scene text detection and recognition has received increasing research at...
research
08/29/2023

PBFormer: Capturing Complex Scene Text Shape with Polynomial Band Transformer

We present PBFormer, an efficient yet powerful scene text detector that ...
research
06/27/2023

Efficient and Accurate Scene Text Detection with Low-Rank Approximation Network

Recently, regression-based methods, which predict parameter curves for l...
research
03/24/2017

Deep Direct Regression for Multi-Oriented Scene Text Detection

In this paper, we first provide a new perspective to divide existing hig...

Please sign up or login with your details

Forgot password? Click here to reset