SRFormer: Empowering Regression-Based Text Detection Transformer with Segmentation

08/21/2023
by   Qingwen Bu, et al.
0

Existing techniques for text detection can be broadly classified into two primary groups: segmentation-based methods and regression-based methods. Segmentation models offer enhanced robustness to font variations but require intricate post-processing, leading to high computational overhead. Regression-based methods undertake instance-aware prediction but face limitations in robustness and data efficiency due to their reliance on high-level representations. In our academic pursuit, we propose SRFormer, a unified DETR-based model with amalgamated Segmentation and Regression, aiming at the synergistic harnessing of the inherent robustness in segmentation representations, along with the straightforward post-processing of instance-level regression. Our empirical analysis indicates that favorable segmentation predictions can be obtained at the initial decoder layers. In light of this, we constrain the incorporation of segmentation branches to the first few decoder layers and employ progressive regression refinement in subsequent layers, achieving performance gains while minimizing additional computational load from the mask. Furthermore, we propose a Mask-informed Query Enhancement module. We take the segmentation result as a natural soft-ROI to pool and extract robust pixel representations, which are then employed to enhance and diversify instance queries. Extensive experimentation across multiple benchmarks has yielded compelling findings, highlighting our method's exceptional robustness, superior training and data efficiency, as well as its state-of-the-art performance.

READ FULL TEXT

page 3

page 7

research
11/30/2020

BOTD: Bold Outline Text Detector

Recently, text detection for arbitrary shape has attracted more and more...
research
08/14/2023

A Unified Query-based Paradigm for Camouflaged Instance Segmentation

Due to the high similarity between camouflaged instances and the backgro...
research
01/04/2018

PixelLink: Detecting Scene Text via Instance Segmentation

Most state-of-the-art scene text detection algorithms are deep learning ...
research
06/06/2023

TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision

End-to-end text spotting is a vital computer vision task that aims to in...
research
05/26/2023

Maskomaly:Zero-Shot Mask Anomaly Segmentation

We present a simple and practical framework for anomaly segmentation cal...
research
06/04/2020

Boundary-assisted Region Proposal Networks for Nucleus Segmentation

Nucleus segmentation is an important task in medical image analysis. How...
research
02/10/2023

CCDN: Checkerboard Corner Detection Network for Robust Camera Calibration

Aiming to improve the checkerboard corner detection robustness against t...

Please sign up or login with your details

Forgot password? Click here to reset