Mask is All You Need: Rethinking Mask R-CNN for Dense and Arbitrary-Shaped Scene Text Detection

09/08/2021
by   Xugong Qin, et al.
0

Due to the large success in object detection and instance segmentation, Mask R-CNN attracts great attention and is widely adopted as a strong baseline for arbitrary-shaped scene text detection and spotting. However, two issues remain to be settled. The first is dense text case, which is easy to be neglected but quite practical. There may exist multiple instances in one proposal, which makes it difficult for the mask head to distinguish different instances and degrades the performance. In this work, we argue that the performance degradation results from the learning confusion issue in the mask head. We propose to use an MLP decoder instead of the "deconv-conv" decoder in the mask head, which alleviates the issue and promotes robustness significantly. And we propose instance-aware mask learning in which the mask head learns to predict the shape of the whole instance rather than classify each pixel to text or non-text. With instance-aware mask learning, the mask branch can learn separated and compact masks. The second is that due to large variations in scale and aspect ratio, RPN needs complicated anchor settings, making it hard to maintain and transfer across different datasets. To settle this issue, we propose an adaptive label assignment in which all instances especially those with extreme aspect ratios are guaranteed to be associated with enough anchors. Equipped with these components, the proposed method named MAYOR achieves state-of-the-art performance on five benchmarks including DAST1500, MSRA-TD500, ICDAR2015, CTW1500, and Total-Text.

READ FULL TEXT

page 1

page 3

page 8

research
07/18/2020

Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting

Recent end-to-end trainable methods for scene text spotting, integrating...
research
08/14/2023

Towards Robust Real-Time Scene Text Detection: From Semantic to Instance Representation Learning

Due to the flexible representation of arbitrary-shaped scene text and si...
research
03/28/2019

Pyramid Mask Text Detector

Scene text detection, an essential step of scene text recognition system...
research
12/08/2020

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

Recently end-to-end scene text spotting has become a popular research to...
research
04/01/2021

Arbitrary-Shaped Text Detection withAdaptive Text Region Representation

Text detection/localization, as an important task in computer vision, ha...
research
06/27/2022

TextDCT: Arbitrary-Shaped Text Detection via Discrete Cosine Transform Mask

Arbitrary-shaped scene text detection is a challenging task due to the v...
research
11/30/2020

BOTD: Bold Outline Text Detector

Recently, text detection for arbitrary shape has attracted more and more...

Please sign up or login with your details

Forgot password? Click here to reset