Geometry Normalization Networks for Accurate Scene Text Detection

09/02/2019
by   Youjiang Xu, et al.
0

Large geometry (e.g., orientation) variances are the key challenges in the scene text detection. In this work, we first conduct experiments to investigate the capacity of networks for learning geometry variances on detecting scene texts, and find that networks can handle only limited text geometry variances. Then, we put forward a novel Geometry Normalization Module (GNM) with multiple branches, each of which is composed of one Scale Normalization Unit and one Orientation Normalization Unit, to normalize each text instance to one desired canonical geometry range through at least one branch. The GNM is general and readily plugged into existing convolutional neural network based text detectors to construct end-to-end Geometry Normalization Networks (GNNets). Moreover, we propose a geometry-aware training scheme to effectively train the GNNets by sampling and augmenting text instances from a uniform geometry variance distribution. Finally, experiments on popular benchmarks of ICDAR 2015 and ICDAR 2017 MLT validate that our method outperforms all the state-of-the-art approaches remarkably by obtaining one-forward test F-scores of 88.52 and 74.54 respectively.

READ FULL TEXT

page 2

page 8

research
11/19/2020

Scene text removal via cascaded text stroke detection and erasing

Recent learning-based approaches show promising performance improvement ...
research
07/04/2018

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Driven by deep neural networks and large scale datasets, scene text dete...
research
08/19/2019

UprightNet: Geometry-Aware Camera Orientation Estimation from Single Images

We introduce UprightNet, a learning-based approach for estimating 2DoF c...
research
07/27/2023

NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection

We present NeRF-Det, a novel method for indoor 3D detection with posed R...
research
08/20/2019

An End-to-end Video Text Detector with Online Tracking

Video text detection is considered as one of the most difficult tasks in...
research
12/04/2017

Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks

Image orientation detection requires high-level scene understanding. Hum...
research
05/03/2021

Collision Replay: What Does Bumping Into Things Tell You About Scene Geometry?

What does bumping into things in a scene tell you about scene geometry? ...

Please sign up or login with your details

Forgot password? Click here to reset