ShopSign: a Diverse Scene Text Dataset of Chinese Shop Signs in Street Views

03/25/2019
by   Chongsheng Zhang, et al.
0

In this paper, we introduce the ShopSign dataset, which is a newly developed natural scene text dataset of Chinese shop signs in street views. Although a few scene text datasets are already publicly available (e.g. ICDAR2015, COCO-Text), there are few images in these datasets that contain Chinese texts/characters. Hence, we collect and annotate the ShopSign dataset to advance research in Chinese scene text detection and recognition. The new dataset has three distinctive characteristics: (1) large-scale: it contains 25,362 Chinese shop sign images, with a total number of 196,010 text-lines. (2) diversity: the images in ShopSign were captured in different scenes, from downtown to developing regions, using more than 50 different mobile phones. (3) difficulty: the dataset is very sparse and imbalanced. It also includes five categories of hard images (mirror, wooden, deformed, exposed and obscure). To illustrate the challenges in ShopSign, we run baseline experiments using state-of-the-art scene text detection methods (including CTPN, TextBoxes++ and EAST), and cross-dataset validation to compare their corresponding performance on the related datasets such as CTW, RCTW and ICPR 2018 MTWI challenge dataset. The sample images and detailed descriptions of our ShopSign dataset are publicly available at: https://github.com/chongshengzhang/shopsign.

READ FULL TEXT
research
12/30/2021

Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

The flourishing blossom of deep learning has witnessed the rapid develop...
research
02/28/2018

Chinese Text in the Wild

We introduce Chinese Text in the Wild, a very large dataset of Chinese t...
research
09/14/2023

Towards Large-scale Building Attribute Mapping using Crowdsourced Images: Scene Text Recognition on Flickr and Problems to be Solved

Crowdsourced platforms provide huge amounts of street-view images that c...
research
05/07/2022

Unified Chinese License Plate Detection and Recognition with High Efficiency

Recently, deep learning-based methods have reached an excellent performa...
research
07/14/2022

Exploration of an End-to-End Automatic Number-plate Recognition neural network for Indian datasets

Indian vehicle number plates have wide variety in terms of size, font, s...
research
05/23/2023

WinDB: HMD-free and Distortion-free Panoptic Video Fixation Learning

To date, the widely-adopted way to perform fixation collection in panopt...
research
03/27/2018

Random Polyhedral Scenes: An Image Generator for Active Vision System Experiments

We present a Polyhedral Scene Generator system which creates a random sc...

Please sign up or login with your details

Forgot password? Click here to reset