What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels

03/07/2021
by   Jeonghun Baek, et al.
0

Scene text recognition (STR) task has a common practice: All state-of-the-art STR models are trained on large synthetic data. In contrast to this practice, training STR models only on fewer real labels (STR with fewer labels) is important when we have to train STR models without synthetic data: for handwritten or artistic texts that are difficult to generate synthetically and for languages other than English for which we do not always have synthetic data. However, there has been implicit common knowledge that training STR models on real data is nearly impossible because real data is insufficient. We consider that this common knowledge has obstructed the study of STR with fewer labels. In this work, we would like to reactivate STR with fewer labels by disproving the common knowledge. We consolidate recently accumulated public real data and show that we can train STR models satisfactorily only with real labeled data. Subsequently, we find simple data augmentation to fully exploit real data. Furthermore, we improve the models by collecting unlabeled data and introducing semi- and self-supervised methods. As a result, we obtain a competitive model to state-of-the-art methods. To the best of our knowledge, this is the first study that 1) shows sufficient performance by only using real labels and 2) introduces semi- and self-supervised methods into STR with fewer labels. Our code and data are available: https://github.com/ku21fan/STR-Fewer-Labels

READ FULL TEXT

page 2

page 3

page 4

page 12

page 14

page 16

page 17

research
04/16/2022

Pushing the Performance Limit of Scene Text Recognizer without Human Annotation

Scene text recognition (STR) attracts much attention over the years beca...
research
09/03/2020

Synthetic-to-Real Unsupervised Domain Adaptation for Scene Text Detection in the Wild

Deep learning-based scene text detection can achieve preferable performa...
research
08/16/2021

Data Augmentation for Scene Text Recognition

Scene text recognition (STR) is a challenging task in computer vision du...
research
10/04/2019

Synthesizing Credit Card Transactions

Two elements have been essential to AI's recent boom: (1) deep neural ne...
research
06/07/2022

Self-Training of Handwritten Word Recognition for Synthetic-to-Real Adaptation

Performances of Handwritten Text Recognition (HTR) models are largely de...
research
02/20/2020

A survey on Semi-, Self- and Unsupervised Techniques in Image Classification

While deep learning strategies achieve outstanding results in computer v...
research
11/23/2018

MURAUER: Mapping Unlabeled Real Data for Label AUstERity

Data labeling for learning 3D hand pose estimation models is a huge effo...

Please sign up or login with your details

Forgot password? Click here to reset