Text Recognition in Real Scenarios with a Few Labeled Samples

06/22/2020
by   Jinghuang Lin, et al.
0

Scene text recognition (STR) is still a hot research topic in computer vision field due to its various applications. Existing works mainly focus on learning a general model with a huge number of synthetic text images to recognize unconstrained scene texts, and have achieved substantial progress. However, these methods are not quite applicable in many real-world scenarios where 1) high recognition accuracy is required, while 2) labeled samples are lacked. To tackle this challenging problem, this paper proposes a few-shot adversarial sequence domain adaptation (FASDA) approach to build sequence adaptation between the synthetic source domain (with many synthetic labeled samples) and a specific target domain (with only some or a few real labeled samples). This is done by simultaneously learning each character's feature representation with an attention mechanism and establishing the corresponding character-level latent subspace with adversarial learning. Our approach can maximize the character-level confusion between the source domain and the target domain, thus achieves the sequence-level adaptation with even a small number of labeled samples in the target domain. Extensive experiments on various datasets show that our method significantly outperforms the finetuning scheme, and obtains comparable performance to the state-of-the-art STR methods.

READ FULL TEXT

page 1

page 3

research
01/12/2020

Multi-source Domain Adaptation for Visual Sentiment Classification

Existing domain adaptation methods on visual sentiment classification ty...
research
12/07/2022

Cyclically Disentangled Feature Translation for Face Anti-spoofing

Current domain adaptation methods for face anti-spoofing leverage labele...
research
04/02/2020

Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation Method for Semantic Segmentation

Learning segmentation from synthetic data and adapting to real data can ...
research
02/24/2022

SMILE: Sequence-to-Sequence Domain Adaption with Minimizing Latent Entropy for Text Image Recognition

Training recognition models with synthetic images have achieved remarkab...
research
05/11/2018

Exploiting Images for Video Recognition with Hierarchical Generative Adversarial Networks

Existing deep learning methods of video recognition usually require a la...
research
06/27/2021

Few-Shot Domain Expansion for Face Anti-Spoofing

Face anti-spoofing (FAS) is an indispensable and widely used module in f...
research
04/17/2019

TextCaps : Handwritten Character Recognition with Very Small Datasets

Many localized languages struggle to reap the benefits of recent advance...

Please sign up or login with your details

Forgot password? Click here to reset