DeepAI AI Chat
Log In Sign Up

Unsupervised Writer Adaptation for Synthetic-to-Real Handwritten Word Recognition

09/18/2019
by   Lei Kang, et al.
omni:us
Universitat Autònoma de Barcelona
0

Handwritten Text Recognition (HTR) is still a challenging problem because it must deal with two important difficulties: the variability among writing styles, and the scarcity of labelled data. To alleviate such problems, synthetic data generation and data augmentation are typically used to train HTR systems. However, training with such data produces encouraging but still inaccurate transcriptions in real words. In this paper, we propose an unsupervised writer adaptation approach that is able to automatically adjust a generic handwritten word recognizer, fully trained with synthetic fonts, towards a new incoming writer. We have experimentally validated our proposal using five different datasets, covering several challenges (i) the document source: modern and historic samples, which may involve paper degradation problems; (ii) different handwriting styles: single and multiple writer collections; and (iii) language, which involves different character combinations. Across these challenging collections, we show that our system is able to maintain its performance, thus, it provides a practical and generic approach to deal with new document collections without requiring any expensive and tedious manual annotation step.

READ FULL TEXT

page 3

page 4

03/10/2023

Marginalia and machine learning: Handwritten text recognition for Marginalia Collections

The pressing need for digitization of historical document collections ha...
04/17/2018

Synthetic data generation for Indic handwritten text recognition

This paper presents a novel approach to generate synthetic dataset for h...
04/12/2022

Content and Style Aware Generation of Text-line Images for Handwriting Recognition

Handwritten Text Recognition has achieved an impressive performance in p...
10/10/2017

DocEmul: a Toolkit to Generate Structured Historical Documents

We propose a toolkit to generate structured synthetic documents emulatin...
04/05/2021

MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition

Handwritten Text Recognition (HTR) remains a challenging problem to date...
08/15/2016

Generating Synthetic Data for Text Recognition

Generating synthetic images is an art which emulates the natural process...
03/28/2023

Scalable handwritten text recognition system for lexicographic sources of under-resourced languages and alphabets

The paper discusses an approach to decipher large collections of handwri...