Generating Synthetic Data for Text Recognition

08/15/2016
by   Praveen Krishnan, et al.
0

Generating synthetic images is an art which emulates the natural process of image generation in a closest possible manner. In this work, we exploit such a framework for data generation in handwritten domain. We render synthetic data using open source fonts and incorporate data augmentation schemes. As part of this work, we release 9M synthetic handwritten word image corpus which could be useful for training deep network architectures and advancing the performance in handwritten word spotting and recognition tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2018

Synthetic data generation for Indic handwritten text recognition

This paper presents a novel approach to generate synthetic dataset for h...
research
02/17/2018

HWNet v2: An Efficient Word Image Representation for Handwritten Documents

We present a framework for learning efficient holistic representation fo...
research
10/27/2020

Improving Text Relationship Modeling with Artificial Data

Data augmentation uses artificially-created examples to support supervis...
research
03/13/2023

Handwritten Word Recognition using Deep Learning Approach: A Novel Way of Generating Handwritten Words

A handwritten word recognition system comes with issues such as lack of ...
research
09/18/2019

Unsupervised Writer Adaptation for Synthetic-to-Real Handwritten Word Recognition

Handwritten Text Recognition (HTR) is still a challenging problem becaus...
research
05/11/2021

One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition

Low resource Handwritten Text Recognition (HTR) is a hard problem due to...
research
03/06/2023

EEG Synthetic Data Generation Using Probabilistic Diffusion Models

Electroencephalography (EEG) plays a significant role in the Brain Compu...

Please sign up or login with your details

Forgot password? Click here to reset