CENSUS-HWR: a large training dataset for offline handwriting recognition

05/25/2023
by   Chetan Joshi, et al.
0

Progress in Automated Handwriting Recognition has been hampered by the lack of large training datasets. Nearly all research uses a set of small datasets that often cause models to overfit. We present CENSUS-HWR, a new dataset consisting of full English handwritten words in 1,812,014 gray scale images. A total of 1,865,134 handwritten texts from a vocabulary of 10,711 words in the English language are present in this collection. This dataset is intended to serve handwriting models as a benchmark for deep learning algorithms. This huge English handwriting recognition dataset has been extracted from the US 1930 and 1940 censuses taken by approximately 70,000 enumerators each year. The dataset and the trained model with their weights are freely available to download at https://censustree.org/data.html.

READ FULL TEXT

page 2

page 7

research
08/18/2020

Image Pre-processing on NumtaDB for Bengali Handwritten Digit Recognition

NumtaDB is by far the largest data-set collection for handwritten digits...
research
10/23/2018

PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution

We introduce PreCo, a large-scale English dataset for coreference resolu...
research
03/13/2023

Handwritten Word Recognition using Deep Learning Approach: A Novel Way of Generating Handwritten Words

A handwritten word recognition system comes with issues such as lack of ...
research
06/06/2018

NumtaDB - Assembled Bengali Handwritten Digits

To benchmark Bengali digit recognition algorithms, a large publicly avai...
research
08/22/2018

A syllable based model for handwriting recognition

In this paper, we introduce a new modeling approach of texts for handwri...
research
07/05/2023

A Dataset of Inertial Measurement Units for Handwritten English Alphabets

This paper presents an end-to-end methodology for collecting datasets to...
research
11/22/2021

Many Heads but One Brain: an Overview of Fusion Brain Challenge on AI Journey 2021

Supporting the current trend in the AI community, we propose the AI Jour...

Please sign up or login with your details

Forgot password? Click here to reset