DeepAI AI Chat
Log In Sign Up

Image Pre-processing on NumtaDB for Bengali Handwritten Digit Recognition

by   Ovi Paul, et al.

NumtaDB is by far the largest data-set collection for handwritten digits in Bengali. This is a diverse dataset containing more than 85000 images. But this diversity also makes this dataset very difficult to work with. The goal of this paper is to find the benchmark for pre-processed images which gives good accuracy on any machine learning models. The reason being, there are no available pre-processed data for Bengali digit recognition to work with like the English digits for MNIST.


page 2

page 3


MNIST-MIX: A Multi-language Handwritten Digit Recognition Dataset

In this letter, we contribute a multi-language handwritten digit recogni...

CENSUS-HWR: a large training dataset for offline handwriting recognition

Progress in Automated Handwriting Recognition has been hampered by the l...

An improved helmet detection method for YOLOv3 on an unbalanced dataset

The YOLOv3 target detection algorithm is widely used in industry due to ...

Tamil Vowel Recognition With Augmented MNIST-like Data Set

We report generation of a MNIST [4] compatible data set [1] for Tamil vo...

Object Detection Based Handwriting Localization

We present an object detection based approach to localize handwritten re...

TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text

Handling large corpuses of documents is of significant importance in man...

OCR Synthetic Benchmark Dataset for Indic Languages

We present the largest publicly available synthetic OCR benchmark datase...