DeepAI AI Chat
Log In Sign Up

Image Pre-processing on NumtaDB for Bengali Handwritten Digit Recognition

08/18/2020
by   Ovi Paul, et al.
0

NumtaDB is by far the largest data-set collection for handwritten digits in Bengali. This is a diverse dataset containing more than 85000 images. But this diversity also makes this dataset very difficult to work with. The goal of this paper is to find the benchmark for pre-processed images which gives good accuracy on any machine learning models. The reason being, there are no available pre-processed data for Bengali digit recognition to work with like the English digits for MNIST.

READ FULL TEXT

page 2

page 3

04/08/2020

MNIST-MIX: A Multi-language Handwritten Digit Recognition Dataset

In this letter, we contribute a multi-language handwritten digit recogni...
05/25/2023

CENSUS-HWR: a large training dataset for offline handwriting recognition

Progress in Automated Handwriting Recognition has been hampered by the l...
11/09/2020

An improved helmet detection method for YOLOv3 on an unbalanced dataset

The YOLOv3 target detection algorithm is widely used in industry due to ...
06/09/2020

Tamil Vowel Recognition With Augmented MNIST-like Data Set

We report generation of a MNIST [4] compatible data set [1] for Tamil vo...
06/28/2021

Object Detection Based Handwriting Localization

We present an object detection based approach to localize handwritten re...
04/28/2019

TMIXT: A process flow for Transcribing MIXed handwritten and machine-printed Text

Handling large corpuses of documents is of significant importance in man...
05/05/2022

OCR Synthetic Benchmark Dataset for Indic Languages

We present the largest publicly available synthetic OCR benchmark datase...