A BLSTM Network for Printed Bengali OCR System with High Accuracy

08/23/2019
by   Debabrata Paul, et al.
9

This paper presents a printed Bengali and English text OCR system developed by us using a single hidden BLSTM-CTC architecture having 128 units. Here, we did not use any peephole connection and dropout in the BLSTM, which helped us in getting better accuracy. This architecture was trained by 47,720 text lines that include English words also. When tested over 20 different Bengali fonts, it has produced character level accuracy of 99.32 96.65 sometimes recognizes a character of Bengali into the same character of a non-Bengali script, especially Assamese, which has no distinction from Bengali, except for a few characters. For example, Bengali character for 'RA' is sometimes recognized as that of Assamese, mainly in conjunct consonant forms. Our OCR is free from such errors. This OCR system is available online at https://banglaocr.nltr.org

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2012

An Online Character Recognition System to Convert Grantha Script to Malayalam

This paper presents a novel approach to recognize Grantha, an ancient sc...
research
06/19/2013

English Character Recognition using Artificial Neural Network

This work focuses on development of a Offline Hand Written English Chara...
research
04/30/2022

SVTR: Scene Text Recognition with a Single Visual Model

Dominant scene text recognition models commonly contain two building blo...
research
08/11/2016

Automatic text extraction and character segmentation using maximally stable extremal regions

Text detection and segmentation is an important prerequisite for many co...
research
02/18/2019

Modeling fonts in context of counteraction of electromagnetic eavesdropping process

Computer fonts can be one of solutions supporting a protection of inform...
research
11/27/2017

Improving OCR Accuracy on Early Printed Books by utilizing Cross Fold Training and Voting

In this paper we introduce a method that significantly reduces the chara...
research
03/05/2022

Extracting linguistic speech patterns of Japanese fictional characters using subword units

This study extracted and analyzed the linguistic speech patterns that ch...

Please sign up or login with your details

Forgot password? Click here to reset