Large Scale Font Independent Urdu Text Recognition System

05/14/2020
by   Atique ur Rehman, et al.
0

OCR algorithms have received a significant improvement in performance recently, mainly due to the increase in the capabilities of artificial intelligence algorithms. However, this advancement is not evenly distributed over all languages. Urdu is among the languages which did not receive much attention, especially in the font independent perspective. There exists no automated system that can reliably recognize printed Urdu text in images and videos across different fonts. To help bridge this gap, we have developed Qaida, a large scale data set with 256 fonts, and a complete Urdu lexicon. We have also developed a Convolutional Neural Network (CNN) based classification model which can recognize Urdu ligatures with 84.2 demonstrate that our recognition network can not only recognize the text in the fonts it is trained on but can also reliably recognize text in unseen (new) fonts. To this end, this paper makes following contributions: (i) we introduce a large scale, multiple fonts based data set for printed Urdu text recognition;(ii) we have designed, trained and evaluated a CNN based model for Urdu text recognition; (iii) we experiment with incremental learning methods to produce state-of-the-art results for Urdu text recognition. All the experiment choices were thoroughly validated via detailed empirical analysis. We believe that this study can serve as the basis for further improvement in the performance of font independent Urdu OCR systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2016

A CNN Based Scene Chinese Text Recognition Algorithm With Synthetic Data Engine

Scene text recognition plays an important role in many computer vision a...
research
06/26/2017

VoxCeleb: a large-scale speaker identification dataset

Most existing datasets for speaker identification contain samples obtain...
research
03/26/2019

Attention Based Glaucoma Detection: A Large-scale Database and CNN Model

Recently, the attention mechanism has been successfully applied in convo...
research
02/10/2023

Text recognition on images using pre-trained CNN

A text on an image often stores important information and directly carri...
research
10/16/2019

Frequency and temporal convolutional attention for text-independent speaker recognition

Majority of the recent approaches for text-independent speaker recogniti...
research
07/17/2017

Incremental Boosting Convolutional Neural Network for Facial Action Unit Recognition

Recognizing facial action units (AUs) from spontaneous facial expression...
research
01/14/2021

FabricNet: A Fiber Recognition Architecture Using Ensemble ConvNets

Fabric is a planar material composed of textile fibers. Textile fibers a...

Please sign up or login with your details

Forgot password? Click here to reset