Leveraging machine learning for less developed languages: Progress on Urdu text detection

09/28/2022
by   Hazrat Ali, et al.
0

Text detection in natural scene images has applications for autonomous driving, navigation help for elderly and blind people. However, the research on Urdu text detection is usually hindered by lack of data resources. We have developed a dataset of scene images with Urdu text. We present the use of machine learning methods to perform detection of Urdu text from the scene images. We extract text regions using channel enhanced Maximally Stable Extremal Region (MSER) method. First, we classify text and noise based on their geometric properties. Next, we use a support vector machine for early discarding of non-text regions. To further remove the non-text regions, we use histogram of oriented gradients (HoG) features obtained and train a second SVM classifier. This improves the overall performance on text region detection within the scene images. To support research on Urdu text, We aim to make the data freely available for research use. We also aim to highlight the challenges and the research gap for Urdu text detection.

READ FULL TEXT

page 2

page 3

page 4

research
09/16/2021

Urdu text in natural scene images: a new dataset and preliminary text detection

Text detection in natural scene images for content analysis is an intere...
research
07/21/2017

Text Recognition in Scene Image and Video Frame using Color Channel Selection

In recent years, recognition of text from natural scene image and video ...
research
12/04/2017

Enhanced Characterness for Text Detection in the Wild

Text spotting is an interesting research problem as text may appear at a...
research
09/11/2017

Exploring Geometric Property Thresholds For Filtering Non-Text Regions In A Connected Component Based Text Detection Application

Automated text detection is a difficult computer vision task. In order t...
research
06/14/2023

Early Detection of Late Blight Tomato Disease using Histogram Oriented Gradient based Support Vector Machine

The tomato is one of the most important fruits on earth. It plays an imp...
research
07/28/2014

A Fast Hierarchical Method for Multi-script and Arbitrary Oriented Scene Text Extraction

Typography and layout lead to the hierarchical organisation of text in w...
research
01/12/2019

Summarization and Visualization of Large Volumes of Broadcast Video Data

Over the past few years, there has been an astounding growth in the numb...

Please sign up or login with your details

Forgot password? Click here to reset