Zone-based Keyword Spotting in Bangla and Devanagari Documents

by   Ayan Kumar Bhunia, et al.

In this paper we present a word spotting system in text lines for offline Indic scripts such as Bangla (Bengali) and Devanagari. Recently, it was shown that zone-wise recognition method improves the word recognition performance than conventional full word recognition system in Indic scripts. Inspired with this idea we consider the zone segmentation approach and use middle zone information to improve the traditional word spotting performance. To avoid the problem of zone segmentation using heuristic approach, we propose here an HMM based approach to segment the upper and lower zone components from the text line images. The candidate keywords are searched from a line without segmenting characters or words. Also, we propose a novel feature combining foreground and background information of text line images for keyword-spotting by character filler models. A significant improvement in performance is noted by using both foreground and background information than their individual one. Pyramid Histogram of Oriented Gradient (PHOG) feature has been used in our word spotting framework. From the experiment, it has been noted that the proposed zone-segmentation based system outperforms traditional approaches of word spotting.


page 3

page 8

page 14

page 25

page 26


HMM-based Indic Handwritten Word Recognition using Zone Segmentation

This paper presents a novel approach towards Indic handwritten word reco...

Devnagari document segmentation using histogram approach

Document segmentation is one of the critical phases in machine recogniti...

Bangla Text Recognition from Video Sequence: A New Focus

Extraction and recognition of Bangla text from video frame images is cha...

A Multi-oriented Chinese Keyword Spotter Guided by Text Line Detection

Chinese keyword spotting is a challenging task as there is no visual bla...

Line and Word Matching in Old Documents

This paper is concerned with the problem of establishing an index based ...

Word Searching in Scene Image and Video Frame in Multi-Script Scenario using Dynamic Shape Coding

Retrieval of text information from natural scene images and video frames...

Date-Field Retrieval in Scene Image and Video Frames using Text Enhancement and Shape Coding

Text recognition in scene image and video frames is difficult because of...