Query by String word spotting based on character bi-gram indexing

05/28/2015
by   Suman K. Ghosh, et al.
0

In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC). These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2017

Attribute CNNs for Word Spotting in Handwritten Documents

Word spotting has become a field of strong research interest in document...
research
08/07/2023

Keyword Spotting Simplified: A Segmentation-Free Approach using Character Counting and CTC re-scoring

Recent advances in segmentation-free keyword spotting treat this problem...
research
06/09/2021

Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting

In this paper, we explore and evaluate the use of ranking-based objectiv...
research
07/16/2018

Combining a Context Aware Neural Network with a Denoising Autoencoder for Measuring String Similarities

Measuring similarities between strings is central for many established a...
research
06/25/2018

Handling Massive N-Gram Datasets Efficiently

This paper deals with the two fundamental problems concerning the handli...
research
07/23/2019

Optimal Transport-based Alignment of Learned Character Representations for String Similarity

String similarity models are vital for record linkage, entity resolution...
research
11/04/2020

Neural text normalization leveraging similarities of strings and sounds

We propose neural models that can normalize text by considering the simi...

Please sign up or login with your details

Forgot password? Click here to reset