Keyword Spotting Simplified: A Segmentation-Free Approach using Character Counting and CTC re-scoring

08/07/2023
by   George Retsinas, et al.
0

Recent advances in segmentation-free keyword spotting treat this problem w.r.t. an object detection paradigm and borrow from state-of-the-art detection systems to simultaneously propose a word bounding box proposal mechanism and compute a corresponding representation. Contrary to the norm of such methods that rely on complex and large DNN models, we propose a novel segmentation-free system that efficiently scans a document image to find rectangular areas that include the query information. The underlying model is simple and compact, predicting character occurrences over rectangular areas through an implicitly learned scale map, trained on word-level annotated images. The proposed document scanning is then performed using this character counting in a cost-effective manner via integral images and binary search. Finally, the retrieval similarity by character counting is refined by a pyramidal representation and a CTC-based re-scoring algorithm, fully utilizing the trained CNN model. Experimental validation on two widely-used datasets shows that our method achieves state-of-the-art results outperforming the more complex alternatives, despite the simplicity of the underlying model.

READ FULL TEXT
research
05/28/2015

Query by String word spotting based on character bi-gram indexing

In this paper we propose a segmentation-free query by string word spotti...
research
10/20/2014

Supervised mid-level features for word image representation

This paper addresses the problem of learning word image representations:...
research
10/08/2018

End-to-End Text Classification via Image-based Embedding using Character-level Networks

For analysing and/or understanding languages having no word boundaries b...
research
09/10/2019

Chargrid-OCR: End-to-end trainable Optical Character Recognition through Semantic Segmentation and Object Detection

We present an end-to-end trainable approach for optical character recogn...
research
02/01/2016

Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers

Document classification tasks were primarily tackled at word level. Rece...
research
05/05/2021

Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention

Referring Expression Comprehension (REC) has become one of the most impo...

Please sign up or login with your details

Forgot password? Click here to reset