Single Shot Scene Text Retrieval

08/27/2018
by   Lluis Gómez, et al.
2

Textual information found in scene images provides high level semantic information about the image and its context and it can be leveraged for better scene understanding. In this paper we address the problem of scene text retrieval: given a text query, the system must return all images containing the queried text. The novelty of the proposed model consists in the usage of a single shot CNN architecture that predicts at the same time bounding boxes and a compact text representation of the words in them. In this way, the text based image retrieval task can be casted as a simple nearest neighbor search of the query text representation over the outputs of the CNN over the entire image database. Our experiments demonstrate that the proposed architecture outperforms previous state-of-the-art while it offers a significant increase in processing speed.

READ FULL TEXT

page 2

page 8

page 11

page 12

page 13

research
01/14/2020

Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features

Text contained in an image carries high-level semantics that can be expl...
research
10/17/2022

Bridging the Gap between Local Semantic Concepts and Bag of Visual Words for Natural Scene Image Retrieval

This paper addresses the problem of semantic-based image retrieval of na...
research
11/13/2015

Natural Language Object Retrieval

In this paper, we address the task of natural language object retrieval,...
research
04/04/2021

Scene Text Retrieval via Joint Text Detection and Similarity Learning

Scene text retrieval aims to localize and search all text instances from...
research
11/28/2016

Generating Holistic 3D Scene Abstractions for Text-based Image Retrieval

Spatial relationships between objects provide important information for ...
research
03/14/2018

Approximate Query Matching for Image Retrieval

Traditional image recognition involves identifying the key object in a p...
research
02/16/2018

Scenarios: A New Representation for Complex Scene Understanding

The ability for computational agents to reason about the high-level cont...

Please sign up or login with your details

Forgot password? Click here to reset