Semantic Image Retrieval via Active Grounding of Visual Situations

10/31/2017
by   Max H. Quinn, et al.
0

We describe a novel architecture for semantic image retrieval---in particular, retrieval of instances of visual situations. Visual situations are concepts such as "a boxing match," "walking the dog," "a crowd waiting for a bus," or "a game of ping-pong," whose instantiations in images are linked more by their common spatial and semantic structure than by low-level visual similarity. Given a query situation description, our architecture---called Situate---learns models capturing the visual features of expected objects as well the expected spatial configuration of relationships among objects. Given a new image, Situate uses these models in an attempt to ground (i.e., to create a bounding box locating) each expected component of the situation in the image via an active search procedure. Situate uses the resulting grounding to compute a score indicating the degree to which the new image is judged to contain an instance of the situation. Such scores can be used to rank images in a collection as part of a retrieval system. In the preliminary study described here, we demonstrate the promise of this system by comparing Situate's performance with that of two baseline methods, as well as with a related semantic image-retrieval system based on "scene graphs."

READ FULL TEXT

page 2

page 4

research
07/02/2016

Active Object Localization in Visual Situations

We describe a method for performing active localization of objects in in...
research
11/16/2016

Fast On-Line Kernel Density Estimation for Active Object Localization

A major goal of computer vision is to enable computers to interpret visu...
research
03/26/2020

Grounded Situation Recognition

We introduce Grounded Situation Recognition (GSR), a task that requires ...
research
08/16/2023

Integrating Visual and Semantic Similarity Using Hierarchies for Image Retrieval

Most of the research in content-based image retrieval (CBIR) focus on de...
research
07/17/2013

Content Based Image Retrieval System using Feature Classification with Modified KNN Algorithm

Feature means countenance, remote sensing scene objects with similar cha...
research
07/07/2020

Location Sensitive Image Retrieval and Tagging

People from different parts of the globe describe objects and concepts i...
research
09/02/2019

VISIR: Visual and Semantic Image Label Refinement

The social media explosion has populated the Internet with a wealth of i...

Please sign up or login with your details

Forgot password? Click here to reset