Interferences in match kernels

11/24/2016
by   Naila Murray, et al.
0

We consider the design of an image representation that embeds and aggregates a set of local descriptors into a single vector. Popular representations of this kind include the bag-of-visual-words, the Fisher vector and the VLAD. When two such image representations are compared with the dot-product, the image-to-image similarity can be interpreted as a match kernel. In match kernels, one has to deal with interference, i.e. with the fact that even if two descriptors are unrelated, their matching score may contribute to the overall similarity. We formalise this problem and propose two related solutions, both aimed at equalising the individual contributions of the local descriptors in the final representation. These methods modify the aggregation stage by including a set of per-descriptor weights. They differ by the objective function that is optimised to compute those weights. The first is a "democratisation" strategy that aims at equalising the relative importance of each descriptor in the set comparison metric. The second one involves equalising the match of a single descriptor to the aggregated vector. These concurrent methods give a substantial performance boost over the state of the art in image search with short or mid-size vectors, as demonstrated by our experiments on standard public image retrieval benchmarks.

READ FULL TEXT

page 13

page 14

research
07/28/2016

Local Feature Detectors, Descriptors, and Image Representations: A Survey

With the advances in both stable interest region detectors and robust an...
research
07/08/2014

Orientation covariant aggregation of local descriptors with embeddings

Image search systems based on local descriptors typically achieve orient...
research
10/03/2015

Approximate Fisher Kernels of non-iid Image Models for Image Categorization

The bag-of-words (BoW) model treats images as sets of local descriptors ...
research
01/19/2021

Hyperdimensional computing as a framework for systematic aggregation of image descriptors

Image and video descriptors are an omnipresent tool in computer vision a...
research
04/19/2016

Using Apache Lucene to Search Vector of Locally Aggregated Descriptors

Surrogate Text Representation (STR) is a profitable solution to efficien...
research
03/26/2020

Compact Deep Aggregation for Set Retrieval

The objective of this work is to learn a compact embedding of a set of d...
research
10/24/2019

ProLFA: Representative Prototype Selection for Local Feature Aggregation

Given a set of hand-crafted local features, acquiring a global represent...

Please sign up or login with your details

Forgot password? Click here to reset