When VLAD met Hilbert

07/30/2015
by   Mehrtash Harandi, et al.
0

Vectors of Locally Aggregated Descriptors (VLAD) have emerged as powerful image/video representations that compete with or even outperform state-of-the-art approaches on many challenging visual recognition tasks. In this paper, we address two fundamental limitations of VLAD: its requirement for the local descriptors to have vector form and its restriction to linear classifiers due to its high-dimensionality. To this end, we introduce a kernelized version of VLAD. This not only lets us inherently exploit more sophisticated classification schemes, but also enables us to efficiently aggregate non-vector descriptors (e.g., tensors) in the VLAD framework. Furthermore, we propose three approximate formulations that allow us to accelerate the coding process while still benefiting from the properties of kernel VLAD. Our experiments demonstrate the effectiveness of our approach at handling manifold-valued data, such as covariance descriptors, on several classification tasks. Our results also evidence the benefits of our nonlinear VLAD descriptors against the linear ones in Euclidean space using several standard benchmark datasets.

READ FULL TEXT

page 10

page 12

page 13

page 14

research
09/16/2019

More About Covariance Descriptors for Image Set Coding: Log-Euclidean Framework based Kernel Matrix Representation

We consider a family of structural descriptors for visual data, namely c...
research
02/20/2015

Learning Descriptors for Object Recognition and 3D Pose Estimation

Detecting poorly textured objects and estimating their 3D pose reliably ...
research
07/26/2020

Learning and aggregating deep local descriptors for instance-level recognition

We propose an efficient method to learn deep local descriptors for insta...
research
09/02/2021

Computing Graph Descriptors on Edge Streams

Graph feature extraction is a fundamental task in graphs analytics. Usin...
research
01/19/2021

Hyperdimensional computing as a framework for systematic aggregation of image descriptors

Image and video descriptors are an omnipresent tool in computer vision a...
research
04/19/2016

Using Apache Lucene to Search Vector of Locally Aggregated Descriptors

Surrogate Text Representation (STR) is a profitable solution to efficien...

Please sign up or login with your details

Forgot password? Click here to reset