Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search

07/13/2017
by   Lei Zhu, et al.
0

Mobile landmark search (MLS) recently receives increasing attention for its great practical values. However, it still remains unsolved due to two important challenges. One is high bandwidth consumption of query transmission, and the other is the huge visual variations of query images sent from mobile devices. In this paper, we propose a novel hashing scheme, named as canonical view based discrete multi-modal hashing (CV-DMH), to handle these problems via a novel three-stage learning procedure. First, a submodular function is designed to measure visual representativeness and redundancy of a view set. With it, canonical views, which capture key visual appearances of landmark with limited redundancy, are efficiently discovered with an iterative mining strategy. Second, multi-modal sparse coding is applied to transform visual features from multiple modalities into an intermediate representation. It can robustly and adaptively characterize visual contents of varied landmark images with certain canonical views. Finally, compact binary codes are learned on intermediate representation within a tailored discrete binary embedding model which preserves visual relations of images measured with canonical views and removes the involved noises. In this part, we develop a new augmented Lagrangian multiplier (ALM) based optimization method to directly solve the discrete binary codes. We can not only explicitly deal with the discrete constraint, but also consider the bit-uncorrelated constraint and balance constraint together. Experiments on real world landmark datasets demonstrate the superior performance of CV-DMH over several state-of-the-art methods.

READ FULL TEXT

page 1

page 12

research
09/25/2020

Adaptive Online Multi-modal Hashing via Hadamard Matrix

Hashing plays an important role in information retrieval, due to its low...
research
04/25/2019

Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval

Unsupervised hashing can desirably support scalable content-based image ...
research
08/17/2017

Deep Binary Reconstruction for Cross-modal Hashing

With the increasing demand of massive multimodal data storage and organi...
research
07/15/2019

Multi-modal Sentiment Analysis using Deep Canonical Correlation Analysis

This paper learns multi-modal embeddings from text, audio, and video vie...
research
08/22/2023

CLIP Multi-modal Hashing: A new baseline CLIPMH

The multi-modal hashing method is widely used in multimedia retrieval. I...
research
06/29/2016

De-Hashing: Server-Side Context-Aware Feature Reconstruction for Mobile Visual Search

Due to the prevalence of mobile devices, mobile search becomes a more co...
research
02/26/2015

Coding local and global binary visual features extracted from video sequences

Binary local features represent an effective alternative to real-valued ...

Please sign up or login with your details

Forgot password? Click here to reset