End-to-end Learning of Deep Visual Representations for Image Retrieval

10/25/2016
by   Albert Gordo, et al.
0

While deep learning has become a key ingredient in the top performing methods for many computer vision tasks, it has failed so far to bring similar improvements to instance-level image retrieval. In this article, we argue that reasons for the underwhelming results of deep methods on image retrieval are threefold: i) noisy training data, ii) inappropriate deep architecture, and iii) suboptimal training procedure. We address all three issues. First, we leverage a large-scale but noisy landmark dataset and develop an automatic cleaning method that produces a suitable training set for deep retrieval. Second, we build on the recent R-MAC descriptor, show that it can be interpreted as a deep and differentiable architecture, and present improvements to enhance it. Last, we train this network with a siamese architecture that combines three streams with a triplet loss. At the end of the training process, the proposed architecture produces a global image representation in a single forward pass that is well suited for image retrieval. Extensive experiments show that our approach significantly outperforms previous retrieval approaches, including state-of-the-art methods based on costly local descriptor indexing and spatial verification. On Oxford 5k, Paris 6k and Holidays, we respectively report 94.7, 96.6, and 94.8 mean average precision. Our representations can also be heavily compressed using product quantization with little loss in accuracy. For additional material, please see www.xrce.xerox.com/Deep-Image-Retrieval.

READ FULL TEXT

page 4

page 5

page 10

page 17

research
04/05/2016

Deep Image Retrieval: Learning global representations for image search

We propose a novel approach for instance-level image retrieval. It produ...
research
10/07/2021

Efficient large-scale image retrieval with deep feature orthogonality and Hybrid-Swin-Transformers

We present an efficient end-to-end pipeline for largescale landmark reco...
research
08/07/2019

Hierarchy-of-Visual-Words: a Learning-based Approach for Trademark Image Retrieval

In this paper, we present the Hierarchy-of-Visual-Words (HoVW), a novel ...
research
08/11/2019

Unsupervised Neural Quantization for Compressed-Domain Similarity Search

We tackle the problem of unsupervised visual descriptors compression, wh...
research
03/15/2023

A Triplet-loss Dilated Residual Network for High-Resolution Representation Learning in Image Retrieval

Content-based image retrieval is the process of retrieving a subset of i...
research
12/13/2018

Deep Face Image Retrieval: a Comparative Study with Dictionary Learning

Facial image retrieval is a challenging task since faces have many simil...
research
12/20/2021

Learning with Label Noise for Image Retrieval by Selecting Interactions

Learning with noisy labels is an active research area for image classifi...

Please sign up or login with your details

Forgot password? Click here to reset