MILDNet: A Lightweight Single Scaled Deep Ranking Architecture

03/03/2019
by   Anirudha Vishvakarma, et al.
0

Multi-scale deep CNN architecture [1, 2, 3] successfully captures both fine and coarse level image descriptors for visual similarity task, but they come up with expensive memory overhead and latency. In this paper, we propose a competing novel CNN architecture, called MILDNet, which merits by being vastly compact (about 3 times). Inspired by the fact that successive CNN layers represent the image with increasing levels of abstraction, we compressed our deep ranking model to a single CNN by coupling activations from multiple intermediate layers along with the last layer. Trained on the famous Street2shop dataset [4], we demonstrate that our approach performs as good as the current state-of-the-art models with only one third of the parameters, model size, training time and significant reduction in inference time. The significance of intermediate layers on image retrieval task has also been shown to be performing on popular datasets Holidays, Oxford, Paris [5]. So even though our experiments are done on ecommerce domain, it is applicable to other domains as well. We further did an ablation study to validate our hypothesis by checking the impact on adding each intermediate layer. With this we also present two more useful variants of MILDNet, a mobile model (12 times smaller) for on-edge devices and a compactly featured model (512-d feature embeddings) for systems with less RAMs and to reduce the ranking cost. Further we present an intuitive way to automatically create a tailored in-house triplet training dataset, which is very hard to create manually. This solution too can also be deployed as an all-inclusive visual similarity solution. Finally, we present our entire production level architecture which currently powers visual similarity at Fynd.

READ FULL TEXT

page 2

page 4

page 8

page 9

research
06/15/2019

REMAP: Multi-layer entropy-guided pooling of dense CNN features for image retrieval

This paper addresses the problem of very large-scale image retrieval, fo...
research
04/04/2022

Correlation Verification for Image Retrieval

Geometric verification is considered a de facto solution for the re-rank...
research
11/27/2016

Voronoi-based compact image descriptors: Efficient Region-of-Interest retrieval with VLAD and deep-learning-based descriptors

We investigate the problem of image retrieval based on visual queries wh...
research
07/12/2019

ACTNET: end-to-end learning of feature activations and multi-stream aggregation for effective instance image retrieval

We propose a novel CNN architecture called ACTNET for robust instance im...
research
07/12/2019

ACTNET: end-to-end learning of feature activations and aggregation for effective instance image retrieval

We propose a novel CNN architecture called ACTNET for robust instance im...
research
03/13/2015

Hybrid multi-layer Deep CNN/Aggregator feature for image classification

Deep Convolutional Neural Networks (DCNN) have established a remarkable ...
research
02/14/2022

Tightly Coupled Learning Strategy for Weakly Supervised Hierarchical Place Recognition

Visual place recognition (VPR) is a key issue for robotics and autonomou...

Please sign up or login with your details

Forgot password? Click here to reset