Fast Interactive Image Retrieval using large-scale unlabeled data

by   Akshay Mehra, et al.

An interactive image retrieval system learns which images in the database belong to a user's query concept, by analyzing the example images and feedback provided by the user. The challenge is to retrieve the relevant images with minimal user interaction. In this work, we propose to solve this problem by posing it as a binary classification task of classifying all images in the database as being relevant or irrelevant to the user's query concept. Our method combines active learning with graph-based semi-supervised learning (GSSL) to tackle this problem. Active learning reduces the number of user interactions by querying the labels of the most informative points and GSSL allows to use abundant unlabeled data along with the limited labeled data provided by the user. To efficiently find the most informative point, we use an uncertainty sampling based method that queries the label of the point nearest to the decision boundary of the classifier. We estimate this decision boundary using our heuristic of adaptive threshold. To utilize huge volumes of unlabeled data we use an efficient approximation based method that reduces the complexity of GSSL from O(n^3) to O(n), making GSSL scalable. We make the classifier robust to the diversity and noisy labels associated with images in large databases by incorporating information from multiple modalities such as visual information extracted from deep learning based models and semantic information extracted from the WordNet. High F1 scores within few relevance feedback rounds in our experiments with concepts defined on AnimalWithAttributes and Imagenet (1.2 million images) datasets indicate the effectiveness and scalability of our approach.


Exploiting Diversity of Unlabeled Data for Label-Efficient Semi-Supervised Active Learning

The availability of large labeled datasets is the key component for the ...

Information-Theoretic Active Learning for Content-Based Image Retrieval

We propose Information-Theoretic Active Learning (ITAL), a novel batch-m...

Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels

We propose using active learning based techniques to further improve the...

Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking

Large databases are often organized by hand-labeled metadata, or criteri...

FDive: Learning Relevance Models using Pattern-based Similarity Measures

The detection of interesting patterns in large high-dimensional datasets...

Object-Level Targeted Selection via Deep Template Matching

Retrieving images with objects that are semantically similar to objects ...

Query-free Clothing Retrieval via Implicit Relevance Feedback

Image-based clothing retrieval is receiving increasing interest with the...