Unicom: Universal and Compact Representation Learning for Image Retrieval

04/12/2023
by   Xiang An, et al.
0

Modern image retrieval methods typically rely on fine-tuning pre-trained encoders to extract image-level descriptors. However, the most widely used models are pre-trained on ImageNet-1K with limited classes. The pre-trained feature representation is therefore not universal enough to generalize well to the diverse open-world classes. In this paper, we first cluster the large-scale LAION400M into one million pseudo classes based on the joint textual and visual features extracted by the CLIP model. Due to the confusion of label granularity, the automatically clustered dataset inevitably contains heavy inter-class conflict. To alleviate such conflict, we randomly select partial inter-class prototypes to construct the margin-based softmax loss. To further enhance the low-dimensional feature representation, we randomly select partial feature dimensions when calculating the similarities between embeddings and class-wise prototypes. The dual random partial selections are with respect to the class dimension and the feature dimension of the prototype matrix, making the classification conflict-robust and the feature embedding compact. Our method significantly outperforms state-of-the-art unsupervised and supervised image retrieval approaches on multiple benchmarks. The code and pre-trained models are released to facilitate future research https://github.com/deepglint/unicom.

READ FULL TEXT
research
08/22/2023

Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features

Given a query composed of a reference image and a relative caption, the ...
research
07/20/2022

Feature Representation Learning for Unsupervised Cross-domain Image Retrieval

Current supervised cross-domain image retrieval methods can achieve exce...
research
08/08/2023

Coarse-to-Fine: Learning Compact Discriminative Representation for Single-Stage Image Retrieval

Image retrieval targets to find images from a database that are visually...
research
06/08/2023

Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models

The advent of large pre-trained models has brought about a paradigm shif...
research
06/11/2021

AugNet: End-to-End Unsupervised Visual Representation Learning with Image Augmentation

Most of the achievements in artificial intelligence so far were accompli...
research
04/06/2022

OSCARS: An Outlier-Sensitive Content-Based Radiography Retrieval System

Improving the retrieval relevance on noisy datasets is an emerging need ...
research
01/10/2022

Why-So-Deep: Towards Boosting Previously Trained Models for Visual Place Recognition

Deep learning-based image retrieval techniques for the loop closure dete...

Please sign up or login with your details

Forgot password? Click here to reset