Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification

12/12/2021
by   Lin Wan, et al.
0

While RGB-Infrared cross-modality person re-identification (RGB-IR ReID) has enabled great progress in 24-hour intelligent surveillance, state-of-the-arts still heavily rely on fine-tuning ImageNet pre-trained networks. Due to the single-modality nature, such large-scale pre-training may yield RGB-biased representations that hinder the performance of cross-modality image retrieval. This paper presents a self-supervised pre-training alternative, named Modality-Aware Multiple Granularity Learning (MMGL), which directly trains models from scratch on multi-modality ReID datasets, but achieving competitive results without external data and sophisticated tuning tricks. Specifically, MMGL globally maps shuffled RGB-IR images into a shared latent permutation space and further improves local discriminability by maximizing agreement between cycle-consistent RGB-IR image patches. Experiments demonstrate that MMGL learns better representations (+6.47 (converge in few hours) and solider data efficiency (<5 ImageNet pre-training. The results also suggest it generalizes well to various existing models, losses and has promising transferability across datasets. The code will be released.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2023

CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets

Current RGB-D scene recognition approaches often train two standalone ba...
research
12/01/2021

Unleashing the Potential of Unsupervised Pre-Training with Intra-Identity Regularization for Person Re-Identification

Existing person re-identification (ReID) methods typically directly load...
research
02/10/2020

Cross-Modality Paired-Images Generation for RGB-Infrared Person Re-Identification

RGB-Infrared (IR) person re-identification is very challenging due to th...
research
02/11/2023

Rethinking Vision Transformer and Masked Autoencoder in Multimodal Face Anti-Spoofing

Recently, vision transformer (ViT) based multimodal learning methods hav...
research
03/08/2022

Part-Aware Self-Supervised Pre-Training for Person Re-Identification

In person re-identification (ReID), very recent researches have validate...
research
04/11/2022

XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

Generating a new font library is a very labor-intensive and time-consumi...

Please sign up or login with your details

Forgot password? Click here to reset