DeepAI AI Chat
Log In Sign Up

Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification

by   Xinyu Lin, et al.
Harbin Institute of Technology
The Chinese University of Hong Kong, Shenzhen
NetEase, Inc

Thanks for the cross-modal retrieval techniques, visible-infrared (RGB-IR) person re-identification (Re-ID) is achieved by projecting them into a common space, allowing person Re-ID in 24-hour surveillance systems. However, with respect to the probe-to-gallery, almost all existing RGB-IR based cross-modal person Re-ID methods focus on image-to-image matching, while the video-to-video matching which contains much richer spatial- and temporal-information remains under-explored. In this paper, we primarily study the video-based cross-modal person Re-ID method. To achieve this task, a video-based RGB-IR dataset is constructed, in which 927 valid identities with 463,259 frames and 21,863 tracklets captured by 12 RGB/IR cameras are collected. Based on our constructed dataset, we prove that with the increase of frames in a tracklet, the performance does meet more enhancement, demonstrating the significance of video-to-video matching in RGB-IR person Re-ID. Additionally, a novel method is further proposed, which not only projects two modalities to a modal-invariant subspace, but also extracts the temporal-memory for motion-invariant. Thanks to these two strategies, much better results are achieved on our video-based cross-modal person Re-ID. The code and dataset are released at:


page 2

page 3

page 5

page 6

page 7

page 9

page 10

page 11


Multi-Scale Cascading Network with Compact Feature Learning for RGB-Infrared Person Re-Identification

RGB-Infrared person re-identification (RGB-IR Re-ID) aims to match perso...

A Cross-Modal Distillation Network for Person Re-identification in RGB-Depth

Person re-identification involves the recognition over time of individua...

Image-to-Video Person Re-Identification by Reusing Cross-modal Embeddings

Image-to-video person re-identification identifies a target person by a ...

AXM-Net: Cross-Modal Context Sharing Attention Network for Person Re-ID

Cross-modal person re-identification (Re-ID) is critical for modern vide...

Learning Instance-level Spatial-Temporal Patterns for Person Re-identification

Person re-identification (Re-ID) aims to match pedestrians under dis-joi...

Cross-modal Local Shortest Path and Global Enhancement for Visible-Thermal Person Re-Identification

In addition to considering the recognition difficulty caused by human po...