Rethinking Temporal Fusion for Video-based Person Re-identification on Semantic and Time Aspect

11/28/2019
by   Xinyang Jiang, et al.
0

Recently, the research interest of person re-identification (ReID) has gradually turned to video-based methods, which acquire a person representation by aggregating frame features of an entire video. However, existing video-based ReID methods do not consider the semantic difference brought by the outputs of different network stages, which potentially compromises the information richness of the person features. Furthermore, traditional methods ignore important relationship among frames, which causes information redundancy in fusion along the time axis. To address these issues, we propose a novel general temporal fusion framework to aggregate frame features on both semantic aspect and time aspect. As for the semantic aspect, a multi-stage fusion network is explored to fuse richer frame features at multiple semantic levels, which can effectively reduce the information loss caused by the traditional single-stage fusion. While, for the time axis, the existing intra-frame attention method is improved by adding a novel inter-frame attention module, which effectively reduces the information redundancy in temporal fusion by taking the relationship among frames into consideration. The experimental results show that our approach can effectively improve the video-based re-identification accuracy, achieving the state-of-the-art performance.

READ FULL TEXT

page 3

page 4

page 6

research
04/09/2019

Convolutional Temporal Attention Model for Video-based Person Re-identification

The goal of video-based person re-identification is to match two input v...
research
07/06/2022

Context Sensing Attention Network for Video-based Person Re-identification

Video-based person re-identification (ReID) is challenging due to the pr...
research
12/26/2018

3D PersonVLAD: Learning Deep Global Representations for Video-based Person Re-identification

In this paper, we introduce a global video representation to video-based...
research
05/05/2022

MMINR: Multi-frame-to-Multi-frame Inference with Noise Resistance for Precipitation Nowcasting with Radar

Precipitation nowcasting based on radar echo maps is essential in meteor...
research
03/27/2018

Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification

Video-based person re-identification matches video clips of people acros...
research
07/18/2020

Temporal Complementary Learning for Video Person Re-Identification

This paper proposes a Temporal Complementary Learning Network that extra...
research
02/25/2020

MagnifierNet: Towards Semantic Regularization and Fusion for Person Re-identification

Although person re-identification (ReID) has achieved significant improv...

Please sign up or login with your details

Forgot password? Click here to reset