Revisiting Temporal Modeling for Video-based Person ReID

05/05/2018
by   Jiyang Gao, et al.
0

Video-based person reID is an important task, which has received much attention in recent years due to the increasing demand in surveillance and camera networks. A typical video-based person reID system consists of three parts: an image-level feature extractor ( e.g. CNN), a temporal modeling method to aggregate temporal features and a loss function. Although many methods on temporal modeling have been proposed, it is still hard for us to find an apple-to-apple comparison among these methods, because the choice of base network architecture and loss function also have a large impact on the final performance. Thus, we comprehensively study and compare four different temporal modeling methods (temporal pooling, temporal attention, RNN and 3D convnets) for video-based person reID. We also propose a new attention generation network which adopts temporal convolution to extract temporal information among frames. The evaluation is done on the MARS dataset, and our methods outperform state-of-the-art methods by a large margin. Our source codes are released at https://github.com/jiyanggao/Video-Person-ReID.

READ FULL TEXT
research
01/21/2020

A Comprehensive Study on Temporal Modeling for Online Action Detection

Online action detection (OAD) is a practical yet challenging task, which...
research
03/24/2020

KFNet: Learning Temporal Camera Relocalization using Kalman Filtering

Temporal camera relocalization estimates the pose with respect to each v...
research
08/19/2021

Video Relation Detection via Tracklet based Visual Transformer

Video Visual Relation Detection (VidVRD), has received significant atten...
research
07/27/2022

One-Trimap Video Matting

Recent studies made great progress in video matting by extending the suc...
research
11/23/2021

Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis

Event analysis in untrimmed videos has attracted increasing attention du...
research
01/29/2021

Spatiotemporal Dilated Convolution with Uncertain Matching for Video-based Crowd Estimation

In this paper, we propose a novel SpatioTemporal convolutional Dense Net...
research
07/21/2019

Attention Filtering for Multi-person Spatiotemporal Action Detection on Deep Two-Stream CNN Architectures

Action detection and recognition tasks have been the target of much focu...

Please sign up or login with your details

Forgot password? Click here to reset