Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism

08/09/2017
βˆ™
by   Qi Chu, et al.
βˆ™
0
βˆ™

In this paper, we propose a CNN-based framework for online MOT. This framework utilizes the merits of single object trackers in adapting appearance models and searching for target in the next frame. Simply applying single object tracker for MOT will encounter the problem in computational efficiency and drifted results caused by occlusion. Our framework achieves computational efficiency by sharing features and using ROI-Pooling to obtain individual features for each target. Some online learned target-specific CNN layers are used for adapting the appearance model for each target. In the framework, we introduce spatial-temporal attention mechanism (STAM) to handle the drift caused by occlusion and interaction among targets. The visibility map of the target is learned and used for inferring the spatial attention map. The spatial attention map is then applied to weight the features. Besides, the occlusion status can be estimated from the visibility map, which controls the online updating process via weighted loss on training samples with different occlusion statuses in different frames. It can be considered as temporal attention mechanism. The proposed algorithm achieves 34.3 challenging MOT15 and MOT16 benchmark dataset respectively.

READ FULL TEXT
research
βˆ™ 02/02/2019

Online Multi-Object Tracking with Dual Matching Attention Networks

In this paper, we propose an online Multi-Object Tracking (MOT) approach...
research
βˆ™ 05/30/2022

Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for Autonomous Driving

While separately leveraging monocular 3D object detection and 2D multi-o...
research
βˆ™ 05/31/2022

Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking

The recent trend in multiple object tracking (MOT) is heading towards le...
research
βˆ™ 03/23/2018

Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking

Discriminative Correlation Filters (DCF) are efficient in visual trackin...
research
βˆ™ 07/05/2018

Spatiotemporal KSVD Dictionary Learning for Online Multi-target Tracking

In this paper, we present a new spatial discriminative KSVD dictionary a...
research
βˆ™ 10/02/2020

Leveraging Tacit Information Embedded in CNN Layers for Visual Tracking

Different layers in CNNs provide not only different levels of abstractio...
research
βˆ™ 01/09/2019

Fast CNN-Based Object Tracking Using Localization Layers and Deep Features Interpolation

Object trackers based on Convolution Neural Network (CNN) have achieved ...

Please sign up or login with your details

Forgot password? Click here to reset