Multiple Object Tracking with Correlation Learning

by   Qiang Wang, et al.

Recent works have shown that convolutional networks have substantially improved the performance of multiple object tracking by simultaneously learning detection and appearance features. However, due to the local perception of the convolutional network structure itself, the long-range dependencies in both the spatial and temporal cannot be obtained efficiently. To incorporate the spatial layout, we propose to exploit the local correlation module to model the topological relationship between targets and their surrounding environment, which can enhance the discriminative power of our model in crowded scenes. Specifically, we establish dense correspondences of each spatial location and its context, and explicitly constrain the correlation volumes through self-supervised learning. To exploit the temporal context, existing approaches generally utilize two or more adjacent frames to construct an enhanced feature representation, but the dynamic motion scene is inherently difficult to depict via CNNs. Instead, our paper proposes a learnable correlation operator to establish frame-to-frame matches over convolutional feature maps in the different layers to align and propagate temporal context. With extensive experimental results on the MOT datasets, our approach demonstrates the effectiveness of correlation learning with the superior performance and obtains state-of-the-art MOTA of 76.5



page 1

page 3

page 8


End-to-end Flow Correlation Tracking with Spatial-temporal Attention

Discriminative correlation filters (DCF) with deep convolutional feature...

MAT: Motion-Aware Multi-Object Tracking

Modern multi-object tracking (MOT) systems usually model the trajectorie...

Video Modeling with Correlation Networks

Motion is a salient cue to recognize actions in video. Modern action rec...

Visual Object Tracking by Segmentation with Graph Convolutional Network

Segmentation-based tracking has been actively studied in computer vision...

Object Tracking in Hyperspectral Videos with Convolutional Features and Kernelized Correlation Filter

Target tracking in hyperspectral videos is a new research topic. In this...

Spatial-Temporal Relation Networks for Multi-Object Tracking

Recent progress in multiple object tracking (MOT) has shown that a robus...

Towards Omni-Supervised Face Alignment for Large Scale Unlabeled Videos

In this paper, we propose a spatial-temporal relational reasoning networ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.