Detect to Track and Track to Detect

10/11/2017
by   Christoph Feichtenhofer, et al.
0

Recent approaches for high accuracy detection and tracking of object categories in video consist of complex multistage solutions that become more cumbersome each year. In this paper we propose a ConvNet architecture that jointly performs detection and tracking, solving the task in a simple and effective way. Our contributions are threefold: (i) we set up a ConvNet architecture for simultaneous detection and tracking, using a multi-task objective for frame-based object detection and across-frame track regression; (ii) we introduce correlation features that represent object co-occurrences across time to aid the ConvNet during tracking; and (iii) we link the frame level detections based on our across-frame tracklets to produce high accuracy detections at the video level. Our ConvNet architecture for spatiotemporal object detection is evaluated on the large-scale ImageNet VID dataset where it achieves state-of-the-art results. Our approach provides better single model performance than the winning method of the last ImageNet challenge while being conceptually much simpler. Finally, we show that by increasing the temporal stride we can dramatically increase the tracker speed.

READ FULL TEXT

page 1

page 5

page 6

page 9

research
11/13/2018

Detect or Track: Towards Cost-Effective Video Object Detection/Tracking

State-of-the-art object detectors and trackers are developing fast. Trac...
research
02/26/2016

Seq-NMS for Video Object Detection

Video object detection is challenging because objects that are easily de...
research
10/06/2021

3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature Correlation

3D object detection using LiDAR data remains a key task for applications...
research
08/26/2021

Disentangling What and Where for 3D Object-Centric Representations Through Active Inference

Although modern object detection and classification models achieve high ...
research
04/02/2020

Tracking Objects as Points

Tracking has traditionally been the art of following interest points thr...
research
03/31/2023

Video text tracking for dense and small text based on pp-yoloe-r and sort algorithm

Although end-to-end video text spotting methods based on Transformer can...
research
11/27/2018

Integrated Object Detection and Tracking with Tracklet-Conditioned Detection

Accurate detection and tracking of objects is vital for effective video ...

Please sign up or login with your details

Forgot password? Click here to reset