CoTracker: It is Better to Track Together

07/14/2023
by   Nikita Karaev, et al.
0

Methods for video motion prediction either estimate jointly the instantaneous motion of all points in a given video frame using optical flow or independently track the motion of individual points throughout the video. The latter is true even for powerful deep-learning methods that can track points through occlusions. Tracking points individually ignores the strong correlation that can exist between the points, for instance, because they belong to the same physical object, potentially harming performance. In this paper, we thus propose CoTracker, an architecture that jointly tracks multiple points throughout an entire video. This architecture combines several ideas from the optical flow and tracking literature in a new, flexible and powerful design. It is based on a transformer network that models the correlation of different points in time via specialised attention layers. The transformer iteratively updates an estimate of several trajectories. It can be applied in a sliding-window manner to very long videos, for which we engineer an unrolled training loop. It can track from one to several points jointly and supports adding new points to track at any time. The result is a flexible and powerful tracking algorithm that outperforms state-of-the-art methods in almost all benchmarks.

READ FULL TEXT
research
04/08/2022

Particle Videos Revisited: Tracking Through Occlusions Using Point Trajectories

Tracking pixels in videos is typically studied as an optical flow estima...
research
06/08/2023

Tracking Everything Everywhere All at Once

We present a new test-time optimization method for estimating dense and ...
research
05/15/2018

Topological Eulerian Synthesis of Slow Motion Periodic Videos

We consider the problem of taking a video that is comprised of multiple ...
research
11/07/2022

TAP-Vid: A Benchmark for Tracking Any Point in a Video

Generic motion understanding from video involves not only tracking objec...
research
06/14/2023

TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement

We present a novel model for Tracking Any Point (TAP) that effectively t...
research
10/09/2020

Robust Instance Tracking via Uncertainty Flow

Current state-of-the-art trackers often fail due to distractorsand large...
research
06/25/2018

Tracking Emerges by Colorizing Videos

We use large amounts of unlabeled video to learn models for visual track...

Please sign up or login with your details

Forgot password? Click here to reset