Context-TAP: Tracking Any Point Demands Spatial Context Features

06/03/2023
by   Weikang Bian, et al.
0

We tackle the problem of Tracking Any Point (TAP) in videos, which specifically aims at estimating persistent long-term trajectories of query points in videos. Previous methods attempted to estimate these trajectories independently to incorporate longer image sequences, therefore, ignoring the potential benefits of incorporating spatial context features. We argue that independent video point tracking also demands spatial context features. To this end, we propose a novel framework Context-TAP, which effectively improves point trajectory accuracy by aggregating spatial context features in videos. Context-TAP contains two main modules: 1) a SOurse Feature Enhancement (SOFE) module, and 2) a TArget Feature Aggregation (TAFA) module. Context-TAP significantly improves PIPs all-sided, reducing 11.4% Average Trajectory Error of Occluded Points (ATE-Occ) on CroHD and increasing 11.8% Average Percentage of Correct Keypoint (A-PCK) on TAP-Vid-Kinectics. Demos are available at this $\href{https://wkbian.github.io/Projects/Context-TAP/}{webpage}$.

READ FULL TEXT
research
12/26/2022

MonoNeRF: Learning a Generalizable Dynamic Radiance Field from Monocular Videos

In this paper, we target at the problem of learning a generalizable dyna...
research
07/02/2020

Understanding Road Layout from Videos as a Whole

In this paper, we address the problem of inferring the layout of complex...
research
04/08/2022

Particle Videos Revisited: Tracking Through Occlusions Using Point Trajectories

Tracking pixels in videos is typically studied as an optical flow estima...
research
01/22/2019

Super-Trajectories: A Compact Yet Rich Video Representation

We propose a new video representation in terms of an over-segmentation o...
research
08/31/2023

Decoupled Local Aggregation for Point Cloud Learning

The unstructured nature of point clouds demands that local aggregation b...
research
06/14/2023

TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement

We present a novel model for Tracking Any Point (TAP) that effectively t...
research
11/30/2020

DUT: Learning Video Stabilization by Simply Watching Unstable Videos

We propose a Deep Unsupervised Trajectory-based stabilization framework ...

Please sign up or login with your details

Forgot password? Click here to reset