EgoTracks: A Long-term Egocentric Visual Object Tracking Dataset

01/09/2023
by   Hao Tang, et al.
0

Visual object tracking is a key component to many egocentric vision problems. However, the full spectrum of challenges of egocentric tracking faced by an embodied AI is underrepresented in many existing datasets; these tend to focus on relatively short, third-person videos. Egocentric video has several distinguishing characteristics from those commonly found in past datasets: frequent large camera motions and hand interactions with objects commonly lead to occlusions or objects exiting the frame, and object appearance can change rapidly due to widely different points of view, scale, or object states. Embodied tracking is also naturally long-term, and being able to consistently (re-)associate objects to their appearances and disappearances over as long as a lifetime is critical. Previous datasets under-emphasize this re-detection problem, and their "framed" nature has led to adoption of various spatiotemporal priors that we find do not necessarily generalize to egocentric video. We thus introduce EgoTracks, a new dataset for long-term egocentric visual object tracking. Sourced from the Ego4D dataset, this new dataset presents a significant challenge to recent state-of-the-art single-object tracking models, which we find score poorly on traditional tracking metrics for our new dataset, compared to popular benchmarks. We further show improvements that can be made to a STARK tracker to significantly increase its performance on egocentric data, resulting in a baseline model we call EgoSTARK. We publicly release our annotations and benchmark, hoping our dataset leads to further advancements in tracking.

READ FULL TEXT

page 1

page 3

page 5

page 6

research
03/26/2018

Long-term Tracking in the Wild: A Benchmark

We introduce a new video dataset and benchmark to assess single-object t...
research
09/26/2022

EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations

We introduce VISOR, a new dataset of pixel annotations and a benchmark s...
research
03/30/2017

Bootstrapping Labelled Dataset Construction for Cow Tracking and Behavior Analysis

This paper introduces a new approach to the long-term tracking of an obj...
research
07/27/2020

se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains

Tracking the 6D pose of objects in video sequences is important for robo...
research
05/22/2023

Type-to-Track: Retrieve Any Object via Prompt-based Tracking

One of the recent trends in vision problems is to use natural language c...
research
06/05/2022

Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in Videos

Recently, both long-tailed recognition and object tracking have made gre...
research
12/03/2020

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

Despite the recent advances in multiple object tracking (MOT), achieved ...

Please sign up or login with your details

Forgot password? Click here to reset