ZJU ReLER Submission for EPIC-KITCHEN Challenge 2023: TREK-150 Single Object Tracking

07/05/2023
by   Yuanyou Xu, et al.
0

The Associating Objects with Transformers (AOT) framework has exhibited exceptional performance in a wide range of complex scenarios for video object tracking and segmentation. In this study, we convert the bounding boxes to masks in reference frames with the help of the Segment Anything Model (SAM) and Alpha-Refine, and then propagate the masks to the current frame, transforming the task from Video Object Tracking (VOT) to video object segmentation (VOS). Furthermore, we introduce MSDeAOT, a variant of the AOT series that incorporates transformers at multiple feature scales. MSDeAOT efficiently propagates object masks from previous frames to the current frame using two feature scales of 16 and 8. As a testament to the effectiveness of our design, we achieved the 1st place in the EPIC-KITCHENS TREK-150 Object Tracking Challenge.

READ FULL TEXT
research
07/05/2023

ZJU ReLER Submission for EPIC-KITCHEN Challenge 2023: Semi-Supervised Video Object Segmentation

The Associating Objects with Transformers (AOT) framework has exhibited ...
research
07/26/2023

Tracking Anything in High Quality

Visual object tracking is a fundamental video task in computer vision. R...
research
02/10/2019

MOTS: Multi-Object Tracking and Segmentation

This paper extends the popular task of multi-object tracking to multi-ob...
research
11/22/2022

β-Multivariational Autoencoder for Entangled Representation Learning in Video Frames

It is crucial to choose actions from an appropriate distribution while l...
research
03/28/2021

TransCenter: Transformers with Dense Queries for Multiple-Object Tracking

Transformer networks have proven extremely powerful for a wide variety o...
research
06/06/2018

Fast and Accurate Online Video Object Segmentation via Tracking Parts

Online video object segmentation is a challenging task as it entails to ...
research
05/20/2021

Robust Unsupervised Multi-Object Tracking in Noisy Environments

Physical processes, camera movement, and unpredictable environmental con...

Please sign up or login with your details

Forgot password? Click here to reset