LVOS: A Benchmark for Long-term Video Object Segmentation

11/18/2022
by   Lingyi Hong, et al.
0

Existing video object segmentation (VOS) benchmarks focus on short-term videos which just last about 3-5 seconds and where objects are visible most of the time. These videos are poorly representative of practical applications, and the absence of long-term datasets restricts further investigation of VOS on the application in realistic scenarios. So, in this paper, we present a new benchmark dataset and evaluation methodology named LVOS, which consists of 220 videos with a total duration of 421 minutes. To the best of our knowledge, LVOS is the first densely annotated long-term VOS dataset. The videos in our LVOS last 1.59 minutes on average, which is 20 times longer than videos in existing VOS datasets. Each video includes various attributes, especially challenges deriving from the wild, such as long-term reappearing and cross-temporal similar objeccts. Moreover, we provide additional language descriptions to encourage the exploration of integrating linguistic and visual features for video object segmentation. Based on LVOS, we assess existing video object segmentation algorithms and propose a Diverse Dynamic Memory network (DDMemory) that consists of three complementary memory banks to exploit temporal information adequately. The experiment results demonstrate the strength and weaknesses of prior methods, pointing promising directions for further study. Our objective is to provide the community with a large and varied benchmark to boost the advancement of long-term VOS. Data and code are available at <https://lingyihongfd.github.io/lvos.github.io/>.

READ FULL TEXT

page 1

page 4

page 5

page 8

page 15

page 16

research
03/26/2018

Long-term Tracking in the Wild: A Benchmark

We introduce a new video dataset and benchmark to assess single-object t...
research
07/14/2022

XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

We present XMem, a video object segmentation architecture for long video...
research
05/15/2019

Automatic Long-Term Deception Detection in Group Interaction Videos

Most work on automated deception detection (ADD) in video has two restri...
research
12/12/2022

Breaking the "Object" in Video Object Segmentation

The appearance of an object can be fleeting when it transforms. As eggs ...
research
02/08/2018

Learning to score the figure skating sports videos

This paper targets at learning to score the figure skating sports videos...
research
05/02/2023

Long-Term Rhythmic Video Soundtracker

We consider the problem of generating musical soundtracks in sync with r...
research
05/11/2022

Scene Consistency Representation Learning for Video Scene Segmentation

A long-term video, such as a movie or TV show, is composed of various sc...

Please sign up or login with your details

Forgot password? Click here to reset