Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

06/09/2021
by   Ho Kei Cheng, et al.
0

This paper presents a simple yet effective approach to modeling space-time correspondences in the context of video object segmentation. Unlike most existing approaches, we establish correspondences directly between frames without re-encoding the mask features for every object, leading to a highly efficient and robust framework. With the correspondences, every node in the current query frame is inferred by aggregating features from the past in an associative fashion. We cast the aggregation process as a voting problem and find that the existing inner-product affinity leads to poor use of memory with a small (fixed) subset of memory nodes dominating the votes, regardless of the query. In light of this phenomenon, we propose using the negative squared Euclidean distance instead to compute the affinities. We validated that every memory node now has a chance to contribute, and experimentally showed that such diversified voting is beneficial to both memory efficiency and inference accuracy. The synergy of correspondence networks and diversified voting works exceedingly well, achieves new state-of-the-art results on both DAVIS and YouTubeVOS datasets while running significantly faster at 20+ FPS for multiple objects without bells and whistles.

READ FULL TEXT

page 3

page 5

page 9

research
04/01/2019

Video Object Segmentation using Space-Time Memory Networks

We propose a novel solution for semi-supervised video object segmentatio...
research
01/30/2020

Fast Video Object Segmentation using the Global Context Module

We developed a real-time, high-quality video object segmentation algorit...
research
06/16/2023

Learning Space-Time Semantic Correspondences

We propose a new task of space-time semantic correspondence prediction i...
research
04/13/2023

Boosting Video Object Segmentation via Space-time Correspondence Learning

Current top-leading solutions for video object segmentation (VOS) typica...
research
09/14/2021

Space Time Recurrent Memory Network

We propose a novel visual memory network architecture for the learning a...
research
10/01/2013

Object Detection Using Keygraphs

We propose a new framework for object detection based on a generalizatio...
research
02/09/2021

SwiftNet: Real-time Video Object Segmentation

In this work we present SwiftNet for real-time semi-supervised video obj...

Please sign up or login with your details

Forgot password? Click here to reset