BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation

08/01/2022
by   Ye Yu, et al.
7

Video Object Segmentation (VOS) is fundamental to video understanding. Transformer-based methods show significant performance improvement on semi-supervised VOS. However, existing work faces challenges segmenting visually similar objects in close proximity of each other. In this paper, we propose a novel Bilateral Attention Transformer in Motion-Appearance Neighboring space (BATMAN) for semi-supervised VOS. It captures object motion in the video via a novel optical flow calibration module that fuses the segmentation mask with optical flow estimation to improve within-object optical flow smoothness and reduce noise at object boundaries. This calibrated optical flow is then employed in our novel bilateral attention, which computes the correspondence between the query and reference frames in the neighboring bilateral space considering both motion and appearance. Extensive experiments validate the effectiveness of BATMAN architecture by outperforming all existing state-of-the-art on all four popular VOS benchmarks: Youtube-VOS 2019 (85.0 Youtube-VOS 2018 (85.3 (92.5

READ FULL TEXT

page 3

page 11

page 12

page 13

page 14

research
11/20/2021

FAMINet: Learning Real-time Semi-supervised Video Object Segmentation with Steepest Optimized Optical Flow

Semi-supervised video object segmentation (VOS) aims to segment a few mo...
research
01/25/2023

Flow-guided Semi-supervised Video Object Segmentation

We propose an optical flow-guided approach for semi-supervised video obj...
research
04/06/2022

Implicit Motion-Compensated Network for Unsupervised Video Object Segmentation

Unsupervised video object segmentation (UVOS) aims at automatically sepa...
research
11/29/2021

MUNet: Motion Uncertainty-aware Semi-supervised Video Object Segmentation

The task of semi-supervised video object segmentation (VOS) has been gre...
research
08/11/2021

Multi-Source Fusion and Automatic Predictor Selection for Zero-Shot Video Object Segmentation

Location and appearance are the key cues for video object segmentation. ...
research
09/04/2018

Unsupervised Video Object Segmentation using Motion Saliency-Guided Spatio-Temporal Propagation

Unsupervised video segmentation plays an important role in a wide variet...
research
07/05/2022

Segmenting Moving Objects via an Object-Centric Layered Representation

The objective of this paper is a model that is able to discover, track a...

Please sign up or login with your details

Forgot password? Click here to reset