Full-Duplex Strategy for Video Object Segmentation

08/06/2021
by   Ge-Peng Ji, et al.
0

Appearance and motion are two important sources of information in video object segmentation (VOS). Previous methods mainly focus on using simplex solutions, lowering the upper bound of feature collaboration among and across these two cues. In this paper, we study a novel framework, termed the FSNet (Full-duplex Strategy Network), which designs a relational cross-attention module (RCAM) to achieve the bidirectional message propagation across embedding subspaces. Furthermore, the bidirectional purification module (BPM) is introduced to update the inconsistent features between the spatial-temporal embeddings, effectively improving the model robustness. By considering the mutual restraint within the full-duplex strategy, our FSNet performs the cross-modal feature-passing (i.e., transmission and receiving) simultaneously before the fusion and decoding stage, making it robust to various challenging scenarios (e.g., motion blur, occlusion) in VOS. Extensive experiments on five popular benchmarks (i.e., DAVIS_16, FBMS, MCL, SegTrack-V2, and DAVSOD_19) show that our FSNet outperforms other state-of-the-arts for both the VOS and video salient object detection tasks.

READ FULL TEXT

page 1

page 3

page 7

research
06/08/2022

Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation

Referring video object segmentation aims to predict foreground labels fo...
research
07/18/2022

Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation

Optical flow is an easily conceived and precious cue for advancing unsup...
research
04/08/2023

Co-attention Propagation Network for Zero-Shot Video Object Segmentation

Zero-shot video object segmentation (ZS-VOS) aims to segment foreground ...
research
06/02/2021

Rethinking Cross-modal Interaction from a Top-down Perspective for Referring Video Object Segmentation

Referring video object segmentation (RVOS) aims to segment video objects...
research
03/09/2022

A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection

Humans tend to mine objects by learning from a group of images or severa...
research
08/08/2022

Semi-Supervised Cross-Modal Salient Object Detection with U-Structure Networks

Salient Object Detection (SOD) is a popular and important topic aimed at...
research
05/17/2023

Object Segmentation by Mining Cross-Modal Semantics

Multi-sensor clues have shown promise for object segmentation, but inher...

Please sign up or login with your details

Forgot password? Click here to reset