One-Shot Weakly Supervised Video Object Segmentation

12/18/2019
by   Mennatullah Siam, et al.
0

Conventional few-shot object segmentation methods learn object segmentation from a few labelled support images with strongly labelled segmentation masks. Recent work has shown to perform on par with weaker levels of supervision in terms of scribbles and bounding boxes. However, there has been limited attention given to the problem of few-shot object segmentation with image-level supervision. We propose a novel multi-modal interaction module for few-shot object segmentation that utilizes a co-attention mechanism using both visual and word embeddings. It enables our model to achieve 5.1 previously proposed image-level few-shot object segmentation. Our method compares relatively close to the state of the art methods that use strong supervision, while ours use the least possible supervision. We further propose a novel setup for few-shot weakly supervised video object segmentation(VOS) that relies on image-level labels for the first frame. The proposed setup uses weak annotation unlike semi-supervised VOS setting that utilizes strongly labelled segmentation masks. The setup evaluates the effectiveness of generalizing to novel classes in the VOS setting. The setup splits the VOS data into multiple folds with different categories per fold. It provides a potential setup to evaluate how few-shot object segmentation methods can benefit from additional object poses, or object interactions that is not available in static frames as in PASCAL-5i benchmark.

READ FULL TEXT
research
01/26/2020

Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Inputs

Significant progress has been made recently in developing few-shot objec...
research
11/20/2021

FlowVOS: Weakly-Supervised Visual Warping for Detail-Preserving and Temporally Consistent Single-Shot Video Object Segmentation

We consider the task of semi-supervised video object segmentation (VOS)....
research
05/10/2022

Weakly-supervised segmentation of referring expressions

Visual grounding localizes regions (boxes or segments) in the image corr...
research
06/19/2017

Bayesian Joint Modelling for Object Localisation in Weakly Labelled Images

We address the problem of localisation of objects as bounding boxes in i...
research
03/27/2022

Temporal Transductive Inference for Few-Shot Video Object Segmentation

Few-shot video object segmentation (FS-VOS) aims at segmenting video fra...
research
05/09/2017

Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation

We address the problem of localisation of objects as bounding boxes in i...
research
04/18/2020

A Deep Learning Approach to Object Affordance Segmentation

Learning to understand and infer object functionalities is an important ...

Please sign up or login with your details

Forgot password? Click here to reset