DeepAI AI Chat
Log In Sign Up

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

by   Shiyi Lan, et al.

We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision. Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. The teacher is a structured energy model incorporating a pairwise potential and a cross-image potential to model the pairwise pixel relationships both within and across the boxes. Minimizing the teacher energy simultaneously yields refined object masks and dense correspondences between intra-class objects, which are taken as pseudo-labels to supervise the task network and provide positive/negative correspondence pairs for dense constrastive learning. We show a symbiotic relationship where the two tasks mutually benefit from each other. Our best model achieves 37.9 supervised methods and is competitive to supervised methods. We also obtain state of the art weakly supervised results on PASCAL VOC12 and PF-PASCAL with real-time inference.


page 1

page 4

page 8

page 13


Semantic Instance Segmentation of 3D Scenes Through Weak Bounding Box Supervision

Current 3D segmentation methods heavily rely on large-scale point-cloud ...

BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation

Weakly supervised segmentation methods using bounding box annotations fo...

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

Establishing dense correspondences across semantically similar images is...

Affinity Attention Graph Neural Network for Weakly Supervised Semantic Segmentation

Weakly supervised semantic segmentation is receiving great attention due...

Where are the Masks: Instance Segmentation with Image-level Supervision

A major obstacle in instance segmentation is that existing methods often...

DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

Observing the close relationship among panoptic, semantic and instance s...

Weakly-Supervised Amodal Instance Segmentation with Compositional Priors

Amodal segmentation in biological vision refers to the perception of the...