DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

by   Shiyi Lan, et al.

We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision. Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. The teacher is a structured energy model incorporating a pairwise potential and a cross-image potential to model the pairwise pixel relationships both within and across the boxes. Minimizing the teacher energy simultaneously yields refined object masks and dense correspondences between intra-class objects, which are taken as pseudo-labels to supervise the task network and provide positive/negative correspondence pairs for dense constrastive learning. We show a symbiotic relationship where the two tasks mutually benefit from each other. Our best model achieves 37.9 supervised methods and is competitive to supervised methods. We also obtain state of the art weakly supervised results on PASCAL VOC12 and PF-PASCAL with real-time inference.



There are no comments yet.


page 1

page 4

page 8

page 13


Weakly- and Semi-Supervised Panoptic Segmentation

We present a weakly supervised model that jointly performs both semantic...

BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation

Weakly supervised segmentation methods using bounding box annotations fo...

Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence

Establishing dense correspondences across semantically similar images is...

Simple Does It: Weakly Supervised Instance and Semantic Segmentation

Semantic labelling and instance segmentation are two tasks that require ...

Where are the Masks: Instance Segmentation with Image-level Supervision

A major obstacle in instance segmentation is that existing methods often...

Weakly-Supervised Amodal Instance Segmentation with Compositional Priors

Amodal segmentation in biological vision refers to the perception of the...

Weakly-Supervised Semantic Segmentation by Learning Label Uncertainty

Since the rise of deep learning, many computer vision tasks have seen si...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.