STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation

by   Yunchao Wei, et al.

Recently, significant improvement has been made on semantic object segmentation due to the development of deep convolutional neural networks (DCNNs). Training such a DCNN usually relies on a large number of images with pixel-level segmentation masks, and annotating these images is very costly in terms of both finance and human effort. In this paper, we propose a simple to complex (STC) framework in which only image-level annotations are utilized to learn DCNNs for semantic segmentation. Specifically, we first train an initial segmentation network called Initial-DCNN with the saliency maps of simple images (i.e., those with a single category of major object(s) and clean background). These saliency maps can be automatically obtained by existing bottom-up salient object detection techniques, where no supervision information is needed. Then, a better network called Enhanced-DCNN is learned with supervision from the predicted segmentation masks of simple images based on the Initial-DCNN as well as the image-level annotations. Finally, more pixel-level segmentation masks of complex images (two or more categories of objects with cluttered background), which are inferred by using Enhanced-DCNN and image-level annotations, are utilized as the supervision information to learn the Powerful-DCNN for semantic segmentation. Our method utilizes 40K simple images from and 10K complex images from PASCAL VOC for step-wisely boosting the segmentation network. Extensive experimental results on PASCAL VOC 2012 segmentation benchmark well demonstrate the superiority of the proposed STC framework compared with other state-of-the-arts.


page 3

page 5

page 8


Reliability Does Matter: An End-to-End Weakly Supervised Semantic Segmentation Approach

Weakly supervised semantic segmentation is a challenging task as it only...

Learning Semantic Segmentation with Diverse Supervision

Models based on deep convolutional neural networks (CNN) have significan...

Coarse-to-fine Semantic Segmentation from Image-level Labels

Deep neural network-based semantic segmentation generally requires large...

Free Lunch for Co-Saliency Detection: Context Adjustment

We unveil a long-standing problem in the prevailing co-saliency detectio...

Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features

Weakly-supervised semantic segmentation under image tags supervision is ...

Bottom-Up Top-Down Cues for Weakly-Supervised Semantic Segmentation

We consider the task of learning a classifier for semantic segmentation ...

Convolutional Feature Masking for Joint Object and Stuff Segmentation

The topic of semantic segmentation has witnessed considerable progress d...

Please sign up or login with your details

Forgot password? Click here to reset