Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation

08/13/2019
by   Jungbeom Lee, et al.
3

When a deep neural network is trained on data with only image-level labeling, the regions activated in each image tend to identify only a small region of the target object. We propose a method of using videos automatically harvested from the web to identify a larger region of the target object by using temporal information, which is not present in the static image. The temporal variations in a video allow different regions of the target object to be activated. We obtain an activated region in each frame of a video, and then aggregate the regions from successive frames into a single image, using a warping technique based on optical flow. The resulting localization maps cover more of the target object, and can then be used as proxy ground-truth to train a segmentation network. This simple approach outperforms existing methods under the same level of supervision, and even approaches relying on extra annotations. Based on VGG-16 and ResNet 101 backbones, our method achieves the mIoU of 65.0 and 67.4, respectively, on PASCAL VOC 2012 test images, which represents a new state-of-the-art.

READ FULL TEXT

page 1

page 4

page 7

page 8

page 9

page 10

research
01/02/2017

Weakly Supervised Semantic Segmentation using Web-Crawled Videos

We propose a novel algorithm for weakly supervised semantic segmentation...
research
03/16/2021

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation

Weakly supervised semantic segmentation produces a pixel-level localizat...
research
11/20/2021

FlowVOS: Weakly-Supervised Visual Warping for Detail-Preserving and Temporally Consistent Single-Shot Video Object Segmentation

We consider the task of semi-supervised video object segmentation (VOS)....
research
07/03/2020

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation

This paper studies the problem of learning semantic segmentation from im...
research
07/29/2021

Temporal Feature Warping for Video Shadow Detection

While single image shadow detection has been improving rapidly in recent...
research
03/27/2018

WebSeg: Learning Semantic Segmentation from Web Searches

In this paper, we improve semantic segmentation by automatically learnin...
research
01/08/2019

Unseen Object Segmentation in Videos via Transferable Representations

In order to learn object segmentation models in videos, conventional met...

Please sign up or login with your details

Forgot password? Click here to reset