Harvesting Information from Captions for Weakly Supervised Semantic Segmentation

05/16/2019
by   Johann Sawatzky, et al.
0

Since acquiring pixel-wise annotations for training convolutional neural networks for semantic image segmentation is time-consuming, weakly supervised approaches that only require class tags have been proposed. In this work, we propose another form of supervision, namely image captions as they can be found on the Internet. These captions have two advantages. They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image. To leverage such textual context, we deploy a multi-modal network that learns a joint embedding of the visual representation of the image and the textual representation of the caption. The network estimates text activation maps (TAMs) for class names as well as compound concepts, i.e. combinations of nouns and their attributes. The TAMs of compound concepts describing classes of interest substantially improve the quality of the estimated class activation maps which are then used to train a network for semantic segmentation. We evaluate our method on the COCO dataset where it achieves state of the art results for weakly supervised image segmentation.

READ FULL TEXT

page 4

page 6

research
11/16/2021

Weakly-supervised fire segmentation by visualizing intermediate CNN layers

Fire localization in images and videos is an important step for an auton...
research
05/25/2016

Describing Human Aesthetic Perception by Deeply-learned Attributes from Flickr

Many aesthetic models in computer vision suffer from two shortcomings: 1...
research
04/06/2021

Weakly supervised segmentation with cross-modality equivariant constraints

Weakly supervised learning has emerged as an appealing alternative to al...
research
08/12/2021

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision

The abundance and richness of Internet photos of landmarks and cities ha...
research
10/17/2022

Weakly Supervised Face Naming with Symmetry-Enhanced Contrastive Loss

We revisit the weakly supervised cross-modal face-name alignment task; t...
research
03/14/2023

Exploring Weakly Supervised Semantic Segmentation Ensembles for Medical Imaging Systems

Reliable classification and detection of certain medical conditions, in ...
research
03/19/2016

Seed, Expand and Constrain: Three Principles for Weakly-Supervised Image Segmentation

We introduce a new loss function for the weakly-supervised training of s...

Please sign up or login with your details

Forgot password? Click here to reset