Amortized Inference and Learning in Latent Conditional Random Fields for Weakly-Supervised Semantic Image Segmentation

05/03/2017
by   Gaurav Pandey, et al.
0

Conditional random fields (CRFs) are commonly employed as a post-processing tool for image segmentation tasks. The unary potentials of the CRF are often learnt independently by a classifier, thereby decoupling the inference in CRF from the training of classifier. Such a scheme works effectively, when pixel-level labelling is available for all the images. However, in absence of pixel-level labels, the classifier is faced with the uphill task of selectively assigning the image-level labels to the pixels of the image. Prior work often relied on localization cues, such as saliency maps, objectness priors, bounding boxes etc., to address this challenging problem. In contrast, we model the labels of the pixels as latent variables of a CRF. The pixels and the image-level labels are the observed variables of the latent CRF. We amortize the cost of inference in the latent CRF over the entire dataset, by training an inference network to approximate the posterior distribution of the latent variables given the observed variables. The inference network can be trained in an end-to-end fashion, and requires no localization cues for training. Moreover, unlike other approaches for weakly-supervised segmentation, the proposed model doesn't require further post-processing. The proposed model achieves performance comparable with other approaches that employ saliency masks for the task of weakly-supervised semantic image segmentation on the challenging VOC 2012 dataset.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset