DiverseNet: When One Right Answer is not Enough

08/24/2020
by   Michael Firman, et al.
32

Many structured prediction tasks in machine vision have a collection of acceptable answers, instead of one definitive ground truth answer. Segmentation of images, for example, is subject to human labeling bias. Similarly, there are multiple possible pixel values that could plausibly complete occluded image regions. State-of-the art supervised learning methods are typically optimized to make a single test-time prediction for each query, failing to find other modes in the output space. Existing methods that allow for sampling often sacrifice speed or accuracy. We introduce a simple method for training a neural network, which enables diverse structured predictions to be made for each test-time query. For a single input, we learn to predict a range of possible answers. We compare favorably to methods that seek diversity through an ensemble of networks. Such stochastic multiple choice learning faces mode collapse, where one or more ensemble members fail to receive any training signal. Our best performing solution can be deployed for various tasks, and just involves small modifications to the existing single-mode architecture, loss function, and training regime. We demonstrate that our method results in quantitative improvements across three challenging tasks: 2D image completion, 3D volume estimation, and flow prediction.

READ FULL TEXT

page 5

page 6

page 7

page 8

research
12/22/2018

Search-Guided, Lightly-supervised Training of Structured Prediction Energy Networks

In structured output prediction tasks, labeling ground-truth training ou...
research
07/24/2020

A Lightweight Neural Network for Monocular View Generation with Occlusion Handling

In this article, we present a very lightweight neural network architectu...
research
05/16/2022

Test-Time Adaptation with Shape Moments for Image Segmentation

Supervised learning is well-known to fail at generalization under distri...
research
01/22/2016

Unsupervised convolutional neural networks for motion estimation

Traditional methods for motion estimation estimate the motion field F be...
research
03/31/2023

Single Image Depth Prediction Made Better: A Multivariate Gaussian Take

Neural-network-based single image depth prediction (SIDP) is a challengi...
research
01/16/2023

Diverse Multimedia Layout Generation with Multi Choice Learning

Designing visually appealing layouts for multimedia documents containing...
research
04/24/2023

Towards Mode Balancing of Generative Models via Diversity Weights

Large data-driven image models are extensively used to support creative ...

Please sign up or login with your details

Forgot password? Click here to reset