Icebreaker: Element-wise Active Information Acquisition with Bayesian Deep Latent Gaussian Model

08/13/2019
by   Wenbo Gong, et al.
3

In this paper we introduce the ice-start problem, i.e., the challenge of deploying machine learning models when only little or no training data is initially available, and acquiring each feature element of data is associated with costs. This setting is representative for the real-world machine learning applications. For instance, in the health-care domain, when training an AI system for predicting patient metrics from lab tests, obtaining every single measurement comes with a high cost. Active learning, where only the label is associated with a cost does not apply to such problem, because performing all possible lab tests to acquire a new training datum would be costly, as well as unnecessary due to redundancy. We propose Icebreaker, a principled framework to approach the ice-start problem. Icebreaker uses a full Bayesian Deep Latent Gaussian Model (BELGAM) with a novel inference method. Our proposed method combines recent advances in amortized inference and stochastic gradient MCMC to enable fast and accurate posterior inference. By utilizing BELGAM's ability to fully quantify model uncertainty, we also propose two information acquisition functions for imputation and active prediction problems. We demonstrate that BELGAM performs significantly better than the previous VAE (Variational autoencoder) based models, when the data set size is small, using both machine learning benchmarks and real-world recommender systems and health-care applications. Moreover, based on BELGAM, Icebreaker further improves the performance and demonstrate the ability to use minimum amount of the training data to obtain the highest test time performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2018

EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE

Making decisions requires information relevant to the task at hand. Many...
research
02/05/2021

Reducing the Amortization Gap in Variational Autoencoders: A Bayesian Random Function Approach

Variational autoencoder (VAE) is a very successful generative model whos...
research
12/03/2018

Crowd Sourcing based Active Learning Approach for Parking Sign Recognition

Deep learning models have been used extensively to solve real-world prob...
research
05/03/2023

Efficient Online Decision Tree Learning with Active Feature Acquisition

Constructing decision trees online is a classical machine learning probl...
research
11/06/2020

Trust Issues: Uncertainty Estimation Does Not Enable Reliable OOD Detection On Medical Tabular Data

When deploying machine learning models in high-stakes real-world environ...
research
06/08/2021

A critical look at the current train/test split in machine learning

The randomized or cross-validated split of training and testing sets has...

Please sign up or login with your details

Forgot password? Click here to reset