Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

07/23/2019
by   Matthew Faw, et al.
5

We consider a co-variate shift problem where one has access to several marginally different training datasets for the same learning problem and a small validation set which possibly differs from all the individual training distributions. This co-variate shift is caused, in part, due to unobserved features in the datasets. The objective, then, is to find the best mixture distribution over the training datasets (with only observed features) such that training a learning algorithm using this mixture has the best validation performance. Our proposed algorithm, Mix&Match, combines stochastic gradient descent (SGD) with optimistic tree search and model re-use (evolving partially trained models with samples from different mixture distributions) over the space of mixtures, for this task. We prove simple regret guarantees for our algorithm with respect to recovering the optimal mixture, given a total budget of SGD evaluations. Finally, we validate our algorithm on two real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2017

Stochastic Gradient Descent as Approximate Bayesian Inference

Stochastic Gradient Descent with a constant learning rate (constant SGD)...
research
05/14/2019

Task-Driven Data Verification via Gradient Descent

We introduce a novel algorithm for the detection of possible sample corr...
research
09/04/2023

Corgi^2: A Hybrid Offline-Online Approach To Storage-Aware Data Shuffling For SGD

When using Stochastic Gradient Descent (SGD) for training machine learni...
research
05/12/2023

Online Learning Under A Separable Stochastic Approximation Framework

We propose an online learning algorithm for a class of machine learning ...
research
06/28/2022

Studying Generalization Through Data Averaging

The generalization of machine learning models has a complex dependence o...
research
03/08/2016

Mixture Proportion Estimation via Kernel Embedding of Distributions

Mixture proportion estimation (MPE) is the problem of estimating the wei...

Please sign up or login with your details

Forgot password? Click here to reset