The Search Problem in Mixture Models

10/04/2016
by   Avik Ray, et al.
0

We consider the task of learning the parameters of a single component of a mixture model, for the case when we are given side information about that component; we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain better sample complexity than existing standard mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real datasets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.

READ FULL TEXT

page 10

page 12

page 13

research
03/07/2023

Polynomial Time and Private Learning of Unbounded Gaussian Mixture Models

We study the problem of privately estimating the parameters of d-dimensi...
research
07/22/2022

Generalized Identifiability Bounds for Mixture Models with Grouped Samples

Recent work has shown that finite mixture models with m components are i...
research
11/01/2014

Learning Mixed Multinomial Logit Model from Ordinal Data

Motivated by generating personalized recommendations using ordinal (or p...
research
12/18/2019

Boltzmann Exploration Expectation-Maximisation

We present a general method for fitting finite mixture models (FMM). Lea...
research
02/22/2018

Learning Mixtures of Linear Regressions with Nearly Optimal Complexity

Mixtures of Linear Regressions (MLR) is an important mixture model with ...
research
11/26/2021

Comparison of annual maximum series and flood-type-differentiated mixture models of partial duration series

The use of the annual maximum series for flood frequency analyses limits...
research
12/09/2014

Max vs Min: Tensor Decomposition and ICA with nearly Linear Sample Complexity

We present a simple, general technique for reducing the sample complexit...

Please sign up or login with your details

Forgot password? Click here to reset