Tight bounds for learning a mixture of two gaussians

04/19/2014
by   Moritz Hardt, et al.
0

We consider the problem of identifying the parameters of an unknown mixture of two arbitrary d-dimensional gaussians from a sequence of independent random samples. Our main results are upper and lower bounds giving a computationally efficient moment-based estimator with an optimal convergence rate, thus resolving a problem introduced by Pearson (1894). Denoting by σ^2 the variance of the unknown mixture, we prove that Θ(σ^12) samples are necessary and sufficient to estimate each parameter up to constant additive error when d=1. Our upper bound extends to arbitrary dimension d>1 up to a (provably necessary) logarithmic loss in d using a novel---yet simple---dimensionality reduction technique. We further identify several interesting special cases where the sample complexity is notably smaller than our optimal worst-case bound. For instance, if the means of the two components are separated by Ω(σ) the sample complexity reduces to O(σ^2) and this is again optimal. Our results also apply to learning each component of the mixture up to small error in total variation distance, where our algorithm gives strong improvements in sample complexity over previous work. We also extend our lower bound to mixtures of k Gaussians, showing that Ω(σ^6k-2) samples are necessary to estimate each parameter up to constant additive error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2015

The Optimal Sample Complexity of PAC Learning

This work establishes a new upper bound on the number of samples suffici...
research
02/17/2020

Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity

The current paper studies the problem of agnostic Q-learning with functi...
research
11/21/2022

Sample-optimal classical shadows for pure states

We consider the classical shadows task for pure states in the setting of...
research
06/03/2019

Optimal Learning of Mallows Block Model

The Mallows model, introduced in the seminal paper of Mallows 1957, is o...
research
12/04/2020

Near-Optimal Model Discrimination with Non-Disclosure

Let θ_0,θ_1 ∈ℝ^d be the population risk minimizers associated to some lo...
research
08/17/2018

Efficiently Learning Mixtures of Mallows Models

Mixtures of Mallows models are a popular generative model for ranking da...
research
06/29/2011

The Rate of Convergence of AdaBoost

The AdaBoost algorithm was designed to combine many "weak" hypotheses th...

Please sign up or login with your details

Forgot password? Click here to reset