Self-regularizing Property of Nonparametric Maximum Likelihood Estimator in Mixture Models

08/19/2020
by   Yury Polyanskiy, et al.
0

Introduced by Kiefer and Wolfowitz <cit.>, the nonparametric maximum likelihood estimator (NPMLE) is a widely used methodology for learning mixture odels and empirical Bayes estimation. Sidestepping the non-convexity in mixture likelihood, the NPMLE estimates the mixing distribution by maximizing the total likelihood over the space of probability measures, which can be viewed as an extreme form of overparameterization. In this paper we discover a surprising property of the NPMLE solution. Consider, for example, a Gaussian mixture model on the real line with a subgaussian mixing distribution. Leveraging complex-analytic techniques, we show that with high probability the NPMLE based on a sample of size n has O(log n) atoms (mass points), significantly improving the deterministic upper bound of n due to Lindsay <cit.>. Notably, any such Gaussian mixture is statistically indistinguishable from a finite one with O(log n) components (and this is tight for certain mixtures). Thus, absent any explicit form of model selection, NPMLE automatically chooses the right model complexity, a property we term self-regularization. Extensions to other exponential families are given. As a statistical application, we show that this structural property can be harnessed to bootstrap existing Hellinger risk bound of the (parametric) MLE for finite Gaussian mixtures to the NPMLE for general Gaussian mixtures, recovering a result of Zhang <cit.>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2015

Dirichlet Process Parsimonious Mixtures for clustering

The parsimonious Gaussian mixture models, which exploit an eigenvalue de...
research
11/08/2020

Consistency of the MLE under a two-parameter gamma mixture model with a structural shape parameter

The finite Gamma mixture model is often used to describe randomness in i...
research
12/06/2017

On the nonparametric maximum likelihood estimator for Gaussian location mixture densities with application to Gaussian denoising

We study the Nonparametric Maximum Likelihood Estimator (NPMLE) for esti...
research
07/23/2019

Sparse Regularization for Mixture Problems

This paper investigates the statistical estimation of a discrete mixing ...
research
06/07/2021

Learning Gaussian Mixtures with Generalised Linear Models: Precise Asymptotics in High-dimensions

Generalised linear models for multi-class classification problems are on...
research
02/14/2020

Optimal estimation of high-dimensional Gaussian mixtures

This paper studies the optimal rate of estimation in a finite Gaussian l...
research
01/04/2023

Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient Flow

Gaussian mixture models form a flexible and expressive parametric family...

Please sign up or login with your details

Forgot password? Click here to reset