Log In Sign Up

Refined Convergence Rates for Maximum Likelihood Estimation under Finite Mixture Models

by   Tudor Manole, et al.

We revisit convergence rates for maximum likelihood estimation (MLE) under finite mixture models. The Wasserstein distance has become a standard loss function for the analysis of parameter estimation in these models, due in part to its ability to circumvent label switching and to accurately characterize the behaviour of fitted mixture components with vanishing weights. However, the Wasserstein metric is only able to capture the worst-case convergence rate among the remaining fitted mixture components. We demonstrate that when the log-likelihood function is penalized to discourage vanishing mixing weights, stronger loss functions can be derived to resolve this shortcoming of the Wasserstein distance. These new loss functions accurately capture the heterogeneity in convergence rates of fitted mixture components, and we use them to sharpen existing pointwise and uniform convergence rates in various classes of mixture models. In particular, these results imply that a subset of the components of the penalized MLE typically converge significantly faster than could have been anticipated from past work. We further show that some of these conclusions extend to the traditional MLE. Our theoretical findings are supported by a simulation study to illustrate these improved convergence rates.


page 1

page 2

page 3

page 4


Uniform Convergence Rates for Maximum Likelihood Estimation under Two-Component Gaussian Mixture Models

We derive uniform convergence rates for the maximum likelihood estimator...

Convergence of latent mixing measures in finite and infinite mixture models

This paper studies convergence behavior of latent mixing measures that a...

Convergence Rates of Latent Topic Models Under Relaxed Identifiability Conditions

In this paper we study the frequentist convergence rate for the Latent D...

Convergence Rates of Gradient Descent and MM Algorithms for Generalized Bradley-Terry Models

We show tight convergence rate bounds for gradient descent and MM algori...

Context-aware learning for finite mixture models

This work introduces algorithms able to exploit contextual information i...

On Excess Mass Behavior in Gaussian Mixture Models with Orlicz-Wasserstein Distances

Dirichlet Process mixture models (DPMM) in combination with Gaussian ker...