Generalized Identifiability Bounds for Mixture Models with Grouped Samples

07/22/2022
by   Robert A. Vandermeulen, et al.
0

Recent work has shown that finite mixture models with m components are identifiable, while making no assumptions on the mixture components, so long as one has access to groups of samples of size 2m-1 which are known to come from the same mixture component. In this work we generalize that result and show that, if every subset of k mixture components of a mixture model are linearly independent, then that mixture model is identifiable with only (2m-1)/(k-1) samples per group. We further show that this value cannot be improved. We prove an analogous result for a stronger form of identifiability known as "determinedness" along with a corresponding lower bound. This independence assumption almost surely holds if mixture components are chosen randomly from a k-dimensional space. We describe some implications of our results for multinomial mixture models and topic modeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/30/2016

An Operator Theoretic Approach to Nonparametric Mixture Models

When estimating finite mixture models, it is common to make assumptions ...
research
07/14/2018

On the Identifiability of Finite Mixtures of Finite Product Measures

The problem of identifiability of finite mixtures of finite product meas...
research
07/01/2021

Robust Estimation in Finite Mixture Models

We observe a n-sample, the distribution of which is assumed to belong, o...
research
07/13/2023

How to perform modeling with independent and preferential data jointly?

Continuous space species distribution models (SDMs) have a long-standing...
research
08/02/2012

Multidimensional Membership Mixture Models

We present the multidimensional membership mixture (M3) models where eve...
research
10/04/2016

The Search Problem in Mixture Models

We consider the task of learning the parameters of a single component o...
research
11/26/2021

Comparison of annual maximum series and flood-type-differentiated mixture models of partial duration series

The use of the annual maximum series for flood frequency analyses limits...

Please sign up or login with your details

Forgot password? Click here to reset