DeepAI AI Chat
Log In Sign Up

Generalized Identifiability Bounds for Mixture Models with Grouped Samples

by   Robert A. Vandermeulen, et al.
Berlin Institute of Technology (Technische Universität Berlin)

Recent work has shown that finite mixture models with m components are identifiable, while making no assumptions on the mixture components, so long as one has access to groups of samples of size 2m-1 which are known to come from the same mixture component. In this work we generalize that result and show that, if every subset of k mixture components of a mixture model are linearly independent, then that mixture model is identifiable with only (2m-1)/(k-1) samples per group. We further show that this value cannot be improved. We prove an analogous result for a stronger form of identifiability known as "determinedness" along with a corresponding lower bound. This independence assumption almost surely holds if mixture components are chosen randomly from a k-dimensional space. We describe some implications of our results for multinomial mixture models and topic modeling.


page 1

page 2

page 3

page 4


An Operator Theoretic Approach to Nonparametric Mixture Models

When estimating finite mixture models, it is common to make assumptions ...

On the Identifiability of Finite Mixtures of Finite Product Measures

The problem of identifiability of finite mixtures of finite product meas...

Robust Estimation in Finite Mixture Models

We observe a n-sample, the distribution of which is assumed to belong, o...

How to perform modeling with independent and preferential data jointly?

Continuous space species distribution models (SDMs) have a long-standing...

Multidimensional Membership Mixture Models

We present the multidimensional membership mixture (M3) models where eve...

The Search Problem in Mixture Models

We consider the task of learning the parameters of a single component o...

Comparison of annual maximum series and flood-type-differentiated mixture models of partial duration series

The use of the annual maximum series for flood frequency analyses limits...