On the Influence of Enforcing Model Identifiability on Learning dynamics of Gaussian Mixture Models

06/17/2022
by   Pascal Mattia Esser, et al.
0

A common way to learn and analyze statistical models is to consider operations in the model parameter space. But what happens if we optimize in the parameter space and there is no one-to-one mapping between the parameter space and the underlying statistical model space? Such cases frequently occur for hierarchical models which include statistical mixtures or stochastic neural networks, and these models are said to be singular. Singular models reveal several important and well-studied problems in machine learning like the decrease in convergence speed of learning trajectories due to attractor behaviors. In this work, we propose a relative reparameterization technique of the parameter space, which yields a general method for extracting regular submodels from singular models. Our method enforces model identifiability during training and we study the learning dynamics for gradient descent and expectation maximization for Gaussian Mixture Models (GMMs) under relative parameterization, showing faster experimental convergence and a improved manifold shape of the dynamics around the singularity. Extending the analysis beyond GMMs, we furthermore analyze the Fisher information matrix under relative reparameterization and its influence on the generalization error, and show how the method can be applied to more complex models like deep neural networks.

READ FULL TEXT
research
12/07/2021

Towards Modeling and Resolving Singular Parameter Spaces using Stratifolds

When analyzing parametric statistical models, a useful approach consists...
research
05/27/2019

Lightlike Neuromanifolds, Occam's Razor and Deep Learning

Why do deep neural networks generalize with a very high dimensional para...
research
11/08/2022

A generalized AIC for models with singularities and boundaries

The Akaike information criterion (AIC) is a common tool for model select...
research
10/07/2019

Gaussian Mixture Clustering Using Relative Tests of Fit

We consider clustering based on significance tests for Gaussian Mixture ...
research
08/23/2023

Quantifying degeneracy in singular models via the learning coefficient

Deep neural networks (DNN) are singular statistical models which exhibit...
research
07/13/2023

Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent

The learning of Gaussian Mixture Models (also referred to simply as GMMs...
research
01/04/2023

Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient Flow

Gaussian mixture models form a flexible and expressive parametric family...

Please sign up or login with your details

Forgot password? Click here to reset