DeepAI AI Chat
Log In Sign Up

An Introduction to the Practical and Theoretical Aspects of Mixture-of-Experts Modeling

by   Hien D. Nguyen, et al.

Mixture-of-experts (MoE) models are a powerful paradigm for modeling of data arising from complex data generating processes (DGPs). In this article, we demonstrate how different MoE models can be constructed to approximate the underlying DGPs of arbitrary types of data. Due to the probabilistic nature of MoE models, we propose the maximum quasi-likelihood (MQL) estimator as a method for estimating MoE model parameters from data, and we provide conditions under which MQL estimators are consistent and asymptotically normal. The blockwise minorization-maximizatoin (blockwise-MM) algorithm framework is proposed as an all-purpose method for constructing algorithms for obtaining MQL estimators. An example derivation of a blockwise-MM algorithm is provided. We then present a method for constructing information criteria for estimating the number of components in MoE models and provide justification for the classic Bayesian information criterion (BIC). We explain how MoE models can be used to conduct classification, clustering, and regression and we illustrate these applications via a pair of worked examples.


page 1

page 2

page 3

page 4


An Introduction to MM Algorithms for Machine Learning and Statistical

MM (majorization--minimization) algorithms are an increasingly popular t...

A Novel Algorithm for Clustering of Data on the Unit Sphere via Mixture Models

A new maximum approximate likelihood (ML) estimation algorithm for the m...

Non-Normal Mixtures of Experts

Mixture of Experts (MoE) is a popular framework for modeling heterogenei...

Non-asymptotic model selection in block-diagonal mixture of polynomial experts models

Model selection, via penalized likelihood type criteria, is a standard t...

QM/MM Methods for Crystalline Defects. Part 3: Machine-Learned Interatomic Potentials

We develop and analyze a framework for consistent QM/MM (quantum/classic...

Prediction Errors for Penalized Regressions based on Generalized Approximate Message Passing

We discuss the prediction accuracy of assumed statistical models in term...

Consistent Estimators for Nonlinear Vessel Models

In this work, the issue of obtaining consistent parameter estimators for...