A Two-round Variant of EM for Gaussian Mixtures

01/16/2013
by   Sanjoy Dasgupta, et al.
0

Given a set of possible models (e.g., Bayesian network structures) and a data sample, in the unsupervised model selection problem the task is to choose the most accurate model with respect to the domain joint probability distribution. In contrast to this, in supervised model selection it is a priori known that the chosen model will be used in the future for prediction tasks involving more "focused' predictive distributions. Although focused predictive distributions can be produced from the joint probability distribution by marginalization, in practice the best model in the unsupervised sense does not necessarily perform well in supervised domains. In particular, the standard marginal likelihood score is a criterion for the unsupervised task, and, although frequently used for supervised model selection also, does not perform well in such tasks. In this paper we study the performance of the marginal likelihood score empirically in supervised Bayesian network selection tasks by using a large number of publicly available classification data sets, and compare the results to those obtained by alternative model selection criteria, including empirical crossvalidation methods, an approximation of a supervised marginal likelihood measure, and a supervised version of Dawids prequential(predictive sequential) principle.The results demonstrate that the marginal likelihood score does NOT perform well FOR supervised model selection, WHILE the best results are obtained BY using Dawids prequential r napproach.

READ FULL TEXT

page 1

page 2

page 3

research
01/23/2013

On Supervised Selection of Bayesian Networks

Given a set of possible models (e.g., Bayesian network structures) and a...
research
01/10/2013

Classifier Learning with Supervised Marginal Likelihood

It has been argued that in supervised classification tasks, in practice ...
research
04/20/2018

Empirical-likelihood-based criteria for model selection on marginal analysis of longitudinal data with dropout missingness

Longitudinal data are common in clinical trials and observational studie...
research
02/06/2013

Models and Selection Criteria for Regression and Classification

When performing regression or classification, we are interested in the c...
research
08/21/2017

Network Model Selection for Task-Focused Attributed Network Inference

Networks are models representing relationships between entities. Often t...
research
10/13/2020

Bayesian model selection for unsupervised image deconvolution with structured Gaussian priors

This paper considers the objective comparison of stochastic models to so...
research
06/15/2018

Robust Bayesian Model Selection for Variable Clustering with the Gaussian Graphical Model

Variable clustering is important for explanatory analysis. However, only...

Please sign up or login with your details

Forgot password? Click here to reset