DeepAI
Log In Sign Up

A Conway-Maxwell-Multinomial Distribution for Flexible Modeling of Clustered Categorical Data

11/05/2019
by   Darcy Steeg Morris, et al.
0

Categorical data are often observed as counts resulting from a fixed number of trials in which each trial consists of making one selection from a prespecified set of categories. The multinomial distribution serves as a standard model for such clustered data but assumes that trials are independent and identically distributed. Extensions such as Dirichlet-multinomial and random-clumped multinomial can express positive association, where trials are more likely to result in a common category due to membership in a common cluster. This work considers a Conway-Maxwell-multinomial (CMM) distribution for modeling clustered categorical data exhibiting positively or negatively associated trials. The CMM distribution features a dispersion parameter which allows it to adapt to a range of association levels and includes several recognizable distributions as special cases. We explore properties of CMM, illustrate its flexible characteristics, identify a method to efficiently compute maximum likelihood (ML) estimates, present simulations of small sample properties under ML estimation, and demonstrate the model via several data analysis examples.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/24/2021

Canonical fundamental skew-t linear mixed models

In clinical trials, studies often present longitudinal data or clustered...
06/12/2020

Fast Maximum Likelihood Estimation and Supervised Classification for the Beta-Liouville Multinomial

The multinomial and related distributions have long been used to model c...
04/03/2018

Grouped Heterogeneous Mixture Modeling for Clustered Data

Clustered data which has a grouping structure (e.g. postal area, school,...
04/07/2022

Categorical Distributions of Maximum Entropy under Marginal Constraints

The estimation of categorical distributions under marginal constraints s...
06/27/2019

Efficient algorithms for modifying and sampling from a categorical distribution

Probabilistic programming languages and other machine learning applicati...