Fast Maximum Likelihood Estimation and Supervised Classification for the Beta-Liouville Multinomial

06/12/2020
by   Steven Michael Lakin, et al.
0

The multinomial and related distributions have long been used to model categorical, count-based data in fields ranging from bioinformatics to natural language processing. Commonly utilized variants include the standard multinomial and the Dirichlet multinomial distributions due to their computational efficiency and straightforward parameter estimation process. However, these distributions make strict assumptions about the mean, variance, and covariance between the categorical features being modeled. If these assumptions are not met by the data, it may result in poor parameter estimates and loss in accuracy for downstream applications like classification. Here, we explore efficient parameter estimation and supervised classification methods using an alternative distribution, called the Beta-Liouville multinomial, which relaxes some of the multinomial assumptions. We show that the Beta-Liouville multinomial is comparable in efficiency to the Dirichlet multinomial for Newton-Raphson maximum likelihood estimation, and that its performance on simulated data matches or exceeds that of the multinomial and Dirichlet multinomial distributions. Finally, we demonstrate that the Beta-Liouville multinomial outperforms the multinomial and Dirichlet multinomial on two out of four gold standard datasets, supporting its use in modeling data with low to medium class overlap in a supervised classification context.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2021

Estimation of Dirichlet distribution parameters with bias-reducing adjusted score functions

The Dirichlet distribution, also known as multivariate beta, is the most...
research
08/25/2017

Accurate parameter estimation for Bayesian Network Classifiers using Hierarchical Dirichlet Processes

This paper introduces a novel parameter estimation method for the probab...
research
06/02/2022

On Some Properties of the Beta Inverse Rayleigh Distribution

We study with some details a lifetime model of the class of beta general...
research
02/23/2022

Simple models for macro-parasite distributions in hosts

Negative binomial distribution is the most used distribution to model ma...
research
12/30/2009

MedLDA: A General Framework of Maximum Margin Supervised Topic Models

Supervised topic models utilize document's side information for discover...
research
11/05/2019

A Conway-Maxwell-Multinomial Distribution for Flexible Modeling of Clustered Categorical Data

Categorical data are often observed as counts resulting from a fixed num...
research
05/01/2014

Fast MLE Computation for the Dirichlet Multinomial

Given a collection of categorical data, we want to find the parameters o...

Please sign up or login with your details

Forgot password? Click here to reset