The Poisson Multinomial Distribution and Its Applications in Voting Theory, Ecological Inference, and Machine Learning

01/11/2022
by   Zhengzhi Lin, et al.
0

The Poisson multinomial distribution (PMD) describes the distribution of the sum of n independent but non-identically distributed random vectors, in which each random vector is of length m with 0/1 valued elements and only one of its elements can take value 1 with a certain probability. Those probabilities are different for the m elements across the n random vectors, and form an n × m matrix with row sum equals to 1. We call this n× m matrix the success probability matrix (SPM). Each SPM uniquely defines a PMD. The PMD is useful in many areas such as, voting theory, ecological inference, and machine learning. The distribution functions of PMD, however, are usually difficult to compute. In this paper, we develop efficient methods to compute the probability mass function (pmf) for the PMD using multivariate Fourier transform, normal approximation, and simulations. We study the accuracy and efficiency of those methods and give recommendations for which methods to use under various scenarios. We also illustrate the use of the PMD via three applications, namely, in voting probability calculation, aggregated data inference, and uncertainty quantification in classification. We build an R package that implements the proposed methods, and illustrate the package with examples.

READ FULL TEXT
research
12/04/2017

Approximating the Sum of Independent Non-Identical Binomial Random Variables

The distribution of sum of independent non-identical binomial random var...
research
02/11/2021

The vote Package: Single Transferable Vote and Other Electoral Systems in R

We describe the vote package in R, which implements the plurality (or fi...
research
11/26/2018

Non-deterministic inference using random set models: theory, approximation, and sampling method

A random set is a generalisation of a random variable, i.e. a set-valued...
research
10/08/2019

The density ratio of generalized binomial versus Poisson distributions

Let b(x) be the probability that a sum of independent Bernoulli random v...
research
11/14/2018

Saddlepoint adjusted inversion of characteristic functions

For certain types of statistical models, the characteristic function (Fo...
research
06/17/2021

Large-Scale Multiple Testing for Matrix-Valued Data under Double Dependency

High-dimensional inference based on matrix-valued data has drawn increas...

Please sign up or login with your details

Forgot password? Click here to reset