Zero-Truncated Poisson Regression for Sparse Multiway Count Data Corrupted by False Zeros

01/25/2022
by   Oscar López, et al.
0

We propose a novel statistical inference methodology for multiway count data that is corrupted by false zeros that are indistinguishable from true zero counts. Our approach consists of zero-truncating the Poisson distribution to neglect all zero values. This simple truncated approach dispenses with the need to distinguish between true and false zero counts and reduces the amount of data to be processed. Inference is accomplished via tensor completion that imposes low-rank tensor structure on the Poisson parameter space. Our main result shows that an N-way rank-R parametric tensor ℳ∈(0,∞)^I×⋯× I generating Poisson observations can be accurately estimated by zero-truncated Poisson regression from approximately IR^2log_2^2(I) non-zero counts under the nonnegative canonical polyadic decomposition. Our result also quantifies the error made by zero-truncating the Poisson distribution when the parameter is uniformly bounded from below. Therefore, under a low-rank multiparameter model, we propose an implementable approach guaranteed to achieve accurate regression in under-determined scenarios with substantial corruption by false zeros. Several numerical experiments are presented to explore the theoretical results.

READ FULL TEXT
research
08/27/2022

Generally-Altered, -Inflated, -Truncated and -Deflated Regression, With Application to Heaped and Seeped Data

Models such as the zero-inflated and zero-altered Poisson and zero-trunc...
research
07/18/2022

Tensor Decompositions for Count Data that Leverage Stochastic and Deterministic Optimization

There is growing interest to extend low-rank matrix decompositions to mu...
research
03/26/2022

Estimating the Ratio of Means in a Zero-inflated Poisson Mixture Model

The problem of estimating the ratio of the means of a two-component Pois...
research
07/08/2020

Modelling excess zeros in count data: A new perspective on modelling approaches

We consider models underlying regression analysis of count data in which...
research
03/27/2023

Prior Elicitation for Generalised Linear Models and Extensions

A statistical method for the elicitation of priors in Bayesian generalis...
research
03/25/2021

Biwhitening Reveals the Rank of a Count Matrix

Estimating the rank of a corrupted data matrix is an important task in d...
research
07/16/2022

A Flexible Zero-Inflated Poisson-Gamma model with application to microbiome read counts

In microbiome studies, it is of interest to use a sample from a populati...

Please sign up or login with your details

Forgot password? Click here to reset