Generally-Altered, -Inflated, -Truncated and -Deflated Regression, With Application to Heaped and Seeped Data

08/27/2022
by   Thomas W. Yee, et al.
0

Models such as the zero-inflated and zero-altered Poisson and zero-truncated binomial are well-established in modern regression analysis. We propose a super model that jointly and maximally unifies alteration, inflation, truncation and deflation for counts, given a 1- or 2-parameter parent (base) distribution. Seven disjoint sets of special value types are accommodated because all but truncation have parametric and nonparametric variants. Some highlights include: (i) the mixture distribution is exceeding flexible, e.g., up to seven modes; (ii) under-, equi- and over-dispersion can be handled using a negative binomial (NB) parent, with underdispersion handled by a novel Generally-Truncated-Expansion method; (iii) overdispersion can be studied holistically in terms of the four operators; (iv) an important application: heaped and seeped data from retrospective self-reported surveys are readily handled, e.g., spikes and dips which are located virtually anywhere; (v) while generally-altered regression explains why observations are there, generally-inflated regression accounts for why they are there in excess, and generally-deflated regression explains why observations are not there; (vi) the VGAM R package implements the methodology based on Fisher scoring and multinomial logit model (Poisson, NB, zeta and logarithmic parents are implemented.) The GAITD-NB has potential to become the Swiss army knife of count distributions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset