Modeling Sums of Exchangeable Binary Variables

07/22/2020 ∙ by Ryan Elmore, et al. ∙ University of Denver 0

We introduce a new model for sums of exchangeable binary random variables. The proposed distribution is an approximation to the exact distributional form, and relies on the theory of completely monotone functions and the Laplace transform of a gamma distribution function. Using Monte Carlo methods, we show that this new model compares favorably to the beta binomial model with respect to estimating the success probability of the Bernoulli trials and the correlation between any two variables in the exchangeable set. We apply the new methodology to two classic data sets and the results are summarized.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

2 Model Development

3 Monte Carlo Simulation

4 Examples

5 Discussion


  • Altham (1978) Altham, P. M. E. 1978. “Two generalizations of the binomial distribution.” Journal of the Royal Statistical Society Series C: Applied Statistics 27: 162–167.
  • Bowman and George (1995) Bowman, D., and E. O. George. 1995. “A saturated model for analyzing exchangeable binary data: Applications to clinical and developmental toxicity studies.” Journal of the American Statistical Association 90: 871–879.
  • Bowman (2016) Bowman, Dale. 2016. “Statistical inference for familial disease models assuming exchangeability.” Statistics & Probability Letters 119: 220–225.
  • Diaconis (1977) Diaconis, P. 1977. “Finite forms of de Finetti’s theorem on exchangeability.” Synthese 36 (2): 271–281.
  • Feller (1971) Feller, W. 1971.

    An Introduction to Probability Theory and Its Applications

    . 2nd ed., Vol. II. John Wiley and Sons.
  • George and Bowman (1995) George, E. O., and D. Bowman. 1995. “A full likelihood procedure for analysing exchangeable binary data.” Biometrics 51: 512–523.
  • George and Kodell (1996) George, E. O., and R. L. Kodell. 1996. “Tests of independence, treatment heterogeneity, and dose-related trend with exchangeable binary data.” Journal of the American Statistical Association 91: 1602–1610.
  • Griffiths (1973) Griffiths, D. A. 1973. “Maximum Likelihood Estimation for the Beta-binomial Distribution and An Application to the Household Distribution of the Total Number of Cases of a Disease.” Biometrics 29: 637–648.
  • Kuk (2004) Kuk, Anthony Y. C. 2004. “A litter-based approach to risk assessment in developmental toxicity studies via a power family of completely monotone functions.” Journal of the Royal Statistical Society Series C: Applied Statistics 53 (2): 369–386.
  • Kupper and Haseman (1978) Kupper, L. L., and J. K. Haseman. 1978. “The use of a correlated binomial model for the analysis of certain toxicological experiments.” Biometrics 34: 69–76.
  • Lehmann (1999) Lehmann, E. L. 1999. Elements of Large Sample Theory. Springer-Verlag.
  • Leisch, Weingessel, and Hornik (1998) Leisch, F., A. Weingessel, and K. Hornik. 1998. On the Generation of Correlated Artificial Binary Data. Working Paper 13. SFB “Adaptive Information Systems and Modeling in Economics and Management Science”.
  • Pack (1986) Pack, S.E. 1986. “Hypothesis testing for proportion with overdispersion.” Biometrics 42: 85–89.
  • Pang and Kuk (2005) Pang, Zhen, and Anthony Y. C. Kuk. 2005. “A Shared Response Model for Cluster Binary Data in Developmental Toxicity Studies.” Biometrics 61: 1076–1084.
  • Paul (1979) Paul, S.R. 1979. “A clumped beta-binomial model for the analysis of clustered attribute data.” Biometrics 35: 821–824.
  • Prentice (1986) Prentice, R. L. 1986. “Binary Regression Using An Extended Beta-binomial Distribution, With Discussion of Correlation Induced By Covariate Measurement Errors.” Journal of the American Statistical Association 81: 321–327.
  • R Core Team (2020) R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
  • Sastry (1997) Sastry, Narayan. 1997. “A nested frailty model for survival data, with an application to the study of child survival in northeast Brazil.” Journal of the American Statistical Association 92 (438): 426–435.
  • Skellam (1948) Skellam, J.G. 1948. “A probability distribution derived from the binomial distribution by regarding the probability of success as variable between sets of trials.” Journal of the Royal Statistical Society Series B: Methodological 10 (2): 257–261.
  • Williams (1975) Williams, D. A. 1975. “The Analysis of Binary Responses From Toxicological Experiments Involving Reproduction and Teratogenicity.” Biometrics 31: 949–952.
  • Yu and Zelterman (2008) Yu, Chang, and Daniel Zelterman. 2008. “Sums of exchangeable Bernoulli random variables for family and litter frequency data.” Computational Statistics & Data Analysis 52 (3): 1636–1649.