Generalized Data Thinning Using Sufficient Statistics

03/22/2023
by   Ameer Dharamshi, et al.
0

Our goal is to develop a general strategy to decompose a random variable X into multiple independent random variables, without sacrificing any information about unknown parameters. A recent paper showed that for some well-known natural exponential families, X can be "thinned" into independent random variables X^(1), …, X^(K), such that X = ∑_k=1^K X^(k). In this paper, we generalize their procedure by relaxing this summation requirement and simply asking that some known function of the independent random variables exactly reconstruct X. This generalization of the procedure serves two purposes. First, it greatly expands the families of distributions for which thinning can be performed. Second, it unifies sample splitting and data thinning, which on the surface seem to be very different, as applications of the same principle. This shared principle is sufficiency. We use this insight to perform generalized thinning operations for a diverse set of families.

READ FULL TEXT
research
12/15/2020

Exponential and Hypoexponential Distributions: Some Characterizations

The (general) hypoexponential distribution is the distribution of a sum ...
research
01/22/2019

On random multi-dimensional assignment problems

We study random multidimensional assignment problems where the costs dec...
research
04/17/2019

Remarks on the Rényi Entropy of a sum of IID random variables

In this note we study a conjecture of Madiman and Wang which predicted t...
research
12/06/2022

Independences of Kummer laws

We prove that if X, Y are positive, independent, non-Dirac random variab...
research
10/24/2022

Learning and Covering Sums of Independent Random Variables with Unbounded Support

We study the problem of covering and learning sums X = X_1 + ⋯ + X_n of ...
research
10/14/2021

On Efficient Range-Summability of IID Random Variables in Two or Higher Dimensions

d-dimensional efficient range-summability (dD-ERS) of a long list of ran...
research
12/25/2022

Estimator selection for regression functions in exponential families with application to changepoint detection

We observe n independent pairs of random variables (W_i, Y_i) for which ...

Please sign up or login with your details

Forgot password? Click here to reset