Sampling with replacement vs Poisson sampling: a comparative study in optimal subsampling

05/17/2022
by   Jing Wang, et al.
0

Faced with massive data, subsampling is a commonly used technique to improve computational efficiency, and using nonuniform subsampling probabilities is an effective approach to improve estimation efficiency. For computational efficiency, subsampling is often implemented with replacement or through Poisson subsampling. However, no rigorous investigation has been performed to study the difference between the two subsampling procedures such as their estimation efficiency and computational convenience. This paper performs a comparative study on these two different sampling procedures. In the context of maximizing a general target function, we first derive asymptotic distributions for estimators obtained from the two sampling procedures. The results show that the Poisson subsampling may have a higher estimation efficiency. Based on the asymptotic distributions for both subsampling with replacement and Poisson subsampling, we derive optimal subsampling probabilities that minimize the variance functions of the subsampling estimators. These subsampling probabilities further reveal the similarities and differences between subsampling with replacement and Poisson subsampling. The theoretical characterizations and comparisons on the two subsampling procedures provide guidance to select a more appropriate subsampling approach in practice. Furthermore, practically implementable algorithms are proposed based on the optimal structural results, which are evaluated through both theoretical and empirical analyses.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2020

Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators with Massive Data

Nonuniform subsampling methods are effective to reduce computational bur...
research
02/08/2018

More Efficient Estimation for Logistic Regression with Optimal Subsample

Facing large amounts of data, subsampling is a practical technique to ex...
research
06/17/2018

Poisson Source Localization on the Plane. Cusp Case

This work is devoted to the problem of estimation of the localization of...
research
06/07/2018

Parameter estimation for fractional Poisson processes

The paper proposes a formal estimation procedure for parameters of the f...
research
09/07/2015

Poisson Subsampling Algorithms for Large Sample Linear Regression in Massive Data

Large sample size brings the computation bottleneck for modern data anal...
research
05/25/2022

Linear Algorithms for Nonparametric Multiclass Probability Estimation

Multiclass probability estimation is the problem of estimating condition...
research
10/07/2022

A Roadmap to Asymptotic Properties with Applications to COVID-19 Data

Asymptotic properties of statistical estimators play a significant role ...

Please sign up or login with your details

Forgot password? Click here to reset