On the asymptotic properties of a bagging estimator with a massive dataset
Bagging is a useful method for large-scale statistical analysis, especially when the computing resources are very limited. We study here the asymptotic properties of bagging estimators for M-estimation problems but with massive datasets. We theoretically prove that the resulting estimator is consistent and asymptotically normal under appropriate conditions. The results show that the bagging estimator can achieve the optimal statistical efficiency, provided that the bagging subsample size and the number of subsamples are sufficiently large. Moreover, we derive a variance estimator for valid asymptotic inference. All theoretical findings are further verified by extensive simulation studies. Finally, we apply the bagging method to the US Airline Dataset to demonstrate its practical usefulness.
READ FULL TEXT