Hyperparameter Selection for Subsampling Bootstraps

06/02/2020
by   Yingying Ma, et al.
0

Massive data analysis becomes increasingly prevalent, subsampling methods like BLB (Bag of Little Bootstraps) serves as powerful tools for assessing the quality of estimators for massive data. However, the performance of the subsampling methods are highly influenced by the selection of tuning parameters ( e.g., the subset size, number of resamples per subset ). In this article we develop a hyperparameter selection methodology, which can be used to select tuning parameters for subsampling methods. Specifically, by a careful theoretical analysis, we find an analytically simple and elegant relationship between the asymptotic efficiency of various subsampling estimators and their hyperparameters. This leads to an optimal choice of the hyperparameters. More specifically, for an arbitrarily specified hyperparameter set, we can improve it to be a new set of hyperparameters with no extra CPU time cost, but the resulting estimator's statistical efficiency can be much improved. Both simulation studies and real data analysis demonstrate the superior advantage of our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2023

Optimal Subsampling Bootstrap for Massive Data

The bootstrap is a widely used procedure for statistical inference becau...
research
12/21/2011

A Scalable Bootstrap for Massive Data

The bootstrap provides a simple and powerful means of assessing the qual...
research
01/16/2021

JITuNE: Just-In-Time Hyperparameter Tuning for Network Embedding Algorithms

Network embedding (NE) can generate succinct node representations for ma...
research
01/16/2022

Discrete Simulation Optimization for Tuning Machine Learning Method Hyperparameters

Machine learning methods are being increasingly used in most technical a...
research
06/27/2012

The Big Data Bootstrap

The bootstrap provides a simple and powerful means of assessing the qual...
research
10/19/2020

How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers

Many optimizers have been proposed for training deep neural networks, an...
research
12/10/2019

What is the best predictor that you can compute in five minutes using a given Bayesian hierarchical model?

The goal of this paper is to provide a way for statisticians to answer t...

Please sign up or login with your details

Forgot password? Click here to reset