Modern Subsampling Methods for Large-Scale Least Squares Regression

05/04/2021
by   Tao Li, et al.
0

Subsampling methods aim to select a subsample as a surrogate for the observed sample. As a powerful technique for large-scale data analysis, various subsampling methods are developed for more effective coefficient estimation and model prediction. This review presents some cutting-edge subsampling methods based on the large-scale least squares estimation. Two major families of subsampling methods are introduced, respectively, the randomized subsampling approach and the optimal subsampling approach. The former aims to develop a more effective data-dependent sampling probability, while the latter aims to select a deterministic subsample in accordance with certain optimality criteria. Real data examples are provided to compare these methods empirically, respecting both the estimation accuracy and the computing time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/17/2022

Statistical learning methods for neuroimaging data analysis with applications

The aim of this paper is to provide a comprehensive review of statistica...
research
09/17/2015

Optimal Subsampling Approaches for Large Sample Linear Regression

A significant hurdle for analyzing large sample data is the lack of effe...
research
06/04/2021

Distributed nonparametric regression imputation for missing response problems with large-scale data

Nonparametric regression imputation is commonly used in missing data ana...
research
08/30/2023

Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems

Algorithms from Randomized Numerical Linear Algebra (RandNLA) are known ...
research
04/30/2022

A nonparametric regression approach to asymptotically optimal estimation of normal means

Simultaneous estimation of multiple parameters has received a great deal...
research
07/12/2022

Accelerating Certifiable Estimation with Preconditioned Eigensolvers

Convex (specifically semidefinite) relaxation provides a powerful approa...
research
10/06/2015

Large-scale subspace clustering using sketching and validation

The nowadays massive amounts of generated and communicated data present ...

Please sign up or login with your details

Forgot password? Click here to reset