Optimal Subsampling for Large Sample Ridge Regression

04/10/2022
by   Yunlu Chen, et al.
0

Subsampling is a popular approach to alleviating the computational burden for analyzing massive datasets. Recent efforts have been devoted to various statistical models without explicit regularization. In this paper, we develop an efficient subsampling procedure for the large sample linear ridge regression. In contrast to the ordinary least square estimator, the introduction of the ridge penalty leads to a subtle trade-off between bias and variance. We first investigate the asymptotic properties of the subsampling estimator and then propose to minimize the asymptotic-mean-squared-error criterion for optimality. The resulting subsampling probability involves both ridge leverage score and L2 norm of the predictor. To further reduce the computational cost for calculating the ridge leverage scores, we propose the algorithm with efficient approximation. We show by synthetic and real datasets that the algorithm is both statistically accurate and computationally efficient compared with existing subsampling based methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/03/2017

Optimal Subsampling for Large Sample Logistic Regression

For massive data, the family of subsampling algorithms is popular to dow...
research
03/22/2019

One-shot distributed ridge regression in high dimensions

In many areas, practitioners need to analyze large datasets that challen...
research
05/20/2016

Piece-wise quadratic approximations of arbitrary error functions for fast and robust machine learning

Most of machine learning approaches have stemmed from the application of...
research
09/17/2015

Optimal Subsampling Approaches for Large Sample Linear Regression

A significant hurdle for analyzing large sample data is the lack of effe...
research
06/19/2020

λ-Regularized A-Optimal Design and its Approximation by λ-Regularized Proportional Volume Sampling

In this work, we study the λ-regularized A-optimal design problem and in...
research
03/24/2020

Efficient Algorithms for Multidimensional Segmented Regression

We study the fundamental problem of fixed design multidimensional segme...
research
09/17/2021

Adaptive Ridge-Penalized Functional Local Linear Regression

We introduce an original method of multidimensional ridge penalization i...

Please sign up or login with your details

Forgot password? Click here to reset