Statistical Inference for Algorithmic Leveraging

06/05/2016
by   Katelyn Gao, et al.
0

The age of big data has produced data sets that are computationally expensive to analyze. To deal with such large-scale data sets, the method of algorithmic leveraging proposes that we sample according to some special distribution, rescale the data, and then perform analysis on the smaller sample. Ma, Mahoney, and Yu (2015) provides a framework to determine the statistical properties of algorithmic leveraging in the context of estimating the regression coefficients in a linear model with a fixed number of predictors. In this paper, we discuss how to perform statistical inference on regression coefficients estimated using algorithmic leveraging. In particular, we show how to construct confidence intervals for each estimated coefficient and present an efficient algorithm for doing so when the error variance is known. Through simulations, we confirm that our procedure controls the type I errors of significance tests for the regression coefficients and show that it has good power for those tests.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/23/2013

A Statistical Perspective on Algorithmic Leveraging

One popular method for dealing with large-scale data sets is sampling. F...
research
10/19/2021

Simulating the Power of Statistical Tests: A Collection of R Examples

This paper illustrates how to calculate the power of a statistical test ...
research
06/06/2021

Statistical Inference for Cox Proportional Hazards Models with a Diverging Number of Covariates

For statistical inference on regression models with a diverging number o...
research
02/18/2020

Post-selection inference on high-dimensional varying-coefficient quantile regression model

Quantile regression has been successfully used to study heterogeneous an...
research
02/03/2021

Statistical Inference for Ordinal Predictors in Generalized Linear and Additive Models with Application to Bronchopulmonary Dysplasia

Discrete but ordered covariates are quite common in applied statistics, ...
research
01/26/2023

Distributional outcome regression and its application to modelling continuously monitored heart rate and physical activity

We propose a distributional outcome regression (DOR) with scalar and dis...
research
09/04/2019

Group Inference in High Dimensions with Applications to Hierarchical Testing

Group inference has been a long-standing question in statistics and the ...

Please sign up or login with your details

Forgot password? Click here to reset