Canonical thresholding for non-sparse high-dimensional linear regression

07/24/2020
by   Igor Silin, et al.
0

We consider a high-dimensional linear regression problem. Unlike many papers on the topic, we do not require sparsity of the regression coefficients; instead, our main structural assumption is a decay of eigenvalues of the covariance matrix of the data. We propose a new family of estimators, called the canonical thresholding estimators, which pick largest regression coefficients in the canonical form. The estimators admit an explicit form and can be linked to LASSO and Principal Component Regression (PCR). A theoretical analysis for both fixed design and random design settings is provided. Obtained bounds on the mean squared error and the prediction error of a specific estimator from the family allow to clearly state sufficient conditions on the decay of eigenvalues to ensure convergence. In addition, we promote the use of the relative errors, strongly linked with the out-of-sample R^2. The study of these relative errors leads to a new concept of joint effective dimension, which incorporates the covariance of the data and the regression coefficients simultaneously, and describes the complexity of a linear regression problem. Numerical simulations confirm good performance of the proposed estimators compared to the previously developed methods.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

06/15/2022

Noise Covariance Estimation in Multi-Task High-dimensional Linear Models

This paper studies the multi-task high-dimensional linear regression mod...
04/12/2022

High-dimensional nonconvex lasso-type M-estimators

This paper proposes a theory for ℓ_1-norm penalized high-dimensional M-e...
12/06/2017

Estimating the error variance in a high-dimensional linear model

The lasso has been studied extensively as a tool for estimating the coef...
11/27/2020

Two-sample testing of high-dimensional linear regression coefficients via complementary sketching

We introduce a new method for two-sample testing of high-dimensional lin...
02/09/2016

Online Active Linear Regression via Thresholding

We consider the problem of online active learning to collect data for re...
04/16/2021

Generalized Matrix Decomposition Regression: Estimation and Inference for Two-way Structured Data

This paper studies high-dimensional regression with two-way structured d...
11/25/2018

Sparse PCA from Sparse Linear Regression

Sparse Principal Component Analysis (SPCA) and Sparse Linear Regression ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.