Linear regression and its inference on noisy network-linked data

07/01/2020
by   Can M. Le, et al.
0

Linear regression on a set of observations linked by a network has been an essential tool in modeling the relationship between response and covariates with additional network data. Despite its wide range of applications in many areas, such as social sciences and health-related research, the problem has not been well-studied in statistics so far. Previous methods either lack inference tools or rely on restrictive assumptions on social effects, and usually assume that networks are observed without errors, which is too good to be true in many problems. In this paper, we propose a linear regression model with nonparametric network effects. Our model does not assume that the relational data or network structure is exactly observed; thus, the method can be provably robust to a certain level of perturbation of the network structure. We establish a set of asymptotic inference results under a general requirement of the network perturbation and then study the robustness of our method in the specific setting when the perturbation comes from random network models. We discover a phase-transition phenomenon of inference validity concerning the network density when no prior knowledge about the network model is available, while also show the significant improvement achieved by knowing the network model. A by-product of our analysis is a rate-optimal concentration bound about subspace projection that may be of independent interest. We conduct extensive simulation studies to verify our theoretical observations, and demonstrate the advantage of our method over a few benchmarks in terms of accuracy and computational efficiency under different data-generating models. The method is then applied to adolescent network data to study gender and racial difference in social activities.

READ FULL TEXT

page 24

page 25

research
01/17/2019

Model-Free Tests for Series Correlation in Multivariate Linear Regression

Testing for series correlation among error terms is a basic problem in l...
research
10/23/2017

Linear regression model with a randomly censored predictor:Estimation procedures

We consider linear regression model estimation where the covariate of in...
research
10/24/2022

A Note on Cohen's d From a Partitioned Linear Regression Model

In this note we introduce a generalized formula for Cohen's d under the ...
research
10/29/2017

Distributional Consistency of Lasso by Perturbation Bootstrap

Least Absolute Shrinkage and Selection Operator or the Lasso, introduced...
research
12/22/2022

Estimating network-mediated causal effects via spectral embeddings

The last several years have seen a renewed and concerted effort to incor...
research
08/19/2020

Structure Learning in Inverse Ising Problems Using ℓ_2-Regularized Linear Estimator

Inferring interaction parameters from observed data is a ubiquitous requ...
research
07/30/2019

Network Dependence and Confounding by Network Structure Lead to Invalid Inference

Researchers across the health and social sciences generally assume that ...

Please sign up or login with your details

Forgot password? Click here to reset