Inject Machine Learning into Significance Test for Misspecified Linear Models

06/04/2020
by   Jiaye Teng, et al.
0

Due to its strong interpretability, linear regression is widely used in social science, from which significance test provides the significance level of models or coefficients in the traditional statistical inference. However, linear regression methods rely on the linear assumptions of the ground truth function, which do not necessarily hold in practice. As a result, even for simple non-linear cases, linear regression may fail to report the correct significance level. In this paper, we present a simple and effective assumption-free method for linear approximation in both linear and non-linear scenarios. First, we apply a machine learning method to fit the ground truth function on the training set and calculate its linear approximation. Afterward, we get the estimator by adding adjustments based on the validation set. We prove the concentration inequalities and asymptotic properties of our estimator, which leads to the corresponding significance test. Experimental results show that our estimator significantly outperforms linear regression for non-linear ground truth functions, indicating that our estimator might be a better tool for the significance test.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2021

Analysis of Least square estimator for simple Linear Regression with a uniform distribution error

We study the least square estimator, in the framework of simple linear r...
research
06/19/2018

Estimation from Non-Linear Observations via Convex Programming with Application to Bilinear Regression

We propose a computationally efficient estimator, formulated as a convex...
research
03/07/2022

Fast rates for noisy interpolation require rethinking the effects of inductive bias

Good generalization performance on high-dimensional data crucially hinge...
research
05/07/2020

A Locally Adaptive Interpretable Regression

Machine learning models with both good predictability and high interpret...
research
08/17/2018

Concentration Based Inference in High Dimensional Generalized Regression Models (I: Statistical Guarantees)

We develop simple and non-asymptotically justified methods for hypothesi...
research
08/19/2020

Structure Learning in Inverse Ising Problems Using ℓ_2-Regularized Linear Estimator

Inferring interaction parameters from observed data is a ubiquitous requ...
research
07/14/2023

Adaptive Linear Estimating Equations

Sequential data collection has emerged as a widely adopted technique for...

Please sign up or login with your details

Forgot password? Click here to reset