Towards Practical Robustness Auditing for Linear Regression

07/30/2023
by   Daniel Freund, et al.
0

We investigate practical algorithms to find or disprove the existence of small subsets of a dataset which, when removed, reverse the sign of a coefficient in an ordinary least squares regression involving that dataset. We empirically study the performance of well-established algorithmic techniques for this task – mixed integer quadratically constrained optimization for general linear regression problems and exact greedy methods for special cases. We show that these methods largely outperform the state of the art and provide a useful robustness check for regression problems in a few dimensions. However, significant computational bottlenecks remain, especially for the important task of disproving the existence of such small sets of influential samples for regression problems of dimension 3 or greater. We make some headway on this challenge via a spectral algorithm using ideas drawn from recent innovations in algorithmic robust statistics. We summarize the limitations of known techniques in several challenge datasets to encourage further algorithmic innovation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2015

Fuzzy Mixed Integer Optimization Model for Regression Approach

Mixed Integer Optimization has been a topic of active research in past d...
research
03/02/2022

Are Latent Factor Regression and Sparse Regression Adequate?

We propose the Factor Augmented sparse linear Regression Model (FARM) th...
research
08/18/2016

Conditional Sparse Linear Regression

Machine learning and statistics typically focus on building models that ...
research
05/28/2022

Provably Auditing Ordinary Least Squares in Low Dimensions

Measuring the stability of conclusions derived from Ordinary Least Squar...
research
05/20/2015

Supervised Learning for Dynamical System Learning

Recently there has been substantial interest in spectral methods for lea...
research
09/16/2020

An Intrinsic Treatment of Stochastic Linear Regression

Linear regression is perhaps one of the most popular statistical concept...
research
01/29/2023

Imbalanced Mixed Linear Regression

We consider the problem of mixed linear regression (MLR), where each obs...

Please sign up or login with your details

Forgot password? Click here to reset