RIGID: Robust Linear Regression with Missing Data

05/26/2022
by   Alireza Aghasi, et al.
0

We present a robust framework to perform linear regression with missing entries in the features. By considering an elliptical data distribution, and specifically a multivariate normal model, we are able to conditionally formulate a distribution for the missing entries and present a robust framework, which minimizes the worst case error caused by the uncertainty about the missing data. We show that the proposed formulation, which naturally takes into account the dependency between different variables, ultimately reduces to a convex program, for which a customized and scalable solver can be delivered. In addition to a detailed analysis to deliver such solver, we also asymptoticly analyze the behavior of the proposed framework, and present technical discussions to estimate the required input parameters. We complement our analysis with experiments performed on synthetic, semi-synthetic, and real data, and show how the proposed formulation improves the prediction accuracy and robustness, and outperforms the competing techniques.

READ FULL TEXT
research
03/28/2015

Sparse Linear Regression With Missing Data

This paper proposes a fast and accurate method for sparse regression in ...
research
07/18/2022

A self-censoring model for multivariate nonignorable nonmonotone missing data

We introduce a self-censoring model for multivariate nonignorable nonmon...
research
01/09/2017

Coupled Compound Poisson Factorization

We present a general framework, the coupled compound Poisson factorizati...
research
03/27/2023

A joint Bayesian framework for missing data and measurement error using integrated nested Laplace approximations

Measurement error (ME) and missing values in covariates are often unavoi...
research
02/24/2022

The Impossibility of Testing for Dependence Using Kendall's τ Under Missing Data of Unknown Form

This paper discusses the statistical inference problem associated with t...
research
09/01/2021

RIFLE: Robust Inference from Low Order Marginals

The ubiquity of missing values in real-world datasets poses a challenge ...
research
12/14/2021

Navigating the corporate disclosure gap: Modelling of Missing Not at Random Carbon Data

Corporate carbon emissions data is disclosed by approximately 65 and mid...

Please sign up or login with your details

Forgot password? Click here to reset