HMLasso: Lasso for High Dimensional and Highly Missing Data

11/01/2018
by   Masaaki Takada, et al.
0

Sparse regression such as Lasso has achieved great success in dealing with high dimensional data for several decades. However, there are few methods applicable to missing data, which often occurs in high dimensional data. Recently, CoCoLasso was proposed to deal with high dimensional missing data, but it still suffers from highly missing data. In this paper, we propose a novel Lasso-type regression technique for Highly Missing data, called `HMLasso'. We use the mean imputed covariance matrix, which is notorious in general due to its estimation bias for missing data. However, we effectively incorporate it into Lasso, by using a useful connection with the pairwise covariance matrix. The resulting optimization problem can be seen as a weighted modification of CoCoLasso with the missing ratios, and is quite effective for highly missing data. To the best of our knowledge, this is the first method that can efficiently deal with both high dimensional and highly missing data. We show that the proposed method is beneficial with regards to non-asymptotic properties of the covariance matrix. Numerical experiments show that the proposed method is highly advantageous in terms of estimation error and generalization error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2012

Likelihood Estimation with Incomplete Array Variate Observations

Missing data is an important challenge when dealing with high dimensiona...
research
02/26/2018

Missing Data in Sparse Transition Matrix Estimation for Sub-Gaussian Vector Autoregressive Processes

High-dimensional time series data exist in numerous areas such as financ...
research
10/01/2019

Covariance Matrix Estimation with Non Uniform and Data Dependent Missing Observations

In this paper we study covariance estimation with missing data. We consi...
research
07/01/2016

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

In this paper, we examine the problem of missing data in high-dimensiona...
research
03/28/2015

Sparse Linear Regression With Missing Data

This paper proposes a fast and accurate method for sparse regression in ...
research
01/12/2018

A Simple and Efficient Estimation Method for Models with Nonignorable Missing Data

This paper proposes a simple and efficient estimation procedure for the ...
research
09/14/2019

Adaptive Bayesian SLOPE – High-dimensional Model Selection with Missing Values

The selection of variables with high-dimensional and missing data is a m...

Please sign up or login with your details

Forgot password? Click here to reset