1 Introduction
Consider a linear regression model where
observations are modelled through,(1) 
where is the response, is a vector of predictors, is a vector of regression parameters and
is an independent random error with variance
. If an intercept is included in model (1), then the first component of equals 1 for all . For simplicity in notation, if not otherwise mentioned, we write regression models without an intercept term.It is common for real and raw data to contain outliers. Usually, the term outlier refers to rowwise outliers, as visualized in the left panel of Figure 1, that is all elements of an entire row of the design matrix together ‘create’ that the th row is outlying.
Traditional robust regression estimators such as Mestimators (Huber, 2004)
handle rowwise outliers through mitigating the influence of ‘outlying’ rows through ‘robust’ loss functions in the objective function such as in
where is a robust loss function and is an initial robust estimate of residual scale. Popular robust loss functions include Huber’s loss or Tukey’s biweight loss. These estimators are less sensitive to rowwise outliers because the robust loss function downweights outlying rows as identified by their large residuals. Although those estimators are robust, they can cause overfitting when there are many predictors and have poor robustness and efficiency properties when the versus ratio is high (see e.g., Maronna and Yohai, 2015; Smucler and Yohai, 2015).Recently, Chang et al. (2018) combines robust Mestimation with the adaptive Lasso penalty: where is a tuning parameter and are adaptive weights obtained from initial estimates , . Chang et al. (2018) uses Tukey’s biweight loss because its derivative is redescending, which is a necessary property for the loss function to be robust against outliers in the design matrix (e.g., Rousseeuw and Yohai (1984)).
However, the rowwise outlier paradigm (visualised in the left panel of Figure 1) is often too narrow a framework. For some datasets, most cells in a row are clean while just a few cells are contaminated. Alqallaf et al. (2009) notes the propagation of cellwise outliers: under the independent contamination model, for a contamination rate of cells, the expected proportion of contaminated observation rows is , which on average rapidly exceeds with increasing dimension , and this can even be more rapid in special cases, as shown in the right panel of Figure 1. Specifically, Figure 1 shows two scenarios with the same proportion of contaminated cells in the design matrix, i.e. 20% of entries are contaminated overall. However, the scenario in the left panel has only a few observation rows contaminated whereas every single row of the scenario in the right panel is contaminated. Thus, rowwise methods may fail with cellwise outliers, because even a small overall proportion of contaminated cells may occur in a large proportion of rows having at least one outlying cell. Therefore, the breakdown point, the proportion of rowwise contamination which can be accommodated by a traditional robust estimation method, can very quickly exceed 50% rendering even the most robust traditional methods ineffective.
A natural choice to handle cellwise outliers is to perform outlier detection first. For example,
Rousseeuw and Bossche (2018) predicts the value of a cell and then flags a cell as outlying if its observed value and its predicted value differ by too much. Debruyne et al. (2019) regards all outliers as rowwise outliers and then detects which cells contribute more to the outlying rows. Raymaekers and Rousseeuw (2019)detects outlying cells for each row by proposing a ‘cellflagger’ technique, which combines a Lasso penalty with a stepwise application of constructed cutoff values. After an initial outlier detection, there are two natural possibilities for a next step: (1) impute nonoutlying values for those cells flagged as outlying, as done in
Rousseeuw and Bossche (2018) and Debruyne et al. (2019); (2) regard detected outliers as missing and then obtain robust estimation directly, as in Raymaekers and Rousseeuw (2019).Very little work has been done in the context of robust regression under cellwise contamination. Ollerer et al. (2016) proposes the Shooting S algorithm to perform robust regression. First, a robust imputation of the design matrix is obtained. Then the shooting S algorithm is used to obtain robust estimation of regression coefficients. Bottmer et al. (2021) combines the shooting S algorithm with the Lasso to obtain sparse regression results. However, in our empirical studies we found this method can be improved upon, particularly in high dimensional cases and for high contamination rates but where the magnitude of outliers is moderate; which is a very challenging setting when dealing with cellwise outliers.
Leung et al. (2016) proposes a three step regression method. First, cellwise outliers are detected marginally; second, a robust estimator of multivariate location and scatter is applied to obtain robust covariance estimates; finally the regression coefficients are computed from the estimates obtained in the second step. However, this method only copes with marginal cellwise outliers and does not give sparse regression results.
Filzmoser et al. (2020) proposes a cellwise robust Mestimator by replacing the detected outliers with imputed values. This method works well for nonsparse settings but not for sparse data generating models.
To obtain robust variable selection results under cellwise contamination, especially in high dimensional cases, we propose a procedure to perform variable section based on a robust estimation of the covariance matrix among response and predictors. First, we use a pairwise robust estimator to obtain an initial empirical covariance matrix. This robust covariance matrix is adopted to achieve robust variable selection. Variable selection based on covariances is introduced in Section 2 and our proposed robust method using this device is presented in Section 3. Our method is robust to cellwise outliers in low and high dimensional settings. Simulations show good performance in comparison with recent proposed approaches, and those are presented in Section 4. Real data applications in Section 5 demonstrate the utility of our proposed method. We conclude the paper with some remarks in Section 6.
2 Variable selection based on covariances
The quadratic loss function can be expressed as (see, e.g., Loh and Wainwright, 2012),
(2) 
where are the components of the empirical covariance matrix among predictors and the response,
(3) 
The solution of (2) can be expressed as . Under cellwise contamination, Leung et al. (2016) highlights that one way to obtain robust regression results is through using a robust counterpart of an empirical covariance matrix.
The quadratic loss function in (2) can be more elegantly expressed. Given that is positive semidefinite, we can define , where is the square root of , the first column of is dentoed and is a matrix consisting of the remaining columns of . Then we rewrite (2) as
(4) 
which is a classic quadratic loss minimization problem.
To achieve robust variable selection, the adaptive Lasso penalty (Zou, 2006) can be incorporated resulting in the regularized objective loss function. Considering the consistency of the adaptive Lasso (Zou, 2006), using its penalty obtains a regularized objective loss function
(5) 
where is the tuning parameter, and is an initial nonregularized estimation of . In the minimization equation (5), we choose the value of the tuning parameter that minimizes
(6) 
where is the estimated residual variance and
denotes the degrees of freedom of the regularized regression model.
While there are many criteria that could be applied to determine the tuning parameter , we have selected the BIC (Schwarz, 1978) because of its ease of implementation and good performance.
3 Robust estimation of the covariance matrix
A natural choice to get a robust estimated covariance matrix under cellwise contamination is to obtain robust estimates of marginal scales, which we denote , , and pairwise correlations , . Robust scale estimators such as the median absolute deviation, estimator (Rousseeuw and Croux, 1993) and estimator (Tarr et al., 2012) can be used to obtain such robust and highly efficient scale estimates. Robust pairwise estimators can be used to obtain robust correlations as further detailed next.
3.1 Correlation estimator
Given a pair of observation vectors for variables and , i.e. , , a natural estimator of the strength of linear association is the Pearson correlation coefficient, stressing of course that this is not a robust estimator of linear dependence. Under contaminated settings, a simple pairwise estimator is based on the identity proposed by Gnanadesikan and Kettenring (1972),
(7) 
where is a robust estimator of scale and , . This estimator is also used in Raymaekers and Rousseeuw (2021) to obtain fast robust correlation. With this method, we can use scale estimators to obtain a correlation estimator. In our simulations and applications, the estimator is used to compute robust scales since it is very robust and reasonably efficient (Rousseeuw and Croux, 1993). After obtaining estimated correlation coefficients, we assemble them into an empirical correlation matrix as in Tarr et al. (2016).
As an alternative, other nonparametric correlation estimators such as Quadrant correlation, Spearman correlation or Gaussian rank correlation could be used to obtain robust correlation matrices as well. In this paper, we consider the Gaussian rank correlation (Boudt et al., 2012) defined as the sample correlation estimated from the normal scores of the data. For , the th observation of the th variable, we compute
(8) 
where denotes the rank of among all elements of and
is the standard normal cumulative distribution function. Then we obtain the Gaussian rank correlation matrix by computing the correlation matrix of
. Compared with other nonparametric correlation estimators, Gaussian rank correlation is robust and consistent under the multivariate normal distribution
(Boudt et al., 2012).3.2 Positive definite adjustment
A limitation of pairwise methods is that estimated correlation matrices are not necessarily positive definite, and this is particularly prone to happen in high dimensional situations. Many methods have been devised to solve this problem. A positive definite (PD) calibration method is used by Huang et al. (2017),
(9) 
where denotes the identiy matrix, denotes the Frobenius norm of and
is the eigenvalue threshold we set. When
, the PD calibration method reduces to the classic nearest positive definite matrix method (Higham, 2002).Given the spectral decomposition for and , we have
(10) 
From the view of the Frobenius norm, has smaller difference with compared to other calibration methods, such as the linear shrinkage estimator (Ledoit and Wolf, 2004).
In order to obtain sparse selection results, should not be too small. In Huang et al. (2017), the optimal is chosen by minimizing
(11) 
where is a parameter to control the range of . Huang et al. (2017) suggest setting to be the smallest nonnegative eigenvalue. However, as long as is positive, it does not contribute to the minimization in (11). Therefore, for simplicity, we set in the following. The criterion in (11) penalises small values of
rather than solely just minimizing the Frobenius norm. Such penalty terms are widely used in statistical feature selection, such as when using the AIC
(Akaike, 1973) and BIC (Schwarz, 1978).The estimated correlation matrices from the Gaussian rank correlation and Pearson correlation will also be ill conditioned when . In this case, the PD calibration method could be used to adjust the obtained correlation matrices as well.
After obtaining the estimates of the scale parameters and the correlation matrix, we obtain the robust empirical covariance matrix , where is a diagonal matrix consisting of the robust estimated scales. In our algorithm we can also use directly then the scaling of predictors could be dealt with initially.
Then the robust estimation of empirical covariance matrix is used to perform variable selection, as introduced in Section 2. For different robust estimators, we perform both, the Adaptive Lasso based on Robust Pairwise correlations (ALRP) and the Adaptive Lasso based on Gaussian Rank correlations (ALGR). Besides, the Lasso based on Pearson correlation is just equivalent to Lasso itself. We continue with presenting empirical work on those and other methods in Section 4 and their application to real data in Section 5.
4 Simulation Studies
To measure the utility of our proposed methods, we compare the performance of Lasso (Tibshirani, 1996), the sparse LTS (sLTS) (Alfons et al., 2013), the sparse Shooting Sestimator (sShootingS) (Bottmer et al., 2021), and two new approaches: the adaptive Lasso estimator based on robust pairwise correlation (ALRP) and the adaptive Lasso estimator based on Gaussian rank correlation (ALGR) in regular and high dimensional settings.
Our empirical work will show that no matter the setting, either regular when , or high dimensional when , ALRP shows superior performance compared to Lasso and other robust methods, particularly in the challenging setting when contamination rates are high but the magnitude of outliers is moderate. ALGR is overall a little inferior to ALRP but still performs strongly compared to other methods.
4.1 Regular settings:
Recall the linear regression model (1). In our simulations, we set , , , is sampled from and is sampled from . The correlation structure among variables is given by and we set .
Contamination proportions are set as , and for all predictors separately. Outlying cells of all variables are randomly generated from . We vary over the set to simulate outliers with different magnitudes. When we apply appropriate ’s, those cells will be outlying cells but not not necessarily marginal outliers.
For Lasso, ALGR and ALRP, we use the as defined in (6) to choose the regularization tuning parameter
. For sLTS and sShootingS, their default selection criteria are used. To assess the performance of the considered methods, we use the true positive rate (TPR), the false positive rate (FPR) and the balanced Fscore (
score) defined by(12) 
where TP denotes the true positive number, FP the false postive number and FN the false negative number, respectively. We run simulations for each scenario and use the average TPR, FPR and score to measure the performance of each method.
Figure 2 reports the empirical findings. All methods perform well when the cellwise contamination rate as can be seen from the plots in the left panel of Figure 2. This contamination rate is sufficiently low that all methods have good TPRs. The score and FPR of Lasso are worse than for other methods, i.e. values are consistently smaller and consistently larger, respectively. The score of ALGR is also slightly worse compared to the three other cellwise robust methods considered here.
Considering the contamination rate (middle column of Figure 2), the scores, TPRs and FPRs of ALRP and sShootingS are generally better compared to other methods, indicating that ALRP and sShootingS both maintain their good performance even with the increase of . ALGR performs slightly worse than ALRP and sShootingS in terms of score and FPR. Lasso has lower score and TPR. Especially, Lasso performs more inferior with the increase of . When , the TPR of Lasso is merely around . We also observe that sLTS also becomes inferior with the increase of .
When (right column of Figure 2), ALRP performs the best in terms of all performance measures and we observe that it maintains a stable score, TPR and FPR over the considered range of . The performance of ALGR is a little worse than ALRP but we also observe stable results overall. While we observe that sShootingS performs well when or , we note an inferior performance between those values, e.g. particularly when is around . We conclude that under this scenario, the performance of sShootingS depends on the magnitude of outliers. Compared to the previous settings where e is either 2% or 5%, when e is 10% both Lasso and sLTS have a more pronounced poor performance.
4.2 High dimensional settings:
In this section we show the empirical findings in highdimensional cases. We continue with the simulation settings from the previous section but now increase to be while keeping at . As mentioned in Section 3, under high dimensional settings, the correlation matrices obtained from the Pearson correlation and Gaussrank correlation are illconditioned. In this case, the PD calibration method (Huang et al., 2017) is used to obtain adjusted covariance matrices for Lasso and ALGR as well.
Figure 3 shows simulation results under high dimensional settings. Considering the contamination rate (left column of Figure 3), we observe that the three cellwise robust methods we considered (ALRP, ALGR and sShootingS) perform well as they maintain good scores. Lasso, and sLTS perform inferior, as they have worse scores as increases.
When (middle column of Figure 3), ALRP, ALGR and sShootingS still perform well in terms of the score. Meanwhile, the TPRs, FPRs and scores of Lasso and sLTS become more inferior, especially with the increase of . We boserve that when , the TPR of Lasso is less than and the TPR of sLTS is less than .
When (right column of Figure 3), ALRP still shows good and stable performance over the considered range of . ALGR performs slightly worse compared to ALRP as it has lower score. Lasso and sLTS have worse TPRs and scores with the increase of . As observed in the simulations in low dimensional cases, the score of sShootingS is good around or but become inferior particularly when is around .
5 Real data applications
We show the usage of the proposed methods using two real datasets: the Boston Housing data, and the Breast Cancer data. They are representative for applications in regular and high dimensional settings, respectively. Our aim is to explore how different methods vary in real data applications.
5.1 Boston Housing data
We illustrate the effect of cellwise outlier propagation on the previously considered estimators using the Boston Housing dataset. This data is available at the UCI maching learning repository (https://archive.ics.uci.edu/ml/index.php) and was collected from census samples on different variables. The original objective of the study is to analyze the association between the median housing values (medv) in Boston and the residents willingness to pay for clean air, as well as the association between medv and other variables in the dataset.
We only consider the quantitative variables that were extensively studied. A sparse regression model is fitted for,
and we compare the performance of five methods: Lasso (Tibshirani, 1996), sLTS (Alfons et al., 2013), sShootingS (Bottmer et al., 2021) as well as the new proposed methods ALRP and ALGR.
To measure the performance of variable selection when adding some known redundant variables, we generate
additional random variables as redundant predictors using the same setting as in the previous simulation section. Therefore, we now have 19 predictors to choose from.
To test the performance under cellwise contamination, we standardize the 19 predictors with robust estimators of location (here we use the median) and scale (here we use ). Then, for each predictor, of the cells are replaced by cellwise outliers which are randomly generated from . As a comparison, we will also run simulations without any contamination to investigate how stable the various methods are when known outliers are present in the data. We repeat this process of adding ten redundant variables followed by generating 10% of outliers in the 19 explanatory variables 1,000 times, and then compute the selection rates of each variable (shown in Table 1).
method  e  FPR  

ALRP  0  1.00  1.00  1.00  0.00  1.00  1.00  0.00  0.00  0.80  0.08 
0.1  1.00  1.00  1.00  0.00  1.00  1.00  0.00  0.00  0.05  0.08  
ALGR  0  1.00  1.00  1.00  1.00  1.00  1.00  1.00  0.00  0.00  0.04 
0.1  1.00  1.00  1.00  0.05  1.00  1.00  0.53  0.63  0.18  0.07  
sShootingS  0  1.00  1.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 
0.1  1.00  1.00  0.01  0.00  0.00  0.00  0.00  0.00  0.00  0.00  
sLTS  0  1.00  1.00  1.00  0.00  1.00  1.00  0.00  1.00  1.00  0.02 
0.1  1.00  0.97  0.71  0.29  0.86  0.84  0.31  0.39  1.00  0.24  
lasso  0  1.00  1.00  1.00  0.76  1.00  1.00  0.54  0.03  1.00  0.11 
0.1  1.00  1.00  1.00  0.43  1.00  0.93  0.80  0.81  1.00  0.17 
For all five methods reported in Table 1, we observe some inconsistent selection results under different settings. ALRP has the most consistent selection results for the two scenarios considered (without cellwise outliers and with 10% cellwise outliers), although it still shows some inconsistency for the predictor (the selection rate is 0.8 when there is no cellwise contamination whereas it is only 0.05 when there is 10% contamination). ALGR also shows some inconsistency for the predictors , and , although it has good FPR. Interestingly, for this dataset, sShootingS essentially selects nothing for either contamination setting, showing extreme sparsity. On the other hand, sLTS shows inconsistent selection results for many predictors and shows inferior FPR under the 10% cellwise contamination. The selection results of Lasso also vary a lot and has worse FPR.
5.2 Breast Cancer Data
To illustrate a real data application for a high dimensional settings, we use the Breast Cancer dataset (van ’t Veer et al., 2002) as an example. There are 24,481 genes of 117 breast tumor patients in this dataset. For the 78 breast cancer patients with diseasefree survival time, for each of the genes the , and pvalue is given. We construct a regression model to select some important genes whose are significantly related to the diseasefree patient survival. For the sake of this, we regard the diseasefree survival time as the response and use the ’s as predictors.
It is assumed that only a few genes are associated with diseasefree survival time. For very high dimensional datasets, usually we screen some variables first and then run variable selection based on the predictors we screened, such as in Bai et al. (2021). Thus we first compute the robust marginal correlations (using the pairwise estimator Gnanadesikan and Kettenring (1972) with the scale estimator (Rousseeuw and Croux, 1993)) of gene expressions with diseasefree survival time and then select the 200 genes that have the largest robust pairwise correlation coefficients as predictors.
Outlier detection results of those predictors based on DDC (Rousseeuw and Bossche, 2018) is shown in Figure 4. Most cells are yellow, showing they are not detected as outliers. Cells are flagged as outlying (red and blue cells) if the observed value and predicted value differ too much. A red cell means the observed value is larger than the predicted value and a blue cell means the observed value is smaller than the predicted value significantly.
In this screened Breast Cancer dataset, only a few cells are contaminated. The average contamination rate for all 200 predictors is only . For most genes, the contamination rate is less than . Some genes have a contamination rate of more than . Gene Contig55181_RC has the highest contamination rate of . Looking at the outlier pattern in each row, that is for each patient, the contamination rates of most patients are below . The th patient has the most detected outliers, its contamination rate is . Therefore, we could regard this observation as a rowwise outlier. Because this is clearly a highdimensional cellwise outlier situation, all observations and predictors are contaminated for the breast cancer data.
Given the high cellwise contamination, traditional variable selection methods such as Lasso, or rowwise robust variable selection methods may lead to misleading selection results. With this consideration, cellwise robust methods may have better solutions. A sparse regression model is fitted via Lasso, sLTS, sShootingS, ALRP and ALGR, seperately. All selected genes are sorted based on the their robust pairwise correlations with the survival time. The selection results are shown in Table 2.
Method  Size  Selected genes  

ALRP  20  AB040971  AJ236885  AK000004  AL162078  Contig37028_RC 
Contig40406_RC  Contig57357_RC  Contig62306  NM_001616  NM_001830  
NM_002986  NM_003125  NM_003860  NM_004052  NM_004618  
NM_005055  NM_006571  NM_012316  NM_013437  NM_020120  
ALGR  10  AL137302  AL162078  Contig48328_RC  Contig54893_RC  Contig54956_RC 
Contig55181_RC  M94046  NM_001616  NM_005881  NM_006301  
sShootingS  72  AB002369  AB033032  AB033035  AF052185  AF257175 
AK000365  AK001044  AL080059  AL110193  AL137302  
AL137362  AL137527  AL162078  Contig1408  Contig14797_RC  
Contig19451_RC  Contig27464_RC  Contig28947_RC  Contig29141_RC  Contig29995_RC  
Contig30146_RC  Contig31424_RC  Contig31449_RC  Contig37063_RC  Contig38966_RC  
Contig40406_RC  Contig40772_RC  Contig41521_RC  Contig42740_RC  Contig47940_RC  
Contig48328_RC  Contig48472_RC  Contig54893_RC  Contig54956_RC  Contig55181_RC  
Contig56229  Contig57521_RC  Contig726_RC  Contig8862_RC  NM_000915  
NM_000950  NM_001616  NM_001673  NM_001826  NM_002300  
NM_002442  NM_002824  NM_003110  NM_003158  NM_003245  
NM_004052  NM_004911  NM_005055  NM_005089  NM_005498  
NM_005785  NM_005881  NM_006297  NM_006301  NM_006804  
NM_007278  NM_013262  NM_014675  NM_014699  NM_014706  
NM_014894  NM_014908  NM_015270  NM_016175  NM_016434  
NM_019848  NM_020120  
sLTS  27  AB033035  AJ224741  Contig16447_RC  Contig31424_RC  Contig37063_RC 
Contig45347_RC  Contig45917_RC  M55643  M94046  NM_000950  
NM_001826  NM_002461  NM_002811  NM_002986  NM_003158  
NM_003641  NM_003862  NM_004723  NM_005055  NM_005452  
NM_005657  NM_006571  NM_007278  NM_012429  NM_013262  
NM_014908  one_barcode  
Lasso  11  AL137302  AL162078  Contig37063_RC  Contig42740_RC  Contig48328_RC 
M94046  NM_001673  NM_005881  NM_006301  NM_016390  
one_barcode 
From Table 2, we observe that gene selection results vary greatly between different methods. In stark contrast to the previous real data set, here sShootingS selects the most genes while other methods select fewer. Given the insights from the Boston Housing data (Section 5.1) and our empirical work in Section 4, we conclude that the selection results from ALRP are preferred here. Of course the results here are from a ‘single’ run of the methods only. To explore further and to investigate stability in particular, we could run more simulations by resampling.
6 Conclusion
We have developed a robust variable selection method that deals with cellwise outliers in the design matrix in regression models. It is apparent that traditional robust methods might cause misleading estimates when there are cellwise outliers, particularly because the contamination rate may be larger than the breakdown point. Our method tackled this problem by changing a robust regression parameter estimation problem to a robust covariance matrix estimation problem. We have shown that our methods work well under both low dimensional cases and high dimensional cases. Simulations illustrated that the ALRP and ALGR perform better than other methods considered. Real data applications demonstrated the utility of the proposed method.
Acknowledgements
Su’s work was supported by Chinese Scholarship Council (201906360181). Muller and Tarr were supported by the Australian Research Council (DP210100521). Tarr was supported in part by the AIRinnoHK program of the Innovation and Technology Commission of Hong Kong.
SUPPLEMENTARY MATERIAL
 Codes:

R functions for ALRP and ALGR are available on the GitHub page of the first author (https://github.com/PengSU517/robcovsel).
References
 Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & B. F. Csaki (Eds.). 2nd Int.Sympo.on Information Theory. Cited by: §3.2.
 Sparse least trimmed squares regression for analyzing highdimensional large data sets. The Annals of Applied Statistics 7 (1), pp. 226–248. External Links: ISSN 19326157, Link Cited by: §4, §5.1.
 Propagation of outliers in multivariate data. The Annals of Statistics 37 (1), pp. 311–331 (en). External Links: ISSN 00905364, Link, Document Cited by: §1.

Variable selection for ultrahigh dimensional quantile regression with missing data and measurement error
. Statistical Methods in Medical Research 30 (1), pp. 129–150. External Links: ISSN 09622802, Link, Document Cited by: §5.2.  Sparse regression for large data sets with outliers. European Journal of Operational Research, In Press. (en). Note: DOI: https://doi.org/10.1016/j.ejor.2021.05.049 External Links: Link Cited by: §1, §4, §5.1.
 The Gaussian rank correlation estimator: robustness properties. Statistics and Computing 22 (2), pp. 471–483 (en). External Links: ISSN 09603174, 15731375, Link, Document Cited by: §3.1.

Robust Lasso regression using Tukey’s biweight criterion
. Technometrics 60 (1), pp. 36–47 (en). External Links: ISSN 00401706, 15372723, Link, Document Cited by: §1.  Outlyingness: which variables contribute most?. Statistics and Computing 29 (4), pp. 707–723 (en). External Links: ISSN 09603174, 15731375, Link, Document Cited by: §1.
 Cellwise robust M regression. Computational Statistics & Data Analysis 147, pp. 106944 (en). External Links: ISSN 01679473, Link, Document Cited by: §1.
 Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28, pp. 81–124. Cited by: §3.1, §5.2.
 Computing the nearest correlation matrix–a problem from finance. IMA Journal of Numerical Analysis 22 (3), pp. 329–343 (en). External Links: ISSN 02724979, 14643642, Link, Document Cited by: §3.2.

A calibration method for nonpositive definite covariance matrix in multivariate data analysis.
Journal of Multivariate Analysis
157, pp. 45–52 (en). External Links: ISSN 0047259X, Link, Document Cited by: §3.2, §3.2, §4.2.  Robust Statistics. John Wiley & Sons, New York (en). External Links: ISBN 9780471650720 Cited by: §1.
 A wellconditioned estimator for largedimensional covariance matrices. Journal of Multivariate Analysis 88 (2), pp. 365–411 (en). External Links: ISSN 0047259X, Link, Document Cited by: §3.2.
 Robust regression estimation and inference in the presence of cellwise and casewise contamination. Computational Statistics & Data Analysis 99, pp. 1–11 (en). External Links: ISSN 01679473, Link, Document Cited by: §1, §2.
 Highdimensional regression with noisy and missing data: Provable guarantees with nonconvexity. The Annals of Statistics 40 (3), pp. 1637–1664 (en). External Links: ISSN 00905364, Link, Document Cited by: §2.
 High finitesample efficiency and robustness based on distanceconstrained maximum likelihood. Computational Statistics & Data Analysis 83, pp. 262–274 (en). External Links: ISSN 01679473, Link, Document Cited by: §1.
 The shooting Sestimator for robust regression. Computational Statistics 31 (3), pp. 829–844 (en). External Links: ISSN 09434062, 16139658, Link, Document Cited by: §1.
 Flagging and handling cellwise outliers by robust estimation of a covariance matrix. arXiv: 1912.12446 (en). External Links: Link Cited by: §1.

Fast Robust Correlation for HighDimensional Data
. Technometrics 63 (2), pp. 184–198. External Links: ISSN 00401706, Link, Document Cited by: §3.1.  Robust Regression by Means of SEstimators. In Robust and Nonlinear Time Series Analysis, J. Franke, W. Härdle, and D. Martin (Eds.), New York, pp. 256–272. External Links: ISBN 9781461578215 Cited by: §1.
 Detecting deviating data cells. Technometrics 60 (2), pp. 135–145 (en). External Links: ISSN 00401706, 15372723, Link, Document Cited by: §1, §5.2.
 Alternatives to the Median Absolute Deviation. Journal of the American Statistical Association 88 (424), pp. 1273–1283 (en). External Links: ISSN 01621459, 1537274X, Link, Document Cited by: §3.1, §3, §5.2.
 Estimating the dimension of a model. The Annals of Statistics 6 (2). Cited by: §2, §3.2.
 Highly Robust and Highly Finite Sample Efficient Estimators for the Linear Model. In Modern Nonparametric, Robust and Multivariate Methods: Festschrift in Honour of Hannu Oja, K. Nordhausen and S. Taskinen (Eds.), pp. 91–108. External Links: ISBN 9783319224046, Link, Document Cited by: §1.
 A robust scale estimator based on pairwise means. Journal of Nonparametric Statistics 24 (1), pp. 187–199 (en). External Links: ISSN 10485252, 10290311, Link, Document Cited by: §3.
 Robust estimation of precision matrices under cellwise contamination. Computational Statistics & Data Analysis 93, pp. 404–420 (en). External Links: ISSN 01679473, Link, Document Cited by: §3.1.
 Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58 (1), pp. 267–288 (en). External Links: ISSN 00359246, Link, Document Cited by: §4, §5.1.
 Gene expression profiling predicts clinical outcome of breast cancer. Nature 415 (6871), pp. 530–536 (en). External Links: ISSN 14764687, Link, Document Cited by: §5.2.
 The adaptive Lasso and its oracle properties. Journal of the American Statistical Association 101 (476), pp. 1418–1429 (en). External Links: ISSN 01621459, 1537274X, Link, Document Cited by: §2.
Comments
There are no comments yet.