Improved Covariance Matrix Estimator using Shrinkage Transformation and Random Matrix Theory

12/08/2019
by   Samruddhi Deshmukh, et al.
0

One of the major challenges in multivariate analysis is the estimation of population covariance matrix from sample covariance matrix (SCM). Most recent covariance matrix estimators use either shrinkage transformations or asymptotic results from Random Matrix Theory (RMT). Shrinkage techniques help in pulling extreme correlation values towards certain target values whereas tools from RMT help in removing noisy eigenvalues of SCM. Both of these techniques use different approaches to achieve a similar goal which is to remove noisy correlations and add structure to SCM to overcome the bias-variance trade-off. In this paper, we first critically evaluate the pros and cons of these two techniques and then propose an improved estimator which exploits the advantages of both by taking an optimally weighted convex combination of covariance matrices estimated by an improved shrinkage transformation and a RMT based filter. It is a generalized estimator which can adapt to changing sampling noise conditions in various datasets by performing hyperparameter optimization. We show the effectiveness of this estimator on the problem of designing a financial portfolio with minimum risk. We have chosen this problem because the complex properties of stock market data provide extreme conditions to test the robustness of a covariance estimator. Using data from four of the world's largest stock exchanges, we show that our proposed estimator outperforms existing estimators in minimizing the out-of-sample risk of the portfolio and hence predicts population statistics more precisely. Since covariance analysis is a crucial statistical tool, this estimator can be used in a wide range of machine learning, signal processing and high dimensional pattern recognition applications.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

02/25/2022

An Improvement on the Hotelling T^2 Test Using the Ledoit-Wolf Nonlinear Shrinkage Estimator

Hotelling's T^2 test is a classical approach for discriminating the mean...
09/10/2019

Covariance Matrix Estimation under Total Positivity for Portfolio Selection

Selecting the optimal Markowitz porfolio depends on estimating the covar...
10/27/2020

Source Enumeration via RMT Estimator Based on Linear Shrinkage Estimation of Noise Eigenvalues Using Relatively Few Samples

Estimating the number of signals embedded in noise is a fundamental prob...
07/03/2021

Cleaning large-dimensional covariance matrices for correlated samples

A non-linear shrinkage estimator of large-dimensional covariance matrice...
03/04/2015

Large Dimensional Analysis of Robust M-Estimators of Covariance with Outliers

A large dimensional characterization of robust M-estimators of covarianc...
10/23/2017

Structural Variability from Noisy Tomographic Projections

In cryo-electron microscopy, the 3D electric potentials of an ensemble o...
11/15/2017

The Dispersion Bias

Estimation error has plagued quantitative finance since Harry Markowitz ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Estimating the population covariance matrix is a crucial problem in multivariate statistics [1, 2] and it finds application across many core disciplines ranging from engineering [3, 4] and physics [5, 6] to finance [7, 8, 9]

. It is an active area of research in statistical signal processing, computer vision, wireless communication, machine learning, pattern recognition and finance

[10, 4, 11, 12, 13, 14, 8, 9]. However, the extent of innovation needed in estimating true correlations largely depends on the properties of data and the trade-off between accuracy and computational cost. A simple estimator like SCM can be useful if the data has some desirable properties like multivariate normality, independence across samples, larger sample size, etc. However, this is not the case with most real-world datasets, which is why we need better estimators.

Financial data is particularly challenging for traditional estimators like SCM because it does not exhibit properties like multivariate normality or availability of a large sample set. This has been well established by the fact that the theoretically optimal and Nobel Prize winning minimum risk portfolio theory by H. Markowitz [7] could not be effectively used in practical cases for almost 50 years because it relies on an accurate estimation of the covariance matrix [15, 16, 17]. His theory is also referred to as the Modern Portfolio Theory (MPT) as it radically changed investment perspectives after the 1950s. The key idea is to minimize risk by avoiding investment in highly correlated stocks, thus creating a diversified portfolio. The traditional asymptotically unbiased SCM estimator has proved to be highly ineffective due to heavy-tailed nature of stock market data and availability of limited samples [16, 9].

The amount of sampling noise present in SCM depends on certain properties of the data. To understand this, let be the number of features (can be stocks in a market), be the number of samples (daily returns of each stock). The data matrix can be represented as . After removing mean, the SCM () is defined as . The following factors decide the extent of deviation of this SCM from the population covariance matrix ():

  1. [leftmargin=0.5cm]

  2. Dimensionality constant ()

    : SCM is an asymptotically unbiased estimator, i.e.

    as and . Furthermore, the estimation error is low if and high if . Usually, is not close to 0 in stock market data since is comparable to [9]. This is because the data must include recent values of daily returns to predict future values correctly based on recent trends and therefore N cannot be very large. Hence, due to the limited samples per feature, SCM can give highly noisy correlations (high sampling noise).

  3. Normality Assumption: SCM is a maximum likelihood estimate of

    which is effective if data follows multivariate normality and has a finite second moment. But the distribution of stock returns is mostly non-Gaussian and is best modeled by heavy-tailed distributions

    [18]. This increases the estimation error.

  4. Independence across samples: Another crucial assumption made while deriving the maximum likelihood estimator of is that the samples across each feature are independent and identically distributed (i.i.d.). This is not true for stock data as it can have temporal correlations.

  5. Bias-Variance tradeoff (structure in covariance matrix): Deviation from the aforementioned assumptions might increase the sampling noise and cause over-fitting resulting in a highly non-structured SCM. This results in poor estimates for out-of-sample correlation coefficients.

Hence, the scarcity of samples, deviation from multivariate normality, deviation from i.i.d nature and lack of structure, all make SCM a terrible estimator for many practical cases, particularly for financial applications [19, 8, 9, 11, 14, 10].

1.1 Contribution

In this paper, we propose an improved covariance matrix estimator by taking optimally weighted convex combination of covariance matrices estimated by shrinkage transformation and RMT based filter. This involves formulating and solving a convex optimization problem with linear constraints in a way that it can solve the problem of bias-variance trade-off in estimating population correlations using limited number of data samples with complex properties like heavy tailed distribution. This improved estimator when applied to the data of major stock markets, outperformed the existing estimators in minimizing the portfolio’s out-of-sample risk (test error). A lower risk implies a better estimation of true correlations among stocks. We have chosen the minimum risk portfolio design problem for a comparative study because the complex properties of the market data provide extreme conditions to test the robustness of a covariance estimator. Since the covariance analysis is a crucial step in statistics, the proposed estimator can be useful in a wide range of machine learning, pattern recognition, computer vision and signal processing applications [11, 13, 12, 14, 10, 4, 3, 5].

The paper is organized as follows: Section 2 provides an overview of existing covariance matrix estimators, highlighting their advantages and disadvantages. It also explains the reformulation of Markowitz’s portfolio optimization problem (MPT) to make it more suitable for real world investment requirements. Section 3 describes the proposed estimator in detail followed by empirical results in Section 4.

2 Existing techniques and Problem Formulation

In last two decades, real world data-driven problems like Markowitz’s portfolio optimization have motivated researchers to develop improved covariance matrix estimators which are mainly of two types: shrinkage estimators [17, 19, 20] and estimators based on RMT [21, 22, 23]. These estimators are also extremely useful in fields involving multivariate signal processing and machine learning [10, 4, 11, 14, 8, 9]. A comprehensive review of these estimators can be found in [9].

2.1 Formulation of Portfolio Optimization Problem

The conventional problem of finding the minimum risk portfolio is a convex problem with linear constraints [7]. We have included an additional return constraint for our empirical study because even a risk-averse investor would expect a minimal positive return. A portfolio optimization problem which minimizes the risk of investment while satisfying a certain return constraint can be formulated as follows:

(1)
subject to

where

is the stock return matrix for M stocks, each with N number of daily returns. The portfolio vector

() is the optimization variable and ’’ in Equation (1) represents variance. The vector

represents the predicted daily returns of M stocks and it can be estimated using recurrent neural networks

[24] or simply by dividing the available data into training and test sets. is the minimum daily expected return assuming that the portfolio is updated daily. The first constraint in Equation (1) implies that the sum of all portfolio weights is one. The second constraint forces portfolio weights to be positive, since we are not considering a short-selling scenario [17]. The third constraint specifies the minimum expected return.

The objective function in Equation (1) can be approximated to as:

(2)

Thus the optimization problem in Equation (1) tries to find the optimum vector in an dimensional feature space on which the projection of data is minimum. Equation (2) represents the same problem in terms of . So the objective function in Equation (1) is equaivalent to the eigen decomposition of

where the lowest eigenvalue represents the minimum value of this function and the eigenvector corresponding to this lowest eigenvalue is the minimum risk portfolio vector

. An important point to note here is that transforming this problem from the original feature space to the eigen space retains the convex nature of the optimization problem.

The estimation of covariance matrix is a key step in solving Equation (1). A better estimator for the covariance matrix implies a better prediction of future correlations among stocks, thus giving a portfolio which minimizes the risk of investment while satisfying the minimum return expectation of the investor. Therefore, the risk of the portfolio obtained provides a good metric to evaluate the performance of our proposed estimator and compare it to existing estimators.

2.2 Shrinkage Transformations

Shrinkage estimators solve the overfitting problem by imposing structure on SCM. The estimated covariance values in SCM that are extremely high due to sampling noise (and outliers) tend to contain a lot of positive error and need to be pulled downwards to compensate for that. Similarly, extremely low covariance values need to be pulled upwards. This is done by shrinking SCM towards a highly structured matrix called a

shrinkage target. The convex combination of SCM () with the shrinkage target () gives the shrinkage estimator, as shown in Equation (3) where

is called the shrinkage intensity. Its value depends on the properties of the data. For example, if the data is normally distributed, sample size is large (

) and samples across individual stocks are independent, will be almost 0 since is asymptotically unbiased under these conditions.

(3)

Over the last two decades, researchers have proposed various shrinkage estimators [25, 17, 19, 20]. Haff [25]

was among the first to propose using an identity matrix (scaled by a constant) as the shrinkage target assuming that all stocks have the same variance and there are no correlations among stocks. Thus

where c is a constant. Even this simple shrinkage estimator gave a lower out-of-sample risk as compared to SCM.

Ledoit and Wolf [17] proposed another shrinkage target based on the famous Sharpe Single Index model [26]. This provided a significant improvement in the performance of the shrinkage estimator. Instead of considering correlations among stocks, the Single Index model considers the correlation of stocks with market index, thus making it analogous to taking the projection of all stock return samples on the first principle component of the covariance matrix.

Another famous paper by Ledoit and Wolf [19] proposed a shrinkage target that has sample variances as the diagonal elements and the average value of all sample covariances as the off-diagonal elements. It is called the Sample Variance and Mean Covariance target. Previous studies [19, 9] as well as our empirical results in section 4 show that this estimator is the best among all linear shrinkage estimators. Hence we have used this estimator as one of the combining components in our proposed framework in section 3. Note that in the rest of the paper, we use the symbol for the shrinkage target to represent the Sample Variance and Mean Covariance target.

A major drawback of shrinkage estimators however is that they impose a uniform structure on all covariance values. Shrinkage transformations specifically focus on reducing overestimation of correlation values among significantly correlated features but do not focus on the possibility that truly uncorrelated features might also appear to be correlated due to sampling noise. This means that the process of correcting extreme correlation coefficients might invoke error in the bulk correlation coefficients. We explain this in detail in the next subsection in terms of the eigenvalues of SCM. Also, the choice of a shrinkage target is highly sensitive to properties of data like non-normality and skewness.

2.3 Random Matrix Theory Approach: Analysis in Eigenspace

There are many advantages of working with a matrix in its eigenspace, especially in case of a covariance matrix which is symmetric and positive definite (PD). The eigen decomposition of the covariance matrix yields real and positive eigenvalues and orthogonal eigenvectors. This type of decomposition is a key step in several widely used multivariate statistical tools like PCA. Equation (

4) shows the eigen decomposition of SCM where is a diagonal matrix of eigenvalues, is a matrix of the corresponding eigenvectors and . On substituting this eigen decomposition of SCM into the objective function of MPT (Equation (1)) we get the lowest eigenvalue of SCM as the optimal value of the conventional unconstrained MPT problem (shown in Equation (5)). The eigenvector corresponding to this eigenvalue is the desired minimum risk portfolio vector.

(4)
(5)

It is intuitive that the eigenvector corresponding to the lowest eigenvalue of SCM is the optimal solution to the conventional MPT optimization problem because a lower eigenvalue implies a lower variance of data along the corresponding eigenvector, thus implying lower correlations in multivariate data along that direction. But the problem still remains. Since SCM is not a good estimator of , its eigen decomposition will give noisy eigenvalues. Now instead of directly cleaning the covariance values using shrinkage techniques, tools from RMT can be used to clean these eigenvalues.

It is important to note that there are two main classes of eigenvalues of the correlation matrix (normalized SCM) based on their relation to the correlation coefficients (entries of SCM). These are 1) extreme eigenvalues and, 2) bulk eigenvalues. Extreme eigenvalues represent significantly correlated features which are reflected in the components of the corresponding eigenvectors. The sampling noise might cause overestimation of these extreme eigenvalues (or extreme correlation coefficients). The shrinkage transformation is effective in this case as it shrinks these coefficients. The bulk eigenvalues (lying near the average of eigenvalue distribution) represent less correlated features or even uncorrelated features (zero correlation). The sampling noise might cause overestimation of these low correlations and hence misrepresent these features as correlated. Since the shrinkage intensity in shrinkage transformations largely depends on extreme eigenvalues, it cannot effectively reduce error in bulk eigenvalues.

Unlike shrinkage estimators which uniformly add bias to SCM and shrink all eigenvalues uniformly, RMT based methods exploit the asymptotic properties of matrices in the eigenspace and add selective bias to the unstructured SCM. The central theme in RMT based techniques is to precisely estimate the population eigenvalue distribution and asymptotic limits for a given matrix whose entries are random variables with a certain distribution. Once the population eigenvalue distribution is derived, it can be compared to the sample eigenvalue distribution to separate eigenvalues representing correlated and uncorrelated features. These tools can specifically reduce error in the estimation of low correlation coefficients or identify noisy correlations which should have been zero as per the population statistics. Also, they do not rely on assumptions like multivariate normality which is important in case of heavy tailed features like stock market data.

One such technique is cleaning noisy eigenvalues of SCM using Marchenko-Pastur (MP) law [27, 21]. MP law provides lower and upper bounds on eigenvalues such that all eigenvalues inside the bounds are associated with sampling noise. MP law is stated as follows: Let be a matrix such that entries are jointly independent and identically distributed (i.i.d.) real random variables with zero mean and finite variance () (other strict results need the first four moments to be finite). Let be the sample eigenvalues of SCM (

). Since the entries of the original matrix are random, these sets of eigenvalues can also be viewed as random variables. Now consider a probability measure

on the sample eigenvalues (

) of any SCM in the semi-infinite Borel set which can be represented as a count function (analogous to the cumulative distribution function) as shown in Equation (

6). The derivative of Equation (6) gives the sample eigenvalue probability density as shown in Equation (7).

(6)
(7)

This density converges to the Marchenko-Pastur distribution

as the dimensions of matrix X become very large ( and ). The convergence is better if is close to 0. The Marchenko-Pastur distribution () is given as:

(8)

If the population correlation matrix (covariance matrix scaled by standard deviation) is an identity matrix having all its eigenvalues equal to ‘1’, MP law states that eigenvalues of the associated SCM will be scattered around ‘1’ and this scattering is bounded by MP law bounds

. This is also called NULL covariance model as it represents i.i.d. data and the absence of any correlation. If there are significant correlations present, i.e. few eigenvalues of the population correlation matrix are significantly greater than ‘1’, its called a SPIKE covariance model [28]. As stock market data can have significant correlations, it generates a SPIKE covariance model instead of a NULL model.

Since the eigenvalues lying inside MP law bounds represent sampling noise among originally uncorrelated features, they can be replaced with a constant while keeping eigenvalues outside these bounds intact. The eigenvectors of SCM can be scaled with these new eigenvalues to obtain a cleaner covariance matrix (). This technique is called Eigenvalue Clipping [9] and unlike shrinkage techniques, it selectively adds bias to noisy correlations. Another recent development in using RMT to clean SCM is that of Rotationally Invariant Estimator (RIE). However it does not give a significant improvement over Eigenvalue Clipping [9] and needs much heavier numerical computations.

There are some disadvantages of the RMT based methods. For example, Eigenvalue Clipping completely overlooks the fact that extreme eigenvalues lying outside MP law bounds can also be overestimated and can increase the sampling noise [9], especially in case of heavy tailed data which can give large error in the estimation of extreme correlation coefficients. Also, these results are derived under asymptotic assumptions and thus can be misleading when the available sample size is small. RMT based methods also need data to be i.i.d and have finite variance. This might not be true for heavy-tailed financial data having high temporal correlations.

Thus, both shrinkage techniques and RMT based techniques have their pros and cons depending on the properties of data. Shrinkage techniques are better for reducing noise in the estimation of extreme correlations (extreme eigenvalues) whereas RMT based tools are better in reducing noise in the estimation of comparatively lower correlation coefficients (bulk eigenvalues). The ideal candidate for improving the performance of covariance estimators should be able to reduce noise in estimation of both extreme and bulk eigenvalues of the covariance matrix. Therefore, if eigenvalue clipping is combined with a shrinkage technique in an optimal way, the MP law bound can help us clean noisy bulk eigenvalues and the shrinkage target can pull extreme eigenvalues (outside MP law bound) towards the target. This is the basis of our proposed estimator which is tested to give improved results on real world stock market data.

3 Proposed Estimator

As mentioned in previous sections, both shrinkage and RMT techniques have some pros and cons and their performance depends on many factors such as the distribution of data, number of samples per stock, independence of samples among individual stocks, etc. In this section, we propose an improved covariance estimator which exploits the advantages of both shrinkage and Eigenvalue Clipping approaches by taking an optimally weighted convex combination of the high variance , a highly structured shrinkage target and a matrix obtained by applying Eigenvalue Clipping (). The formulation of the proposed estimator (represented as ) is shown in Equation (9).

(9)

Here, the shrinkage target is the Sample Variance and Mean Covariance target (explained in Section 2.2). The optimization problem for finding optimal weights is given as:

(10)
subject to

where, represents Frobenius norm. Since, is not known in practical cases, the values of and can be estimated by replacing with in the portfolio problem (Equation (1)) and iterating over values of from 0 to 1 with sufficiently high resolution to achieve the minimum risk. The problem with three variables () in Equation (10) can be reformulated with two variables () as shown in Equation (11). This reduces the computational cost while preserving the convex nature of the problem. Solving Equation (11) gives the final estimator shown in Equation (12). The effective weights of , and are now and respectively.

(11)
subject to
(12)

This approach is simple but very effective in removing the shortcomings of shrinkage estimators and Eigenvalue Clipping filters. We know that shrinkage transformation adds bias to all sample covariance values uniformly. On the other hand, Eigenvalue Clipping selectively removes and replaces noisy eigenvalues inside MP law bounds but ignores the noise in extreme eigenvalues. This means that Eigenvalue Clipping is efficiently removing noisy correlations between the features which are originally uncorrelated. But it is ineffective in removing noisy correlations between features which are originally highly correlated. Shrinkage estimator on the other hand does the opposite. So when we take the weighted convex combination of both shrinkage and Eigenvalue Clipping estimators, we not only remove noisy eigenvalues inside the MP law bounds, but also shrink extreme eigenvalues lying outside the bounds. Hence, noisy correlations among both correlated and uncorrelated features can now be handled. Furthermore, this provides a generalized estimator that can adapt to different datasets by changing the values of and .

3.1 Geometric Interpretation and Analysis

Figure 1: Geometric interpretation of the proposed estimator in Hilbert space.

The geometric interpretation of this estimator in Hilbert space is shown in Figure 1 using a tetrahedron. It can be seen that the three vertices of the triangular base represent matrices and . The fourth vertex of the tetrahedron represents . The figure shows that the optimization problem in Equation (11) yields an estimator () which is the orthogonal projection of on the closed triangular plane formed by as its three vertices (shaded portion in the figure). In other words, the proposed estimator is a convex combination of these three matrices and depending on the values of the weights and the estimator () can lie anywhere within or on the edges of the triangle.

It can be seen that if lies on the edge joining and , the best results will be obtained by using the linear shrinkage technique. If lies on the vertex representing , then eigenvalue clipping will give the best results. However, when lies inside the triangle, either shrinkage or eigenvalue clipping alone cannot give the best possible estimator. In this case, the best result is given by their convex combination. All the limiting cases of the estimator depending on the values of are summarized in Table 1.


Weights Resultant Estimator
Table 1: Limiting cases of proposed estimator
NSE (India) NIKKEI (Japan) S&P (USA) BSE (India)
30 days 60 days 90 days 30 days 60 days 90 days 30 days 60 days 90 days 30 days 60 days 90 days
19.26 19.04 18.99 18.89 19.25 18.30 17.20 16.99 16.82 14.07 14.76 14.34
12.55 12.07 12.67 14.53 14.68 14.94 11.24 11.24 11.32 11.07 10.58 13.55
12.88 12.51 13.26 14.62 14.81 15.19 11.32 11.28 11.39 14.26 14.18 14.51
12.68 12.29 12.86 14.56 14.68 14.99 11.29 11.19 11.24 13.29 14.15 13.62
12.72 12.34 12.45 14.57 14.72 15.01 11.31 11.26 11.38 14.10 14.18 14.13
12.12 11.83 12.20 14.34 14.22 14.62 10.99 11.11 11.12 11.07 10.21 12.98
Table 2: Annualized out-of-sample risk for minimum variance portfolio (with constraint of achieving atleast 10 % return) for NSE, NIKKEI, S&P and BSE datasets using different estimators (in terms of % standard deviation).

4 Data and Empirical Results

We have compared the following five estimators with our proposed estimator ():  1) Identity Matrix () proposed by [25]. It assumes that there is no correlation among stocks;  2) Shrinkage Estimator () proposed in [19], shown to be the most efficient linear shrinkage estimator [19, 9];  3) Sample Covariance Matrix ();  4) Eigenvalue Clipping based estimator ();  5) Rotational Invariant Estimator () as implemented by Bun et al. [23].

For comparing these estimators, we have considered stocks from four major stock exchanges: NSE, NIKKEI, BSE and S&P. We solved the problem formulated in Equation (1) for each dataset to minimize the investment risk while satisfying the constraint of minimum 10% return. We have selected 100 most liquid stocks from each of the exchanges, with 750 days (Jan, 2014 to Jan, 2016, around 2 years) of daily returns data for each stock. The daily returns for the first 200 days are used to design the initial minimum risk portfolio using six estimators shown in table 2. So the size of the data matrix is , i.e. the dimensionality constant c is 0.5. We then shift this 200 day training window forward and update the portfolio at frequencies of 30, 60 and 90 days and record the variance of daily returns in each case. We have used only 200 days of daily returns for training because in finance, using recent data is preferable in order to capture the effect of recent trends.

Table 2 compares the six estimators for all four stock market datasets. The comparison is based on the variance (volatility) of daily returns for portfolios obtained by solving Equation (1). The volatility is calculated for test data and is called the out-of-sample risk. This variance is converted to percentage standard deviation and is annualized by multiplying it with . The results are shown for portfolio update frequencies of 30, 60 and 90 days. It can be seen that our proposed estimator gives the lowest out-of-sample risk for all four datasets and for all the portfolio update frequencies. A decrease in volatility even at the first decimal place is considered fairly significant in the field of portfolio optimization research [17, 9, 19]. In most cases, the identity matrix () is the worst estimator. This is expected as it assumes zero correlation among stocks. SCM is the second-worst. Also, when the portfolio update frequency is low, i.e. 90 days, the performance of other estimators deteriorates significantly whereas our proposed estimator still gives comparatively better results. This implies that our estimator predicts future correlations more precisely.

5 Conclusion

In this paper, we proposed an improved covariance matrix estimator which exploits the advantages of both shrinkage and RMT based estimators to effectively reduce sampling noise. The central idea behind this estimator is to take an optimally weighted convex combination of the high variance SCM, a highly structured shrinkage target and SCM cleaned using MP law. The primary advantage of this method is that it first uses MP law to clip noisy eigenvalues lying inside the MP law bounds, thus adding selective bias, and then uses shrinkage techniques to shrink extreme covariance values, thus reducing error in them. Hence noisy correlations among both correlated and uncorrelated features can be handled. Also, this provides a generalized estimator that can adapt to different datasets by tuning its parameters using training data.

We used stock returns data from four of the world’s largest stock exchanges to show that our proposed estimator outperforms all existing estimators in minimizing the out-of-sample risk of the portfolio. This implies that it can efficiently predict true correlations among stocks and by extension, among any set of multivariate features. Hence it can be useful in all fields dealing with covariance matrices including machine learning, pattern recognition and signal processing.

References

  • [1] Jinho Baik and Jack W Silverstein, “Eigenvalues of large sample covariance matrices of spiked population models,” Journal of multivariate analysis, vol. 97, no. 6, pp. 1382–1408, 2006.
  • [2] TW Anderson, “An introduction to multivariate statistical analysis (wiley series in probability and statistics),” July 11, 2003.
  • [3] Haohai Sun, Edwin Mabande, Konrad Kowalczyk, and Walter Kellermann, “Joint doa and tdoa estimation for 3d localization of reflective surfaces using eigenbeam mvdr and spherical microphone arrays,” in 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2011, pp. 113–116.
  • [4] Yujie Gu, Yimin D Zhang, and Nathan A Goodman, “Optimized compressive sensing-based direction-of-arrival estimation in massive mimo,” in 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2017, pp. 3181–3185.
  • [5] Stefano Olivares and Matteo GA Paris, “Bayesian estimation in homodyne interferometry,” Journal of Physics B: Atomic, Molecular and Optical Physics, vol. 42, no. 5, pp. 055506, 2009.
  • [6] Rainer Schodel, T Ott, R Genzel, R Hofmann, M Lehnert, A Eckart, N Mouawad, T Alexander, MJ Reid, R Lenzen, et al., “A star in a 15.2-year orbit around the supermassive black hole at the centre of the milky way,” Nature, vol. 419, no. 6908, pp. 694–697, 2002.
  • [7] Harry M Markowitz, “Foundations of portfolio theory,” The journal of finance, vol. 46, no. 2, pp. 469–477, 1991.
  • [8] Yiyong Feng and Daniel P Palomar, “Portfolio optimization with asset selection and risk parity control,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2016, pp. 6585–6589.
  • [9] J Bun, J Bouchaud, and M Potters, “Cleaning correlation matrices,” Risk magazine, 2016.
  • [10] Jing Liu, Yacong Ding, and Bhaskar Rao, “Sparse bayesian learning for robust pca,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019, pp. 4883–4887.
  • [11] Shuangli Liao, Jin Li, Yang Liu, Quanxue Gao, and Xinbo Gao, “Robust formulation for pca: Avoiding mean calculation with l 2, p-norm maximization,” in

    Thirty-Second AAAI Conference on Artificial Intelligence

    , 2018.
  • [12] Aleix M Martínez and Avinash C Kak, “Pca versus lda,” IEEE transactions on pattern analysis and machine intelligence, vol. 23, no. 2, pp. 228–233, 2001.
  • [13] Fernando De la Torre and Michael J Black,

    “Robust principal component analysis for computer vision,”

    in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001. IEEE, 2001, vol. 1, pp. 362–369.
  • [14] Qian Zhao, Deyu Meng, Zongben Xu, Wangmeng Zuo, and Lei Zhang, “Robust principal component analysis with complex noise,” in International conference on machine learning, 2014, pp. 55–63.
  • [15] Frank J Fabozzi, Francis Gupta, and Harry M Markowitz, “The legacy of modern portfolio theory,” Journal of Investing, vol. 11, no. 3, pp. 7–22, 2002.
  • [16] Jean-Philippe Bouchaud and Marc Potters, Theory of financial risk and derivative pricing: from statistical physics to risk management, Cambridge university press, 2003.
  • [17] Olivier Ledoit and Michael Wolf, “Improved estimation of the covariance matrix of stock returns with an application to portfolio selection,” Journal of empirical finance, vol. 10, no. 5, pp. 603–621, 2003.
  • [18] Svetlozar Todorov Rachev, Handbook of heavy tailed distributions in finance: Handbooks in finance, vol. 1, Elsevier, 2003.
  • [19] Olivier Ledoit and Michael Wolf, “Honey, i shrunk the sample covariance matrix,” The Journal of Portfolio Management, vol. 30, no. 4, pp. 110–119, 2004.
  • [20] Olivier Ledoit and Michael Wolf, “Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets goldilocks,” The Review of Financial Studies, vol. 30, no. 12, pp. 4349–4388, 2017.
  • [21] Jean-Philippe Bouchaud and Marc Potters, “Financial applications of random matrix theory: a short review,” arXiv preprint arXiv:0910.1205, 2009.
  • [22] Joël Bun, Romain Allez, Jean-Philippe Bouchaud, Marc Potters, et al., “Rotational invariant estimator for general noisy matrices.,” IEEE Trans. Information Theory, vol. 62, no. 12, pp. 7475–7490, 2016.
  • [23] Joël Bun, Jean-Philippe Bouchaud, and Marc Potters, “Cleaning large correlation matrices: tools from random matrix theory,” Physics Reports, vol. 666, pp. 1–109, 2017.
  • [24] Fei Qian and Xianfu Chen, “Stock prediction based on lstm under different stability,” in 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis (ICCCBDA). IEEE, 2019, pp. 483–486.
  • [25] LR Haff, “Empirical bayes estimation of the multivariate normal covariance matrix,” The Annals of Statistics, pp. 586–597, 1980.
  • [26] William F Sharpe, “Capital asset prices: A theory of market equilibrium under conditions of risk,” The journal of finance, vol. 19, no. 3, pp. 425–442, 1964.
  • [27] Vladimir A Marčenko and Leonid Andreevich Pastur, “Distribution of eigenvalues for some sets of random matrices,” Mathematics of the USSR-Sbornik, vol. 1, no. 4, pp. 457, 1967.
  • [28] Iain M Johnstone et al., “On the distribution of the largest eigenvalue in principal components analysis,” The Annals of statistics, vol. 29, no. 2, pp. 295–327, 2001.