Portfolio optimization is a widely studied and relevant problem in portfolio management applications. Optimal portfolio selection methods with the global minimum variance approach or the Markowitz mean-variance model (Markowitz, 1952) are frequently used by portfolio managers, including robo-advisors (Sabharwal and Anjum, 2018). The optimal portfolio allocation weights and associated risk are functions of the mean and precision matrix (inverse covariance matrix) of the excess asset returns, thus making it imperative to estimate these parameters from the available data accurately.
In most of modern era portfolio optimization problems, the number of assets is often large, exceeding the number of observations . There might be multiple reasons for this. For example, if we consider a portfolio with assets, even a 10-year long monthly returns data would still result in The dynamics of interdependence among the various assets might also change over a long period, and for new assets, long history of returns is unavailable. All of these lead to a high-dimensional scenario, where classical statistical inference techniques fail to provide stable or robust solutions. Estimating a covariance matrix or the precision matrix is a non-trivial problem in such instances owing to singularity.
Sparsity plays an important role in tackling the high-dimensional problem. This essentially refers to exploring the low-dimensional structure in a high dimensional space to make valid inference; see, for example, Friedman et al. (2008). Meinshausen and Bühlmann (2006) developed a regression-based estimator using lasso for estimating the entries of the precision matrix via regression coefficients. Callot et al. (2021) proposed a similar nodewise regression approach for application in the portfolio optimization problem. However, their approach leads to an estimator that is not necessarily symmetric.
In this paper, we propose to use a regression-based joint shrinkage method developed by Peng et al. (2009) to estimate the precision matrix of a portfolio of a large number of assets. The method is more efficient than competing approaches in terms of computational complexity and the number of parameters to be estimated.
The paper is organized as follows. In the next section, we discuss the optimal solution for two different portfolio choices. We describe our proposed method in Section 3. Numerical results are presented in Sections 4 and 5, followed by a discussion. Details of simulations, additional real-data results, and visualizations are presented as supplementary materials.
2 Optimum portfolios
Let us define
as the random vector of excess asset returns at timewith mean and a -dimensional positive definite covariance matrix . The corresponding precision matrix is with the th element denoted by A portfolio comprising of assets is given by an asset allocation vector of weights such that The global minimum variance portfolio is defined as
where is the covariance matrix of excess returns of assets in the portfolio. The solution to the above optimization problem is with the minimum expected risk of portfolio return given by
The Markowitz mean-variance framework takes into account a fixed expected return, say , and solves the optimization problem
The optimal weight for this problem is given by where
Our proposed methodology for portfolio optimization in the high-dimensional scenario depends on the intrinsic relationship between elements of the precision matrix and regression coefficients obtained via regressing each of the individual excess asset returns on the remainder of the same. We elaborate on this relationship in detail below.
3.1 Regression approach to precision matrix estimation
separate linear regression models
We can show that, for every a necessary and sufficient condition for the error term and the set of predictors to be uncorrelated is . In addition to this, we also have for every Also, for the correlation between and is defined as the partial correlation between and . We can express the precision matrix as where and is such that and This gives,
From equation (3), we get an alternate representation of the regression coefficient as , so that the partial correlation can be written as Also, note that the sparsity structure of and the corresponding partial correlation matrix are identical. Thus, estimation of the elements of the precision matrix or the partial correlation matrix reduces to fitting parallel regression models to the individual excess asset returns with the remainder of the asset returns as predictors.
3.2 Joint regression shrinkage
We propose to work with the joint regression shrinkage approach named space (Peng et al., 2009)
that considers a joint loss function given by
where and are non-negative weights. The space method minimizes the joint penalized loss function
where is the tuning parameter. The proposed approach for partial correlation estimation has several advantages over other nodewise regression-based approaches. First, since we are estimating the partial correlations directly using the penalized joint loss function as in (4), the sign consistency of the regression co-efficients and are preserved automatically, and the estimated precision matrix is also symmetric. These are in contrast to the nodewise regression methods of Meinshausen and Bühlmann (2006) and Callot et al. (2021), where the sign consistency is not guaranteed, and the estimated precision matrix is not necessarily symmetric. Additionally, owing to direct estimation of the partial correlations and the residual variances, the number of unknown parameters to be estimated is in comparison to for nodewise regression approaches, thus making the computations for the proposed method faster and more efficient. Finally, the weights
can be chosen accordingly to adjust for heteroscedastic noises in the regressions or account for the overall dependence structure among the asset returns.
We assess the performance of our proposed method under different simulated settings and also compare the results with three competing methods, namely, the nodewise regression method (Callot et al., 2021), factor model based POET estimator (Fan et al., 2013), and the shrinkage based Ledoit-Wolf estimator (Ledoit and Wolf, 2004). We consider both the unweighted and weighted versions (named space-unweighted and space-weighted) of the proposed procedure, as outlined in Peng et al. (2009). Following the simulation setup as described in Callot et al. (2021), we consider two data generation processes (DGPs), given by the Toeplitz structure for the covariance, and a sparse factor structure for returns. The details of the DGPs are provided in the Supplementary section.
4.1 Performance evaluation metrics
We consider three different error metrics for evaluating the performance of the methods, namely, the weight estimation error, variance estimation error, and the risk estimation error, defined below.
We denote the optimal portfolio weight vector by , where can be either or corresponding to the global minimum variance portfolio or the Markowitz portfolio as given in equations (1) and (2) respectively. The estimated counterpart of , obtained via the plug-in estimate of is denoted by . The portfolio weight estimation error is defined as the -norm of the difference between and , that is,
The variance estimation error is likewise defined as the absolute relative difference in the estimated optimal variance and the true optimal variance, that is,
Finally, the risk estimation error is defined as
4.2 Simulation setting
We consider varying sample sizes with corresponding choices of the dimension to be and for all the different DGPs. For each of the combinations, we run 100 replications and report the performance metrics for the proposed joint shrinkage methods along with its competitors. We use the space package in R to implement the proposed method, and the R codes provided in the supplementary section of Callot et al. (2021) for the competing methods.
The results under the different DGPs and varying combinations of are shown in Tables 1a, 1b, 2a, and 2b corresponding to the Toeplitz covariance structure and in Tables 3a, 3b, 4a, and 4b corresponding to the sparse factor structure for returns. The proposed joint shrinkage methods, both the unweighted and weighted versions, demonstrate excellent performance in all three metrics under the Toeplitz covariance structure. The variance and weight estimation errors for the joint shrinkage methods are better than both the nodewise and POET estimators, while the risk estimation errors are the least compared to all the competing methods. Under the sparse factor structure of the returns, the joint shrinkage methods and the Ledoit-Wolf method perform better than the other two methods. Overall, all the errors decrease with increasing sample size , irrespective of the increasing dimension . These results validate the utility of the proposed joint shrinkage approach for the portfolio optimization task, especially when the asset class is large.
5 Real datasets
We now demonstrate the performance of our proposed method and the competing approaches on real datasets of excess asset returns. We first describe the metrics on the basis of which we will compare the performances of the different methods.
5.1 Performance evaluation metrics
We adopt a ‘rolling-sample’ approach (DeMiguel et al., 2009) to evaluate the performances of the portfolio optimization methods. For -dimensional excess returns data spanning time points (daily or monthly), we split the same in two portions, namely, the training and the testing data with respective lengths and The portfolios are rebalanced at the start of each rolling-sample. The details of the rolling-sample approach are outlined in the Supplementary section.
We compute four common metrics for performance evaluation – out-of-sample mean return and variance, out-of-sample Sharpe ratio, and the portfolio turnover. We also consider portfolios including and excluding associated transaction costs. The out-of-sample Sharpe Ratio is given by where and are respectively the out-of-sample mean portfolio return and variance. In the absence of transaction costs, these quantities are defined as:
where is the estimated portfolio weight at time . In the presence of transaction cost , we need to modify the above-defined measures suitably. Denoting as the estimated portfolio weight before rebalancing at time point , the net return at accounting for the transaction cost is given by
The modified out-of-sample mean and variance of the portfolio under transaction cost are then given by,
The Sharpe Ratio is modified suitably as . Under presence of transaction cost, the portfolio turnover is defined as
We have used daily and monthly returns of components of the S&P 500 index. With the monthly data, we have setup, while for daily data we have . The data are split into training and testing sets, and thereafter we use the rolling-sample approach as described above. A brief description of the data, including the training and the testing periods, are provided in Table 5. We use monthly and daily equivalent targets corresponding to compounded annual return for the Markowitz portfolio, and take the three-month treasury bill rate as the risk-free return rate. The proportional transaction cost is set to basis points per transaction, as in DeMiguel et al. (2009).
The performances of the different methods are presented in Table 6 corresponding to the first training periods. For the monthly returns data, the Ledoit-Wolf estimator yields the highest returns corresponding to the global minimum variance portfolios, but at the cost of substantial risks, resulting in poor Sharpe ratios. The performances of the proposed joint shrinkage-based methods are generally the best, or at least comparable with the nodewise regression approach. For the Markowitz portfolio, our proposed methods show the best performance in terms of the best returns, lowest risks and highest Sharpe ratios. Additionally, in the presence of transaction costs, the joint shrinkage methods give the lowest turnover, thus preventing higher fees to adjust for the higher turnover rates. For the daily returns data, they perform the best with respect to all the metrics when transaction cost is present. These results validate the utility of joint shrinkage in the portfolio optimization problem. We present additional real data results corresponding to the second set of training samples in the Supplementary section.
In this paper, we have proposed using a regression-based joint shrinkage method to estimate the partial correlations and residual variances for a large number of assets. The estimated parameters are then used to solve a portfolio optimization problem, corresponding to two optimal portfolio choices. Through extensive numerical studies, we demonstrate the excellent performances of our proposed approach as compared to competing methods. Our proposed approach also enjoys the ease of implementation and the availability of fast computational methods, even for a large number of financial assets.
S.B. is supported by IIM Indore Young Faculty Research Chair Award grant.
- Callot et al. (2021) Callot, L., Caner, M., Önder, A. Ö., and Ulaşan, E. (2021). A nodewise regression approach to estimating large portfolios. Journal of Business & Economic Statistics, 39(2):520–531.
- DeMiguel et al. (2009) DeMiguel, V., Garlappi, L., and Uppal, R. (2009). Optimal versus naive diversification: How inefficient is the 1/n portfolio strategy? The Review of Financial Studies, 22(5):1915–1953.
- Fan et al. (2013) Fan, J., Liao, Y., and Mincheva, M. (2013). Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society. Series B, Statistical methodology, 75(4).
- Friedman et al. (2008) Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441.
Ledoit and Wolf (2004)
Ledoit, O. and Wolf, M. (2004).
A well-conditioned estimator for large-dimensional covariance
Journal of Multivariate Analysis, 88(2):365–411.
- Markowitz (1952) Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1):77–91.
- Meinshausen and Bühlmann (2006) Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso. The Annals of Statistics, 34(3):1436 – 1462.
- Peng et al. (2009) Peng, J., Wang, P., Zhou, N., and Zhu, J. (2009). Partial correlation estimation by joint sparse regression models. Journal of the American Statistical Association, 104(486):735–746. PMID: 19881892.
- Sabharwal and Anjum (2018) Sabharwal, C. L. and Anjum, B. (2018). Robo-revolution in the financial sector. In 2018 International Conference on Computational Science and Computational Intelligence (CSCI), pages 1289–1292. IEEE.
s.1 Data generating process details
We have considered two data generating processes (DGPs) in our simulation studies, following the setup in Callot et al. (2021). We describe the DGPs in detail, along with the choice of underlying parameters.
Toeplitz structure for covariance: For each time-point , the vector of excess asset returns is simulated from a
-dimensional Normal distribution with mean vectorand covariance matrix where for all . We consider two different choices for the mean vector : (i) , and (ii) We evaluate the global minimum variance portfolio for the first choice of and the Markowitz portfolio for the latter. Considering a annual return over 252 trading days, the target return is set at .
Sparse factor structure for returns: We consider a weakly sparse three-factor structure for the excess returns, given by
where is a three-dimensional factor vector, is a matrix of factor loadings, and is the error term. The columns of are generated independently from a distribution, whereas the factors are generated from a distribution. Likewise the previous DGP, we consider two different choices for the mean vector as: (i) , and (ii) . We evaluate the performances of the different methods under the global minimum variance portfolio for the first choice of and the Markowitz portfolio for the latter, with the same choice of the target return as in the previous scenario.
s.2 Rolling-sample approach for evaluating performance in real datasets
As mentioned in Section 5, we adopt the ‘rolling-sample’ approach for evaluating the performances of the different portfolio optimization methods. We briefly describe the steps below:
First, the optimal portfolio weights are computed using the training data for the period , with the corresponding estimated portfolio return at the point
Next, the training window is rolled by one time point taking the data for the period as the new training data to obtain the optimal weights along with the corresponding estimated return at time point
We iterate the above step till the end of the sample, computing the optimal weights taking as the training data and estimating the return at time point After each step above, we rebalance the portfolios at the start of the next time point.
s.3 Additional real data results
For the second set of monthly returns training samples, we have similar conclusions regarding the performances of the different methods as in the first set. For the daily returns data, all the returns come out to be negative in the second training sample. The Ledoit-Wolf method gives the lowest return but with the lowest risks, but the joint shrinkage methods in almost all of the different scenarios result in the highest Sharpe ratios and lowest turnovers.
In this section, we present visualizations of the estimated precision matrices under the unweighted version of our proposed approach and the nodewise regression method (Callot et al., 2021) via chord diagrams. The chord diagrams are presented in Figures S.1, S.2, S.3, and S.4. These visualizations reveal the dependence structure of the underlying assets, and in all the cases we find that the space approach recovers the true precision matrix more accurately as compared to the nodewise regression approach.