I Introduction
Multiperiod portfolio choice is a central problem in finance. It is described by an investor who faces the problem of determining how to sequentially allocate his capital to maximise some performance measure over multiple periods. Online portfolio selection algorithms tackle the problem of maximising cumulative wealth by adaptively identifying and exploiting patterns in historical data [1]. The key feature of these algorithms is that they are online: patterns and portfolio decisions update upon the arrival of new data, thereby adapting to changing market conditions [1].
Online portfolio selection algorithms can be classified according to their update scheme. Traditional algorithms forecast asset returns and are used to update the current portfolio. Li
et al. (2014) classifies traditional algorithms into the following categories [1]:
FollowTheWinner (FTR) algorithms assume that recent stock performance would persist and so transfer capital from the worstperforming stocks to the bestperforming stocks.

FollowTheLoser (FTL) algorithms assume that recent stock performance would revert to a longrun mean and so transfer capital from the bestperforming stocks to the worstperforming stocks.

PatternMatching based (PM) algorithms assume that market conditions repeat themselves and so they allocate capital based on what was optimal for similar historical periods.
The PatternMatching based approach has the least restrictive assumption about market behaviour. This affords greater flexibility in algorithm design and allows these algorithms to exploit a wider range of market conditions, thereby outperforming the other approaches [2, 3, 4, 5]. In particular, the CORNK (CORrelationdriven Nonparametric learning) algorithm appears to demonstrate the best results. Recently, the CORNK algorithm has been extended to incorporate risk in its portfolio selection [4, 5].
However, the CORNK algorithm (and its extensions) often output a cautious portfolio which restrict its returns. In short, this occurs when the algorithm is unable to detect a subset of historical data that is similar to the recent data and therefore, allocates wealth equally across assets.
To do this, we propose the Aggressive MultiTemporal Allocation (AMAK) algorithm, which combines the PatternMatching and FollowtheWinner principles.
Ii Online Portfolio Selection
An investor wants to allocate his initial capital into a portfolio of securities for each of the trading days to maximise his terminal wealth . The investor’s portfolios are represented by , where is a proportion of the capital invested in security at time . Furthermore, portfolio positions are constrained to be nonnegative and all capital is invested at each period .
Define the price relative for security at day as , where
denotes the logprice. Hence, denote the price relative vector as
. A sequence of price relative vectors are used to define a market window , where is the given window size.An online portfolio selection algorithm is a function that takes the historical price data at time and outputs a portfolio:
(1) 
The portfolio is constructed at the start of period , using all information up until then. The terminal wealth at the end of period is given by:
(2) 
For tractability, we make the following assumptions: each asset is arbitrarily divisible, desired quantities can be traded at the most recent closing price, and market prices are not affected by the investor’s actions. In addition, we ignore trading costs and do not allow for borrowing or shortselling.
Iii Correlationdriven Nonparametric Learning
CORNBased strategies use experts that construct portfolios using previous market windows. Experts have a portfolio and have a cumulative wealth , at time . experts are considered, each defined by a window size ) and a
Pearson productmoment correlation coefficient
threshold ) [3]. In top based strategies, the experts with the most cumulative wealth at time have their portfolios combined. Each expert is responsible for wealth of the portfolio’s allocation for a given day. This combined portfolio is the agent’s portfolio at time .Each expert compares the most recent market window at time with all historical market windows of the same size. Each expert searches for their optimal portfolio using a given set of data that is equal to or greater than their respective . Days that match this required are called correlation similar days and is represented by [3]. Experts update their wealth at the end of a day using this portfolio and the day’s returns.
An expert’s portfolio is determined by b that maximises Equation 3 at time :
(3) 
At times, the correlation similar set of days may be small or empty. In this case the expert returns uniform portfolio [3, 4, 5]. A uniform portfolio is when wealth is equally distributed amongst all assets  which generally have lower returns. DRICORNK is a variation of CORNK that classifies the market and adjusts accordingly. Classification is done through the use of (market Beta) in searching for the optimal portfolio [5]. Utilising allows for more aggressive/defensive portfolios based on current market conditions [5]. At times, does not impact the portfolio construction, in which case DRICORNK returns a similar portfolio to CORNK.
Iv Method
Clustering previously has been employed in online portfolio selection [6, 7, 8, 9, 10]. Similar approaches are given by Khedmati et al. (2020) where portfolios are optimised using clustering techniques, market windows, PatternMatching and similar day samples [8, 6]. Nanda et al.
(2010) found that Kmeans clustering provided the best result for online portfolio selection based on cluster compactness using the Bombay Stock Exchange
[10]. Our work extends on these previous successes by directly integrating clustering into the CORNK framework. Further, we introduce a more effective low dimensional representation for market windows that improve the clustering results.A limitation of CORNK’s use of correlation similar days is correlation similar days are rare and usually only common amongst experts with smaller window sizes and lower values for . Hence, experts that have a suitable quantity of data to produce inference in the market dynamics are unable to do so. To overcome this limitation, we use online KMeans clustering (Konline) with Manhattan distance as an alternative to discover sets of cluster similar days. These cluster similar days do not require a correlation coefficient threshold and are considered similar if they belong to the same centroid. Manhattan distance is selected for computational efficiency  alternate metrics could be considered.
Our variation to CORNK lies in dealing with empty sets of correlation similar days. If we encounter a day where the agent has and the market window size is , we make use of our current day’s (day ) market vector’s assigned cluster as created in Algorithm 1. We let the correlation similar set be all days assigned to the same cluster and maximise using Equation 3.
Iva Aggressive MultiTemporal Allocation
We have chosen to maintain the method of choosing best experts in our algorithm. Furthermore, we use the concept of market windows ( for a market window of size at day ), these are matrices that represent consecutive market days’ price relative vectors across all shares.
IvA1 Agent Memory
At the start, all the agent’s experts have days of market history. As the algorithm proceeds, new days are added to the agent’s memory until it has days of market history. At days, the agent forgets all but the most recent days of market history. The choice of ensures that the agent considers only recent price movements and the length of keeps the agent’s portfolio allocations “stable“. A small value results in an myopic agent that exploits volatility. A high value results in an agent that looks for“blue chips“ that represent longterm growth trends.
Each market window is represented by a market vector with components: the sum of each stock’s mean in the market window (Equation 4
), the market window’s mean, the sum of each stock’s variance (Equation
5) and the variance of the market window .(4) 
where represents the asset.
(5) 
where is the average price relative for the asset in the given market window.
We initialise the algorithm with , windows of size where market vectors and is the maximum window size. ’s value is from the original CORN paper [1]. We initialise the number of cluster centroids as ( represents the number of days). The initial centroid amount was determined using the validation data.
Subsequently, we shift the market window forward by one day and assign new market vectors for their respective windows to their nearest clusters for that agent’s experts. Noting that market vectors ( are assigned to experts with the same window size. We redo the clustering every days to keep the allocation of market vectors uniform and relevant. The additional centroids allow for the new unseen market vectors to be represented. Even though this readjustment interval was determined using the validation set, the strategy was found to be insensitive to intervals in the range to .
Since the Konline algorithm does not converge, we terminate the clustering when reassignments affect only of market vectors. The is a manually determined parameter for the general case when the assignment of market vectors to clusters proceeded normally. If the threshold is not obtained within ten reinitialisation attempts, we use the last readjustment. This maximum number of attempts allow the clustering process to proceed which may yield undesirable cluster assignments.
Every days we reset the Konline clustering using the most recent days of data and centroids. The reason why we reset every days is that it allows the algorithm to provide a balance between factoring in new information and acting less erratically from continuously switching between asset allocations.
IvA2 AmaK
Given that the various (10day, 120day and 190day) agents perform well for their respective time horizons as shown by Table II, we create an algorithm that combines the agents. This combination creates an agent with three time horizons that we have defined as; short (10day), medium (120day) and long (190day). Each time horizon has its own set of experts associated to it. The clustering algorithm is repeated for each subagent using their value. Here a subagent is an agent over a specific day time horizon. We take the portfolio for each time horizon and normalize it such that it represents the proportion of the subagent’s wealth allocated to each asset at day . These portfolios are merged. The resulting portfolio is then divided by the number of time horizons under consideration (here we divide by three). This resultant portfolio may have a diverse range of assets, which should reduce risk whilst maintaining a high expected return. This idea of efficient diversification is wellfounded by Markowitz (1952) in his famous paper Portfolio Selection [11].
IvB Metrics
In comparing our algorithm to similar approaches, we use the following metrics that aim to measure performance in a generalised manner.
Maximum Drawdown (MDD) [12]
(6) 
where .
MDD is a risk evaluation metric which represents the maximum decline from a historical peak of the total wealth(
) achieved at the time . The smaller the MDD value, the more risk tolerant the trading strategy.Annualised Percentage Yield (APY) [13]
(7) 
Here is the total return after trading periods, and is the number of years corresponding to . APY measures the rate of return that was achieved and it takes into account the effect of compounding. Typically a greater APY is desired.
Annualised Sharpe Ratio(ASR) [14]
(8) 
Here represents the annualised Sharpe Ratio after periods, is an Annualised Percentage Yield (Equation 7). is the riskfree rate of return and
is the annualised standard deviation of daily returns. We use the same assumptions as in the DRICORNK to calculate Equation
8 [5]. Where is set to 4% and is set to as a result of assuming an average number of 252 trading days in a given year. The Sharpe Ratio captures the “return per unit of risk“. A higher value ASR is preferred.IvC Training and Validation
The data sets used are given in Table I. Assets had their prices adjusted for dividends and stock splits. Validation and Testing represent the validation and testing data sets respectively. All sets are in years  where 252 days is the average number of trading days in a year. The sets consisted of 6000, 5040 and 2520 days for training, validation and testing respectively.
In demonstrating memory’s effects, we trained agents using different values of in intervals of ten between 10 and 230 days. Table II is a subset of results for the best performing sizes for in various periods.
Exchange  Assets  Validation  Test 

BIST  46  2  1 
BOV  28  4  2 
EUR  46  4  2 
JSE  38  4  2 
NAS  41  4  2 
SP5  47  2  1 
d  MDD  APY  ASR 

10  0.277  0.665  0.028 
120  0.300  0.806  0.034 
190  0.252  0.723  0.030 
It was observed anecdotally that in markets that experienced high volatility with the best performing asset constantly changing, the 10day agent performed best. In markets that had a consistent best stock over a long period, the 190day agent performed better. The 120day agent represents a “middle of the way“ agent that yielded an overall better strategy as shown by the performance across the presented metrics.
IvD Testing
Each testing data set consists of one year of data, with 300 days prior for CORNbased strategies to train with. This extra data is to allow CORNbased strategies to have a more fair comparison against AMAK. The CORNbased algorithms were tuned with their optimal hyperparameters as set out in their respective papers [3, 5, 4]. In the case of our implementation, we have segmented days for each algorithm, where is the size of the largest subagent’s memory  here this would be 190 days. We will compare our approaches to some common baselines such as UBAH, CRP and Best Stock [1]. We also compare our method to EG (Exponential Gradient) as a showcase of a FollowTheWinner strategy [15]. In EG we have set to be .
V Results and Discussion
In the Tables III, IV and V, the number next to a stock exchange represents which data set it came from. The mean () is the average for the metric across each of the markets and is the standard deviation.
Algorithm  BIST  BOV1  EUR1  JSE1  NAS1  SP5  BOV2  EUR2  JSE2  NAS2  

UBAH  0.082  0.159  0.242  0.102  0.093  0.034  0.201  0.070  0.075  0.203  0.126 0.070 
CRP  0.081  0.150  0.248  0.097  0.090  0.035  0.199  0.068  0.072  0.204  0.124 0.071 
EG  0.081  0.149  0.230  0.097  0.090  0.034  0.197  0.070  0.072  0.203  0.122 0.067 
CORNK  0.189  0.125  0.231  0.100  0.090  0.035  0.201  0.070  0.055  0.204  0.130 0.071 
RACORNK  0.081  0.164  0.227  0.110  0.090  0.035  0.208  0.066  0.081  0.204  0.127 0.068 
DRICORNK  0.189  0.125  0.231  0.100  0.090  0.035  0.201  0.070  0.055  0.204  0.130 0.071 
AMAK  0.218  0.087  0.153  0.117  0.180  0.130  0.087  0.086  0.117  0.180  0.136 0.046 
Algorithm  BIST  BOV1  EUR1  JSE1  NAS1  SP5  BOV2  EUR2  JSE2  NAS2  

UBAH  0.693  0.188  0.381  0.186  0.247  0.214  0.227  0.186  0.220  0.106  0.265 0.166 
CRP  0.654  0.206  0.388  0.168  0.265  0.205  0.254  0.183  0.222  0.116  0.266 0.154 
EG  0.640  0.203  0.421  0.167  0.280  0.204  0.253  0.185  0.220  0.131  0.270 0.152 
CORNK  1.450  0.240  0.396  0.144  0.301  0.216  0.331  0.184  0.289  0.128  0.368 0.390 
RACORNK  0.484  0.124  0.302  0.107  0.288  0.219  0.169  0.143  0.215  0.117  0.217 0.116 
DRICORNK  1.450  0.240  0.396  0.144  0.301  0.216  0.331  0.184  0.289  0.128  0.368 0.390 
AMAK  2.052  0.539  1.237  0.616  0.418  0.775  0.558  0.390  0.639  0.413  0.764 0.516 
Algorithm  BIST  BOV1  EUR1  JSE1  NAS1  SP5  BOV2  EUR2  JSE2  NAS2  

UBAH  0.041  0.009  0.021  0.009  0.013  0.011  0.012  0.009  0.011  0.004  0.014 0.010 
CRP  0.039  0.010  0.022  0.008  0.014  0.010  0.013  0.009  0.011  0.005  0.014 0.010 
EG  0.038  0.010  0.024  0.008  0.015  0.010  0.013  0.009  0.011  0.006  0.014 0.010 
CORNK  0.089  0.013  0.022  0.007  0.016  0.011  0.018  0.009  0.016  0.006  0.021 0.024 
RACORNK  0.028  0.005  0.016  0.004  0.016  0.011  0.008  0.006  0.011  0.005  0.011 0.007 
DRICORNK  0.089  0.013  0.022  0.007  0.016  0.011  0.018  0.009  0.016  0.006  0.021 0.024 
AMAK  0.126  0.031  0.075  0.036  0.024  0.046  0.033  0.022  0.038  0.023  0.045 0.032 
Va Individual Metric Performance
We examine the performances of the various strategies based on metrics given by Equations 6, 7 and 8.
VA1 Mdd
As seen in Table III, the best performing strategies are EG and AMAK. EG outperforms AMAK by a considerable margin on average. It should be noted that AMAK achieves a low standard deviation for its MDD. Despite this lower standard deviation, AMAK had the greatest MDD value hence, it represents the riskiest strategy. This is expected given that AMAK is extremely aggressive in terms of portfolio allocation. Furthermore, we note that in the two markets that have periods of flatness, we see that AMAK performed the worst. AMAK received an MDD value that is twice and thrice greater than the other strategies (in general) for NAS1 and JSE2 respectively.
VA2 Apy
In Table IV we see that AMAK had the best performance, with AMAK achieving in the worst case (on NAS1) 38.87% better annualised percentage yield for the period compared to the secondbest. AMAK had the highest mean APY, however it should be noted that its standard deviation is fairly large. The secondbest strategies are CORNK and DRICORNK. These strategies had nearly half the mean APY of AMAK. AMAK’s better performance to CORNK and DRICORNK results from AMAK searching for portfolios in days that these CORNbased strategies’ experts would have returned uniform portfolios. AMAK’s strategy has turned enough of these days into profitable trading days  as reflected by its mean APY.
VA3 Asr
Based on Table V, we can see that the best performing algorithm is AMAK. The performance benefits of AMAK can be seen in the example of JSE1, where AMAK performs 4 times better than the secondbest algorithm in JSE1. Therefore, we can conclude that per unit of risk with a given riskfree rate of 4%, the AMAK algorithm returns showcase the risk to reward tradeoff at play.
VB General Performance
Looking at the cumulative return of the algorithm on various markets, we see that the approaches have different patterns in general. For example for NAS1 in Figure 4, the general market trend is upwards, with AMAK performing the best. The secondbest algorithm is CORNK and DRICORNK which have performed the same. For the first 100 days AMAK’s MDD risks are prevalent and AMAK does the worst. AMAK fluctuates heavily in this market and from a bottom at day 68 where AMAK lost 7.8% of its total wealth in four days to being the best performing algorithm at the end.
In Figure 3 we see the market stays relatively flat before decreasing slightly for the first 100 days. Comparing our 10day, 120day and 190day subagents to AMAK we see that AMAK cannot beat the best performing agent. AMAK’s other agents have also done well for the period and for the initial 55 days we see that the subagents perform well. The 10day agent is clearly more volatile and loses more of its cumulative wealth than the other agents. Towards the end of the period, the 120day agent has outperformed all other strategies. AMAK has benefited from this subagent, but the benefit is dampened by the other subagents.
When CORNK’s and DRICORNK’s experts pick up enough similarity, their performance is close to ours. This can be seen in Figure 3. Despite our approach making considerable gains early on, the CORNK and DRICORNK approaches identified an asset our approach was unable to. This resulted in these algorithms far outpacing our method. The asset that CORNK and DRICORNK identified was most likely in a period our agents had forgotten.
In general our approach is competitive and leverages CORNbased strategies to produce further gains by searching for optimal portfolios on days that CORNbased strategies would not. Although the risk presented by our algorithm as shown in Figure 4 can be significant, it is an example of a risk to reward tradeoff.
Vi Conclusion and Further Development
Our approach can generate high returns using our memorybased method. The approaches can be volatile and merging them typically results in a more stable strategy (as shown in Tables 6, 7, 8) at the cost of reducing returns of the best agents. Here are possible areas to develop further: Changing how we search for the optimal portfolio to include regret or penalise volatility  for lowerrisk strategies. Doing a finegrained grid search for the optimal amount of memory for the agents in different periods. Testing AMAK using a variety of combinations for the day agents with different time horizons. Lastly, further testing using other well known or niche financial metrics should be conducted to further understand the performance of the method in comparison to other modern approaches.
References
 [1] B. H. Li and S. C.H, “Online portfolio selection: A survey,” ACM Computing Surveys, vol. 46, 2014.
 [2] L. Gyorfi, G. Lugosi, and F. Udina, “Nonparametric kernelbased sequential investment strategies,” Mathematical Finance, vol. 16, pp. 337–357, 2006.
 [3] B. H. Li, S. C.H, and V. Gopalkrishnan, “Corn: Correlationdriven nonparametric learning approach for portfolio selection,” ACM Transactions on Intelligent Systems and Technology, vol. 2, pp. 21–29, 2011.
 [4] Y. Wang, D. Wang, and T. Zheng, “Racornk: riskaversion pattern matchingbased portfolio selection,” AsiaPacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1816–1820, 2018.
 [5] S. Sooklal, T. van Zyl, and A. Paskaramoorthy, “Dricornk: A dynamic risk correlationdriven nonparametric algorithm for online portfolio selection,” Proceedings of the First Southern African Conference for AI Research, pp. 183–196, 2020.
 [6] S.H. Liao, H. hui Ho, and H. wen Lin, “Mining stock category association and cluster on taiwan stock market,” Expert Systems with Applications, vol. 35, no. 1, pp. 19–29, 2008.
 [7] S. K. Kumari, P. Kumar, J. Priya, S. Surya, and A. K. Bhurjee, “Meanvalue at risk portfolio selection problem using clustering technique : A case study,” AIP Conference Proceedings, vol. 2112, no. 1, p. 020178, 2019.
 [8] M. Khedmati and P. Azin, “An online portfolio selection algorithm using clustering approaches and considering transaction costs,” Expert Systems with Applications, vol. 159, p. 113546, 2020.
 [9] P. Zuccolotto and G. De Luca, “Dynamic tail dependence clustering of financial time series,” Statistical Papers, vol. 58, 09 2017.
 [10] S. Nanda, B. Mahanty, and M. Tiwari, “Clustering indian stock market data for portfolio management,” Expert Systems with Applications, vol. 37, no. 12, pp. 8793–8798, 2010.
 [11] H. Markowitz, “Portfolio selection,” The Journal of Finance, vol. 7, pp. 77–91, 1952.
 [12] M. MagdonIsmail and A. Atiya, “Maximum drawdown,” Risk Magazine, vol. 10, pp. 99–102, 2004.
 [13] K. Elton, J. Gruber, and J. Brown, Modern Portfolio Theory and Investment Analysis. J. Wiley & Sons, 2003.
 [14] W. Sharpe, “The sharpe ratio,” The Journal of Portfolio Management, vol. 21, no. 1, pp. 49–58, 1994.
 [15] D. P. Helmbold, R. E. Schapire, Y. Singer, and M. K. Warmuth, “Online portfolio selection using multiplicative updates,” Mathematical Finance, vol. 8, pp. 325–347, 1998.
Comments
There are no comments yet.