Financial Times Series Forecasting is a crucial topic in finance and its prediction is an extensive ongoing research issue where computer science algorithms are very relevant. ”It was estimated that, in 2012, approximately 85% of trades within the US Stock markets were performed by algorithms”Glantz et al. (2013)
. To date, there has been numerous works linking Machine Learning to financial decisions and specially to trading strategies.Dixon et al. (2016) shows that ”Back Propagation and Gradient Descent have been the preferred method for training finance structures due to the ease of implementation and their tendancy to converge” Fehrer et al. (2015); Teiseira et al. (2010).
Kodia et al. (2010) focuses on social interactions using a multi-agent based simulation. Batres et al. (2015)
Deep Learning outperforms in finance as well, and Long-Short Term Memory (LSTM) is very promising in forecasting field because of its ability to memorize data. ”Capturing spatio-temporal dependencies, based on regularities in the observations, is therefore viewed as a fundamental goal for Deep Learning systems.” Arel et al. (2010)
We aim to use a Long Short Term Memory ensemble method with two input sequences, a sequence of daily features and a second sequence of annual features, in order to predict the next day closing price and make a better decision in trading. LSTM is the most convenient for the following reasons:
New studies focuses on Ensemble methods where the weakness of a method is balanced out by the strength of another to produce high quality Chang et al. (2009). Ensemble learning can be either parallel or serial. A parallel ensemble results from different learners which are combined according to the schemes of Majority Voting, Weighted Majority Voting, Min, Max, etc.. Serial learning arranges different base learners in sequence and selects the result of one learner as the final output. Our study relies on heteregenous features with two temporal frequencies, it is then more appropriate to use two LSTM learners to form one ensemble method.
The main contributions of this study are the following:
(1) a new ensemble method where two time frequency inputs can be combined to predict the next day closing stock price; (2) evaluation of the method by comparing it to other Machine Learning techniques and financial strategies.
The paper is divided into four sections. In section II we present some of the related work to this subject that can be found in the literature. Section III details the research methodology, describes the data pre-processing and develops the experiment steps. Section IV analyses the results. Finally, in section V, we draw some conclusions and future work. The objective of this project is to study the effectivness of an LSTM ensemble method on the stock market prediction based on Open, High, Low, Close and Volume indicators as well as some financial metrics used by traders like MACD and RSI and other financial ratios, evaluate its performance in terms of RMSE and other measures through experiments on real NYSE Stock Exchange data and analyze and compare our results to other machine learning techniques.
Ii Litterature Review
Times series prediction is a field that has been related for years by statistics and Machine Learning techniques.
Ii-a Pure single learner for Stock market prediction
Ii-A1 Autoregressive integrated moving average (ARIMA)
Researchers have exploited statistical techniques to predict and forecast financial times series like ARIMA which requires strict assumption like stationarity while finance forecasting is complex, noisy and non-stationary Bao et al. (2017).
Ii-A2 Artificial Neural Network
Kuremoto et al. (2014)’s paper shows that Artificial Neural Network is used for Times Series Forecasting since 1980s, more than 5000 publications has been released on ANN for Forecasting Crone et al. (2007).
However, ANN presents issues like overfitting, impact of initial values and .
In order to tackle these issues, Deep Neural Network could be considered as a solution.
Ii-A3 Deep Neural Network
DNN is effectively more robust toward overfitting and can model complex non linear relationship between dependant and independant variables Arel et al. (2010). Howerver, Schmidhuber et al. (2015) explains some of the drawbacks of Deep Neural Network as:
-Slow Convergence time;
-Vanishing or exploding gradient;
Ii-A4 Deep Learning
Ii-B Ensemble and hybrid learning models for Stock market prediction
Zhang et al. (2003) apply a hybrid model based on ARIMA and Neural Network to raise the prediction precision, Werbos et al. (1974) shows that ARIMA-ANN model outperfoms statistical methods.
Ding et al. (2015)
Our model combines two feature categories in order to transmit the different aspects of an equity to the network.
The first aspect considers historical daily stock price that reflects its trend. OHLC (Open, High, Low and Close stock prices), volume and trading indicators MACD, RSI and Signal show a share evolution based mainly on supply and demand.
The second aspect consists of financial metrics that are part of the financial analysis. The objective is to take advantage of both technical analysis and fundamental-financial analysis. We adopt an Ensemble LSTM approach as explained in the figure below (Fig 2):
Iii-a Data exploration
The first step in this research is Data exploration.
The database used in this research contains New York Stock Exchange S&P500 data, publicly available at https://www.kaggle.com/dgawlik/nyse, for the period between 04/01/2010 and 30/12/2016. It is splitted into 2 Datasets, one for prices : Open - Low - High - Close - Volume and daily indicators : MACD - Signal - RSI.
Moving Average Convergence Divergence (MACD) is a trend-following momentum indicator and when MACD crosses its signal, it can function as a buy and sell signal. Relative Strength indicator (RSI) indicates the internal strength of an equity, it also reflects the vitality of increases over decreases.
The second Dataset contains 75 financial annual ratios that shows informations about the company itself. The following ratios are some of them:
EPS (Earning Per Share)
ROE (Return On Equity)
PER (Price Earning Ratio)
PBR (Price to Book ratio)
Data collected contains 417 companies, the training set consists of consecutive observations from 04/01/2010 to 07/08/2015 while the test set consists of observations from 08/08/2015 to 31/12/2016 which represents sequences of 1408 time series for the training set and 352 for the test set. Before using Data as input for our Ensemble LSTM method, missing values are handled using Mean and Logistic regression functions in Matlab, then z-score normalization is applied (see equation1).
Iii-B Hierarchical LSTM technique
The chosen Hierachical LSTM technique is based on the assumption that the use of Ensemble method is a way such that the weakness of a method will be balanced out by the strength of another. The second step in our model is to construct the first LSTM learner that can also be considered as the first weak learner (see fig 2).
As a reminder, Long Short Term Memory (LSTM) is a special type of Recurrent Neural Network (RNN) with the capability of learning long term and short term despendencies. Relying on gates, this Deep Learning algorithm has the ability to memorize. It is then well suited for times series prediction (see fig1) and solve the problem of Vanishing gradient.
This model contains an LSTM algorithm modeling prediction based on annual features (financial ratios) (see fig 3) for the sequence between 2012 and 2016. The input of the first learner consists on 75 features over a 5-sequence (2012 - 2013 -2014 - 2015 - 2016). The output result is a prediction for the closing price that will be considered as one of the inputs of the second LSTM learner.
Then, the serial ensemble learning is the result of the combination of the first learner’s output (prediction) and the other daily features as an input for the second learner as follows:
Highest price for the day
Lowest price for the day
the first learner’s prediction
The training is then executed over two stages (sequentially).
After training, the model is capable of forecasting and predicting the Closing price one step ahead (see fig 2).
Iv Results and evaluation
Experiments were carried out to predict the next day Closing price following the approach described above.
As explained, the Ensemble LSTM approach is composed of one main dataset splitted into two datasets sequentially different since one is for annual features and the other for daily features.
The empirical experiment is applied on 417 New York Stock Exchange companies using financial ratios for the first learner. Then the predicted value is included into the input of the second LSTM learner. This article aims to analyze the hierachical LSTM technique in terms of Root Mean Square Error (RMSE) and Forecasting.
The Ensemble LSTM to be used is a Regression algorithm that allows to predict the next day Closing price. To fix the architecture of our method, several experiences were done to choose the number of hidden units and the learning rate in order to get the best results. The topology chosen is 20 hidden nodes for the first LSTM and an initial learning rate of 0.005, the output is then included into the input of the second LSTM, with a total of 9 features and 200 hidden nodes.
Figures 5 presents the Ensemble LSTM model performance in terms of Mean Squared Error and Forecasting. Our model shows a great performance since RMSE is equal to 0.0119 which is more interesting than LSTM standalone for the first Databse alone with 0.0124.
In order to determine if there is an improvement while applying an ensemble model, we executed the LSTM algorithm to the first Database (with daily features) based on a sequence to determine the last Clossing price, we then got an RMSE equal 0.0124. Then we applied LSTM standalone to the annual Database and got RMSE equals to 0.08. And finally, to get sure that LSTM is more appropriate in case of Times series, we chose to compare it to Neural Network. We used Neural Network for times series with Matlab, Levenberg-Marquardt as a training algorithm and 10 hidden nodes. We got RMSE equals to 0.067 which is better than LSTM applied to the second Database 6. Table I compare our approach to LSTM standalone, using each dataset apart and then to Neural Network.
|LSTM 1||LSTM 2||Ensemble LSTM||NN|
LSTM’s ability to store historic data for a long period and retrieve valuable data when required makes this technique a good technique for time series forecasting.
V Conclusion and future work
In this study, we established an ensemble method based on LSTM algorithm to predict the next day Closing price. This model takes into account two frequencies with two types of variables, daily and annual variables. It confirms then our hypothesis that the combination of these two databases gives a better performance. This ensemble method can be a useful tool for traders and stakeholders to determine their trading strategy. For future work, we attend to achieve a benchmark between this hierarchical technique and traditional finance methods. Furthermore, Deep Learning and LSTM are very promising, we can then expect to improve this model and achieve online prediction for forecasting.
Malhotra et al. (2015)
Malhotra, Pankaj and Vig, Lovekesh and Shroff, Gautam and Agarwal, Puneet. (2015) ’Long short term memory networks for anomaly detection in time series’,Presses universitaires de Louvain.
- Selvin et al. (2017) Selvin, Sreelekshmy and Vinayakumar, R and Gopalakrishnan, EA and Menon, Vijay Krishna and Soman, KP. (2017). ’Stock price prediction using LSTM, RNN and CNN-sliding window model’ International Conference on Advances in Computing, Communications and Informatics (ICACCI), 1643-1647.
- Schmidhuber et al. (1997) Hochreiter, S., Schmidhuber, Jürgen (1997) ’ Long short-term memory’, Neural computation, 9(8), 1735-1780.
- Dixon et al. (2016) Dixon, M., Klabjan, D., Bang, J. H. (2016) ’Classification-based financial markets prediction using deep neural networks’, Algorithmic Finance, 1-11.
- Glantz et al. (2013) Glantz, Morton and Kissell, Robert. (2016) ’Multi-asset risk modeling: techniques for a global economy in an electronic and algorithmic trading era’, Academic Press.
- Fehrer et al. (2015) Fehrer, R., Feuerriegel, S. (2015) ’Improving Decision Analytics with Deep Learning: The Case of Financial Disclosures’, arXiv preprint, arXiv:1508.01993.
- Teiseira et al. (2010) Vieira, P. C., Teixeira, A. A. (2010) ’Are finance, management, and marketing autonomous fields of scientific research? An analysis based on journal citations’, Scientometrics, 85(3), 627-646.
- Kodia et al. (2010) Kodia, Z., Said, L. B., Ghedira, K. (2010) ’ A study of stock market trading behavior and social interactions through a multi agent based simulation’, KES International Symposium on Agent and Multi-Agent Systems: Technologies and Applications, (pp. 302-311). Springer, Berlin, Heidelberg.
Arel et al. (2010)
Arel, I., Rose, D. C., Karnowski, T. P. (2010) ’ Deep machine learning-a new frontier in artificial intelligence research’,IEEE computational intelligence magazine, 5(4), 13-18.
- Batres et al. (2015) Batres-Estrada, B. (2015) ’ Deep learning for multivariate financial time series’.
Bao et al. (2017)
Deng, Y., Bao, F., Kong, Y., Ren, Z., Dai, Q. (2017) ’ Deep direct reinforcement learning for financial signal representation and trading’,IEEE transactions on neural networks and learning systems, 28(3), 653-664.
- Werbos et al. (1988) Werbos P. J. (1988) ’ Generalization of back-propagation with application to a recurrent gas market model’, Neural Networks, Vol. 1, pp. 339–356.
- Sutskever et al. (2014) Sutskever, I., Vinyals, O., Le, Q. V. (2014) ’ Sequence to sequence learning with neural networks’, Advances in neural information processing systems, (pp. 3104-3112).
- Bengio et al. (2009)
- Nelson et al. (2017) Nelson, D. M., Pereira, A. C., de Oliveira, R. A. (2017) ’ Stock market’s price movement prediction with LSTM neural networks’, Neural Networks (IJCNN), 2017 International Joint Conference, (pp. 1419-1426).
Kuremoto et al. (2014)
Kuremoto, T., Kimura, S., Kobayashi, K., Obayashi, M. (2014) ’ Time series forecasting using a deep belief network with restricted Boltzmann machines’,Neurocomputing, 137, 47-56.
- Chang et al. (2009) Chang, P. C., Liu, C. H., Fan, C. Y., Lin, J. L., Lai, C. M. (2009) ’ An ensemble of neural networks for stock trading decision making’, International Conference on Intelligent Computing, (pp. 1-10).
- Crone et al. (2007) Crone, S., Nikolopoulos, K. (2007) ’ Results of the NN3 neural network forecasting competition’, The 27th International Symposium on Forecasting, (pp. 129)
- Schmidhuber et al. (2015) Schmidhuber, J. (2015) ’ Deep learning in neural networks: An overview’, Neural networks, 61, 85-117.
- Hinton et al. (2006) Hinton, G. E., Osindero, S., Teh, Y. W. (2006) ’A fast learning algorithm for deep belief nets’, Neural computation, 61, 18(7), 1527-1554.
- Zhang et al. (2003) Zhang, G. P. (2006) ’Time series forecasting using a hybrid ARIMA and neural network model’, Neurocomputing, 50, 159-175.
- Werbos et al. (1974) Werbos, P. (1974) ’Beyond Regression:” New Tools for Prediction and Analysis in the Behavioral Sciences’, Ph. D. dissertation, Harvard University.
- Ding et al. (2015) Ding, X., Zhang, Y., Liu, T., Duan, J.. (2015) ’Deep learning for event-driven stock prediction’, Twenty-Fourth International Joint Conference on Artificial Intelligence.
Bao et al. (2017)
Bao, W., Yue, J., Rao, Y. (2017) ’A deep learning framework for financial time series using stacked autoencoders and long-short term memory’,PloS one, 12(7), e0180944. +++++++
- SHarif et al. (2017) Sharif, Abu-Ghazaleh, Saifan (2017) ’Investigating Algorithmic Stock Market Trading using Ensemble Machine Learning Methods’.