A new approach for trading based on Long Short Term Memory technique

by   Zineb Lanbouri, et al.

The stock market prediction has always been crucial for stakeholders, traders and investors. We developed an ensemble Long Short Term Memory (LSTM) model that includes two-time frequencies (annual and daily parameters) in order to predict the next-day Closing price (one step ahead). Based on a four-step approach, this methodology is a serial combination of two LSTM algorithms. The empirical experiment is applied to 417 NY stock exchange companies. Based on Open High Low Close metrics and other financial ratios, this approach proves that the stock market prediction can be improved.



There are no comments yet.


page 3


Investigation Into The Effectiveness Of Long Short Term Memory Networks For Stock Price Prediction

The effectiveness of long short term memory networks trained by backprop...

Stock Market Trend Analysis Using Hidden Markov Model and Long Short Term Memory

This paper intends to apply the Hidden Markov Model into stock market an...

Extreme Volatility Prediction in Stock Market: When GameStop meets Long Short-Term Memory Networks

The beginning of 2021 saw a surge in volatility for certain stocks such ...

Deep Stock Predictions

Forecasting stock prices can be interpreted as a time series prediction ...

Earnings Prediction with Deep Leaning

In the financial sector, a reliable forecast the future financial perfor...

Long Short-Term Memory Neural Network for Financial Time Series

Performance forecasting is an age-old problem in economics and finance. ...

Deep learning for Stock Market Prediction

Prediction of stock groups' values has always been attractive and challe...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Financial Times Series Forecasting is a crucial topic in finance and its prediction is an extensive ongoing research issue where computer science algorithms are very relevant. ”It was estimated that, in 2012, approximately 85% of trades within the US Stock markets were performed by algorithms”

Glantz et al. (2013)

. To date, there has been numerous works linking Machine Learning to financial decisions and specially to trading strategies.

Dixon et al. (2016) shows that ”Back Propagation and Gradient Descent have been the preferred method for training finance structures due to the ease of implementation and their tendancy to converge” Fehrer et al. (2015); Teiseira et al. (2010).
Kodia et al. (2010) focuses on social interactions using a multi-agent based simulation. Batres et al. (2015)

’s paper introduces financial time series in a comparison between Logistic regression, MLP and Naive Bayes.

Deep Learning outperforms in finance as well, and Long-Short Term Memory (LSTM) is very promising in forecasting field because of its ability to memorize data. ”Capturing spatio-temporal dependencies, based on regularities in the observations, is therefore viewed as a fundamental goal for Deep Learning systems.” Arel et al. (2010)
We aim to use a Long Short Term Memory ensemble method with two input sequences, a sequence of daily features and a second sequence of annual features, in order to predict the next day closing price and make a better decision in trading. LSTM is the most convenient for the following reasons:

  • LSTM is considered as an improvment of Reccurrent Neural Network which comes as a solution to vanishing and exploding gradient, see figure (Fig  

    1) Bao et al. (2017); Werbos et al. (1988); Schmidhuber et al. (1997);

  • LSTM is suitable for sequences Sutskever et al. (2014);

  • LSTM can store and retrieve information using its gates Bengio et al. (2009);

  • LSTM doesn’t flow in a single way (unlike Neural Networks);

  • LSTM technique can distinguish between recent and early examples Nelson et al. (2017).

New studies focuses on Ensemble methods where the weakness of a method is balanced out by the strength of another to produce high quality Chang et al. (2009). Ensemble learning can be either parallel or serial. A parallel ensemble results from different learners which are combined according to the schemes of Majority Voting, Weighted Majority Voting, Min, Max, etc.. Serial learning arranges different base learners in sequence and selects the result of one learner as the final output. Our study relies on heteregenous features with two temporal frequencies, it is then more appropriate to use two LSTM learners to form one ensemble method.
The main contributions of this study are the following:
(1) a new ensemble method where two time frequency inputs can be combined to predict the next day closing stock price; (2) evaluation of the method by comparing it to other Machine Learning techniques and financial strategies.
The paper is divided into four sections. In section II we present some of the related work to this subject that can be found in the literature. Section III details the research methodology, describes the data pre-processing and develops the experiment steps. Section IV analyses the results. Finally, in section V, we draw some conclusions and future work. The objective of this project is to study the effectivness of an LSTM ensemble method on the stock market prediction based on Open, High, Low, Close and Volume indicators as well as some financial metrics used by traders like MACD and RSI and other financial ratios, evaluate its performance in terms of RMSE and other measures through experiments on real NYSE Stock Exchange data and analyze and compare our results to other machine learning techniques.

Fig. 1: LSTM as an improvment of RNN

Ii Litterature Review

Times series prediction is a field that has been related for years by statistics and Machine Learning techniques.

Ii-a Pure single learner for Stock market prediction

Ii-A1 Autoregressive integrated moving average (ARIMA)

Researchers have exploited statistical techniques to predict and forecast financial times series like ARIMA which requires strict assumption like stationarity while finance forecasting is complex, noisy and non-stationary Bao et al. (2017).

Ii-A2 Artificial Neural Network

Kuremoto et al. (2014)’s paper shows that Artificial Neural Network is used for Times Series Forecasting since 1980s, more than 5000 publications has been released on ANN for Forecasting Crone et al. (2007). However, ANN presents issues like overfitting, impact of initial values and .
In order to tackle these issues, Deep Neural Network could be considered as a solution.

Ii-A3 Deep Neural Network

DNN is effectively more robust toward overfitting and can model complex non linear relationship between dependant and independant variables Arel et al. (2010). Howerver, Schmidhuber et al. (2015) explains some of the drawbacks of Deep Neural Network as:
-Slow Convergence time;
-Vanishing or exploding gradient;
-Expensive computation.

Ii-A4 Deep Learning

Hinton et al. (2006)

used a Deep Belief Network and gets 1,25% test error. Deep Learning has the following advantages:

-Ability to learn complexity;
-Aptitude to learn with little human input with low level/intermediate/high level of abstraction;

-Strong unsupervised learning.

Ii-B Ensemble and hybrid learning models for Stock market prediction

Zhang et al. (2003) apply a hybrid model based on ARIMA and Neural Network to raise the prediction precision, Werbos et al. (1974) shows that ARIMA-ANN model outperfoms statistical methods.
Ding et al. (2015)

chooses to combine Neural tensor and Convolutional Neural Network to predict short and long term influences.

Bao et al. (2017) use Stacked Auto-encoders and Long Short Term memory.

Iii Methodology

Our model combines two feature categories in order to transmit the different aspects of an equity to the network.
The first aspect considers historical daily stock price that reflects its trend. OHLC (Open, High, Low and Close stock prices), volume and trading indicators MACD, RSI and Signal show a share evolution based mainly on supply and demand.
The second aspect consists of financial metrics that are part of the financial analysis. The objective is to take advantage of both technical analysis and fundamental-financial analysis. We adopt an Ensemble LSTM approach as explained in the figure below (Fig  2):

Fig. 2: Ensemble LSTM approach

Iii-a Data exploration

The first step in this research is Data exploration. The database used in this research contains New York Stock Exchange S&P500 data, publicly available at https://www.kaggle.com/dgawlik/nyse, for the period between 04/01/2010 and 30/12/2016. It is splitted into 2 Datasets, one for prices : Open - Low - High - Close - Volume and daily indicators : MACD - Signal - RSI.
Moving Average Convergence Divergence (MACD) is a trend-following momentum indicator and when MACD crosses its signal, it can function as a buy and sell signal. Relative Strength indicator (RSI) indicates the internal strength of an equity, it also reflects the vitality of increases over decreases.
The second Dataset contains 75 financial annual ratios that shows informations about the company itself. The following ratios are some of them:

  • EPS (Earning Per Share)

  • ROE (Return On Equity)

  • Payout ratio

  • Dividend yield

  • PER (Price Earning Ratio)

  • PBR (Price to Book ratio)

Data collected contains 417 companies, the training set consists of consecutive observations from 04/01/2010 to 07/08/2015 while the test set consists of observations from 08/08/2015 to 31/12/2016 which represents sequences of 1408 time series for the training set and 352 for the test set. Before using Data as input for our Ensemble LSTM method, missing values are handled using Mean and Logistic regression functions in Matlab, then z-score normalization is applied (see equation  



MACD,the Signal and RSI are defined by the following formulas ( 2) and ( 3) :


Iii-B Hierarchical LSTM technique

The chosen Hierachical LSTM technique is based on the assumption that the use of Ensemble method is a way such that the weakness of a method will be balanced out by the strength of another. The second step in our model is to construct the first LSTM learner that can also be considered as the first weak learner (see fig  2).

As a reminder, Long Short Term Memory (LSTM) is a special type of Recurrent Neural Network (RNN) with the capability of learning long term and short term despendencies. Relying on gates, this Deep Learning algorithm has the ability to memorize. It is then well suited for times series prediction (see fig  

1) and solve the problem of Vanishing gradient.
This model contains an LSTM algorithm modeling prediction based on annual features (financial ratios) (see fig  3) for the sequence between 2012 and 2016. The input of the first learner consists on 75 features over a 5-sequence (2012 - 2013 -2014 - 2015 - 2016). The output result is a prediction for the closing price that will be considered as one of the inputs of the second LSTM learner.

Fig. 3: LSTM number 1

Then, the serial ensemble learning is the result of the combination of the first learner’s output (prediction) and the other daily features as an input for the second learner as follows:

  • Open

  • Highest price for the day

  • Lowest price for the day

  • Close

  • Volume

  • MACD

  • Signal

  • RSI

  • the first learner’s prediction

The training is then executed over two stages (sequentially).

After training, the model is capable of forecasting and predicting the Closing price one step ahead (see fig  2).

Iv Results and evaluation

Experiments were carried out to predict the next day Closing price following the approach described above. As explained, the Ensemble LSTM approach is composed of one main dataset splitted into two datasets sequentially different since one is for annual features and the other for daily features.
The empirical experiment is applied on 417 New York Stock Exchange companies using financial ratios for the first learner. Then the predicted value is included into the input of the second LSTM learner. This article aims to analyze the hierachical LSTM technique in terms of Root Mean Square Error (RMSE) and Forecasting.

Iv-a Results

The Ensemble LSTM to be used is a Regression algorithm that allows to predict the next day Closing price. To fix the architecture of our method, several experiences were done to choose the number of hidden units and the learning rate in order to get the best results. The topology chosen is 20 hidden nodes for the first LSTM and an initial learning rate of 0.005, the output is then included into the input of the second LSTM, with a total of 9 features and 200 hidden nodes.
Figures  5 presents the Ensemble LSTM model performance in terms of Mean Squared Error and Forecasting. Our model shows a great performance since RMSE is equal to 0.0119 which is more interesting than LSTM standalone for the first Databse alone with 0.0124.

Fig. 4: Forecast vs real values when LSTM is applied to DB number 1 alone

Fig. 5: Forecast vs real values when Ensemble LSTM approach is used

Iv-B Analysis

In order to determine if there is an improvement while applying an ensemble model, we executed the LSTM algorithm to the first Database (with daily features) based on a sequence to determine the last Clossing price, we then got an RMSE equal 0.0124. Then we applied LSTM standalone to the annual Database and got RMSE equals to 0.08. And finally, to get sure that LSTM is more appropriate in case of Times series, we chose to compare it to Neural Network. We used Neural Network for times series with Matlab, Levenberg-Marquardt as a training algorithm and 10 hidden nodes. We got RMSE equals to 0.067 which is better than LSTM applied to the second Database  6. Table  I compare our approach to LSTM standalone, using each dataset apart and then to Neural Network.

Fig. 6: Neural Network response
LSTM 1 LSTM 2 Ensemble LSTM NN
0.0124 0.08 0.0119 0.07
TABLE I: RMSE for the LSTM standalone, LSTM ensemble and Neural Network

LSTM’s ability to store historic data for a long period and retrieve valuable data when required makes this technique a good technique for time series forecasting.

V Conclusion and future work

In this study, we established an ensemble method based on LSTM algorithm to predict the next day Closing price. This model takes into account two frequencies with two types of variables, daily and annual variables. It confirms then our hypothesis that the combination of these two databases gives a better performance. This ensemble method can be a useful tool for traders and stakeholders to determine their trading strategy. For future work, we attend to achieve a benchmark between this hierarchical technique and traditional finance methods. Furthermore, Deep Learning and LSTM are very promising, we can then expect to improve this model and achieve online prediction for forecasting.


  • Malhotra et al. (2015)

    Malhotra, Pankaj and Vig, Lovekesh and Shroff, Gautam and Agarwal, Puneet. (2015) ’Long short term memory networks for anomaly detection in time series’,

    Presses universitaires de Louvain.
  • Selvin et al. (2017) Selvin, Sreelekshmy and Vinayakumar, R and Gopalakrishnan, EA and Menon, Vijay Krishna and Soman, KP. (2017). ’Stock price prediction using LSTM, RNN and CNN-sliding window model’ International Conference on Advances in Computing, Communications and Informatics (ICACCI), 1643-1647.
  • Schmidhuber et al. (1997) Hochreiter, S., Schmidhuber, Jürgen (1997) ’ Long short-term memory’, Neural computation, 9(8), 1735-1780.
  • Dixon et al. (2016) Dixon, M., Klabjan, D., Bang, J. H. (2016) ’Classification-based financial markets prediction using deep neural networks’, Algorithmic Finance, 1-11.
  • Glantz et al. (2013) Glantz, Morton and Kissell, Robert. (2016) ’Multi-asset risk modeling: techniques for a global economy in an electronic and algorithmic trading era’, Academic Press.
  • Fehrer et al. (2015) Fehrer, R., Feuerriegel, S. (2015) ’Improving Decision Analytics with Deep Learning: The Case of Financial Disclosures’, arXiv preprint, arXiv:1508.01993.
  • Teiseira et al. (2010) Vieira, P. C., Teixeira, A. A. (2010) ’Are finance, management, and marketing autonomous fields of scientific research? An analysis based on journal citations’, Scientometrics, 85(3), 627-646.
  • Kodia et al. (2010) Kodia, Z., Said, L. B., Ghedira, K. (2010) ’ A study of stock market trading behavior and social interactions through a multi agent based simulation’, KES International Symposium on Agent and Multi-Agent Systems: Technologies and Applications, (pp. 302-311). Springer, Berlin, Heidelberg.
  • Arel et al. (2010)

    Arel, I., Rose, D. C., Karnowski, T. P. (2010) ’ Deep machine learning-a new frontier in artificial intelligence research’,

    IEEE computational intelligence magazine, 5(4), 13-18.
  • Batres et al. (2015) Batres-Estrada, B. (2015) ’ Deep learning for multivariate financial time series’.
  • Bao et al. (2017)

    Deng, Y., Bao, F., Kong, Y., Ren, Z., Dai, Q. (2017) ’ Deep direct reinforcement learning for financial signal representation and trading’,

    IEEE transactions on neural networks and learning systems, 28(3), 653-664.
  • Werbos et al. (1988) Werbos P. J. (1988) ’ Generalization of back-propagation with application to a recurrent gas market model’, Neural Networks, Vol. 1, pp. 339–356.
  • Sutskever et al. (2014) Sutskever, I., Vinyals, O., Le, Q. V. (2014) ’ Sequence to sequence learning with neural networks’, Advances in neural information processing systems, (pp. 3104-3112).
  • Bengio et al. (2009)
  • Nelson et al. (2017) Nelson, D. M., Pereira, A. C., de Oliveira, R. A. (2017) ’ Stock market’s price movement prediction with LSTM neural networks’, Neural Networks (IJCNN), 2017 International Joint Conference, (pp. 1419-1426).
  • Kuremoto et al. (2014)

    Kuremoto, T., Kimura, S., Kobayashi, K., Obayashi, M. (2014) ’ Time series forecasting using a deep belief network with restricted Boltzmann machines’,

    Neurocomputing, 137, 47-56.
  • Chang et al. (2009) Chang, P. C., Liu, C. H., Fan, C. Y., Lin, J. L., Lai, C. M. (2009) ’ An ensemble of neural networks for stock trading decision making’, International Conference on Intelligent Computing, (pp. 1-10).
  • Crone et al. (2007) Crone, S., Nikolopoulos, K. (2007) ’ Results of the NN3 neural network forecasting competition’, The 27th International Symposium on Forecasting, (pp. 129)
  • Schmidhuber et al. (2015) Schmidhuber, J. (2015) ’ Deep learning in neural networks: An overview’, Neural networks, 61, 85-117.
  • Hinton et al. (2006) Hinton, G. E., Osindero, S., Teh, Y. W. (2006) ’A fast learning algorithm for deep belief nets’, Neural computation, 61, 18(7), 1527-1554.
  • Zhang et al. (2003) Zhang, G. P. (2006) ’Time series forecasting using a hybrid ARIMA and neural network model’, Neurocomputing, 50, 159-175.
  • Werbos et al. (1974) Werbos, P. (1974) ’Beyond Regression:” New Tools for Prediction and Analysis in the Behavioral Sciences’, Ph. D. dissertation, Harvard University.
  • Ding et al. (2015) Ding, X., Zhang, Y., Liu, T., Duan, J.. (2015) ’Deep learning for event-driven stock prediction’, Twenty-Fourth International Joint Conference on Artificial Intelligence.
  • Bao et al. (2017)

    Bao, W., Yue, J., Rao, Y. (2017) ’A deep learning framework for financial time series using stacked autoencoders and long-short term memory’,

    PloS one, 12(7), e0180944. +++++++
  • SHarif et al. (2017) Sharif, Abu-Ghazaleh, Saifan (2017) ’Investigating Algorithmic Stock Market Trading using Ensemble Machine Learning Methods’.