I Introduction
The market structure [1] is a core issue in economics since it can help the economist understand the dynamic patterns in the market. Generally, the markets can be observed by a group of time series, e.g. the stock market can be treated as a group of varying price and volume time series. Most recent works model the market structure as a correlation network with the statistical methods [2]
or methods based on similar pattern matching
[3]. However, since many existing works try to model the relations between financial assets with predefined patterns without considering the relationship of the stocklevel and the marketlevel financial time series (e.g. the impact of the stock index on the market index trend), it is still challenging to fully address the following aspects:
Prior pattern dependency: Many statistical methods model the market structure as the time series correlation network based on the prior patterns (e.g. Pearson correlation coefficient [2]) designed by experts, and they ignore all the other patterns that are potentially useful. Therefore, their results may bias from the practical rules.

Stop word bias [4]: The truly informative patterns may be overwhelmed by the numerous frequent results which are caused by the regular activities similar to the stop word bias. Therefore, it is challenging to distinguish the regular and the truly interesting patterns for these methods.

Evolving patterns: The market time series system is evolving with the new data, and the covarying patterns between the time series are also changing. Consequently, it may cause unexpected results when people analyze the time series with the outmoded patterns.
To derive a financial market structure that reveals the investment activities between pairwise stocks in the market which is hard to obtain directly, we propose a Deep Coinvestment Network Learning method, namely DeepCNL, to learn a stock market structure that reflects coinvestment activities (deep coinvestment network) via automatically learned nonlinear coinvestment patterns (deep coinvestment patterns). Our study starts from a financial mechanism about the buysell imbalance and the assets price. As introduced in the literature [5]
, there is a positive relationship between the buysell pressure and the asset price. Therefore, if the transaction prices and volumes of two stocks have similar rise and fall patterns, there is a high probability that these two stocks are invested together by a same group of people. We consider this as the coinvestment activity. Since the coinvestment activities cannot be observed directly, DeepCNL captures the evidence for them through convolution operators on the stock time series, where the convolution kernels represent the deep coinvestment patterns, at a Convolutional Neural Network (CNN) layer. Considering the convolution kernels are learned automatically during the training process, the obtained deep coinvestment patterns do not depend on any prior patterns and they can evolve with the new data. With the evidence of the coinvestment activities, we represent the stock market as the deep coinvestment network that addresses all the coinvestment relationships for the stocks in a stock market. DeepCNL is supervised by the risefall trends of the index of the whole market and learns the deep coinvestment network by the inner parameters of a Recurrent Neural Network (RNN) layer that captures temporal impacts of any pairwise stocks on the market index trends. This step also alleviates the stop word bias issue by supervising the learned results (both the deep coinvestment patterns and network) with the market index trends. We verify the effectiveness of the deep coinvestment network in addressing the investment activities on the realworld data. Several financial tasks are introduced to interpret the compatibility of the learned deep coinvestment networks from DeepCNL with widely acknowledged financial principles.
In summary, the main contribution of this work including:

We propose a Deep Coinvestment Network Learning method, namely DeepCNL, to automatically learn a coinvestment network, which reflects the intensities of the coinvestment activities for any pairwise stocks.

DeepCNL connects the learning of the coinvestment network to the risefall trends of the market index. The modeling of deep coinvestment patterns and the resulting deep coinvestment network do not rely on any predefined patterns.

We build the deep coinvestment network with the learned inner parameters of DeepCNL which reveal the impacts of the stock index on the market index trends. To the best of our knowledge, this paper is the first to study the relation between the coinvestment activities and the inner parameters of the deep learning model.

We observe that DeepCNL learns a deep coinvestment network that performs consistently with known financial principles about the investment activities.
Ii Preliminary
The assets usually have several attributes, e.g. the observation about a stock contains both its prices and transaction volumes at a given time. Therefore, this work treats all the original time series for stocks as the vector time series.
Definition 1
(Vector Time Series) A vector time series is denoted as a sequence of realvalued vectors, is a matrix, where is a vector which represents the values of the features at the th time stamp, and is the length of .
We simplify the term “vector time series” as “time series”, as a scalar and as a vector for brevity in the following part. We follow the method from Lee et al. [6] to concatenate two aligned time series and with length into a matrix .
(1) 
where can also be written as , each () is a dyadic tuple which represents the corresponding values of the time series and at the th time stamp. We call the matrix as the observation matrix for and .
Definition 2
(Deep Coinvestment Pattern) Given two time series and , let be their observation matrix, then the deep coinvestment pattern between and can be denoted as tensor , where each () is a matrix which represents the th covarying latent pattern of and in the window of size . is a parameter to decide the number of the different covarying latent patterns.
The covarying latent pattern is a framework to capture the covarying rules between two time series, and it is a generalized form of many existing correlation measurement methods. In our deep learning model, the covarying latent patterns are automatically learned during training. The bigger the is, the more covarying latent patterns will be collected by our model. According to the “buysell pressure” principle in the introduction, if two stocks are invested by the same group of investors, their feature time series (e.g. prices, volumes, etc.) will vary analogously (with the similar rise or fall trends). Therefore, the deep coinvestment patterns can be used as the filters to capture the clues for two stocks being frequently invested together. We propose the deep coinvestment evidence to record the filtered clues.
Definition 3
(Deep Coinvestment Evidence) Given the observation matrix , and a deep coinvestment pattern with the window of size for and ; the deep coinvestment evidence is a matrix resulted from the convolution between s and all the covarying pattern (). Denote vector , where is a scalar (, ) which represents the coinvestment degree between and at the th time stamp of the th covarying latent pattern. can be computed as:
(2) 
where is the convolution operator and is the bias.
The deep coinvestment evidence records the coinvestor activities which follow the deep coinvestment patterns. For example, given time series and , the observation matrix is:
(3) 
Let the deep coinvestment pattern be a tensor , where the covarying patterns and are the matrices, and suppose the bias is zero. Then the deep coinvestment evidence is:
(4) 
In this example, collects the evidence both for the covarying latent pattern and in .
We define the deep coinvestment network to describe the relationships for all dyadic combinations of the stocks being invested together in a market.
Definition 4
(Deep Coinvestment Network) The deep coinvestment network is a weighted undirected graph , where the node set represents all the stocks in the market, and the edge set represents the deep coinvestment relationships between any two stocks. The weight vector represents the coinvestment intensities between the two stocks, where a large weight, indicates the trends that the investors spend more money on the stock and together.
Generally, the transactions between the stocks and investors can be modeled as a bipartite graph. In addition, if the data are available, an oracle coinvestment network can be computed by a onemode projection with the transactions between the stocks and investors. However, since the transaction records are hard to collect for many reasons (e.g. legal, privacy, or technical issues), we construct the deep coinvestment network indirectly by the deep coinvestment evidence and the market index data.
The rise or fall of the market index is caused by the aggregated investment activities from the stocks in that market. Consequently, given the deep coinvestment evidence for all the dyadic stock combinations, the deep coinvestment network learning problem can be formalized as follows.
Definition 5
(Deep Coinvestment Network Learning) Suppose is the set of all the stocks in a market, all the stock time series in are aligned with length , and the window size for the deep coinvestment pattern is . is a tensor, where is the deep coinvestment evidence for the stock time series and in . is a matrix, and it is the deep coinvestment evidence for all the dyadic combinations of the stocks at the th time stamp. Let
be a binary time series which representing the rise or fall trends of the market index, our goal is to estimate the optimal weight vector
for the deep coinvestment network which can maximize the probability of computing given the deep coinvestment evidence. It can be formalized as the following.(5) 
where is the value of at the tth time stamp. is a vector and it can be computed as the follow equation:
(6) 
where is a bias vector.
With the aforementioned notations, the coinvestment evidence records the coinvestment activities by the investors and is the coinvestment intensities on the stock and by the investors. While the weight is directly related to the money spent on the stock and by the investors with the fixed coinvestment evidence. Furthermore, as the market index (aggregate index price movement) is directly correlated to the total fund flow (or invested money) [7], it is correlated to the sum of coinvestment intensities (or weights) for all the stocks. Therefore, only the correct coinvestment intensities for all the corresponding pairs of the stocks lead to the real market index when the corresponding coinvestment evidence is converged after the optimization process, and thus, supervised by the market index, we can learn the effective coinvestment intensity for each pair of stocks.
Since solving this problem would require trying different weight values on all dyadic combinations of the time series, it is an NPhard combinatorial optimization problem. We propose a deep learning model solution by applying the GPU to speed up the computation.
Iii Our Framework
We propose the Deep Coinvestment Network Learning (DeepCNL) framework,
as shown in Figure 1, the DeepCNL model combines a convolution layer, a general RNN layer, and a risefall trends supervision layer. (1) The convolution layer enumerates all the dyadic combinations of the time series and concatenate every pair of time series and () to a set of the observation matrices , and computes all the deep coinvestment evidence for the observation matrices through a convolution layer; (2) the general RNN layer gets all the deep coinvestment evidence input into a channel RNN with each deep coinvestment evidence corresponding a specific input channel, and outputs a single predicted realvalued sequence; (3) In the risefall trends supervision layer, DeepCNL measure the loss between the predicted sequence from the RNN layer and the target risefall sequence and apply the back propagation methods to optimize the parameters. (4) DeepCNL learns the deep coinvestment network with the learned inner weights of RNN after the training with the supervision of the risefall trends. The obtained deep coinvestment network can be applied to various financial tasks.
Iiia Convolution Layer
The convolution layer [8] is a critical part of DeepCNL to trace the deep coinvestment patterns between two time series. Since our model aims to capture a full spectrum of deep coinvestment patterns and computes the result sequences which should align to the market index time series with the risefall pattern, it uses the convolution layer directly without any down sampling methods [9].
With the notations in Definition 2, the computation process of our convolution layer is designed in Algorithm 1, where the function “enumerate()” generates all the dyadic combinations from , and “concatenate” appends the newly computed to the end of the first dimension of at each iteration. One should note that all the convolutions share the same during different sliding time windows. Therefore, this methods captures the global coinvestment structure for all the stocks with the same deep coinvestment pattern. Since the algorithm 1 needs to enumerate the complete dyadic combinations, its complexity is .
IiiB General RNN Layer
The deep coinvestment network learning is a “n to 1” sequence learning task which means to learn a singlefeature sequence given a multifeature sequence (such as the vector time series). This conforms to the scene which predicts the rise or fall trends of the market index based on the former observations of the stock indices in a market. As illustrated in the Figure 1, we apply a general multifeature recurrent neural network framework for the sequence learning task. The RNN framework has the flexibility to adopt the basic RNN [10], the longshort memory recurrent neural network (LSTM) [11], the gated recurrent neural network (GRURNN) [12] or any other RNNs according to the tradeoff of performance and accuracy. Our RNN layer consists of the RNN framework with input channels, and it computes a score that indicates the risefall trend with the process in Algorithm 2,
where the function concatenate is the same function as Algorithm 1 to concatenate the vectors, and the function RNN is a general RNN framework which computes the output value from the new input and its last hidden state . Since the deep coinvestment evidence tensor contains the evidence for covarying latent patterns, the RNN output is a vector. Therefore, we use a “ to 1” linear layer to convert the RNN output to a scalar output , and is the score which represents the risefall trend considering all the deep coinvestment evidence in the former window. The RNN in Algorithm 2 needs to process the deep coinvestment evidence for all the time stamps, and thus its time complexity is .
IiiC Training with Risefall Trends
We connect the deep coinvestment network learning with the risefall trends of the market index with the softmax [13] method. We preprocess the target market index into a binary time series , where or 1, 1 and 0 represents the rise and fall trend of the market index from the last value respectively, and the classification score of can be computed by:
(7) 
where is the time stamp, the element and
represents the degree of rise and fall trends respectively. Our loss function is defined as follows.
(8) 
where is the Frobenius norm for all the parameters . It is convenient to compute the derivation to this loss, and thus, we apply the back propagation methods with Adam [14] optimizer to train DeepCNL.
From Equation (5) and (6), one can note that can also be computed through a map of , where the is the deep coinvestment evidence at the th time stamp (), is the weights for the deep coinvestment network, and is the size for the time window. Since can also be obtained by Algorithm 2 of DeepCNL, we can infer that our loss function actually helps optimizing the weight during the training of DeepCNL. The experiments shows that this inference is correct, and the weight estimated with the learned inner parameters of DeepCNL positively related to the real coinvestment relations for the stocks.
IiiD Network Learning
We use LSTM [15] to estimate the weight () for the deep coinvestment network in this work. This process can also use any different version of RNN according to the application. The output of LSTM is computed as:
(9) 
where the is the output gate and is the cell value, and they can be calculated as:
(10) 
(11) 
where function is the function, is the weights from the input to the output gate, and is the weights between the hidden states. and are the corresponding biases. The and are the forget gate and the input gate respectively, and
is the hidden representation for the inputs. They can be obtained by the following equations:
(12) 
(13) 
(14) 
where , and are the weights from the input to the forget gate, input gate and the input hidden representation respectively. , and are the corresponding biases. , and are the weights between the hidden states. , and are the biases. We design the LSTM version weight as:
(15) 
where , , and are the weights of the input gate, the input hidden representation, and the output gate for at the th hidden state on the LSTM.
Discussion about the weight estimation. As one can observe from in Equation (15), the weights for the deep coinvestment network are computed by adding up all the weights of the new input gates (e.g. the input to hidden, input gate, etc.), since these weights collect the positive impacts of the deep coinvestment evidence on the market index. Thus the estimated weights for the deep coinvestment network indicate the total positive impacts from the deep coinvestment evidence to the market index. Then the final estimated weight reveals the impact of (for stock and ) on the market index sequence. Therefore, the estimated weights will keep a relative deterministic partial order after the training by referring to their corresponding impacts. We verify this intuitive inference in the experiment.
Network generation. As it is described in the former sections, we obtain a weight for every dyadic combination of the stocks. If the final network gets too many edges, it will be a useless dense graph. This is a common issue in the graph learning tasks [16]. We design a graph generator framework, applying the “Occam’s Razor law”, we first sort all the combinations by the weights in the descending order. And then, we add a small proportion, which we called rare ratio , of edges into the final graph. This is also a similar process as the BenjaminiHochberg procedure [17] which aims to alleviate the multiple comparisons problem for a set of statistical tests when treating the weight estimation for each edge as an independent statistical test.
Iv Experiments and Discussion
Iva Datasets
We compare our methods with other existing methods on the S&P 500 dataset. Its details are shown in Figure 2.
Property  S&P500 

Total instants  851,264 
Company number  470 
Combination number  110,215 
Feature number  5 
Time stamps  1,200 
Start date  20100104 
End date  20161230 
This dataset is obtained from a competition targeting on the stock price predicting^{1}^{1}1https://www.kaggle.com/dgawlik/nyse. This dataset of S&P 500 consists of 470 famous companies since the varying of the components over the years. The 5 features are “open, high, low, close” prices and the volumes for the stocks. We use the time series of the ticker “SPY” as the target for this ETF precisely tracks the market index of S&P 500. The combination number in Figure 2 is the number of dyadic combinations for the stocks in a market. One can note that, although the original data is not big, the combination number is over 100,000. This means that our model needs to apply the learning process through time series where each has over timestamps (6 years). To capture both longterm and shortterm dynamics, we choose one year as the interval and apply the learning process of DeepCNL model on data within one year.
IvB Experiment Settings and Benchmark
Our experiments consist of five parts. First, we verify the effectiveness of DeepCNL on the task about the investment density. Second, we compare DeepCNL with other existing methods on the tasks related to the financial influence. Third, we show the capability of DeepCNL to capture the annual evolving patterns from real data. What’s more, we analyze the training process and the correctness of different DeepCNL implementations. Last but not least, we analyze the scalability of all mentioned methods. Since the result in shows that DeepCNL implemented with LSTM performances the best of all the RNNs, we use the LSTM version of DeepCNL for all the comparison experiments.
In our experiments, we set the number of hidden state to 256, the number of hidden layers to 2 for all different RNN implementations of DeepCNL, and we set the number of covarying latent patterns to 16 for DeepCNL. The methods used for comparison including:

Pearson Correlation Coefficient (PCC). The PCC is the most common method to analyze the correlation relationships for the time series [2].

Dynamic Time Warping (DTW) [18]. DTW is a dynamic similarity measure method on the time series. It can compare the time series adaptively and does not need to select the features or align the time series manually.

Visibility graph and WL kernelbased method (VWL). We propose VWL to compare the time series through the graph kernel methods. Visibility graph [19] can transform a single time series as a graph. Based on this idea, we transform all the time series into the graphs, and we apply a graph kernel, WeisfeilerLehman kernel [20], to compare the similarities between the stocks.
We preprocess the original data with a minmax normalization method. The edge weights of the found networks for each method (we simplified DeepCNL as DNL in all the experiments) are computed as the following: DNL uses the deep coinvestment network weights; DTW computes the DTW distance between two time series, and then use the as the edge weights; PCC obtains the weights by computing the Pearson correlation coefficient between two time series (with pvalue); VWL generates the visibility graph for each time series, and computes the similarities between the graphs through WL kernel as the edge weights. Our prototype system^{2}^{2}2https://github.com/hkharryking/deepcnl
is implemented with Pytorch.
IvC Analysis on Investment Intensity
Since the edges of the deep coinvestment network reveal the relationships that two stocks are invested together by a same group of people, the investment intensity for a group of stocks can be measured by the edge density of the corresponding deep coinvestment network. To verify if the learned networks capture the practical investment activities, we compare the edge densities of the subgraphs from the learned networks with DeepCNL and PCC. Each subgraph is related to a ETF which contains a subset of all the component stocks of S&P 500. The ETFs we used including: the Guggenheim Russell top 50 mega ETF (ticker: “XLG”) containing the select top 50 big companies in S&P 500; the S&P 100 index (ticker: “OEX”) containing the select top 100 big companies in S&P 500; ishares Russell top 200 index (ticker: “IWL”) containing the select top 200 big companies in S&P 500. The XLG’s component set is the subset of OEX’s component set, and the OEX’s component set is the subset of IWL’s component set. That is, if people invest OEX, they will also invest the component stocks in the XLG, and if people invest IWL, they will also invest the component stocks in XLG and OEX. Since the deep coinvestment describes the relation when two stocks are invested by same group of people, the bigger the edge density for one subgraph, the more invested activities are captured in it. We compare the average edge densities for the subgraphs from the annual deep coinvestment networks obtained by 5 independent trials for each year’s data, and we set the for DeepCNL and PCC. The results are shown in Figure 4, where the postfixes “i”,“g”,“o”, and “f” refers to the weights of the input gate, the input hidden representation, the output gate and forget gate of the LSTM respectively. For example, DNLigo refers to learning the network weight of the deep coinvestment network with Equation (15). Figure 4 also illustrates the results of other implementations of DeepCNL with adding up the weights of the corresponding gates. It can be clearly observed that the edge densities for the subgraphs of XLG are the biggest, and the edge densities for the subgraphs of IWL are the smallest for the results of DNLigo. Since the investment activities on a ETF will accumulate to its subset ETFs, the edge density for the subgraph of ETF will be smaller than the edge density for any subgraph of its subset ETF. The edge density result of DNLigo consistently coincides with the order “XLGOEXIWL”, which justifies the effectiveness of using the inner parameters of RNN to represent the coinvestment relations in Section . Therefore, DNLigo learns the deep coinvestment network which reveals the real market structure and we use it in all the remaining experiments (we omit the suffix “igo”).
IvD Consistency with Practical Financial Principles
We provide two subtasks to further verify if the learned networks capture the practical financial principles: (1) we analyze the financial influence of the top 10 degree nodes from the learned networks, and (2) whether the learned networks captures the investment activities which lead to the stock performance in the real news.
Since the S&P 500 index is capitalizationweighted, its component stocks with higher market capitalization have the bigger impact on the index value than those with small market capitalizations^{3}^{3}3https://en.wikipedia.org/wiki/S&P_500_Index#Weighting. Therefore, we use the average market capitalization as the metric to measure the financial influence of the found stocks. In social network field, the highdegree method is a nonstochastic metric for identifying the influence of a selected node set [21]. That is, given a fixed node set in a social network, its influence is measured by the sum of the degrees of its nodes. Consequently, our first comparison is made among the average market capitalizations of the 10 top degree nodes of the networks respectively. Since DTW and VWL are not able to scale up to data of this scale, we compare the methods by first sorting all the stock symbols in alphabetic order, and then comparing the four methods with the first 50, 100, 150, and 200 stock time series. We also compare DTW and DNL on all the 470 stocks. We learn the networks with the annual split data during the 6 years and compute the top degree nodes from the learned networks. We set the rare ratio for the experiments with 200 or fewer stocks, and for the experiments with 470 stocks. The results are reported in Table .
Stock num.  DNL  PCC  DTW  VWL 

50  134.039.3  90.149.4  35.85.3  57.142.1 
100  132.338.6  70.339.1  42.911.8  50.340.1 
150  169.061.3  62.132.7  35.47.9  NA 
200  176.156.9  63.034.4  38.46.9  NA 
470  223.167.7  67.035.8  NA  NA 
We can observe from Table that DNL beats all the other methods on the average market capitalizations, while PCC performs the best among the baselines. Furthermore, the influence of the top stocks from PCC is decreasing with the number of stocks, while the influence of the top stocks from DNL is increasing. This shows that DNL captures the deep coinvestment patterns correctly, and it alleviates the stop word bias by getting more global rules than others. To further compare the DNL with the best baseline PCC, we collected the news about “annual best 10 performing S&P 500 stocks”^{4}^{4}4e.g. http://www.nasdaq.com/article/5bestperformingsp500stocksof2014analystblogcm425786 during the year 2010 to 2016. These news reports ranked the stocks according to their returns. The performance of the stocks has a positive relationship with the realworld investment activities. That is, the higher frequency a stock is invested, the better its performance will be. Therefore, we use the ranked lists of the topperforming stocks as another benchmark to test that how similar the obtained networks conform with the investment activities from people. We list the covered (blue) stocks by the learned network of DNL in Table ,
Rank  2010  2011  2012  2013  2014  2015  2016 

1  NFLX  COG  HW  NFLX  LUV  NFLX  NVDA 
2  FFIV  EP  DDD  MU  EA  AMZN  OKE 
3  CMI  ISRG  REGN  BBY  EW  ATVI  FCX 
4  AIG  MA  LL  DAL  AGN  NVDA  CSC 
5  ZION  BIIB  PHM  CELG  MNK  CVC  AMAT 
6  HBAN  HUM  MHO  BSX  AVGO  HRL  PWR 
7  AKAM  CMG  AHS  GILD  GMCR  VRSN  NEM 
8  PCLN  PRGO  VAC  YHOO  DAL  RAI  SE 
9  WFMI  OKS  S  HPQ  RCL  SBUX  BBY 
10  Q  ROST  EXH  LNC  MNST  FSLR  CMI 
and we compare the coverage of the best 10 performing stocks for the stocks in the obtained networks of DNL and PCC in Figure 3. We can see from Figure 3, DNL covers more topperforming stocks than PCC, and thus DNL captures more clues of the investment activities than PCC.
IvE Discovering the Evolving Coinvestment Patterns
In this experiment, we show the capability of DeepCNL (DNL) to capture the evolving deep coinvestment patterns. We learn the annual deep coinvestment networks by DNL from the year 2010 to 2013 with and extract the biggest connected components from the corresponding networks. The result of DNL is shown in Figure 5 (cf) and we also show the learned networks (from the year 2010 to 2011) by PCC with the same parameters in Figure 5 (a) and (b) as the comparison.
We can observe that the obtained connected components by DNL keep a relatively stable structure although the coinvestment patterns are evolving annually and this indicates that our method captures the effective evolving coinvestment patterns. What’s more, we further find out that the results of DNL in Figure 5 cover 60% of the top stocks mentioned by the news reports found in the related study [22]. This shows that our result follows the financial rule about the positive correlation between the transaction volume of a stock and the number of times that it is mentioned in the news media [23]. As a comparison, the results of PCC hardly capture any useful stable structure related to the mentioned studies.
We also observe that the distances between the high degree nodes (stocks) of DNL’s results in Figure 5 match the discoveries of several financial studies [24] [25] [26]. To further analyze the captured coinvestment rules by DNL, we compute the yearly average distance between the common high degree nodes (stocks) for the obtained connected components of DNL and list the results in Table . Especially, in Table
, we find that the average distance between the “AAPL” (Apple) and “BAC” (Bank of America) is the shortest with the smallest standard deviation.
This result coincides the most invested stocks in the mutual funds [24], the most cooccurrences stocks in the financial reviews [25] and the ranking result of the stock correlation study [26].BAC  AAPL  BSX  DAL  RF  F  

BAC  1.50.5  1.80.8  2.01.0  2.51.1  2.00.7  
AAPL  2.51.5  2.51.2  3.80.8  2.50.9  
BSX  3.30.7  3.81.2  3.30.7  
DAL  3.00.7  2.31.3  
RF  2.81.8  
F 
V Conclusion
This work learns the coinvestment relations for the stocks in the market with the deep coinvestment network learning model. Its main contribution is to learn the coinvestment relationships which address the impact of the market index trends and link the practical financial rules to the inner parameters of the proposed deep learning model (DeepCNL). The experimental results on the realworld stock data show that DeepCNL learns the effective deep coinvestment network that performs consistently with known financial principles on various tasks. This verifies that our model captures the dynamic relationship between the stocklevel index and the marketlevel index.
Vi Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No.61503422,61602535), the Open Project Program of the National Laboratory of Pattern Recognition (NLPR), the Program for Innovation Research in Central University of Finance and Economics, and Beijing Social Science Foundation (Grant No. 15JGC150).
References
 [1] N. Cetorelli and M. Gambera, “Banking market structure, financial dependence and growth: International evidence from industry data,” The Journal of Finance, vol. 56, no. 2, pp. 617–648, 2001.
 [2] N. Johnson and A. Banerjee, “Structured hedging for resource allocations with leverage,” in Proceedings of the 21th ACM SIGKDD, Sydney, NSW, Australia, August 1013, 2015, 2015, pp. 477–486.
 [3] D. F. Silva, G. E. A. P. A. Batista, and E. J. Keogh, “Prefix and suffix invariant dynamic time warping,” in IEEE 16th International Conference on Data Mining, ICDM 2016, December 1215, 2016, Barcelona, Spain, 2016, pp. 1209–1214.
 [4] H. A. Dau and E. J. Keogh, “Matrix profile V: A generic technique to incorporate domain knowledge into motif discovery,” in Proceedings of the 23rd ACM SIGKDD, Halifax, NS, Canada, August 13  17, 2017, 2017, pp. 125–134.
 [5] B. et al., “Order imbalances and stock price movements on october 19 and 20, 1987,” The Journal of Finance, vol. 44, no. 4, pp. 827–848, 1989.
 [6] J. B. Lee, X. Kong, Y. Bao, and C. M. Moore, “Identifying deep contrasting networks from time series data: Application to brain network analysis,” in Proceedings of the 2017 SIAM, Houston, Texas, USA, April 2729, 2017., 2017, pp. 543–551.
 [7] V. A. Warther, “Aggregate mutual fund flows and security returns,” Journal of Financial Economics, vol. 39, no. 2, pp. 209–235, 1995.
 [8] Y. B. Yann LeCun and G. Hinton, “Deep learning,” Nature, vol. 521, no. 10, pp. 436–444, May 2015.

[9]
A. Giusti, D. C. Ciresan, J. Masci, L. M. Gambardella, and J. Schmidhuber, “Fast image scanning with deep maxpooling convolutional neural networks,” in
IEEE ICIP 2013, Melbourne, Australia, September 1518, 2013, 2013, pp. 4034–4038.  [10] M. LukoAeviA ius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Computer Science Review, vol. 3, no. 3, pp. 127 – 149, 2009.
 [11] F. Gers, “Learning to forget: continual prediction with lstm,” IET Conference Proceedings, pp. 850–855(5), January 1999.
 [12] J. Chung, Ç. Gülçehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” CoRR, vol. abs/1412.3555, 2014.
 [13] J. S. Bridle, Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition. Berlin, Heidelberg: Springer Berlin Heidelberg, 1990, pp. 227–236.
 [14] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR, vol. abs/1412.6980, 2014.
 [15] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 3104–3112.

[16]
T. Jebara, J. Wang, and S. Chang, “Graph construction and b
matching for semisupervised learning,” in
Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, June 1418, 2009
, 2009, pp. 441–448.  [17] D. Thissen, L. Steinberg, and D. Kuang, “Quick and easy implementation of the benjaminihochberg procedure for controlling the false positive rate in multiple comparisons,” Journal of Educational and Behavioral Statistics, vol. 27, no. 1, pp. 77–83, 2002.
 [18] D. J. Berndt and J. Clifford, “Using dynamic time warping to find patterns in time series,” in Knowledge Discovery in Databases: Papers from the 1994 AAAI Workshop, Seattle, Washington, July 1994. Technical Report WS9403, 1994, pp. 359–370.
 [19] L. L. et al., “From time series to complex networks: The visibility graph,” Proceedings of the National Academy of Sciences, vol. 105, no. 13, pp. 4972–4975, 2008.
 [20] N. Shervashidze and K. M. Borgwardt, “Fast subtree kernels on graphs,” in Proceedings of NIPS held 710 December 2009, Vancouver, British Columbia, Canada., 2009, pp. 1660–1668.
 [21] D. Kempe, J. Kleinberg, and E. Tardos, “Maximizing the spread of influence through a social network,” in Proceedings of the Ninth ACM SIGKDD. New York, NY, USA: ACM, 2003, pp. 137–146.
 [22] G. Ranco, I. Bordino, G. Bormetti, G. Caldarelli, F. Lillo, and M. Treccani, “Coupling news sentiment with web browsing data improves prediction of intraday price dynamics,” PLOS ONE, vol. 11, no. 1, pp. 1–14, 01 2016.
 [23] M. Alanyali, H. S. Moat, and T. Preis, “Quantifying the relationship between financial news and the stock market,” Scientific Reports, vol. Volume 3, p. Article number 3578, December 2013.
 [24] R. Solis, “Visualizing stockmutual fund relationships through social network analysis,” Global Journal of Finance and Banking Issues, vol. 3, no. 3, 2009.
 [25] G. Kramida, “Analysis of stock symbol cooccurrences in financial articles,” https://wiki.cs.umd.edu/cmsc734_f13/images/9/9f/Analysis_of_Stock_Symbol_Cooccurences_in_Financial_Articles.pdf, accessed May 26, 2018.

[26]
C. Shekhar and M. Trede, “Portfolio Optimization Using Multivariate tCopulas with Conditionally Skewed Margins,”
Review of Economics & Finance, vol. 9, pp. 29–41, August 2017.
Comments
There are no comments yet.