## 1 Introduction

It has been recognised in the financial literature that jumps in asset returns occur clustered in time and affect several stock markets within a few hours or days, see for example Aït-Sahalia et al. (2015)

. We work with a large panel of stocks from several European markets and we aim to identify and to predict joint tail risk expressed as probabilities of jumps in their daily returns. Our modelling assumptions are based on the well-known paradigms of stochastic volatility (SV) models

(Taylor, 1982) combined with Poisson-driven jumps (Andersen et al., 2002). The resulting Bayesian hierarchical models require careful prior specifications and sophisticated, modern Markov chain Monte Carlo (MCMC) inference implementation strategies. Our purpose is to provide a general, robust modelling and algorithmic framework and test it in an out-of-sample forecasting scenario with real data.

Our first point of departure is the univariate SV model with jumps. We provide a new efficient MCMC algorithm combined with careful prior specification that improves upon existing MCMC strategies. Next, we extend this approach by utilising the panel structure of stock returns so that unobserved jump intensities are assumed to propagate over time through a dynamic factor model. The resulting Bayesian hierarchical model can be viewed as a generalisation of the Bates (1996), univariate SV model with jumps since it can be obtained as a special case by just assuming that jump intensities are independent across time and stocks. Furthermore, our computational techniques are relevant to other scientific areas such as, for example, neuroscience, whereas simultaneous recordings of neural spikes are often modelled through latent factor models (Buesing et al., 2014).

To illustrate our motivation, consider the daily returns from the stocks of the STOXX Europe Index over a period -. The top panel of Figure 1

has been created by empirically identifying a jump when the return exceeds three standard deviations, where the estimators of mean and variance of each series were robustly estimated as the median and the robust scale estimator of

(Rousseeuw and Croux, 1993) respectively. The middle panel depicts a summary statistic of an SV model with jumps estimated separately for each stock return series: each dot denotes a stock return in which that particular day the probability of a jump in the series has been estimated to be greater than . Compared with the top panel, it provides a more sparse jump process indicating a successful separation of jumps from persistent moves modelled by the volatility process, a topic discussed in detail by, for example, Aït-Sahalia (2004). But the notable feature that inspired this work is that clustering and inter-dependence of jump intensities is evident in the middle panel and, interestingly, days in which jumps have been simultaneously identified in a large number of stocks coincide with days of events that affected the financial markets worldwide as pointed out at the x-axis of the plots. Thus, one may attempt joint, Markovian modelling of jump intensities across time which might reveal some predictability of jumps. Indeed, the results of our proposed joint modelling formulation provides strong evidence for better out-of-sample predictability with estimated jump probabilities higher than depicted in the lower panel of Figure 1.We propose a modelling approach for the jump processes of SV models in which the time- and cross-sectional dependence of the jump intensities are driven by a latent, common across stocks, dynamic factor model. We carefully specify informative prior distributions to the parameters of the factor model so that the implied priors for the jump intensities in each stock are in concordance with the well-known (Eraker et al., 2003) a priori expectation for one jump every few months. Following Bhattacharya and Dunson (2011), we do not impose the usual (Aguilar and West, 2000) identifiability constraints on the factor loadings parameters. Our model is an alternative to the jump processes of SV models with jumps proposed by Aït-Sahalia et al. (2015). In their approach, the SV model is combined with a multivariate Hawkes process (Hawkes, 1971)

which models the propagation of jumps over time and across assets by introducing a feedback from the jumps to their intensities and back. Their proposed model is estimated with a generalised version of the method of moments.

Our Bayesian inference implementation strategy is based on an MCMC algorithm that alternates sampling of the latent volatility process and its parameters with sampling of the jump process and its parameters. Both these steps require dawings of the high-dimensional paths of the latent volatilities and factors from their full conditional distributions. By noting that the target distributions are proportional to the product of intractable, non-linear likelihood functions with Gaussian priors we utilize the sampler proposed by

Titsias and Papaspiliopoulos (2018) to simultaneously draw the whole path of each latent process at each MCMC iteration. This is an important feature of our methods for the following reasons. First updating each state of a volatility process separately results in Markov chain samplers with slow mixing (Shephard and Kim, 1994). Second, although Chib et al. (2002) and Nakajima and Omori (2009)use a mixture of Gaussian distributions approximation to also simultaneously update the whole volatility path, their methods rely on an importance sampling step which is quite problematic, see

Johannes et al. (2009) for a detailed discussion and Section 3.4 of this paper for a simulation that illustrates this issue.To compare the predictive performance of two different models by estimating predictive Bayes factors (Geweke and Amisano, 2010) we use sequential Monte Carlo methods. Since this turns out to be problematic in the case of jointly modelled jump intensities, we utilize the annealed importance sampling method developed by Neal (2001) to construct a sequential importance sampling algorithm in order to obtain the desired estimates.

The structure of the remaining of the paper is as follows. Section 2 presents our proposed modelling framework and Section 3 describes the methods that we develop to conduct Bayesian inference for the proposed model. In Section 4 we present the computational techniques that we use to assess the predictive performance of the proposed model. Section 5 presents results and insights from the application of our methods on the real dataset and Section 6 concludes with a small discussion.

## 2 Latent jump modelling

### 2.1 Notation

We denote the univariate Gaussian distribution with mean and variance by , and its density evaluated at by ; denotes the density of the

-variate normal distribution with mean

and covariance matrix evaluated at ; and denote the gamma, with mean, and inverse gamma, defined as the reciprocal of gamma, distributions respectively;

the uniform distribution on interval

; denotes the gamma function. Index is used for stocks and index for times, e.g., is the th stock return at day ; when we introduce factor models indexdenotes factor. Upper case letters denote vectors and matrices, and lower case letters denote scalars. For a collection of variables indexed by time or stock and denoted by a lower case character, the corresponding upper case character denotes their vector or matrix representation, e.g.

denotes the matrix whose elements are ; , and are then generic rows and columns of ; all vectors are understood as column vectors. A subscript, of the form , where is a positive integer, indicates the collection of vectors with subscripts ; e.g. is the collection of vectors . Finally, the transpose of a vector or matrix is denoted by a prime; e.g. is the transpose of .### 2.2 Stochastic volatility with jumps

We model stock returns as

(1) |

(2) |

with and and are independent. The number of jumps of the th stock at time

follow a Poisson distribution,

(3) |

where denotes the corresponding jump intensity, is a time-increment associated to each return and is the th jump size. We explicitly take into account that stock returns are computed over varying time increments , such as in-between weekends and holidays, hence is the daily jump intensity.

We model the parameters and of each log-volatility process and the parameters and of the jump sizes as independent across stocks. For , and we follow the related literature (Kim et al. (1998), Omori et al. (2007), Kastner and Frühwirth-Schnatter (2014)) and take , and .

For and we choose proper priors. In fact, improper priors will lead to improper posterior. This is due to the fact that according to the model there is positive probability that there are no jumps for the whole period of time, i.e., the event for all

, has positive prior probability, and in such an event,

and are unidentifiable in the likelihood. Instead, we take , where , and . In this specification, , hence we match the variance of jump sizes with a quantity that relates to the sample variance of returns. A same reasoning is followed for setting the prior variance of and we have found that the choice of the multiplier, here taken 5, is not crucial in the analysis.#### 2.2.1 Modelling jumps independently across stocks and time

By modelling as independent random vectors, we obtain independent univariate SV with jumps models. Additionally, each vector can be taken as one of independent elements, , in which case one obtains the models described, among others, by Bates (1996), Chib et al. (2002) and Eraker et al. (2003). It follows that has, marginally with respect to

, a negative binomial distribution with density

(4) |

where . The mean of the distribution is and the variance is Taking ensures that the density is monotonically decreasing. This is a feature we are interested in: jumps are meant to capture infrequently large price movements, and a priori we expect that with highest probability there is no jump and it is more likely to have less than more jumps, if any. We can choose such that the probability of no jump at a given time increment , is at least , a user-specified threshold. Taking for example and results in with . These choices are consistent with the prior expectation, in financial applications, for one jump every few months; see for example Eraker et al. (2003) for a more detailed discussion.

### 2.3 Joint modelling of jumps

To capture the dependence of the jumps over time and across stocks we model by using a dynamic factor model which is specified as follows. We first transform to via

(5) |

which implies that . Then, we model the time-dependence of jumps via latent factors modelled as independent autoregressive processes. The cross-sectional dependence of jumps is captured by a matrix of factor loadings and a vector . The joint model specification is

(6) | ||||

(7) | ||||

(8) |

where is a diagonal matrix with elements and is a diagonal matrix with elements .

It is known that parameters of latent factor models are only identifiable up to certain transformations, such as orthogonal rotations and sign changes (Aguilar and West, 2000). However, latent factor models are also used as low-rank predictive models in which case the out-of-sample prediction is of prior importance; see Bhattacharya and Dunson (2011). This is the perspective we adopt here, where we try to borrow strength from the information in large panels of stocks in order to predict with higher accuracy individual jumps by using a parsimonious factor model. Therefore, our prior specification imposes no identifiability constraints on the parameters of the factor model and are taken as , , and .

The specification of the hyperparameters of the priors assigned on the parameters

and require extra care because they affect, through (5), the imposed prior on . The fact that our modelling formulation of Section 2.2.1 was based on informative priors provides a way to elicit informative prior specification of the hyperparameters and . We first set which is theupper quantile of the

density. Then, we choose the hyperparameters , and to be such that the mean, the variance and the mode for the resulting prior on each in (5) are comparable with the corresponding quantities of the distribution. Based on calculations presented in the supplementary material we set , and . In Table 1 we display the quantities of interest in the two prior distributions. To evaluate the differences of the two priors we also examine the impact of different choices for their hyperparameters in the forecasts of future observations. See in the supplementary material for the related results.Prior for | Mean | Variance | Mode |
---|---|---|---|

0 | |||

Factor model |

## 3 MCMC sampling from the posterior of interest

Let ,. Our interest lies on the joint posterior of parameters and latent states . To draw samples from the posterior of interest we design an MCMC algorithm which proceeds as follows. At each MCMC iteration we perform stock-specific updates of , , , , and then we update the latent factors and their parameters .

An important characteristic of the designed MCMC algorithm is that before updating the log-volatilities and the number of jumps we integrate out the jump sizes . This results to an efficient simultaneous update of the log-volatilities vector and a direct sampling of the full conditional of the number of jumps . Algorithm 1 summarizes the steps of the MCMC sampler. The algorithm presents how the full conditional densities of vectors of parameters are sampled by exploiting the conditional independence structure of the hierarchical model defined by the equations (1)-(3) and (5)-(8). In Sections 3.1, 3.2 and 3.3 we will describe in detail the more elaborate steps and of Algorithm 1 whereas the details of the other sampling steps are given in the supplementary material.

We also emphasize that by switching off certain steps of Algorithm 1 we obtain a novel MCMC sampler for existing models. More precisely, Bayesian inference for univariate SV with jumps models can be conducted if the step in the th line is skipped and the step in the th line replaced with the step of drawing the independent jump intensities of the model. Switching off the steps in lines and results in an MCMC algorithm for univariate SV without jumps models. By removing the steps in lines of Algorithm 1 we obtain a sampler for a Cox model (Cox, 1955) with intensities driven by latent factors.

### 3.1 Sampling the number of jumps and jump sizes

To sample each we first integrate out the jump sizes and then we construct a rejection sampling algorithm to draw from . This algorithm is based on the following result.

###### Proposition 1.

For each and the discrete distribution with probability mass function is log-concave for .

###### Proof.

See the supplementary material. ∎

For the remaining of this subsection let denote . Due to Proposition 1, the target distribution is a unimodal distribution. Let be an integer at the right of the mode of the distribution. We define the probability mass function (pmf)

with . The pmf is proportional to for and proportional to the geometric pmf with parameter, , for

. By noting that if a random variable

follows the exponential distribution with parameter,

, thenfollows the geometric distribution with parameter

, we observe that , for , corresponds to an exponential density that goes through the points and . Log-concavity of the distribution implies that , for , since any straight line that goes through the points and bounds from above the logarithm of . See Figure 2 for an illustration. A rejection algorithm to sample from the distribution with pmf proceeds as summarized by Algorithm 2.#### 3.1.1 Jump sizes

Drawing the jump sizes given the number of jumps is based on the following proposition. Let and be the identity matrix and an -dimensional vector with ones respectively.

###### Proposition 2.

###### Proof.

See the supplementary material. ∎

### 3.2 Sampling latent volatilities and their parameters

In the th line of Algorithm 1 we sample the whole vector jointly with the parameters and . We make this choice because the prior covariance matrix of depends on the values of these parameters and this dependence results in high correlation between and (Titsias and Papaspiliopoulos, 2018). To sample the latent log-volatility paths and the parameters we first integrate out the jump sizes from the target density and we obtain

(9) |

where denotes the prior of the latent autoregressive log-volatility process defined in (2) and and denote the densities of the priors for the parameters and as described in Section 2.2.

To draw samples from (9) we utilize the auxiliary gradient-based sampler proposed by Titsias and Papaspiliopoulos (2018). The sampler is a Metropolis-Hastings step that combines auxiliary variables, the Gaussian prior of the latent path and a Taylor expansion of the intractable likelihood of the observations. After the update of , and we condition on their new values in order to draw from its full conditional, Gaussian, distribution. See the supplementary material for details.

To further improve the sampling of the parameters we follow Kastner and Frühwirth-Schnatter (2014) and we use the, so-called, ancillarity-sufficiency interweaving strategy proposed by Yu and Meng (2011). After drawing , and as described above we transform the vector to a new vector via . Then, we draw again from the conditional densities of and which are now conditioned on rather than on . These last updates are performed through univariate Metropolis-Hastings steps. Before moving to the next step of the MCMC algorithm we also update deterministically the path by setting ; see in the supplementary material for illustrations of the MCMC performance achieved by using the interweaving strategy.

### 3.3 Sampling latent jump intensities

Following the modelling approach presented in Section 2.2.1 it is easy to see that the full conditional density of is an independent .

In the case of modelling the jump intensities by using the dynamic factor model defined in (5)-(8) we need to draw samples from the distribution with density

(10) |

where , denotes the prior density of and is the covariance matrix, which depends on the values of the parameters , of the autoregressive process defined in (7) and (8). We utilize again the auxiliary gradient-based sampler of Titsias and Papaspiliopoulos (2018) to obtain the desired samples. We also employ the ancillarity-sufficiency interweaving strategy developed by Yu and Meng (2011) to reduce the autocorrelation of the samples of the parameters in . After the end of the auxiliary gradient-based sampler, we set , where is a diagonal matrix with elements and . Then we use a random walk Metropolis-Hastings step that targets the density . See the supplementary material for more details and illustration for the improvements of MCMC mixing.

### 3.4 Approximation by using a mixture of Gaussian distributions

For a given stock, sampling the whole path of the latent log-volatilities at each MCMC iteration could be performed by utilizing the methodology developed by Chib et al. (2002) and Nakajima and Omori (2009). Their methods are based on the idea of Kim et al. (1998) to approximate a SV without jumps model by using a mixture of Gaussian distributions.

The advantage of their approach is that the update of can be conducted by drawing samples directly from its full conditional distribution which is a -variate Gaussian with tridiagonal inverse covariance matrix. The jumps sizes and their parameters should be updated after integrating out the mixture component indicators from the likelihood of the approximated model, otherwise convergence of the corresponding MCMC algorithm will be slow (Nakajima and Omori, 2009). Integrating out the mixture component indicators results in a partially collapsed Metropolis-Hastings within Gibbs sampler which has to be implemented with care since permuting the order of the updates can change the stationary distribution of the chain (Van Dyk and Jiao, 2015). If, for example, the update of the mixture indicators is conducted in the step which is intermediate between the update of the latent log-volatilities and the update of their parameters, then the Markov chain will not converge to the desired stationary distribution.

More importantly, by using the methods developed in Chib et al. (2002) and Nakajima and Omori (2009), samples from the posterior of the parameters of the exact model are obtained with an importance sampling step. In this step, weights that are equal to the ratio between the likelihood of the exact and the approximated model, are assigned to the MCMC samples obtained from the approximated model. Although in the case of SV models without jumps the described importance sampling step works without problems, it is known (Johannes et al., 2009) that in the case of models with jumps such weights may have large variance since a few or only one of them could be much larger than the others. Thus, the output of the importance sampling step will not be suitable for any Monte Carlo calculation. For an illustration of the problem we conducted a simulation experiment in which we calculated the required importance sampling weights for SV models with and without jumps. In particular, we simulated time series with log-returns from each model by utilizing equations (1)-(2) and omitting the jump component in (1) to simulate from the model without jumps. In the case of the SV model with jumps we assumed independent jump intensities over time following the modelling approach presented in Section 2.2.1. We chose values for the parameters of the log-volatility process which are common in the financial literature (Kim et al., 1998); and and we simulated the sizes of possible jumps from the Gaussian distribution with and (Eraker et al., 2003). Then, we drew (thinned) posterior samples of the parameters and latent states of the approximated models by using the corresponding MCMC algorithms.

Figure 3 displays the log-normalised importance sampling weights plus the logarithm of their number. The variance of the normalized importance sampling weights should become smaller as the importance sampling approximation improves. This implies that the histograms in Figure 3 should be centered at zero in the case of an accurate enough approximation; see the corresponding plots in Kim et al. (1998) and Omori et al. (2007). The left panel of Figure 3 indicates that there is little effect of the mixture approximation in the case of the model without jumps, but the right panel points out that this is not true in the case of the SV with jumps model. The bad performance of the approximation strategy in models with jumps is due to the fact that the mixture model has been proposed by Kim et al. (1998) and improved by Omori et al. (2007) for SV models without jumps. By adding a jump component in the model the mixture approximation has to take into account the uncertainty in the estimation of the jump process and this is not the case in the approximation used by Chib et al. (2002) and Nakajima and Omori (2009).

## 4 Out-of-sample forecasting

To compare the predictive performance of different SV models we develop a computational framework for out-of-sample forecasting. Based on in-sample and out-of-sample observations we estimate the predictive Bayes factors (Geweke and Amisano, 2010) by utilizing sequential Monte Carlo methods (Doucet et al., 2001). For the remaining of this section we assume that the parameters , , are fixed to their in-sample estimated posterior means and we omit them from our notation.

### 4.1 Prediction with independent SV models

Let and be the model that consists of independent, univariate SV models with and without jumps respectively. Let also and denote in-sample and out-of-sample observations. We compare the predictive performance of and by considering predictive Bayes factors defined as

(11) |

For both and we have that

so to estimate the Bayes factors in (11) we need to estimate the quantities . Based on the formula

(12) |

we use sequential Monte Carlo methods, also known as particle filters, to estimate the desired quantities. It is known, see for example in Chopin et al. (2013), that from the output of a particle filter algorithm that targets the posterior in (12

) we also obtain an unbiased estimator for its normalizing constant. See in the supplementary material for the pseudocode of the particle filter algorithm that we employ to obtain unbiased estimators of the normalizing constants in (

12).### 4.2 Prediction by using joint modelled jump intensities

Let be our proposed SV with jumps model, defined by equations (1)-(3) and (5)-(8). To compare the predictive performance of with the performance of competing models by utilizing predictive Bayes factors, we need to estimate the marginal likelihood for each . This could be achieved with a particle filter algorithm, constructed similarly to the case of independent SV models, as described in section 4.1. However, in the case of , particle filter algorithms deliver poor estimations of the marginal likelihoods since the importance weights, calculated at each of their steps, have large variance and the estimation of is based on a few samples with large weights.

We develop an alternative sequential importance sampling algorithm to obtain the desired estimates which is based on the annealed importance sampling (AIS) proposed by Neal (2001). AIS utilizes the method of simulated annealing (Kirkpatrick et al., 1983) to construct an importance sampling algorithm for drawing weighted samples from an unnormalized target distribution. The proposal distribution of the algorithm is based on a sequence of auxiliary distributions and Markov transition kernels that leave them invariant. As in any importance sampling algorithm, the sample mean of the importance weights is an estimation of the ratio between the normalizing constants of the target and proposal distributions. Moreover, the output of AIS can be used for the estimation of expectations with respect to any of the auxiliary distributions used for the construction of the algorithm. Here we exploit these features of AIS to estimate the marginal likelihoods. In the rest of the section we provide a detailed description of how we apply AIS, is suppressed throughout except when it is necessary.

Let be the required sequence of auxiliary distributions. By noting that each must be proportional to a computable function and satisfy wherever we set and

for . Samples from , , are obtained by drawing samples from using a few iterations of Algorithm 1 and the Markov transition densities and . Note that is easily sampled by utilizing samples already drawn from . To compute the importance weights of the obtained samples we first note that are proportional to the computable functions

and . Therefore, the importance weight for every sampled point is

and , , where is the number of samples obtained by AIS and

(13) |

We also note that the infinite sum in (13) can be truncated without affecting the results of the algorithm (Johannes et al., 2009). See in the supplementary material for the pseudocode that we used to construct the described AIS algorithm.

The sample means of the weights calculated at each step of AIS provide estimators of the marginal likelihoods . However, the resulting weights have large variance and the corresponding estimates will be inaccurate. Nevertheless, and in contrast with a particle filter that targets , we can use the samples to estimate the quantities for each stock separately. We propose the following sampling procedure. First, note that

This implies that by using only the th element of the product in (13) we obtain an estimator of the th marginal of as follows. We set , where

and . Then, is an accurate estimator of since it is based on importance sampling weights with reduced variance and can be used for the estimation of Bayes factors of the form

(14) |

where the denominator can be estimated with the methods presented in Section 4.1. Finally, by assuming that the joint predictive density of all returns is estimated by the product of their marginal predictive densities, as in the independent across stocks SV with jumps models, we estimate the Bayes factors

(15) |

## 5 Real data application

Here we present results from the application of the developed methodology on real data. In the supplementary material we present results from simulation studies used to test our methodology. Further specifics of the data analysis, such as parameter estimates for the log-volatility processes, can be found in Alexopoulos (2017).

### 5.1 Data and MCMC details

We applied the developed methods on time series consisted of daily log-returns from stocks of the STOXX Europe Index downloaded from Bloomberg between to . We removed stocks by requiring each stock to have at least traded days and no more than 10 consecutive days with unchanged price. The final dataset consisted of stocks and traded days. We used daily log-returns observed in the first days as in-sample observations to estimate all models described through our article and the last observations to test their predictive ability. In our dynamic factor model we used factors. We ran all the MCMC algorithms for iterations using a burn-in period of sampled points and we collected (thinned) posterior samples.

### 5.2 Results

#### 5.2.1 Bayesian inference for parameters and latent states

The middle and bottom panels of Figure 1 present a summary visual outcome of models and . Figure 4 depicts the posterior means of the factor paths along with the posterior distributions of their persistent parameters and . The size of and indicates strong evidence of autocorrelation in the factor processes. Figure 5

demonstrates the ability of our Bayesian methods to estimate several quantities of interest. It depicts the posterior probabilities of the number of jumps for each weekday. This is clearly related with the parameter

in (3) which takes into account the time distance of two successive observations.#### 5.2.2 Predictive performance

Figure 6 compares the predictive performance of independent, univariate SV models with and without jumps. The Figure is constructed by considering the logarithms of the Bayes factors defined in (11) for the out-of-sample observations of our dataset. Thus, positive log-Bayes factor indicates that the predictions obtained with univariate SV models with jumps are more accurate than those of SV models without jumps. According to Kass and Raftery (1995) a value of the log-Bayes factor greater than provides strong evidence in favour of the model with jumps. The increasing nature of the log-Bayes factor reassures that as more data are collected and as more jumps are identified in stock returns, the evidence that the model with jumps outperforms the model without jumps increases. In Figure 7 we compare the predictive performance of our proposed SV with jumps model in which the jump intensities are modelled by using a dynamic factor model with SV models in which the jump intensities are independent over time and across stocks. The Figure compare the two models by presenting the logarithms of the Bayes factors in (15). Clearly, there is strong evidence that our proposed model outperforms the independent modelling of stock returns with a SV with jumps formulation.

## 6 Discussion

We have developed a general modelling framework together with carefully designed MCMC algorithms to perform Bayesian inference for SV models with Poisson-driven jumps. We have shown that for the data we applied our models there is evidence that, with respect to predictive Bayes factors, (i) univariate SV models with jumps outperform univariate SV models without jumps and (ii) models that jointly model jump intensities with a dynamic factor model outperform SV models with jumps that are applied independently across stocks. We feel that (ii) is an interesting result that adds considerable insight to the growing literature of modelling financial returns with jumps, adding to the observation by Aït-Sahalia et al. (2015) that there is indeed predictability in jump intensities.

There are various issues that have not been addressed in this paper. The interesting problem of choosing the number of factors has not been dealt with and left for future research. Other modelling aspects include the relaxing of independence of in (8) or the assumption that is lower diagonal as in the models proposed in Dellaportas and Pourahmadi (2012).

A series of interesting financial questions can be addressed with our models by exploiting the fact that panel of stock returns carry additional, possibly important, information. For example, in our dataset one can explore the effects of country and sector effects or could investigate whether the joint tail risk dependence does or does not change before, during and after the financial crisis in Europe. We feel that our proposed models together with our algorithmic guidelines will serve as useful tools for such future research pursuits in financial literature.

## Acknowledgements

The authors gratefully acknowledge the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program ”Education and Lifelong Learning” of the National Strategic Reference Framework (NSRF) - Research Funding Program: ARISTEIA-LIKEJUMPS-436, and the Alan Turing Institute for EPSRC grant EP/N510129/1.

Supplementary material

## Choosing hyperparameters for the factor model

We have assumed that , and , and . Our aim is to specify the hyperparameters , and to be such that the resulting prior on has mode, mean and variance close to the corresponding values of the distribution which was used as prior in the case of modelling the jump intensities independently over time and across stocks.

From and we have that

(16) |

where we have set, see Section , . To specify and we note that equations imply

Thus, by assuming that we have

(17) |

From (16) we have that the density of is

(18) |

Therefore, the location of its mode satisfies the equation

(19) |

which can be solved, for given values of , and , with numerical methods. Here we use the R-package rootSolve (Soetaert and Herman, 2009) to find all the roots of the equation in (19).

Since our aim is to match , and the mode of the density in (18) with the corresponding quantities implied from the distribution we first note that if , right truncated at , then for we have that

(20) |

and that the mode of the distribution is located at zero. By noting that in our applications we have used latent factors and taking into account (17) we set and . Then, we have that which is close to the variance that we obtain for in the univariate case; see equation (20). To specify except from matching the means in (17) and (20) we are also trying to set the mode of the density in (18) close to zero. Table 2 displays the mean, the variance and the mode of (18) for three different values of . Note that by choosing we have that the mean and the variance of the density in (18) are quite close to the corresponding moments of the distribution while by taking smaller values for the mode in (18) becomes closer to zero.

Mean | Variance | Mode | |
---|---|---|---|

## Proof of Proposition 1

###### Proof.

A discrete distribution with pmf is called log-concave when the inequality is satisfied for all