Log In Sign Up

Copula estimation for nonsynchronous financial data

by   Arnab Chakrabarti, et al.

Copula is a powerful tool to model multivariate data. Due to its several merits Copula modelling has become one of the most widely used methods to model financial data. We discuss the problem of modelling intraday financial data through Copula. The problem originates due to the nonsynchronous nature of intraday financial data whereas to estimate the Copula, we need synchronous observations. We show that this problem may lead to serious underestimation of the Copula parameter. We propose a modification to obtain a consistent estimator in case of Elliptical Copula or to reduce the bias significantly in case of general copulas.


GARCH-UGH: A bias-reduced approach for dynamic extreme Value-at-Risk estimation in financial time series

The Value-at-Risk (VaR) is a widely used instrument in financial risk ma...

Comparison of Sobol' sequences in financial applications

Sobol' sequences are widely used for quasi-Monte Carlo methods that aris...

Clearing Payments in Dynamic Financial Networks

This paper proposes a novel dynamical model for determining clearing pay...

Structured Spreadsheet Modelling and Implementation with Multiple Dimensions - Part 1: Modelling

Dimensions are an integral part of many models we use every day. Without...

A Regularized Vector Autoregressive Hidden Semi-Markov Model, with Application to Multivariate Financial Data

A regularized vector autoregressive hidden semi-Markov model is develope...

Equalizing Financial Impact in Supervised Learning

Notions of "fair classification" that have arisen in computer science ge...

Remarks on stochastic automatic adjoint differentiation and financial models calibration

In this work, we discuss the Automatic Adjoint Differentiation (AAD) for...

1. Introduction

Correlation dynamics models have become an important aspect of theory and practice in finance. Correlation trading, which is a trading activity to exploit the changes in dependence structure of financial assets, and correlation risk that capture the exposure to losses due to changes in correlation, have attracted the attention of many practitioners, see Krishnan et al. (2009). Needless to say, poor models of dependence structure may lead to an unexpected collapse of the market of the security of interest. It is also of paramount interest for range of practical scenarios. For example, basket options are widely used although accurate pricing of the basket option is challenging. The primary reason is that they are cheaper to use for portfolio insurance. The cost saving relies on the dependence structure between the assets, see Salmon et al. (2006). In the acturial world, as shown in Embrechts et al. (2002), some Monte Carlo-based approach to joint modelling of risks-like Dynamic Financial Analysis- depends heavily on the dependence structure. Frey and McNeil (2002) and Breymann et al. (2003)

showed that the choice of model and correlation has significant impact on the tail of the loss distribution and measures of extreme risks. In this regard, modeling the dependence structure through Copula has proved its merit over traditional estimate of simple product moment correlation. Over the past few decades, copula models have become widely popular and practiced in the analysis of financial data. One of the many advantages of the copula is the flexibility it offers to model complex relationship between variables in a rather simple manner. Copula allows us to model the marginal distributions as necessary and takes care of the dependence structure separately. It is also one of the most important tools to model the probability of extremely large negative (positive) return on one asset given that the other asset yielded an extremely large negative (positive) return- commonly known as tail dependence, see

Xu (2008). Recently a Mixed-copula VaR model has been proposed to accurately measure the portfolio risk and a novel investment strategy was developed by Yin et al. (2018). Copula can also play a key role to estimate dynamic daily dependence, as shown in Grossmass and Poon (2015). On the other hand, Fengler and Okhrin (2016) used realized copula calculated daily to obtain a time series of copula parameter that help us to capture the time varying dependency. A test for structural break through copula has been developed by Remillard Rémillard (2010).

In this paper, we are going to cast a closer look into the application of copulas to capture dependence between bivariate intraday financial data. High frequency intraday data is essential to calculate integrated covolatility (daily). The evaluation of intraday market risk is important for traders involved in frequent trading. Univariate market risk models have been investigated for intraday data by Giot (2005). While building a multivariate model one of the problems of intraday financial data to be encountered is its nonsynchronous nature. The exact time of transactions in the two stocks are likely to be independent of each other and hardly be observed synchronously. The effect of this asynchronicity, if not taken care properly, can be quite serious. One of such effect, as reported by T W Epps Epps (1979), is called Epps effect. It refers to the empirical evidence of the decreasing estimated correlation between two stocks when sampling frequency increases. This is primarily a result of asynchronicity of price observations and the existing lead-lag relation between asset prices, see Renò (2003). Also the realized estimator based on nonsynchronous data can be biased and unreliable as shown in Hayashi et al. (2005).

To avoid such problems, one needs to artificially synchronize the data. A common practice to avoid such problems is to set a predetermined sampling frequency and a synchronous grid, see Wang and Zou (2014). The price at the time point just previous to a grid point is taken to be the price corresponding to that grid point. Synchronized data formed by this method is called previous tick data. In this paper, we will show that there can be a serious underestimation of the copula after synchronizing the data in this way. We propose an alternative method for the estimation of copula parameter consistently. The rest of the paper is organized as follows. In section 2 we deal with the Elliptical copula parameter estimation for nonsynchronous data and prove the main theorem. The extension to more general copula is done in section 3. In section 4 and 5 the results of simulation and real data analysis are shown. We present the conclusions in section 6. A brief introduction to copula is provided in the Appendix A.

2. Copula and asynchronicity

Suppose there are two stocks and their prices at time are denoted by and . By and

we denote the corresponding log-returns. If the prices are assumed to follow the Black-Scholes model then the log returns have a Gaussian distribution and obey the independent increment property. As the stylized facts about financial data suggest that the Gaussian distribution is not an appropriate choice for modelling, a wider class of distributions need to be considered. That is where the true importance of copula lies.

In Section 4, the results of the simulation study is reported where the effect of asynchronicity on copula parameter estimation for the Black-Scholes model has been shown. The simulation results display serious underestimation. In this section the cause of this problem is explained and a remedy is suggested. In section 2.1 we present the algorithm to pair the data to make it suitable for computation of the copula parameter estimator. The estimator is presented and consistency is proved in section 2.2. Finally an estimator of the copula function is obtained in 2.3 2.3 and its consistency is proved.

2.1. Pairing Method

The prices of the stocks are observed at random times when transactions take place. It is assumed that the observation times of the two stocks are independent Point processes. Therefore, if we have data of the first stock along with its time of occurrence as and that of the second stock as , then s and s are independent.

Before fitting a copula model the data has to be paired such that they can be treated as synchronously observed. We call it the ’pairing method’ instead of ’synchronizing method’ as we deliberately refrain from assigning an observation to a particular time point. Instead, we are taking into account the actual time of observations. In contrast, the conventional synchronizing methods such as Generalized sampling times, which includes previous tick sampling and refresh time sampling as special cases, aims to determine the sampling times or grid times with some desirable properties. After determining the grid an observation pair is assigned at each time point in the grid. In this way no memory of the original time point is conserved.

The pairing method, to be followed throughout in this paper, is described through the following algorithm:

Algorithm ():

1. Take and

2. If then find . The th pair will be . Modify

3. If then find . The th pair will be . Modify

4. Modify . and .
Repeat step 2,3,4.

So now we have paired data. Instead of writing we shall henceforth write . Suppose total number of such pairs is .

Note that, this pairing algorithm generates same pairs of and as refresh time sampling of Guo et al. (2017). The main difference is that by the above-mentioned algorithm we can keep track of the transaction times of the pair whereas in case of refresh time sampling algorithm the focus is on creating a synchronous grid.

2.2. Estimation of Copula Parameter

We will show that in order to obtain a consistent estimator for the copula parameter, we need the paired observations, as well as the positioning of To see this, first suppose for some and . Here is the combined (ordered) time points at which a transaction (in any of the stocks) is noted. Then one of these four configurations is true:


See Figure 1 and Figure 2 for examples of first two configurations.

We define a random variable

, denoting the overlapping time interval of th interarrivals corresponding to and

For Figure 1, and for Figure 2, .

By we will denote the rectangle in formed by the interval i.e. .

Now we state our assumptions and the main theorem.

We take the following assumptions :

: The associated copula is an Elliptical copula.

: The log return process follows independent and stationary increment property.

: The observation times (arrival process) of two stocks are independent Point processes and as .

: Estimation is based on paired data obtained by algorithm .

with where and is a positive number.

Theorem 1.

Under the assumptions , defined below is a consistent estimator of the true copula parameter

where and being the sample mean and sample correlation coefficient based on the pairs.

Suppose and are two consecutive pairs of log-prices with their corresponding transaction times , . We denote as a bivariate rv with mean

, variances

and correlation . is independent of the arrival processes. Here we discuss two cases as examples. Figure 1 and Figure 2 illustrate two situations where two consecutive pairs of log-prices are and with their corresponding transaction times- , . First consider Figure 1. Define with . The conditional expectation:


and hence, Similarly, . Thus for the case of Figure 1,

Figure 1. Two consecutive pairs of log-returns are and with their corresponding transaction times ,
Figure 2. Two consecutive pairs of log-returns are and with their corresponding transaction times ,

For Figure 2, using similar calculations, we get

Now we formally write down the proof.


The conditional expectation:

This is a consequence of the assumption (and illustrated in the examples).



Therefore the estimate , defined as

where is a consistent estimator of because , and almost surely converge to , and respectively. This completes the proof. ∎

2.3. Convergence of copula function

The following theorem says that the copula based on Algorithm and parameter defined above, is uniformly convergent.

Theorem 2.

Under , the copula ,with and being the empirical distribution functions of and on the margins of paired data, and defined as in Theorem 1, is uniformly convergent for the true copula.


Note that . So is a consistent estimator for the copula . But ’s are unobserved, where ’s are actually observed. Let us use the notation and for the empirical distribution function of based on the observations and respectively. Therefore to claim that the estimated copula based on the paired data (observed) is consistent, we have to show that a.s.

Suppose . Note that by Assumption 5, . Note that,

Now from the above

The second inequality is due to Chebyshev’s inequality and the last equality is due to . The third inequality is a consequence of asynchronicity as, for each , there are at most two ’s (the preceding and the next) for which and are dependent.

Hence by Borel Cantelli Lemma, . Again we know . So, .

Now we have to show that uniform convergence will hold in this case. That is we want to show . For any given we have a finite partition of the real line such that . This can be achieved by taking and . Then, . Because if then by right continuity there exists a such that , hence contradicting the definition of . So between and , jumps at least . This can happen at most finite number of times, so . By our definition of , we have . Hence .
If then

Now using properties of copula we can clearly see that

As both and are uniformly convergent to and respectively, the result follows. ∎

Remark: The assumption can also be replaced by

with where and is a positive number.

3. Extension for more general copula

In this section, we will extend our result to a more general class of copulas. As the argument in Section 2 is entirely based on the correlation coefficient it can not be directly extended to a larger class of copulas. One big drawback of elliptical copula is that it does not provide us a lot of flexibility. For example, if the marginal distribution is to be modeled by a non-elliptical distribution, then the copula parameter cannot be directly estimated from the data. The method we are now going to prescribe provides a solution to that problem.

Although for a general copula there is no direct relation between Pearson’s correlation coefficient and the copula parameter, in Appendix A, we saw that there is a relation between Kendall’s tau and the copula parameter. So we can study how Kendall’s tau is affected by asynchronicity. Theorem 3, in this section, will help us estimate Kendall’s tau for intraday financial data.

Figure 3.

As illustrated in Figure 3, suppose we have two non-overlapping interarrivals and for the first stock, with arrival times denoted by the triangles and two non-overlapping interarrivals and for the second stock, with arrival times denoted by the circles. The log returns corresponding to those interarrivals of the first stock are given by and . Similarly the log returns corresponding to the intervals of the second stock are denoted by (due to independent increment property) and . In the following section, we will focus on the two specific configurations (described in section 2).