# Non-Parametric Estimation of Spot Covariance Matrix with High-Frequency Data

Estimating spot covariance is an important issue to study, especially with the increasing availability of high-frequency financial data. We study the estimation of spot covariance using a kernel method for high-frequency data. In particular, we consider first the kernel weighted version of realized covariance estimator for the price process governed by a continuous multivariate semimartingale. Next, we extend it to the threshold kernel estimator of the spot covariances when the underlying price process is a discontinuous multivariate semimartingale with finite activity jumps. We derive the asymptotic distribution of the estimators for both fixed and shrinking bandwidth. The estimator in a setting with jumps has the same rate of convergence as the estimator for diffusion processes without jumps. A simulation study examines the finite sample properties of the estimators. In addition, we study an application of the estimator in the context of covariance forecasting. We discover that the forecasting model with our estimator outperforms a benchmark model in the literature.

• 2 publications
• 11 publications
08/23/2019

### On the estimation of high-dimensional integrated covariance matrix based on high-frequency data with multiple transactions

Due to the mechanism of recording, the presence of multiple transactions...
07/14/2021

### Generalized Covariance Estimator

We consider a class of semi-parametric dynamic models with strong white ...
09/07/2018

### Asymptotic efficiency for covariance estimation under noise and asynchronicity

The estimation of the covariance structure from a discretely observed mu...
07/25/2016

### A Non-Parametric Control Chart For High Frequency Multivariate Data

Support Vector Data Description (SVDD) is a machine learning technique u...
05/04/2019

### De-biased graphical Lasso for high-frequency data

This paper develops a new statistical inference theory for the precision...
06/19/2019

### Rate-optimal estimation of the Blumenthal-Getoor index of a Lévy process

The Blumenthal-Getoor (BG) index characterizes the jump measure of an in...
08/25/2020

### High-frequency Estimation of the Lévy-driven Graph Ornstein-Uhlenbeck process

We consider the Graph Ornstein-Uhlenbeck (GrOU) process observed on a no...

## 1 Introduction

Spot covariance has important applications in studying the intraday patterns of the covariance process, co-jump tests (Bibinger and Winkelmann (2015)) and estimating parametric multivariate stochastic volatility models (Kanaya and Kristensen (2016)

). Moreover, understanding covariance dynamics is crucial for effective portfolio choice, derivative pricing, and risk management. The availability of high-frequency intraday data of asset returns has given rise to several approaches for estimating integrated (co)variances and spot variances. While the literature proposes few measures of integrated covariance, see e.g. Barndorff-Nielsen and Shephard

(2004a), Hayashi and Yoshida (2011), there is sparse literature on empirical approaches and statistical theory to estimate spot covariances with high-frequency data.

In this paper, we consider the nonparametric filtering of spot covariance with high-frequency financial data. Our study is at the intersection of two fields of literature. The first strand of literature is on estimating integrated covariance matrices over a fixed period. This topic has been studied extensively in high-frequency econometrics. For example, the highly celebrated paper by Barndorff-Nielsen and Shephard (2004a) makes important contributions to the use of realized covariance to estimate integrated covariance matrix in a setup without market microstructure noise. The quasi-maximum likelihood estimator by Aït-Sahalia et al. (2010), the multivariate pre-averaging estimator by Christensen et al. (2013), the two-scale estimator by Zhang (2011) are robust to microstructutre noise. However, all above mentioned realized covariance estimators do not account for jumps in the underlying price process.

The second strand focuses on spot volatility estimation. Several approaches of estimating spot volatility were proposed. Foster et al. (1996) were the first to introduce the spot volatility estimator: rolling and sampling filters. Later, kernel-type estimators were introduced in Fan and Wang (2008) and Kristensen (2010). These estimators of spot variance neglect the microstructure noise and jumps. The examples of spot variance estimators accounting for microstructure noise include Zu and Boswijk (2014), Bos et al. (2012), Mykland and Zhang (2008). Yu et al. (2014) extend kernel spot volatility estimator of Kristensen (2010) to the case when the underlying price process has jumps.

The estimation of spot covariance matrix is, however, an area that has been studied the least. For a multi-dimensional continuous semimartingale log-asset price process Bibinger et al. (2017) propose an estimator for spot covariance which is constructed based on a local average of block-wise parametric spectral covariance estimates. Aiming to fill this gap in the literature we study the spot covariance estimation of both continuous and discontinuous semimartingales.

Our contribution is following. First, for a setup without jumps, we study asymptotic properties of the kernel covariance estimator, which was mentioned in Kristensen (2010) as an extension to the multivariate case and was left for the future research. Second, we propose the threshold kernel covariance estimator when the underlying price process is a discontinuous semimartingale with finite activity jumps. We derive the asymptotic distribution of this estimator for a fixed bandwidth. The estimator is an extension to the multivariate case of the threshold kernel volatility estimator proposed by Yu et al. (2014). Third, we conduct numerical studies to examine finite sample properties of both estimators. Next, we study an application of the kernel estimator in the context of covariance forecasting.

In a setup without jumps the estimator is a kernel-weighted version of the standard integrated covariance estimator, which depends on a kernel function and choice of bandwidth. It can be regarded as a kernel regression in the time domain. The bandwidth choice allows us to focus on the covariance behavior at specific points in time, and give different weights to the covariance matrix over the window used. As the bandwidth shrinks to zero, the spot covariance can be extracted. We establish asymptotic normality of the estimator for both fixed and shrinking bandwidth. The proofs are component-wise. We construct our proofs referring to the techniques of Barndorff-Nielsen and Shephard (2004a) and Kristensen (2010)

. We first derive the mean and covariance of the estimator. We then derive the asymptotic distribution by employing central limit theorem for triangular arrays and Cramér-Wold device. We also prove the asymptotic normality for the threshold kernel estimator with fixed bandwidth. In the proof of this theorem we combine our results from the first theorem, techniques from Yu et al.

(2014) and employ Cramér-Wold device. In simulation study we examine the finite sample properties of both estimators using the integrated mean square error and the integrated bias performance measurements.

The rate of convergence of both estimators is

. The local method of moments estimator of the spot covariances of Bibinger et al.

(2017) attains slower optimal rate of convergence (). However, it should be noted this is due to the fact that Bibinger et al. (2017) consider the setting with market microstructure noise, whereas we target for a complementary jump case. The kernel and threshold kernel covariance estimators are fairly easy to implement.

In terms of applications of this kernel covariance estimator, considerable efforts has been put into covariance forecasting, see e.g. Alexander (2018), Andersen et al. (2013). Multivariate GARCH models are a standard tool used in modelling and forecasting covariances. However, more recent studies propose models based on high-frequency data and options implied data. In a comprehensive empirical study by Symitsi et al. (2018)

several approaches to the covariance forecasting are compared based on statistical and economic criteria. In this study the authors conclude that models based on high-frequency data offer a clear advantage in terms of statistical accuracy. In particular, a Vector Heterogeneous Autoregressive (VHAR) model achieves the best performance amongst the competing models. The VHAR model is a linear combination of past daily, weekly and monthly realized covariance estimators of Barndorff-Nielsen and Shephard

(2004a).

Motivated by this we use the VHAR model to forecast covariance, however instead of the realized covariance estimator we use newly proposed kernel covariance estimator. We further show that with the VHAR model the kernel covariance estimator outperforms the benchmark realized covariance estimator in all three measures of accuracy: the Euclidean loss function, the Frobenius distance and the multivariate quasi-likelihood loss function.

The paper is structured as follows. In Section 2.1 we review theoretical setup of the problem and the kernel covariance estimator which was proposed in Kristensen (2010) and left for the future research. In Section 2.2 we study the asymptotic properties of the estimator for a fixed and small (tending to zero) bandwidth. In Section 3 we introduce the setup with jumps, propose the estimator for jump case and derive its asymptotic distribution. In Section 4 we conduct Monte Carlo simulations and investigate the finite sample properties of both estimators. In Section 5 we present an application of the estimator in the context of covariance forecasting. Finally, in Section 6 we summarise our findings.

## 2 Kernel Covariance Estimation

### 2.1 Theoretical Setup and the Kernel Covariance Estimator

In this section we start by considering a multidimensional continuous semimatingale, describe the theoretical setup and review the kernel covariance estimator in Kristensen (2010). Our aim is to accurately estimate the spot covariance matrix of a fixed -dimensional log-price process . We assume that follows a continuous semimartingale

 X(t)=X(0)+∫t0μ(s)ds+∫t0θ(s)dW(s),   t∈[0,T], (1)

defined on a filtered probability space

, with an initial condition , the drift vector , the -dimensional standard Brownian motion and the instantaneous volatility matrix which has elements that are all càdlàg. The latter yields the -dimensional spot covariance matrix , which is our object of interest. We also denote the integrated covariance matrix by . We consider the finite and fixed time horizon with high-frequency discrete observations of the realization of -th asset, with . For an arbitrary partition of the interval we require that approaches zero under the asymptotic limit. For simplicity, we consider the case of equally spaced and synchronous observation times. We denote , so that for .

A kernel is a non-negative integrable function satisfying the following condition: . The kernel weighted measure of the integrated covariance, which is an extension of the measure of the integrated variance introduced in Kristensen (2010), is of the following form

 KCV(τ)=∫T0Kh(s−τ)Σ(s)ds, (2)

where the function is given by , satisfies , and is the fixed bandwidth. delivers a kernel weighted average of the quadratic covariation.

An estimator of the integrated covariance in equation (2) is the kernel smoothed sample average of the increments, which was mentioned in Kristensen (2010) as an extension of the univariate case and was left for the future research:

 ˆKCV(τ)=n∑i=1Kh(ti−1−τ)ΔX(ti−1)ΔX⊤(ti−1), (3)

where is the -dimensional vector ( is fixed) of the increments of the process over time interval . As demonstrated above, for a fixed , gives a weighted measure of the integrated covariance. However, as , the instantaneous covariance can be recovered at any point of continuity of :

 Σ(τ)=limh→∞KCV(τ). (4)

To emphasize that we are working with an estimator of the instantaneous covariance at time , we shall denote:

 ˆΣ(τ)=n∑i=1Kh(ti−1−τ)ΔX(ti−1)ΔX(ti−1)⊤ (5)

Note that, can be regarded as the Nadarya-Watson estimator. An overview of this types of kernel can be found in Silverman (1986). In the univariate case, i.e. when , we recover the spot variance estimator from Kristensen (2010).

### 2.2 Asymptotic Properties of the Kernel Covariance Estimator

In this section we state the necessary assumptions and present the two out of the three main results of the paper. Our first theorem derives the asymptotic distribution of the kernel covariance estimator for the fixed bandwidth. Theorem 2 proves asymptotic normality of the kernel covariance estimator for a tending to zero bandwidth. Throughout our work we shall consider the following set of assumptions:

###### Assumption 1.

The processes and are jointly independent of .

This assumption holds for a widely used stochastic volatility models, such as Heston (1993), Hull and White (1987). Assumption 1 greatly facilitates the proof by allowing us to make all arguments conditional on and . Under Assumption 1, the volatility process being independent of , the model falls into the case without leverage effects. However, this assumption does not appear to be strictly necessary as demonstrated in Kanaya and Kristensen (2016).

###### Assumption 2.

For any sequences , with and every , as

 δn∑i=1|μ2k(si)−μ2k(ti)|=o(1),      δn∑i=1|Ω(si)−Ω(ti)|=o(1), (6)

where

Assumptions 2 imposes a restriction on the local behavior of the mean and covariance processes. It allows for the deterministic patterns, jumps, and nonstationarity, and is automatically satisfied when the mean and volatility processes have continuous trajectories. In particular, standard diffusion models such as Heston (1993), Hull and White (1987) satisfy this assumption.

###### Assumption 3.

For every and the quantities

 δ−1∫titi−1Σkk(s)ds (7)

are bounded away from 0 and infinity uniformly in .

Equation (7) in Assumption 3 essentially means that, on any bounded interval, itself is bounded away from infinity. This is the case, for example for Cox-Ingersoll-Ross (CIR) and Ornstein-Uhlenbeck (OU) processes in Cox et al. (1985), Uhlenbeck and Ornstein (1930) respectively. The above mentioned assumptions are sufficient to derive asymptotic distribution of , however in order to get the asymptotics of , when , the general smoothness condition needs to be imposed on the covariance process.

###### Assumption 4.

The space for some and consists of functions that are times differentiable with the -th derivative , satisfying

 |f(m)(t+δ)−f(m)(t)|≤L(t,|δ|)|δ|γ+o(|δ|γ),   δ→0,  (a.s.), (8)

where is Lipschitz coefficient, a slowly varying function at zero and is continuous. The mapping for lies in for some and .

As stated in Yu et al. (2014) this condition is satisfied by commonly used diffusion processes. When Assumption 5 holds with and the model is driven by a Brownian motion (see e.g. Revuz and Yor (1998, ch.5)).

We also impose requirements on the kernel function:

###### Assumption 5.

The kernel

1. [label=()]

2. satisfies and continuously differentiable, i.e. , such that

 ¯Kz\coloneqqsup0≤u≤T|K(z)(u)|<∞,    z=0,1.
3. satisfies the condition that there exists some constants and such that , and for some , for , .

4. satisfies , and , for some .

The assumptions above are satisfied by most standard kernels for . When , is called a higher-order kernel. If as well, the higher-order kernels can be used to reduce the bias in the estimation of more than twice differentiable functions. Although, as mentioned in Kristensen (2010), since is a usual case, Cline and Hart (1991) demonstrated that higher-order kernels can potentially reduce bias even when the object of interest is non-smooth and has jumps.

Now we are ready to derive the asymptotics of the kernel covaraince estimator for a fixed bandwidth.

###### Theorem 1.

If Assumptions 1-5 hold, we have that for fixed and any

 √δ−1{ˆKCV(τ)−∫T0Kh(s−τ)Σ(s)ds}L→N(0,∫T0K2h(s−τ)Ω(s)ds), (9)

where is a array with elements

 Ω(t)=:{Σkk′(t)Σll′(t)+Σkl′(t)Σlk′(t)}k,k′,l,l′=1,⋯,d. (10)
###### Proof.

We give the proof in several steps. First we derive the means, variances and covariances of the variates

 ˆKCVkl(τ) = n∑i=1Kh(ti−1−τ)ΔXk(ti−1)ΔXl(ti−1) = n∑i=1Kh(ti−1−τ)(Xk(ti)−Xk(ti−1))(Xl(ti)−Xl(ti−1)).

with . Second, the Theorem 1 is proved for the case, where the mean processes are identically , by employing Cramer-Wold device. Finally, the latter restriction is lifted and using lemma 5 in Appendix D the negligibility of non-zero drift term is shown. The proof is component-wise and based on the results and techniques employed by Barndorff-Nielsen and Shephard (2004a) and Kristensen (2010). See Appendix A for the details of the proof. ∎

This theorem is an intermediate step in the derivation of the asymptotic distribution of the estimator for a shrinking bandwidth. The Theorem 1 is necessary for the proof of the asymptotic normality of the spot kernel covariance estimator in (5).

###### Theorem 2.

If Assumptions 1-5 hold with , then as and for any we have

 √δ−1h{ˆΣ(t)−Σ(t)}L→N(0,Ω(t)∫RK2(z)dz) (11)

where is a array with elements

 Ω(t)=:{Σkk′(t)Σll′(t)+Σkl′(t)Σlk′(t)}k,k′,l,l′=1,⋯,d. (12)
###### Proof.

See Appendix B. ∎

Bibinger et al. (2017) propose spot covariance estimator which is constructed based on local averages of block-wise parametric spectral covariance estimates. This is an extension of the local method of moments (LMM) in Bibinger and Reiss (2014). Since Bibinger et al. (2017) consider a setting with market microstructure noise, their estimator attains the optimal rate of convergence () which is slower compared to the convergence rate of the kernel covariance estimator (). The kernel estimator in equation (5) is fairly easy to implement.

• It is helpful to focus on the bivariate case in order to gain further understanding. We will look at the results for the assets and , whose log-prices will be written as and respectively. Then the high-frequency returns at time is

 ΔXk(ti)=Xk(ti)−Xk(ti−1)   and   ΔXl(ti)=Xl(ti)−Xl(ti−1)   for i=1,⋯,n.

In order to avoid the symmetric replication in the covariation matrix we employ a half-vectorization, or alternatively, a vech transformation. The half-vectorization of a symmetric matrix is obtained by vectorizing only the lower triangular part of the matrix (see Kollo and Rosen (2005), Lütkeohl (1996)). In this case Theorem 1 tells us that joint asymptotic distribution for identifying elements of realized covariation of two assets and becomes

 √δ−1⎛⎜ ⎜ ⎜⎝∑ni=1Kh(ti−1−τ)ΔX2k(ti)−∫T0Kh(s−τ)Σkk(s)ds∑ni=1Kh(ti−1−τ)ΔXk(ti)ΔXl(ti)−∫T0Kh(s−τ)Σkl(s)ds∑ni=1Kh(ti−1−τ)ΔX2l(ti)−∫T0Kh(s−τ)Σll(s)ds⎞⎟ ⎟ ⎟⎠L→

## 3 Jump Case: Threshold Kernel Covariance Estimation

In this section we assume that the price process is governed by a discontinuous semimartingale with finite activity jumps. We propose a threshold kernel spot covariance estimator, which is an extension of the threshold kernel spot volatility estimator in Yu et al. (2014) to the multivariate case. Theorem 3 derives the asymptotic distribution of the threshold kernel covariance estimator for a fixed bandwidth.

Consider a filtered probability space . Let the d-dimensional (with fixed ) log-price be defined on the this space and satisfy the following stochastic differential equation:

 dX(t)=μ(t)dt+θ(t)dW(t)+dJ(t),   t∈[0,T]. (13)

where is the drift vector, is the instantaneous volatility matrix, is the -dimensional standard Brownian motion and is a compound Poisson process with finite activity of jumps, which can be written as . Here is a homogeneous Poisson process with constant intensity and

is a sequence of i.i.d. random variables with values in

, which denotes the jump size at the jump location . We assume for are i.i.d. and independent of . Denote the -dimensional spot covariance matrix by .

Suppose that on a finite and fixed time horizon , we have high-frequency discrete observations of the realization of -th asset, with . Here, is an arbitrary partition of the interval . Although the observations are not necessarily equidistant, we require that approaches zero under the asymptotic limit. We consider the case of equally spaced and synchronous observation times, though this assumption can easily be lifted. Denote , so that for .

The quantity of interest is the spot covariance matrix . The threshold kernel covariance estimator, denoted by , is defined as

 ˆTCV(τ)=n∑i=1Kh(ti−1−τ)ΔX(ti−1)ΔX⊤(ti−1)1{∥ΔXti−1∥≤dr(δ)}, (14)

where is the indicator function and is the -dimensional vector of increments of process over time interval . The function is given by , where is bandwidth and the kernel function satisfies . The threshold function is a deterministic function of the step length . As the bandwidth we recover the spot covariance. The threshold function has to vanish more slowly than the modulus of the continuity of the Brownian motion in order to have the convergence in probability. Thus we have the following additional assumption.

###### Assumption 6.

is a deterministic function of the step length such that and .

We now can derive the asymptotics of the threshold kernel covariance estimator.

###### Theorem 3.

If Assumptions 1-6 hold, we have that for fixed and any

 √δ−1{ˆTCV(τ)−∫T0Kh(s−τ)Σ(s)ds}L→N(0,∫T0K2h(s−τ)Ω(s)ds), (15)

where is a array with elements

 Ω(t)=:{Σkk′(t)Σll′(t)+Σkl′(t)Σlk′(t)}k,k′,l,l′=1,⋯,d. (16)
###### Proof.

See Appendix C

The threshold kernel covariance estimator in equation (3) is an extension of the threshold kernel estimator of the time-dependent spot volatility in Yu et al. (2014) to the multivariate case. In Theorem 3 we derive asymptotic distribution for the estimator for a fixed bandwidth of the kernel. The similar results as in Theorem 3 was achieved for univariate case in Yu et al. (2014).

## 4 Simulation Study

In this section we examine the performance of the kernel and threshold kernel covariance estimators. In particular, we investigate the finite-sample performances of the estimators relative to the time distance between observations. Throughout we work with bivariate stochastic volatility model. First, we examine the kernel covariance estimator in a setup without jumps and assume that asset prices, , follows Heston model:

 dY(t)=μY(t)dt+θ(t)Y(t)dW(t),       Σ(t)=θ(t)θ′(t), (17)

where

 Σ(t)=(Σ11(t)Σ12(t)Σ12(t)Σ22(t))=(σ21(t)σ1,2(t)σ1,2(t)σ22(t)) (18)

with the covariance , the drift vector and a standard two dimensional Brownian motion such that . The variance processes, for , follow the CIR model Cox et al. (1985):

 dσ2i(t)=κi(θi−σ2i(t))dt+ηiσi(t)dZi(t). (19)

We set the correlation between asset and its volatility process to zero in order for Assumption 1 to hold. The remaining data generating parameters are chosen to match the estimated parameter values in Barndorff-Nielsen and Shephard (2002). In our simulation we set (48 hours). We consider frequencies corresponding to sampling every 5 seconds, 20 seconds, 1 minute, 5 minutes and 10 minutes. In order to simulate the data using model (4) we employ the Euler discretization scheme from Kloeden and Platen (1999, ch.14). We simulate one trajectory of each for and keep them fixed. Then we run 500 Monte Carlo repetitions for prices of two assets . In each repetition we compute for based on sampling frequencies.

Three different estimators of instantaneous covariance: Gaussian kernel estimator, one-sided kernel estimator and beta kernel estimator are implemented. For all three estimators cross-validation was used to select the bandwidth (see Kristensen (2010)). We used the following integrated squared error (ISE) as the goodness-of-fit criterion:

 ISE(h)=∫tutl(Σkl(s)−ˆΣkl(s))2ds,       for  0

where and for are the true and the estimated spot covariances. Two performance measurements are used to evaluate the finite-sample properties of the estimators: the integrated mean squared error and the integrated bias

 IMSE=∫tutl[(Σkl(s)−ˆΣkl(s))2]ds,  ISB=∫tutl[E(Σkl(s)−ˆΣkl(s))2ds], (21)

where . The results for the performance of the estimator of the covariance, , are reported in Table 1. Figure 1

displays QQ plot for observed standardized error terms of Kernel Covariance Estimator using minute-by-minute data.

Next, we examine the finite sample performance of the threshold covariance estimator. Though several models combining jumps and stochastic volatility appeared in the literature, we use the model from Bates (1996), one of the most popular examples of the class, an independent jump component is added to the Heston stochastic volatility model:

 dX(t)=μdt+θ(t)dW(t)+dJ(t),       Σ(t)=θ(t)θ′(t), (22)

with

 Σ(t)=(Σ11(t)Σ12(t)Σ12(t)Σ22(t))=(σ21(t)σ1,2(t)σ1,2(t)σ22(t),) (23)

where is log of asset prices, , is the drift vector, is a two dimensional compound Poisson jump process and is a standard two dimensional Brownian motion such that . The variance processes, for , follow the CIR model:

 dσ2i(t)=κi(θi−σ2i(t))dt+ηiσ2i(t)dZi(t). (24)

As in simulations for Heston model without jumps we set (48 hours) and consider sampling frequencies 5 seconds, 30 seconds, 1 minute. We employ Euler discretization scheme from Kloeden and Platen (1999, ch.14) for the simulation. We simulate one trajectory of each and for and keep them fixed. Then we run 500 repetitions of . For each simulated path of the bivariate log asset price we compute based on sampling frequencies.

We use two IMSE and ISB performance measurements in equation (21) for three different estimators: Gaussian, beta and one-sided kernel estimator. The results for the performance of the estimator are reported in Table 2. Figure 2 displays QQ plot for observed standardized error terms of Threshold Kernel Covariance Estimator using minute-by-minute data.

## 5 Applications: Covariance Forecasting

Forecasting covariance has an important economic value in the context of asset pricing and portfolio allocation. Multivariate GARCH model is a standard tool of modelling and forecasting covariances. However, the more recent approaches advocate the use of high-frequency data.

Symitsi et al. (2018)

undertake a comprehensive empirical comparison of two generic families of covariance forecasting models: multivariate GARCH models that employ daily data and models that use high-frequency and options data. The authors conclude that models based on high-frequency data offer both a clear advantage in terms of statistical accuracy and yield more theoretically consistent predictions leading to superior out-of-sample portfolio performance. In particular, a Vector Heterogeneous Autoregressive Model (VHAR) achieves the best performance out of the models under consideration. Motivated by this, we use the VHAR model to forecast the integrated covariance, however, when implementing for a finite sample, we use the kernel covariance estimator (

3) in Section 2 instead of the realized covariance estimator of Barndorff-Nielsen and Shephar (2004a).

Heterogeneous Autoregressive model (HAR), see Corsi (2009), was proposed as a simple way to approximate the long-memory behaviour of volatility. Vector HAR, implemented in Chiriac (2011), is a multivariate extension of HAR. In the VHAR the realized covariance is expressed as a linear combination of past daily, weekly and monthly realized covariances:

 RCt+1=α+βdRCt+βwRCt−5:t+βmRCt−22:t+ϵt+1, (25)

where is obtained from Cholesky decomposition of realized covariance matrix. If is a matrix of realized covariances, its Cholesky decomposition gives and then . In order to allow direct comparison among quantities defined over various time horizons, these multiperiod factors are normalized sums of the daily realized factors, i.e.

 RCt−k:t=1kk−1∑i=0RCt−i (26)

is the past day values of , is a constant term and are, respectively, the parameters of daily, weekly and monthly components of the model. The covariance forecasts, , are obtained by the reverse transformations of the ’s. Modelling the Cholesky factors rather than covariances directly is done in order to avoid unnecessary restrictions that ensure positive definiteness.

We simulate the log-prices of two assets and their volatilises using model (4) in Section 4. Since we use simulated data, we have the true integrated covariance matrix and we propose to forecast the true covariance matrix using two measures of integrated covariance: standard in the literature realized covariance estimator of Barndorff-Nielsen and Shephard (2002) and newly proposed kernel filtering of the covariance in equation (3). Thus we have two models for forecasting integrated covariance. First model is VHAR model where we use the realized covariance as a measure of integrated covariance:

 ICt+1=α+βdRCt+βwRCt−5:t+βmRCt−22:t+ϵt+1, (27)

where is the half-vectorized Cholesky decomposition of the integrated covariance matrix.

In light of this it is natural to define the VHAR-KCV model, in which we borrow the VHAR model above to predict the integrated covariance matrix, however we use kernel covariance estimator:

 ICt+1=α+βdˆKCVt+βwˆKCVt−5:t+βmˆKCVt−22:t+ϵt+1, (28)

where is the half-vectorized Cholesky decomposition of the kernel covariance estimator in (3). We benchmark the VHAR-KCV against the VHAR.

In line with Symitsi et al. (2018) we evaluate forecasting ability of the the VHAR-KCV model (28) based on three multivariate loss functions and compare its performance to the performance of the benchmark VHAR model (27). We use the Euclidean loss function, , which is equally-weighted elements of the forecast error matrix; the Frobenius distance, , which is the extension of the mean squared error to the multivariate space and the multivariate quasi-likelihood loss function, , which is scale invariant:

 LE=vech(Σt−Ht)′vech(Σt−Ht), (29) LF=Tr[(Σt−Ht)′(Σt−Ht)], (30) LQ=log|Ht|+Tr(H−1tΣt). (31)

Here denotes the trace of square matrix, denotes the integrated covariance matrix at time and is time matrix of conditional covariance forcasts.

Results are reported in Tables 3. It is clear that for all forecasting horizons, the VHAR-KCV model outperforms the VHAR model which was shown to be the best model for forecasting covariance matrix in large study by Symitsi et al. (2018).

## 6 Concluding Remarks

Inspired by the kernel filtering of spot volatility, in this paper we develop estimators of spot covariances for two types of the underlying price process: continuous and discontinuous semimartingales. We show the asymptotic normality of the estimators. An important result is that we are able to attain the convergence rate for both estimators, which is . The convergence rate of spot covariance matrix estimator for continuous martingales in a setup with microstructure noise proposed by Bibinger et al. (2017) is, in turn, . In financially realistic scenarios, we conduct Monte Carlo experiments to study the finite sample properties of our estimators. In addition, we investigate one of the possible applications of the estimator, the forecasting of covariance matrix. We conclude that our estimator performs better in the context of forecasting than the benchmark realized covariance estimator of Barndorff-Nielsen and Shephard (2004a). One of the possible extensions of the estimators is to consider a market-microstructure noise.

## References

• Aït-Sahalia et al. (2010) Aït-Sahalia Y, Fan J, Xiu D. 2010. High-frequency estimates with noisy and asynchronous financial data. Journal of the American Statistical Association 105: 1504-1516.
• Alexander et al. (2018) Alexander C. 2008. Market risk analysis (vol. 2): practical financial econometrics. Chichester: John Wiley & Sons.
• Andersen et al. (2013) Andersen TG, Bollerslev T, Christoffersen PF, and Diebold FX. 2013. Financial risk measurement for financial risk management. Handbook of the Economics of Finance 53: 1127-1220.
• Barndorff-Nielsen et al. (2004a) Barndorff-Nielsen OE, Shephard N. 2004a. Econometric analysis of realised covariation: High frequency based covariance, regression and correlation in financial economics. Econometrica 72: 885–925.
• Barndorff-Nielsen et al. (2002) Barndorff-Nielsen, OE, Shephard N. 2002. Econometric analysis of realized volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society 64: 253-280.
• Bates et al. (1996) Bates D. 1996. Jumps and stochastic volatility: the exchange rate processes implicit in Deutschemark options. The Review of Financial Studies 9: 69-107.
• Bibinger et al. (2017) Bibinger M, Hautsch N, Malec P, Reiss M. 2017. Estimating the spot covariation of asset prices — statistical theory and empirical evidence. Journal of Business and Economic Statistics. 1504-1516.
• Bibinger et al. (2014) Bibinger M, Reiss M. 2014. Spectral estimation of covolatility from noisy observations using local Weights. Scandinavian Journal of Statistics 6: 23-50.
• Bibinger et al. (2015) Bibinger M, and Winkelmann L. 2015. Econometrics of co-jumps in high frequency data with noise. Journal of Econometrics 184: 361-378.
• Bos et al. (2012) Bos CS, Janus P, Koopman SJ. 2012. Spot variance path estimation and its application to high-frequency jump testing. Journal of Financial Econometrics 10: 354-389.
• Chiriac et al. (2011) Chiriac R, Voev V. 2011. Modelling and forecasting multivariate realized volatility. Journal of Applied Econometrics 26: 922-947.
• Christensen et al. (2013) Christensen K, Podolskij M, Vetter M. 2013. On covariation estimation for multivariate continuous Itô semimartingales with noise in non-synchronous observation schemes.

Journal of Multivariate Analysis

120: 59-84.
• Cline et al. (1991) Cline DBH, Hart JD. 1991. Kernel estimation of densities with discontinuities or discontinuous derivatives. Statistics 22: 69-84.
• Corsi et al. (2009) Corsi F. 2009. A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics 7: 174-196.
• Cox et al. (1985) Cox J, Ingersoll J, Ross S. 1985. A theory of the term structure of interest rates. Econometrica 53: 385-407.
• Fan et al. (2008) Fan J, Wang Y. 2008. Spot volatility estimation for high-frequency data. Statistics and Its Interface 1: 279-288.
• Foster et al. (1996) Foster DP, Nelson DB. 1996. Continuous record asymptotics for rolling sample variance estimators. Econometrica 64: 139-174.
• Hayashi et al. (2011) Hayashi T, Yoshida N. 2011. Nonsynchronous covariation process and limit theorems. Stochastic processes and their applications 121: 2416-2454.
• Heston et al. (1993) Heston SL. 1993. A closed-form solution for options with stochastic volatility with applications to bond and currency options. Review of Financial Studies 6: 327-343.
• Hull et al. (1987) Hull J, White A. 1987. The pricing of options on assets with stochastic volatility. Journal of Finance 42: 281-300.
• Kanaya et al. (2016) Kanaya S, Kristensen D. 2016. Estimation of stochastic volatility models by nonparametric filtering. Econometric Theory 32: 861-916.
• Karatzas et al. (1999) Karatzas I, Shreve SE. 1999. Brownian motion and stochastic calculus. New York: Springer.
• Kloeden et al. (1999, ch.14) Kloeden P, Platen E. 1999. Numerical solutions of stochastic differential equations. Berlin: Springer-Verlag.
• Kollo et al. (2005) Kollo T, Rosen D. 2005. Advanced multivariate statistics with matrices. Dordrecht: Springer.
• Kristensen et al. (2010) Kristensen D. 2010. Nonparametric filtering of the realized spot volatility: A kernel-based approach. Econometric Theory 26: 60–93.
• Lutkeohl et al. (1996) Lütkeohl, H. 1996. Handbook of matrices. Chichester: John Wiley & Sons Ltd.
• Mykland et al. (2008) Mykland PA, Zhang L. 2008. Inference for Volatility-type objects and implications for hedging. Statistics and Its Interface 1: 255-278
• Revuz et al. (1998, ch.5) Revuz D, Yor M. 1998. Continuous martingales and Brownian motion. Berlin: Springer-Verlag.
• Silverman et al. (1986) Silverman BW. 1986. Density estimation for statistics and data analysis. New York: Chapman and Halls.
• Symitsi et al. (2018) Symitsi E, Symeonidis L, Kourtis A, Markellos R. 2018. Covariance forecasting in equity markets. Journal of Banking and Finance 96: 153-168.
• Uhlenbeck et al. (1930) Uhlenbeck GE, Ornstein LS. 1930. On the theory of Brownian Motion. Phys.Rev 36: 823-41.
• Yu et al. (2014) Yu C, Fang Y, Li Z, Zhao X. 2014. Non-parametric estimation of high-frequency spot volatility for Brownian semimartingale with jumps. Journal of Time Series Analysis 35: 572-591.
• Zhang et al. (2011) Zhang L. 2011. Estimating covariation: Epps effect and microstructure noise. Journal of Econometrics 160: 33-47.
• Zu et al. (2014) Zu Y, Boswijk HP. 2014. Estimating spot volatility with high-frequency financial data. Journal of Econometrics 181: 117-135.

## Appendix A Proof of Theorem 1

### a.1 Notation

In a similar way to Barndorff-Nielsen and Shephard (2004a)

for the purpose of simplifying the proof we will use the index (or equivalently, tensor) notation instead of vector or matrix notation. We rewrite the

stochastic processes in equation (1) in index notation as

 Xk(t)=∫t0μk(s)ds+∫t0θak(u)dWa(s), (32)

with initial condition . Here

 Θ(t)={θa(k)(t)}k,a=1,2,⋯,d.

In the index notation the Einstein summation convention is used, which means if an index variable appears twice in a single expression then it implies summation over that index. Thus (32) is understood to mean

 Xk(t)=∫t0μ(k)(s)ds+d∑a=1∫t0θa(k)(s)dWa(s). (33)

We apply summation convention to indices , but not to indices , unless otherwise specified. Furthermore, we write

 θabkl=θakθbl, (34)

with similar notation for other index combination. In (34) no superscripts or subscripts are repeated and so no summation operator is generated. Combining the Einstein summation convention and the notional rule for , the th element of the spot covalatility matrix of model (1) is

 Σkl(t)=θaakl=d∑a=1θak(t)θal(t). (35)

### a.2 Mean and variances

The proof of Theorem 1 consists of several steps. First step is to derive the means and covariances of the variates

 ˆKCVkl(τ) = n∑i=1Kh(ti−1−τ)ΔXk(ti−1)ΔXl(ti−1) (36) = n∑i=1Kh(ti−1−τ)(Xk(ti)−Xk(ti−1))(Xl(ti)−Xl(ti−1)), (37)

with . Next, the Theorem 1 is proved for the case, where the mean processes are identically . Finally, the latter restriction is lifted. The proof is component-wise and based on the results and techniques employed by Barndorff-Nielsen and Shephard (2004a) and Kristensen (2010).
We start by computing the expectation of in equation (37).

 E[ˆKCVkl(τ)] = E[n∑i=1Kh(ti−1−τ)(Xk(ti)−Xk(ti−1))(Xl(ti)−Xl(ti−1))] (38) = n∑i=1Kh(ti−1−τ)E[(Xk(ti)−Xk(ti−1))(Xl(ti)−Xl(ti−1))] = n∑i=1Kh(ti−1−τ)∫titi−1θaakl(s)ds,

where the final equation is due to the results of Barndorff-Nielsen and Shephard (2004a):

 E[ΔXk(ti−1)ΔXl(ti−1)]=∫titi−1θaakl(s)ds. (39)

Next, we apply Lemma 5 and have

 n∑i=1Kh(ti−1−τ)∫ti