# Volatility of volatility estimation: central limit theorems for the Fourier transform estimator and empirical study of the daily time series stylized facts

We study the asymptotic normality of two estimators of the integrated volatility of volatility based on the Fourier methodology, which does not require the pre-estimation of the spot volatility. We show that the bias-corrected estimator reaches the optimal rate n^1/4, while the estimator without bias-correction has a slower convergence rate and a smaller asymptotic variance. Additionally, we provide simulation results that support the theoretical asymptotic distribution of the rate-efficient estimator and show the accuracy of the Fourier estimator in comparison with a rate-optimal estimator based on the pre-estimation of the spot volatility. Finally, we reconstruct the daily volatility of volatility of the S P500 and EUROSTOXX50 indices over long samples via the rate-optimal Fourier estimator and provide novel insight into the existence of stylized facts about its dynamics.

## Authors

• 2 publications
• 1 publication
• 2 publications
• 1 publication
10/10/2018

### Inference for Volatility Functionals of Itô Semimartingales Observed with Noise

This paper presents the nonparametric inference for nonlinear volatility...
11/06/2019

### The Fourier Transform Method for Volatility Functional Inference by Asynchronous Observations

We study the volatility functional inference by Fourier transforms. This...
03/19/2018

### Exploring the predictability of range-based volatility estimators using RNNs

We investigate the predictability of several range-based stock volatilit...
04/04/2020

### Kernel Estimation of Spot Volatility with Microstructure Noise Using Pre-Averaging

We first revisit the problem of kernel estimation of spot volatility in ...
07/13/2020

### Feasible Inference for Stochastic Volatility in Brownian Semistationary Processes

This article studies the finite sample behaviour of a number of estimato...
03/19/2019

### Optimal Bias Correction of the Log-periodogram Estimator of the Fractional Parameter: A Jackknife Approach

We use the jackknife to bias correct the log-periodogram regression (LPR...
03/18/2021

### Inference and Computation for Sparsely Sampled Random Surfaces

Non-parametric inference for functional data over two-dimensional domain...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In the last decades, different stochastic volatility models have been proposed to describe the evolution of asset prices, motivated by empirical studies on the patterns of volatilities in financial time series. Further, the availability of high-frequency data has given impulse to devise statistical techniques aimed at the efficient estimation of model parameters in the stochastic volatility framework, e.g., the leverage and the volatility of volatility processes. The estimation of these model parameters is rather complicated, the main difficulties being due to the fact that some factors are unobservable. In particular, the estimation of the volatility of volatility is a challenging task, because a pre-estimation of the spot volatility is required as a first step, due to the latency of the volatility process.

Unlike the integrated volatility, the non-parametric estimation of the integrated volatility of volatility is a relatively recent topic. [Barndorff-Nielsen and Veraart, 2009] propose a new class of stochastic volatility of volatility models, with an extra source of randomness, and show that the volatility of volatility can be estimated non-parametrically by means of the quadratic variation of the preliminarily estimated squared volatility process, which they name pre-estimated spot variance based realized variance. [Vetter, 2015] proposes an estimator of the integrated volatility of volatility which is also based on increments of the pre-estimated spot volatility process and attains the optimal convergence rate in the absence of noise. The common feature of these estimators is that they first reconstruct the unobservable volatility path via some consistent estimator thereof and then compute the volatility of volatility using the estimated paths as a proxy of the corresponding unknown paths. The issue of estimating the volatility of volatility in the presence of jumps is studied in [Cuchiero and Teichmann, 2015]: firstly, the authors combine jump robust estimators of the integrated realized variance and the Fourier-Fejer inversion formula to get an estimator of the instantaneous volatility path; secondly, they use again jump robust estimators of the integrated volatility, in which they plug the estimated path of the volatility process to obtain an estimator of the volatility of volatility. In the same spirit of [Barndorff-Nielsen and Veraart, 2009, Vetter, 2015], also [Li et al., 2021] propose an estimator of the integrated volatility of volatility by means of a pre-estimation of the spot volatility, but, in order to extend the study to the case when the observed price process contains jumps and microstructure noise, the authors adopt a threshold pre-averaging estimator of the volatility, following [Jing et al., 2014].

In this paper, we focus on the estimation of the integrated volatility of volatility via the Fourier estimation method by [Malliavin and Mancino, 2002], which does not require the pre-estimation of the spot volatility. An early application of the Fourier methodology to identify the parameters (volatility of volatility and leverage) of stochastic volatility models has been proposed by [Barucci and Mancino, 2010], where the authors prove a consistency result for the estimator of both the integrated leverage and volatility of volatility in the absence of noise. In the presence of microstructure noise, [Sanfelici et al., 2015] study the finite sample properties of the Fourier estimator of the volatility of volatility introduced in [Barucci and Mancino, 2010] and show its asymptotic unbiasedness. However, the convergence rate of the estimator is not established, not even in the absence of microstructure noise contamination.

In the present paper we fill this gap. Specifically, after proving that the Fourier estimator of the volatility of volatility by [Sanfelici et al., 2015] has a sub-optimal rate of convergence, we define its bias-corrected version and prove that it reaches the optimal convergence rate . We also show that the non-corrected estimator with slower convergence rate displays a smaller asymptotic error variance.

The asymptotic results are supported by a simulation exercise, where we also compare the finite-sample performance of the rate-efficient Fourier estimator with that of the rate-efficient realized estimator based on the pre-estimation of the spot volatility by [Ait-Sahalia and Jacod, 2014]. The comparison shows that the Fourier estimator works satisfactorily well on the daily and longer horizons, while the realized estimator requires a yearly horizon to achieve sufficient accuracy. This feature may be related to the fact that, differently from the other volatility of volatility estimators, which rely on the pre-estimation of the instantaneous volatility path via a numerical differentiation, the Fourier approach relies only on the reconstruction of integrated quantities, i.e., the Fourier coefficients of the volatility. As it was early observed in [Malliavin and Mancino, 2002], this is a peculiarity of the Fourier estimator that renders the proposed method easily implementable and computationally stable.

Finally, we present an empirical exercise where the Fourier estimator is applied to obtain the daily time series of the volatility of volatility of the S&P500 and EUROSTOXX50 indices over, resp., the periods May 1, 2007 - August 6, 2021 and June 29, 2005 - May 28, 2021. As a result, we obtain some novel insight into the empirical regularities that characterize the daily dynamics of the volatility of volatility, which - to the best of our knowledge - up to now had been scarcely explored in the literature. Specifically, we find that the daily volatility of volatility of both the indices spikes in correspondence of periods of financial turmoil (e.g., during the financial crisis of 2008 and the outbreak of the COVID pandemic in 2020). Additionally, we also find that it is usually positively (resp., negatively) correlated with the volatility (resp., the asset return), but appears to be less persistent than the volatility. Finally, we observe that its empirical distribution is satisfactorily approximated by a log-normal distribution in years characterized by higher financial stability, as it is for the volatility.

The paper is organized as follows. Section 2 contains the assumptions and definitions. Section 3 states the central limit theorems, which are supported by the simulation study in Section 4. Finally, Section 5 contains the empirical results and Section 6 concludes. The proofs are given in Appendix A, while Appendix B contains some auxiliary lemmas on the Fejer and Dirichlet kernels.

## 2 The Volatility of Volatility estimator: definition and assumptions

This section presents the general non-parametric stochastic volatility model which will be considered throughout the paper and defines two estimators of the integrated volatility of volatility based on the Fourier estimation method introduced in [Malliavin and Mancino, 2002, Malliavin and Mancino, 2009]. The class considered includes most of the continuous stochastic volatility models commonly used in high-frequency finance and is assumed (to cite one among many others) in [Ait-Sahalia and Jacod, 2014], chp.8.3.

We make the following assumptions.

(A.I) The log-price process and the variance process are continuous Itô semimartingales on satisfying the stochastic differential equations

 {dp(t)=σ(t)dWt+b(t)dtdv(t)=γ(t)dZt+β(t)dt

where , while and

are Brownian motions on a filtered probability space

satisfying the usual conditions, possibly correlated555It is not restrictive to assume a constant correlation ..

(A.II) The processes , , and are continuous adapted stochastic processes defined on the same probability space , such that for any

 E[∫T0σp(t)dt]<∞  , E[∫T0bp(t)dt]<∞  , E[∫T0γp(t)dt]<∞  , E[∫T0ap(t)dt]<∞.

The processes are specified in such a way that and are a.s. positive666The process represents the spot volatility of volatility..

(A.III) The process is a continuous Itô semimartingale, whose drift and diffusion processes are continuous adapted stochastic processes defined on the same probability space .

The assumptions (A.I)-(A.II)-(A.III) are standard in the non-parametric setting and are considered, e.g., in [Barndorff-Nielsen and Veraart, 2009, Ait-Sahalia and Jacod, 2014, Cuchiero and Teichmann, 2015, Vetter, 2015, Li et al., 2021].

By changing of the origin of time and scaling the unit of time, one can always modify the time window to . Suppose that the asset log-price is observed at discrete, irregularly-spaced points in time on the grid . For simplicity, we omit the second index . Denote and suppose that as .

Consider the following interpolation formula

 pn(t):= n−1∑i=0p(ti)I[ti,ti+1[(t).

For any integer , , set777The symbol is the imaginary unit .

 ck(dpn):=12πn−1∑i=0e−ikti(p(ti+1)−p(ti)), (1)

and, for any , define the convolution formula

 ck(vn,N):=2π2N+1∑|s|≤Ncs(dpn)ck−s(dpn). (2)

In [Malliavin and Mancino, 2009] it is proved that (2) is a consistent estimator of the -th Fourier coefficient of the volatility process888Hereafter, we will follow the relevant econometric literature by using the term volatility as a synonym of variance, thus referring to as the volatility process. Similarly for the volatility of volatility. and in [Barucci and Mancino, 2010, Sanfelici et al., 2015] it is shown that it is possible to derive an estimator of the integrated volatility of volatility by exploiting only the knowledge of the Fourier coefficients in (2), without the need of the preliminary estimation of the instantaneous volatility. This fact characterizes the Fourier method for estimating the volatility of volatility. In fact, as far as we know, all other existing methods rely on the pre-estimation of the spot volatility, see [Ait-Sahalia and Jacod, 2014, Cuchiero and Teichmann, 2015, Vetter, 2015, Li et al., 2021]. In general, these methods entail the pre-estimation of the spot volatility, in the presence or absence of noise contamination, as a first step; then, as a second step, a quadratic variation approach (e.g., the realised volatility formula) is applied to the pre-estimated spot volatility trajectory.

The estimator of the integrated volatility of volatility, defined in [Sanfelici et al., 2015], is given by times

 c0(γ2n,N,M):=2πM+1∑|k|≤M(1−|k|M+1)k2ck(vn,N)c−k(vn,N), (3)

as is the estimator of . [Sanfelici et al., 2015] show that the estimator (3) is consistent under the assumptions (A.I)-(A.II)- (A.III) and the conditions and and asymptotically unbiased in the presence of microstructure noise. However, the rate of convergence and the asymptotic normality are not established. We show in Theorem 3.1 that the convergence rate of the estimator (3) is not optimal. However, the estimator has a very good finite-sample performance, as shown in Section 4.

In order to obtain an estimator with the optimal rate of convergence in the absence of microstructure noise, a bias correction is needed and thus we consider the estimator

 ˆγ2n,N,M:=2πM+1∑|k|≤M(1−|k|M+1)k2ck(vn,N)c−k(vn,N)−Kˆσ4n,N,M (4)

where the constant is determined in (44) and is the Fourier quarticity estimator

 ˆσ4n,N,M:=2π ∑|k|≤Mck(vn,N)c−k(vn,N). (5)

The asymptotic normality of the estimator (5) is studied in [Livieri et al., 2019], while its properties in the presence of microstructure noise are studied in [Mancino and Sanfelici, 2012].

Note that the estimator (4) differs from (3) for the presence of the bias correction . This bias correction, while ensuring a faster rate of convergence, destroys the positivity of the estimator, see also [Barndorff-Nielsen et al., 2011]. The estimator (3) is instead positive.

## 3 The Central Limit Theorems

In this Section we study the asymptotic normality of the Fourier estimators of the integrated volatility of volatility defined by (3) and (4) and prove that the estimator (4) reaches the optimal rate of convergence , at the cost of a de-biasing term, while the estimator (3) has a smaller asymptotic variance at the cost of a slower convergence rate.

###### Theorem 3.1

Suppose that assumptions (A.I)-(A.II)-(A.III) hold. Let and , where both and are positive constants999We stress the point that and are two positive constants, i.e., they do not depend on . This notation is in line with the one used in [Mancino and Sanfelici, 2012].. Then, as , the following stable convergence in law holds:

 ρ(n)−1/4(ˆγ2n,N,M−12π∫2π0γ2(t)dt)
 ↓
 N(0,12π∫2π0K(cM)γ4(t)+K(cM,cN)σ8(t)+˜K(cM,cN),γ2(t)σ4(t)dt),

where , , and , with , being the integer part of .

Note that if or, equivalently, (i.e., the cutting frequency used for the estimation of the volatility coefficient given the log-prices is equal to the Nyquist frequency), then and the asymptotic variance in Theorem 3.1 becomes

 12π∫2π0431cMγ4(t)+16105c3Mσ8(t)+1615cMγ2(t)σ4(t)dt. (6)
###### Remark 3.2

The realised volatility of volatility estimator (Th. 8.11 [Ait-Sahalia and Jacod, 2014]) is obtained as the quadratic variation of the estimated spot volatility, with a de-biasing term depending on the quarticity. The underlying model is a continuous semimartingale for the price, the volatility and the volatility of volatility. The convergence rate of the estimator is and the asymptotic variance is

 ∫T015170βγ4(t)+48β3σ8(t)+12βσ4(t)γ2(t)dt. (7)

Letting , the correspondence between the asymptotic variances (6) and (7) is easily seen, with the second and third terms smaller in the case of the Fourier estimator101010Notice that the estimator in [Ait-Sahalia and Jacod, 2014] corresponds to (4) multiplied by ..

A similar approach as [Ait-Sahalia and Jacod, 2014] is considered in [Li et al., 2021], but it is extended to obtain a consistent estimator in the presence of noisy data. To this aim, the authors first build an estimator of the spot volatility by means of a pre-averaging method to get rid of the noise contamination, then they compute the realized variance from the spot volatility estimates to obtain an estimator of the integrated volatility of volatility. Finally, they also need to correct for the bias of the obtained estimator. The rate of convergence is in the presence of noise and without noise.

It is possible to obtain an estimator of the volatility of volatility without a bias-correction term and a smaller asymptotic variance, but the rate of convergence is slower, precisely , with . The estimator is simply given by (3) multiplied by and the following result holds:

###### Theorem 3.3

Suppose that Assumptions (A.I)-(A.II)-(A.III) hold. Let and , where and both and are positive constants. Then, as , the following stable convergence in law holds:

 ρ(n)−ι/2(c0(γ2n,N,M)−12π∫2π0γ2(t)dt)→N(0,12π∫2π0431cMγ4(t)dt).
###### Remark 3.4

The result in Theorem 3.3 is in line with [Cuchiero and Teichmann, 2015] Th. 3.13, where an estimator without bias correction is considered. The proposed estimator relies on a smooth function of the plug-in spot volatility, estimated with the Fourier method.

## 4 Simulation Study

In this section we present a simulation study of the finite-sample performance of the rate-efficient estimator (4). The objective of the study is to provide support to the asymptotic result in Theorem 3.1 and offer a comparison of the accuracy of the Fourier estimator with that of the rate-efficient realized estimator by [Ait-Sahalia and Jacod, 2014].

### 4.1 Simulation design

We simulated discrete observations from two parametric models which satisfy Assumptions (A.I)-(A.II)-(A.III). The first model that we simulated is the Heston model (see

[Heston, 1993]):

 ⎧⎨⎩dp(t)=(μ−12v(t))dt+σ(t)dWtdv(t)=θ(α−v(t))dt+√γ2v(t)dZt, (8)

where , , , and denotes the correlation between the Brownian motions and . Under the Heston model, the volatility of volatility is given by .

The second model that we simulated is the stochastic volatility of volatility model that appears in [Barndorff-Nielsen and Veraart, 2013] and [Sanfelici et al., 2015]. The model is as follows:

 ⎧⎪ ⎪⎨⎪ ⎪⎩dp(t)=(μ−12v(t))dt+σ(t)dWtdv(t)=θ(α−v(t))dt+γ(t)dZtdγ2(t)=χ(η−γ2(t))dt+ξγ(t)dYt, (9)

where is a Brownian motion, independent of and , and , .

The parameter vectors used for the simulations of the models (

8) and (9) are, resp.:

• ;

• .

Note that the selection of a negative reproduces the presence of leverage effects. For each model we simulated trajectories of length , i.e., one trading day, which were used to produce the results in subsection 4.2. Additionally, for each model we also simulated weekly (), monthly () and yearly () trajectories, which were used to obtain the results in subsection 4.4. All trajectories were simulated on the equally-spaced grid with mesh equal to second. For the simulations we have assumed that one trading day is 6.5-hour long.

### 4.2 Finite-sample performance

We assessed the finite-sample performance of the estimator (4) for increasing values of the sample size , in order to provide numerical support to Theorem 3.1. Specifically, given , we chose a fixed estimation horizon , corresponding to one day (as customary for econometric analysis), and let vary. The resulting values of considered range between second and minutes.

For what concerns the frequency , which is needed for the convolution formula (2), we set and selected . This selection yields the value of equal to the Nyquist frequency and allows obtaining the smallest variance of the asymptotic error, see (6) in section 3.

As for the frequency , we set and optimized the value of the constant based on the (unfeasible) numerical minimization of the mean squared error (MSE). In this regard, we found an MSE-optimal value of equal to (resp., ) for the model (8) (resp., (9)). A feasible selection procedure for is proposed in the subsection 4.3.

Table 1 illustrates the finite-sample performance of the estimator (4) under the two different data-generating processes considered. Specifically, the Table illustrates the MSE and the bias for the different values of the sampling frequency . As expected, the bias and MSE improve as is increased for both the data-generating processes considered, thereby providing numerical support to Theorem 3.1. In particular, note that the performance of the estimator is still satisfactory for equal to 5 minutes, the sampling frequency typically used in the absence of noise with empirical data. Additionally, it is worth mentioning that the estimator never produced negative volatility of volatility estimates in this simulation study.

### 4.3 Feasible selection of the frequency M

Inspired by [Li et al., 2021], Section 2.4, in this subsection we propose a feasible, data-driven procedure for the selection of the frequency . The procedure is described as follows111111

Note that the procedure is heuristic, as it uses the sample standard error of estimates as a rough proxy of the unobservable estimation error.

:

obtain volatility of volatility estimates on periods (e.g., daily intervals) with the initial selection

, the smallest possible value, and record their standard deviation, denoted by

;
increase the initial value of by one unit, i.e., select equal to , and record the absolute relative change of the standard deviation ; iterate the procedure with , , until the statistic falls below a given threshold, denoted by , and take the last as the final choice of .

We evaluated the accuracy of this procedure with simulated prices, setting equal to minutes, the noise-free sampling frequency used in the empirical study of Section 5. As a result, we found that the choice yields final values of equal to the MSE-optimal ones found in subsection 4.2 for both models (8) and (9)121212The unfeasible MSE-optimal values of found in subsection 4.2 are 0.05 and 0.07 for, resp., the model (8) and (9). Given that , the unfeasible optimal values of are then equal to, resp., and when minutes. . Unreported additional simulations show that this result is fairly robust to smaller values of , namely, equal to and , and to values of in the interval .

### 4.4 Comparative performance study

This subsection contains a comparative study of the finite-sample performance of the rate-efficient Fourier estimator (4) and the rate-efficient realized estimator by [Ait-Sahalia and Jacod, 2014] (see Remark 3.2). For the sake of practical relevance, the study was performed with fixed and equal to minutes, the highest noise-free sampling frequency usually available with empirical data (see [Andersen et al., 2001]). Given this sampling choice, the performance of the estimators was then evaluated for increasing values of the estimation horizon . Specifically, we considered equal to one day , one week , one month and one year .

We recall the definition of the rate-efficient realized volatility of volatility estimator by [Ait-Sahalia and Jacod, 2014]. Let denote a sequence of integers such that , . The estimator reads131313For a given , it holds , .

 ^γ2n=32κ(n)n−2κ(n)+1∑i=1((ˆσ2n(ti+k(n))−ˆσ2n(ti))2−4κ(n)(ˆσ2n(ti))2), (10)

where

 ˆσ2n(ti)=1κ(n)ρ(n)κ(n)−1∑m=0(p(ti+m)−p(ti+m−1))2

is the local estimator employed to pre-estimate the spot variance. The estimator (10) is also studied in [Vetter, 2015], where the author replaces with

 ˆσ4n(ti)=13κ(n)ρ(n)2κ(n)−1∑m=0(p(ti+m)−p(ti+m−1))4. (11)

The rate-efficient realized estimator considered in [Vetter, 2015] thus reads

 ^¯γ2n=32κ(n)n−2κ(n)+1∑i=1((ˆσ2n(ti+k(n))−ˆσ2n(ti))2−4κ(n)ˆσ4n(ti)). (12)

To implement the estimator , we selected and optimized based on the minimization of the MSE, as done in subsection 4.2. Analogously, to apply estimators (10) and (12) we used the MSE-optimal value of , after setting . Specifically, we found that the MSE-optimal values of for equal to, resp., are for the model (8) and for the model (9). As for , for both models and almost all values of considered we found that its MSE-optimal value is such that is equal to the maximum value allowed, so that the summation in (10) features just one squared increment of the estimated volatility. The only exception is the case of the model (9) with year: in that case, the MSE-optimal value of was found to be equal to . Also, we note that simulations suggest that the MSE-optimal value is the same for both versions of the realized estimator for all cases investigated. Table 2 reports the bias and MSE of the three estimators for increasing values of the estimation horizon .

Based on Table 2, we note that the performance of the realized estimators (10) and (12) is not satisfactory, both in terms of bias and MSE, for smaller or equal than one month and becomes good enough only for equal to one year141414Note that year is the choice made in the numerical analysis proposed in [Vetter, 2015], where the author studies the finite-sample properties of the estimator (12), and also in the numerical and empirical exercises presented in [Li et al., 2021], where the performance of the noise- and jump-robust version of the realized estimator (10) is investigated.. See also [Sanfelici et al., 2015] and [Toscano and Recchioni, 2021] for similar considerations on the finite-sample performance of realized volatility of volatility estimators. Moreover, we observe that the use of the quarticity estimator in the de-biasing term in (12) does not appear to improve the finite-sample performance. Instead, the Fourier estimator performs rather well for all the estimation horizons considered, maintaining satisfactory accuracy also for the smallest one, that is, equal to one day, as already pointed out in subsection 4.2.

## 5 Empirical Analysis

To the best of our knowledge, the empirical properties of the volatility of volatility of financial assets have been scarcely explored in the literature. The aim of the empirical study presented in this Section is thus to provide insight into the existence of stylized facts pertaining to the daily dynamics of the volatility of volatility.

In fact, the numerical evidence presented in Section 4 suggests that the Fourier methodology allows reconstructing the integrated volatility of volatility with satisfactory accuracy on daily intervals by means of the rate-efficient estimator (4). Accordingly, in this section we use the estimator (4) to obtain the daily volatility of volatility series for two market indices: the S&P500 and the EUROSTOXX50. The periods considered for the analysis are, resp., May 1, 2007 - August 6, 2021 and June 29, 2005 - May 28, 2021.

### 5.1 Data description, estimation and sample statistics

For the empirical analysis we used the series of -minute trade prices, recorded during trading hours. Specifically, for the S&P500 index, we used the prices recorded between a.m. and p.m., while for the EUROSTOXX50 index we employed the prices recorded between a.m. and p.m. Days with early closure were discarded.

The estimation of the daily integrated volatility of volatility was performed via the rate-efficient Fourier estimator (4), without considering overnight returns. Before performing the estimation, we run the test by [Ait-Sahalia and Xiu 2014] on 5-minute series and found that the assumption of absence of noise could not be rejected at the significance level for both indices. Moreover, following [Wang and Mykland, 2014], days with jumps were removed, based on the results of the test by [Lee and Mykland, 2008], which was applied at the significance level. Overall, the number of days for which we estimated the volatility of volatility is and for, resp., the S&P500 and EUROSTOXX50.

For the estimation, given the choice of the sampling frequency minutes, we selected equal to the Nyquist frequency (see subsection 4.2) and based on the feasible procedure illustrated in subsection 4.3151515The resulting selected is equal to and for, resp., the S&P500 and the EUROSTOXX50.. Figures 1 and 2 display the reconstructed trajectories of the daily volatility of volatility of the two indices, while Table 3 compares sample statistics of volatility and volatility of volatility estimates. Daily volatility estimates were obtained via the Fourier estimator by [Malliavin and Mancino, 2002], applied to 5-minute returns161616All the analyses appearing in this Section and involving the computation of the volatility were also performed using volatility estimates obtained via the 5-minute realized variance and the final outcome was pretty much the same.. We note that all volatility of volatility estimates obtained are strictly positive.

Figures 1 and 2 show that the volatility of volatility spikes during financial crises, while remaining rather low and stable during tranquil periods. Indeed, both daily series reach their three highest peaks in correspondence of, resp., the global financial crisis starting at the end of 2008, the instabilities of the Euro area in the second part of 2011 and the outbreak of the COVID pandemic in the first months of 2020.

Moreover, based on Table 3, we make the following remarks. First, the volatility of volatility is on average smaller than the volatility only in the case of the European stock index. Secondly, the volatility of volatility appears to be more volatile than the volatility itself, as it displays larger sample standard deviations and maxima for both the estimated series. Finally, the volatility of volatility appears to be much more skewed and leptokurtic than the volatility for both the indices.

An analysis of the empirical regularities displayed by the reconstructed daily series of the volatility of volatility of the S&P500 and the EUROSTOXX50 is illustrated in the next subsection.

### 5.2 Insight into volatility-of-volatility stylized facts

The literature on the stylized facts related to the behavior of the volatility of financial assets is very rich (see, for instance, [Andersen et al., 2001], [Patton and Engle, 2001] and [Corsi, 2009], among many others). These include, e.g., clustering, long memory, mean-reversion, log-normality and leverage effects. Nowadays, the volatility can be regarded, in some sense, as a traded asset itself. In fact, it is possible to “trade” the volatility of many financial asset classes via quoted and O.T.C. volatility derivatives (e.g., variance swaps, VIX futures and VIX options). Therefore, it may be of interest to evaluate which typical features of the daily volatility actually apply to the daily volatility of the volatility itself.

We have already observed in the previous subsection that the volatility of volatility shows clusters, being larger in correspondence of crises and smaller and less volatile during periods of economic stability. However, based on the observation of the sample auto-correlation function (see Figures 3 and 4), it appears to be less persistent than the volatility for both the indices considered. As for the mean-reversion property, the Augmented Dickey-Fuller test rejects the hypothesis of a unit root for both volatility of volatility series analysed, at the 0.01% significance level.

We also examine the year-by-year correlation of the daily volatility of volatility with, resp., the daily volatility and the daily log-return, computed as the difference between the closing and opening log-price. The dynamics of such correlations are summarized in Tables 4 and 5, where the values of return-variance correlations, a rough proxy of the leverage effect, are also displayed for comparison171717Note that the correlations appearing in Tables 4 and 5 are typically pushed towards zero by the presence of a finite-sample bias, see [Ait-Sahalia et al., 2013]. For this reason, their true values are likely to be larger, in absolute value. However, obtaining unbiased and efficient estimates of these correlations goes beyond the scope of the exploratory analysis proposed.. For both the indices, we observe that the yearly correlation between the log-return and the volatility of volatility tends to be negative and to follow the return-variance correlation closely, although being most often smaller in absolute value. This result may suggest the existence of a “second-order” leverage effect: what we observe is in fact that when the asset price decreases, not only the volatility increases, due the asset becoming riskier, but also the volatility of volatility - which can be seen as a proxy of the uncertainty about the amount of risk perceived by market operators, that is, the “volatility of risk” - becomes larger. The yearly correlation between the volatility of volatility and the volatility is instead positive and close to on average for both the indices. This is consistent with the presence of volatility-of-volatility peaks in periods of higher market volatility.

Finally, we test for the Gaussianity of the logarithmic volatility and volatility of volatility estimates, using the Jarque-Bera and Anderson-Darling tests at the 5% significance level. The years in which both tests reject the null hypothesis of Gaussianity are 2008, 2011, 2016 and 2020, that is, the years in the sample that were the most characterized by market turmoil (in order: the global financial crisis, the Euro-area instability phase, Brexit and the outbreak of the COVID pandemic). This happens for both the quantities tested and both the indices analyzed, thus suggesting that the log-normal approximation for the distribution of the volatility and the volatility of volatility is more satisfactory in periods of market stability.

## 6 Conclusion

This paper fills a gap in the literature on financial econometrics by deriving the convergence rate of the Fourier estimator of the volatility of volatility. In this regard, we showed that the bias-corrected version of the estimator reaches the optimal rate , while the estimator without bias-correction achieves a sub-optimal rate, but has a smaller asymptotic variance.

Further, we presented a numerical study that shows that the rate-optimal Fourier estimator of the volatility of volatility performs well in finite samples, even at the relatively small daily estimation horizon, where the competing rate-efficient realized estimator shows a poor performance.

Finally, we applied the Fourier estimator to multi-year samples of S&P500 and EUROSTOXX50 observations and gained some new knowledge about the empirical regularities that characterize the daily dynamics of the volatility of volatility, a topic which so far had been scarcely explored in the literature.

## 7 Appendix A: Proofs

The proof of Theorems 3.1 and 3.3 are carried in the next subsections. Some preliminary remarks are useful.

###### Remark 7.1

As every continuous process is locally bounded, thus all processes appearing are. Moreover, standard localization procedures (see, e.g., [Ait-Sahalia and Jacod, 2014]) allow to assume that any locally bounded process is actually bounded and almost sure positive processes can be considered as bounded away from zero.

###### Remark 7.2

In [Malliavin and Mancino, 2009] Lemma 2.2, it is proved that the drift component of the semi-martingale model gives no contribution to the convolution formula (2). Therefore, we will refer to the drift-less model in Assumption (A.I). Moreover, as observed in [Malliavin and Mancino, 2002], we can assume that and . In fact, if (similarly for the process ), we introduce

 ˜p(t)=p(t)−t2π[p(2π)−p(0)].

Then, while satisfies the required assumption, at the same time the volatility and co-volatilities estimations are not affected by a modification of the drift as above. From the point of view of the modeling, we may consider

 d˜p(t)=√v(t)dW(t),
 d˜v(t)=γ(t)dZ(t),

being In fact, for any , it holds , while the -th Fourier coefficient, , is not contributing to the definition of the Fourier volatility of volatility estimator. The situation would change in the spot volatility estimation. However, this is not an issue of the present study as the spot volatility estimation is not required.

### 7.1 Preliminary Decomposition

Given the discrete time observations , we use the notation in continuous time by letting , for the sake of simplicity. From the Itô formula we have the following decomposition of the term (2)

 ck(vn,N):=Ak,n+Bk,n,N+Ck,n,N, (13)

where

 Ak,n:=12π∫2π0e−ikφn(s)v(s)ds, (14)
 Bk,n,N:=12π∫2π0e−ikφn(s)∫s0DN(φn(s)−φn(u))σ(u)dWuσ(s)dWs, (15)
 Ck,n,N:=12π∫2π0∫s0e−ikφn(u)DN(φn(s)−φn(u))σ(u)dWuσ(s)dWs. (16)

It follows that is equal to the following sum:

 2πM+1∑|k|≤M(1−|k|M+1)k2Ak,nA−k,n (17)
 +2πM+1∑|k|≤M(1−|k|M+1)k22(Ak,nB−k,n,N+Ak,nC−k,n,N) (18)
 +2πM+1∑|k|≤M(1−|k|M+1)k2(Bk,n,NB−k,n,N+2Bk,n,NC−k,n,N+Ck,n,NC−k,n,N). (19)

In the sequel we denote:

 Yn,N(t,s):=∫t0DN(φn(s)−φn(u))σ(u)dWu, (20)

and, for brevity, we will also use the following notation for the Dirichlet and the Fejér kernels (see also the Appendix B, Section 8):

 DN,n(s−u):=DN(φn(s)−φn(u)),    FM,n(s−u):=FM(φn(s)−φn(u)), (21)

and similarly for the derivatives of the Fejér kernel181818We stress the point that the notation refers to , which is a function of two time variables, , as stated in (21). It is not to identify to . Notation (21) is maintained in order to highlight the role of the convolution product, which is the main characteristic of the Fourier estimation method in [Malliavin and Mancino, 2009]. and .

In order to identify the different contribution of all terms, we start with equation (17), which can be written as

 2πM+1∑|k|≤M(1−|k|M+1)k2ck(v)c−k(v) (22)
 +2πM+1∑|k|≤M(1−|k|M+1)k2(Ak,nA−k,n−ck(v)c−k(v)). (23)

We consider now the term (18). An application of the Itô formula shows that it is equal to:

 2(AB(i)M,n,N+AB(ii)M,n,N+AC(i)M,n,N+AC(ii)M,n,N),

where

 AB(i)M,n,N:=12π∫2π0∫s01M+1∑|k|≤M(1−|k|M+1)k2eik(φn(s)−φn(u))v(u)duYn,N(s,s)σ(s)dWs
 =−∫2π012π∫s01M+1F′′M,n(s−u)v(u)duYn,N(s,s)σ(s)dWs, (24)
 AB(ii)M,n,N:=∫2π012π∫s01M+1∑|k|≤M(1−|k|M+1)k2e−ik(φn(s)−φn(u))Yn,N(u,s)σ(u)dWuv(s)ds
 =−∫2π012π∫s01M+1F′′M,n(s−u)Yn,N(u,s)σ(u)dWuv(s)ds, (25)

and, letting

 Yk,n,N(t,s):=∫t0e−ikφn(u)DN(φn(s)−φn(u))σ(u)dWu,

then:

 AC(i)M,n,N:=1M+1∑|k|≤M(1−|k|M+1)k212π∫2π0∫s0e−ikφn(u)v(u)duY−k,n,N(s,s)σ(s)dWs
 =−∫2π012π∫s0∫u01M+1F′′M,n(u−u′)DN,n(s−u′)σ(u′)dWu′v(u)duσ(s)dWs (26)
 −∫2