1 Introduction
A Foreign Exchange (FX) option is a contract that gives the buyer the right, but not the obligation, to buy or sell a particular currency at a prespecified rate (exercise price) at a prespecified future date. Implied volatility (IV) is where the market’s expectation of future realised volatility is backed out of traded option prices. Models including Black & Scholes (1973), assume that IV is constant across all option contracts for the same underlying currency. However, constant IV is not what is observed empirically, and IV varies according to both option delta and timetoexpiry. This concept results in nonflat IV surfaces. Further, the shape of these IV surfaces change from day to day, with Deuskar et al. (2008); Konstantinidi et al. (2008) highlighting how important it is to model daily changes in IV surfaces. In order to model and predict these daily IV surface changes we propose a dynamic functional timeseries approach, considering daily IV surfaces as the timeseries of observations.
Theoretical general equilibrium models have been used by David & Veronesi (2000), Guidolin & Timmermann (2003), Garcia et al. (2003) and Bernales & Guidolin (2015), among others, to explain why we observe nonflat IV surfaces. This strand of literature links agent learning and uncertainty in options markets to macroeconomic and dividend fundamentals to explain IV surface shape. Latent factors are then said to drive changes in the shape of the IV surface over time. Bernales & Guidolin (2014, 2015) also conclude that IV predictability is driven by the learning behaviour of agents in options markets. They highlight that as options constitute forwardlooking market information, agents’ expectations of future economic scenarios can be determined by interpreting timetoexpiry as investor forecast horizons. Goncalves & Guidolin (2006) find predictable patterns in how the S&P 500 IV surface changes over time, however they conclude that their results do not constitute firstorder evidence against the efficient market hypothesis. Bernales & Guidolin (2014) conclude that if the efficient market hypothesis is imposed, the discovery of such predictable patterns could be associated with microstructural imperfections and timevarying risk premia. Following Bernales & Guidolin (2014), we hypothesise that isolated pockets of option market inefficiency exist, and that these inefficiencies could lead to the presence of predictable patterns for segments of the IV surface. We seek to identify these patterns using functional timeseries techniques.
In our analysis, we consider IV smiles of a single maturity as a univariate functional timeseries. Nevertheless, IV also varies considerably across time to maturity, as highlighted by Borovkova & Permana (2009), among others. Therefore, when forecasting the IV surface (essentially multiple IV smiles of differing maturities for the same underlying), it could be advantageous to incorporate information from other maturity series simultaneously. Our multivariate and multilevel functional timeseries methods follow this principle, allowing us to extend the previously proposed univariate functional timeseries method. Both multivariate and multilevel functional timeseries methods capture correlation among IV smiles of differing maturities. We illustrate their use in producing forecasts of daily FX IV surfaces comprised of a number of maturities. To the best of our knowledge, no study to date evaluates and compares forecast errors associated with univariate, multivariate, and multilevel functional timeseries methods. We attempt to fill this gap using both static and dynamic functional principal component approaches.
Functional timeseries analysis has become increasingly popular in both implied and realised volatility modelling and forecasting. For instance, Benko et al. (2009) test stochastic similarities and differences between IV smiles of one and threemonth maturity equity options by conducting a functional principal component analysis. Further, Müller et al. (2011) utilise a functional model to characterise high frequency S&P 500 volatility trajectories. They find that when combined with forecasting techniques, their model accurately predicts future volatility levels. Kearney et al. (2015) use the smoothness in the IV smile to obtain a measure of IV steepness in oil options using functional data analysis. In a closely related study to ours, Kearney et al. (2018) model and forecast foreign exchange IV smile separately using static univariate functional timeseries techniques, outperforming benchmark methods from Goncalves & Guidolin (2006) and Chalamandaris & Tsekrekos (2010, 2011).
Functional principal component analysis is the functional counterpart of traditional discrete principal component analysis. Functional principal component analysis reduces the dimensionality of the data to a small number of latent components, in order to effectively summarise the main features of the data. While discrete principal component analysis searches for an optimal set of orthonogal eigenseries that describe the highest level of variance in a discrete data set, functional principal component analysis seeks to identify eigenfunctions that best describe a set of continuous/functional observations. Advances of functional principal component analysis date back to the early 1940s when
Karhunen (1946) and Loève (1946) independently developed a theory on the optimal series expansion of a continuous stochastic process. Later, Rao (1958) and Tucker (1958) provided applications of the KarhunenLoève expansion to functional data, by applying multivariate principal component analysis to observed function values. For additional methodological details of functional principal component analysis, we direct the interested reader to survey articles by Hall (2011); Shang (2014) and Wang et al. (2016).Our paper extends Kearney et al. (2018)
through firstly utilising dynamic functional timeseries specifications, to see if their ability to dynamically update their predictions can help improve forecast accuracy. This dynamic approach captures the inherent timeseries relationship present between daily observations of IV surfaces. When there is moderate to high temporal dependence, such as between daily observations of IV surfaces, the extracted principal components obtained from the (static) functional principal component analysis are not consistent. This inconsistency leads to inefficient estimators. To overcome this issue, we consider a dynamic approach that extracts principal components based on the longrun covariance between daily observations of IV surfaces, instead of using variance alone. Longrun covariance and spectral density estimation enjoy a vast literature in the case of finitedimensional timeseries, beginning with the seminal work of
Bartlett (1950) and Parzen (1957), and still the most commonly used techniques resort to smoothing the periodogram at frequency zero by employing a smoothing weight function and bandwidth parameter. Datadriven bandwidth selection methods have received a great deal of attention (see, e.g., Andrews (1991) for the univariate timeseries context and Rice & Shang (2017) for the functional timeseries context). Similar to a univariate timeseries framework, the longrun covariance includes the variance function as a component, but it also measures temporal autocovariance at different positive and negative lags. The estimation of the longrun covariance is the sum of the empirical autocovariance functions and is often truncated at some finite lag in practice. As shown in Shang (2019) and MartínezHernández et al. (2020), a dynamic functional principal component analysis is a more suitable decomposition than a static functional principal component. Shang (2019) demonstrate this for mortality forecasting, while MartínezHernández et al. (2020) show this for yield curve forecasting. However, we highlight its use for forecasting FX IV surfaces.Our second contribution is the use of the multivariate functional principal component regression (see, e.g., Shang & Yang, 2017) and multilevel functional principal component regression models (see, e.g., Shang et al., 2016) to model and forecast FX IV surfaces. These multivariate and multilevel approaches overcome the issue that twodimensional univariate approaches have when modelling IV surfaces, where they proceed by essentially slicing the surface and modelling the smiles of different maturities separately. There are, however, intuitive and theoretical links between smiles of different maturities for the same IV surface, so we adopt the multivariate and multilevel extensions to explicitly incorporate information derived from all smiles simultaneously. Both multilevel and multivariate regression models can jointly model and forecast multiple functional timeseries that are potentially correlated with each other. By taking into account this correlation between IV smiles of differing maturities, the point and interval forecast accuracies of the IV surfaces are improved.
The remainder of the paper is organised as follows. In Section 2, we introduce the FX options data set. In Section 3, we provide a background to the univariate, multivariate, and multilevel functional timeseries methods, and present extensions based on the dynamic functional principal component analysis. In Section 4, we outline our forecast evaluation procedure and present several forecast error measures. In Section 5, we present comparisons of insample model fitting and outofsample forecast accuracy among the functional timeseries approaches and traditional IV surface forecasting benchmarks. In our comparisons, we consider the benchmark methods of Chalamandaris & Tsekrekos (2011) and Goncalves & Guidolin (2006)
, as well as the random walk (RW) model and an autoregressive model of order 1 (AR(1)). Using the model confidence set procedure of
Hansen et al. (2011), we determine a set of superior forecasting models based on the outofsample accuracy of predicting daily IV surfaces. Further, we present a stylised trading strategy based on the proposed dynamic functional timeseries approach. As a robustness check, we also consider an SPX S&P 500 equity options dataset to further showcase the use of the functional timeseries methods. We outline our conclusions in Section 6, along with some thoughts on how the methods presented here can be extended.2 Data description
We mirror Kearney et al. (2018) in utilising aggregated market participant contributed IV quotes, thus circumventing the need to adopt a specific options pricing model or impose noarbitrage assumptions or calibration methods to back out IV from option prices. The use of contributed IV quotes is a common convention in FX options markets, as unlike equity or commodity options, the contracts are most actively traded overthecounter between market participants, as opposed to being listed on an exchange. We treat these retrieved IV quotes as economic variables of interest in their own right and model how they change over time. Our data set comprises atthemoney, risk reversal, and butterfly composition IV quotes for Euro/United States Dollar (EURUSD), Euro/British Pound (EURGBP), and Euro/Japanese Yen (EURJPY) currency options obtained from Bloomberg. These four currencies account for almost 77% of total global foreign exchange market turnover according to the Bank of International Settlements’s (2016) report. The same four currencies are also considered by Chalamandaris & Tsekrekos (2011).
IV quotes associated with delta values of 5, 10, 15, 25, 35, 50, 65, 75, 85, 90 and 95 are available from Bloomberg. However, in line with Chalamandaris & Tsekrekos (2011) and Kearney et al. (2018), we restrict our forecast prediction to the segments of the surface with the greatest trading intensity; namely, delta values of 10, 25, 50, 75 and 90. Therefore, our models and forecasts focus on these specific delta values. In addition, for each FX rate, we observe three maturities; one month (1M), six months (6M) and two years (2Y). Our data set runs from January 1, 2008 to December 30, 2016.
Figure 1 presents rainbow plots of the EURUSD IV smiles for the three maturities. The rainbow plot is a plot of all the IV data in our data set, with a rainbow colour palette used to indicate when the observations occur (see also Hyndman & Shang, 2010). Days in the distant past are shown in red, with the most recent days presented in violet. We observe that as the maturity period increases from 1M to 2Y, the spread of IV decreases.
Figure 2 presents plots of implied volatilities (IVs) averaged across delta values. They show that the 1Maveraged IVs reached a peak on October 27, 2008 and a trough on August 1, 2014. Both the 6M and 2Yaveraged IVs reached their peaks on December 17, 2008 and troughs on July 3, 2014.
3 Methodology
3.1 Univariate functional timeseries method
3.1.1 Functional principal component analysis
Let
be an arbitrary functional timeseries defined on a common probability space
. It is assumed that the observations are elements of the Hilbert space equipped with the inner product , where represents a continuum and denotes a function support range (i.e., range of delta values), and is the real line. Each function is a squareintegrable function satisfying and associated norms. The notation is used to indicate for some . When , has the mean curve ; when , a nonnegative definite covariance function is given by(1) 
for all . The covariance function in (1) allows the covariance operator of , denoted by , to be defined as
Via Mercer’s lemma, there exists an orthonormal sequence of continuous functions in and a nonincreasing sequence of positive numbers, such that
By the separability of Hilbert spaces, the KarhunenLoève expansion of a stochastic process can be expressed as
(2) 
where principal component scores are given by the projection of in the direction of the eigenfunction , i.e., ; denotes the model truncation error function with a mean of zero and a finite variance; and is the number of retained components. Equation (2) facilitates dimension reduction as the first terms often provide a good approximation to the infinite sums, and thus the information inherited in can be adequately summarised by the
dimensional vector
. These scores constitute an uncorrelated sequence of random variables with zero mean and variance
, and they can be interpreted as the weights of the contribution of the functional principal components to .The estimation accuracy of the functional timeseries method crucially relies on the optimal selection of
. There are at least five conventional approaches; namely, eigenvalue ratio tests
(Ahn & Horenstein, 2013), crossvalidation (Ramsay & Silverman, 2005), Akaike’s information criterion (Akaike, 1974), the bootstrap method (Hall & Vial, 2006) and explained variance (Chiou, 2012). Here, the value of is determined as the minimum that reaches a certain level of the cumulative percentage of variance (CPV) explained by the leading components such thatwhere , is to exclude possible zero eigenvalues, and represents the binary indicator function. As a sensitivity test, we also consider the number of retained components , since there are 5 discrete delta values.
3.1.2 Longrun covariance and its estimation
While Section 3.1.1 presents a method based on the functional principal component analysis extracted from the variance function, these static functional principal components are not consistent in the presence of moderate to strong temporal dependence. This issue motivates the development of dynamic functional principal component analysis, such as in Hörmann et al. (2015) and Rice & Shang (2017), where functional principal components are extracted from the longrun covariance function.
We implement a stationarity test of Horvàth et al. (2014), and from the values reported in Table 3 in Section 5.2, we conclude that all series are stationary. For a stationary functional timeseries , the longrun covariance function is defined as
and is a welldefined element of for a compact support interval
, under mild dependence and moment conditions. By assuming
is a continuous and squareintegrable covariance function, the function induces the kernel operator . Through right integration, defines a HilbertSchmidt integral operator on given bywhose eigenvalues and eigenfunctions are related to the dynamic functional principal components defined in Hörmann et al. (2015). Hörmann et al. (2015) provide asymptotically optimal finitedimensional representations of the sample mean of a functional timeseries.
In practice, we need to estimate from a finite sample . Given its definition as a biinfinite sum, a natural estimator of is
(3) 
where is the bandwidth parameter, and
is an estimator of . is a symmetric weight function with bounded support of order , and the following properties:



if for some

is continuous on

exists and satisfies
The kernel sandwich estimator in (3) is discussed in Horvàth et al. (2013), Panaretos & Tavakoli (2013), Rice & Shang (2017), and MartínezHernández et al. (2020), among others. As with the kernel sandwich estimator, a crucial consideration here is the estimation of the bandwidth parameter . One approach is to use a datadriven approach, such as the plugin algorithm proposed in Rice & Shang (2017). Rice & Shang (2017) prove that the estimator in (3) is a consistent estimator of the true and unknown longrun covariance. They also find that the estimated dynamic functional principal components and associated scores extracted from the estimated longrun covariance are consistent too.
3.1.3 Dynamic functional principal component analysis
Using the estimated longrun covariance function , we apply functional principal component decomposition to extract the dynamic functional principal components and their associated scores. The sample mean and sample covariance are given by
where are the sample eigenvalues of , and are the corresponding orthogonal sample eigenfunctions. Via the KarhunenLoève expansion, the realisations of the stochastic process can be written as
where is the ^{th} estimated principal component score for the ^{th} observation.
Dynamic functional principal component analysis has three properties:

It minimizes the mean integrated squared error of the reconstruction error function over the whole functional data set.

It provides an effective way of extracting a large amount of longrun covariance.

Its dynamic principal component scores are uncorrelated.
3.1.4 Univariate functional timeseries forecasting method
We obtain an IV smile of a specific maturity by interpolating the moneyness level (in terms of delta) of the option contract. We extract estimated functional principal component functions
, using the empirical covariance function. Conditioning on the past functions and the estimated functional principal components , the stepahead point forecast of can be expressed aswhere denotes timeseries forecasts of the principal component scores. The forecasts of these scores can be obtained via a univariate timeseries forecasting method, such as the autoregressive integrated moving average (ARIMA) model (see Ord & Fildes, 2013, for a review). We use the automatic algorithm of Hyndman & Khandakar (2008) to choose the optimal orders of , , and . In the supplement, we also present the results using exponential smoothing, which is a generalisation of the ARIMA model (see, e.g., Kearney & Shang, 2020). The forecast accuracy results are qualitatively similar between the two univariate timeseries methods.
3.2 Multivariate functional timeseries method
Section 3.1 introduces a method for modelling and forecasting one functional timeseries or multiple functional timeseries individually. However, explicitly modelling the correlation between multiple timeseries may improve forecast accuracy. Therefore, we introduce a multivariate functional timeseries method to jointly model and forecast multiple functional timeseries that could be correlated with each other.
Let be the IV smile for 1Mmaturity options, the IV smile for 6Mmaturity options, and the IV smile for 2Ymaturity options. As our multiple functional timeseries have the same function support, we consider data where each observation consists of functions, i.e., , where and in our data set.
The multivariate functional timeseries are stacked in a vector with . We assume that . For , the theoretical crosscovariance function can be defined with elements
where and denote the mean functions for the and series, respectively. By assuming that is a continuous and squareintegrable covariance function, the function induces the kernel operator given by
Via Mercer’s lemma, there exists an orthonormal sequence of continuous functions in and a nonincreasing sequence of positive numbers, such that
By the separability of Hilbert space, the KarhunenLoève expansion of a stochastic process can be expressed as
where denotes the number of retained functional principal components. Its matrix formulation is
where and denote stacked historical functions. Moreover, is the vector of the basis expansion coefficients, , and
Conditioning on the past functions and the estimated functional principal components , the stepahead point forecast of can be expressed as
where denotes the timeseries forecasts of the principal component scores corresponding to the three maturities. In our implementation, each series is standardised to have equal scale before being combined into a long vector (see, e.g., Chiou et al., 2014).
3.3 Multilevel functional timeseries method
The multilevel functional data model bears a strong resemblance to a twoway functional analysis of variance model studied by Morris et al. (2003) and CuestaAlbertos & FebreroBande (2010). It is a special case of the general “functional mixed model” proposed in Morris & Carroll (2006) and applied in Shang (2016) and Shang et al. (2016). As the IV surface comprises distinct IV smiles for multiple maturities of the same currency pair, the basic idea is to decompose the related smiles into an average IV smile across the entire surface , a maturityspecific deviation from the averaged IV smile , a common trend across maturities , and a maturityspecific residual trend
. The common trend and maturityspecific residual trend are modelled by projecting them onto the eigenvectors of covariance functions of the aggregate and maturityspecific centred stochastic processes, respectively. The curve at observation
can be written as(4)  
(5) 
where corresponds to the 1month, 6month and 2year maturities, respectively. To ensure model and parameter identifiability, the observations of the two stochastic processes and are uncorrelated since we implement twostage functional principal component decomposition.
Since the centred stochastic processes and are unknown in practice, the population eigenvalues and eigenfunctions can only be approximated at best through a set of realisations and . From the covariance of , we can extract a set of functional principal components and the associated scores, along with a set of residual functions. From the covariance function of the residual functions, we can then extract a second set of functional principal components and their associated scores. While the first functional principal component decomposition captures the common trend shared across various maturities of the same currency, the second functional principal component decomposition captures the maturityspecific residual trend.
The sample versions of the aggregate mean function, maturityspecific mean function deviation, a common trend, and maturityspecific residual trend for a set of functional data can be estimated by
(6)  
(7) 
(8)  
(9) 
where represents a timeseries of functions for the currency exchange averaged across different maturities; represents the simple average of the averaged currency exchange, whereas represents the simple average of the currency exchange at the maturity; represents the sample principal component scores of ; and denotes the corresponding orthogonal sample eigenfunctions in a squareintegrable function space.
Substituting equations (6) to (9) into equations (4)–(5), we obtain
where denotes measurement error with a finite variance .
To select the optimal number of components, we use a CPV criterion to determine and . The optimal numbers of and are determined by
where denotes a binary indicator function. Following Chiou (2012), we chose .
From the estimated eigenvalues, we can estimate the proportion of variability explained by aggregated data (also known as the variance explained by the withincluster variability, see for example, Di et al., 2009). A possible measure of withincluster variability is given by
(10) 
where denotes the population variance, which can be approximated by the estimated and retained eigenvalues. When the common trend can explain the primary mode of total variability, the value of the withincluster variability is close to 1.
Conditioning on observed data and basis functions and , the stepahead forecasts can be obtained by
where and are the forecasted principal component scores, obtained from a univariate timeseries forecasting method, such as an ARIMA model.
While Sections 3.2 and 3.3 present static multivariate and multilevel functional timeseries methods, dynamic counterparts can be produced using dynamic functional principal components and their associated scores. These quantities are constructed based on an estimated longrun covariance function in (3).
4 Forecast evaluation
4.1 Expandingwindow approach
An expandingwindow analysis of timeseries models is commonly used to assess model and parameter stability over time. It determines the constancy of a model’s parameters by computing parameter estimates, and their forecasts over an expanding window of a fixed size through the sample (see Zivot & Wang, 2006, Chapter 9 for details). Using the first 1,827 observations from January 1, 2008 to December 31, 2014 for each currency pair, we produce onestepahead point forecasts. Through an expanding window approach, we reestimate the parameters in all models using the first 1,828 observations from January 1, 2008 to January 1, 2015. Onestepahead forecasts are then produced for each model. We iterate this process by increasing the sample size by one trading day until we reach the end of the data period. This process produces 522 onestepahead forecasts from January 1, 2015 to December 30, 2016. We compare these forecasts with the holdout samples to determine outofsample point forecast accuracy. This evaluation is in line with Chalamandaris & Tsekrekos (2010), who adopt a recursive onedayahead strategy scheme. Also, we consider 518 fivestepahead forecasts and 513 tenstepahead forecasts to assess forecast accuracy at comparably longer horizons. We choose to expand the training set and reestimate all the models daily to incorporate all available information into our prediction. This approach more accurately simulates the action likely to be taken by a market practitioner who aims to predict the following day’s movement.
To control for sensitivity to specific outofsample periods, various window lengths are tested: In the case of onestepahead prediction, we consider 522 days (outofsample: January 1, 2015 to December 30, 2016), 261 days (outofsample: January 1, 2016 to December 30, 2016), 131 days (outofsample: July 1, 2016 to December 30, 2016). The outofsample forecasts between the end of the insample period and December 30, 2016 are obtained using an expandingwindow scheme.
4.2 Forecast error measures
A number of criteria exist for evaluating point forecast accuracy (see Armstrong & Collopy, 1992; Hyndman & Koehler, 2006). They can be grouped into four categories: (1) scaledependent metrics, (2) percentageerror metrics, (3) relativeerror metrics, which average the ratios of the errors from a designated method to the errors of a naïve method, and (4) scalefree error metrics, which express each error as a ratio to the average error from a baseline method. Here, we use the following three accuracy measures:

Mean absolute forecast error (MAFE); a measure of the absolute difference between the forecast, , and the corresponding observation, . It measures the average error magnitude in the forecasts, regardless of error direction and serves to aggregate the errors into a single measure of predictive power. The MAFE can be expressed as
where denotes the number of discretised points in a curve and denotes the number of observations in the forecasting period; denotes the holdout sample for the delta and curve in the forecasting period; and denotes the point forecast of the holdout sample.

Mean squared forecast error (MSFE); a measure of the squared difference between the predicted and realised values, which again serves to aggregate the errors into a single measure of predictive power. The MSFE can be expressed as

Mean mixed error (MME); an asymmetric loss function. MME(U) penalises underpredictions more heavily, while MME(O) penalises overpredictions more heavily. This feature is crucial for investors in options markets, as an under (over)prediction of IV is more likely to be of greater concern to a seller (buyer) than a buyer (seller). This measure has previously been used in studies that evaluate volatility forecasting techniques, such as
Brailsford & Faff (1996), Fuertes et al. (2009), and Kearney et al. (2018). With a slight abuse of notation , the MME(U) and MME(O) are given asMME(U) MME(O) where denotes the number of underpredictions and denotes the number of overpredictions. represents the indices of the overpredictions, and represents the indices of the underpredictions.
5 Data analysis results
This section presents the results of modelling IV using the three functional timeseries models for the entire data sample from January 1, 2008 to December 30, 2016. Using the entire EURUSD dataset, we present the first two estimated static and dynamic principal components and their associated scores in Figure 3. While the static principal components are extracted based on sample variance, the dynamic principal components are extracted based on sample longrun covariance. When a timeseries has moderate to strong temporal dependence, the extracted static and dynamic principal components are different, as shown in Figure 3. Due to the variation in the extracted principal components, the associated scores for both approaches also exhibit visible differences.
In Section 5.1, the models are fitted insample to ascertain goodnessoffit when seeking to capture the empirical dynamics of the IV surfaces for each currency pair during the entire sample period. In Section 5.2, we evaluate and compare outofsample forecast accuracy among the functional timeseries methods. In Section 5.3, we examine statistical significance based on point forecast accuracy.
5.1 Insample model fitting
Three functional timeseries models are fitted to the underlying IV data, for each day over the full sample of January 1, 2008 to December 30, 2016. There are various ways of assessing the fit of a linear model, and we initially consider a criterion introduced by Ramsay & Silverman (2005, Section 16.3). The approach borrowed from conventional linear models is to consider the squared correlation function:
In Figure 4, we present over the five delta values, for the three maturities and three currencies.
Total is useful if one requires a single numerical measure of fit. It can be expressed as
The total for the three maturities for each currency pair are listed in Table 1. Kearney et al. (2018) find that the univariate functional timeseries model provides the best goodnessoffit in comparison to the benchmark methods of Goncalves & Guidolin (2006) and Chalamandaris & Tsekrekos (2010, 2011). Here, we improve on Kearney et al. (2018) by showing that the goodnessoffit of the univariate functional timeseries model they propose is outperformed by our multivariate and multilevel functional timeseries models, as measured by total . We also observe that the optimal values are often associated with the dynamic methods.
Maturity  
Currency  Method  1M  6M  2Y 
EURUSD  FTS  0.9387  0.9246  0.9174 
MFTS  0.9773  0.9842  0.9797  
MLFTS  0.9784  0.9715  0.9668  
DFTS  0.9976 
0.9939 
0.9877  
DMFTS  0.9849  0.9934  0.9914  
DMLFTS 
0.9988 
0.9885 
0.9977 

EURGBP  FTS  0.9360  0.9207  0.9031 
MFTS  0.9811  0.9842  0.9723  
MLFTS  0.9752  0.9689  0.9265  
DFTS  0.9975  0.9929  0.9764  
DMFTS  0.9896  0.9942 
0.9957 

DMLFTS 
0.9979 
0.9972 
0.9796  
EURJPY  FTS  0.8553  0.8331  0.7869 
MFTS  0.9825  0.9721  0.9277  
MLFTS  0.9488  0.8863  0.9041  
DFTS  0.9032  0.8850  0.8337  
DMFTS  0.9872 
0.9929 
0.9848 

DMLFTS 
0.9892 
0.9594  0.9811 
A key measure of goodnessoffit for the multilevel functional timeseries method is the withincluster variability defined in (10). When the common factor explains the primary mode of total variability, the value of withincluster variability is close to 1. We calculate this withincluster variability for each currency pair and maturity period in Table 2.
Currency  Method  1M  6M  2Y 

EURUSD  MLFTS  0.7507  0.8021 
0.8360 
DMLFTS 
0.8384 
0.8604 
0.8251  
EURGBP  MLFTS  0.7423  0.8087  0.7134 
DMLFTS 
0.7672 
0.8197 
0.7340 

EURJPY  MLFTS 
0.6048 
0.6918  0.7117 
DMLFTS  0.6047 
0.7258 
0.7854 
5.2 Outofsample forecast evaluation
It can be seen from the previous section that, the multivariate functional timeseries and multilevel functional timeseries models provide better insample fit when seeking to capture shape of the IV surface. We now turn our attention to outofsample forecasting.
5.2.1 Stationarity test of functional timeseries
Since the computation of longrun covariance is only meaningful if the functional timeseries is stationary, we implement the stationarity test of Horvàth et al. (2014) and its corresponding R function T_stationary in the ftsa package (Hyndman & Shang, 2020). Based on the values reported in Table 3, we conclude that all the functional timeseries we consider are stationary.
Currency  1M  6M  2Y 

EURUSD  0.533  0.424  0.415 
EURGBP  0.771  0.747  0.487 
EURJPY  0.455  0.646  0.363 
With the null hypothesis being that of stationarity, this table presents the probabilities of rejecting the null hypothesis (
values) for each currency and maturity using the stationarity test of Horvàth et al. (2014).5.2.2 EurUsd
A summary of the outofsample forecast measures calculated under a recursive parameter estimation scheme and 522day outofsample window length for EURUSD are presented in Table 4. We consider the Chalamandaris & Tsekrekos’s (2011) method, the Goncalves & Guidolin’s (2006) method, RW and AR(1) as benchmakr methods. We use a generalised least squares (GLS) procedure to produce these Goncalves & Guidolin (2006) and Chalamandaris & Tsekrekos (2011) benchmarks, in line with Bernales & Guidolin (2014). GLS is used to control for the option measurement errors identified by Hentschel (2003). We direct the interested reader to Appendix B of Goncalves & Guidolin (2006) for the technical detail of applying the GLS method suggested by Hentschel (2003) in the context of deterministic IV surface models.
CPV  

Horizon  Maturity  Method  MME(U)  MME(O)  MME(U)  MME(O)  
1 day  1M  FTS  0.5936  0.5709  0.6653  0.6356  0.5610  0.5155  0.6414  0.6062 
MFTS  0.6557  0.7748  0.7016  0.6878  0.5237  0.4776  0.6090  0.5749  
MLFTS  0.5074  0.4678  0.5910  0.5624  0.4429  0.3701  0.5314  0.5128  
DFTS  0.4346  0.3586  0.5187  0.5094  0.3975  0.3177  0.4938  0.4669  
DMFTS  0.6426  0.7541  0.6830  0.6831  0.4587  0.3916  0.5503  0.5204  
DMLFTS  0.5493  0.5024  0.6100  0.6194  0.4023  0.3252  0.4978  0.4717  
CT11  0.6040  0.6486  0.6479  0.6582  
GG06  0.6675  0.7272  0.7097  0.7063  
RW  0.4071  0.3311  0.4983  0.4793  
AR(1)  0.4129  0.3328  0.5120  0.4802  
6M  FTS  0.5026  0.3950  0.6258  0.5355  0.4580  0.3323  0.5941  0.4919  
MFTS  0.3708  0.2821  0.4685  0.4496  0.3791  0.2959  0.4699  0.4616  
MLFTS  0.3674  0.2772  0.4834  0.4253  0.2573  0.1585  0.3805  0.3277  
DFTS  0.3649  0.2462  0.4472  0.4719  0.2308  0.1557  0.3332  0.3142  
DMFTS  0.3748  0.2858  0.4439  0.4808  0.4013  0.3205  0.4610  0.5095  
DMLFTS  0.3961  0.2836  0.4638  0.5079  0.2288  0.1490  0.3320  0.3130  
CT11  0.7322  0.8314  0.7871  0.7370  
GG06  0.7557  0.9552  0.7966  0.7549  
RW  0.2358  0.1775  0.3353  0.3175  
AR(1)  0.2395  0.1704  0.3361  0.3187  
2Y  FTS  0.4629  0.3101  0.6122  0.4885  0.4162  0.2401  0.5815  0.4430  
MFTS  0.3360  0.1977  0.4880  0.3738  0.3374  0.1879  0.4970  0.3728  
MLFTS  0.3070  0.1772  0.4399  0.3631  0.2280  0.0953  0.3713  0.2892  
DFTS  0.3259  0.1527  0.4213  0.4423  0.1617  0.0582  0.2643  0.2556  
DMFTS  0.2713  0.1384  0.4015  0.3383  0.2600  0.1206  0.4082  0.3149  
DMLFTS  0.1960  0.0746  0.2949  0.3001  0.1576  0.0561  0.2604  0.2504  
CT11  0.2548  0.1203  0.3585  0.3529  
GG06  0.2435  0.1067  0.3521  0.3403  
RW  0.1593  0.0629  0.2599  0.2479  
AR(1)  0.1626  0.0643  0.2596  0.2466  
Average  FTS  0.5197  0.4253  0.6344  0.5532  0.4784  0.3626  0.6057  0.5137  
MFTS  0.4542  0.4182  0.5527  0.5037  0.4134  0.3205  0.5253  0.4698  
MLFTS  0.3939  0.3074  0.5048  0.4503  0.3094  0.2080  0.4277  0.3766  
DFTS  0.3751  0.2525  0.4624  0.4745  0.2633  0.1772  0.3638  0.3456  
DMFTS  0.4295  0.3928  0.5095  0.5007  0.3733  0.2776  0.4732  0.4483  
DMLFTS  0.3805  0.2869  0.4562  0.4758 
0.2629 
0.1768 
0.3634 
0.3450 

CT11  0.5303  0.5334  0.5978  0.5827  
GG06  0.5556  0.5964  0.6195  0.6005  
RW  0.2674  0.1905  0.3645  0.3482  
AR(1)  0.2717  0.1892  0.3692  0.3452  
5 day  1M  FTS  0.9598  1.5348  0.9398  0.9200  0.9481  1.5261  0.9327  0.9075 
MFTS  1.0091  1.8200  0.9640  0.9579  0.9365  1.5892  0.9172  0.8980  
MLFTS  0.9298  1.5650  0.9284  0.8769  0.9116  1.5657  0.9140  0.8607  
DFTS  0.9172  1.5958  0.9095  0.8703  0.9002  1.5563  0.8981  0.8554  
DMFTS  1.0162  1.8861  0.9672  0.9597  0.9270  1.6338  0.9075  0.8862  
DMLFTS  0.9652  1.6986  0.9411  0.9136  0.9092  1.5883  0.9047  0.8629  
CT11  0.9953  1.7950  0.9612  0.9355  
GG06  1.0148  1.8012  0.9831  0.9459  
RW  0.9100  1.5753  0.9080  0.8625  
AR(1)  0.9203  1.5898  0.9206  0.8879  
6M  FTS  0.6420  0.6596  0.7153  0.6685  0.6124  0.6189  0.6935  0.6397  
MFTS  0.5758  0.6414  0.6477  0.6105  0.5753  0.6440  0.6455  0.6109  
MLFTS  0.5640  0.6032  0.6400  0.6006  0.5123  0.5325  0.5942  0.5606  
DFTS  0.5732  0.6186  0.6290  0.6291  0.5042  0.5289  0.5791  0.5584  
DMFTS  0.5853  0.6757  0.6476  0.6248  0.5983  0.7091  0.6570  0.6339  
DMLFTS  0.5933  0.6528  0.6362  0.6540  0.5000  0.5248  0.5761  0.5533  
CT11  0.8295  1.2002  0.8687  0.7917  
GG06  0.8726  1.3253  0.8884  0.8330  
RW  0.5127  0.5441  0.5874  0.5657  
AR(1)  0.5140  0.5362  0.5943  0.5638  
2Y  FTS  0.5389  0.4449  0.6454  0.5729  0.5051  0.3882  0.6244  0.5399  
MFTS  0.4615  0.3782  0.5626  0.5124  0.4583  0.3668  0.5610  0.5095  
MLFTS  0.4409  0.3503  0.5348  0.5045  0.3995  0.2974  0.4982  0.4710  
DFTS  0.4656  0.3595  0.5396  0.5515  0.3733  0.2662  0.4662  0.4575  
DMFTS  0.4259  0.3418  0.5196  0.4929  0.4194  0.3233  0.5154  0.4873  
DMLFTS  0.3904  0.2840  0.4775  0.4760  0.3742  0.2669  0.4658  0.4595  
CT11  0.4203  0.3196  0.5062  0.5005  
GG06  0.4220  0.3206  0.5088  0.5020  
RW  0.3765  0.2713  0.4692  0.4593  
AR(1)  0.3754  0.2647  0.4656  0.4543  
Average  FTS  0.7148  0.8862  0.7668  0.7215  0.6885  0.8444  0.7502  0.6957  
MFTS  0.6821  0.9465  0.7248  0.6936  0.6567  0.8667  0.7079  0.6728  
MLFTS  0.6449  0.8395  0.7011  0.6607  0.6078  0.7985  0.6688  0.6308  
DFTS  0.6520  0.8580  0.6926  0.6837 
0.5926 
0.7838 
0.6478 
0.6238 

DMFTS  0.6758  0.9679  0.7115  0.6925  0.6482  0.8887  0.6933  0.6691  
DMLFTS  0.6496  0.8785  0.6849  0.6812  0.5945  0.7933  0.6489  0.6252  
CT11  0.7484  1.1049  0.7787  0.7426  
GG06  0.7698  1.1490  0.7934  0.7603  
RW  0.5997  0.7969  0.6549  0.6292  
AR(1)  0.6032  0.7969  0.6602  0.6353  
10 day  1M  FTS  1.2442  2.4842  1.1378  1.1328  1.2452  2.5329  1.1433  1.1267 
MFTS  1.2935  2.8260  1.1776  1.1548  1.2534  2.6819  1.1543  1.1210  
MLFTS  1.2405  2.6064  1.1631  1.0944  1.2527  2.7404  1.1746  1.0978  
DFTS  1.2862  2.8628  1.1875  1.1350  1.2747  2.8248  1.1792  1.1262  
DMFTS  1.3352  3.0192  1.2095  1.1798  1.2834  2.8693  1.1732  1.1418  
DMLFTS  1.2865  2.8253  1.1816  1.1383  1.2774  2.8527  1.1801  1.1283  
CT11  1.3039  2.9100  1.1966  1.1457  
GG06  1.2983  2.8572  1.1936  1.1419  
RW  1.2786  2.8572  1.1815  1.1264  
AR(1)  1.2628  2.7316  1.1809  1.1291  
6M  FTS  0.7615  0.9441  0.7948  0.7717  0.7454  0.9348  0.7850  0.7538  
MFTS  0.7324  0.9967  0.7768  0.7300  0.7347  1.0048  0.7812  0.7280  
MLFTS  0.7138  0.9389  0.7529  0.7239  0.6908  0.9378  0.7376  0.6979  
DFTS  0.7386  1.0367  0.7624  0.7498  0.6849  0.9482  0.7300  0.6916  
DMFTS  0.7525  1.0859  0.7918  0.7407  0.7635  1.1112  0.8037  0.7469  
DMLFTS  0.7625  1.0729  0.7856  0.7664  0.6866  0.9424  0.7287  0.6965  
CT11  0.9473  1.6022  0.9614  0.8680  
GG06  1.0002  1.7614  0.9880  0.9216  
RW  0.6843  0.9544  0.7323  0.6942  
AR(1)  0.6863  0.9337  0.7347  0.6947  
2Y  FTS  0.6001  0.5765  0.6752  0.6368  0.5793  0.5363  0.6631  0.6173  
MFTS  0.5508  0.5400  0.6177  0.6021  0.5522  0.5417  0.6157  0.6062  
MLFTS  0.5340  0.5083  0.5973  0.5971  0.5187  0.4907  0.5906  0.5795  
DFTS  0.5820  0.5690  0.6359  0.6443  0.5085  0.4772  0.5824  0.5700  
DMFTS  0.5399  0.5360  0.6076  0.5943  0.5374  0.5222  0.6016  0.5976  
DMLFTS  0.5166  0.4836  0.5842  0.5821  0.5081  0.4709  0.5792  0.5743  
CT11  0.5416  0.5213  0.6071  0.6029  
GG06  0.5443  0.5316  0.6038  0.6091  
RW  0.5058  0.4717  0.5829  0.5691  
AR(1)  0.5120  0.5328  0.5812  0.5704  
Average  FTS  0.8702  1.3441  0.8707  0.8480  0.8566 
1.3347 
0.8638  0.8326  
MFTS  0.8589  1.4542  0.8574  0.8290  0.8468  1.4095  0.8504  0.8184  
MLFTS  0.8294  1.3512  0.8377  0.8051  0.8207  1.3896  0.8343 
0.7917 

DFTS  0.8689  1.4893  0.8619  0.8430  0.8227  1.4167  0.8305  0.7959  
DMFTS  0.8759  1.5470  0.8696  0.8383  0.8614  1.5009  0.8595  0.8288  
DMLFTS  0.8552  1.4606  0.8505  0.8289  0.8240  1.4220 
0.8293 
0.7997  
CT11  0.9309  1.6778  0.9217  0.8722  
GG06  0.9476  1.7167  0.9285  0.8909  
RW  0.8229  1.4278  0.8322  0.7966  
AR(1) 
0.8204 
1.3394  0.8323  0.7981 
In Table 5, we present a comparison of outofsample accuracy for forecasting EURUSD IV for the five delta values considered. The errors are aggregated over three maturities of one month, six months and two years. In the supplement, we also present the results using exponential smoothing, which is a generalisation of the ARIMA model (see, e.g., Kearney & Shang, 2020).
Horizon  Delta  Delta  

Method  10  25  50  75  90  10  25  50  75  90  
1 day  
CPV  FTS  0.7114  0.4847  0.3888  0.4507  0.5628  0.7330  0.3712  0.2442  0.2928  0.4853 
DFTS  0.4811  0.3305  0.4047  0.2882  0.3711  0.3981  0.2273  0.2491  0.1484  0.2396  
MFTS  0.6692  0.4438  0.2969  0.3430  0.5178  0.8482  0.3783  0.1634  0.2171  0.4841  
DMFTS  0.6273  0.4518  0.2869  0.3087  0.4729  0.8019  0.3716  0.1610  0.1997  0.4296  
MLFTS  0.5441  0.3520  0.2735  0.3400  0.4599  0.5283  0.2509  0.1544  0.2095  0.3939  
DMLFTS  0.4819  0.3568  0.4003  0.2896  0.3738  0.4393  0.2621  0.2857  0.1691  0.2781  
CT11  0.8555  0.5321  0.2413  0.4563  0.5664  1.1277  0.5310  0.1411  0.3431  0.5244  
GG06  0.8299  0.5783  0.3702  0.4428  0.5565  1.2154  0.5403  0.2565  0.3739  0.5959  
RW  0.4388  0.3692  0.3255  0.3133  0.3404  0.4113  0.2817  0.2264  0.2070  0.2762  
AR(1)  0.4440  0.3738  0.3296  0.3178  0.3432  0.4117  0.2827  0.2277  0.2085  0.2651  
FTS  0.4916  0.5449  0.4583  0.4903  0.4069  0.4099  0.4552  0.3176  0.3459  0.2846  
DFTS  0.3392  0.2759  0.2364  0.2252  0.2400  0.2977  0.1835  0.1303  0.1127  0.1618  
MFTS  0.6084  0.4289  0.3088  0.3101  0.4107  0.6274  0.3191  0.1709  0.1721  0.3129  
DMFTS  0.5603  0.4209  0.3045  0.2640  0.3171  0.5566  0.3144  0.1741  0.1357  0.2069  
MLFTS  0.3777  0.3094  0.2601  0.2687  0.3310  0.3273  0.2048  0.1406  0.1404  0.2267  
DMLFTS  0.3393  0.2734  0.2328  0.2233  0.2457 
Comments
There are no comments yet.