DeepAI

Implicit Copulas: An Overview

Implicit copulas are the most common copula choice for modeling dependence in high dimensions. This broad class of copulas is introduced and surveyed, including elliptical copulas, skew t copulas, factor copulas, time series copulas and regression copulas. The common auxiliary representation of implicit copulas is outlined, and how this makes them both scalable and tractable for statistical modeling. Issues such as parameter identification, extended likelihoods for discrete or mixed data, parsimony in high dimensions, and simulation from the copula model are considered. Bayesian approaches to estimate the copula parameters, and predict from an implicit copula model, are outlined. Particular attention is given to implicit copula processes constructed from time series and regression models, which is at the forefront of current research. Two econometric applications – one from macroeconomic time series and the other from financial asset pricing – illustrate the advantages of implicit copula models.

01/06/2022

Bayesian Regression Approach for Building and Stacking Predictive Models in Time Series Analytics

The paper describes the use of Bayesian regression for building time ser...
12/26/2017

Variational Bayes Estimation of Time Series Copulas for Multivariate Ordinal and Mixed Data

We propose a new variational Bayes method for estimating high-dimensiona...
08/11/2022

HyperTime: Implicit Neural Representation for Time Series

Implicit neural representations (INRs) have recently emerged as a powerf...
02/11/2015

Dependent Matérn Processes for Multivariate Time Series

For the challenging task of modeling multivariate time series, we propos...
07/30/2021

A Survey of Estimation Methods for Sparse High-dimensional Time Series Models

High-dimensional time series datasets are becoming increasingly common i...
02/15/2021

Network of Tensor Time Series

Co-evolving time series appears in a multitude of applications such as e...
10/28/2022

Empirical Macroeconomics and DSGE Modeling in Statistical Perspective

Dynamic stochastic general equilibrium (DSGE) models have been an ubiqui...

1 Introduction

Copulas are widely used to specify multivariate distributions for the statistical modeling of data. Fields where copula models have had a significant impact include (but are not limited to) actuarial science (Frees and Valdez, 1998), finance (Cherubini et al., 2004; McNeil et al., 2005; Patton, 2006), hydrology (Favre et al., 2004; Genest et al., 2007), climatology (Schoelzel and Friederichs, 2008), transportation (Bhat and Eluru, 2009; Smith and Kauermann, 2011) and marketing (Danaher and Smith, 2011; Park and Gupta, 2012). Copula models are popular because they simplify the specification of a distribution, allowing the marginals to be modeled arbitrarily, and then combined using a copula function. In practice, a major challenge is the selection and estimation of a copula function that captures the dependence structure well and is tractable. One choice are “implicit copulas”, which are copulas constructed from existing multivariate distributions by the inversion of Sklar’s theorem as in Nelsen (2006, p.51). This is a large and flexible family of copulas, which share an auxiliary representation that makes estimation tractable in high dimensions. Thus, they are suitable for modeling the large datasets that arise in many modern applications. The objective of this paper is to introduce and survey implicit copulas and their use in statistical modeling in an accessible manner.

Implicit copulas have a long history with key developments spread across multiple fields, including actuarial studies, econometrics, operations research, probability and statistics. Yet while there are many excellent existing monographs and surveys on copulas and copula models (see

Genest and MacKay (1986); Joe (1997); McNeil et al. (2005); Nelsen (2006); Genest and Nešlehová (2007); Jaworski et al. (2010); Patton (2012); Nikoloulopoulos (2013a); Joe (2014) and Durante and Sempi (2015)

for prominent examples) there does not appear to be a dedicated survey or overview on this important class of copulas. This paper aims to fill this gap and provides an overview that stresses common features of the implicit copula family, likelihood-based estimation, and the usefulness of implicit copulas in statistical modeling. Particular focus is given to recent developments on implicit copula processes for regression and time series data, along with Bayesian inference that extends the earlier overview by

Smith (2013) to these copula processes.

Two econometric applications illustrate the use of implicit copula models with non-Gaussian data. The first is a time-varying heteroscedastic time series model for U.S. inflation between 1954:Q1 and 2020:Q2. The implicit copula is a copula process constructed from a nonlinear state space model as in

Smith and Maneesoonthorn (2018). It is a “time series copula” that captures serial dependence. The second application is a five factor asset pricing regression model (Fama and French, 2015) with an asymmetric Laplace marginal distribution for monthly equity returns. The implicit copula here is a “regression copula” process with respect to the covariates as in Klein and Smith (2019). The copula model forms a distributional regression (Klein et al., 2015; Kneib et al., 2021)

, where the five factors affect the entire distribution of equity returns, not just its first or other moments. In both applications the implicit copulas are of dimension equal to the number of observations, so that they are high-dimensional. Nevertheless, their auxiliary representation allows for likelihood-based estimation of the copula parameters. In both examples the marginal distribution of the response variables exhibit strong asymmetries.

The overview is organized as follows. Section 2 introduces general copula models, and then implicit copulas specifically. Their interpretation as transformations and specifications for variables that are continuous, discrete or mixed are also discussed. Section 3 covers elliptical and skew-elliptical copulas, including the Gaussian, , skew and factor copulas. Implicit copulas that capture serial dependence in time series data are covered in Section 4. Section 5 extends these to implicit copulas that capture both serial and cross-sectional dependence in multivariate time series. Section 6

covers regression copula processes, with the implicit copula constructed from a regularized linear regression given in detail. It is shown that when this copula is combined with flexible marginals, it defines a promising new distributional regression model. Last, Section

7 discusses the advantages of using implicit copula models for modeling data, and future directions.

2 Implicit copulas

2.1 Copula models in general

All copula models are based on the theorem of Sklar (1959)

(i.e. “Sklar’s theorem”), which states that for every random vector

with distribution function and marginals , there exists a “copula function” , such that

 FY(y)=C(FY1(y1),…,FYm(ym)), (1)

where . The copula function is a well-defined distribution function for a random vector on the unit cube with uniform marginal distributions. To construct a copula model, select (i.e. the “marginal models”) and a copula function , to define via (1).

2.1.1 Continuous case

If all the elements of are continuous, then differentiating through (1) gives the density

 fY(y)=∂m∂y1⋯∂ymFY(\boldmathy)=c(FY1(y1),…,FYm(ym))m∏j=1fYj(yj), (2)

where , and is widely called the “copula density” with . (Throughout this paper the notation and are used interchangeably, as are and .) The decomposition at (2) is used to specify the likelihood of a continuous response vector in a statistical model.

2.1.2 Discrete case

If all the elements of are discrete-valued (e.g. as with ordinal or binary data) the probability mass function is obtained by differencing over the elements of as follows. Let and be the left-hand limit of at (which is for ordinal ). Then the mass function is

 fY(y)=Pr(Y1=y1,…,Ym=ym)=Δb1a1Δb2a2⋯ΔbmamC(\boldmathv), (3)

where is a differencing vector, and the notation

 ΔbjajC(u1,…,uj−1,vj,uj+1,…,um) = C(u1,…,uj−1,bj,uj+1,…,um)−C(u1,…,uj−1,aj,uj+1,…,um).

Evaluating the mass function at (3) is an computation, so that its direct evaluation is impractical for high values of when undertaking likelihood-based estimation (Nikoloulopoulos, 2013a). One solution suggested by Smith and Khaled (2012)

is to consider the joint distribution of

. To do so, note that when is discrete, is a many-to-one function and is a degenerate distribution with density , where the indicator function if is true, and zero otherwise. (An alternative notation is to use the Dirac delta function, with where

is the quantile function of

.) Then the mixed density of is

 fY,U(\boldmathy,\boldmathu)=f(\boldmathy|% \boldmathu)c(\boldmathu)=m∏j=1{1(aj≤uj

Marginalizing over gives the probability mass function at (3) (i.e. ); see Proposition 1 in Smith and Khaled (2012).

Equation (4) can be used to define an “extended likelihood” for estimation using computational methods for latent variables, where the observations on are the latents. This has two advantages. First, the computation at (3) is avoided, allowing estimation for higher values of . Second, only the copula density is required and not the copula function , which is an advantage for some copulas where only can be computed, as is the case with most vine copulas (Joe, 1996; Aas et al., 2009). Bayesian data augmentation can be used based on (4

), and evaluated using Markov chain Monte Carlo (MCMC) as in

Smith and Khaled (2012) or variational Bayes methods as in Loaiza-Maya and Smith (2019). The latter is particularly attractive, because it allows for the estimation of discrete-margined copulas of very high dimensions, with examples up to presented by these authors.

2.1.3 Mixed cases

If some elements of are continuous and others discrete, then is often called a “mixed density”. In this case, an extended likelihood can be constructed from the distribution of joint with the elements of that correspond only to the discrete variables; see Smith and Khaled (2012, Sec.6). Similarly, if some individual elements have distributions that are mixtures of continuous and discrete distributions (such as a zero-inflated continuous distribution) then an extended likelihood can also be constructed for this case; see Gunawan et al. (2020) for how to do so.

2.2 The basic idea of an implicit copula

McNeil et al. (2005, p.190) use the term “implicit copula” for the copula that is implicit in the multivariate distribution of a continuous random vector . It is obtained by inverting Sklar’s theorem, which Nelsen (2006, p.51) calls the “inversion method”, so that copulas derived in this fashion are also called “inversion copulas” (e.g. Smith and Maneesoonthorn (2018)). If has distribution function with marginals , then its implicit copula function is

 CZ(\boldmathu)=FZ(F−1Z1(u1),…,F−1Zm(um)). (5)

Differentiating with respect to gives the implicit copula density

 cZ(\boldmathu)=∂m∂u1⋯∂umC(\boldmathu)=fZ(\boldmathz)∏mj=1fZj(zj), (6)

where is a function of with elements for . The implicit copula function and density above can be employed in (1), (2) and (3). Thus, an implicit copula model uses Sklar’s theorem twice: once to form the joint distribution with arbitrary marginals, and a second time to construct the implicit copula from the joint distribution .

Because implicit copulas are an immediate consequence of Sklar’s theorem, they have a long history. Early uses for modelling data include Rüschendorf (1976) and Deheuvels (1979), who both construct a non-parametric implicit copula from the empirical distribution function (although neither called it a copula). Rüschendorf (2009) gives an overview of the early developments of implicit copulas, pointing out that many transformation-based multivariate models—which themselves have a long history—are also copula models based on implicit copulas (although in the early literature this was often unrecognized and the term “copula” not used).

Note that only a continuous distribution is used to construct an implicit copula here. This is because the implicit copula of a discrete distribution is not unique (Genest and Nešlehová, 2007).

2.3 Implicit copulas as transformations

One way to look at all copula models is that they are a transformation from to . The key observation is that it is usually easier to capture multivariate dependence using on the vector space , rather than directly on the domain of the original vector . Implicit copulas go one step further, with a second transformation from to , and then capture the dependence structure using the distribution . Table 1 provides a summary of these transformations, along with the marginal and joint distribution and density/mass functions of . Throughout this paper, the vector is referred to as the “copula vector” and as the “auxiliary vector” (the latter is also called a “pseudo vector” in Smith and Klein (2021)). Simulation from an implicit copula model is straightforward if is tractable using Algorithm 1, which produces a draw .

Notice that the transformation removes all features of the marginal distribution of . This becomes an important observation for establishing parameter identification when constructing implicit copulas, as discussed in Sections 45 and 6.

2.4 An alternative extended likelihood

For the case where the elements of are discrete-valued, for an implicit copula model there exists an alternative extended likelihood based on the joint density of , rather than that of given previously at (4). This alternative joint density is

 (7)

with as defined above in Section 2.1.2. Marginalizing over produces the probability mass function at (3); i.e. . An advantage is that it is often simpler to use computational methods to estimate an implicit copula using (7) rather than (4). Moreover, an extended likelihood is also easily defined for vectors with combinations of discrete, continuous or even mixed valued elements, by simplifying (7) to only include elements of that correspond to the non-continuous valued variables.

Bayesian data augmentation is a suitable method for estimation using this extended likelihood. Here, values for are generated in an MCMC sampling scheme to evaluate an “augmented posterior” proportional to the extended likelihood multiplied by a parameter prior. This has been used to estimate the elliptical and skew elliptical copulas discussed in Section 3 below. For example, Pitt et al. (2006) do so for a Gaussian copula, while Danaher and Smith (2011) do so for the copula, and Smith et al. (2012) for the skew copula. Hoff and others (2007) considered the extended likelihood above using empirical marginals and rank data, Danaher and Smith (2011) and Dobra et al. (2011) provide early applications to higher dimensional Gaussian . Last, the multivariate probit model is a Gaussian copula model, and the popular approach of Chib and Greenberg (1998) is a special case of these data augmentation algorithms.

3 Elliptical and Skew Elliptical Copulas

In practice, parametric copulas with parameter vector are almost always used in statistical modelling, with McNeil et al. (2005), Nelsen (2006) and Joe (2014) giving overviews of choices. However, the implicit copulas of elliptical distributions, and more recently skew elliptical distributions, are common choices for capturing dependence in many applications. An attractive feature is that because elliptical and skew elliptical distributions are closed under marginalization, so are their implicit copulas.

3.1 Elliptical copulas

3.1.1 Gaussian copula

The simplest and most popular elliptical copula is the “Gaussian copula”, which is constructed from with an correlation matrix. If denotes an distribution function, and a distribution function, then from (5) the Gaussian copula function is

 CGa(\boldmathu;Ω)=Φm(Φ−1(u1),…,Φ−1(um);0,Ω).

If is a density, and is a standard normal density, then plugging the Gaussian densities into (6) gives the Gaussian copula density

 cGa(\boldmathu;Ω)=ϕm(\boldmathz;0,Ω)/m∏j=1ϕ(zj)=|Ω|−1/2exp{−12\boldmathz⊤(Ω−1−Im)\boldmathz},

with .

There are a number of immediate observations on the Gaussian copula. First, the auxiliary vector

has a distribution with a zero mean and unit marginal variances. This is because information about the first two marginal moments of

are lost in the transformation and are unidentified in the copula density. Second, adopting any constant mean value (other than zero) and marginal variances (other than unit values) for produces the same Gaussian copula . Third, closure under marginalization means that if has distribution function , then any subset of elements of has distribution function , where is a correlation matrix made up of the corresponding rows and columns of .

A fourth observation is that any parametric correlation structure for is inherited by the Gaussian copula. It is this property that has led the widespread adoption of Gaussian copula models for modeling time series (Cario and Nelson, 1996), longitudinal (Lambert and Vandenhende, 2002), cross-sectional (Murray et al., 2013) and spatial (Bai et al., 2014; Hughes, 2015) data. The Gaussian copula has a long history, particularly when formed implicitly via transformation (e.g. Li and Hammond (1975)), although some early and influential mentions include Joe (1993), Clemen and Reilly (1999) and Wang (1999), while Li (2000) popularized its use in finance. A comprehensive overview of the Gaussian copula and its properties is given by Song (2000).

3.1.2 Other elliptical copulas

Fang et al. (2002) and Embrechts et al. (2002) use an elliptical distribution for , and study the resulting class of “elliptical copulas”. When combined with choices for the marginals of in a copula model, Fang et al. (2002) call the distribution “meta-elliptical”, and an overview of their dependence properties is given by Abdous et al. (2005). After the Gaussian copula, the most popular elliptical copula is the copula, where a multivariate

distribution with degrees of freedom

is adopted for . Embrechts et al. (2002) and Venter (2003) study this copula, and the main advantage is that it can capture higher dependence in extreme values, which is important for financial and actuarial variables. A lesser known property is that values of close to zero allow for positive dependence between squared elements of . This is useful for capturing the serial dependence in heteroscedastic time series, such as equity returns in finance; see Loaiza-Maya et al. (2018) and Bladt and McNeil (2021).

3.2 Skew elliptical copulas

3.2.1 Overview

Elliptical copulas exhibit radial symmetry, where the distributions of and are the same. Yet there are applications where this is unrealistic, including for the dependence between equity returns (Longin and Solnik, 2001; Ang and Chen, 2002) and regional electricity spot prices (Smith et al., 2012). The implicit copulas of skew elliptical distributions (Genton, 2004) allow for asymmetric pairwise dependence, with the most common being those constructed from the differing skew distributions. Demarta and McNeil (2005) were the first to construct an implicit copula from a skew distribution (i.e. a “skew copula”), for which they used a special case of the generalized hyperbolic distribution, and Chan and Kroese (2010)

do so for an adjustment of the skew normal distribution of

Azzalini and Dalla Valle (1996). The most popular variants of the skew distribution are those of Azzalini and Capitanio (2003) and Sahu et al. (2003), which share a similar conditionally Gaussian representation. Smith et al. (2012) show how to construct implicit copulas from these latter two skew distributions, and estimate them using MCMC. Yoshiba (2018) considers maximum likelihood estimation for the skew copula constructed from the distribution of Azzalini and Capitanio (2003), and Oh and Patton (2020) consider a dynamic extension of the skew copula of Demarta and McNeil (2005) for high dimensions.

3.2.2 Skew t copula

Write for a -dimensional distribution with location , scale matrix and degrees of freedom , with density . Let and be vectors with joint distribution

 (XQ)∼t2m((00),Ω=(Γ+D2DDI),ν). (8)

Here, is a diagonal matrix and is positive definite. Then the skew distribution of Sahu et al. (2003) (with location parameter equal to zero) is given by , which has density

 f\tiny St(\boldmathz;Γ,D,ν)=2m|Γ+D2|1/2ft((Γ+D2)−1/2\boldmathz;0,Im,ν)Pr(V>0;\boldmathz) (9)

where and . This manner of constructing a skew distribution is called “hidden conditioning” because is latent. The skew distribution of Azzalini and Capitanio (2003) is constructed in a similar way, but where is a scalar.

The parameter controls the level of asymmetry in the distribution of , but in the implicit copula it controls the level of asymmetric dependence. This is a key observation as to why skew copulas have strong potential for applied modeling. To construct this copula, first fix the leading diagonal elements of to ones (i.e. restrict to be a correlation matrix), and note that the marginal of is also a skew distribution with density . Then, the copula function and density are given by (5) and (6), respectively. These require computation of the distribution function

and its inverse (i.e. the quantile function) which can either be undertaken numerically using standard methods, or using the interpolation approach outlined in

A for large datasets. Simulation from a skew copula model is straightforward using (8) and the representation of a distribution as Gaussian conditional on a Gamma variate. To do so, at Step 1 of Algorithm 1 generate a draw by drawing sequentially as follows:

• Step 1(a) Generate ,

• Step 1(b) Generate constrained to ,

• Step 1(c) Generate .

A computational bottleneck for the evaluation of the skew copula density is the evaluation of the multivariate integral at (9).

However, this can be avoided in likelihood-based estimation by considering the tractable conditionally Gaussian representation motivated by (8). Let , then consider the joint distribution of with density

 f(\boldmathx,\boldmathq,w|q>0)∝f(% \boldmathx|\boldmathq,w)f(\boldmathq|w)1(% \boldmathq>0)f(w), (10)

where and . Marginalizing out gives the skew density at (9) in . Smith et al. (2012) use this feature to design Bayesian data augmentation algorithms for the skew copula that generate as latent variables in Markov chain Monte Carlo (MCMC) sampling schemes for both continuous-valued and discrete-valued .

The density of the Azzalini and Capitanio (2003) skew distribution does not feature the multivariate probability term , so that it is easier to evaluate its implicit copula density, as in Yoshiba (2018). But when computing the Bayesian posterior using data augmentation it makes little difference, because the copula density is never evaluated directly.

3.3 Factor copulas

To capture dependence in high dimensions, “factor copulas” are increasingly popular, and there are two main types in the literature. The first links a small number of independent factors by a pair-copula construction to produce a higher dimensional copula, as proposed by Krupskii and Joe (2013). Flexibility is obtained by using different bivariate copulas for the pair-copulas and a different number of factors, with applications and extensions found in Nikoloulopoulos and Joe (2015); Mazo et al. (2016); Schamberger et al. (2017); Tan et al. (2019) and Krupskii and Joe (2020). In general, this type of factor copula is not an implicit copula. The second type of factor copula is the implicit copula of a traditional elliptical or skew-elliptical factor model. This type of copula emerged in the finance literature for low-dimensional applications (Laurent and Gregory, 2005), but is increasingly used to model dynamic dependence in high dimensions; see Creal and Tsay (2015); Oh and Patton (2017, 2018) and Oh and Patton (2020). Estimation issues grow with the dimension and complexity of the copula, and this remains an active field of research.

3.3.1 Gaussian static factor copula

One of the simplest factor copulas is a Gaussian static factor copula, which Laurent and Gregory (2005) suggest for a single factor, and Murray et al. (2013) consider for a larger number of factors. The multiple factor copula can be defined as follows. Let , where is an matrix of factor loadings, is a diagonal matrix of idiosyncratic variations, and typically . The implicit copula of is a Gaussian copula, as outlined in Section 3.1.1. To derive the parameter matrix , set the diagonal matrix

 S=diag(ΛΛ⊤+D)=diag(p∑k=1λ21,k+d1,…,p∑k=1λ2m,k+dm),

then , so that .

Murray et al. (2013) identify the loadings and idiosyncratic variations by setting , the upper triangular elements of to zero and the leading diagonal elements to positive values . The copula parameters are then , where is the half-vectorization operator applied to the lower triangle of the rectangular matrix . In the non-copula factor model literature, there are alternative ways to identify and (Kaufmann and Schumacher, 2017; Frühwirth-Schnatter and Lopes, 2018), and similar restrictions may be adapted for the correlation matrix as well. In a Bayesian framework, priors also have to be adopted for and , and these can be used to provide further regularization as in Murray et al. (2013) and elsewhere.

Simulation from this factor copula model is fast using the latent variable representation of the factor structure given by and . To do so, at Step 1 of Algorithm 1 generate a draw by drawing sequentially as follows:

• Step 1(a) Generate and ,

• Step 1(b) Set ,

• Step 1(c) Set .

4 Time series

Copulas have been used extensively to capture the cross-sectional dependence in multivariate time series; see Patton (2012) for a review. However, they can also be used to capture the serial dependence in a univariate series. The resulting time series models are extremely flexible, and there are many potential applications to continuous, discrete or mixed data.

4.1 Time series copula models

If is a time series vector, then the copula at (1) with captures the serial dependence in the series and is called a “time series copula”. While there has been less work on time series copulas than those used to capture cross-sectional dependence, they are increasingly being used for both time series data (where there is a single observation on the vector ) and longitudinal data (where there are multiple observations on the vector ). Early contributions include Darsow et al. (1992), Joe (1997, Ch.8), Frees and Wang (2005), Chen and Fan (2006), Ibragimov (2009) and Beare (2010) for Markov processes, Wilson and Ghahramani (2010)

for the implicit copulas of Gaussian processes popular in machine learning, and

Smith et al. (2010) for vine copulas that exploit the time ordering of the elements of .

4.1.1 Decomposition

For a continuous-valued stochastic process , denote the copula model for the joint density of time series variables as

 fY1:t(y1,…,yt)=c1:t(u1,…,ut)t∏s=1fYs(ys),

where is a -dimensional copula density that defines a copula process for stochastic process , with . Then the conditional distribution has density

 fYt+1|1:t(yt+1|y1,…,yt) = fY1:t+1(y1,…,yt+1)fY1:t(y1,…,yt)=c1:t+1(u1,…,ut+1)c1:t(u1,…,ut)fYt+1(yt+1) (11) = fUt+1|1:t(ut+1|u1,…,ut)fYt+1(yt+1).

Here, is the density of , which is not uniform on (whereas the marginal distribution of is uniform on ). This conditional density can be used to form predictions from the copula model. It can also be used in likelihood-based estimation because , with , which can be computed efficiently for many choices of copula . In drawable vine copulas (D-vines) is further decomposed into a product of bivariate copulas called “pair-copulas” (Aas et al., 2009), allowing for a flexible representation of the serial dependence structure, as discussed by Smith et al. (2010), Beare and Seo (2015), Smith (2015), Loaiza-Maya et al. (2018), Bladt and McNeil (2021) and others.

4.1.2 Selection of marginal distributions

For longitudinal data with a sufficient number of observations on , it is possible to estimate the marginal distribution functions at (5) separately as in Smith et al. (2010). But for time series data it is necessary to impose some structure on these marginal densities. For example, Frees and Wang (2005, 2006) employ generalized linear regression models with time-based covariates in an actuarial setting. In the absence of common covariates, the marginals may be assumed time-invariant, so that for all as in Chen and Fan (2006) and Smith (2015). Flexible marginals, such as a skew

distribution, or non-parametric estimators such as smoothed empirical distribution functions or kernel density estimators, can be used.

4.1.3 Discrete time series data

Time series copulas can also be used for discrete-valued data; see Joe (1997, Ch.8) for an early exploration of such models. Smith and Khaled (2012, Sec.5) do so for longitudinal data using the extended likelihood at (4) and the copula decomposition above, so that for and ,

 fY,U(\boldmathy,\boldmathu) = T∏t=1{1(at≤ut

with marginally uniform on . These authors employ a D-vine copula, and show how estimation using this extended likelihood can be undertaken by Bayesian data augmentation, where the values of are generated in an MCMC sampling scheme. Alternatively, Loaiza-Maya and Smith (2019) show how to estimate the copula parameters using variational Bayes methods (Blei et al., 2017). These calibrate tractable approximations to the augmented posterior obtained from the extended likelihood above. They call this approach “variational Bayes data augmentation” (VBDA) and show it is faster than MCMC and can be employed for much larger for many choices of copula.

4.2 Implicit time series copulas

4.2.1 Decomposition

In early work, Lambert and Vandenhende (2002) and Frees and Wang (2005, 2006) suggested adopting the implicit copula of an auxiliary stochastic process . In this case, the copula density has the form at (6), so that for

 c1:t(u1,…,ut)=fZ1:t(z1,…,zt)/t∏s=1fZs(zs).

The conditional density at (11) is therefore

 fYt+1|1:t(yt+1|y1,…,yt) = fUt+1|1:t(ut+1|u1,…,ut)fYt+1(yt+1) (13) = fZ1:t+1(z1,…,zt+1)fZ1:t(z1,…,zt)fZt+1(zt+1)fYt+1(yt+1) = fZt+1|1:t(zt+1|z1,…,zt)fYt+1(yt+1)fZt+1(zt+1).

4.2.2 Stationarity

A major advantage of an implicit time series copula is that for many processes , the densities and are straightforward to compute and simulate from, simplifying parameter estimation and evaluation of predictive distributions. It is straightforward to show (e.g. see Chen and Fan (2006); Smith (2015)) that if is a (strongly) stationary stochastic process, then is time invariant and is also stationary because is a monotonic transformation. In addition, if the marginal distribution is also time invariant, the process is also stationary.

4.2.3 Discrete time series data

For an implicit copula, the extended likelihood at (7) based on with realization can be used instead of that at (12), which is

 fY,Z(\boldmathy,\boldmathz) = T∏t=2{1(F−1Zt(at)≤zt

4.2.4 Example: Gaussian autoregression copula

The simplest implicit time series copulas are those based on stationary Gaussian time series models. Cario and Nelson (1996) and Joe (1997, pp.259) suggest using a zero mean stationary autoregression of lag length , so that

 Zs=p∑k=1ρkZs−k+es, for s=1,2,…,

with an independent disturbance, and parameters . Then , with the usual full rank autocovariance matrix with a band matrix that is a function of only.

Therefore, the implicit copula of is the Gaussian copula with the autocorrelation matrix . The parameter does not feature in (i.e. it is unidentified in the copula), so that it is sufficient to fix it to an arbitrary value such as , as is done here. Thus, is only a function of , so that are the copula parameters. The marginal distribution , with variance computed from . Denoting the density of a standard normal as , and that of a as , the conditional density

 fUt+1|1:t(ut+1|u1,…,ut) = fZt+1|1:t(zt+1|z1,…,zt)/fZt+1(zt+1) = ϕ(zt+1−p∑k=1ρkzt−k+1)/ϕ1(zt+1;0,γ0),

with a distribution function evaluated at . (The dependence of this conditional density on is tacit here.) Thus, the likelihood of a continuous-valued series, or the extended likelihood of a discrete-valued series, can be expressed in terms of the copula parameters and the marginals . A variety of estimation methods, including standard maximum likelihood, can then be used to estimate the time series copula parameters.

This copula model extends the stationary autoregression from a marginally Gaussian process to one with any other marginal distribution. This is why Cario and Nelson (1996) originally labeled it an “autoregression-to-anything” transformation, although these authors did not recognize it as a Gaussian copula. Interestingly, even though the auxiliary stochastic process is conditionally homoscedastic (i.e. ) the process need not be so (i.e. it can be heteroscedastic). To see this, notice that even when is time invariant, the conditional density of is

 fYt+1|1:t(yt+1|y1,…,yt)=ϕ(zt+1−p∑k=1ρkzt−k+1)g(yt+1)ϕ1(zt+1;0,γ0).

The second moment of this density is not necessarily a constant with respect to time, as demonstrated in Smith and Vahey (2016).

The usual measures of serial dependence for an autoregression (e.g. autocorrelation or partial autocorrelation matrices) can be computed for . Spearman correlations, which are unaffected by the choice of continuous margin(s) , provide equivalent metrics for . For example, the Spearman autocorrelation at lag is

 ρSh=6πarcsin(γh2γ0),

where is the autocovariance at lag for the auxiliary stochastic process, and is a function of . Other popular measures of concordance, can also be computed easily for different values of .

Last, while the Gaussian autoregression copula—or indeed other Gaussian time series copulas, such as those based on Gaussian processes (Wilson and Ghahramani, 2010)—produces a flexible family of time series models, the form of serial dependence is still limited. For example, serial dependence is both symmetric and has zero tail dependence, which are properties of the Gaussian copula. This motivates the construction of more flexible time series copulas, as now discussed.

4.3 Implicit state space copula

A wide array of time series and other statistical models can be written in state space form; see Durbin and Koopman (2012) for an overview of this extensive class. Smith and Maneesoonthorn (2018) outline how to construct and estimate the implicit time series copulas of such models, as is now outlined.

4.3.1 The copula

A nonlinear state space model for is given by the observation and transition equations

 Zt|Xt=xt ∼ Ht(zt|xt;θ) (14) Xt|Xt−1=xt−1 ∼ Kt(xt|xt−1;θ). (15)

Here, is the distribution function of , conditional on an -dimensional state vector . The states follow a Markov process, with conditional distribution function . Typically, tractable parametric distributions are adopted for and , with the parameters denoted collectively as .

A key requirement in evaluating (5) and (6) is the computation of the marginal distribution and density functions of . Marginalizing over gives these as

 FZt(zt|θ) = ∫Ht(zt|xt;θ)fXt(xt|θ)dxt