# EstimatedWold Representation and Spectral Density-Driven Bootstrap for Time Series

The second-order dependence structure of purely nondeterministic stationary process is described by the coefficients of the famous Wold representation. These coefficients can be obtained by factorizing the spectral density of the process. This relation together with some spectral density estimator is used in order to obtain consistent estimators of these coefficients. A spectral density-driven bootstrap for time series is then developed which uses the entire sequence of estimated MA coefficients together with appropriately generated pseudo innovations in order to obtain a bootstrap pseudo time series. It is shown that if the underlying process is linear and if the pseudo innovations are generated by means of an i.i.d. wild bootstrap which mimics, to the necessary extent, the moment structure of the true innovations, this bootstrap proposal asymptotically works for a wide range of statistics. The relations of the proposed bootstrap procedure to some other bootstrap procedures, including the autoregressive-sieve bootstrap, are discussed. It is shown that the latter is a special case of the spectral density-driven bootstrap, if a parametric autoregressive spectral density estimator is used. Simulations investigate the performance of the new bootstrap procedure in finite sample situations. Furthermore, a real-life data example is presented.

## Authors

• 8 publications
• 5 publications
• 13 publications
06/18/2018

### A Frequency Domain Bootstrap for General Stationary Processes

Existing frequency domain methods for bootstrapping time series have a l...
07/23/2021

### Bootstrapping Whittle Estimators

Fitting parametric models by optimizing frequency domain objective funct...
07/01/2020

### Spectral methods for small sample time series: A complete periodogram approach

The periodogram is a widely used tool to analyze second order stationary...
02/14/2018

### Bootstrap-Assisted Unit Root Testing With Piecewise Locally Stationary Errors

In unit root testing, a piecewise locally stationary process is adopted ...
06/17/2021

### Maximum Entropy Spectral Analysis: a case study

The Maximum Entropy Spectral Analysis (MESA) method, developed by Burg, ...
10/26/2021

### Statistical inference on AR(p) models with non-i.i.d. innovations

The autoregressive process is one of the fundamental and most important ...
07/11/2021

### A prediction perspective on the Wiener-Hopf equations for discrete time series

The Wiener-Hopf equations are a Toeplitz system of linear equations that...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The spectral density (SD), if it exists, plays an important role as a quantity which completely describes the so-called second-order properties of stationary time series. A broad literature exists on SD estimators, among them parametric (e.g. autoregressive) estimators, nonparametric (e.g. lag window or smoothed periodogram) estimators or semiparametric estimators as a mixture of both. Time series analysts typically are rather skilled in estimating spectral densities and they know, depending on the required application, the pros and cons of the various estimators. This paper intends to bring together several bootstrap procedures under the umbrella of SD estimation.

Recall that for a purely nondeterministic and stationary stochastic process with SD , Szegö’s factorization expresses as a power series. The coefficients of this factorization, appropriately normalized, coincide with the coefficients of the well-known Wold representation of . Recursive formulas, which make use of the Fourier coefficients of - the so-called cepstral-coefficients -, to calculate the coefficients of the Wold representation of the process have been developed; cf. Pourahmadi (1983). Moreover, if is strictly positive then also obeys an autoregressive (AR) representation and similar recursive formulas to compute the coefficients of this representation have also been derived; see again Pourahmadi (1983). Using these recursions we suggest a procedure to estimate the coefficients of both the moving average (MA) and the AR representations based on an estimator of the SD . In particular, we show that under certain conditions on and on the used estimator , the sequence of coefficients of the Wold and of the AR representation of the process can be consistently estimated. Furthermore, under additional smoothness conditions the pointwise consistency of the estimators can be extended to uniform consistency for the entire sequence of coefficients. It should be noted that the factorization of the spectral density has been considered in the literature also for implementing and investigating the so-called Wiener-Kolmogorov predictor in linear prediction (cf. Jones (1964), Bhansali (1974, 1977) and Pourahmadi (1983)).

The availability of estimates of the MA coefficients of the Wold representation enables the development of a general spectral density-driven bootstrap (SDDB) procedure for time series. In particular, a pseudo time series can be generated by using the estimated sequence of MA coefficients and an appropriately chosen sequence of pseudo innovations. The resulting bootstrap procedure is then fully determined by the particularly chosen SD estimator and the stochastic properties of the generated pseudo innovations. The estimated Wold representation used should mainly be regarded as a means to an end to generate a pseudo time series which exactly has the chosen SD estimator as its spectrum.

For instance, choosing a parametric AR SD estimator, the coefficients of the estimated Wold representation coincide with the coefficients of the inverted estimated AR polynomial and therefore, the AR model can just as well be used to generate the bootstrap data. In other words, using a parametric AR SD estimator will lead to the well-known AR-sieve bootstrap for time series (cf. Kreiss (1992), Bühlmann (1997) and Kreiss et al. (2011)). However, a parametric AR SD estimator often is not the first choice. Let us consider a nonparametric competitor, for instance, a lag window estimator of with truncation lag . As we will see, this will lead us essentially to a MA process of finite order which can be used to generate the pseudo time series. Therefore, the SDDB proposed in this paper, is a general notion of bootstrap for time series which allows for a variety of possibilities to generate the pseudo time series. These possibilities are determined by the particular SD estimator used to obtain the estimates of the coefficients of the Wold representation. Notice that although the SDDB generates bootstrap pseudo time series in the time domain, the second-order dependence structure of the underlying process is entirely mimicked in the frequency domain by means of the selected SD estimator used. Thus, various well-known and flexible methods for SD estimation can be used in our bootstrap method. As a consequence, we formulate the assumptions needed for our theoretical developments, in terms of the SD and its estimator, only. This allows us to restrict the class of admissible SD estimators as little as possible.

Fed by independent and identically distributed (i.i.d.) pseudo innovations the proposed SDDB generates pseudo time series stemming from a linear process. For such a choice of pseudo innovations we compare our bootstrap proposal to some other linear bootstrap procedures, like the AR-sieve bootstrap (cf. Kreiss (1992) and Bühlmann (1997)) and the linear process bootstrap, cf. McMurry and Politis (2010). As already indicated, it is shown that the AR-sieve bootstrap is a special case of the SDDB which is obtained if a parametric AR SD estimator is used. Furthermore, we show that the linear process bootstrap essentially generates pseudo observations by factorizing banded autocovariance matrices. This technique is related to the factorization of spectral densities which is used in this paper. However, in finite samples the two approaches differ from each other.

It is worth mentioning that pseudo innovations generated in a different way than i.i.d. could also be used in the proposed SDDB procedure. For instance, pseudo innovations generated by means of a block bootstrap applied to appropriately defined residuals may be used. Although such a proposal would most likely extend the range of validity of the SDDB to nonlinear time series, we do not consider such an approach in this paper, i.e., we restrict ourselves to the linear process set-up. We show that if the pseudo innovations are generated by means of an i.i.d. wild bootstrap that appropriately mimics the first, the second, and the fourth moment structure of the true innovations, then the proposed SDDB is asymptotically valid for a wide range of statistics commonly used in time series analysis. Besides the sample mean, statistics described by the so-called class of generalized autocovariances are also considered; see Section 3 for details. We demonstrate by means of simulations that our asymptotic findings coincide with a good finite sample behavior of the proposed bootstrap procedure. Furthermore, the performance of the new bootstrap method is compared with that of the asymptotic normal approximations and of some other bootstrap competitors, like the linear process, the AR-sieve and the tapered block bootstrap. An R-code to generate pseudo time series with the SDDB is available at www.tu-bs.de/Medien-DB/stochastik/code-snippet_sddb.txt.

The paper is organized as follows: Section 2 briefly describes the Wold and the AR representation of a stationary time series and discusses the method used to estimate the entire sequence of coefficients in both representations. Local and global consistency properties of the estimators are established. Section 3 introduces the SDDB procedure for time series and establishes, for linear processes and for relevant classes of statistics, its asymptotic validity. A comparison with the AR-sieve and with the linear process bootstrap is also given in this section. Section 4 presents some numerical simulations investigating the finite sample behavior of the proposed bootstrap method and compares its performance with that of other bootstrap methods and of asymptotic normal approximations. A real-life data example demonstrates the applicability of the suggested bootstrap procedure. Finally, auxiliary results as well as proofs of the main results are deferred to Section 6.

## 2 Estimated Wold Representation

### 2.1 Moving Average and Autoregressive Representation

Stationary processes are commonly classified using the concept of linear prediction; see for example

(Brockwell and Davis, 1991, Section 5.7) or (Pourahmadi, 2001, Section 5.5). To elaborate, let be a stationary stochastic process and define by and the closed linear subspaces of the Hilbert space . Note that an overlined set denotes its closure. Let be the projection of onto and define by the mean square error of the best (in the mean square sense) one-step, linear predictor. The process is called deterministic if and only if or equivalently if and only if . It is called nondeterministic if and consequently . Furthermore, it is called purely-nondeterministic if it is nondeterministic and .

If the process possesses a SD , which is the case if , with , then it holds true that is nondeterministic if and only if

 ∫(−π,π]logf(λ)dλ>−∞, (2.1)

Wold’s decomposition, see (Pourahmadi, 2001, Theorem 5.11), guarantees that any nondeterministic process can be divided into a deterministic and a purely-nondeterministic part. Furthermore, the purely-nondeterministic part of the process has a unique one-sided moving average (MA) representation given by

 (2.2)

where and

is a white noise process defined by

, , called the innovation process. Here, white noise refers to an uncorrelated time series. Notice that even if is a linear process driven by i.i.d. innovations, the white noise process appearing in the corresponding one-sided MA representation (2.2) might not be i.i.d.. To give an example consider the linear, first order MA process, where is an i.i.d. process and . The Wold representation of this process is given by where

is white noise with variance

. Obviously, the innovations are not independent.

Another interesting one-sided representation of the process is the so-called AR representation which appears if the SD is bounded away from zero, i.e., if . In this case and instead of using the full history of the innovation process , the full history of the process itself is used to express the value at any time point . can then be written as

 Xt=∞∑k=1bkXt−k+εt,t∈Z, (2.3)

where and is the same white noise innovation process as in (2.2); see (Pourahmadi, 2001, Section 6.2.1). Expression (2.3) is called the AR representation of the process and should not be confused with that of a linear, infinite order AR process driven by i.i.d. innovations. To demonstrate this, consider again the previous example of the linear, noninvertible MA process with . This process has the AR representation where is the uncorrelated but not independent white noise processes appearing in the Wold representation of .

To derive recursive formulas for the coefficients in the MA representation and the AR representation , we start with some basic factorization properties of the SD . Notice first that can be expressed as for a power series and that such a factorization exists if and only if condition (2.1) is fulfilled; see Szegö (1921). The above factorization of the SD is not unique. However, if we restrict ourselves to power series which have no roots inside the unit disk and appropriately normalize the coefficients, a unique representation occurs. The coefficients of this unique power series coincide with the coefficients of the Wold representation (2.2), if is used. We denote this unique and normalized power series by . Notice that ensures, that the Fourier coefficients of are well defined. Furthermore, since has no zeros inside the open unit disc, is analytic inside the same region and we have for that

 σ(2π)−1/2∞∑k=0ckzk=exp(a0/2+∞∑k=1akzk), (2.4)

where is the -th Fourier coefficients of ,

 ak=∫π−πlogf(λ)exp(−ikλ)dλ/(2π). (2.5)

Differentiation of equation together with comparison of coefficients leads to a recursive formula to calculate the coefficients of this power series by using the Fourier coefficients of , see Pourahmadi (1983, 1984). In particular, setting , the following recursive formula can be used to obtain the coefficients ,

 ck+1=k∑j=0(1−jk+1)ak+1−jcj,k=0,1,2,…. (2.6)

Furthermore, . If the process possesses also the AR representation (2.3), then the coefficients of this representation can be calculated using the relation . Setting the corresponding recursive formula to obtain the ’s is given by

 bk+1=−k∑j=0(1−jk+1)ak+1−jbj,k=0,1,2,…. (2.7)

A proof of (2.4) can be found in the Supplementary Material, Lemma A.2. As we see from the proof of (2.4), this approach cannot be transferred directly to the multivariate case. Matrix multiplication is not commutative and therefore the exponential laws do not apply for matrices. However, these properties are essential for the proof of (2.4). Moreover, there are examples where is not valid in the multivariate case. Consequently, the recursive formulae and cannot be directly applied to multivariate time series.

### 2.2 Estimating the Coefficients of the Wold Representation

Our next goal is to estimate the coefficients of the Wold representation (2.2). The basic idea is to use an estimator of the SD to get estimates of the Fourier coefficients of and to plug in these estimates in the recursive formula (2.6). Notice that estimates of the coefficients of the AR representation (2.3) can be obtained by using formula (2.7) and the estimates of the ’s.

Let be the estimator of the -th Fourier coefficient of and denote by , , the estimators of the coefficients of the Wold representation obtained using formula (2.6), e.g. Let be the corresponding estimators of the coefficients of using formula (2.7). Using the above recursive formula, it is theoretically possible to compute the infinite series of coefficients corresponding to . However, in practice the computation of Fourier coefficients is usually approximated by a sum of finite frequencies. This limits the number of MA coefficients to compute and gives an approximation error. This error depends on the smoothness of and is usually negligible, see Supplementary Material A.1 for details. To give an example, consider Model II used in the simulation study; see Section 4. This model possesses the slowest decaying autocovariance of all three models considered. Nevertheless, using instead of Fourier frequencies to compute Wold’s coefficients gives an overall squared error of less than .

It is clear that the properties of the estimators and depend heavily on the properties of the estimator . To obtain consistency, the following condition suffices which essentially requires that is a uniformly consistent estimator of . For lag window estimators such a uniform consistency has been established by Jentsch and Subba Rao (2015, Lemma A.2), and for AR SD estimators by Bühlmann (1995, Theorem 3.2).

Assumption 1 The estimator satisfies . Furthermore,

 supλ∈[0,π]|^fn(λ)−f(λ)|P→0, as n→∞. (2.8)

Then, the following result can be established.

###### Theorem 2.1

Suppose that satisfies (2.1) and that Assumptions 1 holds true. Then, as , a) , and for every fixed , b) , and c)  .

By the above theorem, for an -dependent process, we have . Imposing more conditions on and its estimator , the consistency properties of the estimators and can be refined and inequalities, similar to the well-known Baxter inequality for the AR-coefficients, Baxter (1962), can be established. Such inequalities are useful since they control the overall estimation error that occurs when the estimated SD instead of the true SD is used in order to obtain the estimates of interest.

Assumption 2  The estimator fulfills the following conditions.

1. There exists constants such that for all and all .

2. The first derivative of with respect to exists, is continuous and integrable. Furthermore,

 supλ∈[−π,π]∣∣∣ddλ^fn(λ)−ddλf(λ)∣∣∣P→0, as n→∞. (2.9)

Condition (ii) can be verified for lag window estimators by using similar arguments as in the proof of Lemma A.2 in Jentsch and Subba Rao (2015) under the same cumulant conditions and a slightly faster decay of the autocovariance function. For the AR SD estimators the same condition can be verified by using arguments similar to those used in the proof of Theorem 3.2 in Bühlmann (1995). Notice that boundedness of the SD is ensured by an absolute summable autocovariance function, which is a common assumption for bootstrap procedures for time series. Furthermore, the assumption regarding the existence of derivatives of the SD can be transferred to assumptions on the summability of the autocovariance function. However, since the bootstrap approach proposed in this paper is SD-driven, we prefer to formulate the conditions needed as assumptions for the SD of the underlying process. The following theorem summarizes the properties of the estimators and .

###### Theorem 2.2

Let the spectral density be strictly positive and bounded with continuous and integrable first derivative. Then, as ,

1. If satisfies Assumption 1 and Assumption 2(i) then

 ∞∑k=−∞|^ak,n−ak|2=∫2π0|logf(λ)−log^fn(λ)|2dλ/(2π)P→0 (2.10)

and

 ∞∑k=0|^ck,n−ck|2P→0. (2.11)
2. If satisfies Assumption 1 and Assumption 2, then

 ∞∑k=−∞k2|^ak,n−ak|2P→0 and ∞∑k=1|^ak,n−ak|P→0. (2.12)

Furthermore,

 ∞∑k=0k2|^ck,n−ck|2P→0 and ∞∑k=0|^ck,n−ck|P→0. (2.13)

Relation (2.4) plays a key role in the proofs of assertions (2.11) and (2.13). Notice that since , similar relations for can be derived. Furthermore, the results of Theorem 2.2 can be straightforwardly extended to the sequence of estimation errors , .

There are some alternative approaches to estimate the coefficients and which have been proposed in the literature. In particular and for estimating the coefficients , one option is the innovation-algorithm which works by fitting MA(q) models where the order increases to infinity as the sample size increases to infinity; see (Brockwell and Davis, 1991, Proposition 5.2.2). For estimating the coefficients , commonly an AR(p) model is fitted to the time series at hand by means of Yule-Walker estimators, where, the order is also allowed to increase to infinity with sample size; see (Brockwell and Davis, 1991, Section 8.1). Under certain conditions, both approaches are consistent; see (Pourahmadi, 2001, Theorem 7.14). However, the basic idea behind these approaches differs from ours and so do the estimators obtained via SD factorization. In the above mentioned approaches, the estimated autocovariance matrix is used to fit a finite MA or a finite AR model. Consistency of the corresponding estimators is then obtained by allowing the order of the fitted model to increase to infinity at an appropriate rate as the sample size increases to infinity. These approaches face, therefore, two sources of errors. The first is the estimation error which is caused by the fact that estimated autocovariances are used instead of the true ones. The second is the approximation error which is due to the fact that a finite order model is used to approximate the underlying infinite order structure. Although the estimation error cannot be avoided, the approximation error caused by our estimation procedure is different. This error depends on the quality of the SD estimator used to approximate the true SD , where

is selected from a wide range of possible estimators and not only from those obtained by using finite order AR or MA parametric models. The innovation-algorithm is similar to the factorization of autocovariance matrices which is used in the linear process bootstrap. In the Supplementary Material, see Section A.2, a simple example is discussed to point out the differences between factorizing autocovariance matrices and spectral densities.

### 2.3 Spectral Density Estimators

Since our estimation procedure relies on a SD estimator , we briefly discuss the variety of such estimatorsand their impact on the estimators or obtained.

As already mentioned, spectral densities can be estimated using a parametric approach, that is, by fitting a parametric model to the time series at hand and using the SD of the fitted model as an estimator of the SD of the process. Since AR models are easy to fit, they are commonly used for such a purpose; see Akaike (1969), Shibata (1981). In this context, parameter estimators, like Yule-Walker estimators, are popular because they ensure invertibility of the corresponding estimated AR-polynomial; see (Brockwell and Davis, 1991, Section 8.1). Now, if an AR SD estimator is used in the spectral factorization procedure, then the estimated coefficients obtained are identical to those appearing in the power series expansion of the inverted estimated AR polynomial. Furthermore, the corresponding sequence of estimated coefficients is finite and the ’s, coincide with the estimated AR parameters.

Using nonparametric methods like lag window or kernel smoothed periodogram estimators is another popular approach to estimate the SD; cf. (Brockwell and Davis, 1991, Section 10.4). Lag window estimators truncate the estimated autocovariances at a given lag controlled by a truncation parameter. Such estimators of the SD can be interpreted as obtained by (implicitly) fitting a finite order MA model to the time series at hand; see also (Brockwell and Davis, 1991, Prop. 3.2.1). The sequence of estimated coefficients of the Wold representation obtained by using such a SD estimator is finite with for values of larger than the truncation parameter. Due to the asymptotic equivalence between lag window and smoothed periodogram estimators, similar remarks can be made also for SD estimators obtained by smoothing the periodogram. Furthermore, as mentioned in Section 2.2, lag window estimators as well as AR estimators satisfy Assumptions 1 and 2.

A different nonparametric approach to estimate the SD is to truncate the Fourier series of which presumes an exponential model for the SD; see Bloomfield (1973). Such a model is given by Unlike truncating the autocovariance function, non-negative definiteness of the SD is ensured for all possible values of the parameters . As Bloomfield (1973) pointed out, the autocovariance function of such an exponential model cannot, in general, be described by a finite AR or a finite MA model. Thus, using such an estimator of the SD in the factorization algorithm, leads to an infinite sequence of estimators or respectively. Notice that the Fourier coefficients of are also known as the cepstral coefficients or vocariances and they have been widely used in the signal processing literature to estimate the SD; see Stoica and Sandgren (2006).

An interesting combination of nonparametric and parametric approaches for SD estimation is offered by the so-called pre-whitening approach; see Blackman and Tukey (1958). The idea is to use a parametric model to filter the time series and then apply a nonparametric estimator to the time series of residuals. Using an AR-model for pre-whitening (filtering) and a lag window estimator for estimating the SD of the residuals, can be interpreted as (implicitly) fitting an ARMA-model to the time series at hand. The idea is that the parametric AR-model fit is able to represent the peaks of the SD quite well while the lag window estimator applied to the residuals can capture features of the SD that are not covered by the parametric fit. Notice that for the pre-whitening approach consistency of the lag window estimator is obtained even in the case, where the parametric fit does not improve the estimation. Using such a SD estimator for the factorization algorithm the coefficients and obtained will be those of the infinite order MA representation and infinite order AR representation of the (implicitly) fitted ARMA model, respectively. However, to reduce numerical errors, the use of the ARMA representation is recommend, the MA coefficients are obtained by the factorization of the pre-whitened SD and the AR coefficients are those of the fitted AR-model.

## 3 Spectral Density-Driven Bootrstrap

### 3.1 The Spectral Density-Driven Bootrstrap Procedure

In the previous section we have dealt with the coefficients of the MA and of the AR representation of the process. For the coefficients in both representations, consistent estimators have been developed. Consequently, both representations can be used in principle to develop a bootstrap procedure to generate pseudo time series . We focus in this work on the MA representation, since it exists for every SD. Clearly, such a bootstrap procedure will be determined by the SD estimator used to obtain the coefficients and by the generated series of pseudo innovations (cf. Step 3 below). Thus, the tuning parameters of this bootstrap procedure coincide with those used for the SD estimation. Consequently, one can follow data-driven methods proposed in the literature to choose these parameters. Now, given an estimator of the SD , the SDDB algorithm consists of the following steps.

1. Compute the Fourier coefficients of given by for .

2. Let and compute the coefficients using the formula and the starting value .

3. Generate i.i.d. pseudo innovations , with mean zero and variance .

4. The pseudo time series is then obtained as where is the sample mean.

It should be stressed that the above bootstrap algorithm with i.i.d. pseudo innovations represents a general procedure to generate a pseudo time series stemming from a linear process. Regarding the particular generation of the i.i.d. innovations in Step 3, different possibilities can be considered depending on the stochastic properties of the time series at hand which should be mimicked by the pseudo time series . In particular, suppose that stems from a linear process and that a statistic is considered, the distribution of which should be approximated by the bootstrap. We then propose to generate the i.i.d. innovations in a way which asymptotically matches the first, the second and the fourth moment structure of the true innovations . Matching also the fourth moment structure of turns out to be important for some statistics ; we refer to Section 3.3 for examples.

One possibility to achieve this requirement is, to generate the

’s as i.i.d. random variables with the following discrete distribution:

and . Here and denotes a consistent estimator of the fourth moment of the innovations . Consistent, nonparametric estimators of have been proposed in Kreiss and Paparoditis (2012) and Fragkeskou and Paparoditis (2015).

In the above bootstrap algorithm, the pseudo time series is generated using the estimated coefficients of the moving average representation. Modifying the algorithm appropriately, the pseudo time series can be also generated using the estimated AR representation of the process. For this, we set and calculate the coefficients , using the recursive formula starting with and for . Using these estimates of the coefficients of the AR representation, the pseudo time series is then obtained as .

We stress here the fact that the SDDB should not be considered as an MA-sieve bootstrap procedure, where the order of the MA model is allowed to increase to infinity as the sample size increases to infinity. The SDDB procedure is rather governed by the SD estimator used, which appropriately describes the entire autocovariance structure of the underlying process. The MA representation in this bootstrap procedure is solely used as a device to generate a time series with a second-order structure characterized by the SD estimator used. Notice however, that some SD estimators can implicitly lead to an MA-sieve type bootstrap.

### 3.2 Comparison with other Linear Bootstrap Procedures

The idea of the AR-sieve bootstrap is to fit a -th order AR model to the time series at hand and to use the estimated model structure together with i.i.d. pseudo innovations generated according to the empirical distribution function of the centered residuals. In order to fully cover the second-order dependence structure of the underlying process , the order of the fitted AR-model is allowed to increase to infinity (at an appropriate rate) as the sample size increases to infinity; see Kreiss (1992), Paparoditis and Streitberg (1991), and Bühlmann (1997). The range of validity of this bootstrap procedure has been investigated in Kreiss et al. (2011). As already mentioned, the AR-sieve bootstrap is a special case of the SDDB described in Section 3.1 when is chosen to be a parametric AR SD estimator and the innovations are generated through i.i.d. resampling from the centered residuals of the AR fit. Using the estimated AR-parametric SD, the factorization algorithm leads to a sequence of estimated MA coefficients that correspond to the MA representation obtained by inverting the estimated AR polynomial. However, and as already mentioned, the SDDB is a much more general procedure since it is not restricted to describing the dependence structure of the time series at hand by means of a finite order parametric AR model. Notice that both bootstrap approaches work under similar conditions, see Assumptions and . However, if a lag window SD estimator is used, there are situations where the SDDB is valid, whereas validity of the AR-sieve is not clear; see Section 3.3 for details.

The linear process bootstrap, established by McMurry and Politis (2010) is also related to the SDDB. It uses the factorization of banded autocovariance matrices instead of the SD itself to generate the pseudo observations. A factorization of autocovariance matrices is similar to the innovation algorithm, see Brockwell and Davis (1991, Proposition 5.2.2). As pointed out at the end of Section 2.2

this leads in finite sample situations to different results. Furthermore, the linear process bootstrap aims to generate a data vector with a given covariance structure, while the SDDB generates a stationary time series. A more detailed discussion can be found in the Supplementary Material.

### 3.3 Bootstrap Validity

In this section we prove validity of the proposed SDDB procedure for the sample mean and under quite general dependence assumptions on the underlying process which go far beyond linearity. Furthermore, we show that if the underlying process is linear, the same bootstrap procedure driven by i.i.d. pseudo innovations is valid for the class of so-called generalized autocovariance statistics. We first focus on this general class of statistics.

###### Definition 3.1

Let be a sequence of real numbers such that , where . Let further be a differentiable function. Then, the generalized autocovariance statistic is defined as

 ^Tn=g(^Tn,1,…,^Tn,P), where for p∈{1,…,P}, (3.1)

and

The above class of statistics contains, among others, sample autocovariances, sample autocorrelations and lag window SD estimators, cf. the Supplementary Material for details.

Assumption 3:   is a linear process with i.i.d. innovations , where , , and . We write for short . The coefficients in the MA representation fulfill the summability condition .

As the following theorem shows, the proposed SDDB procedure is valid for approximating the distribution of statistics belonging the class of generalized autocovariances. Here and in the sequel, for two random variables , and , denotes Mallow’s distance, i.e., , where and

denote the cumulative distribution functions of

and , respectively.

###### Theorem 3.1

Let where for , and is a sequence of real numbers as in Definition 3.1. Furthermore, is a pseudo time series generated using the SDDB procedure with a pseudo innovation process satisfying with , a consistent estimator of which also fulfills for some constant which does not depend on . Finally, assume that the estimated Wold coefficients fulfill and . Then under Assumption 3 and as ,

.

The assumptions and are of rather technical nature and can be satisfied by using appropriate estimators of and . If the SD estimator fulfills then the requirement of the above theorem is satisfied. Notice that sufficiently smooth kernels guarantee the required differentiability of . Furthermore, by using an appropriate truncation, boundedness of and can also be guaranteed.

In Section 2 we gave conditions under which holds, see Theorem 2.1 and 2.2. Moreover, there are settings in which it is not clear whether the AR-sieve bootstrap is valid while the SDDB in connection with a lag window SD estimator can lead to a valid approximation. For instance, the SDDB remains valid for statistics as in when the time series is generated by finite MA processes with unit roots, like for instance by the process or even by nonlinear continuous transformations of -dependent stationary processes.

The following theorem establishes validity of the SDDB for the case of the sample mean, which is not covered by the class of general covariance statistics given in (3.1). Notice, that for this case, it suffices that the pseudo innovations mimic asymptotically correct only the first and the second moment of the true innovations . Furthermore, no linearity assumptions of the underlying processes are needed. What is needed is that

converges to a normal distribution with variance

, which, however, is fulfilled for a huge class of stationary processes. For instance, appropriate mixing or weak dependence conditions are sufficient for this statistic to satisfy the required asymptotic normality of . Furthermore, regarding the SDDB, the SD and its estimator need to fulfill less restrictive conditions. In particular, for a lag window SD estimator , the assumptions and , see Jentsch and Subba Rao (2015, Lemma A.2), suffice to ensure uniform consistency of .

###### Theorem 3.2

Assume that is a purely nondeterministic stationary process with mean , SD , and autocovariance with and assume that , as . Denote by a uniformly consistent and bounded estimator of fulfilling Assumptions and where does not depend on . Assume that is generated using the SDDB procedure with an i.i.d. innovation process , where , , and . Then, as , , in probability.

The assumption is satisfied if a strictly positive, differentiable, and bounded SD estimator is used.

Notice that validity of block bootstrap approaches is often established for so-called generalized mean statistics, see Künsch (1989, Example 2.2). For a time series , this class of statistics is given by , where and and is fixed. Let . The validity of the SDDB for this class can be derived by applying the results of Theorem 3.2. The stated cumulant and autocovariance conditions have to be fulfilled by the process .

###### Corollary 3.1

Let fulfill the assumptions of Theorem 3.2 and denote the mean by . Furthermore, assume that is differentiable at and is generated using the SDDB procedure with an i.i.d. innovation process , where , , and . Then, as , , in probability.

An improved finite sample performance of bootstrap approximations is often achieved by applying the bootstrap to studentized statistics, see for instance (Lahiri, 2003, Chapter 6); Götze and Künsch (1996); Romano and Wolf (2006)

. A studentized form is obtained by normalizing the statistic of interest with a consistent estimator of the asymptotic standard deviation. Since in Theorem

3.2 the asymptotic variance is given by and this quantity can be consistently estimated, we get as a studentized statistic where is a consistent estimator of . A bootstrap approximation of this studentized statistic is then given by , where is the same SD estimator as obtained using the pseudo observations .

###### Corollary 3.2

Let and be consistent estimators of which are bounded from below by . Under the assumption of Theorem 3.2 and if the SD estimator used for the SDDB is two times differentiable with a second derivative of bounded variation independent from , then, as , , in probability.

The asymptotic variance of the generalized autocovariance statistic depends on the SD and it may also depend on the fourth moment of the underlying innovations of the linear process. This fourth moment can be estimated consistently, by say ; see Fragkeskou and Paparoditis (2015). Since the pseudo time series is driven by i.i.d. innovations, the fourth moment of can be estimated using the same estimator as for . Consequently, an asymptotically valid approximation of the SDDB for studentized generalized autocovariance statistics can be established. This is done in the following corollary, where, and in order to simplify notation, only the case is considered. In this case the statistic of interest is given by and the asymptotic variance by .

###### Corollary 3.3

Let and let be consistent SD estimators which are bounded from below by . Furthermore, let be consistent estimators of . Under the assumptions of Theorem 3.1 and if independent from then, as , , in probability.

The assumption ensures that converges to a non-degenerate distribution. It is fulfilled if or if is a non-constant function. The estimators and are estimators of based on and and and , respectively.

## 4 Numerical Examples

### 4.1 Simulations

In this section we investigate by means of simulations the finite sample behavior of the SDDB and compare its performance with that of two other linear bootstrap methods, the AR-sieve bootstrap and the linear process bootstrap. We also compare all three linear bootstrap methods with the tapered block bootstrap, cf. Paparoditis and Politis (2001), and the moving block bootstrap, cf. Künsch (1989). Two statistics are considered, the sample mean and the sample autocorrelation . The time series used have been generated from the following three models:

• Model I:   ,

• Model II:

• Model III:

In all cases the innovation process consists of i.i.d. random variables having a

-student distribution with 3 degrees of freedom and variance normalized to

. Model I is tailor made for the AR-sieve bootstrap. The SD in Model II has a difficult to estimate strong peak around frequency

. Furthermore, this model possesses a slowly decaying autocovariance function which oscillates with two frequencies; one for the odd lags and one for the even lags. Model III is an MA process with a unit root; the SD is zero at frequency zero. Consequently, the sample mean converges to a degenerated distribution making a studentization inappropriate. In order to investigate the finite sample performance of the different bootstrap methods, empirical coverage probabilities of two-sided confidence intervals obtained for the levels

and are presented. The empirical coverage probabilities are based on realizations of each process and bootstrap repetitions. We present the results for the case , while results for the case are given in the Supplementary Material.

For the AR-sieve bootstrap, denoted by ARS, the Akaike’s information criterion (AIC) is used to select the AR order , cf. Akaike (1969). The SDDB is applied using an AR-pre-whitening, nonparametric estimator of the SD, where the order of the AR part has been selected by the AIC and a smoothed periodogram is used with Gaussian kernel and of bandwidth selected by cross-validation; see Beltrão and Bloomfield (1987). Furthermore, for this bootstrap procedure, i.i.d. Gaussian innovations are used. Furthermore, the linear process bootstrap, denoted by LPB, has been implemented as in McMurry and Politis (2010), and the tapered block bootstrap, denoted by TBB, has been applied with a block length choice and a tapering window as in Paparoditis and Politis (2001). Due to the strong dependence of some of the models considered, this rule for choosing the block length leads to unfeasible results especially for small sample sizes. For instance, even for this rule delivers for Model II block lengths of around . For this reason, we also consider the moving block bootstrap with nonrandom block length given by . This procedure is denoted by BB.

As mentioned in Section 3.3, a better finite sample performance may be obtained by using bootstrap approximations of studentized statistics. Thus, we consider for the sample mean the statistic , where is the same SD estimator as the one used for SDDB. The sample autocorrelation is studentized as well, where the variance is estimated by Bartlett’s formula, Brockwell and Davis (1991, Theorem 7.2.1), based on the autocorrelation function corresponding to the estimated SD . Finally, a standard normal distribution is considered as a further competitor for the studentized statistics and is denoted in the following by ND. For non-studentized statistics a normal distribution is used with the variance estimated by using the SDDB procedure. Studentization brings clear improvements for all models and all statistics considered. Hence, the focus is on the studentized case and the non-studentized tables can be found in the Supplementary Material.

The coverage probabilities for the studentized sample mean are displayed in Table 1. As it is seen from Table 1, none of the competitors outperforms the SDDB procedure. In fact, in many cases the SDDB performs best. Finally, and for Model III it seems that only the SDDB procedure gives reasonable estimates. Notice that the SD of Model III is not bounded away from zero, that is, it is not clear whether the LPB or the ARS are valid in this case. The coverage probabilities for the studentized sample autocorrelation are displayed in Table 2. For this statistic over all, the most accurate coverage probabilities are those obtained by using the ARS and the SDDB procedures.

Notice that block bootstrap methods have their strength in their general applicability, i.e., they are applicable not only to linear processes, like those considered in the simulation study, and to a broad class of statistics. Consequently, it is not surprising that these methods do not perform best for the linear processes considered.

Summarizing our numerical findings, it seems that the SDDB performs very good in all model situations and for both statistics considered. In combination with a flexible SD estimator, like for instance the pre-whitening based estimator used in the simulations, the SDDB seems to be a valuable tool for bootstrapping time series.