1 Introduction
Functional data analysis has attracted attention of researchers in the last few decades and many methods for structural break detection for functional data have been developed. Here, we propose a novel method of detecting and testing structural breaks in functional covariance. The motivation of this paper comes from a neuroscience experiment conducted by coauthor Frostig (see Wann, 2017) to investigate changes in the rat brain following an induced ischemic stroke. Local field potentials (LFP) are recorded from 32 implanted tetrodes on the rat brain cortex during the prestroke and poststroke phase (each phase consisted of five minutes of recording) which are segmented into onesecond epochs. Thus we have multivariate (32dimensional) functional curves for each epoch and a total of 300 epochs for the prestroke phase and also 300 for the poststroke phase. We expect structural change in each LFP tetrode following a stroke. Note, however, that one should not assume the structural change to take place immediately upon the clamping of the artery (simulating ischemic stroke) because there may be some time delay or lag till changes in the brain signals are manifested. Here, all trajectories have constant mean and thus fluctuate around
for both prestroke and poststroke and thus our goal here is to develop a method for detecting a changepoint method in the functional covariance. To the best of our knowledge, this is the first paper on changepoint detection for functional covariance.Existing work on functional structural break analysis mainly focus on mean function. As an example, consider a series of random functions recorded sequentially, the task is to detect the changepoint in the sequence, denoted
. Berkes et al. (2009) proposed to test the null hypothesis by checking the structural break of functional principal components, and Aue et al. (2009a) quantified the large sample behavior of the changepoint estimator. Aston et al. (2012a) extended their results to dependent functional data.
Torgovitski (2015)proposed changealigned principal components for such changepoint problems. Aue et al. (2014) proposed a method to check the changepoint of coefficient operators in potentially nonhomogeneous functional autoregressive model. Aue et al. (2018) proposed a fully functional detecting procedure without any dimension reduction. These methods, however, all focus only on the mean function. This limitation serves as a motivation to develop a changepoint detection method for functional covariance.
Meanwhile, there are existing methods related to changepoint detection of functional covariance. Aue et al. (2020) dealt with analyzing structural break of spectrum and trace of covariance operator. Aue et al. (2009b) studied the structural break detection problem for the covariance matrix of multivariate time series. They proposed to stack the lower triangular elements of covariance matrix and detect the structure break of the concatenated vectors. One limitation of this method is that it cannot be extended to functional data, since lower triangular elements are not defined for functional covariance. Aston et al. (2012b) and Gromenko et al. (2017) studied the structural break problem for twoway functions, specifically, spatialtemporal data and fMRI data, but they assume separability of the functional covariance of twoway functions. This is essentially rank1 approximation of the functional covariance, which could be overly restrictive for practical data analysis. Comparing with their work, we study the structural break for the complete functional covariance structure and no separable assumption is made, making it suitable for a broad range of cases.
There are other changepoint methods that can be applied to structural break detection in brain signals. Fryzlewicz & Rao (2014) proposed the “BASTA” method for detecting multiple change points in the structure of an autoregressive conditional heteroscedastic model. Kirch et al. (2015) used VAR model to detect changepoints in multivariate time series and applied the method to EEG sequences. Cho & Fryzlewicz (2015) proposed a sparsified binary segmentation method for the secondorder structure of a multivariate time series. Schröder & Ombao (2017) proposed a FreSpeD method to detect the changepoint in spectrum and coherence sequences of multivariate time series. Sundararajan & Pourahmadi (2018) proposed a nonparametric method to detect multiple changepoints in multivariate time series based on difference in the spectral density matrices. Other works on this topic include Jones et al. (1969), Hively et al. (2000) and Saaid et al. (2011). As discussed, our method can be used to detect the structural break in brain signals by checking the change point in the functional covariance. Comparing with other works, our procedure focuses on the “big picture” of brain signals. Specifically, we aim to find the changepoints in the sequences of functional epochs/trials instead of univariate brain signal recordings. Thus the procedure is more robust to chance variability and random errors, because it is nearly impossible that all observations in an epoch are contaminated by chance variability when there is no pronounced structural breaks, and the chance variability can be attenuated by functional smoothing techniques. Another advantage of our functional procedure lies in the intracurve information. As the new functional procedure checks the structural break of the entire functional covariance, thus intracurve information is incorporated, which can potentially reveal the structural break. This is discussed in more details in simulation studies. This paper provides a new perspective for changepoint problem of brain signal data.
The major contribution of this article is developing a fully functional procedure to detect the changepoint in functional covariance. We consider a general situation where functions can be weakly correlated (see Hörmann et al. (2010)), and the proposed procedure also naturally works for independent functions. Dimension reduction techniques, such as functional principal component analysis, are very popular in functional data analysis. However, they will automatically lead to a loss of information. This loss of information may not be crucial for functional reconstruction but could be critical for changepoint detection. Indeed, when the leading principal components are orthogonal to the discrepancy function (here, the difference of functional covariance), it is always better to employ fully functional procedure without dimension reduction regardless of the sample size, because the leading principal components cannot explain the discrepancy at all. Here, we also establish the asymptotic behavior of the estimated changepoint.
The rest of the article is organized as follows. In Section 2, we present some preliminaries of functional data, especially the spectral decomposition of twoway functions. In Section 3, we develop the changepoint model for functional covariance, along with the estimating, detection and testing procedure. We also derive the asymptotic properties of the proposed changepoint estimator. In Section 4, we display some simulation results. Section 5 shows the real data analysis on local field potentials, and Section 6 makes the conclusion. Proof of the theorems in the article can be found in the supplement material.
2 Spectral Decomposition of Twoway Functional Data
We consider a series of continuous random functions over , and each function is defined over a compact support . We denote the space composed of square integrable functions defined over as . The mean function of is assumed to be homogeneous across and is defined as follows
which we assume, without loss of generality, to be . For the moment, we assume the covariance operator , which is of major interest in this paper, is also homogeneous across , and is defined as
where is the covariance kernel, and the inner product is defined as
Notationally, define as the data analogue of the functional covariance, and denote to be the space of square integrable continuous functions defined over . For any , , we define the inner product of the twoway functions as
and norm as
where for any .
Similarly, we can define the mean function of as
and covariance operator and autocovariance operator as
with covariance and autocovariance kernel
Our procedure involves longrun functional covariance of , defined as the summation of all lagged functional covariances presented below
and it is evident that is a positive definite kernel in , and thus admits the following representation by Theorem 1.1 in Ferreira & Menegatto (2009),
(21) 
where the twoway eigenfunctions
form a series of orthonormal basis of, and the eigenvalues
account for the variability of the principal components . Equation (21) presents a generalized version of Mercer’s theorem.3 Main results
3.1 Detection and testing procedure
In the case of single changepoint, we assume the following changepoint model for the functional covariance
(31) 
and for simplicity, we assume there is no structural break in mean function. The interest here is to test if the functional covariance remains constant across functions , specifically, we want to test the hypothesis
Under the , there exists at least one and such that the equality is violated. Then we need to find the changepoint and divide the whole sequence into two continuous parts at the changepoint, and then we can assume a piecewise stationary model for , where the functional covariance is homogeneous in each part.
Under the null hypothesis, we can assume the following model for ,
We assume satisfy the following conditions.
Assumption 1 (A1).
are symmetric and positive definite kernels, and for some .
Assumption 2 (A2).
There is a measurable function , where is a measurable space, and innovations taking values in , so that under , and under ,
where are defined similarly with and is the space of square integrable positive definite symmetric kernels defined in .
Assumption 3 (A3).
There exists an dependent sequence , so that under ,
and under
where is an independent copy of , and for some .
Assumption (A2, A3) can be referred to approximable. The idea of this dependence measure was introduced in Wu (2005), and was extended to functional data in Hörmann & Kokoszka (2015). This assumption quantify the weak dependence of and covers most stationary functional time series model. The weak dependence quantified by approximable typically will not influence the consistency property of estimators. As a special case, if a sequence composes of independent functions, it automatically satisfies Assumption (A2, A3).
The current proposed method is developed for single changepoint problem, however, we can adapt the procedure to multiple changepoint problem, for example, by binary segmentation. However, we do not pursue this problem in the current paper. We now describe our proposed twostage procedure. In the first stage, we apply the detection procedure to find the most pronounced changepoint candidate, and then we apply the testing procedure to test the significance of the candidate. To proceed, we first introduce the following estimators of the functional covariance and for the parts and to be, respectively,
Under the null hypothesis, the difference should be relatively close to zero for all and . To reduce the impact of the edges (which have high level of uncertainty due to fewer number of observations), we incorporate a weight function to attenuate the endpoint effect. Thus, our proposed weighted difference sequence is
and large value of can be expected for some if structural break is present. The detection step is based on the following cumulative statistics (CUSUM)
where and is the location of in the rescaled interval
. Thus, our proposed test statistic is
To determine the changepoint candidate, we plot against and find the argument that maximises . To ensure uniqueness, we define the changepoint candidate as
The next step is to apply a formal test to classify the candidate changepoint as a changepoint or otherwise. The following theorems provides the asymptotic properties of the test statistics under
and .Theorem 1.
Under Assumptions (A1)–(A3) and ,
where are standard Brownian bridge defined on .
Theorem 2.
Under Assumption (A1)–(A3) and ,
Remark 1.
Theorem 1 validates the asymptotic distribution of the test statistics under
, and provides a threshold, for a given probability of Type I error, for rejecting
.Remark 2.
Remark 3.
Theorem 2 tells us the power of the test is asymptotically 1.
3.2 Estimation of ’s
One key step of the testing procedure is estimating the unknown eigenvalues of the longrun functional covariance . As the longrun covariance consists of infinite many lagged functional autocovariance, we consider the kernel estimator of , defined as
where is a symmetric weight function such that, , , , if , is continuous on , and is the bandwidth parameter that needs to be selected (see e.g. Rice & Shang (2017)). As a special case, when the functions are independent, , and . Under , we can estimate the functional covariance and autocovariance with the entire sequence as follows,
Ramsay (2004) described two computational methods for estimating eigenfunctions of twoway functional covariance. The first method is discretizing functional covariances. However, this method cannot be extended to fourway longrun functional covariance. We propose to represent by a series of common basis. Given being the common basis of , are then the common basis of . If a function is a symmetric function, we have for any pair of , therefore we can construct the following basis for twoway symmetric functions,
Suppose each demeaned function has the following basis approximation,
(31) 
where is selected such that the above dimensional approximation is close to the original functions, such that the integrated mean square error , such that, for some tolerance error of approximation , is the smallest dimension that satisfies that the integrated mean squared error of approximation is less than . Define , where , and . We represent the estimator of longrun functional covariance in the following matrix form
Now suppose that an eigenfunction has the following basis expansion
where , this yields
where
where is a matrix with elements . This equation holds for arbitrary and , thus we have
where . We can solve the eigenequation to obtain . We now summarize the implementation procedure in Algorithm 1,
3.3 Asymptotic properties of the estimated changepoint
We now develop the asymptotic properties of the estimated changepoint. Denote , where is fixed and unknown, we will discuss for both scaled () and unscaled changepoint () estimator under . First we shall show it is consistent in the scaled case, say, . For , we define the following function
where and are defined in Equation (31). Then we have the following theorem.
Theorem 3.
Under the change point model and Assumption (A2), if , for any , we have
Then we can get the consistency of as displayed in the following corollary.
Corollary 1.
Under the assumption in Theorem 3, if , we have .
This corollary can be easily obtained from Theorem 3. Equivalently, the estimated scaled changepoint is the maximizer of . It is evident that the unique maximizer of is , thus .
To discuss the asymptotic distribution of the estimated unscaled changepoint, we define
The difference between the estimated unscaled changepoint and the true unscaled changepoint asymptotically converges to the maximizer of the function in distribution, which is illustrated in Theorem 4.
Theorem 4.
Under Assumption (A1)–(A3), if , we have
Remark 4.
The asymptotic distribution of is influenced by two factors: 1). The discrepancy of covariance structure between the two parts, and 2). The variability of and the alignment between and . As a special case, if are orthogonal to , then the unscaled estimated changepoint is always consistent with the true one, in other words, .
4 Simulations
We simulated two groups of functions with the same sample size and different functional covariances to study the finite sample behaviors of the changepoint estimator. The two groups of functions were concatenated as a functional sequence with structural break in the midpoint. For simplicity, we simulated independent functions with mean zero. We selected the 2nd to the 9th Fourier basis over the unit interval , denoted by , to represent the functions. In other words, we simulated functions in frequency band (14 Hertz). The independent curves were then generated by
where
are independent normal random variable with standard deviation
and for group 1 and group 2 respectively. Denoting to be a dimensional row vector with all elements to be , we consider the following three different forms of standard deviation,Setting 1: , ;
Setting 2: , ;
Setting 3: , .
In the first two settings, the discrepancy between the functional covariances comes from the difference of spectral distribution, and group 1 contains functions that contain mostly low frequency oscillations while functions in group 2 are dominated by high frequencies. In Setting 3, the two groups have the same spectrum but different phase distribution. are random error functions satisfying
where are independent normal random variables with mean zero and standard deviation . We took into account the influence of random error on the detection performance by setting different values to , say, . A large value of indicates low signalnoise ratio. For each setting, we simulated or curves for each group and applied our method to find the changepoint. The simulation runs were repeated 500 times for different values of
, and we obtained the crossvalidation confidence interval of
. We used the R package “” to obtain numerical 95% quantile of the null distribution at different
. Figure 1 displays the power curves for different settings and sample sizes. Table 1 displays the 90% cross validation confidence interval, and Figure 2 displays the Boxplots offor different setting and variance parameter
. We can see that the method works equally well for these three settings with respect to the estimation variability.Setting  

300  (0.497,0.503)  (0.460,0.533)  (0.347,0.617)  (0.286,0.710)  
600  (0.500,0.502)  (0.482,0.513)  (0.428,0.570)  (0.330,0.628)  
300  (0.500,0.503)  (0.450,0.537)  (0.333,0.627)  (0.266,0.677)  
600  (0.500,0.502)  (0.485,0.518)  (0.408,0.570)  (0.380,0.643)  
300  (0.500,0.507)  (0.453,0.527)  (0.343,0.633)  (0.296,0.703)  
600  (0.500,0.503)  (0.487,0.515)  (0.422,0.563)  (0.345.0.639) 
We would like to stress one interesting point in Setting 3. Two important elements in frequency domain analysis are spectra and phase. When we want to test the structural break in brain signal recordings (i.e. EEG, LPF), we can check the spectrum function (see e.g. Schröder and Ombao (2017)), however, under Setting 3, the spectrum function is the same for the entire sequence, and consequently the spectrumbased detection method does not work, but as our functional procedure incorporates intracurve information, the structural break in phase can also be detected. This is one of the major advantages of our functional procedure.
5 Application on local field potentials
The new method was applied to local field potential (LPF) trajectories of rat brain activity, collected from the experiment of Frostig and Wann (Wann, 2017). Microtectrodes were inserted in 32 locations on the rat cortex. From these microtectrodes, LFPs were recorded at the rate of 1000 observations per second, and the observations collected in one second is considered as an epoch. The experiment last for 10min and a total of 600 epochs were recorded. Midway in this period (at epoch 300), stroke was mechanically induced on the rat by clamping the medial cerebral artery. As noted, following the stroke, there might be some delay before changes in brain electrophysiological signal taking place. Here we only considered the frequency band (14 Hertz), and smooth the trajectory of each epoch with the first 9 Fourier basis specified as follows
where . If other frequency bands are of interest, we can represent the trajectories with the Fourier basis in the corresponding frequency band.
To solve the multiple changepoint problem, we applied a binary segmentation procedure. We first considered the entire sequence, and found the most pronounced changepoint. Then we divided the sequence into two parts, and applied our method to detect the changepoint in each subsequence. We repeated the procedure until no more significant changepoints were found.
The changepoints were detected at epochs 371, 380, 423, 479, and 576. These changepoints were tested significant mainly due to the unusual epochs where the signals displayed extremely high variability. However, the changepoint candidate near the midpoint was not significant. The reason is that the variability of poststroke trajectories is not stationary, and the nonstationarity influences the significance of spectral structure discrepancy between the functional covariances of pre and poststroke trajectories. To remove this effect, we rescaled the trajectories of each epoch to norm one, and applied our procedure to the rescaled trajectories, displayed in Figure 3. The plot of is displayed in Figure 4.
After rescaling the trajectories of each epoch, the first changepoint candidate is near midpoint (at which is about 14 seconds after stroke), and was test significant at significance level . The changepoint candidate near the midpoint is the only one which was tested significant for rescaled functions. This indicates that the stroke applied on rat cortex caused significant structural break in the spectral structure of functional covariance, and also tells us if we are more interested in the spectral structure than the amplitude, we can work with the functional covariance of rescaled functions.
6 Conclusions
To conclude, we developed a fully functional procedure to identify the changepoint in functional covariance. Our proposed method does not require a preapplication of any dimension reduction technique. The procedure is well designed for both independent and correlated functions. The method can be useful when structural breaks are present in the second moment structure (see Jiao et al. (2020)). The scaled estimated changepoint is consistent and the two factors influencing the consistency of the estimated unscaled changepoint are 1) discrepancy of functional covariance, 2) variability level of random error and the alignment between and . Even though this paper focuses on single changepoint problem, the procedure can be easily extended to multiple changepoint problem by employing approximate detection procedure, such as window sliding, binary segmentation and bottom up segmentation (see e.g. Truong et al. (2020)).
An important application of our method is structural break detection in brain signals. Comparing with other existing methods on this topic, this functional approach has two main advantages. First, it is robust to chance variability since we propose to check the functional covariance of complete epoch trajectories. Additionally, the proposed method incorporates intracurve information, which is potentially informative of structural break in brain signals. Considering the curse of dimensionality, the methodology requires sample size to be large enough if we want to detect the structural breaks of the brain signals over a wide frequency band, and dimension reduction techniques will be considered in the future.
References

Aston (2012a)
Aston, John AD & Kirch, Claudia (2012a).
Detecting and estimating changes in dependent functional data.
Journal of Multivariate Analysis
109, 204–220.  Aston (2012b) Aston, John AD & Kirch, Claudia (2012b). Evaluating stationarity via changepoint alternatives with applications to fMRI data. The Annals of Applied Statistics 6, 1906–1948.
 Aue (2020) Aue, Alexander, Rice, Gregory & Sönmez, Ozan (2020). Structural break analysis for spectrum and trace of covariance operators. Environmetrics 31, e2617.
 Aue (2018) Aue, Alexander, Rice, Gregory & Sönmez, Ozan (2018). Detecting and dating structural breaks in functional data without dimension reduction. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80, 509–529.
 Aue (2009a) Aue, Alexander, Gabrys, Robertas, Horváth, Lajos & Kokoszka, Piotr (2009). Estimation of a changepoint in the mean function of functional data. Journal of Multivariate Analysis 1043–1073.
 Aue (2014) Aue, Alexander, Hörmann, Siegfried, Horváth, Lajos & Huková, Marie (2009). Dependent functional linear models with applications to monitoring structural change. Statistica Sinica 100, 2254–2269.
 Aue (2009b) Aue, Alexander, Hörmann, Siegfried, Horváth, Lajos & Reimherr, Matthew (2009). Break detection in the covariance structure of multivariate time series models. The Annals of Statistics 37, 4046–4089.
 Berkes (2009) Berkes, István, Gabrys, Robertas, Horváth, Lajos & Kokoszka, Piotr (2009). Detecting changes in the mean of functional observations. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71, 927–946.
 Berkes (2013) Berkes, István, Horváth, Lajos & Rice, Gregory (2009). Weak invariance principles for sums of dependent random functions. Stochastic Processes and their Applications 123, 385–403.
 Cho (2015) Cho, Haeran, & Fryzlewicz, Piotr. (2015) Multiplechangepoint detection for high dimensional time series via sparsified binary segmentation. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 77, 475507.
 Ferriera (2009) Ferreira, JC & Menegatto, VA (2009). Eigenvalues of integral operators defined by smooth positive definite kernels. Integral Equations and Operator Theory 64, 61–81.
 Fryzlewicz (2014) Fryzlewicz, Piotr & Rao, S. Subba (2014). Multiplechangepoint detection for autoregressive conditional heteroscedastic processes. ournal of the Royal Statistical Society: Series B (Statistical Methodology) 76, 903–924.
 Gromenko (2017) Gromenko, Oleksandr, Kokoszka, Piotr & Reimherr, Matthew (2017). Detection of change in the spatiotemporal mean function. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79, 29–50.
 hively (2017) Hively, LM, Protopopescu, VA & Gailey, PC (2000). Timely detection of dynamical change in scalp EEG signals. Chaos: An Interdisciplinary Journal of Nonlinear Science 10, 864–875.
 Hormann (2010) Hörmann, Siegfried & Kokoszka, Piotr (2010). Weakly dependent functional data. The Annals of Statistics 38, 1845–1884.
 Jiao (2020) Jiao, Shuhao, Ron D. Frostig & Ombao, Hernando (2020). Classification of Functional Data by Detecting the Discrepancy of Second Moment Structure of Scaled functions. arXiv preprint arXiv:2004.00855
 jones (1969) Jones, Richard H, Crowell, David H & Kapuniai, Linda E (2015). A method for detecting change in a time series applied to newborn EEG. Electroencephalography and clinical neurophysiology 27, 436–440.
 kirch (2010) Kirch, Claudia, Muhsal, Birte & Ombao, Hernando (2015). Detection of changes in multivariate time series with application to EEG data. Journal of the American Statistical Association 110, 1197–1216.
 ramsay (2004) Ramsay, J. O. (2004). Functional data analysis. Encyclopedia of Statistical Sciences 4
 Rice (2017) Rice, Gregory & Shang, Hanlin (2017). A Plugin Bandwidth Selection Procedure for LongRun Covariance Estimation with Stationary Functional Time Series. Journal of time series analysis 38, 591–609.
 Ombao (2017) Schröder, Anna Louise & Ombao, Hernando (2019). FreSpeD: Frequencyspecific changepoint detection in epileptic seizure multichannel EEG data. Journal of the American Statistical Association 114, 115–128.

Saaid (2011)
Saaid, MF Mohamed, Abas, WAB Wan, Aroff, H, Mokhtar, N, Ramli, R & Ibrahim, Z (2011).
changepoint detection of EEG signals based on particle swarm optimization.
5th Kuala Lumpur International Conference on Biomedical Engineering 2011, 484–487, Springer.  Shorack (2009) Shorack, Galen R & Wellner, Jon A (2009). Empirical processes with applications to statistics. SIAM.
 raanju (2018) Sundararajan, Raanju R. & Pourahmadi, Mohsen (2018). Nonparametric change point detection in multivariate piecewise stationary time series. Journal of Nonparametric Statistics 30, 926–956.
 Torgovitski (2015) Torgovitski, Leonid (2015). Detecting changes in Hilbert space data based on” repeated” and changealigned principal components. arXiv preprint arXiv:1509.07409
 Truong (2020) Truong, Charles, Oudre, Laurent & Vayatis, Nicolas (2017). Selective review of offline changepoint detection methods. Signal Processing 167, 107299.

Wann (2009)
Wann, Ellen Genevieve (2017).
Largescale spatiotemporal neuronal activity dynamics predict cortical viability in a rodent model of ischemic stroke
. Ph.D. dissertation, UC Irvine.  Wu (2005) Wu, Weibiao (2005). Nonlinear system theory: Another look at dependence. Proceedings of the National Academy of Sciences 102, 14150–14154.
Appendix A Proof of Theorem 1 & 2
Under Assumption (A1–A3), we can approximate the weakly dependent process with dependent process and study the asymptotic distribution of that approximated dependent process. Then we need to show the asymptotic property as goes to infinity.
It is noted that does not change if the is replaced with . First we show that the result in Theorem 1 can be established for dependent random functions. To show the asymptotic result for dependent process, we first introduce the following lemma.
Lemma 1.
Let
Then, under Assumption (A1)–(A3), there exists a Gaussian process, , such that
and
The proof of Lemma 1 can follow the framework of Berkes et al. (2013). We will not discuss in details but will give a sketch of the proof.
First we need to show that the Lemma holds for a dependent process , which is constructed as in (A3). We can similarly define the longrun covariance of as follows
which is symmetric and positive definite kernel and thus admits the spectral decomposition
The series forms an sequence of orthonormal basis for , and thus for fixed and any ,
where . Next we aim to obtain the asymptotic property of
and it is equivalent to study the dimensional random process
(11) 
then by Lemma 2 in the following, we can show quantity 11 converge in distribution to
where are Wiener processes.
Lemma 2.
If (A1)–(A3), then the dimensional random process (11), as , converges to
proof of Lemma 2.
First we specify the covariance of the asymptotic distribution of (11). For ,
By central limit theorem for
dependent random variables, for each , we haveThus the lemma follows. ∎
Consequently, since is uniformly square integrable by (A1), we have
Then we need to show that the reminder is trivial asymptotically. Evidently,