# A new class of change point test statistics of Rényi type

A new class of change point test statistics is proposed that utilizes a weighting and trimming scheme for the cumulative sum (CUSUM) process inspired by Rényi (1953). A thorough asymptotic analysis and simulations both demonstrate that this new class of statistics possess superior power compared to traditional change point statistics based on the CUSUM process when the change point is near the beginning or end of the sample. Generalizations of these "Rényi" statistics are also developed to test for changes in the parameters in linear and non-linear regression models, and in generalized method of moments estimation. In these contexts we applied the proposed statistics, as well as several others, to test for changes in the coefficients of Fama-French factor models. We observed that the Rényi statistic was the most effective in terms of retrospectively detecting change points that occur near the endpoints of the sample.

## Authors

• 4 publications
• 1 publication
• 8 publications
• ### Change-Point Detection based on Weighted Two-Sample U-Statistics

We investigate the large-sample behavior of change-point tests based on ...
03/27/2020 ∙ by Herold Dehling, et al. ∙ 0

• ### Real-time detection of a change-point in a linear expectile model

In the present paper we address the real-time detection problem of a cha...
07/29/2020 ∙ by Gabriela Ciuperca, et al. ∙ 0

• ### Adaptive Inference for Change Points in High-Dimensional Data

In this article, we propose a class of test statistics for a change poin...
01/29/2021 ∙ by Yangfan Zhang, et al. ∙ 0

• ### Optimal Resolution of Change-Point Detection with Empirically Observed Statistics and Erasures

This paper revisits the offline change-point detection problem from a st...
03/13/2020 ∙ by Haiyun He, et al. ∙ 0

• ### Retrain or not retrain: Conformal test martingales for change-point detection

We argue for supplementing the process of training a prediction algorith...
02/20/2021 ∙ by Vladimir Vovk, et al. ∙ 0

• ### Location-Adaptive Change-Point Testing for Time Series

We propose a location-adaptive self-normalization (SN) based test for ch...
10/28/2021 ∙ by Linlin Dai, et al. ∙ 0

• ### Product Partition Dynamic Generalized Linear Models

Detection and modeling of change-points in time-series can be considerab...
03/03/2021 ∙ by Victor S. Comitti, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

We consider in this paper the development of a new class of change point test statistics that are useful in addressing the problem of retrospectively detecting change points in parameters of interest that might occur near the beginning or end of a sequence of observations. Such end-of-sample change points are usually difficult to detect with traditional procedures, like those based on measuring the fluctuations of the cumulative sum (CUSUM) process, since they rely on the existence of a proportional number of observations before or after the change point in order to maintain power. For an up to date survey on change point analysis in the context of time series data, we refer the reader to Aue and Horváth (2013).

Several authors have thus considered the problem of improving the power of change point tests in the case when the change might lie near the end points of the sample. The resulting methods typically resort to weighting the maximally selected CUSUM process or Wald F-type statistics, and this is sometimes combined with trimming the domain on which such processes are maximized. In the context of a single change in the mean of scalar time series, Andrews (1993) considers the maximum of the standardized CUSUM process on a trimmed domain, which corresponds to the maximally selected likelihood ratio test with independent and identically distributed normal observations. The standardized CUSUM process can also be maximized over all possible change points without trimming, and in this case a Darling–Erdős result is needed in order to establish the null asymptotics of the statistic. The details and history of these results are chronicled in Csörgő and Horváth (1993), and Csörgő and Horváth (1997). Darling–Erdős type statistics are considered in Ling (2007) and Hidalgo and Seo (2013) in order to test for changes in the model parameters in general time series models with weakly dependent errors. In related work, Andrews (2003) develops end-of-sample instability tests assuming that the location of the potential change point in the parameters is known a priori, and that the number of observations before or after the change point is presumed to be fixed in the asymptotics. Some of these results and practical considerations for end-of-sample change point detection with known and unknown potential change points are surveyed in Busetti (2015).

We propose here a new class of change point test statistics that utilize more heavily weighted and trimmed CUSUM processes than have been considered previously. The inspiration for this weighting/trimming scheme comes from some results of Rényi related to the uniform empirical process, which were further studied by Csörgő; see Rényi (1953) and Csörgő (1965). For this reason, we refer to the proposed statistics as “Rényi” statistics. We establish the null asymptotics of these Rényi statistics below, which utilize some recent results on the rate of convergence in the Gaussian approximation of partial sum processes for general weakly dependent time series. We further show that these statistics possess superior power when compared to traditional tests based on the CUSUM process when the change point is near the end points of the sample. These results may also be generalized to test for changes in model parameters in linear and non-linear regression using the least squares principle and generalized method of moment estimation. In a Monte Carlo simulation study, we observed that the proposed statistic outperformed traditional CUSUM based statistics in this setting. We also demonstrate the proposed method in an application to test for change points in the parameters of the Fama-French five factor model. In this case we observed that the proposed statistic is more efficient in retrospectively detecting evident change points when they are near the end points of the sample.

The paper is organized as follows. In Section 2, the Rényi statistic is introduced and we develop its asymptotic properties. This statistic is extended to test for changes in linear and nonlinear regression in Section 3. An asymptotic comparison between the proposed test and standard CUSUM based test statistics is given in Section 4. We present the results of a Monte Carlo simulation study in which we compare the proposed statistic to several competitors for end of sample changes in Section 5, and then present a data application in Section 6. Proofs of all results are given in an online supplement to this article Horváth et al. (2018+). The code and data used to reproduce the simulation study and data analysis are available at .

## 2. Rényi change point statistics, and their null asymptotics

At first, we restrict our attention to the simple location model for scalar random variables

 (2.1) Xt=μt+et,1≤t≤T,

where , and further consider the hypothesis test of versus the at most one change in the mean alternative, with some where is unknown. Multivariate generalizations of these results are presented in Section B of the supplement to this article, Horváth et al (2018+), but here we focus on the univariate case in order to simplify the presentation. Typical test statistics for this hypothesis testing problem are based on functionals of the CUSUM process. For example,

 AT=1T1/2max1≤t≤T∣∣ ∣∣t∑s=1Xs−tTT∑s=1Xs∣∣ ∣∣,

which denotes the maximally selected CUSUM process, can be applied. In order to increase the power of when the change point might be close to 1 or , weighted CUSUM statistics may be used. Let , and define the weighted version of :

 AT(τ)=1T1/2max1≤t

(); we refer to Csörgő and Horváth (1993) and Csörgő and Horváth (1997) for a comprehensive study of such statistics. The maximally selected standardized CUSUM has received special attention in the literature due to its connection with maximally selected likelihood ratio tests assuming Gaussian observations. The statistic appears in Andrews (1993), where it is shown that even when the errors in model (2.1) are independent and identically distributed,

in probability under both

and .

Andrews (1993) introduced the statistic

 ¯AT(1/2,tT)=T−1/2maxtT≤t≤T−tT(tT(T−tT))−1/2∣∣ ∣∣t∑s=1Xs−tTT∑s=1Xs∣∣ ∣∣,

to overcome this problem, which depends on the choice of a trimming parameter , usually taken to satisfy,

 (2.2) tT=⌊T¯θ⌋with some0<¯θ<1/2.

Under (2.2), the convergence in distribution of follows immediately from the weak convergence of when holds. Although diverges in probability, it does have a limit distribution to an extreme value law after a suitable normalization using the results of Darling and Erdős (1956); see Csörgő and Horváth (1993). We note that if , as , then obeys a Darling–Erdős law.

A common feature of all of the above statistics is that they may be written as the maximal weighted difference between the sample means of the first and the last observations. To illustrate, simple calculation shows that

 1T1/2(t∑s=1Xs−tTT∑s=1Xs)=t(T−t)T3/2(1tt∑s=1Xs−1T−tT∑s=t+1Xs),

and so the CUSUM used to define is the difference between the sample means of the first and the last observations that is down-weighted at the end points by a factor of . In order that change points at the beginning or end of the sample might be more apparent, one might instead consider directly the maximal difference between the sample means before and after each potential change point . The statistic based on maximizing this difference over all will evidently not have a limiting distribution without down-weighting or trimming the difference near the endpoints.

We consider here the properties of

 DT=DT(tT)=maxtT≤t≤T−tT∣∣ ∣∣1tt∑s=1Xs−1T−tT∑s=t+1Xs∣∣ ∣∣,

which we refer to as a Rényi statistic. The limit distribution of will evidently depend on the choice of the trimming parameter . For example, if (2.2) holds, then would converge in distribution to the maximum of a Gaussian process on a trimmed domain, and yield a similar test as in Andrews (1993). To increase the power of when a change in the mean might be near the endpoints of the sample, we assume instead that

###### Assumption 2.1.

In order to establish the limit distribution of under Assumption 2.1, we require a rate in the weak convergence of the partial sum process of the ’s, which we quantify with the following assumption.

###### Assumption 2.2.

There are independent Wiener processes and such that

 (2.3) max1≤x≤T/2x−κ∣∣ ∣∣⌊x⌋∑s=1es−σWT,1(x)∣∣ ∣∣=OP(1)

and

 (2.4) maxT/2≤x≤T−1(T−x)−κ∣∣ ∣∣T∑s=⌊x⌋+1es−σWT,2(T−x)∣∣ ∣∣=OP(1)

with some and .

The constant

can be interpreted as the long run variance of the sequence

. Assumption 2.2 has been established for several broad classes of stationary sequences. In the case when is a stationary and strongly mixing sequence, Assumption 2.2 was shown to hold in Kuelbs and Philipp (1980) (see also Bradley (2007) and Davis et al. (1995)). For stationary Bernoulli shift sequences under mild weak dependence conditions, which covers most popular stationary time series models, including the ARMA and GARCH sequences, this result was shown in Ling (2007) and Aue et al. (2014), and in this case Berkes et al. (2014) provide optimal rates in (3.3). Hall and Heyde (1980) obtains (3.3) under various conditions for martingale difference sequences.

The limit distribution of will be expressed in terms of the random variable

 (2.5) ξ=sup0≤u≤1|W(u)|,

where denotes a Wiener process.

###### Theorem 2.1.

If and Assumptions 2.1 and 2.2 hold, then, as , we have

 t1/2TDTσD→max(ξ1,ξ2),

where are independent and each have the same distribution as defined in (2.5).

The proof of Theorem 2.1 is provided in the online supplement to this article, Horváth et al. (2018+). This result suggests an asymptotic sized test of : given a consistent estimate of , , examples of which we develop below, we reject if exceeds the quantile of . Below, we compare this test based on to other typical change point statistics, but before doing so we develop Rényi type statistics in more general change point problems formulated for regression models. We note that one may also obtain similar results with defined with asymmetric trimming, i.e. to define

 D∗T=maxtT≤t≤T−sT∣∣ ∣∣1tt∑s=1Xs−1T−tT∑s=t+1Xs∣∣ ∣∣,

with also satisfying 2.2. We develop this case in the supplement to this paper, Horváth et al. (2018+).

## 3. Extension to change point tests in regression models

We consider in this section the details of applying Rényi statistics in two important cases: simple linear regression and nonlinear regression using both the least squares principle and generalized method of moments.

### 3.1. Linear regression

Consider the simple linear regression

 Xt=x⊤t\boldmathβt+et,1≤t≤T,

where and

. We wish to test the null hypothesis

 H(1)0:\boldmathβ1=\boldmathβ2=…=\boldmathβT

against the alternative

 H(1)A:there is1
###### Assumption 3.1.

(i) The sequences and are independent.
(ii) The sequence is stationary, ergodic, and with some and , where denotes the th coordinate of .
(iii) The sequence satisfies and with some and .

We use the least squares estimator , where

 ZT=[x1,...,xT]⊤∈RT×d,

and . It follows from the ergodic theorem that

 1TZ⊤TZT→Cza.s.
###### Assumption 3.2.

is a nonsingular matrix.

The residuals are defined by

 (3.1) ^et=Xt−x⊤t^\boldmathβT,1≤t≤T.
###### Theorem 3.1.

If , Assumptions 2.12.2, 3.1 and 3.2 hold, then, as , we have

 t1/2TσmaxtT≤t≤T−tT∣∣ ∣∣1tt∑s=1^es−1T−tT∑s=t+1^es∣∣ ∣∣D→max(ξ1,ξ2),

where are independent and they have the same distribution as of (2.5).

The statistic in Theorem 3.1 depends on the unknown long run variance which can be estimated from the residuals , as discussed in Section 5 below.

### 3.2. Nonlinear regression

We next consider the nonlinear regression model where the ’s are

–dimensional parameter vectors. Under the null hypothesis

and the alternative is formulated as there is such that The unknown common parameter vector under is denoted by . Using the least squares principle, the estimator for is the location of the minimizer of

 LT(\boldmathθ)=T∑t=1(Xt−h(xt,% \boldmathθ))2,

where the minimum is taken over the parameter space . We make the following mild assumptions:

###### Assumption 3.3.

The parameter space is a compact subset of , and is an interior point of .

###### Assumption 3.4.
 E∥∥∥∂∂\boldmathθh(x0,\boldmathθ0)∥∥∥2<∞, and E[h(x0,\boldmathθ0)−h(x0,\boldmathθ)]2>0,   if  \boldmathθ≠% \boldmathθ0.

The conditions formulated in Assumption 3.4 ensure that under the differences between the functionals based on the unobservable errors and the residuals

 (3.2) ~et=Xt−h(xt,^\boldmathθ)

are asymptotically negligible in the sense that they do not affect the limit in Theorem 2.1 when is replaced by .

###### Theorem 3.2.

If , Assumptions 2.12.2, 3.1, 3.3 and 3.4 hold, then, as , we have

 t1/2TσmaxtT≤t≤T−tT∣∣ ∣∣1tt∑s=1~es−1T−tT∑s=t+1~es∣∣ ∣∣D→max(ξ1,ξ2),

where are independent and they have the same distribution as of (2.5).

### 3.3. Generalized method of moments estimation

Our results may also be considered in the context of generalized method of moments (GMM) estimation. The basic notation that we use here is inspired by Chapter 21 of Zivot and Wang (2006). The GMM estimator is generally the solution of a moment equation satisfying

 T∑t=1g(xt,^θT)=0,

where contains both model and instrumental variables. Let , and we assume that the parameter , where is a compact subset of . One could more generally consider , , which we address in the the supplement to this paper, Horváth et al. (2018+). Model stability in this case can be described as and a single change point at time is characterized by for some . Under , denotes the true value of the parameter. We require that Assumption 3.1 holds for the instrumental variables, which yields that is a stationary sequence. The following assumptions are standard in GMM estimation, see for example the conditions of Theorem 3.1 of Hansen (1982).

if and only if .

###### Assumption 3.6.

, , with , and is different from zero in a neighborhood of .

###### Assumption 3.7.

There are independent Wiener processes and such that

 (3.3) max1≤x≤T/2x−κ∣∣ ∣∣⌊x⌋∑s=1ms(θ0)−σWT,1(x)∣∣ ∣∣=OP(1)

and

 (3.4) maxT/2≤x≤T−1(T−x)−κ∣∣ ∣∣T∑s=⌊x⌋+1ms(θ0)−σWT,2(T−x)∣∣ ∣∣=OP(1)

with some and .

Ling (2007) establishes general conditions under which Assumption 3.7 holds.

###### Theorem 3.3.

Suppose that , Assumptions 2.1, and 3.53.7 hold, and that satisfies Assumption 3.1(ii). Then as , we have

 t1/2TσmaxtT≤t≤T−tT∣∣ ∣∣1tt∑s=1ms(^θT)−1T−tT∑s=t+1ms(^θT)∣∣ ∣∣D→max{ξ1,ξ2},

where is defined in Assumption 3.7, and and are independent with the same distribution as of (2.5).

## 4. A comparison of the Rényi and maximally selected CUSUM statistics for end–of–sample changes

The primary advantage of the Rényi statistic over traditional test statistics is in its ability to detect change points that occur near the end points of the sample. We now study this advantage in detail by comparing the asymptotic properties of under the alternative of an end-of-sample change point to those of the maximally selected CUSUM statistic . A similar analysis could be conducted for the other CUSUM based statistics described above. If and Assumption 2.2 hold, then it is well known that

 (4.1) 1σATD→sup0≤t≤1|B(t)|,

where denotes a standard Brownian bridge (see Aue and Horváth (2013)). Let denote the size of change. In what follows, we allow both and to depend on . The necessary and sufficient conditions for the asymptotic consistency of both and are as follows:

###### Theorem 4.1.

(i) If Assumption 2.2 is satisfied, then we have that

 (4.2) ATP→∞.

if and only if Condition 4.1 holds.
(ii) If Assumptions 2.1 and 2.2 are satisfied, then we have that

 (4.3) t1/2TDTP→∞

if and only if Condition 4.2 holds.

###### Remark 4.1.

Consider the case when is constant. Writing , an early change point can be characterized by the assumption that (a late change, , can be investigated similarly). In this case, Condition 4.1 reduces to . This clearly does not hold if . However, choosing satisfying , Condition 4.2 will be satisfied, and such a change is asymptotically detected by .

###### Remark 4.2.

According to calculations arising in the proof of Theorem 4.1, one has that

 AT≈T−1/2t∗|Δ|andt1/2TDT≈t1/2T|Δ|,iftT≤t∗≤T−tT.

Therefore, the power of asymptotically exceeds that of if .

Next we provide more detailed information on the local power functions of and in the case of an early change.

###### Theorem 4.2.

We assume that as , i.e. the change is early.
(i) If is a stationary sequence satisfying Assumption 2.2,

 (4.4) E∣∣∣v∑s=ues∣∣∣¯ν≤C(v−u+1)¯ν/2with some¯ν>2andCfor all1≤u≤v≤T,

and then we have that

 (Tt∗)1/2(AT−T−3/2t∗(T−t∗)|Δ|)D→σN,

where denotes a standard normal random variable.
(ii) If Assumptions 2.1 and 2.2, , and hold, then we have that

 t1/2T(DT−|Δ|)D→σsup0≤u≤1W(u),

where is a standard Wiener process.

Remarks 4.1, 4.2, and Theorem 4.2 suggest that should be chosen to be fairly small in order to detect early changes in contrast to the standard CUSUM based procedures. In the simulations and applications below we take , and also consider several other potential choices of .

### 4.1. Consistency of Rényi statistics in regression

We briefly discuss the consistency of our procedure based on the Rényi statistic constructed from the residuals and . The consistency of tests developed in the context of nonlinear regression can be discussed along similar lines as those based on Theorem 3.1, and so we focus just on this latter case.

On account of Assumption 3.1(iii), the ergodic theorem implies that

 limt→∞1tt∑s=1xs=¯x0a.s.

with some . Let and denote the regressors before and after the change point , respectively. We allow that both and might depend on and is included under the alternative. We need some minor conditions on the size and the location of the change:

(i) .  (ii)

###### Theorem 4.3.

If , Assumptions 2.12.2, and 3.1-4.1 hold, then, as , we have

 t1/2TmaxtT≤t≤T−tT∣∣ ∣∣1tt∑s=1^es−1T−tT∑s=t+1^es∣∣ ∣∣P→∞.

## 5. Implementation details and a simulation study

Below we provide specifications for practically implementing the proposed tests, and study these in a simulation study. The code and data used to reproduce the simulation study and data analysis below are available at .

### 5.1. Estimating the long run variance

Normalizing so that it has a pivotal limit distribution requires the estimation of . Therefore in practice we use the statistics

 ^GT=t1/2TmaxtT≤t≤T−tT1^σT,t∣∣ ∣∣1tt∑s=1Xs−1T−tT∑s=t+1Xs∣∣ ∣∣,

where are estimators for assuming the observations might contain a mean change in order to preserve monotonic power; see Perron and Vogelsang (1992) and Vogelsang (1999). So long as the sequence of estimators satisfies

 (5.1) maxtT≤t≤T−tT|^σT,t−σ|=oP(1)underH0

and

 (5.2) |^σT,t∗−σ|=oP(1)underHA,

it follows from Theorem 2.1 that under and the assumptions of Theorem 2.1,

 ^GTD→max(ξ1,ξ2),

where and are independent and distributed as of (2.5). Therefore, we aim to construct estimators that satisfy (5.1) and (5.2). Also, under the conditions of Theorem 4.1(b) we have under the alternative that

 ^GTP→∞.

In case of observations that are presumed to be uncorrelated, we use the estimators

 (5.3) ^σ2T,t=1T(t∑s=1(Xs−¯Xt)2+T∑s=t+1(Xs−~XT−t)2),

where

 ¯Xt=1tt∑s=1Xsand~XT−t=1T−tT∑s=t+1Xs.
###### Theorem 5.1.

We assume that is a stationary and ergodic sequence with and . If Assumption 2.2 holds, then the estimator defined by (5.3) satisfies (5.1) and (5.2).

When the model errors are presumed to be serially correlated, we use a kernel-smoothed estimate of the spectral density at frequency zero to estimate ; see, for example, Priestley (1981). The standard assumptions on the kernel and the window (smoothing parameter) will be used here:

###### Assumption 5.1.

and is Lipshitz continuous on the real line.

###### Assumption 5.2.

and where is given in (4.4).

For any and we define

 Xs,t={Xs−¯Xt,if1≤s≤t,Xs−~XT−t,ift

For every fix we define the long run variance estimator

 (5.4) ^σ2T,t=^γ0,t+2T−1∑ℓ=1K(ℓh)^γℓ,twith^γℓ,t=1T−ℓT−ℓ∑s=1Xs,tXs+ℓ,t.

Our estimator accounts for a possible change in the mean at an unknown time. The classical estimator for is similarly defined, one needs to replace and with in definitions of the empirical covariance of lag . If satisfies a Bernoulli shift assumption as considered in Aue et al. (2014), then defined by (5.4) satisfies (5.1) and (5.2).

### 5.2. Finite sample properties of Theorem 2.1

We now present the results of a simulation study that aimed to investigate how the limit result in Theorem 2.1 manifests under in finite samples for several different data generating processes (DGP’s). All simulations were performed using the R statistical computing language. We considered two different DGP’s under : 1) are iid normal random variables, 2) are iid taking values and with probability . For each of these DGP’s, we generated 100,000 simulated samples of lengths and , and calculated . We note that we did not estimate in this first study, but rather used the known value for each DGP. We considered three choices of : , , and . We only report the results for and , taking performed similarly. Density estimates based on the standard normal kernel with bandwidths computed from the data were compared to the density of the limit in Theorem 2.1. The distribution of can be found using Monte Carlo simulation, or using the formulas in Csörgő and Révész (1981, page 43). We employ the latter method below.

Figure 5.1 shows that the most relevant factor determining the speed of convergence in Theorem 2.1 for each DGP was the choice of . Large choices ) performed the best. But, even in the case when , the convergence is quite fast, and importantly the difference between the sample and theoretical quantiles at the 0.9, 0.95, and 0.99 levels was small for each value of and DGP considered.

### 5.3. Comparison of change point tests for end–of–sample changes

We further compared the Rényi statistic in the case of changes occurring early in the sample to the standard CUSUM statistic , the Darling-Erdős statistic

 ET=(2loglog(T/(logT)3/2))1/2AT(1/2)−MT,

where and to the Lagrange multiplier statistic of equation (8) of Hidalgo and Seo (2013). The limiting distributions of and are reviewed in Aue and Horváth (2013). For the sake of brevity and in order to ease comparisons, we did not include the statistics of Andrews (1993) and Ling (2007), due to their relative similarity in performance to the Darling-Erdős statistic.

Comparing to the previous subsection, we in these simulations replace the theoretical with the corresponding long-run variance estimators defined in (5.4) with the Bartlett kernel and bandwidth defined in Andrews (1991). For each statistic, we generated sequences according to (2.1) under where the error sequence satisfied either 1) the ’s are iid standard normal, 2) follows the GARCH(1,1) model and where are iid standard normal random variables, 3) the ’s follow an AR(1) model with parameter , , or 4) follow and ARMA(2,2) model, . We consider the ARMA(2,2) process in order to evaluate the performance of the endogenous bandwidth selection of Andrews (1991) under model misidentification. We allowed to range between and at increments of , and considered values of and , which each model an early change point. We considered sample sizes of and . For each simulated sequence parameterized by and , we calculated the statistics , and suitably normalized by the variance estimators detailed in Section 5.1. The greatest contrast is seen for , and we present the results for only this case in order to conserve space.

For each value of , we calculated the percentage of test statistics that exceeded the critical value of their limiting distributions. The results are reported in Figures 5.2 and 5.3. With regards to the Rényi statistic we report results when the trimming parameter is . Results for other choices of trimming parameter for were similar to the logarithmic case, with trimming parameters closer to the location of the change point (for a fixed ) tending to perform better. We summarize these results as follows:
(i) In the presence of moderate serial correlation and with a small sample size, the statistics utilizing normalization by the long run variance estimator tended to be oversized, as seen in Figure 5.3. This did improve with increasing sample size. (ii) We observed that the test based on held its size well, even when using the kernel based long run variance estimator. This is especially true relative to the other CUSUM based tests, which typically had inflated size. (iii) When the change was very early, i.e. when , the statistic exhibited higher power than either the standard CUSUM, Darling-Erdős, or Hidalgo and Seo (2013) statistics. (iv) The standard CUSUM based statistic displayed low empirical power for such end–of–sample changes. Since the proportion of the observations before the change was decreasing to zero, the power of the CUSUM and the Darling–Erdős tests decreased or did not change as the sample size increased. The power of the Rényi statistic was still increasing with the sample size confirming the conclusion of Remark 4.2. (v) Among the competing statistics, the Rényi statistic generally showed the best performance for end of sample change detection, followed closely by the statistic of Hidalgo and Seo (2013).

## 6. Data Application: Application to change point testing in Fama-French factor models

In this section we illustrate the Rényi-type test’s ability to detect end-of-sample structural changes when compared to both CUSUM or the Darling-Erdős statistic with an application to detecting structural change in the Fama-French five factor model coefficients fitted for a portfolio of banking sector stocks. In particular, we compare each statistics ability to detect a structural changes during the 2008 financial crisis.

The Fama-French five-factor model (Fama and French (2015)) is a linear regression model intended to describe the expected return of a security or portfolio of financial assets. The model takes the following form:

 (6.1) Rt−RFt=α+βM(RMt−RFt)+βSMBSMBt+βHMLHMLt+βRMWRMWt+βCMACMAt+ϵt.

In (6.1), is the return of the security or portfolio at time ; is the risk-free rate of return; is the market return; is the return on a diversified portfolio of small stocks minus the return on a diversified portfolio of big stocks; is the return of a portfolio of stocks with a high book-to-market (B/M) ratio minus the return of a portfolio of stocks with a low B/M ratio; is the returns of a portfolio of stocks with robust profitability minus a portfolio of stocks with weak profitability; and finally is the return of a portfolio of stocks with conservative investment minus the return of a portfolio of stocks with aggressive investment.

Kenneth French published data for estimating these coefficients on his website, along with portfolios he constructed. Among the portfolios are 49 industry portfolios constructed by assigning each stock on the NASDAQ, NYSE, and AMEX exchanges to an industry portfolio based on the company’s four-digit SIC code. One of those portfolios represents the banking industry. Both value-weighted and equal-weighted returns are provided, and here we use the value-weighted returns. More details about the data can be found on Kenneth French’s website (French (2017)).

We estimated the coefficients of (6.1) for the banking portfolio using OLS on an expanding window of data with the starting point fixed at January 4th, 2005, and the end point varying between the trading days from August 1st to October 31st, 2008. The sample sizes then ranged from 901 days in the initial sample to 965 days for the final sample. Notably this period includes September 15th of that year on which day Lehman Brothers filed for bankruptcy. For each of these datasets we estimated the coefficients of (6.1) and performed the CUSUM, Darling-Erdős, and Rényi-type tests (with trimming parameter for the latter) on the residuals. We did this both with and without using the HAC variance estimation described in (5.4) with the Bartlett kernel and bandwidth selected using the method of Andrews (1991). We report results for the former case, since the difference was negligible. We used R and C++ for computations, including functionality provided by the Rcpp and cointReg packages; see Eddelbuettel (2013) and Aschersleben and Wagner (2016).

A visual inspection of the residuals (not shown) suggests they are heteroskedastic. We performed Ljung-Box and Box-Pierce tests on the squared residuals to test for this type of dependence for one lag, with both tests rejecting the null hypothesis of independence ().

Figure 6.1 shows the resulting p-value from each test indexed by the last date in the sample. We also compared these to the end-of-sample change point test of Andrews (2003), which assumes that the potential change point is specified by the user. As the potential change point location for Andrews (2003) method we use the date that Lehmann Brothers filed for bankruptcy, and as such the p-values for this test are only available for samples ending after this date. From this figure we see that the Rényi-type test was able to retrospectively detect a structural change in the coefficients of (6.1) with only half a month of data beyond September 15, 2008, which was roughly half a month before either the CUSUM or Darling-Erdős tests detected changes. Interestingly, the method of Andrews (2003) was not able to detect a change in this period, which we believe can be attributed to the fact that their statistic is not as strongly weighted as, for example, the Rényi-type statistic.

## References

• [1] Andrews, D.W.K.: Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation. Econometrica 59(1991), 817-858.
• [2] Andrews, D.W.K.: Tests for parameter instability and structural change with unknown change point. Econometrica 61(1993), 821–856.
• [3] Andrews, D.W.K.: End-of-sample instability tests. Econometrica 71(2003), 1661–1694.
• [4] Aschersleben, P., and Wagner, M.: Parameter Estimation and Inference in a Cointegrating Regression, R package version 0.2.0, (2016), url: https://CRAN.R-project.org/package=cointReg
• [5] Aue, A., Hörmann, S., Horváth, L. and Hušková, M.: Dependent functional linear models with applications to monitoring structural change. Statistica Sinica 24(2014), 1043–1073.
• [6] Aue, A. and Horváth, L.: Structural breaks in time series. Journal of Time Series Analysis 34(2013), 1–16.
• [7] Billingsley, P.: Convergence of Probability Measures. Wiley, New York, 1968.
• [8] Berkes, I., Liu, W. and Wu, W.B.: Komlós–Major–Tusnády approximation under dependence. Annals of Probability 42(2014), 794–817.
• [9] Bradley, R. C.: Introduction to Strong Mixing Conditions. Kendrick Press, Heber City, UT, 2007.
• [10] Bureau of Labor Statistics, United States Department of Labor.: Major Sector Productivity and Costs. Series IDs: PRS84006092, PRS84006093, PRS84006152, PRS84006153. url: http://www.bls.gov/
• [11] Busetti, F.: On detecting end-of-sample instabilities, in: S.J. Koopman and N. Shephard eds, Unobserved Components and Time Series Econometrics, Oxford University Press, 2015.
• [12] Csörgő, M.: Some Rényi type limit theorems for empirical distribution functions. Annals of Mathematical Statistics 36(1965), 322-326.
• [13] Csörgő, M. and Horváth, L.: Rényi–type empirical processes.

Journal of Multivariate Analysis

41(1992), 338–358.
• [14] Csörgő, M. and Horváth, L.: Weighted Approximations in Probability and Statistics. Wiley, 1993.
• [15] Csörgő, M. and Horváth, L.: Limit Theorems in Change–point Analysis. Wiley, 1997.
• [16] Csörgő, M. and Révész, P.: Strong Approximations in Probability and Statistics. Academic Press, New York, 1981.
• [17] Darling, D.A. Erdős, P.: A limit theorem for the maximum of normalized sums of independent random variables. Duke Mathematical Journal 23(1956), 143-155.
• [18]

Davis, R.A., Huang, D. and Yao, Y.-C.: Testing for a change in the parameter values and order of an autoregressive model.

Annals of Statistics 23(1995), 282–304.
• [19] Eddelbuettel, D.: Seamless R and C++ Integration with Rcpp, Springer, New York, 2013.
• [20] Fama, E., and French, K.: A five-factor asset pricing model, Journal of Financial Economics, 116, (2015), 1–22 .
• [21] French, K.: Data library, accessed December 28, 2017,
• [22]

Górecki, T., Horváth, L. and Kokoszka, P.: Change point detection in heteroscedastic time series.

Econometrics and Statistics, in press, (2018).
• [23] Hall, P. and Heyde, C.C.: Martingale Limit Theory and Its Application. Academic Press, 1980.
• [24] Hansen, L.P.: Large sample properties of generalized method of moments estimators,Econometrica, 50, (1982), 1029-1054.
• [25] Hidalgo, J. and Seo, M.H.: Testing for structural stability in the whole sample, Journal of Econometrics, 175, (2013), 84–93.
• [26] Horváth, L.  Miller, C. and Rice, G.: Supplementary material for “A new class of change point test statistics of Rényi type”, (2018+)
• [27] Kuelbs, J. and Philipp, W.: Almost sure invariance principles for partial sums of mixing B–valued random variables. Annals of Probability 8(1980) 1003–1036.
• [28] Ling, S.: Testing for change points in time series models and limiting theorems for NED sequences. Annals of Statistics 35(2007), 1213–1237.
• [29] Liu, W. and Wu, W.B.: Asymptotics of spectral density estimates. Econometric Theory 26(2010), 1218–1245.
• [30] Móricz, F., Serfling, R.J. and Stout, W.F.: Moment and probability bounds with quasi–superadditive structure for the maximum partial sum. Annals of probability 10(1982), 1032–1040.
• [31] Perron, P. Vogelsang, T.J.: Testing for a Unit Root in a Time Series with a Shift in Mean: Corrections and Extensions, Journal of Business and Economic Statistics, 10 (1992), 467-470.
• [32] Priestley, M.B. Spectral Analysis and Time Series I. Academic Press, 1981.
• [33] Reich, R. B.: Saving Capitalism for the Many, Not the Few. Alfred A. Knopf, New York, 2015.
• [34] Rényi, A.: On the theory of order statistics. Acta Mathematica Academiae Scientiarum Hungaricae 4(1953), 191–231.
• [35] R Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. (2015), url: https://www.R-project.org/.
• [36] Vogelsang, T.J.: Sources of nonmonotonic power when testing for a shift in mean of a dynamic time series. Journal of Econometrics 88 (1999), 283-299.
• [37] Zivot, E. and Wang, J.: Modelling Financial Time Series with S-PLUS, Springer, New York, 2006.

Supplementary Material

This supplement contains the proofs of the results in the main paper, as well as multivariate generalizations of Theorem 2.1. Generalizations to Rényi type statistics defined with asymmetric trimming are also developed. We provide the details of the consistency of the variance estimators defined in Section 5 of the main paper.

## Appendix A Proofs of Main Results

### a.1. Proof of Theorem 2.1

First we note that under

 (A.1) maxtT≤t≤T−tT∣∣ ∣∣1tt∑s=1Xs−1T−tT∑s=t+1Xs∣∣ ∣∣=max(VT,1,VT,2),

where

 VT,1=maxt