Factor Analysis for High-Dimensional Time Series with Change Point

We consider change-point latent factor models for high-dimensional time series, where a structural break may exist in the underlying factor structure. In particular, we propose consistent estimators for factor loading spaces before and after the change point, and the problem of estimating the change-point location is also considered. Compared with existing results on change-point factor analysis of high-dimensional time series, a distinguished feature of the current paper is that our results allow strong cross-sectional dependence in the noise process. To accommodate the unknown degree of cross-sectional dependence strength, we propose to use self-normalization to pivotalize the change-point test statistic. Numerical experiments including a Monte Carlo simulation study and a real data application are presented to illustrate the proposed methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

03/17/2019

Change Point Detection in the Mean of High-Dimensional Time Series Data under Dependence

High-dimensional time series are characterized by a large number of meas...
12/10/2021

Segmenting Time Series via Self-Normalization

We propose a novel and unified framework for change-point estimation in ...
07/11/2019

Change point detection for graphical models in presence of missing values

We propose estimation methods for change points in high-dimensional cova...
05/08/2022

Optimal Change-point Testing for High-dimensional Linear Models with Temporal Dependence

This paper studies change-point testing for high-dimensional linear mode...
10/28/2021

Location-Adaptive Change-Point Testing for Time Series

We propose a location-adaptive self-normalization (SN) based test for ch...
11/14/2019

Estimation of dynamic networks for high-dimensional nonstationary time series

This paper is concerned with the estimation of time-varying networks for...
05/18/2018

Assessing Health Care Interventions via an Interrupted Time Series Model: Study Power and Design Considerations

The delivery and assessment of quality health care is complex with many ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

High-dimensional time series has been emerging as a common and important data type in applications from a number of disciplines, including climate science, economics, finance, medical science, and telecommunication engineering among others. Although numerous statistical methods and their associated theory have been developed for the modeling and inference of time series data, existing results mostly focused on the univariate or finite-dimensional multivariate case. The problem of extending existing results developed under low-dimensional settings to handle high-dimensional time series, however, is typically nontrivial and requires significant innovations. For example, when the dimension is larger than the length of the observed time series, the commonly used autoregressive moving-average (ARMA) model in its conventional form may face a serious identification problem as commented by Lam et al. (2011)

. To handle the phenomenon of high dimensionality, one typically resorts to certain sparsity-type conditions for the purpose of dimension reduction. For example, when considering vector autoregressive (VAR) models in the high-dimensional setting, one typically need to assume that the coefficient matrices are sparse in a suitable sense in order to obtain their meaningful estimators; see for example

Basu and Michailidis (2015), Davis et al. (2016) and references therein for research in this direction.

Unlike the aforementioned sparse VAR approach that aims at extending existing parametric time series models to their sparse high-dimensional counterparts, a more commonly used approach in the literature for modeling high-dimensional time series is through the use of factor models; see for example Chamberlain and Rothschild (1983) and Stock and Watson (1998), Bai and Ng (2002), and Forni et al. (2004) among others. In the aforementioned works, it was assumed that most of the variations in the observed high-dimensional time series can be explained by a set of common factors, and as a result such factor models cannot be used to capture strong cross-sectional dependence. In addition, the common component suffers from the identifiability issue when the dimension is finite; see for example Bai and Ng (2002). To alleviate these problems, Lam et al. (2011) proposed an alternative type of factor models which has become more and more popular in the last decade. In the model of Lam et al. (2011), the common factors are viewed as the force that drives all the dynamics and are used to explain the serial dependence in the data. Under this setting, the noise process can accommodate the strong cross-sectional dependence and is white. Lam et al. (2011) proposed an approach based on autocovariance matrices of the observed process at nonzero lags for loading space estimation. This method is also applicable to non-stationary processes, processes with uncorrelated or endogenous regressors, and matrix-valued process; see for example Liu and Chen (2016), Chang et al. (2017), Wang et al. (2019), and Liu and Chen (2019). Compared with the approach of Bai and Ng (2002) and Forni et al. (2004), the factor model by Lam et al. (2011) not only ensures that the common component is identifiable but also captures all serial dependence of data which enables us to build forecasting models after dimension reduction. However, existing results along the line of Lam et al. (2011) were generally developed under the assumption that the underlying factor structure remains the same over the whole time period, while recent empirical applications reveal that the factor structure tends to exhibit structural breaks at a certain point, either due to a sudden market change or in response to some unpredictable reasons. This motivates us to consider factor models with possible change points.

The problem of change-point analysis for factor models has been an active area of research. For this, Breitung and Eickmeier (2011) considered testing the existence of change in loadings when the noise process is either independent or autoregressive with a finite order. Chen et al. (2014) came up with a Lagrange multiplier test and a Wald test for detecting the break by regressing one of the factors estimated by PCA on the remaining estimated factors. Han and Inoue (2015)

testes the break point through the second moments of the estimated factors.

Barigozzi et al. (2018) constructed a cumulative sum (CUSUM) test using wavelet coefficients for detecting multiple structural breaks when the noise sequence is Gaussian. Besides testing-based methods, Chen (2015) and Baltagi et al. (2017) considered using methods based on least squares to estimate the change-point location. Ma and Su (2018) proposed an adaptive fused group lasso approach to estimate the multiple change-point location. However, existing results in this direction were typically developed for models with the so-called idiosyncratic noise, where no strong cross-sectional dependence is allowed in the noise. In the current paper, we follow settings by Lam et al. (2011) where the common factor drives all the dynamics and explain serial dependence of the observed process, and consider the situation when the strength of cross-sectional dependence of the noise sequence is unknown and potentially strong. As discussed above, one advantage of this setting is that we can fully extract the dependence of data through common factors for future forecasting model building if needed. The aforementioned paper considered the problem of estimating the factor loading by assuming that there is no change point, and we shall here focus on the change-point case.

The remaining of the paper is organized as follows. Section 2 introduces the change-point factor model and considers its associated estimation problem, including estimating factor loading spaces before and after the change point and the location of the change point. The asymptotic properties of the proposed estimators are also investigated. Section 3.1 proposes a self-normalized approach to testing the existence of change point, which has a pivotalized asymptotic distribution regardless of whether the noise process has weak or strong cross-sectional dependence. Details on its practical implementation are discussed in Section 3.2. Numerical experiments, including a Monte Carlo simulation study and a real data application, are presented in Sections 4 and 5 respectively to illustrate the proposed methods. Section 6 concludes the paper. Technical proofs are deferred to the Appendix.

2 Change-Point Factor Model and Its Estimation

In this section, we will introduce the our change-point factor model and study its associated estimation problem, including the loading space before and after the change point and the change point location.

2.1 Change-Point Factor Model

Suppose we observe a -dimensional time series , , according to a factor model, then one can write

(1)

where is a latent factor process whose dimension is typically much smaller than , is the associated loading matrix, and

denotes the white noise process. The latent factor model (

1) has been widely used in the literature for dimension reduction of high-dimensional time series; see for example Bai and Ng (2002), Lam et al. (2011), Lam and Yao (2012), Chang et al. (2015) and references therein. It also relates to the generalized dynamic factor model of Forni et al. (2005) in which the latent factor process

is assumed to follow a low-dimensional autoregressive model. We shall here consider the general setting where

is not necessarily an autoregressive process. In model (1), the factor loading structure remains the same over the whole sampling period, while in many applications a structural break may occur due to various reasons. For this, we consider the change-point factor model

(2)

where , , represents the underlying latent factor before and after the change point whose location is denoted by , and and are the associated loading matrices. Recently, there have been efforts in studying the change-point factor model by incorporating certain beliefs on the change-point mechanism into the analysis. For example, Liu and Chen (2016)

modeled the change-point mechanism by a finite-state hidden Markov chain, and thus structural breaks occur when there is a regime switching in the hidden state variable. On the other hand,

Liu and Chen (2019) considered using a threshold variable to model the change-point mechanism, where the threshold variable is assumed to be -mixing and observable up to a small number of unknown parameters. Instead of introducing a Markov chain process or an additional threshold variable, we in the current paper focus on the change-point factor model (2) which uses time to naturally divide the observed process into homogenous pieces before and after the change point.

In practice, one does not observe nor but only , and thus the loading matrices in (2) are not fully identifiable. To be more specific, for , one can replace in (2) by for any non-singular matrix . However, the space spanned by the columns of the loading matrix , denoted by , is always uniquely defined; see for example the discussions in Lam et al. (2011), Chang et al. (2015), Liu and Chen (2016) and Liu and Chen (2019). As a result, when estimating the factor model (2), we shall focus on the loading space instead of the loading matrix itself.

We shall here introduce some notations. For a matrix , we use and to denote its Frobenius and L-2 norms respectively. In addition, we use for the trace, for the

-th largest singular value, and

for the square root of minimum nonzero eigenvalue of

. Also, we write if and , and we use and to denote the largest previous and smallest following integers of .

2.2 Estimating the Loading Spaces

Before introducing our estimators for loading spaces in the change-point setting, we first need a measure to quantify the distance between two linear spaces which can then be used to assess the statistical performance our estimators. For this, let and be full rank matrices in and respectively with . Denote the matrix whose columns form an orthonormal basis of for , then the distance between column spaces of and can be measured by

(3)

The distance measure (3) was first introduced in Liu and Chen (2019), and is a quantity between 0 and 1. In particular, it equals to 0 if or , and equals to 1 if and are orthogonal. For the special case when , the two spaces and have the same dimension, and the distance measure (3) reduces to

(4)

which was used in Chang et al. (2015) and Liu and Chen (2016). Since the number of factors is usually unknown in practice and may be estimated in a nonperfect way, we shall in the current paper use the generalized version in (3) to measure the distance between two linear spaces.

For factor models in high-dimensional cases, it is common to assume that the number of factors is fixed but the squared L-2 norm of the loading matrix grows with the dimension (Bai and Ng, 2002; Doz et al., 2011). The growth rate is called the strength of the factors in Lam et al. (2011), Lam and Yao (2012), Chang et al. (2015), and Liu and Chen (2019+). For , assume that

for some , then factors in regime are said to be strong if and weak if . The strength of factors measures the relative growth rate of the amount of information carried by the observed process about the common factors as increases, with respect to the growth rate of the amount of noise process in regime . It can be seen from our theoretical results below that the factor strength plays an important role in the estimation efficiency.

Let , and we shall now introduce the estimation procedure for loading spaces when a tentative break date is given. For this, we separate the data into two subsets before and after , namely and , and propose a change-point generalization of the estimator of Lam et al. (2011). To be more specific, define the generalized second cross moment matrices of and with lag in each regime as

where is the indicator function for regime satisfying if , if , and zero otherwise. With the white noise assumption, and satisfy

and

For a pre-determined positive integer , let

(5)

then we can see that

(6)

holds for if and for if . Therefore, is a symmetric non-negative definite matrix sandwiched by and when is in regime . If there exists at least one nonzero such that and is full rank, then has rank

. Its eigenspace corresponding to the nonzero eigenvalues is

. Hence, can be estimated through an eigen-decomposition of the sample version of . In particular, let

be the unit eigenvector of

corresponding to the -th largest nonzero eigenvalue for . Define

(7)

where is the number of factors. For simplicity, in the rest of this paper we use and to denote and for .

To estimate the loading spaces in the change-point setting, we shall use the sample version of the above quantities and define

for . Let be the eigenvalues of and be the set of corresponding orthonormal eigenvectors with . Define

(8)

then can be estimated by . When is in regime , similar to Theorem 1 in Lam et al. (2011), we can show that provides a consistent estimator for the loading space under mild conditions.

Proposition 1.

Assume Conditions 1–8 in Appendix A.1. If , then as , with true and , we have

when , and

when , where .

We remark that when , the estimator converges to at the rate of

, and thus the curse of dimensionality does not exist. When the factors in regime

are weak, however, the convergence rate is slower and the noise process distorts the information on the latent factor; see for example Lam et al. (2011).

We shall here also provide a discussion about the situation when does not fall in regimes . Without loss of generality, we illustrate by using the case when . In this case, is sandwiched by , and is a reasonable estimate for . However, misclassification does occur for the estimation of regime 1. In particular, since data points with from regime 2 are included in the calculation of , it is no longer sandwiched by . However, when is sufficiently close , one can show that the misclassification effect becomes negligible on the estimation, and the estimated space using the sample version of will continue to be consistent; see the results in Section 2.3.

2.3 Estimating the Change Point Location

In our estimation approach, we assume that the change point does not happen in the boundary area, namely there exists such that . Let be a hypothesized change point location, then we can use it to split the data into two subsets, namely the one with and the other with , and we define

(9)

where is a matrix for which forms a orthonormal matrix with and . In this case, represents the orthogonal complement space of for . Note that although is not uniquely defined and subject to any orthogonal transformation, is invariant under such transformations. If we project the cross moment matrices onto , then by (6) we can see that measures the squared norm of the projections. If is correctly specified, then the data in the two regimes identified by do belong to the correct regimes. Since , , by the definition of in (5), we have

If , the data are not correctly separated and at least one of the subsets contains data from both regimes. The following proposition shows that under mild conditions for .

Proposition 2.

Under Conditions 1–8, if .

Since , we can use data corresponding to and to get consistent estimates for and . Specifically, for , we estimate by

where is the unit eigenvector of corresponding to the -th largest eigenvalue. Then the sample version of the objective function is given by

(10)

We propose to estimate the change point location by

(11)

whose asymptotic property is provided in Theorem 1.

Theorem 1.

Assume Conditions 1–8 in Appendix A.1. If , then for any , with true and we have

as .

By Theorem 1, the proposed estimator in (11) for the change point location is consistent under mild conditions. It also reveals that the estimation performance can depend critically on the strength of factors in both regimes. In particular, if the factors are strong in both regimes (), then the estimation is immune to the curse of dimensionality. On the other hand, if factors are weak in one regime, then the resulting estimator can become less efficient as

increases. When factors have different levels of strengths before and after the break, the probability that the

falls in the weaker regime is larger but the estimation precision in the stronger regime is better. As a result, the overall rate of convergence of depends on the strength of the weaker regime.

By plugging into the estimation procedure described in Section 2.2, we obtain the final estimators for . To be more specific, let and

where is the unit eigenvector of corresponding to its -th largest eigenvalue. Theorem 2 provides the asymptotic property of the estimated loading spaces when the estimated break date is used.

Theorem 2.

Assume Conditions 1–8 in Appendix A.1. If , then as , with true and , we have

for .

By Theorem 2, the convergence rate of the associated loading space estimators is the same as that in Proposition 1 when the true change point location is known. Compared with the results in Liu and Chen (2019) which used a threshold variable to split the data, the ‘helping effect’ disappears and asymptotically there is no interaction between regimes. In practice, the number of factors is typically unknown but can be estimated through a similar eigenvalue ratio estimator as that used in Lam et al. (2011), namely

(12)

where is the -th largest eigenvalue of , .

3 Determining the Existence of Change Point

3.1 A Self-Normalized Change-Point Test

Although the change-point factor model (2) is able to capture potential structural breaks in the loading space, it can be unnecessarily complicated when there is actually no change point. We shall here consider the problem of determining the existence of change points in the high-dimensional factor model (2). The problem has been studied by Chen et al. (2014)

using linear regression of estimated factors, and by

Ma and Su (2018) using fused Lasso penalization on block estimators; see also reference therein. However, the aforementioned results both require the noise vector to have a sparse cross-sectional dependence structure; see for example Assumption 4 in Chen et al. (2014) and Assumption A3 in Ma and Su (2018). In addition, the fused Lasso approach of Ma and Su (2018) does not provide a -value which can be an informative measure for quantifying the amount of statistical evidence in favor of the change-point model. To handle noise processes with possibly non-sparse cross-sectional dependence structures, the main challenge is to deal with the unknown scale parameter that associates with the strength of cross-sectional dependence. For this, we propose to adopt the idea of self-normalization (Lobato, 2001; Shao, 2010), which is capable of adaptively handling unknown scale parameters caused by different dependence strengths; see for example Shao (2011), Bai et al. (2016) and Taqqu and Zhang (2019).

Assume that the noise process has a constant covariance structure , then for any vector , by (2) and Condition 3 in Appendix A.1 we have

and

Therefore, one can determine the existence of change points by testing for structural breaks in the variance of the transformed sequence

, namely

(13)

versus the alternative that there exists a time point such that

. Compared with traditional change-point testing problems, the key difference here is that the scale of the random variable sequence

can grow to infinity at an unknown rate depending on the strength of cross-sectional dependence. This is particularly due to the fact that the dimension of the covariance matrix is , where can grow to infinity with the sample size . Under the idiosyncratic error assumption of Chen et al. (2014) and Ma and Su (2018), the covariance matrix has a sparse structure by which one can show that . However, for situations with non-sparse cross-sectional dependence structures, can grow to infinity even when is bounded, and the rate can depend on the strength of the underlying cross-sectional dependence, which is typically unknown in practice. As a result, this can pose a challenge when one seeks an appropriate normalization for the test statistic. For this, we consider the following self-normalization approach.

Let

(14)

we consider the test statistic

(15)

where

is the self-normalizer that pivotalizes the asymptotic distribution of the test statistic (15) to make it free of any nuisance scale parameter.

Theorem 3.

Assume Condition 9 in Appendix A.1. Then under the null hypothesis of no change point, we have as

,

where

3.2 A Data-Driven Choice

Although the hypothesis test in (13) is able to detect the change point with any , we shall here consider a data-driven choice that aims to optimize the power performance. Intuitively, should be chosen to be far away from the linear space , which can be achieved by

As a result, we propose to choose by

When is the unit right singular vector of that corresponds to the largest singular value, will be in the column space of and the distance of and will then be maximized. Let be the unit right singular vector of that corresponds to the largest singular value, the associated test statistic with the data-driven choice can then be obtained by replacing with in (14) and (15). When the degrees of factor strength are different in the two regimes, the data from the one where factors are stronger will provide more information. We can further incorporate this into the data-driven choice of . To be more specific, if , then factors after the potential change point are likely to be stronger, in which case we can choose that maximizes the distance from by letting , where is the right singular vector of corresponding to the largest singular value.

4 Simulation

In this section, we present a Monte Carlo simulation study to examine the finite sample performance of the proposed inference procedure. For this, we generate entries of the loading matrix

as independent sample from the uniform distribution on

. As a result, the factor strength of is characterized by . The associated noise process is generated as a Gaussian process whose covariance matrix has 1 on the diagonal and for all off-diagonal entries. For simplicity, we use throughout our numerical examples. For each setting, we generate 1000 realizations, and the results are summarized in the following subsections.

4.1 The Case with No Change Point

We first consider the case with no change point and examine the empirical size performance of the self-normalized change-point test proposed in Section 3. For this, we consider both the setting with strong factors () and the setting with weak factors (). Three common factors drive the time series, and the factor process is set to be three independent AR(1) processes with noises process and AR coefficients 0.9, -0.7, and 0.8. for the noise process. and . The results are summarized in Table 1, from which we can see that the empirical sizes of the proposed test are reasonably close to their nominal levels when factors are strong. When , our tests are slightly oversized, and this is because we only utilize information obtained before time and after time to estimate in (14).

0 0.25 0 0.25 0 0.25

n=400
p=20 0.112 0.138 0.064 0.072 0.014 0.012
n=400 p=40 0.127 0.149 0.069 0.083 0.013 0.020
n=400 p=100 0.123 0.112 0.067 0.059 0.018 0.016
n=1000 p=20 0.116 0.122 0.058 0.057 0.009 0.011
n=1000 p=40 0.103 0.130 0.058 0.072 0.012 0.012
n=1000 p=100 0.115 0.137 0.068 0.071 0.014 0.019
Table 1: Empirical sizes of tests with different levels of factor strength in Section 4.1

4.2 The Change-Point Case

We now consider the change-point case and examine if the self-normalized change-point test proposed in Section 3 can successfully detect the existence of the change point. For this, we follow the data generating mechanism described in Section 4.1 for generating the factor and noise processes, and set . Let the change point location , and we consider four different scenarios on the factor strength, namely SS () in which strong factors are used before and after the change point, SW ( and ) in which strong factors are used before the change point and weak factors after, WS ( and ) in which weak factors are used before the change point and strong factors after, and WW () in which weak factors are used before and after the change point. The results are summarized in Table LABEL:power_ex2, from which we can see that the proposed test performs reasonably well as it successfully identifies the change point with high probabilities. In addition, the performance seems to improve in general when the sample increases.


Setting
SS SW WS WW SS SW WS WW SS SW WS WW
n=400 p=20 0.975 0.973 0.966 0.969 0.941 0.929 0.932 0.943 0.820 0.787 0.815 0.810
n=400 p=40 0.979 0.970 0.955 0.969 0.944 0.936 0.915 0.934 0.820 0.804 0.783 0.799
n=400 p=100 0.960 0.968 0.975 0.977 0.931 0.924 0.949 0.940 0.805 0.806 0.811 0.821
n=1000 p=20 0.996 0.996 0.998 0.996 0.990 0.989 0.992 0.992 0.956 0.949 0.958 0.959
n=1000 p=40 0.998 0.996 0.994 0.997 0.989 0.987 0.985 0.987 0.935 0.956 0.946 0.950
n=1000 p=100 0.999 0.990 0.997 0.999 0.996 0.995 0.992 0.992 0.960 0.962 0.960 0.962
Table 2: Empirical powers of tests under different settings in Section 4.2

After the change point existence has been identified by the test, we shall apply the methods proposed in Section 2 to estimate the change point location and the loading spaces in two regimes. Figure 1 provides the histograms of our change point location estimator under different settings, from which we can observe the followings. First, when the factor strength is weak in at least one regime, before or after the change point, the estimation efficiency in that weak regime can suffer from the increase in dimension. In contrast, the estimation efficiency in the strong regime does not seem to be affected by the curse of dimensionality. This is in line with the results in Theorem 1; see also the discussions thereafter. Second, it can be seen from the top panels in Figure 1 that, when the factor strengths before and after the change point are different, namely settings SW and WS, the estimation bias, though asymptotically negligible, is more likely to be toward the regime with weaker factors. In particular, when the factor strength after the change point is weaker as in the SW setting, then it is more likely to overestimate . On the other hand, if the factor strength before the change point is weaker as in the WS setting, then it is more likely to underestimate . We also provide in Table 3 a summary of the estimation error . The estimation error for the loading spaces are summarized in Table 4, from which we can see that the estimation procedure proposed in Section 2.2 performs reasonably well under all the considered settings.

Figure 1: Histograms of estimated threshold value under different settings when and and are known. The dashed line shows the true threshold value , black bars show the frequencies of underestimation, and grey bars show the frequencies of overestimation.
SS 0.035 0.039 0.040 0.015 0.018 0.018
SW 0.051 0.066 0.083 0.023 0.029 0.039
WS 0.054 0.060 0.081 0.023 0.027 0.038
WW 0.053 0.060 0.071 0.024 0.028 0.034
Table 3: Average estimation error in Section 4.2
Setting SS 0.059 0.055 0.053 0.033 0.032 0.031
0.059 0.057 0.055 0.033 0.032 0.031
Setting SW 0.058 0.056 0.053 0.033 0.032 0.031
0.095 0.099 0.113 0.049 0.053 0.058
Setting WS 0.093 0.094 0.110 0.048 0.052 0.058
0.060 0.056 0.053 0.034 0.032 0.032
Setting WW 0.089 0.095 0.103 0.049 0.052 0.057
0.093 0.094 0.105 0.050 0.052 0.057
Table 4: Average estimation error in Section 4.2

4.3 A Comparison with Existing Results

In this subsection, we will compare the performance of our change point detection procedure with Chen et al. (2014) and Han and Inoue (2015) who both focus on single change point detection in factor models but do not allow strong cross-sectional dependence in noise process. Set and . Data are simulated with processes described in Section 4.1 and in Section 4.2 with to compare the sizes and powers of different tests, respectively. From Table 6 and Table 6, we can see that our test controls the sizes better than Wald test proposed by Chen et al. (2014) and is more powerful than LM and Wald tests by Han and Inoue (2015) when the strong cross-sectional exists in noise process.

Our method Wald-cdg LM-hi Wald-hi
0 0.25 0 0.25 0 0.25 0 0.25
0.086 0.100 0.238 0.203 0.016 0.017 0.118 0.130
0.110 0.148 0.267 0.203 0.004 0.010 0.161 0.136
0.119 0.204 0.235 0.210 0.006 0.008 0.161 0.141
0.074 0.071 0.072 0.092 0.028 0.023 0.045 0.042
0.086 0.073 0.072 0.092 0.029 0.031 0.061 0.045
0.099 0.077 0.084 0.093 0.023 0.026 0.050 0.050
Table 6: Power comparison among different tests with for Example 1 in Section 4.3
Our method Wald-cdg LM-hi Wald-hi
0,0 0,0.25 0.25,0 0.25,0.25 0,0 0,0.25 0.25,0 0.25,0.25 0,0 0,0.25 0.25,0 0.25,0.25 0,0 0,0.25 0.25,0 0.25,0.25
0.721 0.692 0.671 0.676 0.960 0.948 0.950 0.928 0.036 0.057 0.024 0.028 0.109 0.066 0.164 0.138
0.722 0.718 0.700 0.662 0.999 0.998 0.999 0.997 0.007 0.006 0.008 0.003 0.212 0.168 0.226 0.187
0.715 0.700 0.710 0.681 0.859 0.809 0.809 0.821 0.000 0.000 0.000 0.000 0.250 0.210 0.260 0.264
0.930 0.929 0.921 0.925 1.000 0.996 0.992 0.995 1.000 1.000 1.000 0.995 1.000 1.000 0.999 0.996
0.951 0.919 0.924 0.913 1.000 1.000 1.000 1.000 0.995 0.360 0.382 0.319 0.396 0.400 0.393 0.447
0.939 0.933 0.912 0.928 1.000 1.000 1.000 1.000 0.323 0.264 0.318 0.310 0.396 0.349 0.392 0.411

Note: ’cdg’ denotes the test proposed by Chen et al. (2014), ’hi’ denotes the test proposed by Han and Inoue (2015), and ’LM’ denotes Lagrange Multiplier test. Wald test in Chen et al. (2014) is more powerful than their LM test. Han and Inoue (2015) developed various test statistics which have similar performance. In Table 5 and Table 6, we compare our test with Wald test by Chen et al. (2014), LM and Wald tests by Han and Inoue (2015) with supremum test statistics obtained by Bartlett kernel.

Table 5: Size comparison among different tests with for Example 1 in Section 4.3

5 Real data analysis

We applied our method to the Stock-Watson data (Stock and Watson, 1998, 2005), containing 132 U.S. monthly economic indicators from March 1960 to December 2003, with and . The data include real output and income, employment, real retail, manufacturing and trade sales, consumption, interest rates, price index and other economic indicators. Stock and Watson (2005) provided more detailed information about this data set and transformations needed for stationarity before analysis.

Set , , , and level of significance . We applied the wild binary segmentation method (Fryzlewicz, 2014) to check if there are multiple change points. The result shows that during this period there is only one change point.

Left and right panels in Figure 5 demonstrate the ratio of eigenvalues of and , respectively, and it reaches its minimum values at 1 and 2, which implies that and . Using methods described in Section 2.3, we have .

It means that the dynamics of economic indicators experienced once a permanent structural change around November 1988 possibly due to saving and loan crisis (Curry and Shibut, 2000). Before the change, there was one common factor, while after the change, there was two common factors driving 132 economic indicators.

Extending the method described in Lam et al. (2011), we can estimate the residuals of factor models with

We compare our method with Wald test by Chen et al. (2014), LM and Wald tests by Han and Inoue (2015) through white noise test (Chang et al., 2017) and residual sum of squares (RSS). All the three tests reject the null and agree that there is a structural break in data. With the estimated break dates, we fix the number of factors before and after break at 1 and 2 for comparison and calculate the residual series. Table 7 reports the -values associated with white noise test for the residual series and RSS. Our model is validated by the white noise test with -value 0.07. It confirms that after extracting the common factors, there is no serial dependence left. The residuals series from Wald test by Chen et al. (2014) and Wald test by Han and Inoue (2015) fail the white noise test. Although -value for LM test by Han and Inoue (2015) is greater than 0.05, Han and Inoue (2015) yields a larger RSS. In sum, our method can successfully capture all the serial dynamics of the data, and lower the residual sum of squares comparing to others.

Method -value for white noise test RSS
Ours 0.075 308238
Wald-cdg 0.000 160476
LM-hi 0.070 1482874
Wald-hi 0.030 1156676
Table 7: Comparison among different models for real data analysis

6 Conclusion

Factor models have been frequently used in the study of high-dimensional time series. Their associated change-point analyses were mostly studied under the framework of Bai and Ng (2002) and Forni et al. (2004). However, such type of factor models suffer from the identifiability issue in the finite dimensional case and cannot be used to explain the phenomenon of strong cross-sectional dependence in noise process. To alleviate the aforementioned problems, Lam et al. (2011) proposed an alternative framework for factor analysis of high-dimensional time series. In the current paper, we develop statistical methods and their associated theory for change-point analysis of the recently proposed factor model of Lam et al. (2011). In the presence of a change point, we propose consistent statistical estimators for the change point location and factor loading spaces before and after the change point, and provide their explicit convergence rates. In particular, our results reveal that the resulting estimators can have very different asymptotic behaviors in response to the phenomenon of high dimensionality depending on the factor strength. To be more specific, for processes with strong factor strength, the convergence rate will not be affected by the dimension, while the curse of dimensionality can be observed for processes with weak factor strength; see the discussion in Section 2 and our numerical results in Section 4. We also propose a self-normalized test for determining the change-point existence, which, due to the self-normalization, can adaptively handle the case with strong and weak cross-sectional dependence. It can be seen from our numerical results in Section 4 that the proposed change-point test performs reasonably well in terms of both the size and power.

Appendix A.1. Regularity Conditions

Define

and three intervals

For any and and are from the same one of the three intervals, , or , let

The following regularity conditions are needed for theoretical properties.

Condition 1. Let be the -field generated by The latent process is -mixing with mixing coefficients satisfying

for and some , where .

Condition 2. For any , , and , , where is the -th element of , is a constant, and is given in Condition 1.

Condition 3. is a white noise process. and are uncorrelated given . Each element of remains bounded by a positive constant as increases to infinity.

Condition 4. For , there exists a constant such that , as goes to infinity.

Condition 5. . For any , there exists an integer such that and