Estimation of high-dimensional factor models and its application in power data analysis

05/06/2019 ∙ by Xin Shi, et al. ∙ 0

In dealing with high-dimensional data, factor models are often used for reducing dimensions and extracting relevant information. The spectrum of covariance matrices from power data exhibits two aspects: 1) bulk, which arises from random noise or fluctuations and 2) spikes, which represents factors caused by anomaly events. In this paper, we propose a new approach to the estimation of high-dimensional factor models, minimizing the distance between the empirical spectral density (ESD) of covariance matrices of the residuals of power data that are obtained by subtracting principal components and the limiting spectral density (LSD) from a multiplicative covariance structure model. The free probability techniques in random matrix theory (RMT) are used to calculate the spectral density of the multiplicative covariance model, which efficiently solves the computational difficulties. The proposed approach connects the estimation of the number of factors to the LSD of covariance matrices of the residuals, which provides estimators of the number of factors and the correlation structure information in the residuals. Considering a lot of measurement noise is contained in power data and the correlation structure is complex for the residuals from power data, the approach prefers approaching the ESD of covariance matrices of the residuals through a multiplicative covariance model, which avoids making crude assumptions or simplifications on the complex structure of the data. Theoretical studies show the proposed approach is robust to noise and sensitive to the presence of weak factors. The synthetic data from IEEE 118-bus power system is used to validate the effectiveness of the approach. Furthermore, the application to the analysis of the real-world online monitoring data in a power grid shows that the estimators in the approach can be used to indicate the system states.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Factor models are important tools for reducing the dimension of observed data and extracting the relevant information. They are used for modeling a large number of variables through a small number of unobserved variables to be estimated in many applications. With the emergence of big data in many fields, especially the increasing data dimensionality, extensive studies on the estimation of high-dimensional factor models have been conducted.

Bai and Ng bai2002determining proposes using information criteria for estimating the number of factors, which is developed under the framework of high data dimensions (), seriously different from the previous methods lewbel1991rank ; connor1993test ; cragg1997inferring ; forni1998let developed under the assumption that the data dimension is fixed or small. A critical assumption made in the work is the factors’ cumulative effect on grows proportionally to . Stock and Watson stock2002forecasting suggests using principal components for estimating factors in high-dimensional datasets. Kapetanios kapetanios2004new ; kapetanios2010testing first proposes exploiting a structure of residual terms in the approximate factor models. Based on Kapetanios’s work, Onatski onatski2010determining relaxes the restrictions on the covariance structure of the residual terms and develops a new consistent estimator for estimating the number of factors. Harding harding2013estimating

imposes restrictions on the spatial-temporal correlation patterns of the residual terms, and proposes an estimation method for the number of factors by relating the moments of the empirical spectral density (ESD) of covariance matrices of observed data to the parameters regarding the spatial-temporal correlations. Yeo and Papanicolaou

yeo2016random presents a new approach to estimate the number of factors by connecting the factor model estimation problem to the limiting spectral density (LSD) of covariance matrices of the residuals, in which two strict assumptions are made: one is the spatial correlation of the real residuals can be completely eliminated by removing the estimated number of factors; the other is the residuals follow an AR(1) process.

Based on Yeo and Papanicolaou’s work, in this paper, we relax the restrictions for the structure of the residuals and propose approaching the LSD of covariance matrices of the residuals through a multiplicative covariance structure model with an controllable parameter. This allows the proposed approach being more flexible and practical in analyzing the real-world data with complex correlation structure residuals. Take the power flow data for example, the classical physical model in matrix form is as follows,

(1)

where denote the variations of regarding variables and is the inverse of the Jacobian matrix. is the observed data (e.g., voltage amplitude and phase angle), are considered as the signals (e.g., active and reactive power), and represents small random fluctuations or measuring errors. Since a lot of measurement noise is contained in the residual term and the spatial-temporal correlations among its entries are complex, it is unreasonable to model the residuals from power data based on crude assumptions and simplifications as in Yeo and Papanicolaou’s work.

Inspired by the idea of decomposing the observed data into systemic components (factors) and idiosyncratic components (residuals), we consider an approximate factor model for variables and observations as follows,

(2)

where is an observed data matrix, is an ( is the number of factors) factor loading matrix, is an matrix of factors, and is an residual matrix.

One simple way to estimate is using the principal components and assuming as pure noise. However, our approach mainly focuses on and we estimate the number of factors and the ESD of covariance matrix of simultaneously. The main advantages of the proposed approach can be summarized as follows:

(i) It relaxes restrictions on the structure of the residuals . pure noise or just temporal-correlation assumption for the residuals is crude and unreasonable in practice. Instead of modeling with strict structure item, the proposed approach prefers approaching the ESD of covariance matrix of through a multiplicative covariance structure model with an controllable parameter, which makes the approach more flexible and practical.

(ii) The proposed approach uses free probability techniques in RMT to derive the LSD of the built multiplicative covariance model, which greatly simplifies the calculation process and ensures the efficiency of the approach.

(iii) It relates the estimation of the number of factors to the ESD of covariance matrix of , which allows controlling both the number of factors and the spectral shape of the residuals.

(iv) The theoretical studies on the synthetic data generated from Monte Carlo experiments show the proposed approach is robust to noise and sensitive to the weak factors, and the built multiplicative covariance structure can fit the ESD of covariance matrices of the auto-cross(weak)-correlation structure residuals better than the AR(1) model in Yeo and Papanicolaou’s approach.

(v) By using the power data generated from IEEE 118-bus test system, the estimators in the proposed approach are proved to be sensitive in indicating the number and scale of anomaly events occurred in the power system.

(vi) With the real-world online monitoring data from a power grid, the estimators in the proposed approach are found to be successful in indicating the system states.

The rest of this paper is organized as follows. In Section 2, we apply the Marchenko-Pastur law for the residuals from both synthetic data and real-world power data. In Section 3, we present our approach for the estimation of high-dimensional factor models. In Section 4, by using the synthetic data generated from Monte Carlo experiment, we evaluate the performance of our approach and compare it with that developed by Yeo and Papanicolaou in terms of detecting weak factors and convergence rate. Section 5 shows the applications of our approach to power data analysis. In Section 6, conclusions are presented.

2 Motivation Example

Marchenko-Pastur law (M-P law): Let be an random matrix, whose entries are independent identically distributed (i.i.d.) variables with the mean

and the variance

. The corresponding covariance matrix is defined as . As but , according to the M-P law marvcenko1967distribution , the ESD of

converges to the limit with probability density function (PDF)

(3)

where , .

In this section, we first apply the M-P law for the residuals from the synthetic data generated by the following model,

(4)

where , , and are independent. The true number of factors is set to be 4. As is shown in Fig. 1, with the factors removed continuously, the ESD of covariance matrices of the residuals converges to the M-P law.

Figure 1: The ESD of covariance matrices of the residuals from the synthetic data with . With factors removed continuously, the ESD can be fit well by the M-P law.

In contrast, we apply the M-P law for the residuals from the real-world online monitoring data in a power grid. Let matrix be the sampling data with , and is the residual matrix obtained by subtracting principal components from . We convert into the standard form through

(5)

where , , and . As is shown in Fig. 2, no matter how many factors are removed, the ESD of covariance matrices of the residuals from the real-world data does not fit to the M-P law. Therefore, it is necessary to build a new model to fit the ESD from real residuals in estimating factor models.

Figure 2: The ESD of covariance matrices of the residuals from the real-world online monitoring data. No matter how many factors are removed, the ESD does not fit to the M-P law.

3 Methodology

In this section, we propose our approach for the estimation of high-dimensional factor models. In Section 3.1, we provide preliminaries that will be used in the proposed approach. In Section 3.2, we introduce a new factor model estimation approach, which connects the estimation of the number of factors to the ESD of covariance matrices of the residuals. Considering a lot of measurement noise is contained in the residuals and the complex correlation structure of the residuals from power data, an approaching way is proposed for deriving the LSD of covariance matrices of the residuals. Specific steps of the proposed approach are given in Section 3.3, during which free probability techniques are used for deriving the spectral distribution of the built multiplicative covariance structure model.

3.1 Preliminaries

Definition 1

For a random matrix , the empirical spectral density of is defined as,

(6)

where for

denote the eigenvalues of

, and is the Dirac delta function centered at .

Definition 2

The limiting spectral density of is defined as the limit of (6) as .

Definition 3

The Stieltjes Transform (Green’s Function) of is defined as,

(7)

and can be reconstructed through

(8)
Definition 4

The th moment of is defined as,

(9)
Definition 5

The moment generating function as a power series at zero is defined as,

(10)

and its relation to the Green’s function is

(11)
Definition 6

Let () be a unital algebra with a unital linear functional. Suppose are unital subalgebras, then are freely independent (or just free) mingo2017free with respect to if whenever for and such that

  • for

  • with for

Definition 7

Given the functional inverse of the moment generating function , the S-transform speicher1994multiplicative ; voiculescu1992free is defined as,

(12)
Theorem 1

Let and are two freely invariant random matrices, the S-transform of the product is simply the product of their S-transforms

(13)

3.2 Factor model estimation

The proposed estimation approach aims to match the LSD calculated from the modeled multiplicative covariance matrix to the ESD of covariance matrix of the real residuals that are obtained by subtracting principal components. By minimizing the distance between the two spectrums, the estimators are obtained.

The first step is to obtain the ESD of covariance matrix of the real residuals. For high-dimensional data, the principal components are able to approximately mimic all true factors stock2002forecasting . Here, we use principal components to represent factors and the real residuals are obtained by subtracting factors from the observed data, which is defined as

(14)

where is the number of factors, is an

matrix which is given as eigenvectors corresponding to the

largest eigenvalues of , and is an matrix which is estimated by . The covariance matrix of the real residuals can be calculated as,

(15)

where the subscript indicates it is constructed from the real residuals. Thus we can obtain the ESD of , which is denoted as .

The next step is to model the covariance matrix of the real residuals. Here, we factorize into cross-covariances and auto-covariances, namely,

(16)

the coefficients and are respectively collected into an cross-covariance matrix and a auto-covariance matrix , both are symmetric and positive-definite. The cross-covariance matrix is a way to model the weak spatial (cross-) correlation of the residuals, because the main spatial correlations can be effectively eliminated by removing factors (principal components). The auto-covariance matrix is used to model the temporal (auto-) correlation of the residuals. In order to obtain the LSD of , one simple way is to consider

as an identity matrix

and model as the covariance AR(1) matrix based on the crude assumptions that the spatial correlations of the residuals can be completely removed from factors and the residuals follow an AR(1) process. However, for the power data, a lot of measurement noise (which is usually considered to be random) is contained in the residuals and the spatial-temporal correlations of the residuals are uncertain. Here, instead of modeling and directly, we prefer approaching the LSD of through a multiplicative covariance structure with an controllable parameter ,namely,

(17)

where the subscript denotes it is constructed from the modeled multiplicative covariance matrix, , is an random Gaussian matrix, and which ensures the spectral distribution of converges to a non-random limit as . The LSD of

can be derived by using free probability theory (FRT) in Section

3.3, which is denoted as .

The last step is to search for the optimal parameter set by minimizing the distance between and , which is denoted as,

(18)

where is a spectral distance measure. In yeo2016random

, several distance metrics are tested and Jensen-Shannon divergence is proved to be the most sensitive to the presence of spikes (i.e., the deviating eigenvalues in the spectrum) as well as correctly reflecting the distribution of the bulk (i.e., the grouped eigenvalues in the spectrum). Here, we choose Jensen-Shannon divergence as the spectral distance measure, which is a symmetrized version of Kullback-Leibler divergence and defined as,

(19)

where . It can be seen that becomes smaller as approaches , and vice versa. Therefore, we can match to by minimizing , through which the optimal parameter set is obtained.

3.3 free probability theory for the calculation of

As is discussed in Section 3.2, is easily obtained by removing principal components from the real data, but the implementation of calculating from the Stieltjes transform for the multiplicative covariance structure is difficult. Here, free probability theory is used to derive the LSD of . The prescription is shown as follows:

  • Obtain the LSDs of , denoted as . Consider the case that involved in (17) are zero-mean with variance and , we can obtain by using the M-P law, namely,

    (20)

    where , , and .

  • Calculate the Stieltjes transform for according to (7), denoted as .

  • From , deduce the corresponding moment generating function according to (11).

  • From , deduce the corresponding S-transform according to (12).

  • Since and are two freely invariant random matrices, according to Theorem 1, the S-transform for is calculated as,

    (21)
  • Combine (11), (12) and (21), the polynomial equation for is obtained as (see A for derivation details),

    (22)
  • Obtain the LSD from through (8).

In order to approximate as much as possible, we allow an controllable parameter in the built multiplicative covariance model: the radio rate regarding . Fig. 3 illustrates the spectrum distribution of with different . For small , the spectral density resembles the M-P law. As increases, the shape of the spectrum becomes ‘thinner’ and more heavily tailed, which resembles the inverse process of continuously removing factors from the real-world online monitoring data in Section 2. By controlling and simultaneously, our approach is more flexible and accurate in estimating high-dimensional factor models.

Figure 3: The spectrum distribution of with different . With the increase of , the spectrum shape becomes ‘thinner’ and more heavily tailed.

Combining Section 3.2, the proposed factor model estimation approach is summarized as in Algorithm 1.

Algorithm 1: Procedure of factor model estimation
  The observed data matrix .
The estimated number of factors , and the ratio rate .
  • For the number of removed factors

  • Obtain the real residual through (14).

  • Normalize into the standard form through (5).

  • Calculate the covariance matrix of the standardized residual through

  • (15), i.e., .

  • For the ratio rate

  • Calculate according to the prescriptions in Section 3.3.

  • Calculate the spectral distance through

  • (19) and save the result in each iteration.

  • End for

  • End for

  • Obtain the optimal parameter set through (18).

4 Theoretical Studies

In this section, we first evaluate the performance of the proposed approach by using the synthetic data generated from Monte Carlo experiment, in which different correlation structures are set for the synthetical residuals. Then we compare the performance of our approach with that proposed by Yeo and Papanicolaou in terms of detecting weak factors and convergence rate.

4.1 Data Generation

The synthetic data is generated from the model used in Yeo and Papanicolaou’s work yeo2016random . This model is also used in many other literatures, like Bai and Ng bai2002determining , Onatski onatski2010determining , and Ahn and Horenstein ahn2013eigenvalue , etc. The model is written as,

(23)

where

(24)

and

(25)

with . The explanations for this model are as follows:

  • , which makes the residual level controlled only by .

  • , where represents the signal-to-noise radio and it is defined as .

  • controls the degree of auto-correlations in the residuals.

  • controls the magnitudes of cross-correlations in the residuals.

  • controls the affecting ranges of the cross-correlations in the residuals. Considering the local cross-correlations can be broader with the increase of data dimensions, is usually set to be proportional to .

Combining the characteristics of the data from power system, our simulation experiments have several perspectives. Firstly, since the signal-to-noise rate for power data is usually at a low level, is set to be small values in the experiments. Next, considering the main cross-correlations in the residuals can be eliminated by removing factors, is set to be much smaller than , and the effects of different combinations of them are tested. Lastly, different sample sizes are set to test the performance of the proposed approach and is set to be . Parameter configurations in the Monte Carlo experiment are shown in Table 1

Sample sizes {50,100,200,300,500}
Number of factors {2,3,4}
1/SNR {1/10000,1/1000,1/100,1/10,1}
Correlations in residuals {(0,0,0),(0.5,0,0),(0,0.05,/10),(0.5,0.05,/10)}
Table 1: Parameter configurations in the Monte Carlo experiment.

4.2 Performance of Our Approach

The performance of our approach is tested by using the generated data in Section 4.1. There are four different residual correlation structures, i.e., no correlation (), auto-correlation-only (), cross(weak)-correlation-only (), auto-cross(weak)-correlation (). The true number of factors is set to be . Average values of the estimated and over simulations are shown in Table 2.

It can be observed that the average estimator is almost equal to the true number of factors for a broad range of and for the cases . For the case , the number of estimated factors is about , because several weak factors caused by the weak cross-correlation of the residuals are presented. It indicates our approach has powerful ability to identify weak factors. It can also be observed that the estimators become more accurate with the increase of the sample size. Meanwhile, varied correlation structures of the residuals are tested in the experiments and the corresponding examples of the fitting results of our approach for the synthetical residuals are shown in Fig. 4. controls the auto-correlation magnitude for the residuals and measures the cross-correlation within the range of in the residuals. As seen from column in Table 2, It can be concluded that the estimator is affected both by the auto- and cross-correlations of the residuals, while the estimator is affected by the cross-correlation of the residuals.

3.000 0.5851 3.000 0.7405 10.010 0.6395 2.948 0.7564
3.000 0.5910 2.998 0.7435 10.000 0.6366 3.000 0.7534
3.000 0.6019 3.010 0.7366 10.045 0.6494 3.061 0.7682
3.006 0.5930 3.007 0.7415 10.047 0.6831 2.924 0.7484
3.011 0.5999 3.033 0.7435 10.045 0.6702 3.199 0.7257
3.000 0.5772 3.099 0.7524 10.030 0.6399 3.017 0.7445
3.000 0.5801 3.031 0.7524 10.005 0.6380 3.274 0.7484
3.000 0.5811 2.900 0.7583 10.031 0.6330 3.101 0.7544
3.000 0.5801 3.000 0.7564 10.010 0.6399 3.382 0.7494
3.002 0.5891 3.045 0.7425 10.023 0.6380 3.300 0.7405
3.000 0.6366 3.000 0.7187 10.003 0.6633 3.000 0.7405
3.000 0.6247 2.998 0.7088 10.000 0.6534 2.996 0.7474
3.000 0.6286 3.002 0.7316 10.000 0.6435 3.132 0.7465
3.000 0.6207 2.999 0.7227 10.003 0.6593 2.946 0.7395
3.000 0.6336 3.000 0.7118 10.005 0.6583 3.161 0.7286
3.000 0.5841 3.000 0.7653 10.000 0.6310 3.000 0.7702
3.000 0.5712 3.000 0.7613 10.005 0.6310 3.099 0.7663
3.000 0.5782 2.998 0.7603 10.000 0.6390 3.099 0.7732
3.000 0.5792 3.010 0.7712 10.000 0.6320 3.000 0.7603
3.000 0.5722 3.004 0.7672 10.001 0.6300 3.099 0.7752
Table 2: Average and over 1000 simulations.
(a) (b) (c) (d)
Figure 4: Examples of the fitting results of our approach for the residuals. The sample size and the signal-noise-ratio .

4.3 Comparison with other approaches

In Yeo and Papanicolaou’s work yeo2016random , the estimators from their approach are compared with the BIC3 estimator of Bai and Ng bai2002determining , the ED estimator of Onatski onatski2010determining , and the ER estimator of Ahn and Horenstein ahn2013eigenvalue

in detail. It shows Yeo and Papanicolaou’s approach converges the fastest when the noise level is high and has more powerful ability to identify weak factors than other methods. In this section, we mainly compare the performance of our free probability (FP) approach with that of Yeo and Papanicolaou’s free random variable (FRV) method.

(a) (b) (c) (d)
Figure 5: Plot of calculated through FRV and FP approaches. Each plot is generated with different noise level: . The residual correlation structure: . Other parameter settings: , and .

Fig. 5 shows the Jensen-Shannon (JS) divergences of and regarding the sample size and the signal-to-noise radio , calculated through FRV and FP approach respectively. In the simulations, the true number of factors was set to be , and . Combining the characteristics of the real residuals from power data, auto-cross(weak)-correlation structure is set for the synthetical residuals, i.e., , and . As is seen from the figure, the optimal JS divergences calculated though FP approach are smaller than those from FRV, which indicates that our built multiplicative covariance model can fit the residuals better than that based on FRV. What’s more, our estimation approach has a faster convergence rate than FRV, especially for the small sample size. When the sample size is large, both FRV and FP approaches converge very well, regardless of the noise levels.

5 Empirical Studies

In this section, we illustrate the proposed approach by using the real-world online monitoring data collected from a power grid and the power flow data generated from IEEE 118-bus test system. We first check how well our built model can fit the residuals from the real data. Then, implications of and are explored by using the power flow data, in which we track the evolutions of and by moving a window on the data at continuous sampling times.

5.1 Fit of our model to real data

The real-world online monitoring data are three-phase voltages collected from monitoring devices installed on the low voltage side of distribution transformers within one feeder. The data is sampled every minutes and the sampling time is from 2017/3/1 00:00:00 to 2017/3/31 23:45:00. Thus, a data set is formulated. Instead of taking the entire matrix for analysis, we move a window on the data set at continuous sampling times. Fig. 6 shows several sample fitting results of our built multiplicative covariance model to the real residuals. It can be observed that our built multiplicative covariance model can fit the residuals well, while the M-P law does not. What’s more, it is noted that the estimated and are different for the data sampled at different sampling moments, which explains why the estimators in our approach can be used to indicate the system states.

Figure 6: The Fitting results of our built multiplicative covariance model to the real residuals. The real data is taken from different sampling time during 2017/3/1 00:00:002017/3/31 23:45:00. The built model with estimated and fits the residuals well. The M-P law is plotted for comparison.

5.2 Implication of

The power flow data generated from IEEE 118-bus test system zimmerman2011matpower is used to explore the implication of . The IEEE 118-bus test system represents a portion of the U.S. Midwest Electric Power System, and it is edited into IEEE Common Data Format and PECO PSAP Format by Richard Christie from the University of Washington richard1993 . In the early 2000’s, researchers from the Illinois Institute Technology (IIT) work with the system and add some line characteristics IIT pena2018extended . The one-line diagram of the IEEE 118-bus test system is shown in Fig. 7. It consists of buses, branches, load sides and generators with a total installed capacity of 7220MW.

Figure 7: One-line diagram of the IEEE 118-bus test system, by Illinois Institute Technology, version 2003.

In the data generation process, a sudden change of the active load at one bus is considered as an anomaly event and a little white and autoregressive (AR(1)) noise is introduced to represent random fluctuations and measuring errors. The anomaly events can cause the variation of the data’s cross-correlations. From Section 4.2, we know that is mainly affected by the cross-correlation of the data. Here, in order to explore the relations between the number of anomaly events and , different number of anomaly events are set, which is shown in Table 3. The generated data contains voltage measurement variables with sampling times, which is shown in Fig. 8. Thus, a data set is formulated. In the experiment, we move a window at continuous sampling times on the data set, which enables us to track the temporal evolutions of .

Bus Sampling Time Active Load(MW)
20
30
60
Others Unchanged
  • is the signal-to-noise ratio, which is set to be .

  • represents random white gaussian noise.

  • represents the autoregressive noise, and the correlation coefficient is set to be .

Table 3: Assumed signals for active load of bus 20, 30 and 60.

Figure 8: The data generated from IEEE 118-bus test system. Different number of anomaly events are set.

Figure 9: curve.

The time-series of generated with continuously moving windows is shown in Fig. 9. The relations between the number of anomaly events and the parameter are stated as follows:

I. From to , the estimated remains almost constant at . The fitting result of our built model to the residuals during this period of time (such as ) is shown in Fig. 10(a). In the experiment, no strong factors are observed during this period of time. The most likely explanation is that the proposed approach is sensitive to the weak factors caused by random fluctuations or measuring errors and can identify them effectively.

II. From to , two strong factors are observed in the experiment and the average estimated is between and , during which one anomaly event is contained in the moving window. The fitting result of our built model to the residuals during this period of time (such as ) is shown in Fig. 10(b). From to , three strong factors are observed and the average number of estimated factors is between and , during which two anomaly events are contained in the moving window. The fitting result of our built model to the residuals during this period of time (such as ) is shown in Fig. 10(c). From , four strong factors are observed and the average estimated is about , during which three anomaly events are contained in the moving window. The fitting result of our built model to the residuals during this period of time (such as ) is shown in Fig. 10(d). It can be concluded that is driven by the number of anomaly events.

III. From to , the value of decreases by every other sampling times, because the width of the moving window is and the number of anomaly events contained in the moving window decreases by every sampling times. It validates the conclusion that is driven by the number of anomaly events.

IV. From , no strong factors are observed and remains nearly , which validates that the proposed approach is sensitive to the weak factors caused by random fluctuations or measuring errors.

(a) (b) (c) (d)
Figure 10: The fitting result of our built model to the residuals. The data is generated from IEEE 118-bus test system and different number of anomaly events are set. Our model with estimated and fits the residuals well. The M-P law is plotted for comparison.

5.3 Implication of

From Section 4.2, we know that is affected both by the cross- and auto-correlation of the data in our approach. The number of anomaly events can cause the variation of the data’s cross-correlations. In this section, we first explore how the number of anomaly events affects by using the generated data in Fig. 8. In the experiment, a window is moved on the data set at continuous sampling times and the generated curve is shown in Fig. 11(a). The relations between the number of anomaly events and are stated as follows:

I. From to , no anomaly events occur and remains almost constant.

II. From to , increases by every other sampling times for the number of anomaly events contained in the moving window increases by every sampling times. From to , decreases by every other sampling times for the number of anomaly events contained in the moving window decreases by every sampling times. It shows is positively affected by the number of anomaly events contained in the moving window, because the cross-correlations of the residuals varies with the number of anomaly events. It validates our assumption that the cross-correlation of the residuals can not be completely eliminated by removing factors, i.e., weak cross-correlation structure assumption for the residuals.

III. From , no anomaly events are contained in the moving window and returns to a constant and remains afterwards.

(a) (b)
Figure 11: curve. (a)The value of with different number of anomaly events. (b) The value of with different scale of anomaly events.

Meanwhile, the scale of anomaly events can affect the variation of the data’s auto-correlations. Here, we explore how the scale of anomaly events affects . Assumed events with different scales are set for bus , which is shown in Table 4. The generated data contains voltage measurements with sampling times. A window is moved on the data set at continuous sampling times and the generated curve is shown in Fig. 11(b). The relations between the scale of anomaly events and are stated as follows:

Bus Sampling Time Active Power(MW)
20
Others Unchanged
  • The parameters are set the same as in Table 3.

Table 4: Assumed events with different scales set for bus 20.

I. From to , the estimated remains almost constant, which indicates no anomaly events occur and the system operates in normal state.

II. From to , the curves are almost inverted U-shaped, because anomaly events in Table 4 are set and the delay lags of anomaly events to are equal to the moving window’s width. It is noted that the estimated corresponding to the anomaly event of the active power (AP) from to has the largest value and that of the AP from to has the smallest value, which indicates increases with the scale of anomaly events. Because the scale of anomaly events is positively related to the auto-correlation of the residuals from the power data.

III. From , the estimated returns to constant and remains afterwards, which indicates the system has returned to normal state.

6 Conclusions

The spectrum from real-world power data is complex and cannot be trivially dissected by the M-P law. In this paper, we propose a new approach to estimate factor models by connecting the estimation of the number of factors to fitting the ESD of covariance matrices of the residuals. Considering a lot of measurement noise is contained in the power data and the uncertain correlation structure of the residuals from the power data, our approach prefers approaching the ESD of covariance matrices of the residuals by using a multiplicative covariance structure model, which avoids making crude assumptions or simplifications on the complex correlation structure of the data. The free probability techniques in random matrix theory ensure the efficiency of the proposed approach.

Theoretical studies show that the proposed approach is robust to noise and has powerful ability to identify weak factors. The built multiplicative covariance structure model can fit the ESD of covariance matrices of the residuals better and has a faster convergence rate compared with that developed by Yeo and Papanicolaou. Empirical studies show that the estimators in our approach effectively characterize the number and scale of anomaly events in a power system, which can be used to indicate the system states.

Appendix A

Let be an random matrix, whose entries are independent identically distributed (i.i.d) variables with the mean and the variance . The covariance matrix of is calculated as,

(26)

As but , according to the M-P law, the spectral density of is obtained as

(27)

where , , and .

According to (7), the Green’s function of is obtained as , which can be integrated using (10) to obtain the moment generating function . Solving (12) for the S-transforms given as

(28)

Then the S-transform of is calculated as

(29)

According to (12), the inverse function of the moment generating function is calculated as,

(30)

and the moment generating function fulfills the equation

(31)

By integrating (11) into (31), we can obtain,

(32)

which can be simplified as

(33)

References

References

  • (1) J. Bai, S. Ng, Determining the number of factors in approximate factor models, Econometrica 70 (1) (2002) 191–221.
  • (2) A. Lewbel, The rank of demand systems: theory and nonparametric estimation, Econometrica: Journal of the Econometric Society (1991) 711–730.
  • (3) G. Connor, R. A. Korajczyk, A test for the number of factors in an approximate factor model, the Journal of Finance 48 (4) (1993) 1263–1291.
  • (4) J. G. Cragg, S. G. Donald, Inferring the rank of a matrix, Journal of econometrics 76 (1-2) (1997) 223–250.
  • (5) M. Forni, L. Reichlin, Let’s get real: a factor analytical approach to disaggregated business cycle dynamics, The Review of Economic Studies 65 (3) (1998) 453–473.
  • (6) J. H. Stock, M. W. Watson, Forecasting using principal components from a large number of predictors, Journal of the American statistical association 97 (460) (2002) 1167–1179.
  • (7) G. Kapetanios, A new method for determining the number of factors in factor models with large datasets, Tech. rep., Working Paper, Department of Economics, Queen Mary, University of London (2004).
  • (8) G. Kapetanios, A testing procedure for determining the number of factors in approximate factor models with large datasets, Journal of Business & Economic Statistics 28 (3) (2010) 397–409.
  • (9) A. Onatski, Determining the number of factors from empirical distribution of eigenvalues, The Review of Economics and Statistics 92 (4) (2010) 1004–1016.
  • (10) M. Harding, Estimating the number of factors in large dimensional factor models, Journal of Econometrics.
  • (11) J. Yeo, G. Papanicolaou, Random matrix approach to estimation of high-dimensional factor models, arXiv preprint arXiv:1611.05571.
  • (12) V. A. Marčenko, L. A. Pastur, Distribution of eigenvalues for some sets of random matrices, Mathematics of the USSR-Sbornik 1 (4) (1967) 457.
  • (13) J. A. Mingo, R. Speicher, Free probability and random matrices, Vol. 4, Springer, 2017.
  • (14) R. Speicher, Multiplicative functions on the lattice of non-crossing partitions and free convolution, Mathematische Annalen 298 (1) (1994) 611–628.
  • (15) D. V. Voiculescu, K. J. Dykema, A. Nica, Free random variables, no. 1, American Mathematical Soc., 1992.
  • (16) S. C. Ahn, A. R. Horenstein, Eigenvalue ratio test for the number of factors, Econometrica 81 (3) (2013) 1203–1227.
  • (17) R. D. Zimmerman, C. E. Murillo-Sánchez, R. J. Thomas, et al., Matpower: Steady-state operations, planning, and analysis tools for power systems research and education, IEEE Transactions on power systems 26 (1) (2011) 12–19.
  • (18) R. Christie, Power systems test case archive, Aug. 1993, [Accessed Feb. 4, 2015]. [online]. Available:http://www.ee.washington.edu/research/pstca/pf118/pg_tca118bus.htm.
  • (19) IIT, Index of data illinois institute of technology, Illinois Inst. Technol.,Chicago, IL, USA,[online]. Available:http://motor.ece.iit.edu/data/.
  • (20) I. Pena, C. B. Martinez-Anido, B.-M. Hodge, An extended ieee 118-bus test system with high renewable penetration, IEEE Transactions on Power Systems 33 (1) (2018) 281–289.