Process monitoring is increasingly significant and essential to guarantee the process safety [11, 1, 26, 22, 33]. Approaches for stationary processes have been intensively investigated and considerable achievements have been obtained [32, 16, 29, 28, 25]. However, process data are generally nonstationary due to varying load, changes of raw materials, aging of equipments and product grade transitions, etc [19, 27]. This phenomenon is ubiquitous in industrial systems, for instance, the power stations, oil explorations, chemical processes, etc. It is urgent and challenging to investigate the monitoring techniques for nonstationary processes under various potential operating conditions .
Recently, several methods have been developed for nonstationary process monitoring. Canonical variate analysis extracts dynamic latent information by state space formulations, which aims to reduce the order of dynamics and is generally applied to linear systems . Dynamic latent variable models (DLVMs) extract dynamic and static latent components simultaneously, which extract the most predictable information first . The switching autoregressive DLVM was proposed for multimode processes in the probabilistic framework  and it requires that the model covers all operating modes. These methods aforementioned fail to distinguish the real faults from normal deviations under varying operating conditions, thus delivering high false alarm rates. To settle this issue, slow feature analysis (SFA) was proposed to identify the real fault from the operating point deviation, by separating dynamic information from steady state information . SFA requires that the system operates in a particular steady condition , which is unsuitable for frequently varying operating conditions.
Cointegration analysis (CA) is an effective method to deal with nonstationary data [5, 10] and able to distinguish the real faults from normal dynamic changes under various operating conditions. It is based on the general consesus that the long-term equilibrium relationship, i.e., cointegration relationship, exists in physical and chemical processes because the nonstationary variables are correlated to each other and governed by specifical laws . When the cointegration relationship is broken, the system enters a new mode if the dynamic equilibrium relationship returns to normal. Cointegration testing method was adopted primarily for nonstationary process monitoring in . Zhao et al. intensively investigated CA and proposed several extensions of CA, including dynamic distributed strategy for large-scale processes , CA with SFA to establish a full-condition monitoring model .
However, these CA-based methods assume that the cointegration relationship remains the same[37, 36], which is unrealistic in practical systems. Take the coal pulverizing system of power plant as an instance. The compositions and characteristics of one coal may change slowly because they are influenced by environments and it is difficult to mix the coal quite evenly. Thus, the cointegration relationship would change accordingly. Hansen et al7], which could update the cointegration relationship to adapt to the new condition. Another form of recursive CA was proposed to adapt to the slowly changing cointegration relationship and the model was updated based on a block of data . However, the monitoring consequences are affected by the length of data block and it is intractable to determine the optimal value. Only the dynamic information that reflected the control performance was extracted and the remaining information was neglected, thus causing insensitivity to detecting the faults that are orthogonal to cointegration space .
It is also a universal phenomenon in practical industrial systems that the cointegration relationship may change sharply and frequently. For instance, the type of coal changes in power plants frequently owing to the environmental requirements and economical benefits. The compositions and calorific value of different coals vary greatly. Assume that there are various variables and can be sorted into three blocks. One block of variables shares the similar trend and one block represents the critical manipulated variables, while the remaining variables are contained in another block. Thus the cointegration relationship and some manipulated variables may be transited from one steady state to another. It has been mentioned  that it is necessary to establish a new CA model from scratch, to quickly adjust to the new cointegration relationship based on the newly collected data. But the recursive CA failed to provide excellent performance  because it requires abundant data.
Aimed at the issues mentioned above, this paper investigates the general nonstationary process monitoring, where the cointegration relationship and the manipulated variables would change from one steady state to another frequently. First, in order to distinguish the real faults from normal dynamic deviations, a novel version of recursive cointegration analysis (RCA) is proposed to track the long-term equilibrium relationship, where the CA model is updated once a new sample arrives. Based on RCA, we introduce recursive principal component analysis (RPCA) to deal with the remaining information, thus establishing a comprehensive monitoring framework. For convenience, RCA with RPCA is denoted as RCA-RPCA.
When the cointegration relationship is recognized as broken by RCA and the dynamic equilibrium relationship returns to normal quickly, the system enters a new operating mode. We need to retrain the RCA and RPCA models from scratch based on the new data. Here, we employ elastic weight consolidation (EWC) to settle the ‘catastrophic forgetting’ issue of RPCA 
, where the significant information that is influential in previous modes is preserved to avoid the dramatic performance degradation for similar operating modes. For convenient description, the proposed RCA-RPCA with EWC is referred to as RCA-RPCA-EWC. In addition, test statistics are established based on the prior knowledge and CA theory, which is more sensitive to normal change than recursive CA.
The rest of this paper is organized below. Section II introduces the problem and reviews the basic theory of CA. Section III presents the detailed procedure of the proposed RCA and summarizes the monitoring algorithm based on RCA-RPCA. Then, the proposed RCA-RPCA is extended to multimode processes in Section IV, where EWC is employed to overcome the ‘catastrophic forgetting’ issue of RPCA when a new mode appears. Section V summarizes the general procedure for nonstationary process monitoring, analyzes the computational complexity and compares with the state-of-the-art approaches. The effectiveness is illustrated by a practical industrial system in Section VI. The concluding remark is presented in Section VII.
Ii Problem formulation and preliminary
Ii-a Problem statement
Since there are various variables of multiple trends in practical applications, how to deal with the variables appropriately affects the monitoring performance severely. Take the practical coal pulverizing system of power plant as an instance.
The variables are affected by load and types of coal. Partial variables are described in Fig. 1. The variables are decomposed into three blocks based on prior knowledge and augmented Dicky Fuller (ADF), where the final results rely on the prior knowledge and ADF test is the auxiliary to enhance universality. Variables in Fig. 1 are nonstationary and share the common trend, which are normally influenced by varying load. For variables in Fig. 1, the uppermost variable is regulated by controllers and the manipulated variable is expected to vary from one steady state to another one if the type of coal changes. The other three variables change slowly or irregularly. The appropriate method needs to be investigated to deal with data of different characteristics, thus delivering an optimal monitoring performance. Note that this variable grouping makes it possible that this proposed method is sensitive to changes of manipulated variables and mode identification.
This paper studies the general case of sequential nonstationary process monitoring, where the stationary variables and the cointegration relationship change from one steady state to another. RCA processes the data with common trend to extract long-term equilibrium information and RPCA is adopted to deal with other variables to extract short-term dynamics, thus constructing a comprehensive monitoring framework for nonstationary processes. When the cointegrated relationship or stationary variables change, the system enters a new mode and EWC is adopted to preserve the significant information of previous modes, thus delivering excellent performance for successive modes based on single model.
Ii-B Conventional CA algorithm
Given the nonstationary time series with . The reference mean
and reference standard deviationare calculated as
where is the th variable at th sampling instant, is the vector of all ones with appropriate dimension. Thus, the original data are normalized as
The vector error-correction (VEC) model is described as:
where , is the order of VEC model and determined by AIC.
is the Gaussian white noise with. , where and are of full rank . The columns in are cointegration vectors. The objective of CA is to determine to make the equilibrium errors as stationary as possible.
The maximum likelihood estimation of cointegration vectors in
is acquired by eigenvalue decomposition (EVD)
where , () is the prediction error and calculated by
where is the difference matrix, the vector is the temporal difference between two neighboring data points. originates from the observation matrix . is the augmented matrix which contains lagged observations. The specific structures are described as
The coefficients and
are obtained by ordinary least squares (OLS). Actually, (6) can be reformulated as
where , , the generalized eigenvalues are listed in the descending order.
contains the generalized principal eigenvectors corresponding tolargest eigenvalues and is determined by the trace test . The cointegration matrix and dynamic cointegration matrix are acquired from , namely, . More information about CA can be found in [5, 10].
Iii The proposed RCA-RPCA for process monitoring
In this section, we propose RCA to adapt to new cointegration relationship once a new sample arrives. The RCA issue is formulated into a recursive generalized EVD problem and settled by standard EVD. Besides, four test statistics are constructed according to prior knowledge and RCA-RPCA theory.
According to the prior knowledge and ADF test, this paper divides the variables into three blocks, one block represents the nonstationary variables with common trend, which are conducted by RCA and labeled as . One block indicates the stationary variables that are sensitive to operating conditions and are denoted as . Generally, is the critical manipulated variables and especially significant for industrial systems. The remaining block includes the variables independent of working conditions, which are expected to be stationary or change over the external environment and labeled as . As a note, the variables are not necessarily divided into three blocks for any industrial system. It depends on the system characteristics and change regularities. However, the monitoring framework proposed in this paper also applies to this situation equally.
Iii-a Recursive cointegration analysis
We establish the initial CA model based on Section II-B. If the cointegration relationship changes slowly, the collected data are preprocessed by fixed mean and standard deviation, as described in (1-2). The procedure of RCA is proposed below.
The prediction errors are
According to recursive OLS, and are determined by:
where , , . Let , the recursion of is
is the identity matrix with appropriate dimension. Thus,and are calculated recursively. Similarly,
where , .
where , , .
where , . Obviously, , . Thus, . Similarly, . and are also calculated recursively, as described in Appendix -B.
The proposed RCA is reformulated into settling the following generalized EVD problem:
where is the diagonal matrix and elements are generalized eigenvalues with descending order.
Iii-B Solution for numerical efficient recursive CA
In this paper, we convert a generalized EVD issue to a standard symmetric EVD problem. As is symmetric and positive definite, let , (22) can be reformulated as
where is positive definite, . Computing directly may be ill-conditioning per update, thus it is essential to acquire the recursion of and avoid inverting a matrix repeatedly. The detailed derivation procedure is presented in Appendix -C. To further reduce the computational burden, the recursion of is obtained based on the rank of . The procedure for (22) is summarized in Algorithm 1.
Iii-C Monitoring statistics
In this section, we construct the monitoring statistics to judge the operating conditions. The proposed RCA is utilized to extract the long-term equilibrium information and the short-term dynamic features are handled by RPCA. The key steps of RPCA have been elaborated in Appendix -A.
At instant, a new sample is collected and preprocessed as . Let . The cointegration matrix and dynamic cointegration matrix are generated from generalized eigenvectors in Algorithm 1.
is designed to judge whether the long-term static equilibrium relationship is still preserved.
is designed to monitor the long-term dynamic equilibrium relationship.
where the prediction error is the last sample of .
Iv Multimode process monitoring with EWC
In this section, we extend the nonstationary monitoring technique to multimode processes. Here, we define a mode where the stationary variables and the long-term static equilibrium fluctuate within acceptable range, which can be measured by and statistics. Actually, the data are still nonstationary in one mode.
When the system operates from one steady operating condition to another, the data distribution may change accordingly. Meanwhile, the cointegration relationship and the stationary variables may also vary dramatically. It has been illustrated that the recursive strategy of CA based on all collected data is unreasonable and may lead to high false alarms 
. RPCA also fails to track the rapid changes accurately. It is essential to build the proposed RCA-RPCA monitoring model from scratch. However, similar to most machine learning approaches[23, 31, 14, 34], RPCA suffers from the ‘catastrophic forgetting’ issue and most information of the previous modes is overlapped when a new model is rebuilt. To settle this issue, EWC  is employed at the initial training phase of RPCA, where significant information from influential variables in previous modes is enhanced to avoid drastic changes. Thus, the proposed RCA-RPCA-EWC method can deliver outstanding monitoring performance when similar or the existing operating modes reappear.
Here, we introduce the procedure of RPCA with EWC (RPCA-EWC), which is similar to PCA with EWC in . Let denote the projection matrix for the previous operating mode. When a new mode is detected by RCA, the initial collected short-term dynamic data are denoted as . The off-line training model of RPCA is built with EWC, thus the objective is designed as
where the hyperparametermeasures the importance of previous modes. The matrix is positive semidefinite, which is influenced by and determined by [34, 9]. The constraint is with ,
is the number of principal components and determined by cumulative percent variance (CPV) approach.
is the loss function of RPCA for the current mode.is the loss function which measures the deviation of key parameters between two successive operating modes.
The objective function (28) is actually the difference of convex (DC) functions programming problem [20, 24]. DC programming includes linearizing the convex function and solving the convex function. The specific deviation process has been described in  and some key steps are listed in Appendix -D. The solution is summarized in Algorithm 2. Note that the matrix measures the importance of parameters and should be updated before a new mode appears. The calculation method can refer to [9, 17, 34].
In summary, when a new mode is judged by RCA, RCA model is rebuilt from scratch and the procedure is similar to Section III. RPCA-EWC is adopted at the initial training phase and then parameters are updated by (32-35), thus avoiding abrupt degradation of monitoring performance when similar modes revisit.
V Monitoring algorithm
State-of-the-art approaches explore the nonstationary processes for a single mode [15, 13, 37, 36], where the stationary variables and the long-term static equilibrium fluctuate within a certain range. When the operating mode changes, the data distribution may vary accordingly and the original cointegration relationship is broken. This section introduces the general monitoring framework for multimode processes, which is also appropriate for a single mode.
Similar to Section III, the normal data are divided into three blocks. At the training phase, the long-term equilibrium information is extracted by CA and PCA is utilized to monitor the remaining short-term dynamic information. Four test statistics are calculated by (24-27), where and are employed to identify the operating status and
and SPE are utilized to monitor the short-term dynamics. The corresponding thresholds are calculated by kernel density estimation (KDE). The off-line training procedure is summarized in Algorithm 3.
For the practical industrial applications, when a new sample arrives, the operating status is judged and the monitoring model is updated if normal, as described in Algorithm 4. The thresholds are updated by KDE. Note that an occasional anomaly is regarded as noise or disturbance. The fault is detected if the anomaly lasts a short time.
The monitoring rule is summarized below:
All test statistics are within their thresholds, it is regarded that the process operates normally in the same operating mode. The proposed RCA-RPCA is still employed to update the parameters;
If , and return to normal after is over its threshold, it indicates that the system enters a new operating state and then RCA-RPCA-EWC is adopted to monitor the system;
If and are within their thresholds, while or is over its threshold, then a fault may occur and it is essential to check the operation of the systems;
All test statistics exceed their thresholds, then the process is out of control. A real fault is detected and the alarm is triggered.
V-a Computational complexity analysis
For online monitoring phase, the computational complexity contains the computation of RCA and RPCA at each step, and RPCA-EWC when the operating mode changes. The RCA and RPCA algorithms occupy the most computational source and are considered in this paper.
For RCA, the computation focuses on Algorithm 1. The complexity of and is , as illustrated in Appendix -B. The complexity of is in Appendix -C. Then, the calculation of needs flops. The symmetric QR algorithm requires at most flops theoretically because
is a block skew diagonal matrix. The calculation ofin Algorithm 1 requires flops. In summary, the computational complexity of RCA is per update. For RPCA, the complexity of and is . Obviously, , , is the dimension of collected data. That is, the computation per update will not grow as the number of samples increases.
V-B Comparison and Discussion
We compare the recursive CA  with the proposed RCA-RPCA-EWC method below:
Model tracking accuracy. The recursive CA model is updated based on a block of data , and it is intractable to determine the data length to deliver the optimal performance. However, the proposed RCA model is updated once a new normal sample arrives. In the case that the cointegration relationship changes sharply and frequently, the recursive CA  may fail to track the normal change, while the proposed approach can establish the inexact model based on just a few data and correct the model gradually.
Sensitivity of mode switching identification. The operating status is judged by and . The construction of two statistics is only based on nonstationary data that reflect the control performance . The proposed statistics consider the prior knowledge and data simultaneously, which is more sensitive to mode switching.
Memory properties. The EWC technique is adopted to overcome the ‘catastrophic forgetting’ issue of RPCA and significant information of previous operating conditions is enhanced to avoid dramatic changes of influential parameters. The proposed RCA-RPCA can be updated accurately when previous or similar operating modes appear, thus delivering optimal monitoring performance.
Algorithm complexity. The computational burden is highly related to the number of current collected samples at each update step . Although the models stop to update to reduce complexity and false alarm, it is hard to satisfy the criterion. For the proposed method, the computational cost is per update, irrespective of the number of samples.
Here we make a further discussion about the variable decomposition. In this paper, the variables are divided into three blocks based on prior knowledge and data, as mentioned in Section III. Specifically, we first employ the theory of industrial systems to partition variables. Then we adopt ADF test and correlation analysis to verify and strengthen the rationality of variable grouping. Thus, the changes of stationary variables would not be covered by the normal variations of nonstationary variables. Note that it is not necessary that the variables are decomposed into three blocks in any industrial system. The number of blocks relies on the characteristics of systems and selected variables. However, the monitoring framework in Section III is also applied. This variable grouping method is sensitive to critical manipulated variables and mode identification, which is beneficial to enhance monitoring performance.
Vi Case study
This section adopts a practical industrial system to illustrate the effectiveness of the proposed method. Besides, we make a comparative analysis with the state-of-the-art methods to highlight the superiorities of the proposed method.
|Case number||Data original||Training samples||Testing samples||Fault time||Fault cause|
|Case 1||Aomei-Aomeng-Aomei||2000||8800||6734||The opening of the regulating baffle of the primary air is abnormally large|
|Case 2||Aomeng-Youhun||2000||15280||8731||Abnormality from cold primary air electric regulating baffle card|
|Case 3||Fudong-Aomeng-Fudong||2000||13120||9010||The cooling fan motor trip|
|Case number||Indexes||Recursive CA ||RCA-RPCA||RCA-RPCA-EWC|
Vi-a Description of the pulverizing system
The 1000-MW ultra-supercritical thermal power plant is increasingly popular owing to economic benefits and environmental requirements. In this paper, we investigate one important unit of boiler, namely, the coal pulverizing system in Zhoushan Power Plant, Zhejiang Province, China . It contains coal feeder, coal mill, rotary separator, raw coal hopper and stone coal scuttle, as depicted in Fig. 2. The coal pulverizing system grinds the raw coal into pulverized coal with desired coal fineness and optimal temperature. The operating conditions would change over the types of coal and varying unit load. For different types of coal, the cointegration relationship may change and the controlled variables may work at different stable points.
We choose 26 key variables and some typical variables have been depicted in Fig. 1 to illustrate the data characteristics. Variables in Fig. 1 are relevant to the unit load, which are also nonstationary by ADF test and prior knowledge. Variables in Fig. 1 are little correlated with load. For instance, the air powder mixture temperature is required to be stationary and may be different for different types of raw coal. When the coal changes, the temperature would vary from one stable value to another one. The bearing temperatures are expected to remain at a stable level. The temperature of cold air is closely related to the external environment.
We select three typical cases to illustrate the effectiveness of the proposed method, namely, abnormality from outlet temperature (Cases 1 and 2) and rotary separator (Case 3). According to the historical records, these two types of faults occur frequently and affect the working safety. The sample interval is 20 seconds. The data information is listed in Table I. For each case, the process data come from two types of coal and the original cointegration relationship may be broken when the type of coal changes.
Vi-B Simulation analysis
In this paper, we compare recursive CA  with the proposed RCA to illustrate the virtues of real-time update. Then, the proposed RCA-RPCA is compared with RCA-RPCA-EWC to illustrate the superiorities of EWC. Note that RCA-RPCA and RCA-RPCA-EWC share the same RCA algorithm proposed in Section III.
Three indicators are considered to evaluate the performance, namely, fault detection rates (FDRs), false alarm rates (FARs) and detection delay (DD). The calculation method can refer to . DD refers to the number of samples that the fault is detected later than the recorded fault time, which is valuable and significant for practical industrial systems. The monitoring consequences of three case are described in Figs. 3-5, respectively. Note that the pink vertical line represents the practical fault time instant.
The monitoring charts of Case 1 are presented in Fig. 3. Recursive CA  fails to detect the fault accurately and the FDR is . Besides, and can not distinguish novelty from normal dynamic changes in Fig. 3. For the proposed RCA, can detect the fault precisely and timely, and the detection delay is about 1 minute. In the time period, where and change significantly, the type of coal changes and the current cointegration relationship may be broken. Thus, the CA model needs to be retrained from scratch based on the newly collected data. During this time period, the process is monitored by the current CA model, and new data are collected to build the initial CA model that is appropriate for the new material. Thus, and recover to be stable quickly. Compared with RPCA, RPCA-EWC provides better performance in Fig. 3 and the FDR of is . However, the FDR of RPCA is , lower than RPCA-EWC. According to and , it is observed that the type of coal varies and a new model is built before the fault occurs, which indicates that RPCA is not trained enough and can not track the system change actually.
For Case 2, the monitoring results are depicted in Fig. 4. For the proposed RCA method, can detect the fault accurately and the FDR is . In Figs. 4 and 4, and change sharply twice. According to the coal records and original data analysis, the first sudden change of two statistics originates from the switch of coal type, while the second abrupt change is attributed to the critical parameters adjusted artificially. Compared with RPCA, SPE of RPCA-EWC enables to detect the fault precisely and the FDR is . The short-term dynamic of two types of coal has a certain degree of similarity, and the significant information of previous coal is preserved and beneficial to monitor other coal. The FARs of SPE are relatively high because RPCA is not able to track the rapid system change at the initial stage. The system is judged as normal because SPE returns to normal quickly. Regardless of the false alarms caused by this situation, the FARs of SPE are and in Figs. 4-4, respectively. However, recursive CA  misidentifies the normal parameter variations as anomalies in Fig. 4 and the FAR is up to . It is insensitive to faults that are orthogonal to cointegration space and only dynamic information is not enough to monitor the process effectively.
For Case 3, the monitoring consequences are exhibited in Fig. 5. The recursive CA  detects the fault inexactly and the FDR of is . The FDRs of and are 0, and thus it is meaningless to mention delay detection. The proposed RCA can detect the fault accurately and the FDRs are more than in Figs. 5-5. For RPCA-EWC in Fig. 5, the FDR of is , which indicates that the significant information of the previous coal is preserved by EWC and beneficial to deliver excellent monitoring performance. However, the FDR of RPCA is less than in Fig. 5.
The evaluation indexes of three cases are summarized in Table II. Compared with recursive CA , the proposed RCA is more sensitive to normal changes from human intervention and raw materials changing. This phenomenon occurs owing to several factors: a) The variables are selected and divided based on prior knowledge and ADF test, which is more universal and accurate than just ADF; b) Critical stationary variables, which are sensitive to raw materials changing, are utilized to establish the statistic; c) In , the model is updated based on a block of data and the monitoring performance is effected by the block length, while the proposed RCA model is updated in real time and more compatible with the current operating system. In addition, compared with RCA-RPCA, RCA-RPCA-EWC preserves significant information of previous influential parameters and avoids dramatic performance degradation when similar operating modes revisit.
In this paper, RCA-RPCA-EWC was developed to monitor the general nonstationary processes, where the proposed RCA is updated in real time and able to distinguish the real faults from normal system deviations. To avoid potential ill-conditioning issue of matrix inversion, several calculation techniques are adopted and the RCA issue is settled with low computational burden. As RCA is insensitive to faults that are orthogonal to cointegration space, the remaining information of RCA together with other short-term dynamic information is monitored by RPCA to establish a comprehensive monitoring framework. When the system enters a new operating mode, EWC is employed to strengthen the significant information of previous operating modes and avoid the abrupt performance degradation for future similar operating modes. Besides, the test statistics are constructed based on RCA and the prior knowledge, which are sensitive to mode identification. Compared with recursive CA  and RCA-RPCA, the effectiveness and superiorities of the proposed method are illustrated by a practical industrial pulverizing system.
In future, we will investigate the quality-related nonstationary process monitoring. Besides, graceful forgetting will be considered as forgetting older modes is essential to make space for learning newer modes.
-a RPCA for process monitoring
In this paper, RPCA is implemented based on on rank-1 modification with first-order perturbation (FOP). Detailed information can refer to .
At instant, the sample is collected. Then, the mean and standard deviation are updated as:
where , is the forgetting factor. The sample is normalized as
Based on rank-1 modification and FOP, the eigenvectors and eigenvalues are updated as:
Define the rank-1 matrix , the diagonal matrix and are calculated by:
where is the th element of , and are the th and th corresponding elements of .
-B Recursive computation of and
We illustrate the computation of and . Take the component of as an example, we show that the computation of is , which is independent of the number of existing samples.