State estimation is a critical component of modern energy management systems (EMSs) for multiple monitoring purposes, including analysis, security, control, situational awareness, stability assessment, power market design, and optimization of electricity dispatchment [1, 2]. In the DC model, the states are the bus voltage angles, while the grid topology includes the arrangement of loads or generators, transmission lines, transformers, and the statuses of system devices. It should be noted that this definition generalizes the computer science graph theory definition, which refers to the connectivity of the graph, since here the topology also includes the weights. In currently applied systems, it is assumed that the EMS has precise knowledge of the grid topology , which is used for obtaining accurate state estimation. However, knowledge of grid topology may not be available and it may change over time due to failure, opening and closing of switches on power lines, and the presence of new loads and generators. For example, large-scale penetration of distributed generation results in regular topology changes due to ad-hoc connection of many plug-and-play components. Even worse, a distribution system operator usually lacks topology information, as many of the distributed energy resources do not belong to the utility [3, 4]. The topology data may also be incorrect due to malicious attacks [5, 6, 7]. Thus, methods for state estimation that are not based on a known topology are crucial for obtaining a reliable system model and high power quality. An additional use for topology identification is event detection, such as identifying faults, line outages, and system imbalances [8, 9]. Moreover, it can be used to secure the system from potential cyberattacks on the topology information and to identify the potential vulnerabilities of a power system.
Several approaches to topology identification have been proposed in the literature. Detecting topological changes has been studied in [10, 11] and the conditions for the detectability of topology errors are studied in . Recently, a few papers have addressed blind estimation of the grid topology by observing multiple power injection supervisory control and data acquisition (SCADA) measurements [13, 14], voltage and power data obtained by phasor measurement units (PMUs) [15, 16], voltage measurements and their associated correlations , and electricity price based market data . In , an unobservable attack is designed based on incomplete knowledge of the system matrix, which is learned via a blind identification approach. The methods proposed in [13, 14, 18, 19]
can reveal part of the grid topology, such as the grid connectivity and the eigenvectors of the topology matrix, but they cannot reconstruct the full topology matrix with exact scaling and true eigenvalues. Thus, incorporating blind source separation (BSS) techniques with the specific characteristics of a graph seems promising.
BSS methods aim at restoring a set of unknown source signals from a set of observed linear mixtures of these source signals (see, e.g. [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]), without prior knowledge of the sources and the mixing system. The problem of BSS has been extensively investigated in the literature in the recent two decades. Prior works on maximum likelihood (ML) separation in BSS deal with general stationary sources [21, 30]
, autoregressive (AR) sources, and AR Gaussian mixture model distributed sources[28, 29]
. The ML BSS for nonstationary structures with varying variance-profiles was considered in
. However, classical BSS solutions are ambiguous in the sense that the order, signs and scales of the original signals cannot be retrieved. These ambiguities cannot be tolerated in the considered power system problem. In addition, usually the distributions of the states are assumed to be Gaussian due to the central limit theorem, while most BSS methods cannot handle Gaussian sources. Therefore, new methods for BSS are required for the semiblind scenario of a Laplacian mixing matrix with Gaussian sources, without permutation and scaling problems.
In addition to state estimation in power systems, the recent field of graph signal processing (GSP)  has many applications, [33, 34, 35]. A major challenge in GSP is learning the graph structure from data under Laplacian matrix constraints (see, e.g. [36, 37, 38]) and blind deconvolution of signals on graphs , aim to jointly identify the filter coefficients and the input signal. In future work the recent approach has the potential to be extended to general GSP applications.
In this paper, we consider the problem of state estimation and topology identification in power systems based on active power measurements. First, we show that this problem is equivalent to the problem of BSS with a weighted Laplacian mixing matrix, where the weights are determined by the branch susceptances. Then, we derive the ML blind estimation of states and topology (ML-BEST) method for Gaussian-distributed states, that incorporates the constraints of a Laplacian mixing matrix and is shown to be a second-order statistics (SOS) method. Since the topology recovery stage of the ML-BEST estimator is shown to be a NP-hard optimization problem, we suggest two practical low-complexity methods to implement this stage: (1) Two-Stage topology recovery, which is based on solving the relaxed convex optimization and then finding the closest Laplacian matrix, and (2) Augmented Lagrangian topology recovery. Preliminary results can be found in . We also derive a closed-form expression for the Cramr-Rao bound (CRB). Finally, simulations demonstrate that the proposed ML-BEST methods are applicable for different network topologies, and asymptotically achieve the CRB.
The remainder of the paper is organized as follows. In Section II we introduce the system model and the graph BSS (GBSS) problem for state and topology estimation in power systems. The ML-BEST solution is defined and two different practical methods for its topology recovery stage are suggested in Sections III and IV, respectively. Section V offers some remarks, including a parameter identifiability analysis, complexity discussion, and possible extensions of the proposed model and methods. A closed-form expression for the CRB of the topology matrix is derived in Section VI. The proposed methods are evaluated via simulations in Section VII. Conclusions appear in Section VIII.
Ii Problem formulation
In this section, we formulate the problem of estimating the state and topology/admittance matrix in power systems under the linear DC power model. We show that this problem is equivalent to BSS with a Laplacian mixing matrix.
In the rest of this paper vectors are denoted by boldface lowercase letters and matrices by boldface uppercase letters. Theidentity matrix is denoted by , and denotes the constant -length one vector. The vectors and are zero vectors (with appropriate dimension), except for the th element of , which is . Additionally, denotes Kronecker’s delta, which equals if and otherwise. The notations , , and denote the determinant operator, the trace operator, and the Kronecker product, respectively. For a full-rank matrix , is the Moore-Penrose pseudo-inverse. The th element of the vector , the th element of the matrix , and the submatrix of are denoted by , , and , respectively. If is a positive semidefinite matrix we denote it by and its square root, , satisfies , where denotes the inverse of this square root. For any matrix , and denote its Frobenius and -(pseudo)norm (counting its non-zero entries), respectively, is the nonnegative part of , and is a vector obtained by stacking its columns. Similarly, for any symmetric matrix , is a vector obtained by stacking the columns of the lower triangular part of , including the diagonal, into a single column. Finally, we denote the cone of real symmetric matrices of size by .
Ii-B Graph representation of power systems
A power system can be represented as an undirected connected weighted graph, , where the set of vertices, , is the set of buses (that represent interconnections, generators or loads) and the edge set, , is the set of connected transmission lines between the buses. An arbitrary orientation is assigned to each edge , , , , that are ordered in a lexicographical order, which connects the vertices and . The cardinality of the edge set, , represents all possible connections in the graph. According to the -model of transmission lines , each line is characterized by the line admittance , .
The incidence matrix of a graph is , where the element of is given by
and . In addition, let be a diagonal matrix where if , . For connections that do not exist we use . Then, we can define the graph Laplacian matrix, , as
The matrix is a real, symmetric, and positive semidefinite matrix111It should be noted that is a positive semidefinite matrix, assuming we only have positive susceptances ., which satisfies the null space property, , and with nonpositive off-diagonal elements.
Ii-C DC model and problem formulation
We consider the DC power flow model , which is based on the following assumptions on the network:
Branches are considered lossless, which results in , where is the susceptance of the branch.
The bus voltage magnitudes, , , are approximated by 1 per unit (p.u.).
Voltage angle differences across branches are small, such that , where , , are the bus voltage angles.
Under Assumptions A.1-A.3, the active power injected at bus satisfies
Now, let be the vector of active power injected and the vector of voltage phase angles at time , . Then, based on the model from (II-C), the noisy linearized DC model of the network can be written as
where the topology matrix , defined in (2), is a deterministic unknown Laplacian matrix, which is considered static for a short-period of time and under normal operating conditions. The noise is a stationary Gaussian sequence with zero mean and a covariance matrix , i.e. , and it is assumed that the additive noises are independent of the state vectors. The vectors ,
, are assumed to be unknown random states with a joint probability density function (pdf)and marginal pdfs of , , . By subtracting the mean from the data, we can assume, without loss of generality, that has zero mean. The resulting centralized measurements are given by , where is the sample mean. For the rest of this paper, will denote the mean-centered active power data.
Now, in order to reformulate the model with a full-rank mixing matrix, we use the relation
where and , . In addition, it can be shown (see, e.g. pp. 134-144 ) that the modified noise sequence satisfies , .
We assume here that all sources are time-independent Gaussian distributed, i.e. , . Thus, , , where . Under the assumption that is known, the observations vectors are also independent Gaussian-distributed vectors, i.e. and , , where
and, assuming nonsingular matrices,
Full rank - Under the assumption of a connected graph, is a nonsingular matrix of rank and, thus, can be identified in general. In power system terminology, we assume that there are no unobservable islands in the grid.
Positive semidefinite - Since is a symmetric, positive semidefinte matrix, is also a symmetric, positive semidefinte matrix.
Nonpositive off-diagonal elements - , , .
Diagonally dominant - Since is a Laplacian matrix, is a diagonally dominant matrix, i.e. , .
Sparsity (optional) - It is shown in previous works that the power system is sparse , i.e. the zero pseudonorm of the off-diagonal entries of , , is much smaller than .
In this section, we develop the basic ML-BEST approach that jointly reconstructs the matrix and the states , , for the model from Section II. This problem can be interpreted as a BSS problem with a Laplacian mixing matrix, or graph BSS (GBSS). First, in Subsection III-A the minimum mean-squared-error (MMSE) estimator of the random states, , , is developed. Then, in Subsection III-C, we develop the ML estimator of the noise variance, , and formulate the optimization problem describing the ML estimator of the mixing system.
Iii-a MMSE state estimation
. We refer to the estimator in (11) as the oracle MMSE state estimator, i.e. an ideal estimator which has perfect knowledge of the noise variance and the system topology.
The practical state estimator for the considered GBSS problem is obtained by plugging in the ML estimators of the noise variance and the reduced-Laplacian matrix, and , respectively, that are developed in the following in Subsections III-B and III-C, into (11), which results in
For high signal-to-noise ratio (SNR) values, i.e. when , the matrix is a singular matrix and, thus, the covariance matrix of the data from (9) is also a singular matrix. In this case, instead of using pseudo inverse as in (11) and (12
), the unknown parameters can also be treated by removing the linearly dependent random variable (see, e.g. Chapters 3 and 10 in). In power system state estimation this is usually done by setting one bus as a reference bus and setting its angle to zero (see, e.g. ), and then only estimating . Here we prefer to use instead the state estimation method in (11) and (12) for estimation of .
Iii-B ML estimation of the noise variance
Iii-C System identification: ML estimation of the mixing matrix
By using the invariance property of the ML estimator  and the relation in (5), the ML estimator of the full Laplacian matrix can be obtained from the ML estimator of the reduced-Laplacian matrix, , as follows:
In the following, the ML estimation of the reduced topology matrix, , is formulated and is shown to be NP-hard. Practical methods to approximate the ML estimator of , , are developed in the next section. Under the model from (8) and the Gaussian-distributed sources assumptions, the normalized log likelihood of , , after removing constant terms and substituting the ML estimator of the noise variance from (13), satisfies
is the modified sample covariance matrix and the last equality is obtained by substituting (14). That is, the log-likelihood from (16) depends on the data only through the sample covariance matrix, , which is the sufficient statistic for estimating .
Since the reduced-Laplacian matrix satisfies Properties P.1-P.4, we are interested in minimizing over the domain of symmetric matrices and under the associated constraints as follows:
The Gaussian log-likelihood function, , is a concave function of the inverse covariance matrix, . However, even without the sparsity constraint, the constraints in (18) cannot be rewritten as convex constraints on . Therefore, the resulting optimization is not a convex optimization and, in addition, a direct Karush-Kuhn-Tucker (KKT) conditions  solution of this constrained minimization is intractable. Two low-complexity implementation methods are described in the next section.
Imposing directly the sparsity constraint in P.5 usually results in complex combinatorial searches, and following advances in compressive sensing [52, 53], the sparsity constraint can be approximated by restricting the off-diagonal -norm. We perform simulations that suggest that simple elementwise thresholding of the estimated Laplacian matrix is competitive with methods. Thus, at the end of the ML-BEST approach, we thresholded the off-diagonal elements of the estimator of the topology matrix, , from (15), with a threshold, , such that the th element of the final estimation is given by
, . The threshold should be tuned until the desired level of sparsity is achieved, while keeping connectivity. The diagonal elements of are known to be positive for the Laplacian matrix, which thus, has partially known support. Thus, set to be smaller than the magnitude of the smallest estimated element of the diagonal:
where . The value of can be set to the inverse of the number of buses, , or of the average nodal degree .
(Optional) Remove the sample mean, , from the observations , .
Obtain the sample covariance matrix, , by (14).
Perform eigendecomposition operation for the sample covariance matrix to find its eigenvalues .
Estimate the noise variance by the smallest eigenvalue, .
Estimate the reduced-Laplacian matrix and obtain approximation to , for example, by the two-phase/augmented ML-BEST from Section IV.
Reconstruct the full topology matrix according to (15):
Evaluate the sources according to (12):
Iv Practical implementations of the ML-BEST
In this section, two low-complexity estimation methods of the reduced topology are derived: 1) Two-Stage topology recovery in Subsection IV-A; and 2) Augmented Lagrangian topology recovery in Subsection IV-B.
Iv-a Two-phase topology recovery
It is well known that the relaxed optimization problem from (21) is a convex optimization w.r.t. and the optimal solution is the sample covariance matrix inverse, , under the assumption of nonsingular matrices (see, e.g. p. 466 in , ). Then, by using the invariance property of the ML estimator , the one-to-one mapping in (10), and the symmetry , one obtains that the unique minimum of (21) w.r.t. , which is the ML estimator of a symmetric positive definite mixing matrix, , satisfies
which implies that
In the second phase, we find the closest graph Laplacian matrix to the matrix in the sense of Frobenius norm. Thus, we solve the following optimization problem:
The problem in (24) is a convex optimization problem and can be efficiently computed by standard semidefinite program solvers, such as CVX . This two-phase topology recovery algorithm is summarized in Algorithm 2. The ML-BEST approach with two-phase topology recovery is implemented by Algorithm 1, where Step 5 is implemented by Algorithm 2.
Iv-B Augmented Lagrangian topology recovery
In this subsection we develop a constrained independent component analysis (cICA) method to solve (18). This approach is based on sequentially estimating the demixing matrix, , under constraints, where the inequality constraints (Constraints and from (18)) are transformed into equality constraints in the augmented Lagrangian [58, 59]. Constraint 1) implies the symmetry of , i.e. the equality constraint . Thus, in this case the objective function for the cICA, which is based on Equation (3) in  is given by
where , , and are the nonnegative vector, positive semidefinite matrix, and symmetric matrix, respectively, of Lagrange multipliers, and is the penalty parameter. The minimization of (IV-B) w.r.t. results in the following natural gradient descent learning rule  for :
where is the iteration index,
Finally, the Lagrange multipliers, , , and , according to the gradient ascent method are updated as follows:
The augmented Lagrangian topology recovery is summarized in Algorithm 3. The ML-BEST approach with augmented Lagrangian topology recovery is implemented by Algorithm 1, where Step 5 is implemented by Algorithm 3.
V-a Identifiability conditions
In this subsection, we discuss the GBSS identifiability conditions, under which the topology matrix and the state vectors can be recovered  for the model from Section II with zero-mean measurements. It is well known that Gaussian sources with i.i.d. time-structures cannot be separated [20, 21, 23]. Nevertheless, the following theorem states that when the mixing matrix is a symmetric matrix, consistent separation can rely exclusively on the SOS of the source covariance, even for Gaussian sources.
is a symmetric positive definite matrix
The covariance of the states, , is known and is a positive definite matrix
Then, the Laplacian mixing matrix, , can be uniquely identified, without scaling and permutation ambiguities, from the sample covariance matrix of the observations, , defined in (14).
First we will show that is identifiable. Then, can be uniquely recovered by using the relationship in (5). Similar to the derivation of (10), it can be shown that for any state distribution and independent noise with known noise covariance, , the covariance of the observations, , , satisfies
where the last equality is obtained by substituting the symmetry property, . It is known that for any positive definite matrix there exists a unique positive definite square root, , such that (see, e.g. p. 448 in ). Thus, under the assumption that and are positive definite matrices, the solution of (33) is unique and is given by
Now, if we use the estimators and in (34) instead of the true unknown values of , , then the existence of a positive definite solution is not guaranteed. Under the Theorem’s assumption that is a positive definite matrix, the uniqueness holds for the solution in (23).
A necessary condition for the existence of the inverse of , as required in Theorem 1, is that the sample covariance matrix has a full rank, i.e. . To ensure numerical stability, we require stricter conditions than the condition