I Introduction
The Optimal Power Flow (OPF) problem is fundamental to power systems operation [fioretto2020predicting]. In general, OPF finds the optimal generation set points that minimize operation costs given a set of loads while at the same time satisfying physical and security constraints. The physicality of solutions is ensured by enforcing the power flow equations – a set of nonlinear equality constraints.
The increasing penetration of distributed energy resources (DER) is posing new challenges for power system operation. Traditionally, generation schedules were updated at 5min intervals [fioretto2020predicting]. Due to the variability and uncertainty of renewable generation, system operators are required to update generation schedules more frequently. At the same time, system operators need to manage a growing number of smaller resources, resulting in a significantly larger problem than the conventional OPF problem [guerrero2020towards]. Thus, the integration of DERs requires solving larger OPF problems at shorter time intervals. This motivates research on using function approximators, such as neural networks (NN), to learn solutions to the OPF problem, due to their fast inference and their potential to scale well to larger power networks [owerko2020optimal].
In order to deal with the nonlinearity of the OPF problem, a common practice in power system operation is to linearize the grid dynamics, leading to the DCOPF approximation. The DCOPF problem is generally formulated as either a linear program (LP) or a quadratic program (QP), depending on the cost function. Such a linearization can work reasonably well for small networks, but the resulting approximation error scales poorly with network size
[ferc2012history]. Furthermore, the assumptions required for linearizing the power flow equations are not valid for heavilyloaded or distribution networks [chatzivasileiadis2018lecture] and the author of [baker2019solutions] show that even under mild assumptions DC solutions are usually never AC feasible. Ideally, the full nonlinear AC power flow equations should be used. However, using the nonlinear power flow equations comes with another challenge. Because the nonlinear ACOPF equality constraints have a fractal nature riddled with spurious roots, which act as attractors [thorp1997load], it is difficult to disambiguate physical from spurious nonphysical solutions. As a result, convergence to the true solution cannot be guaranteed rendering existing solvers brittle.In this paper, we propose a learningbased approach to the ACOPF problem that alleviates the aforementioned issues by differentiating through a robust power flow solver, as shown in Figure 1. Our contributions can be summarized as follows. Firstly, we show that a power flow solver based on a holomorphic embedding of the power flow equations [trias2012holomorphic] is differentiable in its arguments. The primary benefit of using Holomorphic Embedded Load Flow method (HELM) is that it is much more likely to find the correct solution if it exists. Secondly, we can use this differentiable HELM solver as a computational layer in a NN. This allows us to learn a neural policy that solves the OPF problem, while respecting the power flow constraints. Furthermore, we show how this method can be extended to enforce convex security constraints, i.e., voltage magnitudes, as well as nonconvex integer constraints, i.e., unit commitment. To validate our approach, we empirically show that it is faster and more robust (i.e., results in physicallyviable solutions more often) on a 200bus system in comparison to traditional solvers based on mixedinteger interior point methods (MATPOWER [zimmerman2011matpower]). Specifically, our proposed method achieves a 12x increase in speed and a 40% increase in robustness compared to MATPOWER.
Ii Related Work
Iia Challenges of Solving ACOPF
As described earlier, ACOPF is traditionally posed as a constrained optimization problem, i.e. a cost function is minimized under network constraints oftentimes relying on some form of projected gradient descent [carpentier1962contribution].
Note that this problem formulation faces numerous computational difficulties. First, the nonlinear equality constraint (2) poses a challenge. In reality, (2) is a necessary but not sufficient condition for the system to be in a physical state. There are assignments of nodal voltages that fulfill the power flow equations that. however, do not constitute a physical state [tamura1983relationship, thorp1997load]. Optimization algorithms based on some form of projected gradient descent might be attracted to such a nonphysical solution rendering them not robust. Advanced techniques like the Homotopy [okumura1991solution] or Continuation [milano2009continuous] methods alleviate but not fully remedy this problem, and can incur substantial computational costs. Furthermore, nonconvex constraints create additional computational issues. Algorithms such as branchandbound that are typically employed to deal with integer constraints require solving multiple, and in the worst case exponentially many, relaxed linear programs [lawler1966branch].
IiB Learning OPF
Because larger OPF problems need to be solved more frequently, there is growing interest in learning to solve OPF problems with NNs. These approaches formulate OPF as the problem of learning a policy, which maps the demand configurations to generation set points. This policy could be learned from historical data, i.e., imitation learning, and/or by interacting with the system, i.e. reinforcement learning.
The most common paradigm of existing solutions is to learn a mapping from demand assignments to optimal generation set points based on a data set provided by an external solver, such as MATPOWER [zimmerman2011matpower]. This is analogous to behaviour cloning of an expert policy. In this case, the external OPF solver constitutes the expert policy [chen2020learning]. Following [gupta2020deep], we refer to such a paradigm as OPFthenlearn. Specifically, authors of [guha2019machine, owerko2020optimal] pose ACOPF as an endtoend regression task. Specifically, in [owerko2020optimal] graph neural networks (GNN) were adopted, which make predictions based on local information from neighboring nodes. Such an approach allows for a decomposition of the gridlevel problem based on power system connectivity and, in principle, scales well to large networks. However, a significant limitation of such an approach is that the solutions may not adhere to physical and security constraints inherent to the physical system. Concretely, 49% and 30% of the solutions were infeasible when applied to the IEEE 30bus and 118bus test systems respectively [guha2019machine].
This has motivated research on learningbased methods that respect constraints. The authors of [chen2020learning, zhao2020deepopf+, singh2020learning] studied the problem in the context of DCOPF. Since DCOPF can either be posed as an LP or QP, the problem has amenable mathematical properties and permits for relatively simple solution strategies. More relevant to our work, [zamzam2019learning, fioretto2020predicting] studied the problem in the context of ACOPF. To ensure feasibility, in [zamzam2019learning] generation set points produced by a NN were passed into a power flow solver. This speeds up the computation since the power flow equations are easier to solve than ACOPF. In [fioretto2020predicting], the authors applied primaldual updates to the Lagrangian dual function of the underlying constrained optimization problem, where a NN predicts the generation set points and the power system states. Notably, the approach proposed by [fioretto2020predicting] produces more accurate and costeffective solutions than those by DCOPF approximation, which is commonly used in industry.
However, there are key limitations to the OPFthenlearn paradigm. Firstly, a large labeled data set needs to be generated for training. Given the convergence issues of solving ACOPF, some load configurations may not have a solution from commercial solvers. Secondly, the learned solution may not generalize to unseen scenarios and retraining the system on incoming demand assignments requires creating a new training set. Because of this, authors of [gupta2020deep] proposed OPFandlearn as an alternative paradigm where generation cost is optimized directly by computing gradients through the OPF problem. While OPFthenlearn is analogous to imitation learning, OPFandlearn
can be seen as a onestep Markov Decision Process (MDP). However, the
OPFandlearn approach proposed in [gupta2020deep] relies on the DCOPF linearization, which limits the accuracy of solutions and is not applicable to heavilyloaded or distribution networks. Furthermore, when assuming the DCOPF linearization, creating a training set is trivial because optimal solutions can be obtained reliably and fast. Thus, by making the DCOPF assumption, the circular dependence on training data is broken but a significantly easier problem is solved which limits the potential advantages of the OPFandlearn paradigm.To the best of our knowledge, this paper introduces the first OPFandlearn approach that respects the full nonlinear power flow constraints as well as nonconvex unitdecomittment and nonlinear physical and security constraints.
Iii Proposed Learning Framework
Let be the cost associated with the assignment of nodal voltages and being the number of buses of the system. In the following, without loss of generality, we define the cost in terms of for presentational simplicity. However, numerous constraints need to be enforced, some of which are nonlinear and nonconvex, i.e. the load flow problem is traditionally posed as:
minimize w.r.t. :  (1)  
subject to:  (2)  
(3)  
(4) 
with and being a demand and generation assignment, respectively, and being the bus admittance matrix.
For different demand assignments , this optimization problem would be solved over and over again.
Our proposed learningbased formulation following the OPFandlearn paradigm is as follows: First, we note that the nodal voltages are a function of and , i.e. with solving the power flow equations (2). Second, we introduce a function that is tasked with producing optimal generator configurations as a function of the system state, in this case the demand , in an expected risk minimization setting. Thus, with parameterizing the function . Third, we assume knowledge of a data set containing historic demand assignments, i.e. a collection of possible states . Note that unlike OPFthenLearn
approaches [cite,cite,cite] we do not assume knowledge of the corresponding optimal generator assignments. The goal is to estimate the parameters of the function
, in this case , that produce optimal generation assignments as a function of the demand. As an optimization objective and learning signal for the function , we propose the following:minimize w.r.t. :  (5)  
subject to:  (6)  
(7) 
When compared to traditional optimization based approaches, this problem formulation has a number of advantages:

The nonlinear power flow constraint (2) vanishes. This increases robustness and avoids convergence to nonphysical solutions given a differentiable and robust power flow solver.

As we will show later, optimization under nonconvex constraints can be amortized, i.e. time training the system is spent once and after training, inference is extremely fast and solely requires a forwardpass through a NN even when respecting binary unitcommitment constraints.

Because training entails learning an optimal mapping between Euclidean spaces that represent demand and generation assignments respectively, the proposed approach exploits covariances between load flow problems and allows for the generalization to unseen problems.
However, optimizing w.r.t. (5
) poses challenges: If gradient descent is used for optimization, gradients need to be defined. Applying the chain rule to
yields:Thus, the loss is differentiable if the cost function , the voltage function and the actor function are differentiable. and can be usually assumed to be differentiable, however, the fact that computing the gradient through a power flow solver w.r.t. to the generation assignment , i.e. , is possible, might not be obvious. In section IV, we will show that computing the gradient through a robust power flow solver that represents the full nonlinear ACOPF equations, namely the Holomorphic Embedded Load Flow Method, is indeed possible. Doing this allows us to obtain the learning signal .
Furthermore, the constraints i.e. (6) and (7) need to be enforced. In section V, we will show that an auxiliary function can be used to enforce the KarushKuhnTucker conditions ultimately allowing us to enforce arbitrary constraints similar but not identical to the approaches introduced in [fioretto2020predicting, gupta2020deep].
In section VI, we show how the proposed approach can be extended to handle nonconvex, e.g. binary unit commitment constraints, by optimizing a variational lower bound whilst introducing little computational overhead during inference.
The approach is validated empirically on a 200 bus system. The experimental setup and results are described in section 5. In section 6, our findings are concluded and pathways for future work are laid out.
Iv Holomorphic Embedded Load Flow Method
HELM was first proposed by Trias [trias2012holomorphic, trias2015fundamentals] and was later extended in e.g. [subramanian2013pv, wallace2016alternative]. HELM addresses the convergence issues of power flow solvers based on Householder’s methods for rootfinding such as the NewtonRaphson algorithm. Because the power flow equations have multiple roots but only a single physically realizable solution, these types of solvers are at risk to converge to a spurious, unstable or low voltage solution [deng2013convergence]. Whether or not the ‘correct’ root is being found is usually dependent on the initial condition [thorp1997load]. Because HELM does not require an initial guess or initial condition, it overcomes the ambiguity problem that Householder’s solvers face, by performing analytical continuation from a known physicallyrealizable solution. Analytical continuation also allows HELM to find the root that is on the same branchcut as the previously known physicallyrealizable solution. This solution is unique because analytical continuation is unique when the function at hand is holomorphic. HELM treats the complex nodal voltages at each bus as holomorphic functions of a complex scalar . These functions are evaluated at a point for which obtaining a physicallyrealizable solution is trivial (usually ) and are then continued to the solution at a desired point where the original power flow equations are recovered (usually ).
Let be a function of the complex scalar . Equation (8) then describes such a holomorphic embedding, i.e. obtaining a solution at is trivial because no power is flowing and the original power flow equations (2) are recovered at . See [wallace2016alternative] for a proof that is indeed holomorphic.
(8) 
In order to obtain the power series coefficients required for analytical continuation, and its reciprocal are approximated by a power series expansion, i.e.:
(9)  
(10) 
Similar to traditional power flow solver such as Newton Raphson, in order to avoid overspecification of the problem, a slack bus is introduced: Let be the reduced matrix by removing the row and column of the slack bus and be the slackrow of sans selfadmittance. We assume that the voltage at the slack generator is with .
Thus, solving the linear system in (12) yields a solution at . Note that this solution is physicallyrealizable because no power is flowing. Higher order power series coefficients can be obtained by equating coefficients of the same order and by making use of which yields:
(13)  
(14)  
(15) 
After obtaining power series coefficients, analytic continuation is performed to obtain a solution at
. However, since the radius of convergence is usually smaller than 1, analytical continuation is performed using Padé approximants instead of directly evaluating (
9). Padé is a rational approximation of power series known to have the widest radius of convergence [stahl1989convergence]. Analytical continuation by Padé approximation is performed as follows:(16) 
Approximants of order , i.e. and can be obtained from the power series coefficients by solving a linear system of equations, specifically:
with:
and
being the identity matrix. Because we perform analytical continuation to
, plugging the obtained coefficients into (16) yields: .Iva Differentiating through HELM
In the following we will view HELM as a function that maps complex nodal power to complex nodal voltages, i.e. . We will show that is not holomorphic but differentiable in ultimately allowing us to compute gradients w.r.t. the parameters of the actor function . By making use of the chaintule, the strategy is to decompose HELM into a succession of functions and show that each function is differentiable. Specifically, we decompose HELM into its algorithmic steps, i.e. with computing power series coefficients, computing Padé approximants and computing voltage phasors given Padé approximants. We then show that , and are differentiable. Note that
is a recursive function and that writing its gradient out would be tedious. But its gradients can be computed efficiently using the backpropagation algorithm and the implementation is trivial in any deep learning frameworks with automatic differentiation capability, e.g.
PyTorch and TensorFlow.As stated earlier, HELM first computes the power series coefficients followed by Padé approximation. The power series coefficients and are obtained in alternating fashion: Let and be the function that produces and respectively. Note that requires knowledge of the previous coefficient and , whereas is a function of all previous  and coefficients:
Because of the complex conjugation in (17), is not holomorphic in , and , if . However, it is easy to see that, by induction, (17) and (18) are differentiable when and are differentiable which is easy to see from (12) and (13).
After obtaining the power series coefficients, Padé approximants and are calculated. Note that this also only includes solving a linear system of equations, i.e.
which is differentiable. Then includes only a summation and fraction, i.e:
Although, is not holomorphic, it is differentiable in its argument and, when applied to , it is differentiable in .
V Enforcing Constraints
Va A priori constraints
As stated earlier, we treat the generation assignment as the output of a parameterized function . Because of the reasoning laid out earlier, we require to be differentiable and because of recent successes of NNs in nonlinear optimization, we choose to be a NN with a penultimate sigmoidal layer [zamzam2019learning]. We incorporate the generation limits of the generators into the output layer of the NN and therefore enforce generation limits by construction. Let be the penultimate layer with
being the number of generator buses. Thus, every generator is associated with two neurons, i.e.:
with ,, and being the active and reactive generation limit respectively. Because is bounded by nonslack generation limits cannot be violated. However, other constraints such as e.g. voltage magnitude or thermal line limits are not enforced by construction. That is why, in the next section we show how to enforce what we call a posteriori constraints, i.e. constraints whose violation is only known after evaluating .
VB A posteriori constraints
We adapt ideas from mathematical optimization to enforce arbitrary constraints on
. In mathematical optimization, the KarushKuhnTucker conditions (KKTconditions) are necessary conditions for a solution to be optimal [gordon2012karush]. Given the optimization problem (1) expressed in terms of , the KKT conditions state that a solution is locally optimal under some regularity conditions when there exist such that:
(Dual feasibility)

(Complementary slackness)

(Primal feasibility)

(Stationarity)
Note that, without loss of generality (because any equality constraint can be expressed as two inequality constraints) and for notational convenience, we restrict the optimization problem to only have inequality constraints.
However, as stated earlier, we are not interested in the solution of a single constraint optimization problem but instead in solutions to all instances of a class of optimization problem. In this case, , i.e. the demand assignment, specifies the instance of the optimization problem whereas the network topology, i.e. admittance matrix , specifies the class. First, we note that the KKTmultipliers are dependent on the instance of the optimization problem, thus instead of introducing a scalar , we introduce a scalarvalued function . In order to enforce dual feasibility by construction, we choose to be a NN with softplus output parameterized by . Furthermore, let , the th output of and . We will now introduce a learning criterion and show that local optima of this criterion fulfill the KKTconditions for instances of the class contained in the training set. As a learning criterion we propose:
(19)  
(20) 
We will now show that, after convergence, for all , is locally optimal under some regularity constraints, i.e. it fulfills the KKTconditions and furthermore, that the KKTmultipliers for which the KKTconditions hold are:
(21) 

Dual feasibility: is dual feasible by construction: it is either or greater than because it is the output of a softplus NN.

Complementary slackness: Follows directly from (21)

Stationarity: Follows directly from the assumption that (20) has converged. Note that substituting for does not have an influence because if then the corresponding (complementary slackness) and when then
Note that the detour of substituting for improves the performance substantially. Without the substitution, because more constraints are complied with initially, the NN drives the outputs before the softplus nonlinearity to in order to make the corresponding equal to 0. The output units are then ‘dead’ and, because the gradient of the output nonlinearity is close to 0, will always stay 0.
VC Enforcing Physicality
So far, we have shown how to enforce ‘a priori’constraints, i.e. constraints whose violation is known before inferring nodal voltages, by construction, as well as ‘a posteriori’constraints, i.e. constraints whose violation is known after inferring nodal voltages, by introducing a learning objective that, after convergence, will enforce the KKTconditions. However, we have not yet shown how to keep the function in the physical regime, i.e. prevent from producing a generation assignment for some such that there is no that fulfills the power flow equations (2). An extreme example of a nonphysical tuple , for any demand assignment for which is .
First, we note that HELM will always produce complex nodal voltages even for nonphysical tuples. However, for nonphysical tuples the power flow equations (2) will not hold, i.e. there is a mismatch between the RHS and LHS of (2). We quantify this mismatch by defining:
The goal now is to enforce that with being some parameter which specifies when a power flow solution is deemed physical. Note that because is a function of , in principle, an additional inequality constraint could be introduced, i.e. and one could try to enforce this constraint as an a posteriori constraint as described earlier. However, in our experience this approach struggles, i.e. the learning objective usually does not converge. Figure 2 gives an intuition as to why this is the case. Figure 2 shows as a function of on a 200 bus system. scales the generation of a physical tuple , i.e. the yaxis shows . Note that when is either small or big ( or ), is close to flat and therefore the gradient of is close to 0. After randomly initializing the the parameters of the function , its guesses about optimal generation assignments will naturally be bad which corresponds to scaling the optimal generation assignment with a small or big . However, the function cannot improve its guesses by gradient descent because the gradient will be close to 0.
In order to overcome this problem, we propose to optimize a proxy of the actual mismatch function . Note that an indicator of whether or not a solution is physical is whether or not the power series coefficients have converged to 0. Let be the mean th power series coefficient of all voltages, i.e. . Figure 3 shows a scatterplot of and . Empirically, one can see that small is a sufficient condition for small , however not a necessary condition. That is, a small implies small but not vice versa. Thus in order to enforce physicality, can be minimized as a proxy for . However, one might think that optimizing is unnecessarily restrictive, i.e. it excludes solutions where the power series coefficients did not converge to 0 but the corresponding nevertheless fulfill the power flow equations. But, as we will show later, imposing voltage magnitude constraints naturally enforces physicality and additionally minimizing is only required after the actor function was first initialized in order to ‘nudge’ the it into the physical regime.
Vi Binary Constraints
Binary constraints naturally occur in optimal power flow when incorporating the possibility of completely shutting down generators. Introducing this constraint also known as unit commitment makes generation limit constraints nonconvex, i.e. becomes a possible generation assignment, even though points between and are not valid. Typically, optimal power flow solvers employ mixed integer programming techniques such as branch and bound or branch and cut [lawler1966branch] algorithms to tackle this problem. However, these algorithms can incur substantial computational cost, i.e. every branch requires solving an LP relaxed optimal power flow problem and there are exponentiallymany branches in a worstcase scenario.
However, using the problem formulation introduced here, because the constraint is a priori
and can be enforced by construction, we can reduce the nonconvex constraint into the problem of inferring the mode of a probability distribution
over binary configurations. As we will show later, because inference in is intractable, we optimize a variational bound, i.e. we introduce a variational distribution for which posterior inference is tractable and choose the variational parameters in such a way that best approximates. Specifically, we built on recent advances in Bayesian inference, specifically Variational Inference
[wainwright2008graphical] and train a variational distribution parameterized by a NN. See [blei2017variational, zhang2017advances] for recent reviews of Variational Inference.We begin by showing that computing the optimal binary configuration is equivalent to computing the mode of a distribution . Let
be the vector describing which generators are turned
on or off andbe an exponential distribution,
its posterior (Boltzmann distribution) and be the loss as defined in (19), i.e.(22)  
(23) 
It is easy to see that computing the mode of (23), i.e. is equivalent to choosing the binary configuration that results in the smallest loss. However, naïve evaluation of the mode is usually intractable, because of the intractable denominator. Naïvely computing the mode of (23) is equivalent to bruteforce search, i.e. enumerating all possible latent configurations and picking the one with the smallest error. However, in the following, we show how ideas from Variational Inference [wainwright2008graphical] can reduce the computational burden of inference.
We introduce a variational distribution whose posterior is tractable. Specifically, we choose
to be a multivariate Bernoulli distribution and ensure tractability with ideas introduced in
[lange2018factornet]. Note that is parameterized with a NN, therefore ensuring that inference at testtime is fast. As a learning signal for the parameters of the auxiliary posterior distribution , we choose the Evidence Lower Bound defined by:(24)  
(25) 
Note that optimizing (24) does not require knowledge of the intractable posterior of but nevertheless allows for minimizing a divergence measure between the true () and auxiliary posterior (). Thus, after training, in order to obtain an approximation of the mode of , because and will be maximally similar, posterior inference is performed on
instead. However, the price for this ‘trick’ is increased variance. It can be shown that the stochastic gradient estimator of (
24) w.r.t. is an unbiased but higher variance estimator of the KLdivergence [mnih2014neural]. In order to combat variance, a decadesold variance reduction technique is employed, namely sampling without replacement. Sampling without replacement from is not trivial. However, there is a considerable body of preexisting work that we make us of. The sampling scheme introduced in [shah2018without] with slight modifications is employed. Specifically, instead of using the Pareto sampler as the underlying sampling mechanism, a slightly slower but more accurate elimination sampler introduced in [deville1998unequal] is used.In order to obtain an approximation of the mode of the true posterior, because allows for drawing samples efficiently, many samples are drawn from . Then, in order to approximate the mode of , out of the many binary configurations sampled from , the one which results in the smallest generation cost is chosen. Note that the optimal configuration of generators is dependent on the binary configuration, thus should additionally be fed into the actor function . Figure 4 shows a graphical depiction of the proposed data pipeline.
Vii The Lopfalgorithm
In this section we summarize the resulting algorithm, we call Learning Optimal Power Flow, or short LOPF. The algorithm iterates over batches of the data set making updates to the three constituent NNs , and . It is described in pseudocode in Algorithm 1. For notational convenience, we define a function :
Figure 5 shows a graphical depiction of the input/output relationships of the individual networks. Note that the network not only produces active and reactive generation assignments for nonslack generators but also the voltage at the slack bus. Furthermore, the magnitude by which constraints are violated, denoted by , are fed into the network that produces a proxy of the Lagrange multipliers. Additionally feeding into the network eases and speeds up learning considerably.
Viii Experiments
Since this work introduces a learning based approach to the problem of ACOPF, the performance of the algorithm is evaluated similar to how the performance of reinforcement learning agents is evaluated, i.e. an empirical evaluation strategy is employed. Specifically, given a held out test set of load flow problems that the system was not presented with during training, the generation cost and the result of whether or not the system was able to find a feasible solution are recorded.
The requirements for feasibility excluding those that are met by construction are the following:

Logmismatch between the RHS and LHS of the power flow equations (2), i.e. , must be smaller than .

Slack active and reactive generation are within limits

Nonslack voltage magnitude constraints are met
The experiments were conducted on the 200 bus Illinois IEEE test case [zimmerman2011matpower]. However, since the IEEE test cases only contain a single demand assignment, the demand base case was superimposed by temporal patterns extracted from the RE Europe data set [re_europe]. RE Europe data set contains historical demand for 3 years at an hourly interval. Let be the base demand taken from test case and
be the temporal demand patterns taken from RE Europe data set. The temporal patterns were imposed such that the mean demand of every node is equal to the demand in the test case and such that the ratio between mean and standard deviation as seen in the RE Europe data set is preserved. The data set was separated into training (20.280 data points) and test set (6000) when conducting experiments.
The NNs used in this experiments constitute standard fully connected threelayer networks with intermediate activations. All intermediate layers have hidden units.
LOPF  MIPS  

Feasible [%]  99.86  60.85 
Mean Cost [USD]  33325.98  25817.55 
Mean time per Instance [s]  1.2  14.4 
Ix Results
As stated earlier, learning based approaches to control problems are usually not guaranteed to be optimal but can offer advantages in terms of computational time and robustness. The performance of LOPF reinforces these expectations. Figure 7 (left) shows the percentage of load flow solutions produced by the system that violate any requirement for feasibility as a function of learning steps. One learning step encompasses 32 load flow problems. For all of the 32 load flow problems 50 candidate binary configurations are drawn from the auxiliary distribution. One can see that the system quickly learns to produce feasible solutions. Initially the system produces feasible solutions to no load flow problems. However, after just 30 steps close to all solutions proposed by the system are feasible.
Figure 7 (right) shows the mismatch between the RHS and LHS of the power flow equations (2), i.e. . Note that initially, the proposed approach produces generation assignments for which the HELM solver is unable to produce voltage phasors that fulfill the power flow equations but by minimizing the power series coefficients as described in section VC, the system is quickly nudged into a regime where the proposed solutions fulfill the power flow equations. However, after approximately 75 learning steps, for a short period of time, the system produces generation assignments that, again, do not fulfill the power flow equations. This can most likely be explained by the fact that the system also tries to minimize cost. Thus, by trying to find cheaper generation assignments, the system left the regime in which solutions can be found by HELM but was then steered back into this regime.
Table I showcases the performance of our proposed algorithm in comparison to the MIPS solver proposed in [zimmerman2011matpower]
. In order to deal with nonconvex generation limit constraint, the MIPS solver was run with a unitdecommitment heuristic (
runuopf). When obtaining the results for the MIPS solver, all initializations were unchanged and only demand was varied in the way described earlier. Slightly varying the demand reveals the weakness of traditional solvers: Convergence to a feasible solution cannot be guaranteed. In our experiments, the MIPS solver produced solution which comply with all constraints and fulfill the power flow equations in only about 61% of all problem instances (failure in 2349 out of 6000 instances). Our proposed solution produces feasible solutions for 99.86% (failure in 8 out of 6000 cases) of the problem instances.^{1}^{1}1When LOPF fails, it slightly violates voltage magnitude constraints.On top of that, our proposed learning based approach is considerably faster than optimization based approaches: Because solutions can be obtained by feeding a demand assignment through the NNand the forward pass through NNs is usually fast, obtaining the generation assignment proposed by the system is fast. Note that when we report the time per instance for the MIPS solver, we report the meantime over all load flow problems. However, when the solver fails, it usually fails quickly. If only the time per successful instance was reported, the mean time per instance of the MIPS solver would be close to 30s per instance.
However, Table I and Figure 6 more clearly reveal the main weakness of our proposed learning based approach. Even though solutions can be obtained robustly and fast, the proposed learning system does not find solutions that are optimal in terms of generation cost. On average, the solutions that the approach produces are approximately 29% more expensive than the solutions found by the MIPS solver. Note that the average cost is reported for only those load flow problems for which both approaches yielded feasible solutions.
X Conclusion and future work
The main contribution of this paper is the introduction of a learning based framework for the problem of ACOPF that respects the full nonlinear ACOPF equations. Specifically, we introduce a learning based approach in which a function is tasked to produce feasible and minimal cost generation assignments as a function of the demand. A learning signal for this function is obtained by differentiating through the operators of a load flow solver. Furthermore, we show how convex security constraints and nonconvex generation limit constraints can be enforced. The resulting system seems to produce feasible solutions fast. However, these solutions are not necessarily optimal in terms of generation cost.
An obvious future research path is to close the optimality gap. At this moment, because of the complexity of the resulting system, it is hard to understand why the solutions are not optimal. But note that the proposed algorithm cannot be optimal by design, because the slack generator cannot be decommitted and there is a natural interpolation between load flow solutions. However, in the opinion of the authors, performance gains in terms of optimality should be possible.
Another potentially interesting research question is whether or not the trained auxiliary distribution that learns the cost surface as a function of the binary generator configuration allows for conditional sampling. Imagine a scenario where generators have failed. In such a scenario, it is paramount to reconfigure the network in a feasible state fast. If it is possible to sample from conditioned that the failed generators are off, then the proposed learning based approach could potentially find application in emergency and security sensitive situations. Note that this seemingly easy problem is not trivial because of the FactorNet [lange2018factornet] structure of the auxiliary distribution.
Comments
There are no comments yet.