Nomenclature

[]
 Variable Definition

Set of buses, .

Set of PV buses.

Set of PQ buses.

Set of branch.

Active power generation on bus .

Minimum active power generation on bus .

Maximum active power generation on bus .

Active power load on bus .

Reactive power generation on bus .

Minimum reactive power generation on bus .

Maximum reactive power generation on bus .

Reactive power load on bus .

Voltage magnitude on bus .

Minimum voltage magnitude on bus .

Maximum voltage magnitude on bus .

Voltage phase angle on bus .

Minimum voltage phase angle on bus .

Maximum voltage phase angle on bus .

Bus conductance matrix.

Bus susceptance matrix.

Transmission limit on the branch .
We use to denote the size of a set. Note that for the buses without generators, the corresponding generator output as well as the minimum/maximum bound of the generator output are 0. Without loss of generality, let bus be the slack bus.
I Introduction
Due to the superb scalability and capability, the Deep Neural Network (DNN) model becomes increasingly advantageous in solving largescale optimization problems, e.g., network congestion control [jay19a]. Motivated by this, we leverage DNN to address the essential alternative circuit optimal power flow (ACOPF) problem [carpentier1962contribution] in the power system operation.
The objective of the ACOPF problem is to minimize the cost of power generation while subject to the powerflow balances and operational constraints regarding the generation, voltages, and branch flow [johnson1989electric]. The ACOPF problem is an NPhard problem [bienstock2019strong] and thus challenging to solve. This problem has received considerable attentions, and various works have focused on it since the 1960s. For a comprehensive review, please see, e.g., [frank2012optimal1, frank2012optimal2]
, and the references within. The traditional approach is to tackle the problem using heuristics and approximations. Recently, researchers develop two categories of new approaches: the first one is to solve the convex relaxations of the original ACOPF problem. This approach can provide the lower bound for the original OPF problems, and recover the optimal solution to the original ACOPF problem under some conditions
[low2014convex1, low2014convex2]. The other one is to leverage the learning technique to facilitate the solving process by determining the active/inactive constraints/approximating the final solution to eliminate the unnecessary iteration process.The feasibility issue is critical in solving ACOPF. When the system conditions change rapidly, or the system reaches its security margins, a small violation may even produce dramatic damage to the whole system. For example, the dissatisfaction of the powerflow balance can make an unstable system; the violation of the power flow on a single branch can cause cascade failure. However, the current learningbased works cannot ensure the powerflow balances and the operation limits satisfied simultaneously. For this, we develop a feasibilityoptimized DNN approach to solve the ACOPF problem in this paper. To guarantee the powerflow balances, we first train a DNN model to predict a set of independent operating variables and reconstruct the remaining dependent variables via solving the AC power flow equations. Moreover, we employ a penalty approach in training the DNN to ensure that the reconstructed solutions (generation, voltages, and branch flow) satisfy the corresponding operation limits. However, the hardness of o the explicitform expression of the penalty gradients makes the conventional firstorder gradientbased training algorithm prohibited. To address this issue, we design a zeroorder optimizationbased training algorithm to guide the training process. Compared with the existing learningbased approaches, DeepOPF not only guarantees the powerflow balances, but also ensures that the obtained solutions satisfy the operation limits regarding generations, voltages, and branch flow.
The contributions of this work can be summarized as follows. After a brief review of the ACOPF problem in Sec. III, we describe the framework for DeepOPF in Sec. IV. To keep the feasibility of the generated solution, DeepOPF is designed based on the 2step PredictandReconstruct framework structure, and integrated with a penalty approach in the training process. Also, we leverage the zeroorder optimizationbased technique to tackle the hardness of not having the explicitform firstorder penalty gradients. We carried out simulations using Pypower [tpcwTrey1] and summarize the results in Sec. V. Simulation results show the usefulness of the penalty approach, and DeepOPF speeds up the computing time by one order of magnitude with minor cost loss as compared to conventional approaches.
Ii Related Work
We focus on the learningbased methods for solving the OPF problem. The current works is divided into two categories.
The first is hybrid approach, in which the conventional solver applies the learning technique to accelerate the solving process [venzke2020neural, chen2020learning, chen2020hot, biagioni2019learning, baker2019learning, jamei2019meta, deka2019learning, karagiannopoulos2019data, baker2019joint, halilbavsic2018data, ng2018statistical, misra2018learning, vaccaro2016knowledge, canyasse2017supervised, gutierrez2010neural]. For example, learning methods are used to predict the initial iteration point [baker2019learning, jamei2019meta] or to determine the active constraint set to reduce the problem size [deka2019learning, ng2018statistical]. However, these methods still resort to the iteration process and incur high computational complexity in the largescale power systems. Note that both DeepOPF and the approaches on removing inactive/determining active constraints apply orthogonal ideas to reduce the computing time for solving OPF problems. It is possible to combine these two approached to achieve better speedup performance. Specifically, the approaches on removing inactive constraints/determining active constraints achieve speedup by reducing the size of the ACOPF problems. In contrast, DeepOPF achieves speedup by employing a DNNbased ACOPF solver. It is conceivable to first reduce the size of an ACOPF problem by removing the inactive constraints and then apply DeepOPF to solve the sizereduced problem.
The second is the endtoend learningbased approach, in which the learning method directly generates the solution to the OPF problem. Some works [deepopf1, deepopf2, fioretto2020lagrangian, owerko2020optimal, guha2019machine, zamzam2019learning, fioretto2019predicting, dobbe2019towards, sanseverino2016multi]
use either the supervised learning to train a model (e.g., DNN) to generate the final solution from the given load input.
[zhou2020deriving, yan2020RealTimeOP]use reinforcement learning to train an agent that can obtain the final solution by mimicking the iteration process of the conventional solver with the given initial point and load. The approaches based on the supervisedlearning are usually faster than those based on reinforcementlearning as the agent obtains the final solution via iteration. The proposed
DeepOPF approach belongs to the supervisedlearning based approaches. However, for the approaches lying in this category approach, the principal challenge is ensuring the generated solutions satisfy powerflow balances and operation limits simultaneously. Existing endtoend supervisedlearning based works cannot ensure the feasibility of the generated solutions. For example, [guha2019machine, zamzam2019learning, fioretto2019predicting] directly predict all variables, which need a largerscale DNN model (maybe hard to train) and cannot ensure powerflow balances. Also, they develop methods to predict a set of variables and reconstruct the remaining ones via solving AC power flow equations. However, they did not consider the operation limits of generations, voltages, or branch flows during the training process. It was due to the difficulty of calculating the penalty gradients, which makes the training prohibited. Compared with these existing methods, the proposed DeepOPF can tackle the above issues. On the one hand, the proposed DeepOPF leverages the predictandreconstruct framework to keep the powerflow balance satisfied. Also, we employ a penalty approach when training the DNN and use a zeroorder optimization technique in the training algorithm to ensure that the reconstructed solutions satisfy the corresponding operation limits. Thus, DeepOPF can ensure the generated solution satisfies powerbalance, and the operation limits generations, voltages, and branch flow without the need for a largescale DNN model.Iii Mathematical Formulation for ACOPF
We focus on the standard formulation of the ACOPF problem with the bus injection model [cain2012history]. The objective is to minimize the total generation cost subject to the power balance equations, the generation operation limits, the voltage operation limits, and the branch flow limits. We first introduce the bus admittance matrix. Let denote the complex impedance of the transmission line between bus and bus , , and . Note that . Let denote the shunt impedance (admittance) of bus to ground. We can define the bus admittance matrix as:
Then, the bus injection model can be expressed as follows. For , we have:
(1) 
(2) 
where and denote the real and imaginary part of the complex number . (1) and (2) stand for the active and the reactive powerflow balance equations in each bus. Furthermore, we introduce the line admittance matrices and to represent the branch flow. Noted that for the given branch , the branch power flowing from the source node is different from that to the end node , with the difference related to the losses along the branch. These two matrices are both , and we can define the entries as follows. For these two matrices, suppose row corresponds to branch , we have:
and
For details please refer to [chatzivasileiadis2018lecture]. With the line admittance matrices, for , the branch flow can be expressed as:
(3)  
(4)  
(5)  
(6) 
Then we can formulate ACOPF problem as:
As shown above, the objective function is the total cost for the active power generation or total losses. In addition to the powerflow balance equations (1) and (2), the ACOPF problems consider the active and reactive generation limits, (III) and (III). (III) and (III) represent the voltage magnitude, and the corresponding phase angle on each bus cannot violate the given limits. (III) and (III) enforce the branch flow limits. The ACOPF problem is a NPhard problem in general, due to the nonconvex quadratic equality constraints (1) and (2) according to [bienstock2019strong]. Note that the formulation of the ACOPF problem in this paper is the standard formulation that ignores other constraints, such as security constraints, stability constraints, and chance constraints. We leave the incorporation of these constraints for future study.
Iv A FeasibilityOptimized Deep Neural Network Approach For ACOPF
Iva Overview of DeepOPF
Fig. 1 presents the framework of DeepOPF. Overall, DeepOPF consists of two stages: the training stage and the inference stage. In the training stage, we apply a random sampling method to generate the load data and obtain the corresponding optimal solution from the conventional solver (e.g., pypower [tpcwTrey1]) as the groundtruth. For the training, to ensure the feasibility of the generated solution, we apply the general PredictandReconstruct (PR) approach proposed in our previous work [deepopf1, deepopf2]. We first construct and train a DNN to predict a set of independent variables and reconstruct the remaining dependent variables by solving the power flow equations. Moreover, by the penalty approach based training scheme, DeepOPF can ensure the operating limits for the generation, voltage magnitude, and the branch flow satisfied.
In the inference stage, we directly apply DeepOPF approach to get the independent variables and then recover the dependent variables for the ACOPF problem with given load inputs.
IvB Load Sampling and Preprocessing
To train the DNN model, we first need to sample the training data. The load data is first sampled within a given range of the default value uniformly at random, which can help avoid the overfitting issue. The sampling data is then fed into the traditional ACOPF solver to generate optimal solutions. As the magnitude of each dimension of the input and output may differ, each dimension of training data will be normalized with the standard variance and mean of the corresponding dimension before training.
IvC PredictandReconstruct Framework
Recall that DeepOPF leverages the 2step PR framework to guarantee the AC power flow balances. The st stage is to obtain the prediction for a set of independent variables prepared for the stage for solving AC power flow equations. We summarize the set of independent and dependent variables at each type of bus [frank2016introduction] in Table I. We can observe that the DNN is applied to predict the set of voltage phase angle and voltage magnitude on the slack bus, and . For the PV buses, we applied the DNN model to predict the set of the active power generation and the voltage magnitude, . Noted that the voltage angle on the slack bus, the active and reactive power load on the PQ buses are given and fixed; therefore, there is no need to predict them. Thus, once the set of independent variables obtained at the st stage, the remaining dependent variables can be reconstructed directly via solving AC power flow equations at the stage. The PR framework takes dependency between the set of independent variables and the dependent variables into account, which reduces the mapping dimension and makes the training convenient.
Types of bus  Slack  PQ  PV  










IvD DNN Model
In DeepOPF
, we apply the DNN model to approximate the mapping between the load and the optimal solutions, which is established based on the multilayer feedforward neural network structure as follows:
where and are the active and reactive load on the PQ buses.
denotes the input vector of the network,
is the output vector of the th hidden layer, is the output vector, and is the generated scaling factor vector for the generators. Matrices, and activation functions
andare subject to the DNN design. We adopt the Rectified Linear Unit (ReLU) as the activation function of the hidden layers.
According to Table I, the variables determined by DNN in the prediction stage are the active power generation and the voltage magnitude of the PV buses as well as the voltage magnitude on the slack bus. As these variables are with inequality constraints, we can reformulate the corresponding inequality constraints through linear scaling. For example, suppose the predict variable need to satisfy the inequality: . Then, we can have the following reformulation:
(13) 
where is the scaling factor. Thus, we can obtain the value of by predicting the scaling factor
with inverse transformation. As the range of scaling factor is from 0 to 1 and the Sigmoid function
[goodfellow2016deepma] is applied as activation function of the output layer to enforce the outputs of the network to .IvE Penalty Approach based Training Scheme
After constructing the DNN model, we need to design the corresponding loss function and training algorithm to guide the training.
IvE1 Loss function
For each item in the training data set, the loss function consists of two parts. The first part is the difference between the predicted solution and the optimal solution obtained from solvers. Recall the DNN is applied to predict voltage magnitude on the slack bus, the active power generation, and the voltage magnitude on the PQ buses. Thus, the output dimension of the DNN model is . The prediction error is the mean square error between each element in the generated scaling factors and the actual scaling factors in the optimal solutions:
(14) 
Recall that with the PR framework, DeepOPF can guarantee the powerflow balance satisfied. However, the reconstructed solution (e.g., the reactive power generation on the PV bus, the voltage magnitude on the PQ bus, and the branch flow) may still violate the operation limits due to the inevitable prediction error of the DNN model. To address this issue, in addition to the above errorrelated loss term, we include a penalty term into the loss function. The penalty term captures the feasibility of the reconstructed variables (e.g., the reactive power generation on the PV bus, the voltage magnitude on the PQ bus, and the branch flow). Given the reconstructed variable , the corresponding penalty function is as follows:
(15) 
According to (15), if is infeasible, the penalty term returns nonzero value and vice versa. Besides, the more the reconstructed variable violates the constraint, the larger the corresponding penalty term is. Recall that we need to ensure that the reconstructed solutions e.g., the reactive generations on the PV buses, the voltage magnitudes on the PQ buses, and branch flow satisfy the corresponding operation limits. Thus, the penalty term in loss function is computed as the average value of penalty function regarding the reconstructed variables as follows:
(16) 
where is the reconstructed branch flow; is the reconstructed voltage phase angles on all buses. and are the reconstructed voltage magnitude on the PQ buses and the reconstructed reactive power generation on the PV buses. The total loss can be expressed as the weighted summation of the two parts:
(17) 
where and are positive weighting factors, which are used to balance the influence of each term in the training phase. The way to determine these hyperparameters is by educated guesses and empirical tuning, which are the common practice in generic DNN approaches in various engineering domains.
IvE2 Zeroorder Optimization for Penalty Approach
The training processing is to minimize the average value of loss function with the given training data by tuning the DNN model parameters as follows:
(18) 
where is the amount of training data and is the loss of the th item in training. Noted that the widelyused training methods, e.g., stochastic gradient [goodfellow2016deepma] or Adam [adam], are firstorder gradientbased algorithms, in which an explicitform of the gradient is necessary. However, the explicitform expression of the penalty gradients is difficult to obtain as it involved the complicated solving process for AC power flow equations. This issue makes the firstorder gradientbased training algorithms prohibited. To tackle this critical issue, we design a training algorithm based on the twopoint zeroorder optimization [nesterov2017random, ghadimi2013stochastic]
, in which we use the estimated gradient for the training stage. Recall that the DNN is applied to predict voltage magnitude on the slack bus, the active power generation, and the voltage magnitude on the PV buses. Thus the output vector of the DNN model is
. By introducing a random unit vector , the estimated gradient of the penalty term w.r.t. the DNN’s output can be computed as:(19) 
where is a small constant (The value of in the simulation is set as ). With the zeroorder optimization method, we can get the (estimated) penalty gradient and update the DNN’s parameters via the widelyused firstorder gradientbased training algorithms e.g., Adam [adam], in the training stage.
IvF Postprocessing
Similar to other approaches to ACOPF problems, e.g., solving the convex relaxation of the original ACOPF problem, the proposed DeepOPF may obtain infeasible solutions. In view of this, we need the postprocessing procedure to recover the infeasible solution to a feasible solution. The existing methods to recover the feasible solutions are projecting the generated solutions into the feasible region or setting them as initial points for iteration to obtain the feasible one. In this work, we focus on the effectiveness of the penalty approach in ensuring the feasibility of the generated solutions, so we show the performance without the postprocessing. To design efficient postprocessing produce is a potential direction, and we leave it for future work.
V Numerical Experiments







DeepOPF  Ref. 

DeepOPF  Ref.  
98  10  562.6  567.5  0.8  1219  49 
Va Experiment setup
VA1 Simulation environment
The environment is on CentOS 7.6 with quadcore (i73770@3.40G Hz) CPU and 16GB RAM.
VA2 Test case
The IEEE 30bus mesh network provided by the Matpower [zimmerman2010matpower] is used for testing. Table III shows the related parameters for the test cases.
VA3 Training data
In the training stage, the load data is sampled within of the default load on each load uniformly at random. We applied the solution for the ACOPF problem provided by Pypower [tpcwTrey1] as groundtruth. The amount of training data and test data are 10,000 and 2,000, respectively.
VA4 The implementation of the DNN model
We design the DNN model based on the Pytorch platform and apply the Adam
[adam]method to train the neural network. The training epoch is 800, and the batch size is 32. We set the weighting factors in the loss function in (
18) to be , based on empirical experience. Table IIIshows the related parameters, e.g., the number of hidden layers, the number of neurons in each layer.








30  5  20  41  3  128/64/32 
VA5 Evaluation metrics
We evaluate the performance of DeepOPF using the following metrics, averaged over 2,000 test instances:

Feasibility rate: The percentage of the feasible solution obtained by DeepOPF.

Cost: The power generation cost and corresponding loss.

Running time: The computation time of the DeepOPF.

Speedup: The runningtime ratios of the Pypower to DeepOPF. The speedup is the average of ratios, and it is different from the ratio of the average running times between the Pypower and DeepOPF.
Note that we only evaluate performance for the test instances with feasible generated solutions in this paper.
VB Performance evaluation
We show the simulation results of the proposed approach for the test case in Table II. We can observe from Table II that without any postprocessing, the percentage of the feasible generated solution increases from 10% to 98% when applying the penalty term. The improvement indicates that the DeepOPF approach barely generates the infeasible solution, which demonstrates the usefulness of the penalty approach. For the remaining 2% test instances for which DeepOPF generates infeasible solutions, it is due to the violation of 1 PQ bus voltage magnitude upper limit. Note that due to the nonconvexity of the powerflow balance equations, there may exist a complicated implicit relationship between the predicted variables and the reconstructed variables. The approach without the penalty term cannot ensure the feasibility of the reconstructed variables even if the independent variables regarding prediction error is small.
Also, the performance on the optimality loss (0.8%) is minor, as shown in Table II. This means each dimension of the generated solution has high accuracy when compared to that of the optimal solution. Apart from that, we can see that compared with the traditional ACOPF solver, our DeepOPF approach speeds up the computing time by 25. To summarize, the proposed DeepOPF approach can speed up the solving process for the ACOPF problem as compared to the traditional iteration based solvers while with minor loss as compared to the optimal one.
Noted that with the zeroorder optimization technique, for each training instance, the training algorithm needs to solves twice ACpowerflow equations in each training epoch. Also, the training algorithm needs more training epochs to converges as the estimation error of the gradient. Thus, the training stage of DeepOPF takes a long training time finish. For instance, it takes roughly 10 minutes to finish one training epoch (thus about 133 hours for 800 training epochs) when using the setting mentioned before. As the results on the IEEE Case30 have demonstrated the effectiveness and potential of the proposed DeepOPF approach, we leave the evaluation of DeepOPF on largescale power networks for the future.
Vi Conclusion
In this paper, we develop a feasibilityoptimized DNN for solving the ACOPF problems. To ensure the powerflow balance constraints satisfied, DeepOPF first predicts a set of independent variables and then reconstruct the remaining variables by solving the AC power flow equations. Meanwhile, we design a lightweight penalty term to ensure the feasibility of the obtained solutions towards the inequality constraints. We further apply a zeroorder optimizationbased training algorithm to tackle the challenge of deriving the explicitform penalty gradient. Simulation results show the effectiveness of the penalty approach and that DeepOPF speeds up the computing time by one order of magnitude as compared to modern solvers without minor optimality loss. We shall conduct more numerical results to demonstrate the performance for the largerscale power system in the future.
Acknowledgement
We thank Andreas Venzke for the discussions related to the study presented in the paper.
Comments
There are no comments yet.