DeepAI
Log In Sign Up

PhML-DyR: A Physics-Informed ML framework for Dynamic Reconfiguration in Power Systems

06/11/2022
by   Rabab Haider, et al.
MIT
0

A transformation of the US electricity sector is underway with aggressive targets to achieve 100 this objective while maintaining a safe and reliable power grid, new operating paradigms are needed, of computationally fast and accurate decision making in a dynamic and uncertain environment. We propose a novel physics-informed machine learning framework for the decision of dynamic grid reconfiguration (PhML-DyR), a key task in power systems. Dynamic reconfiguration (DyR) is a process by which switch-states are dynamically set so as to lead to an optimal grid topology that minimizes line losses. To address the underlying computational complexities of NP-hardness due to the mixed nature of the decision variables, we propose the use of physics-informed ML (PhML) which integrates both operating constraints and topological and connectivity constraints into a neural network framework. Our PhML approach learns to simultaneously optimize grid topology and generator dispatch to meet loads, increase efficiency, and remain within safe operating limits. We demonstrate the effectiveness of PhML-DyR on a canonical grid, showing a reduction in electricity loss by 23 and improved voltage profiles. We also show a reduction in constraint violations by an order of magnitude as well as in training time using PhML-DyR.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

11/09/2019

Physics-Informed Neural Networks for Power Systems

This paper introduces for the first time, to our knowledge, a framework ...
05/07/2022

GridWarm: Towards Practical Physics-Informed ML Design and Evaluation for Power Grid

When applied to a real-world safety critical system like the power grid,...
08/31/2022

Ranking-Based Physics-Informed Line Failure Detection in Power Grids

Climate change increases the number of extreme weather events (wind and ...
06/28/2021

Physics-Informed Neural Networks for Minimising Worst-Case Violations in DC Optimal Power Flow

Physics-informed neural networks exploit the existing models of the unde...
10/14/2019

Physics-Informed Deep Neural Network Method for Limited Observability State Estimation

The precise knowledge regarding the state of the power grid is important...
06/14/2021

Physics-Aware Downsampling with Deep Learning for Scalable Flood Modeling

Background: Floods are the most common natural disaster in the world, af...
02/27/2021

Voltage Feasibility Boundaries for Power System Security Assessment

Modern power systems face a grand challenge in grid management due to in...

1 Introduction

Power systems are an engineering marvel. They are quintessential illustrations of a well functioning large-scale system that safely, reliably, and affordably delivers an essential service, electricity. In recent years, essential properties of this large scale system have expanded to include sustainability and green, precipitated by urgent and intense concerns of climate change National Academies of Sciences, Engineering, and Medicine [2021]. The result has been a transformation of the grid edge, with increasing penetration of distributed energy resources (DER), which include, among others, roof-top solar PV, community solar, batteries of various sizes, and electric vehicles.

The opportunity to provide new, green sources of power necessitates new decision-making paradigms. Current practices of decision making in power systems consist of static optimization tools at slower time-scales for applications in planning and markets, and dynamic tools at faster time-scales for applications in prediction, estimation, and control. The proliferation of DERs introduces dynamic signatures across the power system landscape, pushing for dynamic decision making. Dynamic decisions require new tools for computationally faster and accurate decisions in the presence of operating constraints. It is in this regard that machine learning (ML) is being investigated.

An important decision in power systems involves grid reconfiguration, a highly dynamic phenomenon. Grid reconfiguration involves the selection of switch-states (open/closed) in the grid to ensure that loads are met with available generation resources, while satisfying voltage and line flow operating constraints. At the transmission (bulk) level, reconfiguration is a typical response to contingencies, failure of a generator or line. At the distribution level, reconfiguration can be used to reduce electrical losses involved in carrying power from bulk resources and DERs to individual loads (commercial businesses, residential units). The latter, increasing efficiency in the distribution grid with high DER penetrations, is the focus of this work. Static reconfiguration (StatR) determines a fixed set of switch-states that optimize losses over a long term. But the introduction of DERs necessitates dynamic reconfiguration where local DERs supply loads in closer proximity to them, thus reducing losses and improving voltage profiles across the grid. Dynamic reconfiguration (DyR) identifies the switch-states (i.e. grid topology) which minimizes losses for the given load and generation conditions. This is not possible with either traditional power systems approaches which rely on heuristics and operator experience, or traditional optimization approaches which can be computationally intractable.

This paper demonstrates a novel use of physics-informed ML (PhML) suitably modified to incorporate physical constraints present in the reconfiguration problem, both equality and inequality, and optimize a relevant cost function. More importantly, this optimization involves solving a mixed integer problem (MIP), a combination of both continuous and binary variables. Exactly solving MIPs, even those with only linear constraints, is an NP-hard problem, requiring exponential time to solve. Thus using traditional optimization methods for DyR is intractable, especially when requiring fast decisions and high degree of accuracy to remain within operating constraints. Further, traditional solvers cannot take advantage of the structures present in repeatedly solving an optimization problem, and warm-start techniques may struggle when parameters vary rapidly - such as with solar generation forecasts. Instead, ML can be used to learn the underlying problem structure, and when modified in a physics-informed approach, can provide good and fast solutions which satisfy operating constraints. We incorporate in PhML-DyR, a novel physics-informed rounding heuristic and overcome the challenges of the mixed decision variables. Using PhML-DyR, we carry out a study of a 33-node grid with 7 switches, and 25% DER penetration in the form of community solar. The proposed PhML-DyR framework leads to a 23% reduction in line losses as compared to no reconfiguration, and 2% reduction in voltage violations throughout the year as compared to StatR. Our physics-informed rounding significantly improves the training and performance of the ML framework, with order of magnitude reduction in inequality constraint violation as well as training time.

Our contributions can be summarized as follows: (1) PhML-DyR: An ML framework for dynamic reconfiguration, which carries out mixed integer optimization employing a novel physics-informed rounding heuristic. PhML-DyR is shown to reduce inequality constraint violations by an order of magnitude, eliminate line flow violations, and improve training performance; and (2) Case study on a canonical grid: PhML-DyR results in upwards of 23% reduction in line losses as compared to no reconfiguration, and 2% fewer voltage violations throughout the year as compared to StatR.

2 Related Work

Traditional Methods for Reconfiguration

The reconfiguration problem has been extensively studied in the context of transmission grids, where normally closed switches (NCS) and normally open switches (NOS) are used to ensure demand is met without violating line thermal constraints, and during contingencies. Classical literature employ reconfiguration for loss reduction Baran and Wu [1989], using single loop optimization Fan et al. [1996] and heuristics for approximating losses without extensive power flow calculations Civanlar et al. [1988]. With increased computing abilities and successful commercial MIP software, literature in the 2000’s focused on the modeling of the reconfiguration problem as a convex problem Taylor and Hover [2012], with a particular focus on the radiality constraint of the resulting grid topology Lei et al. [2020], Wang et al. [2020], Ahmadi and Martí [2015], Lavorato et al. [2012], to be solved by classical optimization methods. Despite the advances in MIP solvers, reconfiguration for realistic power grids remained computationally intractable Silva et al. [2021]; instead literature has proposed using various heuristic methods (including genetic, particle swarm, fuzzy multi-objective, and harmony search algorithms), and other iterative heuristics involving constraint elimination and repeatedly solving smaller sub-optimization problems Crozier et al. [2022]. However, none of these techniques provide guarantees for optimality, and continue to be computationally prohibitive in dynamic applications.

ML for Power Systems

There is a growing body of literature employing ML techniques for various tasks in power systems, including optimal power flow (OPF) Pan et al. [2019], Zhao et al. [2020], Donti et al. [2021], Zhou et al. [2020], Dobbe et al. [2020], Fioretto et al. [2020], Zamzam and Baker [2020], Chen et al. [2022], probabilistic power flow Yang et al. [2020], security constrained unit commitment Gutierrez-Martinez et al. [2011], Pan et al. [2021], Xavier et al. [2020], fault isolation Li et al. [2019], and reconfiguration Junlakarn and Ilić [2014], Yin et al. [2020], Yang and Oren [2019], Subramanian et al. [2021]

. These works all leverage the significant reduction in computational runtimes of ML-based methods as compared to classical methods. ML enables practical implementation of online decision making involving complex mathematics (nonconvex equations versus linear), frequent decision making (dynamic decisions versus static), and higher accuracy (as compared to heuristic methods). Physical constraints of voltage, line flow, and generator limits can be incorporated with loss functions with Lagrangian duals

Fioretto et al. [2020], projection algorithms to ensure feasibility Pan et al. [2019], Zhao et al. [2020], and gradient-based algorithms to learn feasibility Donti et al. [2021]. Prior works show that ML-based approaches can often improve upon existing model-based and optimization-based approaches. The order of magnitude improvement with fast ML-based OPF compared to classical methods using state-of-art solvers is of note Pan et al. [2019]. We further argue that physics-informed ML and heuristics have the potential to significantly improve the reliable, affordable, and green operations of power systems, even beyond what systems currently experience. For example, leveraging domain knowledge of decoupled power flow can improve the computational performance of neural network weight updates Yang et al. [2020]

. Note that in the papers mentioned above, the underlying decision variables have been continuous variables. Power systems problems frequently experience decisions in the mixed-integer space, with switch-states in reconfiguration being key levers for managing power flow, especially at the grid edge. In this context of reconfiguration, current practice has been to use decision trees and other heuristic classification methods. However, other ML methods such as neural networks have shown to perform better

Yang and Oren [2019]. In this paper, we not only use ML but carefully integrate it with physics-informed tools, leading to a novel framework with significant improvement in performance, termed PhML-DyR.

ML for Optimization

The need for fast and repeated solutions to optimization problems has pushed the community towards ML-based approaches over the recent years. While some focus has been on learning new algorithms for optimization Li and Malik [2016] or directly optimizing within a neural framework Donti et al. [2021]

, the vast majority of effort has been in improving the performance of existing optimization solvers and reducing computational time. This includes hyperparameter optimization

Maclaurin et al. [2015]; identifying active constraint sets for a general optimization problem Misra et al. [2021] and for OPF Deka and Misra [2019]; learning warm start techniques Baker [2019], Xavier et al. [2020]; developing an understanding of strategies used by optimization solvers Bertsimas and Stellato [2021]; and for the class of MIPs, the design of primal heuristics Wang et al. [2022], Shen et al. [2021], including neural diving Nair et al. [2020] and neural branching Gasse et al. [2019], Gupta et al. [2020]. Recent surveys on advances in ML approaches for MIP include a review of variable branch selection, cutting plane methods, and heuristics such as feasibility pump algorithms Zhang et al. [2022]

. Another survey focuses on recent advances in employing graph neural networks for combinatorial optimization (of which MIPs are a member), either directly as solvers or by enhancing exact solves

Cappart et al. [2021]. These surveys and references within present a detailed report of the state of art. Within the space of ML for MIPs, few works look at applying techniques to directly optimize (rather than improving existing techniques). In PhML-DyR, we learn how to optimize the reconfiguration MIP directly using an unsupervised neural network, and employ a physics-informed rounding heuristic to significantly improve prediction performance.

3 ML-based Reconfiguration

We propose PhML-DyR, a physics-informed ML-based dynamic reconfiguration framework. It is composed of a neural network which is suitably modified to incorporate grid physics, operating constraints, and salient topological and connectivity constraints. Figure 1 shows the proposed PhML-DyR framework. The unsupervised neural network is composed of 5 key components:

  • Neural network: a simple neural network with a sigmoidal output layer. The neural network predicts the independent variables, denoted as , from the input data . The output of the neural network is divided into two sets: predictions for the switch-states , and the other independent variables . Note consists only of binary variables; is mixed integer.

  • Topology selection layer: the physics-informed rounding heuristic recovers integer solutions to determine switch-states (open/closed). This layer embeds a necessary constraint to maintain topology radiality and node connectivity, as required by grid operators, and permits the neural network to simultaneously learn optimal grid topology and DER dispatch. The input to this layer is the subset of independent variables predicted by the neural network pertaining to switch-states, ; the output are the rounded variables now satisfying the integer constraint, denoted as .

  • Inequality constraint layers: the prediction from the neural network are scaled onto box constraints, thereby ensuring that inequality constraints pertaining to the independent variables are always satisfied. This layer acts in parallel to the topology selection layer. The output of this layer is the scaled variables .

  • Equality constraint layers: leveraging techniques for variable space reduction from optimization literature, the equality constraints describing the power flow across the grid are used to calculate the dependent variables from the independent variables .

  • Loss function and backpropagation: the neural network is trained to

    learn to optimize by selecting a loss function which is composed of the objective function to be minimized during optimization (ex. line losses in the grid) and any regularization to bias the solution against violating physical constraints. This is done by way of a soft-loss penalty.

In what follows, we detail the variable decomposition of the independent and dependent variables, the physics-informed rounding heuristic for topology selection, the equality constraints describing power flow, and the loss function for network training.

Figure 1: PhML-DyR: A physics-informed ML-based dynamic grid reconfiguration framework

3.1 Variable decomposition

The underlying problem can be cast in the form a standard constrained optimization problem

(1)

In (1), the decision variables are decomposed as , where is the training data and input to the neural network, is the reduced variable space predicted by the neural network (independent variables), and are the remaining dependent variables calculated using the equality constraints. In general, this decomposition is non-unique. It critically depends on the structure of the given problem, which determines the relationship between the sets of variables, and requires domain knowledge to exploit the underlying problem structure to produce good solutions. As will be seen below, a further decomposition of the independent and dependent variables is introduced, to accommodate the goal of reconfiguration, into continuous and binary variables, denoted by superscripts C and B, respectively. The overall decomposition in the context of reconfiguration is summarized below:

(2)

In (2), we have denoted a general distribution grid as a graph , where is the set of N nodes, is the set of M edges where are the set of lines with switches, and is the number of switches; node is the point of common coupling (PCC) of the distribution grid to the bulk transmission grid; are real and reactive power loads at every node ; are real and reactive power generation at every node , and generation at indicates import from the bulk transmission grid; are the directed real and reactive power flows through a distribution line ; denotes the squared magnitude of voltage at every node ; are binary variables indicating the direction of power flow through a line ; is a binary variable indicating the switch status , where is closed and is open. The set minus notation denotes all the switches in the grid except the last one, and the indexing notation denotes the last switch. The set of independent and dependent variables are and respectively.

We also classify

and as the set of independent switch variables and the set of other variables, respectively. We note that , and further, . Such a classification enables us to develop a physics-informed rounding, to be applied to the switch variables . This is detailed next.

3.2 Physics-Informed Rounding

A key challenge in the DyR problem with simultaneous topology selection and DER dispatch problem is the mixed integer nature. Traditional optimization literature deals with mixed integer programming using an array of heuristic methods involved in developing good upper and lower bounds, pruning solution branches, and selecting variables to round to nearest integer solutions. Taking inspiration from the class of rounding heuristics well-established in the MIP literature (1, and others including Goemans and Williamson [1995], Marchand and Wolsey [2001], Hifi and Sadeghsa [2022]), we propose a physics-informed rounding heuristic. In what follows we describe key equations governing grid topology, and then describe our rounding heuristic.

Distribution grids in the US are operated with a radial structure, and during normal operation, all nodes must be connected. Various mathematical formulations of these radiality and connectivity constraints include constraints on the determinant of the branch-to-node incidence matrix or spanning tree constraints and other graph theoretic approaches Lei et al. [2020], Wang et al. [2020], Ahmadi and Martí [2015], Lavorato et al. [2012]. However, many of these suffer from high computational requirements and additional complexity, and do not leverage the fact that grid connectivity can be ensured by power flow constraints under normal operation. We integrate these radiality and connectivity constraints in the following manner. We first define where is the number of nodes, the number of lines, and the number of switches.

(3)
(4)

Constraint (3) restricts the number of closed switches in the grid so it is radial with total branches, where switches must be closed and the remaining must be open. Constraint (4) enforces connectivity by requiring power to flow into or out of a node along at least one line. It should be noted that typical reconfiguration problem statements also include an arborescence constraint Taylor and Hover [2012], either explicitly or implicitly in the formulation of the radiality constraint. However, the increasing penetration of DERs voids this assumption, and multiple generating sources (roots of the tree) must be permitted. We have relaxed this arborescence constraint in (4).

With the above constraints, we now use this explicit knowledge of to inform our topology selection. In particular, we use to round to integer values . We describe the algorithm next, and formally present it in Algorithm 1. The neural network predicts the status of all but one switch, and is described by . Of these switches, at least must be closed. The remaining -th switch that is closed can belong either to or to . We then close switches in by setting the corresponding , and we open switches in by setting the corresponding . For the remaining two switches, the neural network training guides to integer solutions. This physics-informed rounding heuristic then selects a feasible (read: radial and connected) grid topology upon which the power flow describing the relationship between and are satisfied. This enables simultaneous optimization of grid topology and DER dispatch.

Data:
Result: Binary variables for switch-state prediction and topology selection
initialization: Sort in ascending order; assign the sorted indices of Assign switches to be closed: Assign switches to be open:
Algorithm 1 Physics-Informed Rounding for Topology Selection

3.3 Equality constraints

We describe the distinction between and variables, which stem from equality constraints from power flow. We use the LinDistFlow model of the distribution grid Baran and Wu [1989], given by:

(5)
(6)
(7)
(8)
(9)

Constraints (5)-(7) describe power flow using the LinDistFlow model. Constraint (7) is Ohm’s law across all lines which do not have switches, and constraints (5)-(6) are lossless power balance at every node. These equality constraints are used to calculate from the independent variables . These correspond to from constraint (5), for non-switch lines from constraint (6), and from constraint (7). A corresponding set of equations describe Ohm’s law across switched line, and are represented as big-M relaxations of the conditional constraints; these are inequality constraints and do not impact the variable decomposition. The full MILP model is in Appendix A for reference. The next set of constraints (8)-(9) and (3) describe topological constraints and are composed of integer variables, used to calculate from . In particular, the corresponding dependent variables are calculated from constraints (8)-(9), and the status of the last switch is calculated from (3).

It should be noted that the equality constraints are linear, and therefore the dependent variables can be determined trivially with zero error. Further, in the backpropagation step, the Jacobians describing the derivatives can be explicitly written out and the implicit function theorem used to backpropagate through the dependent variables. For problems which involve more complex (potentially nonlinear) equality constraints, the same variable space reduction techniques can be used, and programs like Newton’s method can be leveraged to solve for the dependent variables.

We also make a note on the inequality constraints which describe voltage, line flow, and generator limits. The particular variable decomposition in (2) and the selection of which variables belong in and are critical to satisfying inequality constraints in PhML-DyR. By selecting voltages as an independent variable, these are scaled onto the box constraints describing operating limits; for a grid operator, this means voltage limits across the grid will always be satisfied, a critically important aspect of power systems operation. This is inherent in our proposed structure, as compared to other methods which rely on projections, clipping, or penalties to enforce voltage constraints.

3.4 Loss function

The loss function for the dynamic reconfiguration problem is chosen to minimize the power loss over the lines, which is approximated as . The soft loss penalizes the inequality constraint violation with variable . Note that our power flow model is composed of all linear constraints, so the completion step is trivial and results in no equality constraint violations. For nonlinear power flow models, a penalty for equality constraints can also be included. The resulting convex loss function is .

4 Experiments

Dataset: We test the proposed PhML-DyR on a 33-node test system commonly used in the reconfiguration literature Baran and Wu [1989]. The grid consists of 33 nodes (), 37 lines () of which 5 are tie lines (NOS) and the remaining 32 are typically assumed to be switches (NCS). The grid is very lossy, with losses up to 8% of total load, and voltage profile violating voltage limits111Allowable deviation from nominal voltage under normal conditions vary globally. ANSI C84.1 (North America) permits deviation (0.95pu minimum). IEC and European EN 50160 (Europe) permits deviation (0.9pu minimum).. These characteristics make the 33-node grid an excellent test case for DyR, with the objective function to reduce line losses. Grid data includes the location of loads and their nominal power demand (P and Q) for a single period. To develop a diverse set of training data, the maximum load perturbation is restricted to 70% deviation from nominal value (i.e. ). The power factor of the loads, which describes the relationship between the real and reactive power as , is kept constant to the pf in the nominal data. To restrict the problem to a simpler test case, only a subset of the lines are considered switchable – these include the 5 tie lines (numbered 33 to 37) and 2 NCS lines (line numbers 10 and 26). We add a range of community solar facilities (each <5MW), up to a penetration of 25.3% of nameplate capacity to baseline load. This is a modest DER penetration compared to that which we would expect in the future grid. We use renewable generation profiles to couple generation across the grid with weather conditions, with hourly generation data from the NREL SAM tool National Renewable Energy Laboratory [2020]

. We divide the grid into sections, based on the location of switches, and vary the location of community solar farms amongst these sections. We denote the distribution of these DERs (DD) as follows: (i) DD-U: uniform distribution of solar throughout the grid; (ii-iv) DD-I, DD-II, DD-III: all facilities are in Sections I, II, or III of the grid respectively; (v) DD-II+III: all facilities are in Sections II and III of the grid. Additional information on grid data is presented in Appendix

B.

Simulation Parameters:

We use PyTorch on a 2.3GHz Dual-Core Intel Core i5, with 8GB RAM. The neural network has two hidden layers, each with 25 neurons. The size of input and output layer is determined by

and

. Each layer applies a linear transformation with bias, batch normalization, ReLU activation, and 50% dropout. We use He initialization for the weights. Backpropagation uses an adaptive learning rate (Adam), with parameter

. There are data points, split as training, and testing and validation each. We use mini-batching with batch size.

Inequality constraint violation: Violation of inequality constraints are measured as: (i) with a tolerance threshold; and (ii) the mean violation of an inequality constraint across data points.

Sigmoidal activation for integer variables (SigInt):

We compare our physics-informed rounding with a differentiable relaxation of the step function, using a steep sigmoid activation function at the output layer:

, where are free parameters (Cao and Zavala [2020]). This sigmoid activation is used for the binary variables , while the continuous variables

still pass through the traditional sigmoid function. We set the parameters

and , where governs the sharpness of the sigmoidal function - and hence how well the sigmoid approximates the step function. Larger values of better discriminate between binary values, but render the function less differentiable and thus learning more challenging.

Gradient-based correction (Corr): We implement the gradient-based correction procedure proposed in Donti et al. [2021] to accommodate inequality constraints after the variable space completion step. The correction layer uses gradient descent with a learning rate of , momentum , and minimizes the inequality constraint violation given by subject to the equality constraints . The value of was chosen through simulation to retain stability of the gradient descent algorithm.

4.1 Performance of PhML-DyR

Table 1 shows the significant loss reduction enabled by reconfiguration of the 33-node distribution grid as DER locations are varied, upwards of 23%. The StatR closes tie line 35 and opens NCS 10, and keeps this topology fixed. The DyR selects primarily between two states: (a) closing tie line 35 and opening NCS 10, and (b) closing tie lines 35 and 36 and opening NCS 10 and 26. For a grid without any DERs versus a grid with DERs, the loss reduction from reconfiguration is higher without DERs; this can be attributed to greater losses without leveraging local generators which are located closer to loads and thus incur lower losses when supplying those loads. The second column compares DyR to StatR, showing savings up to 30 MW for a single distribution feeder. While this is a modest 2.5% improvement of DyR upon StatR, it should be noted that this was obtained with a small test case (33 nodes). As the dimension increases, with increasing penetration of DERs and switches, with disparate patterns and topologies, it is expected that this difference will be significantly larger.

Figure 2 plots the voltage distribution across the grid for an entire year. The ideal distribution is the shape of a short ice cream cone - wide on top and narrow on the bottom, with the tip above the line indicating the ANSI minimum voltage limit of 0.95pu. On each plot is printed the percentage of time the voltages are within the ANSI limits, where higher numbers are better. We make the following key observations: (i) without reconfiguration, the grid performs very poorly, violating ANSI limits 50-60% of the time, and voltages drop to 0.88pu (outside of ANSI limits); (ii) reconfiguration (Stat or Dy) significantly improves the voltages across the grid, with minimum voltage improving to 0.9pu, and ANSI limits satisfied 77-83% of the time (IEC limits are always satisfied); (iii) DyR reduces the number of voltage violations throughout the year by 2%, as compared to StatR, which is a significant improvement as undervoltage can result in brownouts and even lead to blackouts. We anticipate that larger test cases will similarly show greater improvement in the voltage profile with DyR, as with the line losses. We note that since the grid chosen for this test case is very lossy, in our simulations we enforce a lower voltage limit of 0.87pu to ensure feasibility of loading conditions (instead of 0.95pu), which our PhML-DyR framework always satisfies. While it is interesting to note the simple case study of 33-nodes does not imply a preferred DD over others, higher dimensions will expect to naturally suggest an optimal DD.

StatR vs. no reconfig
% Loss reduction, MW saved per year *
PhML-DyR vs. StatR
MW saved per year *
No DERs 23%, 370 MW 0 MW
DD-U 20%, 300 MW 23 MW
DD-I 20%, 320 MW 31 MW
DD-II 19%, 270 MW 20 MW
DD-III 21%, 310 MW 27 MW
DD-II+III 20%, 280 MW 28 MW
*The MW (power) saved is equal to the MWh (energy), as the simulation is run for every hour of the year and loss reduction summed for every test case
Table 1: Loss reduction using StatR and DyR

[No reconfiguration ] [StatR ] [PhML-DyR ]

Figure 2: Voltage distribution over a year (8760 hours). Solid line is ANSI lower voltage limit.

4.2 Performance of Physics-Informed Rounding

Figure 3 compares the performance of PhML to SigInt-ML for DyR, where PhML employs physics-informed rounding. In Fig. 3 versus 3, we see an order of magnitude reduction in

when using PhML, with PhML outperforming the SigInt-ML within the first 50 epochs. Fig. 

3 also shows that Corr prodives little reduction in . Figure  3 and 3 plot averaged over all test data and all epochs, once again showing order of magnitude reductions in error when using PhML, from a maximum error of to . A similar result is shown in Fig. 3 and 3 which plots for a single test point after the neural network has been trained. With the use of physics-informed rounding, inequality violations of line flow limits are eliminated, and errors in generator limits are significantly reduced. Finally, Fig. 3 and 3 show the test loss over the epochs. From these plots, it’s clear to see that the PhML shows significantly better training performance and reduced training time as compared to SigInt-ML. The interpretation of these results is as follows: at each epoch and for every training point, the physics-informed rounding selects a feasible grid topology, after which the network can optimize the line losses and inequality violation. Notably, without the rounding heuristic, the network struggles to learn both integer and continuous variables, preventing any meaningful progress in satisfying inequality constraints and reducing line losses. We also note that the performance of the PhML-DyR framework is invariant to DD, with the training and test plots looking very similar across all DD scenarios (thus omitted). We also note the difference in computing time: online optimization to solve the reconfiguration MILP using Gurobi with the Yalmip optimizer environment in MATLAB takes 126.1ms (averaged over 500 data points), while PhML-DyR takes 0.422ms (averaged over 876 data points). This incredible speed-up of PhML-DyR over traditional optimization can be attributed to the ability of ML to learn the underlying task of grid reconfiguration, and the reduction of overhead required by optimization solvers in setting up the MILP and executing iterative heuristics to solve for the integer solutions.

[, ]   [ across all epochs and data points ]   [ for test point after training ]   [Loss function ]
[ with and without Corr, ]   [ across all epochs and data points ]   [ for test point after training ]   [Loss function]

Figure 3: Results comparing performance of: (top) PhML-DyR which leverages the proposed physics-informed rounding and (bottom) SigInt-DyR with and without Corr.

5 Conclusion

In this paper we propose a framework, PhML-DyR, for dynamic reconfiguration of the distribution grid. We propose a novel physics-informed rounding to tackle the mixed integer nature of the reconfiguration problem. Our PhML-DyR provides order of magnitude improvement in inequality error and training as compared to other approaches for integer accommodation which do not leverage physics. Our test case on a 33-node grid shows up to 23% reduction in line losses and significant improvements in voltage profile. While our test case is small, leading to modest improvements of DyR over StatR in both loss reduction and voltage violations, we anticipate larger test cases will show significant differences. Future work will further enhance the performance of PhML-DyR by employing representative training datasets for reconfiguration, and case studies on larger grids and with uncertainties in load and generation forecasts.

This material is based upon work partially supported by the MathWorks Mechanical Engineering Fellowship, and U.S. Department of Energy under Award Number DE-IA0000025.

References

  • [1] (2010) In 50 Years of Integer Programming 1958-2008: From the Early Years to the State-of-the-Art, M. Jünger, T. M. Liebling, D. Naddef, G. L. Nemhauser, W. R. Pulleyblank, G. Reinelt, G. Rinaldi, and L. A. Wolsey (Eds.), pp. 619–645. External Links: ISBN 978-3-540-68279-0, Document, Link Cited by: §3.2.
  • H. Ahmadi and J. R. Martí (2015) Mathematical representation of radiality constraint in distribution system reconfiguration problem. International Journal of Electrical Power & Energy Systems 64, pp. 293–299. External Links: ISSN 0142-0615, Document, Link Cited by: Appendix A, §2, §3.2.
  • K. Baker (2019) Learning warm-start points for ac optimal power flow. In 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. External Links: Document Cited by: §2.
  • M.E. Baran and F.F. Wu (1989) Network reconfiguration in distribution systems for loss reduction and load balancing. IEEE Transactions on Power Delivery 4 (2), pp. 1401–1407. External Links: Document Cited by: Appendix A, Figure 4, Appendix B, §2, §3.3, §4.
  • D. Bertsimas and B. Stellato (2021) The voice of optimization. Mach. Learn. 110 (2), pp. 249–277. External Links: ISSN 0885-6125, Link, Document Cited by: §2.
  • Y. Cao and V. M. Zavala (2020) A sigmoidal approximation for chance-constrained nonlinear programs. Note: arXiv:2004.02402 [math.OC] Cited by: §4.
  • Q. Cappart, D. Chételat, E. B. Khalil, A. Lodi, C. Morris, and P. Velic̆ković (2021) Combinatorial optimization and reasoning with graph neural networks. In

    Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21

    , Z. Zhou (Ed.),
    pp. 4348–4355. Note: Survey Track External Links: Document, Link Cited by: §2.
  • Y. Chen, S. Lakshminarayana, C. Maple, and H. V. Poor (2022) A meta-learning approach to the optimal power flow problem under topology reconfigurations. IEEE Open Access Journal of Power and Energy 9 (), pp. 109–120. External Links: Document Cited by: §2.
  • S. Civanlar, J.J. Grainger, H. Yin, and S.S.H. Lee (1988) Distribution feeder reconfiguration for loss reduction. IEEE Transactions on Power Delivery 3 (3), pp. 1217–1223. External Links: Document Cited by: §2.
  • C. Crozier, K. Baker, and B. Toomey (2022) Feasible region-based heuristics for optimal transmission switching. Sustainable Energy, Grids and Networks 30, pp. 100628. External Links: ISSN 2352-4677, Document, Link Cited by: §2.
  • D. Deka and S. Misra (2019) Learning for dc-opf: classifying active sets using neural nets. In 2019 IEEE Milan PowerTech, Vol. , pp. 1–6. External Links: Document Cited by: §2.
  • R. Dobbe, O. Sondermeijer, D. Fridovich-Keil, D. Arnold, D. Callaway, and C. Tomlin (2020) Toward distributed energy services: decentralizing optimal power flow with machine learning. IEEE Transactions on Smart Grid 11 (2), pp. 1296–1306. External Links: Document Cited by: §2.
  • P. L. Donti, D. Rolnick, and J. Z. Kolter (2021) DC3: a learning method for optimization with hard constraints. In International Conference on Learning Representations, External Links: Link Cited by: §2, §2, §4.
  • J. Fan, L. Zhang, and J.D. McDonald (1996) Distribution network reconfiguration: single loop optimization. IEEE Transactions on Power Systems 11 (3), pp. 1643–1647. External Links: Document Cited by: §2.
  • F. Fioretto, T. W.K. Mak, and P. van Hentenryck (2020)

    Predicting ac optimal power flows: combining deep learning and lagrangian dual methods

    .
    In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, pp. 630–637. Cited by: §2.
  • M. Gasse, D. Chételat, N. Ferroni, L. Charlin, and A. Lodi (2019)

    Exact combinatorial optimization with graph convolutional neural networks

    .
    In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Cited by: §2.
  • M. X. Goemans and D. P. Williamson (1995) Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. ACM 42 (6), pp. 1115–1145. External Links: ISSN 0004-5411, Link, Document Cited by: §3.2.
  • P. Gupta, M. Gasse, E. B. Khalil, M. P. Kumar, A. Lodi, and Y. Bengio (2020) Hybrid models for learning to branch. In Proceedings of the 34th International Conference on Neural Information Processing Systems, NIPS’20. External Links: ISBN 9781713829546 Cited by: §2.
  • V. J. Gutierrez-Martinez, C. A. Cañizares, C. R. Fuerte-Esquivel, A. Pizano-Martinez, and X. Gu (2011) Neural-network security-boundary constrained optimal power flow. IEEE Transactions on Power Systems 26 (1), pp. 63–72. External Links: Document Cited by: §2.
  • M. Hifi and S. Sadeghsa (2022) A rounding strategy-based algorithm for the k-clustering minimum biclique completion problem. Journal of the Operational Research Society. External Links: Link, Document Cited by: §3.2.
  • S. Junlakarn and M. Ilić (2014) Toward implementation of the reconfiguration for providing differentiated reliability options in distribution systems. In 2014 IEEE PES General Meeting | Conference Exposition, pp. 1–5. External Links: Document Cited by: §2.
  • M. Lavorato, J. F. Franco, M. J. Rider, and R. Romero (2012) Imposing radiality constraints in distribution system optimization problems. IEEE Transactions on Power Systems 27 (1), pp. 172–180. External Links: Document Cited by: Appendix A, §2, §3.2.
  • S. Lei, C. Chen, Y. Song, and Y. Hou (2020) Radiality constraints for resilient reconfiguration of distribution systems: formulation and application to microgrid formation. IEEE Transactions on Smart Grid 11 (5), pp. 3944–3956. External Links: Document Cited by: Appendix A, §2, §3.2.
  • K. Li and J. Malik (2016) Learning to optimize. arXiv. External Links: Document, Link Cited by: §2.
  • W. Li, D. Deka, M. Chertkov, and M. Wang (2019) Real-time faulted line localization and pmu placement in power systems through convolutional neural networks. IEEE Transactions on Power Systems 34 (6), pp. 4640–4651. External Links: Document Cited by: §2.
  • D. Maclaurin, D. Duvenaud, and R. Adams (2015) Gradient-based hyperparameter optimization through reversible learning. In Proceedings of the 32nd International Conference on Machine Learning, F. Bach and D. Blei (Eds.), Proceedings of Machine Learning Research, Vol. 37, Lille, France, pp. 2113–2122. External Links: Link Cited by: §2.
  • H. Marchand and L. A. Wolsey (2001) Aggregation and mixed integer rounding to solve mips. Operations Research 49 (3). External Links: Link, Document Cited by: §3.2.
  • S. Misra, L. Roald, and Y. Ng (2021) Learning for constrained optimization: identifying optimal active constraint sets. INFORMS Journal on Computing 34 (1). External Links: Link, Document Cited by: §2.
  • V. Nair, S. Bartunov, F. Gimeno, I. von Glehn, P. Lichocki, I. Lobov, B. O’Donoghue, N. Sonnerat, C. Tjandraatmadja, P. Wang, R. Addanki, T. Hapuarachchi, T. Keck, J. Keeling, P. Kohli, I. Ktena, Y. Li, O. Vinyals, and Y. Zwols (2020) Solving mixed integer programs using neural networks. arXiv. External Links: Document, Link Cited by: §2.
  • National Academies of Sciences, Engineering, and Medicine (2021) The future of electric power in the United States. Technical report The National Academies Press. External Links: Document Cited by: §1.
  • National Renewable Energy Laboratory (2020) System advisory model. Note: https://sam.nrel.gov/ Cited by: §4.
  • X. Pan, T. Zhao, M. Chen, and S. Zhang (2021) DeepOPF: a deep neural network approach for security-constrained dc optimal power flow. IEEE Transactions on Power Systems 36 (3), pp. 1725–1735. External Links: Document Cited by: §2.
  • X. Pan, T. Zhao, and M. Chen (2019) DeepOPF: deep neural network for dc optimal power flow. In 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pp. 1–6. External Links: Document Cited by: §2.
  • Y. Shen, Y. Sun, A. Eberhard, and X. Li (2021) Learning primal heuristics for mixed integer programs. In 2021 International Joint Conference on Neural Networks (IJCNN), Vol. , pp. 1–8. External Links: Document Cited by: §2.
  • F. F. C. Silva, P. M. S. Carvalho, and L. A. F. M. Ferreira (2021) Improving pv resilience by dynamic reconfiguration in distribution grids: problem complexity and computation requirements. Energies 14 (4), pp. 830. External Links: Document Cited by: §2.
  • M. Subramanian, J. Viebahn, S. H. Tindemans, B. Donnot, and A. Marot (2021)

    Exploring grid topology reconfiguration using a simple deep reinforcement learning approach

    .
    In 2021 IEEE Madrid PowerTech, pp. 1–6. External Links: Document Cited by: §2.
  • J. A. Taylor and F. S. Hover (2012) Convex models of distribution system reconfiguration. IEEE Transactions on Power Systems 27 (3), pp. 1407–1413. External Links: Document Cited by: Appendix A, §2, §3.2.
  • A. Wang, L. Yang, S. Lai, X. Luo, X. Zhou, H. Huang, S. Shao, Y. Zhu, D. Zhang, and T. Quan (2022)

    Efficient primal heuristics for mixed-integer linear programs

    .
    arXiv. External Links: Document, Link Cited by: §2.
  • Y. Wang, Y. Xu, J. Li, J. He, and X. Wang (2020) On the radiality constraints for distribution system restoration and reconfiguration problems. IEEE Transactions on Power Systems 35 (4), pp. 3294–3296. External Links: Document Cited by: Appendix A, §2, §3.2.
  • Á. S. Xavier, F. Qiu, and S. Ahmed (2020) Learning to solve large-scale security-constrained unit commitment problems. INFORMS Journal on Computing 33 (2), pp. 739–756. External Links: Document Cited by: §2, §2.
  • Y. Yang, Z. Yang, J. Yu, B. Zhang, Y. Zhang, and H. Yu (2020) Fast calculation of probabilistic power flow: a model-based deep learning approach. IEEE Transactions on Smart Grid 11 (3), pp. 2235–2244. External Links: Document Cited by: §2.
  • Z. Yang and S. Oren (2019) Line selection and algorithm selection for transmission switching by machine learning methods. In 2019 IEEE Milan PowerTech, Vol. , pp. 1–6. External Links: Document Cited by: §2.
  • Z. Yin, X. Ji, Y. Zhang, Q. Liu, and X. Bai (2020) Data-driven approach for real-time distribution network reconfiguration. IET Generation, Transmission & Distribution 4 (13). External Links: Document Cited by: §2.
  • A. S. Zamzam and K. Baker (2020) Learning optimal solutions for extremely fast ac optimal power flow. In 2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pp. 1–6. External Links: Document Cited by: §2.
  • J. Zhang, C. Liu, J. Yan, X. Li, H. Zhen, and M. Yuan (2022) A survey for solving mixed integer programming via machine learning. arXiv. External Links: Document, Link Cited by: §2.
  • T. Zhao, X. Pan, M. Chen, A. Venzke, and S. H. Low (2020) DeepOPF+: a deep neural network approach for dc optimal power flow for ensuring feasibility. In 2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pp. 1–6. External Links: Document Cited by: §2.
  • Y. Zhou, B. Zhang, C. Xu, T. Lan, R. Diao, D. Shi, Z. Wang, and W. Lee (2020) A data-driven method for fast ac optimal power flow solutions via deep reinforcement learning. Journal of Modern Power Systems and Clean Energy 8 (6), pp. 1128–1139. External Links: Document Cited by: §2.

Appendix A Grid reconfiguration problem

The grid reconfiguration problem using the LinDistFlow model of the distribution grid Baran and Wu [1989] is formulated as

(10a)
s.t. (10b)
(10c)
(10d)
(10e)
(10f)
(10g)
(10h)
(10i)
(10j)
(10k)
(10l)
(10m)
(10n)
(10o)
(10p)
(10q)
(10r)
(10s)
(10t)
(10u)
(no export allowed) (10v)
(no export allowed) (10w)
(no export allowed) (10x)

In (10), we have denoted a general distribution grid as a graph , where is the set of N nodes, is the set of M edges where are the set of lines with switches, and is the number of switches; node is the point of common coupling (PCC) of the distribution grid to the bulk transmission grid;

is the decision vector for the reconfiguration problem;

are real and reactive power loads at every node ; are real and reactive power generation at every node , and generation at indicates import from the bulk transmission grid; are the directed real and reactive power flows through a distribution line ; denotes the squared magnitude of voltage at every node ; are binary variables indicating the direction of power flow through a line ; is a binary variable indicating the switch status , where is closed and is open. The set minus notation denotes all the switches in the grid except the last one, and the indexing notation denotes the last switch.

Constraints (10b)-(10f) describe power flow using the LinDistFlow model. Constraints (10b)-(10d) describe Ohm’s law across all lines. The big-M relaxation is used to describe the conditional constraints for line . Constraints (10e)-(10f) describe lossless power balance at every node.

Constraints (10g)-(10o) describe the topology selection through switch-status, grid radiality and connectivity constraints, enforce binary constraints on the variables, and restrict power flow through a line based on switch-status. In particular, constraint (10j) restricts the number of closed switches in the grid so it is radial with total branches, where switches must be closed and the remaining must be open. Constraint (10k) enforces connectivity by requiring power to flow into or out of a node along at least one line. Note that the connectivity constraint must be relaxed when a fault has occurred and islanding of sections of the grid is permitted, and for system restoration. It should be noted that typical reconfiguration problem statements also include an arborescence constraint Taylor and Hover [2012], either explicitly or implicitly in the formulation of the radiality constraint. However, the increasing penetration of DERs voids this assumption, and multiple generating sources (roots of the tree) must be permitted. We have relaxed this arborescence constraint in (10k). Various other mathematical formulations of radiality and connectivity constraints include constraints on the determinant of the branch-to-node incidence matrix or spanning tree constraints and other graph theoretic approaches Lei et al. [2020], Wang et al. [2020], Ahmadi and Martí [2015], Lavorato et al. [2012]. However, many of these suffer from high computational requirements and additional complexity, and do not leverage the fact that grid connectivity can be ensured by power flow constraints under normal operation. Our formulation accounts for this.

Constraints (10p)-(10q) set the loads at node , which are assumed to be inflexible. Constraints (10r)-(10s) describe generator operating limits, and (10t)-(10u) describes grid voltage limits where the voltage at the PCC is assumed to be fixed at 1pu, as is common practice in power systems. Finally, constraints (10v)-(10x) describe “no export” limits on the PCC, where net generation excess of net load in the distribution grid cannot be injected into the transmission grid. For regions where distribution grids are permitted to export power to the bulk grid, these constraints can be removed.

The MILP detailed above is solved to minimize an objective function . For a modern distribution grid with high DER penetration, various objectives are sought after by grid operators.

Some such objectives include minimizing electrical line losses (maximizing grid efficiency), minimizing costs for power generation, minimizing congestion, improving voltage profiles across the distribution feeder, reducing peak power demand, ensuring reliability of service (ex. higher capacity margins for feeders and supply transformers), and balancing load. Depending on the types of switches in the grid, operators may also minimize the cost incurred by actuating switches. In general, these objectives can be formulated using a convex function, thus retaining the uniqueness and global optimality of the MILP solution. Further extensions to the reconfiguration problem include distinctions between hard and soft constraints, considering the optimal switch change order to go from topology A to topology B, and considering grid outage conditions and subsequent generator restart and load recovery. Note that soft constraints can include lines which can exceed thermal limits for short periods of time during a reconfiguration activity.

Appendix B Dataset: 33-node grid

The 33-node grid presented in Baran and Wu [1989] is a canonical grid used in the reconfiguration literature. The grid consists of 33 nodes (), 37 lines () of which 5 are tie lines (NOS) and the remaining 32 are typically assumed to be switches (NCS). The grid is very lossy, with losses up to 8% of total load, and voltage profile violating voltage limits. These characteristics make the 33-node grid an excellent test case for dynamic reconfiguration. Available grid data includes the grid topology and line parameters, as presented in Table 2, and location of loads and their nominal power demand (P and Q) for a single period as presented in Table 3. The network is shown in Fig. 4.

Branch No. Upstream Node Downstream Node R [ohm] X [ohm]
1 1 2 0.0922 0.0470
2 2 3 0.4930 0.2511
3 3 4 0.3660 0.1864
4 4 5 0.3811 0.1941
5 5 6 0.8190 0.707
6 6 7 0.1872 0.6188
7 7 8 0.7114 0.2351
8 8 9 1.030 0.7400
9 9 10 1.0440 0.7400
10 10 11 0.1966 0.0650
11 11 12 0.3744 0.1238
12 12 13 1.4680 1.1550
13 13 14 0.5416 0.7129
14 14 15 0.5910 0.5260
15 15 16 0.7463 0.5450
16 16 17 1.2890 1.7210
17 17 18 0.7320 0.5740
18 2 19 0.1640 0.1565
19 19 20 1.5042 1.3554
20 20 21 0.4095 0.4784
21 21 22 0.7089 0.9373
22 3 23 0.4512 0.3083
23 23 24 0.8980 0.7091
24 24 25 0.8960 0.7011
25 6 26 0.2030 0.1034
26 26 27 0.2842 0.1447
27 27 28 1.0590 0.9337
28 28 29 0.8042 0.7006
29 29 30 0.5075 0.2585
30 30 31 0.9744 0.9630
31 31 32 0.3105 0.3619
32 32 33 0.3410 0.5302
33 8 21 2.00 2.00
34 9 15 2.00 2.00
35 12 22 2.00 2.00
36 18 33 0.500 0.500
37 25 29 0.500 0.500
Table 2: 33-node grid topology data and line parameters
[kW] [kVAR] [kW] [kVAR] [kW] [kVAR]
2 100 60 13 60 35 24 420 200
3 90 40 14 120 80 25 420 200
4 120 80 15 60 10 26 60 25
5 60 30 16 60 20 27 60 25
6 60 20 17 60 20 28 60 20
7 200 100 18 90 40 29 120 70
8 200 100 19 90 40 30 200 600
9 60 20 20 90 40 31 150 70
10 60 20 21 90 40 32 210 100
11 45 30 22 90 40 33 60 40
12 60 35 23 90 50
Table 3: 33-node grid load data

b.1 Community Solar Dataset

We add a range of community solar facilities (each <5MW), up to a penetration of 25.3% of nameplate capacity to baseline load. We divide the grid into sections, based on the location of switches, and vary the location of community solar farms amongst these sections. We denote the distribution of these DERs (DD) as follows: (i) DD-U: uniform distribution of solar throughout the grid; (ii-iv) DD-I, DD-II, DD-III: all facilities are in Sections I, II, or III of the grid respectively; (v) DD-II+III: all facilities are in Sections II and III of the grid. The DD and location of each community solar facility is shown in Fig. 4. Different DDs are used to consider effect on grid reconfiguration, line losses, and voltage profiles.

[33-node distribution grid ] [DD-U ] [DD-I ]
[DD-II ] [DD-III ] [DD-II+III ]

Figure 4: 33-node distribution grid from Baran and Wu [1989]. The switches are highlighted in green, sections of the grid highlighted and labelled in orange, and location of community solar DERs noted in yellow squares.
DD-U DD-I DD-II DD-III DD-II+III
4 60 2 185 12 300 29 300 12 170
7 100 4 160 14 160 30 160 14 100
12 120 7 200 15 200 31 200 15 180
14 80 19 185 16 280 32 280 29 160
19 110 23 210 31 190
23 80 32 140
21 110
26 70
29 60
32 150
Table 4: Locations and capacity of community solar facilities under each DD. The generating capacity is in kW

The solar generation data is taken from NREL’s System Advisory Model (SAM) tool. The data is of a 185kW distributed commercial solar PV facility, located in Phoenix, AZ, using the SunPower SPR-E19-310-COM module, and SMA America (STP 60-US-10, 400V) inverter. The DC to AC ratio is set to the default of 1.2. The desired array size is set to 220kWdc, giving a total AC capacity of 179.580kWac. All other parameters are left unchanged in the SAM setup. Figure 5 shows a sample 24-hour generation profile of the solar facility.

Figure 5: Sample solar PV generation profile for community PV facility, queried for May 14, 2019.

Appendix C Gradient-based correction

The correction layer looks to minimize the inequality constraint violation:

We can use a gradient descent with momentum to iteratively solve this algorithm, with each iteration being a corrective step:

Initialize:

where the final step is equivalently . This relationship can be calculated using the implicit function theorem and the constraints. Namely:

These Jacobians are also used in the backpropagation step. Various tools and methods exist for efficiently calculating the Jacobians.