1 Background and motivations
We address the problem of accelerating the computation of current flows in power transmission grids, using artificial neural networks, to emulate slower physical simulators, following other pioneering work [6, 5, 3, 2]. Key to our approach is the possibility of simulating the effect of planned coordinated actions on the grid topology (as opposed to accidental suffered changes). Our neural network models may then be used as part of an overall computer-assisted decision process in which human operators (dispatchers) ensure that the power grid is operated in security at all times, namely that the currents flowing in all lines are below certain thresholds (line thermal limits). Figure 1 illustrates the problem setting on a toy example. If one line goes over its thermal limits, it may be damaged, melt and/or cause fire or break, thus circuit breakers usually put it out of service before this happens. Hence, the grid must be reconfigured quickly to re-balance current flows and avoid that more lines go over their thermal limit, which might result in a cascading effect (black-out). The space of possible grid topologies grows exponentially with the number of substations. For example, the French high-voltage transmission grid includes substations, with more than a dozen possible configurations per substation and thus possible grid topologies. Even if only a small number of those are achievable, the search space is still humongous. In practice, Transmission System Operators (TSOs) limit dispatchers to a very limited set of candidate operations. However, operating the grid is becoming increasingly complex because of the advent of less predictable renewable energies, the globalization of energy markets, growth in consumption and concurrent limitations on new line construction. Therefore, it is becoming urgent to optimize more tightly the grid operation, considering a broader range of topological changes operated more frequently, without compromising security.
2 Proposed methodology
Our objective is to approximate a function that maps input data (e.g. power production and consumption) to output data (e.g.
power flows), parameterized by a discrete “grid topology vector”, taking values in an action space (all possible power-grid topologies e.g. line interconnections). For any fixed topology , training data pairs are drawn i.i.d.
according to an unknown probability distribution. In our application setting,is drawn randomly, but is a deterministic function implementing Kirchhoff’s circuit laws, calculated by a physical simulator that we wish to approximate.
We call simple generalization the capability of a neural net to approximate for test inputs not pertaining to the training set, when values are drawn i.i.d. from a distribution that remains the same in training and test data (this includes the case of a fixed ). Conversely, if values of are drawn according to a source domain distribution in training data and from a different target domain distribution in test data, then we will talk about super-generalization. This setting is a particular case of transfer learning .
One particularity of our application domain in terms of transfer learning is that we have one primary “reference” source domain (corresponding in the power grid to a reference grid topology , around which small variations are made. This is a generic scenario in the industry for systems that operate around nominal conditions, thus we anticipate that our method could be extended to other similar situations. In our application setting, we can easily get a lot of training data in the reference topology (corresponding to the typical way in which the grid is operated). We have comparably very little data available for training from other secondary source domains, corresponding to unary changes in grid topology (a single 1 at position ). Finally, we have extremely scarce data or no data at all available for training from domains corresponding to double changes , or higher order changes (considered target domains). This motivates our architectural design.
Our proposed Latent Encoding of Atypical Perturbations network, or LEAP net (Figure 2), is composed of three parts: An Encoder , learning an embedding of the input data ; a Decoder , learning how to perform the required task within this latent representation; and a Latent module , placed between the and where intervenes. The overall arhitecture is given by:
where and (encoders) and and (decoders) are all differentiable functions (typically implemented as artificial neural networks). The operation denotes the component-wise multiplication and the function composition. If the system is in the reference topology , predictions are made according to . A typical way in which we train LEAP nets is to use a lot of training data in the reference topology (primary source domain), very few examples for each of the unary changes (secondary source domains), and we expect the network to generalize to target domains corresponding to double or higher level changes.
While our architecture draws inspiration from both Dropout  and Residual Neural Networks , in its mathematical formulation, the underlying concept is quite different. Here we first embed in a latent space by applying . Then, based on and the location of within the latent space, we compute the corresponding leap . Then we decode the signal by applying . Those latent leaps contain information about how much the system actually deviates from the reference state, and in which direction. Hence, our architecture only needs to learn to modulate the system response around its nominal value.
3 Predicting flows in power grids
We present results for our target application on simulated and real data. Synthetic data allows us to perform controlled systematic experiments and compare neural network approaches with a standard baseline (DC approximation) in power systems. Real data allows us to check whether our method scales computationally while providing prediction accuracies that are acceptable for our application domain.
3.1 Case 118 synthetic data benchmark
We conducted controlled experiments on a standard medium-size benchmark from ”Matpower” , a library commonly used to test power system algorithms : case, a simplified version of the Californian power grid (dim = injections and dim = 186 power lines). Topology changes consist in reconfiguring line connections in one or more substations (see Figure 1). Such changes are more complex than simple line disconnections considered in . There are possible unary actions (corresponding to single node splitting or merging, compared to the reference topology). To build the Source domain training and test sets, we sampled randomly . In the reference topology (), we sampled input vectors . But for each , we sampled only input vectors . We used Hades2111Freeware available at http://www.rte.itesla-pst.org/. to compute the flows in all cases. This resulted in a training set of rows (each row being one triplet ). We created an independent test set of the same size in a similar manner.
We proceeded differently for the Target dataset. We sampled (Target domains: ) among the possible double actions , and . Then, for each of these , we sampled inputs (with the same distribution as the one used for the training and regular test set). We used the same physical simulator to compute the from the and the . The super-generalization set counts then rows, corresponding to different triplets .
We compare the proposed LEAP net with two benchmarks: the DC approximation, a standard baseline in power systems, which is a linearization of the AC (Alternative Current) non-linear powerflow equations, and the baseline neural network architecture (Figure 2) in which
is simply an input. The mean-square error was optimized using the Tensorflow Adam optimizer. To make the comparison least favorable to LEAP net, all hyper-parameters (learning rates, number of units) were optimized by cross-validation for the baseline network.
shows that the baseline neural network architecture (green curve) is not viable: not only does it perform worse than the DC approximation, but its variance is quite high. While it is improving in regular generalization with the number of training epochs, its super-generalization performances get worse.
3.2 Real French ultra-high voltage power grid data
We now present results on a part of the French ultra-high voltage power grid: the ”Toulouse” area with consumption nodes, production nodes, lines and substations often split in a variable number of nodes.
The inputs representing injections (production and consumption) are of dim ) and the outputs (flows) of dim . In this study, and come from real historical data from the company RTE222 Even in real records, flows are estimated, not measured.
Even in real records, flows are estimated, not measured.. One important difference when using played-back data, compared to simulation, is that we cannot intervene (this is strictly observational data). To place ourselves in a realistic transfer learning setting, we used data from 2012 to May 2017 for and data from June and July 2017 for . This favored changes in distribution. Another key difference in real data is “actions space”. In real data actual grid topologies (specifying line interconnections) are not precisely recorded. Only information on line outages is available to us as surrogate information on topology. This makes the neural net task harder: it must learn the effects of latent topological changes. This unfortunate loss of information on exact grid topology interventions makes it impossible for us to compare our method to the DC approximation: computing this approximation requires a full description of the topology. The results of Fig. 8 yield the same conclusions as in the previous section: the LEAP model generalizes not only to data drawn from a similar distribution it was trained on (Fig. (a)a) but also to unseen grid states (Fig. (b)b), better than the reference architecture, which is a critical property for our application.
4 Discussion and conclusion
The LEAP net architecture has been evaluated on a number of real and artificial test cases. Training was performed on data triplets , for which belong to source domains. The LEAP net generalizes not only by approximating well for new values of when , but also when (super-generalization). In our experiments, we achieved a speed-up of times using the LEAP net, compared to running the physical simulator, on the synthetic dataset (power grid of nodes). With data stored in computer memory, our experiments on the Toulouse area attain a speed of times compared to running the physical simulator. These computational evaluations were carried out using a single high-end Graphical Processing Unit (GPU) Nvidia Titan X. Further work includes scaling up our method computationally to the entire French extra high voltage power grid. We also need to improve prediction accuracy before our system could be deployed to production. However, the fact that the regular generalization performance is already within an acceptable accuracy range shows great promises. We anticipate several developments. From the theoretical point of view, we could seek mathematical guarantees of super-generalization in the form of performance bounds. It can easily be proved that a LEAP net architecture with linear submodules and exhibits super-generalization with respect to linear superposition of perturbations. However, we have demonstrated experimentally that super-generalization extends to combinations of non-linear perturbations. We are hopeful that more powerful theoretical results could be derived. From the practical point of view, the LEAP net architecture could be used in other application domains, lending themselves to transfer learning.
-  O. Alsac and B. Stott. Optimal load flow with steady-state security. IEEE transactions on power apparatus and systems, PAS-93(3):745–751, 1974.
-  B. Donnot, I. Guyon, M. Schoenauer, A. Marot, and P. Panciatici. Anticipating contingengies in power grids using fast neural net screening. In IEEE WCCI 2018, Rio de Janeiro, Brazil, July 2018.
-  B. Donnot, I. Guyon, M. Schoenauer, A. Marot, and P. Panciatici. Fast power system security analysis with guided dropout. In ESANN, Apr. 2018.
-  K. He, X. Zhang, S. Ren, and J. Sun. Identity mappings in deep residual networks. In ECCV, pages 630–645. Springer, 2016.
-  T. Hossen, S. J. Plathottam, R. K. Angamuthu, P. Ranganathan, and H. Salehfar. Short-term load forecasting using deep neural networks (dnn). In 2017 North American Power Symposium (NAPS), pages 1–6, Sept 2017.
-  T. Nguyen. Neural network load-flow. IEE Proceedings - Generation, Transmission and Distribution, 142:51–58(7), January 1995.
-  S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knoweledge and Data Engineering, 22(10):1345–1359, October 2010.
-  N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. JMLR, 15(1):1929–1958, 2014.
-  R. D. Zimmerman and et al. Matpower. IEEE Trans. on Power Systems, pages 12–19, 2011.