A Statistical Learning Approach to Reactive Power Control in Distribution Systems

10/25/2019 ∙ by Qiuling Yang, et al. ∙ 5

Pronounced variability due to the growth of renewable energy sources, flexible loads, and distributed generation is challenging residential distribution systems. This context, motivates well fast, efficient, and robust reactive power control. Real-time optimal reactive power control is possible in theory by solving a non-convex optimization problem based on the exact model of distribution flow. However, lack of high-precision instrumentation and reliable communications, as well as the heavy computational burden of non-convex optimization solvers render computing and implementing the optimal control challenging in practice. Taking a statistical learning viewpoint, the input-output relationship between each grid state and the corresponding optimal reactive power control is parameterized in the present work by a deep neural network, whose unknown weights are learned offline by minimizing the power loss over a number of historical and simulated training pairs. In the inference phase, one just feeds the real-time state vector into the learned neural network to obtain the `optimal' reactive power control with only several matrix-vector multiplications. The merits of this novel statistical learning approach are computational efficiency as well as robustness to random input perturbations. Numerical tests on a 47-bus distribution network using real data corroborate these practical merits.



There are no comments yet.


page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Reliability and operational efficiency of modern distribution systems are currently being challenged by high penetration of unpredictable renewable energy resources, large-scale deployment of electric vehicles, and ‘human-in-the-loop’ demand response programs. As a consequence, reverse power flow as well as voltage magnitude fluctuations are prevailing in nowadays residential grids [4]. For instance, solar power generation may drop by of the photo voltaic (PV) nameplate rating within one minute due, for example, to intermittent cloud coverage [3], which will result in a sizable voltage sag if no action is taken. The role of networked control in power systems is to maintain desired operations, while preventing contingency events involving voltage and/or frequency instabilities from developing into large-scale cascades and blackouts. To protect electrical devices, bus voltage magnitudes in distribution grids are typically regulated to be within a certain range, e.g., around their nominal values. A common practice to achieve this is through reactive power compensation.

Traditional approaches have relied on utility-owned devices including load-tap-changing transformers, voltage regulators, and capacitor banks to control reactive power injection into the grid. Although these devices perform well in certain cases, slow response times, discrete control actions, and lifespan limitations discourage them from fast reactive power control [6]. Recent advances in smart inverters offer new opportunities by circumventing these limitations. Despite their advantages, computing the optimal setpoints for smart inverters can be cast as an instance of the optimal power flow task, which entails solving a non-convex optimization problem [6, 11]. Furthermore, to deal with the renewable energy uncertainties as well as unreliable communication links (which cause delay and even communication failures), stochastic, online, decentralized, and localized smart inverter control schemes have been developed [11, 16, 13, 23, 21]. Nonetheless, centralized solvers suffer from high computational complexity, and decentralized and localized schemes algorithms converge slowly.

To bypass these hurdles, recent proposals have engaged machine learning approaches for fast networked control and monitoring

[14, 17, 20, 19, 18]

. A support vector machine-based method was devised in

[10] to approximate a near-optimal inverter control rule. In [17]

, the authors developed a voltage regulation scheme using deep reinforcement learning. Deep (recurrent) neural networks were used for power system state estimation and forecasting in

[20]. By exploiting the power grid topology, a physics-aware neural network was proposed for state estimation [19]. Related schemes leveraging deep neural networks that ‘learn-to-optimize’ also appeared in resource allocation [12] and outage detection [22]

. Unfortunately, training existing supervised learning models for reactive power control, requires large-scale labeled training data, which are difficult to be obtained in real-world physical systems. Reinforcement learning approaches on the other hand, entail prior knowledge on designing the so-called reward functions and often converge slowly.

Different from existing efforts, in this work an unsupervised statistical learning approach is developed for computationally intensive and time-sensitive reactive power control. Specifically, a deep neural network is used to parameterize the functional relationship between the grid state vector and the optimal reactive power compensation. The computational complexity of solving non-convex optimization problems is shifted to offline training of a deep neural network. In the training phase, by feeding grid state vectors obtained from historical data or through simulations, the weight parameters of the deep neural network are updated iteratively via policy gradient method. In the online inference phase, or real-time implementation, one just needs to pass the observed state vector into the trained deep neural network, and obtains a near-optimal reactive power control at the output. Our model-free approach requires no system knowledge and is computationally inexpensive. It also bypasses the need for data labels, and tackles the optimal reactive control problem through policy gradients.

Regarding the remainder of this paper, Section II introduces our system model. Section III outlines the reactive power control problem formulation, followed by the proposed statistical learning solver in Section IV. Numerical tests using a real-world feeder are presented in Section V, with concluding remarks drawn in Section VI.

Notation. Lower- (upper-) case boldface letters denote column vectors (matrices), with the exception of power flow vectors , and normal letters represent scalars. Calligraphic symbols are reserved for sets, and represents the distribution over space .

Ii System Model

Consider a radial power distribution network modeled by a tree graph , where denotes the set of buses, and the set of edges. The tree is rooted at the substation bus indexed by , and all branch buses are collected in . For each bus , let denote its squared voltage magnitude, and denote its complex power injection, where and with superscript () specifying generation (consumption).

Thanks to the radial distribution grid topology, every non-root bus has a unique parent bus, denoted by ; and they are joined through the -th distribution line , whose impedance is given by . Let represent the complex power flow from buses to seen at the ‘front’ end, and represent the magnitude square of the current over line . For future reference, collect all nodal and line quantities into column vectors , , , , , , , and . See Fig. 1 for a depiction.

Fig. 1: Bus is connected to its unique parent via line .

The radial grid can be described by the so-termed branch flow model [2], which enforces the following equations for all


where the set collects all children buses for bus .

Traditionally, for a smart inverter located at bus with nominal power capacity , and a solar panel equipped at this bus with an nameplate active power capacity , it should hold that . In addition, the reactive power generated by the inverter is constrained by , where is the smart inverter output. However, to capture the special scenario that no reactive power can be provided when the maximum inverter output is reached (i.e., ), oversized inverters’ nameplate capacity (i.e., ) is used in practice. For instance, the reactive power compensation provided by inverter can be , if choose and limit to instead of , regardless of the instantaneous PV output [11]. Under this policy, the reactive injection region is the time-invariant convex set


where , and denotes the number of inverters in the grid. Moreover, the voltage magnitude at every bus should be maintained within a prespecified range, i.e., . In practice this range is chosen to be of its nominal value. For future use, rewrite voltage regulation constraints at all buses in a compact way as


In distribution grids, it holds that and when bus only has a capacitor; while , , when bus is a purely load bus; and a distributed generation bus not only consumes power denoted by , , but also generate active power , and provide negative or positive reactive power . Moreover, active power consumption and solar generation can be predicted through the hourly and real-time market (see e.g., [11]), or by means of running load demand (solar generation) prediction algorithms [20].

Iii Problem Formulation

In the envisioned distribution network operation scenario, active power is controlled at a coarse timescale. Depending on the variability of active power and cyber resources (sensing, communication, and computation delays), reactive power compensation occurs over time intervals indexed by , which could either be real-time market periods, e.g., minutes, or even shorter, e.g., seconds. Let denote the active and reactive power injections at all non-root buses during control period . The total power loss across all distribution lines can be expressed as . Given load consumptions and generation at the beginning of each interval , the goal of reactive power control is to find feasible reactive power injections for smart inverters such that the power loss across all distribution lines is minimized while maintaining all bus voltage magnitudes within a prescribed range. Formally, the reactive power control problem is formulated as follows


where admits the following form


Clearly, constraints (5b)–(5d) and (5f) are linear with respect to system variables . Nevertheless, constraints in (5e) are quadratic equalities, depicting a non-convex feasible set and rendering the optimization problem non-convex and NP-hard in general [7].

To address this issue, these equalities in (5e) have been recently relaxed to convex inequalities described by the hyperbolic constraints [7]


Substituting (6) into (5) yields

s.to (7b)

where (7c) can also be equivalently expressed as a second-order cone


Constraints (7b) and (7c) represent now a convex feasible set, and the problem in (7) can be solved by standard convex programming methods. Interestingly, it has been shown that under certain conditions, at the optimal solution of (7), equalities are attained in (8); see details in e.g., [gan2012exactnessconvex]. In this case, the optimal solution of the original problem (5) is recovered too.

Fig. 2: Statistical learning architecture.

It is worth pointing out that problem (4) formally characterizes the optimal reactive power control policies for a diverse set of networked control problems, including e.g., voltage regulation, Volt/VAR control, and optimal power flow [gan2012exactnessconvex], by choosing suitable objective functions. If active and reactive power injections were both known precisely in advance and remained constant within period , the optimal reactive power compensation would be found by solving (4). However, such conditions are hardly met in contemporary distribution systems, due partly to i) time-varying active and reactive injections; and, ii) noise-contaminated observations caused by direct measurements, delayed estimates, or inaccurate forecasts. To bypass these challenges, minimizing the averaged power loss over the power injections provides an alternative to the static reactive power control formulation in (4), given by


For notational convenience, let us define the state vector

, which is assumed to be a stationary random process, and rewrite the loss function

as . Substituting this display into the original problem (9), yields


Rather than the unreliable and possibly obsolete instantaneous found through (4), problem (10) is expected to yield smoother power control decisions. But, evaluating the expectation in (10

) is nearly impossible in practice, even if the probability density function of

was known. Challenge also comes from the computational burden of dealing with the non-convex constraint (5e). To approximate in a computationally efficient manner, a statistical learning approach is developed next.

Iv Statistical Learning

The rapid growth in renewable generation is displacing traditional forms of energy generation while increasing the need for controllable and flexible resources to balance fluctuations in load and generation. In this section, we introduce a novel parameterization form of the reactive power control problem, as well as a learning solver based on a deep neural network.

Iv-a Parameterization

Instead of solving (10) exactly, consider a parametrization for the reactive power compensation as follows


where is some function given by e.g., a deep neural network, and collects all unknown parameters. Building on this, finding the optimal reactive power control in (10) boils down to finding the optimal parameter vector , such that the expected loss is minimized; that is,


To find , a natural approach is to apply gradient descent type algorithms. To this aim, one needs to obtain the gradient of the objective function in (12) with respect to , i.e., . In practice however, there is no analytic form of as a function of or . In (5), for instance, the loss function depends only implicitly on . Instead, we can observe the function value for any grid operating point [cf. (5)], which can be used to estimate the gradient. This motivates development of a model-free approach [5]. Specifically, for a given set of iterates and reactive power realizations , the corresponding loss function values can be observed from the system. Using and , the parameter vector can be updated through the policy gradient method [15], which constructs a gradient estimate with only function observations.

A control policy here is a mapping from state vectors to reactive power control decisions (a.k.a. actions) . Consider first the stochastic control policy , specifying a conditional distribution of all possible decisions given the current state . Denoting the probability of taking action at state as , the gradient of with respect to can be written as


where denotes the probability of state , and is drawn from the distribution . Here, the computation of is translated to evaluating the expectation of function multiplied by the gradient of the policy distribution . This is indeed useful when we have an analytic form for . In such case, we may further replace the expectation on the right-hand side (13) with a sample mean. Specifically, by using previous function observations, we obtain the following gradient estimate


where is the injected reactive power into the distribution grid, drawn from the distribution , and is the corresponding observed loss function value obtained by solving (5).

Previously, it was assumed that the policy is stochastic. In deterministic cases, where the distribution is a delta function, i.e., . To evaluate in (14), one may approximate the delta function with a known density function centered around . To capture the power constraint

, a truncated Gaussian distribution with a fixed support on the domain

is considered in next subsection.

Iv-B Model-free learning

To find the policy , we restrict ourselves to the increasingly popular set of parameterizations, known as deep neural networks [8]

. Indeed, deep neural networks have recently demonstrated remarkable performance in numerous fields, including computer vision, speech recognition, and robotics. A deep neural network can effectively tackle the ‘curse of dimensionality’ by extracting low-dimensional representation for high-dimensional data


Consider a feed-forward deep neural network connected to a truncated Gaussian probability density function block; see Fig. 2 for an illustration. It takes as input the state vector , followed by

fully connected hidden layers with ReLU activation functions. The output of the deep neural network is a set of mean and standard deviation pairs

, each corresponding to truncated Gaussian distributions. By feeding the outputs of the deep neural network into the probability density function block, the reactive power compensation vector is sampled from . Stacking all the weights of the deep neural network into the vector , we have a function approximation to estimate the reactive power compensation .

Using the gradient estimate in (14), the weights can be successively updated as follows


where is a preselected learning rate. This update in (15) is a model-free approach, since it does not require explicit knowledge about the actual form of the function or distribution of . Different from a traditional supervised approach where requires a set of a given training labeled data [20], the developed method here is unsupervised; hence circumvents the need for labeled data and directly solves (10).

Fig. 3: Two phases reactive power control procedure

Training phase:

1:Initialize: .
2:for   do
3:     Observe historical measurement .
4:     Feed into the deep neural network.
5:     Obtain deep neural network output mean

and variance

6:     Feed and into .
7:     Draw a sample from the distribution .
8:     Obtain an estimate for via (7).
9:     Calculate via (14).
10:     Update according to (15).
11:end for

Inference phase:

1:for  do
2:     Feed real-time measurement into the trained deep neural network.
3:     Obtain the deep neural network output mean and variance .
4:     Feed and into .
5:     Draw a sample from the distribution .
6:end for
Algorithm 1 A statistical learning approach to reactive power control

The proposed reactive power control procedure is tabulated in Alg. 1. It is implemented in two phases, namely offline training and online inference phases, as shown in Fig 3. Specifically, in the training phase, historical/simulated datum is fed into the deep neural network. For a given input datum , our network spits out a reactive power compensation . Subsequently, the distribution network returns a loss for this state-action pair (which can also be found by solving (5)). Finally, a gradient estimate can be obtained using the policy gradient method in (14), based on which the neural network weight parameters are updated following (15). The trained deep neural network will be utilized in the inference phase. By taking the real-time state vector as input, the trained deep neural network outputs the optimal reactive power compensation to be implemented in the grid. Note that the proposed statistical learning approach is desirable for real-time reactive power control, as it shifts the computational burden of tackling non-convex optimization to offline training of a neural network.

V Numerical Tests

In this section, the performance of our proposed statistical learning scheme was evaluated on a real-world -bus feeder with high penetration of renewables [6]; see Fig. 4. This feeder is integrated with smart inverters located on buses , , , , and , with capacities , , , , and kW, respectively. A power factor of was assumed for all loads.

Fig. 4: Schematic diagram of the -bus distribution feeder.

The training and test data were obtained by splitting the consumption and solar generation from the Smart project collected on August 24, 2011 [3]. The CVX toolbox [9] was used to solve the SOCP problem in (7) to evaluate . The deep neural network used here consists of three fully connected hidden layers, with , and neurons per layer, respectively. To carry out the simulations, we used ‘TensorFlow[1] on an NVIDIA Titan X GPU with 12 GB RAM. The weight parameters of the deep neural network were updated using the back-propagation algorithm with ‘Adam’ optimizer. The learning rate was fixed to , and the batch size was throughout epochs of tests.

To assess the performance of the proposed approach, the following baseline was considered. Assuming perfect observations of active and reactive power injections at the beginning of slot , the optimal reactive power control can be found by solving the following problem

s.to (16b)

where is treated as an optimization variable. It should be noted that tackling this problem in real-time is computationally expensive, while the proposed approach finds after performing only several matrix-vector multiplications. The red curve in Fig. 5 shows the observed loss for the proposed approach, while the blue one depicts loss for the deterministic optimal one obtained via (16) during the training phase. The light colour curves correspond to the actual observed losses, while the dark ones are the running averaged ones. Clearly, our model-free approach learns to make optimal decisions . In the inference phase, the loss of the proposed approach versus the baseline is presented in Fig. 6. This plot demonstrates that the proposed model-free approach finds near-optimal reactive power control decisions. The running time of the proposed approach is one order of magnitude less than the optimization-based approach.

Fig. 5: The training loss of the statistical learning approach compared with the baseline (optimal).
Fig. 6: The inference loss values of statistical learning approach and the baseline (optimal).

Vi Conclusions

In this work, a statistical learning framework for reactive power control in distribution grids was developed. Uncertainties and delays in acquiring grid state motivate well this learning framework. The non-convexity of the underlying optimization, and lack of model knowledge makes reactive power control a challenge in modern grids, if not impossible, to solve directly. The theory of statistical learning empowered by non-linear functional approximation property of deep neural networks provided a fresh viewpoint to solve this problem. In particular, this work modeled the reactive power control policy via a deep neural network. The weights of the deep neural network were updated in an unsupervised and model-free fashion, circumventing the need for labeled data as well as an explicit model for the system. Our proposed method is computationally inexpensive, since all computational complexity is shifted to the training phase. Preliminary numerical results on the real-world -bus distribution network using real load data corroborate the merits of our developed approach. This work opens up several interesting directions for future research. Robust methods for reactive power control in the presence of corrupted or adversarial observations is worth investigating. Exploiting topology of the power grid to design physics-informed architecture is also pertinent.