I Introduction
With the increasing penetration of renewable energy, power system frequency stability is of significant concern for reliable system operations. The frequencyconstrained unit commitment (FCUC) is proposed to address this challenge [Restrepo2005]. The swing equation is used to evaluate the system frequency in [Lee2013]. The mechanical output constraints are further taken into account in [Chavez2014][Wen2016]. Both strategies process a simple constraint set, which is easy to solve. However, important turbinegovernor dynamics are not considered. In [Teng2015], the dynamics of mechanical power is approximated using an integrator to derive the security criteria of frequency nadir. Ref. [Prakash2018] employed the formulation in [Teng2015] for the interval scheduling problem. In [Badesa2019][Chu2020], multiple frequency services are considered, where the supported power is approximated by integrator dynamics. Considering the fact that the centerofinertia (COI) frequency is governed by the classic system frequency response (SFR) model that admits an analytical solution, Refs. [Anderson1990], [Ahmadi2014] derived the analytical expression of the SFR under a step input and proposed an efficient piecewise linearization that converts the analytical solution into a set of constraints. This strategy has been fully extended into microgrid scheduling [Gholami2018], hierarchical frequency control [Muzhikyan2018], optimal power flow [Nguyen2019], and unit commitment with renewable energy resources [Zhang2020b], respectively.
Despite these excellent efforts of modeling SFR in mathematical programmingbased (MPbased) scheduling, these methods rely on the loworder model approximation that cannot be able to capture the entire SFR characteristics. In the classic SFR [Anderson1990][Ahmadi2014], individual generator speeds deviate from the COI frequency depending on the electric distances of generators. Therefore, only considering COI frequency can ignore insecure responses of certain generators. These methods are also unable to incorporate highorder models since there are no analytical solutions for the step response of higherorder differential equations. In addition, nonlinearities in SFR such as deadband and saturation cannot be taken into considerations by the existing approaches, which, however, have significant impacts on SFR [Kou2016]. On the other hand, directly incorporating frequency dynamics will result in highly complex MP [Avramidis2020][Zhao2020c].
To overcome the aforementioned limitations, we propose a deep neural network (DNN)based trajectory constraint encoding framework to tackle the FCUC. The framework will first train a DNNbased frequency nadir predictor using either real operation data or highfidelity simulation data. In this case, the frequency nadir predictor can reflect a variety of model types and corresponding nonlinearities. Then, the DNN will be reformulated into a set of mixedinteger linear constraints and further incorporated into the unit commitment program. It is worth mentioning that this reformulation is exact if the rectified linear unit (ReLU) is employed as the activation function
[Anderson2019].The main contributions of this paper are concluded as follows: (1) We propose a generic deep learningbased framework for the FCUC. This framework is generic in the sense that it considers the full operation space and can handle all types of differentialalgebraic models as well as all possible postcontingency scenarios; (2) We identify the necessary features from the operation variables for training the predictors and derive the mixedinteger linear constraints to encode the feature vectors; (3) We propose a
regionofinterest active sampling method to select samples whose frequency nadirs are closer to the UFLS threshold without labeling to generate more informative dataset. We verify that such a method can substantially improve the training efficiency and FCUC accuracy. It is worth mentioning that our previous work [Zhang2020a] only samples over a narrow range of the operating conditions. Though it works for a microgrid, such a sampling strategy is not sufficient for the unit commitment (UC) problems under high renewable penetration, where the operating points can vary significantly from time to time. In this paper, the proposed framework can handle widerange operation conditions by improving sampling and training efficiency using the regionofinterest active sampling. In addition, the possible unstable trajectories have been taken into account and addressed using an extra stability predictor. Such possible instability has not been considered in the previous FCUC problem.The remainder of the paper is organized as follows. Section II describes the problem setup. Section III
describes the DNNbased predictors for frequency trajectory and stability with feature selections. Section
IV describes the data generation using the regionofinterest active sampling strategy. Section V express the FCUC formulation, followed by the case study in Section VI and conclusion in Section VII.Ii Problem Statement
The frequencyconstrained UC (FCUC) searches the optimal dispatch command such that the resulting system states ensure a secure frequency response for any disturbance from a predefined set occurring within the scheduling period . The FCUC reads as follows
(1a)  
(1b)  
(1c)  
(1d)  
(1e) 
where and denotes the equality and inequality constraints, respectively; and denotes the system states and dispatch variables, respectively. The load forecast is denoted by . The disturbance set is defined to be the generator loss. Almost, it is guaranteed that the largest generator outage will result in the worst frequency nadir. Thus, we consider the worstcase contingency in the FCUC problem to be the loss of a generator that has the largest power output, which changes with respect to the dispatch signal.
The term is the frequency nadir predictor once the worsecase contingency occurs under the current operating condition. Since we are considering the full power injection space, the network can be subjected to either voltage or rotor angle instability after the worstcase disturbance. The prediction performance can be significantly compromised if these data is incorporated into the training dataset. If not incorporated, however, the frequency nadir predictor may output incorrect values under these unencountered operating condition. The solution is to use a stability predictor that outputs a twodimensional realvalued vector to filter out the unstable cases as shown in (1d
). The first entry denotes the predicted probability of a case being unstable, and the second entry denotes the predicted probability of a case being stable. The subscript
denotes the th entry of the output vector.Iii Deep LearningBased Trajectory Constraint Approximation
Iiia DNNBased Frequency Nadir Predictor
Let this neural network frequency nadir predictor for be expressed as
(2) 
where denotes the feature vector, and and denote the neural network parameters to be trained. To fully represent all possible operating conditions, a sufficient sample set will be created^{1}^{1}1The subscripts and are essentially the same as they both denote different system operating scenarios. Following the convention, we use
in the machine learning problem and
in the UC problem.. Let denote the frequency nadir of sample . Now consider a fully connected neural network with hidden layers. Each layer uses a rectified linear unit (ReLU) activation function denoted as , and the output layer uses a linear activation function. The predicted nadir can be expressed as follows(3a)  
(3b)  
(3c)  
(3d) 
where matrix and vector for represent the set of weight and bias across all hidden layers, and and represent the set of weight and bias of the output layer. We would like to minimize the total mean squared error between the predicted output and the labeled outputs of all samples using the following
(4) 
There are two essential prerequisites: (1) making correct choice of input features among , , and ; (2) generating representative samples under different system operating conditions.
IiiA1 Feature Selection
The frequency response from a synchronous generator (SG) is less dependent on its output unless there is not enough headroom, which will be avoided in the UC problem. In addition, we assume the droop control of the turbine governor for all SGs will not change. Therefore, it is sufficient to select the status of each SG as the input features. Both the magnitude and location of the disturbance will have impacts on the frequency response. Based on our discussion regarding the disturbance in Section II, the disturbance magnitude can be expressed as
(5) 
We use the bus index to represent the location information as
(6) 
We encode both information into one vector as
(7) 
Then, the feature vector of sample reads as
(8) 
IiiB DNNBased Stability Predictor
Similar to the frequency nadir predictor, we construct a neural network to approximate the probabilistic stability descriptor . To do this, we denote the classes of unstable and stable as for
, respectively. The label data after onehot encoding reads
, where indicates that sample belongs to class. We then apply probabilistic smoothing approximations to the discrete label values. It is well known that when the targets are onehot encoded and an appropriate loss function is used, an neural network directly estimates the posterior probability of class membership
conditioned on the input variables , denoted by . Then, the stability predictor can be expressed as follows(9) 
where reads as follows
(10a)  
(10b)  
(10c)  
(10d) 
where represent the indices for all hidden layers, and is the output layer. The network parameters can be calculated using the maximum likelihood estimation. Therefore, we minimize the negative logarithm of the likelihood function, known as the crossentropy loss, as follows
(11) 
IiiB1 Feature Selection
The disturbance information is also essential for the stability predictor. The same encoding method in (5)(7) is used. Considering that only online SGs will respond to disturbances, their commitment status is also employed. The voltage stability is highly related to the loading condition of the system. Since we relax the generator Var power limits, the active power will have a dominant impact. The transient stability margin depends on the powerangle characteristics, which are determined by the active power injection. Thus, the active power injection of all SGs will be encoded into the feature vector. And the overall feature reads
(12) 
Iv Data Generation and RegionofInterest Active Sampling
Consider a power network with SGs, WTGs, and loads. Let , , and denote the set of SGs, WTGs and loads, respectively. The full dimension injection samples are
(13) 
where and are samples of active and reactive power injections for SG ; are samples of active power injections for WTG ; and are samples of load active and reactive power injections for load . Each sample in is denoted as ^{2}^{2}2The subscript denotes the th row of a matrix., and the corresponding nadir (label) is denoted as .
Unlike many works that sample around a narrow operation range, we consider the widerange space of all power injections in (13) to ensure reliability under vast ranges of operating conditions. In this case, not every sample admits a meaningful frequency nadir. In fact, many sampled power injections are not stable under the worstcase disturbance. Furthermore, what we are concerned about the most is the frequency nadir close to the underfrequnecy load shedding (UFLS) threshold. Under limited computation resources, it is reasonable to focus on improving the accuracy of frequency nadir prediction near the critical threshold. To train such a predictor, the distribution of the training dataset should contain denser samples where the frequency nadirs are closer to the UFLS threshold, as shown in Fig. 2.
To generate such a training set, traditional methods will first perform uniform sampling over all power injections, label all samples to retrieve the nadir information, and then select labeled samples based on desired distributions. However, the labeling procedure needs timedomain simulation with fullorder differentialalgebraic equations. Labeling adequate samples to achieve desired distributions requires significant computation resources and time, which is known as the labeling bottleneck. In other words, how to sample the power injections whose labels are within the regionofinterest without labeling efforts is essential.
The proposed active sampling method can proceed as follows. We first define the sample as secure if the corresponding frequency nadir is above the UFLS threshold and insecure otherwise, as illustrated in Fig. 2. Second, we train a frequency security discriminator using the active learning method. The discriminator will output the predicted probability of a sample being secure. Then, we employ the active learning method to train this discriminator. Compared with the traditional passive learning approaches, the active learning method will actively select unlabeled samples, query their labels, and then perform the training on these data instances [settles2009active]
. During this procedure, samples whose label values are closer to the UFLS are easier to be selected. The rationale lies in the fact that it is more difficult for the discriminator to classify samples that are closer to the decision boundary, which is the UFLS threshold in our case. Therefore, we need to select samples whose posterior probability provided by the discriminator is the closest to 0.5
[settles2009active]. In other words, the selected sample is the least confident to the discriminator.Let the frequency security discriminator be denoted as , the UFLS threshold be denoted as , the security label of a sample be denoted as , where if and otherwise. The details of the regionofinterest active sampling algorithm is illustrated in Algorithm 1. The overall strategy is introduced as follows

Define the frequency security discriminator ; Among all the samples , label a small portion denoted as for frequency security discriminator training

Initialize the set denoted as to store acquisition samples for frequency nadir predictor training

Train the discriminator using current labeled dataset

Perform the active sampling using the discriminator; The active sampling function reads as follows
(14) where is the predicted posterior probability of class membership (0 for insecure and 1 for secure) conditioned on the sample inputs.

Retrieve both nadir and security labels; Add the nadir label with the acquired sample to ; Add the security label with the acquired sample to ; Remove the sample from the unlabeled pool.

Repeat (4) and (5) to reach the batch number; Repeat (3)(5) for the entire process
It is worth mentioning that the ultimate goal of Algorithm 1 is to generate the dataset for training the frequency nadir predictor . By calculating the entropy of the security discriminator outputs and selecting the extreme value, the corresponding samples will be closer to the decision boundary, that is, the UFLS threshold.
V Frequency Constrained Unit Commitment
In this section, we will introduce the FCUC formulation. Consider an bus power network. Let , and map from the bus index to the device index that is connected to bus . Consider the scheduling period from 1 to . Each SG should be dispatched with the permissible limits
(15a)  
(15b)  
(15c)  
(15d) 
The dispatch changes should respect the rampup and rampdown limits
(16a)  
(16b)  
(16c) 
The commitment commands should meet the minimumup and down time constraints
(17a)  
(17b)  
(17c)  
(17d)  
(17e) 
And for , the minimumup and down time constraints read as follow
(18a)  
(18b)  
(18c)  
(18d) 
The linearized AC power flow equations and the voltage constraints are expressed as follows
(19a)  
(19b)  
(19c) 
where
(20a)  
(20b) 
To encode the DNN into the MILP, we will first build the feature vectors in terms of the decision variables in the UC problem. Then, we will express the trained neural networks as MILP models. As discussed before, we replace the sample index with the scheduling period index. The feature vectors for the frequency nadir and stability predictors defined in terms of the decision variables are expressed in (8) and (12), respectively. Unfortunately, contains the operator, and and cannot be directly used in the encoding formulation. Thus, we introduce the supplementary variables to indicate if generator outputs the largest active power in scheduling period . The reformulations are expressed as follows
(21a)  
(21b) 
where is a big number. Eq. (21a) enforces to be zero if the output of any generators is greater than during period . Eq. (21b) ensures that there should be only one largest generator during . The combination of both constraints will ensure generator has the largest output if . To further encode the magnitude information, we define another supplementary variables and let if , and otherwise. The corresponding constraints are expressed as follows
(22a)  
(22b)  
(22c) 
Finally, the feature vector for frequency nadir prediction can be rewritten as follows
(23) 
And the feature vector for stability prediction can be rewritten as follows
(24) 
Then, we can express the trained frequency nadir predictor in (3) and stability predictor in (10) as MILP models to encode the predictions of decision variables and indirectly impose frequency security and stability constraints on decision variables. Since most parts of the model are similar, we use the superscript to denote the types of the neural networks, where . The MILP formulations read as follows
(25a)  
(25b)  
(25c)  
(25d)  
(25e)  
(25f)  
(25g)  
(25h)  
(25i) 
As shown, except for the final layers in (25h) and (25i), the rest parts of the neural networks admit the same reformulations for both and . A binary vector for represents the activation status of ReLU at th hidden layer, and represents the status of the th neuron at the th layer. Let be an interval that is large enough to contain all possible values of , where and . When is less than or equal to zero, constraints (25c) and (25f) will force to be zero. In this case, constraints (25e) and (25f) imply that , so we have . When is greater than zero, constraints (25c) and (25f) will force to be 1. In this case, constraints (25c) and (25d) imply that , so we have . Obviously, this formulation contains no approximation of the original model. In addition, this is the tightest possible formulation with respect to its LP relaxation if no future information about is revealed [Anderson2019].
Finally, we impose the frequency security constraint and stability constraint on the predictions
(26a)  
(26b) 
where is the supplementary variable to relax the predicted stability condition. Since only the steadystate information is used for dynamic stability prediction, the accuracy is relatively low. Therefore, strictly enforcing the predicted stability condition requires significant searches for feasible solutions and is not computationally efficient. Essentially, the stability predictor tries to filter out as many samples as possible that are not encountered by the frequency nadir predictor. The relaxed formulation can sufficiently minimize this risk.
The twosegment cost function, consisting of the fixed and marginal costs, is employed. The scheduling objective is to minimize the total operational cost, expressed as follows
(27a)  
(27b)  
(27c) 
where denotes the incremental outputs of SG , and denote the fixed and marginal costs, respectively. Terms (27a) and (27b) represent the fuel and startup costs of SGs, respectively. Terms (27c) represents the stability risk with a weighting factor . It is worth noting that reformulation of the startup cost has already been carried out in (27b), where
is the slack binary variable. In addition,
and are subject to the following constraints(28) 
Vi Case Study
We use the IEEE 39bus system to demonstrate the framework. The PSS/E software has been widely used in the power industry for dynamic security assessment in daily operations, and thus is chosen for labeling the samples. We use fullscale models for the dynamic simulation during the labeling process: GENTPJU1 for the synchronous machine; IEEEX1 for the excitation system; IEESGO for the turbinegovernor; PSS2A for the power system stabilizer. We assume there are four WTGs installed at Buses 2, 10, 20, 25, respectively. Each WTG is an equivalent representation of 600 1.5MW realistic Type3 WTGs. Standard WTG and corresponding control modules are employed: WT3G2 for the doublyfed induction machine, WT3E1 for the electric control, WT3T1 for the mechanical model, WT3P1 for the pitch control. Details of these models can be found in [siemens2009pss]. The forecast aggregated load and wind power are plotted in Fig. 3. The aggregated load is distributed to each bus based on the ratio of demand in the power flow data.
Via RegionofInterest Active Sampling and DNN Training
We first generate 5000 power injection samples as described in (13
) with negligible computation efforts. For generators, the active power injections are uniformly sampled over the lower and upper bounds. We assume SGs have adequate reactive power capacity, and WTGs are controlled with unity power factor. Therefore, we do not need to sample the reactive power of all generators. For loads, both active and reactive power are sampled based on the Gaussian distribution. The means are equal to the demand values in the power flow data, and deviations are equal to 30% of their nominal values.
With all these unlabeled samples, we employ the activelyselectthenlabel strategy described in Section IV to generate the training dataset. For comparison, another dataset is generated using the randomlyselectthenlabel procedure. In each iteration, 100 samples will be selected for PSS/E to label. We divide the nadir values of frequency deviation into four intervals: , , , . The regions of interest are and since they are closer to the UFLS threshold. The heatmap of the selected samples with both active and random methods are shown in Fig. 4. The number in each box represents how many samples whose label values are within the interval are selected in this iteration. As shown, the active method selects more samples within the region of interest. In addition, if we use random sampling, more than 50% of the samples are unstable and not available for frequency nadir predictor training. This number will reduce to around 25% if the active sampling strategy is employed, and thus greatly improve the labeling efficiency.
We use activelyselectthenlabel and randomlyselectthenlabel strategies to train two frequency nadir predictor, respectively. For the validation, we first generate another 5000 randomly selected samples. Then, we label the validation dataset and select those with label values greater than 1.2. The following metrics are used to demonstrate the prediction accuracy: (1) maximum error (MAXE), (2) median absolute error (MEDE), (3) mean absolute error (MEAE), and (4) score, which is defined as follows
(29) 
score gives some information about the goodness of model fitting. In regression, the coefficient of determination is a statistical measure of how well the regression predictions approximate the real data points. An of 1 indicates that the regression predictions perfectly fit the data. The validation results of the predictors trained using actively and randomly selected samples are shown in Table I. This result verifies the data generation strategy in Section IV, that is, a training dataset with the distribution shown in Fig. 2 and lower figure in Fig. 4 will improve the accuracy.
Case  MAXE  MEDE  MEAE  

Active  0.3539  0.0225  0.0281  0.9352 
Random  0.4511  0.0298  0.0376  0.8782 
ViB FCUC Results
The voltages are allowed to variate between 0.8 and 1.2 p.u. The unit commitment results of ordinary formulations (Eqs. (15)(19)) and (27) are shown in Fig. 5. While, the results of FCUC formulations (Eqs. (15)(27)) are shown in Fig. 6, where the frequency security limit is set to be 59 Hz. A stability predictor with 78% accuracy is trained and incorporated. As we can see, FCUC formulation enforces more SGs to be committed such that adequate responsive active and reactive power is online to ensure both stability and dynamic frequency security. The FCUC also makes the outputs of different SGs to be more equalized to reduce the risk of stability and security under the worstcase contingency. Particularly, the system under ordinary dispatch strategy is vulnerable to the generator outage event during nights. These are periods when the wind power is abundant, and demand is deficient, resulting in less committed SGs. For example, in the scheduling periods 23 and 24, ordinary UC only schedules four SGs shown in Fig. 5. In contrast, FCUC formulation commits seven SGs illustrated in Fig. 6.
We assume the worstcase contingency takes place in the period 24, and the generator outputting the largest power is tripped. The system under the ordinary UC schedules does not have a power flow solution after the disturbance. While, the system frequency responses following the FCUC dispatch are shown in Fig. 7, which satisfy the frequency trajectory constraint. The voltage dynamics and active power outputs of generators are plotted in Fig. 8 (a) and (b), respectively. Fig. 8 (a) indicates that the FCUC dispatch also satisfy the voltage limits. Fig. 8 (b) illustrates the generator tripping and responses from other generators except for the WTGs.
The systematic verification using PSS/E is shown in Table II. The first column is the predicted deviated frequency nadir by the FCUC. The second column is the corresponding PSS/E verification. The third column is the PSS/E simulation of the ordinary UC dispatch. The term ”NA” denotes instability cases. As we can see, ordinary UC dispatch is severely insecure as most dispatched conditions cannot withstand the worstcase contingency. Meanwhile, the FCUC significantly improve the stability and security with satisfactory prediction accuracy.
Pred.  FCUC  UC  Pred.  FCUC  UC  

0.75  0.61  NA  0.99  0.99  NA  
0.90  0.84  NA  1.00  0.98  NA  
1.00  0.72  NA  1.00  1.04  NA  
1.00  0.93  NA  1.00  0.99  NA  
0.94  0.91  1.66  0.99  0.99  NA  
0.98  NA  NA  1.00  0.98  NA  
0.98  NA  NA  1.00  0.98  NA  
0.99  0.96  NA  0.98  0.99  NA  
1.00  0.99  NA  1.00  1.00  NA  
0.89  0.88  NA  0.89  0.87  NA  
1.00  1.02  NA  1.00  0.75  NA  
0.98  1.01  NA  1.00  0.76  NA 
Vii Conclusions
The UC formulation without frequency response constraints will lead to insecure and unstable dynamic response under contingencies when the renewable penetration becomes high. The existing FCUC methods are either incapable to consider realistic models and corresponding stability or extremely difficult to solve. The paper presents a generic deep learningbased framework for FCUC, where two neural network models are constructed to represent the frequency nadir and stability characteristics given a specific operating condition and a worstcase contingency. The proposed regionofinterest active sampling is effective in selecting power injection samples whose frequency nadirs are closer to the UFLS threshold. Such a dataset later proves to be beneficial for training accuracy improvement. The FCUC formulation can effectively schedule frequency secure generator commitments, which is verified using the PSS/E simulation.
Comments
There are no comments yet.