1 Introduction
Renewable energy sources are by now an essential energy producer of the electrical power grid [fraunhofer2018windenergie, Lowery2013]. By integrating these power plants, we introduce a lot of volatile energy. To maintain a stable power grid, power grid operators need realistic information about the effects of energy production and consumption assessing grid stability [Lowery2013, Jens2018]. It is essential to use operational scenario planning [Sovan2008, Lowery2013] to evaluate the integration of renewables.
Traditionally, generative approaches such as stochastic programming, copula methods, or MonteCarlo approaches allow simulating the stochastic and intermittent nature of renewable power generation [Hart2011, Becker2018, Kaut2003Evaluation]. Often, these techniques only allow for modeling either the temporal or the spatial relationship of renewable energy sources. More recently, research in scenario planning shows the strong capabilities of generative adversarial networks (GANs) modeling the temporal as well as the spatial relationship of, e.g., wind and photovoltaic (PV) farms [Chen2018Bay, Chen2018ModelFree, Chen2018Unsup] simultaneously.
Utilizing GANs for scenario planning is especially interesting because it allows for simulating a large number of realistic power samples after initial training of the GAN. Further, the training, see Figure 1, emphasizes the spatial relation (of different farm locations) as well as temporal relation (of the simulated hours) through using historical data. However, due to its novel application in the field, there are limited studies on how well GANs can model the underlying power distribution, especially with limited data for training. This analysis is essential because, e.g., extreme situations with low or high power generation are required to evaluate the grid stability.
Besides evaluating the generated power distribution over all farms, it is essential to assess terrain specific distributions, caused by locationspecific weather conditions, for wind power scenarios. Often, the average power generation and also the density associated with power values are different for different terrains [Pinson2006, jens2019]. To evaluate these effects on the grid in various terrains, the analysis on GANs to generate those distributions is essential.
Therefore, the main contributions can be summarized as follows^{1}^{1}1Implementation details of the evaluation, the experiment, and the training is available at https://git.ies.unikassel.de/scenario_gan/scenario_gan_wind_pv.:

We provide a comparative study of two different loss functions (binarycrossentropy loss and Wasserstein distance) on two solar and two wind dataset (with limited historical data for training) to evaluate the underlying power distributions through the KullbackLeibler divergence (KLD).

Results show that the Wasserstein distance is superior over the binarycrossentropy and a Gaussian copula (GC) baseline even when faced with limited data compared to previous studies.

A study on how locationspecific influences and weather conditions (that affect the power distribution) shows that GANs learn those specifics even when only four offshore parks are present in the dataset.
The remainder of this article is structured as follows. In Section 2, we give an overview of related work. Section 3 describes two types of loss functions and details the evaluations measures. We continue by describing the experiments and results in Section 4. Section 5 summarizes the article and provides an outlook on future work.
2 Related Work
The evaluation of grid stability through operational scenario planning is an essential research topic to integrate volatile renewable energy resources. By creating realistic realizations of stochastic processes in the field of renewable energy, we can analyze their potential impact in realworld scenarios [Antonio2010]. Often, generating scenarios in the field of renewable energy is done by techniques provided by stochastic programming, copula methods, or MonteCarlo approaches. In the following, we give a brief overview of these techniques, followed by a summary on utilizing GANs for simulating scenarios.
Already in 2013, [Kaut2003Evaluation] developed evaluation methods and algorithms for using stochastic programming in scenario simulations. Further, the authors present a stochastic programming method for portfolio management. Various probabilistic prediction methods are also used to simulate scenarios. These have the advantage that they already model the distribution of power. Besides, they allow modeling temporal relationships as given by the prediction models. This modeling enables the authors in [Pinson2009]
to provide a method for converting probabilistic predictions into multivariate Gaussian random variables. In particular, it focuses on the simulation of scenarios which model the interdependent temporal effects from prediction errors. It is also possible to create scenarios based on probabilistic predictions
[Iversen2016]. For the evaluation of such scenarios, [Pinson2012] defines criteria measures, such as the energy score, and gives recommendations. However, the implementation of the score is tedious and errorprone.The authors in [Becker2018] use a copula approach to model temporal affects onto forecasts with distinct forecast horizons allowing to distinguish between the uncertainty in wind power forecasts and temporal dependencies. The presented method outperforms all other approaches in their experiments. Recent work in [TaoWang2016] presents a new proposal that models spatial dependence between renewable energy resources. Therefore, the implemented prototype uses Latin hypercube sampling and copula methods and is tested on actual wind power measurements and power forecasts.
A comprehensive study on realworld data in [Rachunok2018] shows the tradeoff between computational complexity and the quality of simulated scenarios when using MonteCarlo techniques. Another method uses a MonteCarlo approach [Hart2011], to study a planning tool that takes various renewable resources from different locations into account — further, the authors consider temporal effects in simulations for load scenarios.
Most of the previous literature either considers temporal or spatial effects. However, recently, utilizing GANs allows simulating wind and solar scenarios that take spatial and temporal relations into account [Chen2018ModelFree]. They also show how to create scenarios with wind ramp events by utilizing conditional GANs. It is shown in [Chen2018Unsup], that GANs are capable of simulating scenarios conditioned on a previous forecast. In [Chen2018Bay], Bayesian GANs create realistic scenarios for wind and PV simultaneously. In a sense, the approach in [Chen2018Bay] is similar to ours, as we show the capability of GANs to simulate parks of different terrains together.
The literature review shows that most of the work is focusing on either the temporal or spatial evaluation. Further, a comparison between the historical data and the generated data distribution is not provided using known measures such as the KLD. Besides, none of the articles presents an analysis of whether it is possible to create terrain specific power distribution when simultaneously simulating power distributions of numerous wind farms. Further, previous studies have a large amount of data, e.g., [Chen2018ModelFree] uses measurements, compared to our datasets with a maximum of historical power measurements as detailed in Section 4.1.
3 Methodology
After giving a short introduction into the applied GANs, we detail methods to evaluate the simulated power distribution with the distribution from historical data.
3.1 Generative Adversarial Networks
GANs consist of two different neural networks
[goodfellow2014generative]: The discriminator and the generator. In Figure 1, the generator takes some random values and produces fake samples to imitate the distribution of a real dataset. This imitation enables us to make use of spatial and temporal relations already present in historical data. The discriminator, on the other hand, takes real and fake samples as its input and tries to distinguish between real and generated samples. During the training, the quality of the generated data, as well as the classification accuracy of the discriminator, should increase. The improvement depends on the loss functions used. After training, the generator produces examples from the distribution of the original data. The discriminator, whereas, can detect novelties and outliers in the data
[Zenati2018]. Often, GANs employ the Wasserstein distance [arjovsky2017wasserstein] or the binary cross entropy (BCE) [radford2015unsupervised] as loss function. Later on, we refer to the network with the BCE loss function as deep convolutional GAN (DCGAN) and deep convolutional Wasserstein GAN (DCWGAN) as the network trained with the Wasserstein distance.The BCE is defined as follows:
where stands for the label if the data is real or generated, and
for the probability that the discriminator assigns (given by the sigmoid function at the final layer). A zero label means that the data is classified as generated, while a one corresponds to the real data.
The Wasserstein distance [arjovsky2017wasserstein]
is a measure that is used to compare two distributions. It is also referred to as earthmover distance and indicates the effort that is required to transform one probability distribution into another distribution. It is defined as follows
(1) 
where and are the distributions in the range between and . Since there is not only one possible solution to convert one distribution into another, the solution chosen for this loss is the one with the least effort, which corresponds to the infimum (inf) in Equation 1.
3.2 Kernel Density Estimation
The kernel density estimation (KDE) is a statistical method to determine the distribution of a given dataset. In the KDE algorithm, superimposing several Gaussian distributions allows for estimating the probability density function (PDF) for datasets. Applying KDE to the historically measured and generated power data allows comparing them with each other, e.g., by employing the KullbackLeibler divergence.
3.3 KullbackLeibler Divergence
The KLD is a nonsymmetric statistical measure to determine the difference between the distributions. Later on, we use the KLD to quantify the similarity between the generated and historical data through a KDE. It is defined as
(2) 
with , as the distributions and , as their PDFs. Due to the nonsymmetrical behavior, both and are calculated and added together. One interpretation of the KLD is as information gain achieved by replacing distribution with .
4 Experimental SetUp and Evaluation
This section presents the experimental setup and evaluation results. Therefore, we detail the different datasets and explain the preprocessing of the data. Further, we describe the architectural setup of the evaluated DCGAN and DCWGAN. Afterward, we evaluate theses GANs concerning a GC baseline. In particular, we evaluate the generated samples regarding their temporal and spatial correlation, their generated distribution, and the creation of high and lowstress power profiles. In the final study on the GermanWindFarm2017 dataset, we assess how different terrains and their locationspecific wind conditions (that affect the power distribution) are modeled by the DCWGAN when trained simultaneously.
4.1 Data
The EuropeWindFarm2015 and GermanSolarFarm2015 dataset can be obtained via our website^{2}^{2}2https://www.ies.unikassel.de. We further use a GermanWindFarm2017 and GermanSolarFarm2017 dataset, which are not publicly available. However, especially the GermanWindFarm2017 dataset allows us to get additional insights into the power distribution relating to terrainspecific conditions. These datasets make our data quite diverse and we cover a broad spectrum of power distribution from the wind as well as solar problems.
Compared to previous studies on GANs for renewable power generation, see, e.g., [Chen2018Unsup, Chen2018ModelFree] with a total of measurements and a fiveminute resolution, we only have a limited amount of data. The largest of our datasets has power measurements, as detailed in Table 1. The solar datasets have a threehourly resolution totaling in measurements per day. Wind datasets have an hourly resolution with power measurements per day.
To discover relations within the data, we aim at making spatial and temporal relationship available in each training sample. Therefore, we reshape the data to obtain a shaped matrix for each day (sample), where refers to the number of parks and refers to the time steps within the horizon. This matrix is obtained by first creating a list of samples for each farm with its respective time steps (horizon) and afterward combine all individual time steps of all farms. Finally, the reshaping allows the utilized convolutional layers, see Section 4.2, to make use of their receptive field and discover relations within the data, either temporal or spatial. Respectively, the number samples in Table 1 refer to the number of matrices with shape .
Name  #Parks  Resolution  Horizon  #Samples  #Measuresments 

EuropeWindFarm2015  h  time steps  540  
GermanSolarFarm2015  h  time steps  760  
GermanWindFarm2017  h  time steps  426  
GermanSolarFarm2017  h  time steps  483 
After normalizing and reshaping the data, we randomly select of the data for training and the remaining historical data for testing.
4.2 GAN Training
To discover relations within the data, the applied GANs are designed to make use of the receptive field of convolutional networks. Therefore, the generator utilizes convolutional layers to create samples of the form
subsequently. Depending on the dataset, the generators parameter (kernel size, stride, and padding) are selected to fulfill this requirement as detailed in Table
2. Varying stride and padding allow to almost consistently apply a kernel size of while achieving a receptive field sufficient to cover the complete matrix.Dataset name  Kernel size  Stride  Padding 

EuropeWindFarm2015  
GermanSolarFarm2015  
GermanWindFarm2017  
GermanSolarFarm2017 
The discriminator’s parameters are reverse to the generators. This (reverse) parameter setup allows making best use of the joint training and the receptive field of the convolutional layers because the discriminator is capable of detecting missing relations in the generated data and on the other hand the generator is capable of creating those.
In the following, we evaluate two GANs trained with the Wasserstein distance (DCWGAN) and the BCE loss (DCGAN). We apply batch normalization inside the discriminator and the generator. As activation function, we use leaky Rectified Linear Units (ReLU). The GANs are trained for
epochs with a learning rate of and a batch size of .4.3 Study on EuropeWindFarm and GermanSolarFarm Dataset
In this section, we highlight the results of the comparative study of the DCGAN, DCWGAN, and a GC [Nelsen2006] as the baseline.
4.3.1 Evaluation through KLD:
To evaluate the generated distributions, we apply a KDE to the test dataset and the samples created by the models. The KDE uses a Gaussian kernel, a Euclidean distance [scikitlearn], and a bandwidth of to reflect of the normalized power. The PDFs from the KDE algorithm are used to examine the similarity between the distributions of real and generated data using the KLD. Table 3 summarizes this comparison of generated samples and historical power data.
Location  KLD GC  KLD DCGAN  KLD DCWGAN 

EuropeWindFarm2015  
GermanSolarFarm2015  
GermanWindFarm2017  
GermanSolarFarm2017 
Results for all datasets show that the DCWGAN is superior over the GC baseline. DCGAN has worse results than GC for all datasets except the GermanSolarFarm2015. The DCWGAN creates samples with smaller, or at least a similar low, KLDs compared to the DCGAN and the GC, showing its excellent performance. Note that Figure 6 provides a representative example of those distributions for the GermanWindFarm2017 dataset.
Interestingly, even though the limiting amount of training data compared to other studies, results of the KLD suggest that generated samples reflect the distribution of the real world. These positive results are potentially due to the combined training and the selected parameters to make the best use of the receptive field.
4.3.2 Evaluation of Temporal and Spatial Relation:
Besides creating data of similar distribution, it is essential to examine the spatial and temporal relationship at the same time. As results of the KLD suggest the superior performance of the DCWGAN over the DCGAN and GC baseline, we limit the following discussion to the DCWGAN. Nonetheless, note that results of the DCGAN and GC are reasonable but outperformed by the DCWGAN. To restrict the discussion to relevant and nonrepetitive results, we limit the following analysis to two representative examples that provide details due to their increased data availability. For example, the wind datasets provide more details about the temporal relation as these include more time steps compared to the solar datasets.
Figure 2 shows typical results using Pearson’s correlation matrix to calculate temporal relations for the generated hours [Chen2018Unsup]. In these results for the EuropeWindFarm2015 dataset, we observe that in realworld as well as in the generated samples from the DCWGAN, the power values have a higher Pearson coefficient for hours related to each other.
Figure 3 shows exemplary results using Pearsons correlation matrix to measure spatial relations between different farms [Chen2018Bay, Chen2018ModelFree]. In these results of the GermanSolarFarm2017 dataset, we observe that similar spatial relations, measured by the Pearson coefficient, are present on the historical data as well as the generated samples. In most cases, a high correlation is present and the DCWGAN captures almost all those spatial relationship compared to the heatmap from historical data. In some examples, the DCWGAN creates samples with a more substantial spatial relation to each other than present in the historical data.
Interestingly, for some farms, there is a rather modest spatial relation in the historical as well as the generated data. This relation is unlikely in the case of small regions (such as Germany) and can be caused by maintenance problems, shadowing effects, or other problems in the data. As the presented results are representative for all datasets, the above results show that the DCWGAN is capable of reconstructing the historical power distribution, the spatial relation, and temporal relation for all datasets even for a varying amount of farms, resolutions, and the number of historical power measurements for training.
4.3.3 Evaluation of Generated Power Profile:
To asses grid stability, the creation of different stress situations is essential. A typical stress scenario involves large power generation over a long period, as this causes the maximum thermal load on the elements and is therefore relevant for selecting the correct technical characteristics of those elements.
The following section gives insights into the amount of stress by calculating the integral over the generated time horizons. E.g., for the wind datasets for each wind farm power values are generated corresponding to hours. As the maximum power normalizes the data, the maximum value of the integral is for a single farm of a sample. The maximum value for the solar datasets is about because power is not created at night.
Figure 4 and 5 provide examples of this analysis for the EuropeWindFarm2015 and the GermanSolarFarm2015 datasets summarized by histograms. Both results show that the generated stress level is similar to the one of the historical data. However, due to the small amount of historical data with highstress situations, the generated samples contain mostly values below a value of for wind and about for solar. The latter results suggest that there are only a small amount of days with intense solar radiation throughout the whole day. The former relates to the fact that wind farms typically are ramped down when working at a maximum level over a long period.
4.4 Study on LocationSpecific Distribution Generation
The following study reveals how locationspecific influences and their locationspecific wind conditions (that affect the power distribution) are modeled by the GANs and GCs when trained simultaneously with different terrains. The evaluation is similar to the previous section but omits the analysis of spatial and temporal relationships as results are identical to the previous study.
In Table 4, we compare the distributions of the generated samples for all models. We calculate the KLDs between the distributions of all real samples and all generated samples. By grouping farms by their terrain, we estimate the locationspecific KLD. The results show that both models create a distribution similar to the historical data (also compare Figure 6). In cases of flatland, the DCGAN has a smaller KLD, for the forest terrain the values are equal, and for offshore farms, the DCWGAN has a smaller KLD. Again, the GC achieves smaller KLD values compared to the DCGAN but larger amounts compared to the DCWGAN.
Location  KLD GC  KLD DCGAN  KLD DCWGAN  #Farms 

Flatland  0.143  0.194  0.037  32 
Forest  0.085  0.266  0.018  10 
Offshore  0.148  0.304  0.046  4 
In Figure 6, it can be seen that for each terrain and GAN the PDFs are similar to the historical data. For simplicity, we omit the presentation of the GC as we are interested in evaluating GANs for renewable power generation.
The following analysis refers to values from historical data. Both GANs create similar values. However, results of the DCWGAN are closer to the test dataset. For all terrains, a higher density occurs in the low yield range. The density decreases further for increased yields but rises again slightly in the range of maximum power. Wind farms on flatland have a mean at . Wind farms near forests have an increased mean at in the historical data, resulting in a higher total yield. For offshore wind farms, the average power generation is
, with a higher share in the range of maximum power. Similar differences between the terrains are present in the variance and skewness. Flatland has the most remarkable skewness value, forest the second largest, and offshore the smallest one. The order is the opposite in the magnitude of variance values.
Results of the study confirm that the GANs are capable of modeling the terrain specific power distributions due to sitespecific wind conditions even for a limited amount of data as for offshore and the forest terrains. In particular, the GANs create similar PDFs specific to those terrains.
4.5 Discussion
Interestingly, even when training GANs on, e.g., only solar farms and a widely varying amount of capacities, the results of the KLD show that the generated power distribution is similar to historical data. The representative histograms in Figure 6 confirm those similarities.
Also, in the study of terrain specific distributions, both GANs learn the individualities of each terrain. Statistical values such as mean, variance, and skewness are closer to historical in samples from the DCWGAN. Impressively in created data of the offshore territory, the large density, in the range of maximum power is also captured by both GANs. This effect might be due to simultaneous learning (similar to a multitask approach) of the different farms allowing to capture small individualities for each farm and their sitespecific conditions.
A critical remark of the analysis needs to be done concerning seasonal effects, which is challenging to consider due to the limited amount of data. Another problem is related to the number of highstress situations. Due to the limited occurrences in the historical data, the chance of creating those by the GANs are also low. However, they can be created through repeated sampling and rejecting those below a certain threshold of the integrated power value.
Overall, the DCWGAN is superior in modeling the spatial and temporal relations as well as power distributions even when faced with limited data compared to previous studies.
5 Conclusion and Future Work
In this article, we compared the binarycrossentropy (BCE) loss and the Wasserstein distance to the training of GANs on four different data sets. Results show the superior quality of the Wasserstein distance over the BCE loss and a GC as the baseline to generate the power distribution when taking spatial and relationship and the KLD into account. The publicly available source code, the datasets, and the results provide a basis for comparison when utilizing GANs in the scope of renewable scenario generation. Ultimately, we confirmed that GANs are capable to model different power distributions, including external influences such as terrains even when faced with a limited amount of data compared to previous studies.
A future goal is to utilize GANs to impute missing values or create samples for unknown farms by creating a GAN conditioned on weather events and previous power values. The latter case also allows using the samples in the field of transfer learning.
Acknowledgment: This work was supported within the project Prophesy (0324104A) funded by BMWi (Deusches Bundesministerium für Wirtschaft und Energie / German Federal Ministry for Economic Affairs and Energy).
Additionally, special thanks to Maarten Bieshaar for excellent discussions about Gaussian copulas.
Comments
There are no comments yet.