1 Introduction
Climate change is universally recognized as one of the major challenges humanity will have to face over the next decades. Thus, the development of renewable energy systems plays a crucial role in many strategic frameworks for sustainable development Rogelj et al. (2015); Amato et al. (2020a). This includes not only the Sustainable Development Goals (SDGs) defined by the United Nations, but also the ensemble of renewable energy targets defined by different jurisdictions, such as USA Barbose et al. (2016), Europe Oberthür (2010); Santopietro and Scorza (2021), India Bhushan and Gopalakrishnan (2021), Switzerland Prognos and others (2012).
The importance of decarbonizing energy systems is easily understandable McCollum et al. (2018). However, the transformation of energy systems poses technical and logistical challenges, which may imply major threats for many societal and environmental aspects Kiesecker et al. (2019). Power plants, including wind turbines, often require large amounts of land, hence generating conflicts with other priority targets of sustainable development, such as the limitation of land takeSaganeiti et al. (2020), the increase in local agricultural productivity Martellozzo et al. (2018), and the protection of biodiversity Yenneti et al. (2016). The presence of such conflicts may be underestimated or overshadowed by the urgency of operating on energy networks to reduce their cost in terms of carbon dioxide production Spillias et al. (2020). For these reasons, a proper planning of the expansion of renewable energy technologies is required to optimize the future location of power plants by considering precise estimates of power generation while taking into account the conflicts between the installation of power plants, nature and environmental protection.
Among renewable resources, wind energy is a promising resource potentially contributing to the energy transition in many parts of the world. In contrast to solar energy, it is available at any time of the day; however, it is highly variable and complex to model. Thus, the quantification of the spatial and temporal variation of wind power and the related uncertainty may provide valuable information for energy planners and policymakers. To date, most of the estimates in the domain of wind power generation are based on average annual wind speed models, which however can only be used as an indicator of the power generation potential in a geographic area. Indeed, it has been shown that the use of average annual wind speed models underestimated wind power generation Nelson and Starcher (2018). Therefore, developing a methodology to precisely estimate the wind speed based on hourly time steps is needed. Wind speed measurements are generally collected by sparsely located meteorological stations, and hence do not provide the uniform spatial coverage to estimate the power generation potential over large geographical regions at high spatial resolution. Several methods have been developed to obtain wind speed values at locations where no measurements are available Landberg et al. (2003)
. These can be broadly classified into physical  or deterministic – and statistical approaches.
Physical models, such as the nonhydrostatic weather prediction or the Reynoldsaveraged NavierStrokes ones, are mostly based on the study of wind via the use of fluid dynamics equations. While this family of models can ensure good estimates, it generally has limitations in the use of large amount of data and in its large computational burdens. These limitations are particularly inconvenient when working with data collected over long time periods and in relatively large geographical areas.
Statistical approaches are used to model wind speed using its statistical relationship with a set of geoenvironmental and topographical predictors. They include a wide range of models, from classical geostatistics to Machine Learning (ML). The latter have become extremely popular over the last decades, as they can deal with the nonlinearity of wind speed and take advantage of big data. Another potential advantage of statistical methods is that they may enable the estimation of the uncertainty of wind speed prediction, which is very important for exploring the potential location of new wind farms.. Indeed, the uncertainty of wind assessment is a major factor influencing the investment risk related to the installation of wind power plants Veronesi et al. (2015). This uncertainty has been sometimes estimated by imposing a distributional shape to wind speed measurements a priori Veronesi et al. (2016); Laib et al. (2018) . However, it could be more convenient to determine a procedure to estimate uncertainty without making any assumption on the distributional properties of wind data. Moreover, when the aim is to estimate wind power, the propagation of such uncertainty in the process of transformation of wind speed into power must also be considered. ML has been successfully applied to model wind speed at several spatial scales in different parts of the globe. Nonetheless, most applications focused on lower frequency than the hourly one, dealing with the modelling of daily or monthly means Veronesi et al. (2017); Douak et al. (2013). Moreover, the approaches discussed in literature always consider the spatial and temporal dimension of the wind speed patterns separately, hence producing models that account only for the spatial or the temporal correlation in data, respectively Cellura et al. (2008); Xiao et al. (2018); Liu et al. (2020)
. Still, spatiotemporal correlation plays a primary role in wind speed modelling. To the best of our knowledge, no MLbased methodology has been proposed to solve spatiotemporal interpolation problems of wind speed while also accounting for prediction uncertainty and its propagation to the wind power estimation.
To address this gap, we propose a methodology to reconstruct a spatiotemporal field on a regular grid from spatially irregularly distributed wind speed time series. Our approach extends the framework previously proposed in Amato et al. (2020b)
to include uncertainty estimation. More specifically, the wind speed data are decomposed into temporally referenced basis functions and their corresponding spatially distributed coefficients. The latter are spatially modelled using the Extreme Learning Machine (ELM) algorithm, a ML model for which a methodology to estimate model variance was introduced in
Guignard et al. (2021). This will enable the estimation of both model and prediction uncertainty without any assumption on data distributional patterns. An approach to propagate such uncertainty after the transformation of wind speed into power will also be introduced.The methodology is applied to the study of wind power potential in Switzerland, where the complex orography makes wind modelling an extremely challenging task. Previous studies have attempted to model wind speed in this country, although focusing on monthly frequencies or without investigating prediction uncertainties and their propagation to the power generation potential Robert et al. (2013); Assouline et al. (2019). Both these aspects are considered here, and data at higher frequency are used. The application is structured into two parts. First, ten years of wind speed monitored data collected at an hourly frequency on a set of up to 208 monitoring stations. The data are then interpolated using the proposed spatiotemporal modelling technique, allowing the estimation of wind speed and its uncertainty at unsampled locations. Second, the modelled spatiotemporal wind speed field is used to estimate the wind power potential, taking into account technical characteristics of horizontalaxis wind turbines of 100 meters hub height as well as national regulatory planning limitations for the installation of wind power plants. The limitations include restrictions for noise abatement and landscape, natural, ecological and cultural heritage protection plans as provided in the Swiss national wind atlas für Raumentwicklung ARE (2020). The resulting wind power potential is the first dataset of this type for Switzerland, representing a valuable tool for planners to support the design of future energy systems with increased wind power generation. Considering the spatial and temporal variability of wind hereby permits to assess the complementarity with other forms of renewables such as photovoltaics, which play a key role in Switzerland’s Energy Strategy BFE (2020).
2 Methodology
This section presents the proposed methodology to model wind speed and wind power generation potential based on wind speed data from an irregularlyspaced monitoring network. We first show that a basis function representation can be used to consider the spatiotemporal dependencies in wind speed data by decomposing them into fixed temporal bases and stochastic spatial coefficients. The latter is modelled using an ensemble of Extreme Learning Machines (ELM). ELM is a singlelayer feedforward neural network which has the advantage of permitting uncertainty estimates. The results of the regression model are used to recompose the full spatiotemporal signal, returning an interpolation of the wind speed field including the model and prediction uncertainty. Finally, we show how wind speed estimates and their corresponding uncertainties can be used to estimate the wind power generation.
2.1 Spatiotemporal modelling of irregularly spaced data
This subsection describes the methodology to decompose spatiotemporal data via basis functions, to spatially model the resulting linear coefficients, and to estimate the model uncertainty and the prediction uncertainty.
2.1.1 Basis function decomposition of spatiotemporal data
Spatiotemporal wind speed observations collected by irregularly spaced monitoring stations, can be decomposed in a linear combination of purely temporal bases through Principal Component Analysis (PCA), also known as Empirical Orthogonal Function (EOF) analysis in the fields of meteorology and climatology
Cressie and Wikle (2011). The linear coefficients of the combination, which will be modelled, are purely spatial.Assume that we have spatiotemporal measurements at locations and times , with . Let us define the empirical temporal mean at time by
(1) 
and the temporally centered data by
(2) 
Then, the temporally centered data can be written as
(3) 
where the form a discrete orthonormal temporal basis and the are the spatial coefficients with respect to the th EOF at locations , such that
(4) 
The spatiotemporal measurements are then supposed to follow
(5) 
where is an error term with zero mean, which includes any stochastic part which is not described by the model and may contain spatiotemporal dependencies.
The basis is obtained by a spectral decomposition of the empirical temporal covariance matrix, from which temporal EOFs with spatial coefficients are obtained. Several practical considerations can be found in Wikle et al. (2019).
2.1.2 Extreme Learning Machine
ELM is a fast and efficient singlelayer feedforward neural network Huang et al. (2006). The input weights and biases are randomly chosen, and the output weights are optimized through leastsquares. ELM can address spatial interpolation tasks and deal with highdimensional environmental data Leuenberger and Kanevski (2015).
Denoting the transpose operator as , suppose that input variables are related to an output variable through the relationship
(6) 
where is a function and is a centred random noise with finite variance, both depending on the input.
Let be a training set. Given
, the number of neurons of the hidden layer, the input weights
and biases are randomly initialized for . In this paper, all input weights and biases are independently and uniformly drawn between and . The hidden layer matrix, denoted as , is defined elementwise by , and , whereis an infinitely differentiable activation function. Here, the logistic function is chosen as an activation function.
The output weights are then minimized using least squares. A regularized version of ELM is used here Deng et al. (2009)
, with the benefits of stabilizing the variability of the output weights and reducing overfitting and outliers effects. This corresponds to minimizing the cost function
(7) 
for some fixed , where denotes the Euclidean norm and . The real number is sometimes called the Tikhonov factor and controls the amount of regularization. Noting
the identity matrix and
, the solution of this minimization problem is given byThis model is a ridge regression
Piegorsch (2015) performed on the random feature space Lendasse et al. (2013). Then, given a new input point , the prediction is given by , where(8) 
To enable variance estimation of the ELM modelling, the algorithm is retrained times and averaged Guignard et al. (2021), resulting in a particular case of ELM ensembles Lendasse et al. (2013); Liu and Wang (2010). Denoting the th prediction as for , the final prediction is then
(9) 
where and are the analogous quantities defined previously for the th model. Considering the input variables as deterministic, the use of several ELMs allows to develop distributionfree estimates of variance in homoskedastic (constant noise variance) and heteroskedastic (nonconstant noise variance) settings. Several estimates are proposed in Guignard et al. (2021). In this paper, the heteroskedastic estimate will be used within the spatiotemporal model variance estimation in subsection 2.2. Additionally, the biasreduced homoskedastic model variance estimate and its related noise variance estimate will be used in the spatiotemporal prediction variance estimation procedure. Those variance estimates are also provided for regularised ELM and are computed using the UncELMe python package (see Guignard et al. (2021) for more details on their derivation and implementation).
2.1.3 Spatiotemporal modelling via spatial interpolation of the coefficients
As mentioned above, the data are assumed to follow equation (5) based on Amato et al. (2020b). The coefficients depend only on space, potentially through additional spatial features . In the case of wind speed estimation, these features may include terrain characteristics such as altitude, slope or aspect. Using the single output strategy proposed in Amato et al. (2020b), the coefficient maps can be modelled with any ML algorithm, including ELM. For the map, this implicitly supposes the existence of a function such that
(10) 
where is assumed to be a stochastic noise with zero mean and finite variance. The estimated function is denoted as and is used as a spatially interpolated coefficient map. The spatiotemporal prediction at a new point is then given by
(11) 
2.2 Uncertainty quantification
Using equations (5) and (11), the prediction error is given by
(12) 
The first term on the right hand side is the modelling error between the linear combination of true regression functions and the spatiotemporal combination of spatial estimates . The variance of the modelling error, denoted as and referred to as spatiotemporal model variance
, quantifies the model accuracy. The spatiotemporal model variance will be used to construct model standarderror bands.
The prediction error will also be considered to evaluate accuracy of the estimate with respect to the observed output. As the prediction error distribution is unknown and no assumptions are made on the noise distribution, a reliable prediction interval estimation is not obvious. We prefer here to quantify the spatiotemporal prediction variance, given by the variance of the prediction error,
(13) 
2.2.1 Spatiotemporal model variance estimation
Let us denote the vector of training outputs of the
th map as , where the the vector component is given by . In a similar manner, denotes the vector given by the noise at the training points. Assuming that ensures that no additional variability comes from the spatial model interactions. Indeed, knowing the training input, note that for a single ELM and for all ,(14) 
where the law of total covariance is used in the second equality. This result may be generalised to the ELM ensemble as
(15) 
While it seems reasonable to suppose , this should be validated e.g. by looking at the empirical crosscovariance function or the crossvariogram of the training residuals.
The spatiotemporal model variance is now straightforward to compute. Using equation (15), one obtains
(16) 
The spatiotemporal model variance is hence obtained directly by a sum of the spatial component model variances weighted by the corresponding squared basis function. Therefore, can be estimated by using variance estimate of each ELM ensemble model,
(17) 
where is the heteroskedastic estimate of the modelled regression function of the spatial coefficient map, at the input point . The choice of the estimate is motivated by the convenient tradeoff between computational efficiency and estimation effectiveness of , see Guignard et al. (2021).
2.2.2 Spatiotemporal prediction variance estimation
The variance functions are sometimes obtained by modelling them as a function of the input features using the squared residuals Ruppert et al. (2003), as the expectation of the squared residuals approximately corresponds to the prediction variance Carroll and Ruppert (1988). Using the squared residuals to perform a regression hence yields a plausible estimate of prediction variance Hall and Carroll (1989).
The training squared residuals are given by
(18) 
here also denoted as for short. The latter is used to train a new model. This new model may result in negative estimates of . Hence, positiveness of the modelled variance function is here ensured through exponentiation, folllowing Ruppert et al. (2003); Heskes (1997)
. The logarithm of the squared training residuals of the first model are then used as a new training set to model the random variable
with mean and variance . This second spatiotemporal model follows the same pipeline as the first model, including the EOF data decomposition and the ELM modelling on each of the resulting component with the highdimensional input space composed by the spatially referenced features. Its predicted value is noted .A second order Taylor expansion around is needed to retrieve the expected squared residuals back from their logtransform, following the equation
(19) 
Expansion of a random variable function in the neighborhood of the random variable mean is known as the delta method in statistics Oehlert (1992); Ver Hoef (2012). Taking the expectation on both sides yields
(20) 
This motivates the following estimation of the spatiotemporal prediction variance,
(21) 
with the prediction of the second spatiotemporal model and its prediction variance estimate
(22) 
where — respectively the noise estimate — is the biasreduced homoskedastic estimate — respectively — of the modelled spatial coefficient map of the second spatiotemporal model. Although the noise of each component is not necessaryily homoskedastic, is a good estimate of and is better than limiting the estimation of to a first order Taylor expansion.
2.3 Wind power estimation
Let us denote the expectation and variance of the wind speed at a given location and time as and . The wind speed has been measured at a height . Assume that the wind speed at wind turbine height can be estimated by the socalled loglaw,
(23) 
where is the terrain roughness depending on the location Whiteman (2000). The expectation and variance of are then given by
(24) 
The wind speed at the wind turbine height is then converted to power. Logistic functions have proven to be highly precise in fitting power curves, on simulated and manufacturers data Bokde et al. (2018); Villanueva and Feijóo (2016). Assume that the power curve of the turbine is a threeparameter logistic function
(25) 
see Figure 7.9 for an example. The first and second derivative of the power curve are Minai and Williams (1993)
(26) 
Due to the nonlinearity of the power curve, the expectation and variance of are again approximated using the delta method Oehlert (1992); Ver Hoef (2012). The second order Taylor expansion of the power around is
(27) 
Taking the expectation on both side,
(28) 
The variance of
is obtained by computing the variance of its first order Taylor expansion, as higher moments are not available, such that
(29) 
Given the parameters and , the expected value and variance of the wind turbine power at each location and each time are estimated by substituting and by and in equations (24), and plug them into equations (28) and (29).
Equation (29) implies that the variance is completely transformed by the logistic function, see also Figure 1. Thus, when wind speed is high with a sufficiently small amount of variance, the estimate remains confidently in the plateau region of the logistic function, characterised by the maximum wind power. Consequently, the power variance is small — in accordance with equation (29) —, indicating a high confidence in having the maximum of energy production. Similarly, when wind speed is low with a relatively low variance, the power is close to zero with high certainty. By contrast, when the wind speed is in the transition phase of the logistic function, even with a very small variance, the power is susceptible to fluctuate between its minimum and maximum value. This leads to a high variance of the power estimate — characterised by a high derivative of in equation (29).
3 Case study and data
This section introduces the case study for wind power estimation in Switzerland. First, we discuss the structures and properties of the wind data used in the remainder of the paper. Specifically, both the wind speed data and the spatiallyreferenced features used as input for the ML modelling will be presented. Then, the ELM model training based on the methodology proposed in Section 2.1.2 as well as the application of the wind power model for wind turbines of 100 meters hub height is explained. Finally, we quantify the available area for installing wind turbines, which is required to obtain a nationalscale estimate of the wind power potential for Switzerland.
3.1 Study area and data availability
Wind speed measurements have been obtained from the IDAWEB web portal of the Swiss Federal Office of Meteorology and Climatology (MeteoSwiss) MeteoSuisse . The data are collected from 450 monitoring stations measuring wind speed at 10 meters above the ground level with a 10 minutes frequency from 00:00 AM of the January 2008 to 11:50 PM of the December 2017. The number of available monitoring stations significantly changes over the sampling period, with relevant growth in 2013 and 2017. Therefore, data have been temporally divided into the following three sets, each having an homogeneous number of stations as indicated in Table 1:

from January 2008 00:00 am to December 2012 11:50 pm, which will be referred to as MSWind 0812,

from January 2013 00:00 am to December 2016 11:50 pm , which will be referred to as MSWind 1316,

from January 2017 00:00 am to December 2017 11:50 pm, which will be referred to as MSWind 17.
For each dataset, the stations with more than 10% of missing or negative values have been removed, together with those having more than 10% of zero values. The remaining zero values have been set to missing values. Moreover, outliers and local suspicious behaviours, suggesting for example equipment failure, have been detected and replaced by missing values. The frequency of the data has then been reduced to 1 hour by averaging.
Each of the three datasets has been divided into a training set (including 80% of the monitoring stations) and a test set (20%). Table 1 summarizes the main characteristics of the three cleaned datasets. Finally, all the remaining missing values of the training sets have been replaced by the local average data from the eight closer stations in space and the two contiguous time frames, yielding a mean over 24 spatiotemporal neighbours Jun and Stein (2007); Porcu et al. (2016). Figure 2 indicates the location of the monitoring station in Switzerland, together with a division of the national territory into homogeneous geomorphological regions.
MSWind 0812  MSWind 1316  MSWind 17  
General characteristics  
Total number of stations  106  127  208 
Number of training stations  84  101  166 
Number of test stations  22  26  42 
Time series length  43’848  35’064  8’760 
On the training datasets  
Missing values  2.1%  1.1%  1.2% 
Minimal distance between stations  0.1  3.3  0.4 
Maximal distance between stations  323.1  332.7  332.7 
Characteristic network  22.2  20.2  15.8 
scale 
A full exploratory data analysis was performed on the three wind speed datasets and is available in Appendix A.1. The spatial plots in Appendix A.1 highlight the presence of structures related to the channelling effect and/or the climatic barrier formed by the alpine chain crossing the country. Time series plots and autocorrelation functions (ACF) have been used to identify the variety of temporal patterns in the data, including yearly and daily cycles with different intensities depending on the station. Finally, kernel density estimates (KDE) show how, while some stations seem to exhibit a Weibull distribution typical for wind speed measurements
Jung and Schindler (2019), many other stations are more atypical, sometimes even exhibiting bimodality. This highlights the importance of adopting a modelling approach which makes no distributional assumption on the data.Wind speed has been proven to be extremely dependent on local orographic characteristics Guignard et al. (2019), which can be assessed by applying convolutional filters to extract primary or secondary topographic features from a Digital Elevation Model (DEM) Laib and Kanevski (2019). In this study, we adopted the 13dimensional input space proposed in Robert et al. (2013) to model wind speed using ML. In addition to the coordinates of the geographical space (latitude, longitude and elevation), this input space includes three categories of spatial features:

Differences of Gaussians (DoG): obtained by subtracting two smoothed surfaces attained through the application of Gaussian filters with different bandwidth to the DEM. Three different scales have been considered;

Directional derivatives: obtained evaluating the directional derivatives on DEMs smoothed with kernels having different bandwidth. Such filters are used to remove the spurious data of the DEMs, enhancing features in the data. Two scales have been considered for both NorthSouth (NS) and EastWest (EW) directions;

Terrain slopes: obtained as the norm of terrain gradient based on three smoothed DEMs.
Further details on the input features are provided in Appendix A.2.
3.2 Model training and application
3.2.1 Wind speed
The modelling framework described in Section 2.1 has been applied to the MSWind 0812, MSWind 1316 and MSWind 17 datasets. For both the first and the second spatiotemporal model, the coefficients of each EOF component have been spatially modelled with a regularised ELM ensemble of members, with the 13dimensional space presented in Section 3.1 as input features. Table 2 shows the number of neurons of each ELM ensemble – while it is fixed within each ensemble, it changes across the datasets to be slightly smaller than the number of training stations. This approach provides a high flexibility to the model. During model training, each member of each ELM ensemble is regularised by selecting a proper Tikhonov factor via GCV Golub et al. (1979); Piegorsch (2015). Appendix B.1 provides further details concerning model regularization. These include the use of the values as indicator of the presence (or absence) of spatial structure in the modelled spatial coefficient maps, hence increasing the explainability of the ML model.
Test error metrics for the models are reported in Table 2, together with the time series of the empirical temporal means , computed from the training data and used as a prediction for the test stations. The latter are used as a baseline prediction benchmark. A comprehensive residuals analysis, here provided in the Appendix B.2, has been performed to verify the consistency of the obtained predictions and their uncertainty, highlighting the capability of the model to capture spatial, temporal and spatiotemporal dependencies in the data despite the complexity due to its hourly frequency and the relatively low number of training points.
Once trained, the models have been used to predict the spatiotemporal wind speed field and its model and prediction variances on a 250 meters resolution regular grid, yielding three modelled spatiotemporal wind fields for Switzerland – one for each training dataset.
ST ELM model  Emp. temp. mean  

Dataset  RMSE  MAE  RMSE  MAE  
MSWind 0812  80  2.161  1.429  2.760  1.730 
MSWind 1316  100  1.744  1.282  2.060  1.490 
MSWind 17  150  1.929  1.328  2.251  1.513 
3.2.2 Wind power
To estimate the potential wind power generation in Switzerland, the approximated conversion and uncertainty propagation described in Section 2.3 are applied to the three modelled wind speed datasets. The power estimation is based on the characteristic parameters of an Enercon E101 wind turbine at meters hub height Windturbinemodels.com (2021). The latter indicates the distance from the turbine platform to the rotor of an installed wind turbine, showing how high the turbine stands above the ground without considering the length of the turbine blades. Hence, the predicted wind speed data and its estimated variance are transformed from the measurement height of meters to the hub height of meters as described in equation (23), by considering a roughness derived from the Corine Land Cover (CLC) – issued from the Swiss Federal Office of Topography (SwissTopo) – following the methodology proposed in Grassi et al. (2015). Specifically, the CLC map of 2012 was used to estimate roughness for the MSWind 0812 data, while the CLC map of 2018 was used for the two remaining datasets (further details are reported in Appendix A.3). In addition, all wind speeds greater than have been then discarded after the transformation. This value corresponds to the cutout wind speed of the selected turbine as provided in the manufacturer’s datasheet Windturbinemodels.com (2021). The manufacturer’s wind turbine power curve Windturbinemodels.com (2021) has been fitted with the R Package WindCurves Bokde et al. (2018), yielding and for (25). Then, the transformed wind speed and its variance are passed into equations (28) and (29). This yields an estimation of the expected electricity generation potential accompanied by its variance on the entire Switzerland over the ten years from 2008 to 2017.
3.3 Available area for wind turbine installation
To convert the potential electricity generation per wind turbine into a nationalscale potential estimate for wind power in Switzerland, the available area for wind turbine installation and the potential number of turbines must be defined.
The available area for wind power installations is divided into four restriction zones, shown in Table 3, which indicate weather wind installation is (1) prohibited, (2) restricted, (3) inhibited by the presence of forests, or (4) no specific restrictions have been identified (other). These restriction zones are based on the framework for wind energy planning in Switzerland developed by the Swiss Federal Office of Spatial Development (ARE) für Raumentwicklung ARE (2020). It divides, on a scale of m, between buffered building zones, protected areas, areas to be excluded in principle (considered prohibited), areas with a potential balancing of interests in case of national interest (considered restricted), areas subject to interauthority coordination and other areas (no restriction assumed).
Forests are considered here as a separate category, as wind turbine installation may be possible (unless other restrictions apply), but these zones are considered to be more vulnerable than other areas. Furthermore, all areas at altitudes above 3,000 meters are considered prohibited (based on a digital terrain model Swisstopo (2017)), as they mark highly alpine terrain that is typically difficult to access. Areas above 2,500 meters of altitude are categorised as restricted, as currently no installations are found at higher altitudes and wind turbines in these regions may be difficult to install and maintain.
In addition to the technical aspects considered here, the planning and installation of wind power plants is highly dependent on social, political and environmental concerns. We hence exclude only the prohibited zones for wind power installation. All other zones (restricted, forests, other) are used for the analysis in Section 4, whereby the different zones may be subject to different social, political or environmental considerations. In the nonprohibited zones, wind turbines are virtually installed along the main direction of wind speed in Switzerland (SWW, 60° clockwise from north Koller and Humar (2016)) using geospatial tools. To minimise the potential impact of one virtual turbine’s generation on the next, turbines are spaced here by 16 turbine diameters (1.6 km) streamwise and 10 turbine diameters (1 km) spanwise. This is the double of the spacing that maximises the power output of a wind farm as assessed in Stevens et al. (2016), and agrees with the recommendations in Meyers and Meneveau (2012). The nationalscale electricity potential is finally obtained as the annual electricity generation of each virtual turbine across the different restriction zones.
Zone  Wind Atlas  TLM Regio  Altitude 

Prohibited  Building zones  Glaciers  >3,000 m 
Protected areas  Lakes  
Areas excluded in principle  
Restricted  Potential national interest  Protected areas  >2,500 m 
Forests  Interauthority coordination, other  Forest  
Other  Interauthority coordination, other 
4 Results
4.1 Wind speed modelling
Following the framework presented in section 2, hourly predictions of wind speed were performed over the entire Swiss territory, covering the ten years from 2008 to 2017. The top of Figure 3 shows an example of predictions corresponding to January 2017 on a test station belonging to the MSWind 17 dataset. The model reproduces the main features of the measured wind speed time series, including most of the changes of magnitude and behaviour. However, the predicted time series appears smoother than the real data – this may be a consequence of the selfdiscarding of EOF components with a spatially unstructured coefficient map. Similar results are obtained for the MSWind 0812 and the MSWind 1316.
The estimation of the pointwise model and prediction standarderror bands, based respectively on and , is also reported. The model standarderror band is quite narrow, suggesting a low variability of the mean prediction, despite the low number of training stations. By contrast, the prediction standarderror band is larger, as expected from the noisy nature of wind speed data. The true wind speed time series is hereby well encompassed in the prediction standarderror bands. For the same test station, an accuracy plot is shown in the central row of the Figure 3. Moreover, for the fixed time marked in the time series plot, a predicted map of wind speed is displayed. At the same fixed time, maps of the model and prediction standarderror are shown. Higher model and prediction variabilities are observed in the Alps, crossing the study region from the southwest to northwest. Qualitatively, it seems that the spatial scale of the pattern seen on the prediction standarderror map is comparable to the one observed on the prediction map, while the spatial scale of the pattern seen on the model standarderror map is coarser. This may be related to the multiscale features used in the 13dimensional input space.
4.2 Wind power estimation
The wind speed prediction at 10 meters above ground have been used to estimate wind speed at 100 meters ; the latter estimates have then be tranformed into wind power estimates following the methodology of section 3.2.2. Figure 4 illustrates some samples of the results at the same location and period previously shown for wind speed modelling. The latter displays a partial power time series at a test station, a prediction map at a fixed time and its corresponding uncertainty quantification map. For comparison, the power obtained by passing the true wind speed measurement in the threeparameters logistic function is added on the time series plots.
Generally speaking, the main behavioural variations of the true time series are captured and it is contained in the error bands. Interestingly, when production reaches its maximum potential defined by physical turbine characteristics, the error band sometimes shrinks. This was expected, due to the logistic transformation and its consequences on the variance behaviour stated previously. The maps provide a very interesting insight. An important part of the Jura region, in the northwestern corner of the country, shows a very low uncertainty, while the power prediction is at its maximum. This behaviour is of particular interest for practical reasons, as it shows a high confidence of the model in these wind power estimates. Some similar spots are also identifiable in the western plateau.
The aggregation to annual total wind power generation, shown in Figure 5 as the average value for the 10 years from 20082017, suggests that the potential is highest in the mountains, in both the Alps and the Jura, and may exceed 10 GWh in extreme cases. In the Plateau, the potential is lower (around 34 GWh), whereby zones with higher roughness length, such as urban areas, have a higher potential. Across the 10 years modelled in this work, the wind speed and wind power vary by up to 15  20% with respect to the 10year mean (see Figure 6). These variations may be explained through expected interannual variations of the meteorological conditions. Furthermore, differences in the number of weather stations used for the modelling may lead to variations in the estimated average wind speed. In particular, the large increase in training stations from 2016 to 2017 (101 to 166 stations) is expected to leads to a better representation of local weather patterns. Comparing the wind speed (left axis in Figure 6) to the wind power (right axis) shows the impact of applying the logistic wind power curve, which increases the interannual variation of wind power.
4.3 Nationalscale wind power potential in Switzerland
The application of the nationalscale assessment of the available area for wind turbine installation (Section 3.3) shows that less than half of the surface of Switzerland may be considered for wind installations, as 52% of the area is in the prohibited zone. No particular restrictions have been identified for half of the remaining area, while the other half is either restricted or covered by forests (see Table 4). Assuming the occupation of 1.6 km by each wind turbine, around 12,000 turbines could be installed if all available area (restricted + forests + other) was exploited. As Figure 7 shows, much of the prohibited area is located in the Swiss Plateau (see Figure 2 for reference), due to the high building density in this part of the country. The eastern Alps and the Jura mountains, on the other hand, show a lot of available area.
Zone  Area km  Virtual turbines  Wind potential (TWh) 

Prohibited  21,672 (52%)    0 
Restricted  4,351 (11%)  2734  13.6 
Forests  5,315 (13%)  3311  15.8 
Other  9,953 (24%)  5985  23.7 
The average potential of these restriction zones for each part of Switzerland (Figure 8) shows that the mountain areas (upper and lower Alps, Jura) have the highest average wind power potential. The lowest potential is found in the Plateau and in mountain valleys, confirming the observations from Figure 5. Across the different restriction zones, the other zones have the lowest potential per turbine in all parts of Switzerland, followed by the restricted areas and forests. This may be explained by the fact that other zones are located at lower altitudes and in more flat terrain, yielding lower potentials. The high estimates potential of forests may be related to the higher roughness length in these zones. In the upper Alps, restricted areas have the highest average potential, likely due to their locations at higher altitudes with higher wind speeds.
Summing the potential across all virtual wind turbines (Figure 8) shows that the Alps make up for around 70% (36 TWh) of the national total potential of around 53 TWh. Half of this potential is located in the other zones and may hence be exploitable without specific geographic restrictions. In the lower Alps, forests make up another large part of the potential (9 TWh). Since the forest line marks the approximate separation between lower and upper Alps, they have only a small contribution to the potential in the upper Alps. The Plateau follows the Alps with around 10 TWh, of which around 4 TWh are in the other zone, while the Jura may allow for the exploitation of almost 6 TWh of wind energy. Wetlands, which are strictly protected at the federal level and at the same time constitute only a small area outside of lakes and rivers, are neglected here. Across all parts of Switzerland, the other zone makes up 45% of the potential, followed by forests (30%) and restricted zones (25%).
4.4 Validation and comparison to existing studies
The estimated wind power generation is validated against measured electricity production from three wind power plants in Switzerland (see Figure 7), two of which are located in the Jura, and one in the Rhone Valley (Valais). These are the only three of Switzerland’s 40 wind power facilities with turbine heights around 100 meters (90  110 meters considered) with measured electricity generation before 2018. Table 5 provides an overview of the technical features of these power plants. As the installation ”Jura 2” also contained several wind turbines of lower hub heights which were decommissioned between 2013 and 2016, for this installation only the data for 2017 can be used for the validation.
Installation 



Model 





Valais  COL  1  2005  Enercon E70  100  71  2,000  
MTG  1  2008  Enercon E82  99  82  12  2,000  
CHA  1  2012  Enercon E101  99  101  13  3,000  
Jura 1  PEU  3  2010  Enercon E82  108  80  12  2,300  
Jura 2  MTC  12  2010  2013  Vestas V90  95  90  12  2,000  
MTC  4  2016  Vestas V112  94  112  12  3,300 
As Figure 9 shows, the estimated annual production per turbine lies within 15% of the measured values for the two installations in the Jura (see also Table 6). For the turbines installed in the Rhone Valley, an underestimation of up to 69% is observed, particularly for the years after 2013. A part of this underestimation may be due to uncertainties in the roughness length, since these turbines are located at the boundaries of industrial areas. Furthermore, jetlike flows through the Rhone Valley, peaking at 200 meters above ground in the warm summer months Schmid et al. (2020), may lead to an increased wind speed at the modelled height of 100 m, which are not accounted for by the applied loglaw. In the ”Rhone knee”, the corner of the Rhone Valley, these effects are particularly pronounced, which creates major difficulties for modelling the wind speeds in this particular region Koller and Humar (2016). Finally, the rated power of the modelled wind turbine (3,050 kWh) varies from the rated power of the turbines (see Table 5). As the rated power of the installations lies on average below that of the assumed wind turbine, the estimated power is expected to be above that of the measured data. However, this effect may be offset by different wind power curves that increase the generation at lower wind speeds. Due to the small size of the validation sample, these results cannot be considered to be representative.
Installation  2008  2009  2010  2011  2012  2013  2014  2015  2016  2017  mean 

Valais  45  42  37  50  39  69  68  68  66  67  56 
Jura 1  15  12  10  11  10  13  1  2  
Jura 2  8  8 
In addition to the validation against measurement data, we compare the results to another existing estimation of annual wind speeds at 100 meters height for Switzerland, published as part of the wind atlas of the Swiss Federal Office of Energy (SFOE) Koller and Humar (2016). As Figure 10 shows, this approach estimates higher wind speeds than those estimated by SFOE, particularly in the alpine terrain. In the Jura and in the Plateau the difference between both estimates is small, whereby the estimated wind speeds in the Plateau is slightly lower in this study than estimated by SFOE.
The high complexity of the wind speed patterns in mountain terrain may be regarded as the primary reason for these differences, whereby the presented ELMapproach leads to higher estimates than the model based on computational fluid dynamics (CFD) used by the SFOE. In addition to the computational methods, one of the main differences between these two estimations lies in the temporal resolution of the results. While the SFOEestimate is based on average wind conditions Koller and Humar (2016), this work yields results in hourly resolution. These hourly data may be used for example in studies of hybrid energy systems with high shares of wind power.
5 Discussion
5.1 Methodological contributions
Despite the growing popularity of ML applications for wind speed and wind power modelling, several methodological issues remain unsolved. This includes the lack of a modelling approach to deal with (i) the spatiotemporal nature of the phenomenon – especially when wind speed is measured on irregularly spaced monitoring networks –, (ii) the quantification of the prediction uncertainty, and (iii) the difficulty of managing frequencies such as the hourly one.
Here we propose an adaptation of the spatiotemporal framework originally proposed in Amato et al. (2020b), adopting ELM ensembles to individually predict each spatial coefficient map resulting from the EOF decomposition of the spatiotemporal data. The variance estimates developed in Guignard et al. (2021) were used to extend the uncertainty quantification to the spatiotemporal framework. The prediction variance was estimated through a second model based on squared residuals after their logtransformation. The ELM based variance estimate of this second model was further used to backtransform the results. These developments were applied on hourly wind speed data for Switzerland. As shown in detail in Appendix B, the use of the regularised version of the ELM provides the opportunity to extract insightful information about the spatiotemporal model to understand its behaviour, but also to improve the explainability of the models in terms of data interpretability. In this specific case, those insights were also confirmed by the residual analysis.
The potential wind power generation was then estimated based on the modelled wind speed to assess renewable energy potential in Switzerland. As expected, the high variance propagated in the transition phase of the logistic function can lead to very uncertain predictions. An alternative way to estimate wind power may be to spatiotemporally model directly the transformed power data. However, a significant advantage of modelling the wind speed as a first step is that the obtained results do not depend on the choice of a specific turbine height and logistic parameters describing technical specificities of the turbine through the power curve. Hence, the power estimation can easily be updated to adapt to different choices of these parameters, generating multiple turbine scenarios to support decisions related to the turbine selection.
5.2 Practical contributions
The work presented here may contribute to the development of wind power in Switzerland in several ways. First, the hourly wind profiles, estimated for 10 years at a scale of m for the entire country, provide an exhaustive database for the modelling of potential future wind turbines in the Swiss electricity grid. The hourly temporal resolution hereby allows to assess the complementarity of wind power with other renewable resources such as solar photovoltaics, and to quantify the potential impact of an increased share of wind power on the stability of the electricity grid.
Second, the analysis of the annual wind power generation potential (Section 4.3) may be set into context with the goals of the ”Swiss Energy Perspectives”, aiming at a wind power generation of 4.3 TWh by 2050 BFE (2020). This target corresponds to an increase of the current production by a factor of 30 for Energy (2018). With an average annual wind power potential of 4.4 TWh, this target may be achieved through the installation of around 1,000 wind turbines. This target is rather low compared to other European countries WindEurope Business Intelligence (2021), potentially due to the large part of the country being covered by mountains, as well as strong societal and political concerns. The target of 4.4 TWh hence lies well within the potentials identified in Section 4.3, and may be achieved by realising less than 20% of the potential in the other zone.
Third, overlaying the information on wind power generation potential, the variance of this potential and the available area for turbine installation may serve to identify suitable areas for future wind farms in Switzerland. The variance plays a key role in this process, as potential wind farms in areas with low variance may allow for a higher planning reliability.
The work presented in this study is an assessment of the potential wind speeds and wind power generation. It does hence not represent an installation recommendation for wind turbines in a specific location, nor does it replace any local measurements in future wind projects. Instead, it is aimed to be used in studies of future electricity grids, by the scientific community or by energy planners, and to provide further insights for policy makers in the development of national renewable energy targets, while accounting for the need of protecting natural systems, often endangered by power plant expansions.
5.3 Limitations and further work
The estimation of the potential generation of wind turbines of 100 meters hub height is limited by the data availability of wind speeds at 10 meters only. This requires the use of physical and empirical formulas to estimate wind power generation, namely the loglaw and the wind power curve. Propagating the variances through these formulas increases the variance of the estimated potential. The loglaw further requires the estimation of roughness length, which is approximated from land use data, leading to further uncertainties. Additionally, wind phenomena occurring at the target height of 100 m, such as thermally induced winds in mountain valleys, are not taken into account through the extrapolation via the loglaw (see Section 4.4), and can only be considered if wind measurements are available at 100 meters height.
Future work may aim at a further validation and calibration of the proposed model by collecting and integrating hourly monitored data of wind speed and wind power generation at heights above 10 m, which are currently unavailable for Switzerland. The estimated generation, variance and available area may further be combined to develop a suitability indicator for wind power, accounting for these three factors. Finally, the proposed model may be expanded, at national scale or for particular areas of interest, to account for different hub heights and wind turbines. This is the main advantage of using the physical and empirical formulas mentioned above. Such a tool may be used to choose suitable turbine models to maximise the wind power output at a specific location.
6 Conclusions
In this paper we propose an estimation of hourly wind energy potential at the Swiss national scale. The application was developed using a newlyintroduced framework enabling spatiotemporal prediction of data measured on irregularly spaced monitoring networks. A particular attention was paid to uncertainty quantification and its propagation throughout the entire modelling procedure. Particularly, ten years of wind speed measurement collected at an hourly frequency on three sets of up to 208 monitoring stations. The data were interpolated using advanced spatiotemporal techniques, in order to estimate wind speed at unsampled locations. Then, the resulting wind field was used to estimate hourly wind power potential on a national scale on a reguar grid having a spatial resolution of 250 meters.
The results showed that the wind power potential is highest in the mountain areas of the Alps and the Jura, of which the wind speeds in the Jura mountains have an overall lower variance. The conversion of wind speed to wind power through the power curve leads to high uncertainties whenever the wind speed is in the transition region of the logistically approximated power curve. Across Switzerland, we estimate an annual average power generation for turbines at 100 meters hub height of 4.4 TWh, with intraannual variations by up to 15  20%. A validation has shown that the estimated potential deviates by less than from the measured annual electricity yield in the Jura, while there are some limitations for the estimation of wind power in the Rhone valley.
The virtual installation of wind turbines on all available area with a spacing of 1.6 km yields a potential 12,000 turbines on around half of the Swiss terrain. About 1,000 of these turbines would be sufficient to fulfil the targets of the Swiss energy perspectives of 4.3 TWh by 2050, which may be realised by installing wind turbines exclusively in areas without identified restrictions.
The high spatiotemporal resolution of the results, as hourly values for 10 years for pixels of m, allows for the study of future electricity systems with an increased share of wind power. A combination of the wind power potential, its uncertainty and the available area for turbine installation further enables the assessment of the suitability of different areas for future wind projects. With these applications, the current work aims to support the development of wind power as part of a fully renewable future energy system in Switzerland.
Author Contributions
F.A. and F.G. conceived the main conceptual ideas. F.G. preprocessed the data, and developed the theoretical formalism F.A. designed the experiments and performed the calculations. A.W. postprocessed, analyzed and validated the computational results. F.A., F.G. and A.W. wrote the original draft and discussed the results. N.M., J.L.S., M.K., carried out the supervision and funding acquisition. All authors reviewed the manuscript and gave final approval for its publication.
Conflict of interest
The authors declare that they have no conflict of interest.
Acknowledgements.
The research presented in this paper was supported by the National Research Program 75 “Big Data” (PNR75, Project No. 167285 “HyEnergy”) of the Swiss National Science Foundation (SNSF).References
 Spatiotemporal evolution of global surface temperature distributions. In Proceedings of the 10th International Conference on Climate Informatics, pp. 37–43. Cited by: §1.

A novel framework for spatiotemporal prediction of environmental data using deep learning
. Scientific Reports 10 (1), pp. 1–11. Cited by: §1, §2.1.3, §5.1.  Machine learning and geographic information systems for largescale wind energy potential estimation in rural areas. In Journal of Physics: Conference Series, Vol. 1343, pp. 012036. Cited by: §1.
 A retrospective analysis of benefits and impacts of us renewable portfolio standards. Energy Policy 96, pp. 645–660. Cited by: §1.
 Energieperspektiven 2050+. Zusammenfassung der wichtigsten Ergebnisse.. Technical report Bundesamt für Energie BFE, Bern, Switzerland. External Links: Link Cited by: §1, §5.2.
 Environmental laws and climate action: a case for enacting a framework climate legislation in india. In International Forum for Environment, Sustainability and Technology (iFOREST), Cited by: §1.

Wind turbine power curves based on the Weibull cumulative distribution function
. Applied Sciences 8 (10), pp. 1757. Cited by: §2.3, §3.2.2.  Estimating aerodynamic roughness (zo) in mixed grassland prairie with airborne lidar. Canadian Journal of Remote Sensing 37 (4), pp. 422–428. Cited by: A.3  Roughness estimation.
 Transformation and weighting in regression. Vol. 30, CRC Press. Cited by: §2.2.2.
 Wind speed spatial estimation for energy planning in sicily: a neural kriging application. Renewable Energy 33 (6), pp. 1251–1266. External Links: ISSN 09601481, Document, Link Cited by: §1.
 Geostatistics: modeling spatial uncertainty. Vol. 497, John Wiley & Sons. Cited by: B.2  Residuals Analysis.
 Statistics for spatiotemporal data. Wiley. Cited by: §2.1.1.
 Regularized extreme learning machine. In 2009 IEEE symposium on computational intelligence and data mining, pp. 389–395. Cited by: §2.1.2.

Kernel ridge regression with active learning for wind speed prediction
. Applied Energy 103, pp. 328–340. External Links: ISSN 03062619, Document, Link Cited by: §1.  Schweizerische Elektrizitätsstatistik 2018. Technical report Bundesamt für Energie BFE (DE / FR). Cited by: §5.2.
 Konzept Windenergie. Basis zur Berücksichtigung der Bundesinteressen bei der Planung von Windenergieanlagen. Technical report Bern, Switzerland (de). Cited by: §1, §3.3, Table 3.
 Generalized crossvalidation as a method for choosing a good ridge parameter. Technometrics 21 (2), pp. 215–223. Cited by: §3.2.1.
 Satellite remote sensed data to improve the accuracy of statistical models for wind resource assessment. In European Wind Energy Association Annual Conference and Exhibition 2015, Cited by: §3.2.2, A.3  Roughness estimation.

Uncertainty quantification in extreme learning machine: analytical developments, variance estimates and confidence intervals
. Neurocomputing 456, pp. 436–449. External Links: Document Cited by: §1, §2.1.2, §2.2.1, §5.1, B.2  Residuals Analysis.  Investigating the time dynamics of wind speed in complex terrains by using the fisher–shannon method. Physica A: Statistical Mechanics and its Applications 523, pp. 611–621. Cited by: §3.1.
 Variance function estimation in regression: the effect of estimating the mean. Journal of the Royal Statistical Society: Series B (Methodological) 51 (1), pp. 3–14. Cited by: §2.2.2.
 The elements of statistical learning: data mining, inference, and prediction. Springer series in statistics, Springer. External Links: ISBN 9780387848846, LCCN 2008941148 Cited by: B.1  ELM Tikhonov factor analysis.
 Dokumentation Geodatenmodell Windenergieanlagen. Technical report Technical Report 1.0 rev, Bundesamt für Energie BFE, Bern, Switzerland. External Links: Link Cited by: Table 5.
 Practical confidence and prediction intervals. In Advances in Neural Information Processing Systems 9, M. C. Mozer, M. I. Jordan, and T. Petsche (Eds.), pp. 176–182. External Links: Link Cited by: §2.2.2.
 Extreme learning machine: theory and applications. Neurocomputing 70 (13), pp. 489–501. Cited by: §2.1.2.
 An introduction to statistical learning. Vol. 112, Springer. Cited by: B.1  ELM Tikhonov factor analysis.
 An approach to producing space–time covariance functions on spheres. Technometrics 49 (4), pp. 468–479. Cited by: §3.1.
 Wind speed distribution selection–a review of recent development and progress. Renewable and Sustainable Energy Reviews 114, pp. 109290. Cited by: §3.1.
 Analysis and modelling of spatial environmental data. Vol. 6501, EPFL press. Cited by: Table 1, B.2  Residuals Analysis.
 Hitting the target but missing the mark: unintended environmental consequences of the paris climate agreement. Frontiers in Environmental Science 7, pp. 151. Cited by: §1.
 Windatlas Schweiz. Schlussbericht Meteotest (de). Cited by: §3.3, Figure 10, §4.4, §4.4, §4.4.
 Multifractal analysis of the time series of daily means of wind speed in complex regions. Chaos, Solitons & Fractals 109, pp. 118–127. Cited by: §1.
 A new algorithm for redundancy minimisation in geoenvironmental data. Computers and Geosciences 133, pp. 104328. Cited by: §3.1.
 Wind resource estimation—an overview. Wind Energy 6 (3), pp. 261–271. External Links: Document, Link, https://onlinelibrary.wiley.com/doi/pdf/10.1002/we.94 Cited by: §1.
 Extreme learning machine: a robust modeling technique? yes!. In International WorkConference on Artificial Neural Networks, pp. 17–35. Cited by: §2.1.2, §2.1.2.
 Extreme learning machines for spatial environmental data. Computers and Geosciences 85, pp. 64–73. Cited by: §2.1.2.
 Ensemble based extreme learning machine. IEEE Signal Processing Letters 17 (8), pp. 754–757. Cited by: §2.1.2.
 A combined forecasting model for time series: application to shortterm wind speed forecasting. Applied Energy 259, pp. 114137. External Links: ISSN 03062619, Document, Link Cited by: §1.
 Modelling the impact of urban growth on agriculture and natural land in italy to 2030. Applied Geography 91, pp. 156–167. Cited by: §1.
 Energy investment needs for fulfilling the paris agreement and achieving the sustainable development goals. Nature Energy 3 (7), pp. 589–599. Cited by: §1.
 [41] (Website) External Links: Link Cited by: §3.1.
 Optimal turbine spacing in fully developed wind farm boundary layers. Wind Energy 15 (2), pp. 305–317 (en). External Links: ISSN 10991824, Link, Document Cited by: §3.3.
 On the derivatives of the sigmoid. Neural Networks 6 (6), pp. 845–853. Cited by: §2.3.
 Wind energy: renewable energy and the environment. CRC press. Cited by: §1.
 The new climate policies of the european union: internal legislation and climate diplomacy. Asp/Vubpress/Upa. Cited by: §1.
 A note on the delta method. The American Statistician 46 (1), pp. 27–29. Cited by: §2.2.2, §2.3.
 Statistical data analytics: foundations for data mining, informatics, and knowledge discovery. John Wiley and Sons. Cited by: §2.1.2, §3.2.1.
 Spatiotemporal covariance and crosscovariance functions of the great circle distance on a sphere. Journal of the American Statistical Association 111 (514), pp. 888–898. Cited by: §3.1.
 Die energieperspektiven für die schweiz bis 2050. Energienachfrage und Elektrizitätsangebot in der Schweiz 2000 2050. Cited by: §1.
 Spatial prediction of monthly wind speeds in complex terrain with adaptive general regression neural networks. International Journal of Climatology 33 (7), pp. 1793–1804. External Links: Document, Link, https://rmets.onlinelibrary.wiley.com/doi/pdf/10.1002/joc.3550 Cited by: §1, §3.1, Table 8.
 Energy system transformations for limiting endofcentury warming to below 1.5 c. Nature Climate Change 5 (6), pp. 519–527. Cited by: §1.
 Semiparametric regression. Cambridge university press. Cited by: §2.2.2, §2.2.2.
 Territorial fragmentation and renewable energy source plants: which relationship?. Sustainability 12 (5), pp. 1828. Cited by: §1.
 The italian experience of the covenant of mayors: a territorial evaluation. Sustainability 13 (3), pp. 1289. Cited by: §1.
 Diurnal Valley Winds in a Deep Alpine Valley: Observations. Atmosphere 11 (1), pp. 54 (en). Note: Number: 1 Publisher: Multidisciplinary Digital Publishing Institute External Links: Link, Document Cited by: §4.4.
 Renewable energy targets may undermine their sustainability. Nature Climate Change 10 (11), pp. 974–976. Cited by: §1.
 Effects of turbine spacing on the power output of extended windfarms. Wind Energy 19 (2), pp. 359–370 (en). External Links: ISSN 10991824, Link, Document Cited by: §3.3.
 swissALTI3D  The high precision digital elevation model of Switzerland. External Links: Link Cited by: §3.3, Table 3.
 swissTLMRegio. The smallscale landscape model of Switzerland. Technical report Bundesamt für Landestopografie swisstopo. Cited by: Table 3.
 Who invented the delta method?. The American Statistician 66 (2), pp. 124–127. Cited by: §2.2.2, §2.3.
 Statistical learning approach for wind resource assessment. Renewable and Sustainable Energy Reviews 56, pp. 836–850. External Links: ISSN 13640321, Document, Link Cited by: §1.
 Statistical learning approach for wind speed distribution mapping: the uk as a case study. Springer International Publishing. Cited by: §1.
 Assessing accuracy and geographical transferability of machine learning algorithms for wind speed modelling. In The Annual International Conference on Geographic Information Science, pp. 297–310. Cited by: §1.
 Reformulation of parameters of the logistic function applied to power curves of wind turbines. Electric Power Systems Research 137, pp. 51–58. Cited by: §2.3.
 Mountain meteorology: fundamentals and applications. Oxford University Press. Cited by: §2.3.
 Spatiotemporal statistics with r. Chapman and Hall/CRC the R series, CRC Press, Taylor and Francis Group. Cited by: §2.1.1.
 Enercon E101. Note: https://en.windturbinemodels.com/turbines/130enercone101#datasheet[Online; accessed 30March2021] Cited by: §3.2.2.
 Wind energy in Europe  2020 Statistics and the outlook for 20212025. Technical report WindEurope. Cited by: §5.2.
 An improved combination approach based on adaboost algorithm for wind speed time series forecasting. Energy Conversion and Management 160, pp. 273–288. External Links: ISSN 01968904, Document, Link Cited by: §1.
 Spatial justice and the land politics of renewables: dispossessing vulnerable communities through solar energy megaprojects. Geoforum 76, pp. 90–99. External Links: ISSN 00167185, Document, Link Cited by: §1.
Appendix A
A.1  Exploratory Data Analysis
This section shows the main outputs of the exploratory data analysis performed on the wind speed data. Spatial plots and time series are used to highlight the presence of spatiotemporal structures and dependencies in the data. Temporal correlation are further explored fo a set of station located at different altitudes via autcorrelation functions (ACF). Distributional properties are explored using a kernel density estimation (KDE).
Name  Location  Altitude [m] 

AND  Andeer  987 
COM  Acquarossa Comprovasco  575 
DAV  Davos  1594 
DOL  La Dôle  1669 
ENG  Engelberg  1035 
EVI  Evionnaz  482 
FAH  Fahy  596 
INT  Interlaken  577 
LUZ  Luzern  454 
NABBER  Bern  535 
NABDUE  Dübendorf  432 
SCM  Schmerikon  408 
STG  St. Gallen  775 
TSG  Arosa (Tschuggen)  2040 
ZER  Zermatt  1638 
A.2  Input Features
This section provides details on the features included in the input space and the corresponding bandwidth parameters used to extract them from the digital elevation model. A correlation matrix showing relationships among pairs of features is also reported.
ID  Feature  Bandwidth 

3  Difference of Gaussian (small scale)  
4  Difference of Gaussian (medium scale)  
5  Difference of Gaussian (large scale)  
6  Slope (small scale)  
7  Slope (medium scale)  
8  Slope (large scale)  
9  Directional derivative NS (small scale)  
10  Directional derivative NS (large scale)  
11  Directional derivative EW (small scale)  
12  Directional derivative EW (large scale) 
A.3  Roughness estimation
Wind speed transformation in the Earth boundary layer under neutrally stable conditions and dynamic equilibrium has here been modelled using the generalized loglaw presented in equation (23), which is based on the assumption that the mean velocity profile is a function of height, surface roughness, friction velocity, zeroplane displacement and von Karman’s constant, usually assumed equal to 0.4 Grassi et al. (2015). In this paper, roughness has been estimated based on the CLC by associating a roughness value to each of the land cover classes, as reported in Table 9. Such values have been widely used and validated, see e.g. Brown and Hugenholtz (2011).
Land Cover Class  Land Cover Class  

Continuous urban fabric  1.2  Land principally occupied by agriculture  0.3 
Discontinuous urban fabric  0.5  Broadleaved forest  0.75 
Industrial or commercial units  0.5  Coniferous forest  0.75 
Roads and rail networks and associated land  0.075  Mixed forest  0.75 
Ports areas  0.5  Natural grasslands  0.03 
Airports  0.005  Moors and heatland  0.03 
Mineral extraction sites  0.005  Transitional woodlandshrub  0.6 
Construction sites  0.5  Beaches, dunes, sands  0.0003 
Green urban areas  0.6  Bare rocks  0.005 
Sport and leisure facilities  0.5  Sparsely vegetated areas  0.005 
Nonirrigated arable land  0.05  Glaciers and perpetual snow  0.001 
Vineyards  0.1  Inland marshes  0.05 
Fruit trees and berry plantations  0.1  Water courses  0.00002 
Pastures  0.03  Water bodies  0.00002 
Complex cultivation patterns  0.3 
As a result, two roughness maps have been uses, one corresponding to the land covers mapped in the 2012 CLC, one for those mapped in the 2018 CLC data. The roughness map corresponding to the 2018 CLC is show in Figure 16.
Appendix B
B.1  ELM Tikhonov factor analysis
Regularised ELM can be intended as a ridge regression James et al. (2013); Hastie et al. (2009) in a random feature space. This suggests that the Tikhonov factors can be used to increase the explainability of the spatiotemporal models. Tikhonov factors for the first 25 EOF components of each model are presented in Figure 17 as matrices with a linear colour scale depending on the magnitude order of . In the Figure, a factor in yellow indicates the selection of a huge regularisation parameter, which shrinks the output weights of ELM near to zero, hence suggesting the absence structure in the corresponding modelled spatial coefficient map. It is worth noticing that the corresponding model variance will also tends to zero. Differently, dark tones in the Figure correspond to regularisation parameters closer to zero, hence to a behaviour closer to that of the classical LS procedure. This suggests that the regularised ELM is prone to consider a spatial structure for the corresponding EOF coefficients.
The matrix of the first model in Figure 17 tends to be less sparse with more recent data. This may be related to the increasing number of training stations over the different datasets. Reading vertically the matrices it is easy to verify how the first components provide the major contribution in terms of data variability and of spatially structured information. Reading now the matrices horizontally permits the detection of the variability within ensembles for some components of the first component group.
Other patterns can be identified for the single datasets. The MSWind 0812 data in Figure 16(a) mainly takes advantage of the first five components, which from the EOF decomposition are known to cumulate 65% of data variability. Excepted for some isolated members in component 7 and 13, no other information is used by the model. Figure 16(b) shows how for the MSWind 1316 data the first three components and the sixth one are chosen by the model for all members. Again, they represent about 65% of the variability. For the MSWind 17 data in Figure 16(c) components 1 to 6, 9, 11 and 17 are fully contributing to the model. They correspond to 73% of the variability. Components 8, 13, 19 20 and 24 are only partly contributing. All other components are automatically not considered due to the regularisation mechanism. For the three datasets, the very first component of the original data model (contains seasonal cycles which are weakly depending on space) is variable, hence denoting some hesitations of the model. For the second model based on logsquared residuals, the behaviour is more contrasted.
B.2  Residuals Analysis
Training set  Test set  

Dataset  Statistic  Raw  Mod.  Res.  Raw  Mod.  Res. 
Min.  0.10  0.97  9.08  0.10  0.25  11.83  
MSWind 0812  Mean  2.33  2.33  0.00  2.81  2.71  0.10 
Max.  38.08  20.22  22.13  42.85  18.30  34.54  
Min.  0.10  1.26  11.62  0.10  0.71  10.11  
MSWind 1316  Mean  2.56  2.56  0.00  2.42  2.78  0.37 
Max.  40.43  19.59  29.22  25.13  16.88  16.89  
Min.  0.10  2.88  8.66  0.10  2.69  17.18  
MSWind 17  Mean  2.30  2.30  0.00  2.56  2.50  0.06 
Max.  34.15  26.32  24.44  34.67  20.35  25.04 
A careful analysis of the residuals is carried on. Figure 18 displays histograms of training and test set of the raw data, modelled data and residuals, for the three periods of study. Note that the plots are zoomed in for visualisation purposes and the actual ranges are reported in Table 10
, together with the empirical means. The models predict negative values for the three datasets. Although a negative wind speed has no physical meaning, histograms show that it is happening quite rarely. These predicted values have been set to zero for the purpose of power estimation. The training residual means are all null and test ones are close to zero. However, each residual distribution has its mode below zero and is slightly skewed.
Spatiotemporal variography analysis is performed on the training data for some chosen months. Figure 19 visualise the semivariograms for raw data, model predictions and their residuals. The model reproduces well the spatiotemporal dependencies detected by variography for the selected months. The sill is lower for the modelled data, hence suggesting a substantial variability loss, possibly due to the noisy nature of the raw data. The semivariograms of the residuals are close to flat with a residual temporal structure. In the spatial axis, almost a pure nugget effect is observed, although a residual structure could subsist for January 2017. Globally, the patterns observed in the raw data semivariograms are reproduced in the corresponding semivariograms of modelled data, modulo the sill shift. This is even more striking by looking at longer temporal lags e.g. for January 2017, where a high similarity is observed between the semivariogram shapes, see Figure 20.
Raw data  Modelled data  Residuals  

June  2008  2.85  1.01  1.51 
October  2013  6.66  2.34  3.46 
April  2015  6.62  2.56  3.46 
January  2017  6.61  3.40  2.71 
Spatial variography Kanevski and Maignan (2004) is also performed on some spatial coefficients of EOF component before and after ELM modelling. Figure 21 displays such analysis for the MSWind 17 dataset for the first three components, which contain most of the variability. The (spatial) omnidirectional semivariogram is computed on the spatial coefficient obtained directly from the EOF decomposition of the original spatiotemporal data (in solid line). It shows the presence of the spatial structures in the first three components, although it is less pronounced for the very first component which is consistent with what was observed and discussed so far in Figure 17. Figure 21 also shows the semivariograms of the residuals obtained from the spatial modelling with ELM ensembles (in dashed line), which are close to pure nugget effects. This has two consequences. First, the spatial modelling component by component seems to correctly extract the spatial structures from them. Second, this suggests that the heteroskedastic variance estimate of the ELM ensembles satisfies its independence assumption — actually, it is sufficient to assume a vanishing covariance for this estimation, see Guignard et al. (2021) — and hence is appropriate. Finally, the crossvariograms between the residuals of component pairs fluctuate near zero (in dasheddotted line). Comparing to the corresponding semivariograms, this indicates that the correlation between component residual pairs is very weak Chiles and Delfiner (2009) and then satisfy the additional assumption for spatiotemporal model variance estimation which is necessary to ensure that no variability comes from spatial model interactions. The other components, which contain far less variability, can be exempted from such analysis without too many risks.
Comments
There are no comments yet.