1 Introduction
1.1 Context
In the face of climate change, worldwide efforts are being undertaken to reduce carbon emissions (IPCC, 2018). A common roadmap to sustainability is to decarbonise electricity supply and electrify other sectors such as as transport and heating (Staffell, 2017). In many countries, an essential part of this strategy is an increased use of variable renewable energy (VRE) generation such as solar and wind (Jacobson et al., 2015; National Grid, 2017). Hence, while electricity demand has always exhibited some weatherdependence (see e.g. Thornton et al., 2017), the same increasingly holds for supply.
To aid in questions of energy strategy, decisionmakers frequently employ energy system models (ESMs), computer programs that simulate the energy transitions in a given geographical region (Ventosa et al., 2005). Power system models (PSMs) form a subset concerned primarily with the electricity sector. A common use of PSMs is to calculate the optimal generation mix by minimising the sum of installation and generation costs while meeting demand (Stoft, 2002; Bazmi & Zahedi, 2011; Ringkjøb et al., 2018). Models considering renewables require coherent weather timeseries such as windspeeds or solar irradiances as inputs.
1.2 The computational cost of weather & climate variability
Recent studies indicate that robust power system planning under natural climate variability requires long samples of demand and weather data (spanning multiple decades). In particular, characteristics of power systems, such as the optimal installed capacities of different generation technologies, may be highly dependent on which year of data is used. Hence, power systems designed using singleyear simulations may be suboptimal in the long run (Bloomfield et al., 2016; Pfenninger, 2017; Zeyringer et al., 2018; Wohland et al., 2018). The degree of this dependence is expected to increase as more VRE generation is employed (Staffell & Pfenninger, 2018; Collins et al., 2018).
The need for long samples poses a computational challenge since accurately modelling systems with significant shares of VRE generation also requires high temporal resolution. Previous studies indicate that models with low resolution fail to capture VRE output fluctuations and underestimate required flexible and dispatchable generation capacity (De Jonghe et al., 2011; Poncelet et al., 2016; Collins et al., 2017; Kotzur et al., 2018). For many realistic PSMs, it is computationally unfeasible to solve the optimisation problem using both a long sample of data and a high temporal resolution (Pfenninger, 2017).
1.3 Established data reduction approaches
Various approaches to reduce computational cost without lowering temporal resolution exist. One strategy is the softlinking of a longterm planning model with a more detailed simulation operating on a shorter scale (Ringkjøb et al., 2018; Zeyringer et al., 2018; Collins et al., 2017). Another is to run a model with a smaller number of representative periods (e.g. days or weeks) obtained by clustering the full dataset. Numerous studies explore the efficacy of such approaches in reproducing model outputs at reduced computational expense (de Sisternes & Webster, 2013; Pfenninger, 2017; Nahmmacher et al., 2016; Kotzur et al., 2018; Härtel et al., 2017; Poncelet et al., 2017)
. They arrive at a number of common conclusions. Firstly, the reduction approach must not remove extremes (e.g. by “averaging away” peaks) since including them ensures the model determines power system design able to meet demand in such scenarios. For this reason, heuristic adjustments such as including the maximum demand day are sometimes employed. Furthermore, clustering typically works poorly when applied to multiple decades of timeseries data since small changes in approach may lead to large spreads in model outputs. For example,
Pfenninger (2017)clusters 25 years of demand & weather data and finds that optimal wind capacity is anywhere between 0.8 and 2.8 times peak demand depending on the choice of clustering algorithm and heuristic adjustment, giving the user virtually no indication of a good investment strategy. In addition, clustering does not generalise easily to models taking a large number of input timeseries. Consider a model with hourly resolution for 10 demand regions, 5 wind farms and 5 concentrated solar power plants. Clustering the days requires clustering vectors of length 480 (=24
(10+5+5)).1.4 This paper’s contribution
This paper introduces a novel subsampling approach, called importance subsampling, that can be applied to multiple decades of demand & weather timeseries data. PSM outputs evaluated using subsamples reliably estimate those found using the full timeseries at greatly reduced computational cost. The methodology is introduced in full generality and can be either directly applied or straightforwardly generalised to a wide class of optimisationbased PSMs.
A test case is performed using a model of the United Kingdom power system created using the opensource energy modelling framework Calliope (see Pfenninger & Pickering, 2018) and 36 years of UKwide demand and wind data. Model outputs using importance subsampling reproduce those using all 36 years of data more reliably than alternative approaches such as using individual years or clustering days.
This paper is structured as follows. Section 2 provides motivation, outlining the risks in designing power systems using short samples of demand & weather data. Section 3 describes the importance subsampling approach in detail and full generality. Section 4 evaluates its use in a test case, comparing results to those found using individual years or by clustering timeseries into a number of representative days. Section 5 discusses the results’ implications and recommends potential extensions. In the appendix (Section 6), the reader finds full descriptions of the PSM and datasets employed.
2 Motivation: climate variability in power system planning
This section provides a concrete illustration of the problem of interyear climatebased variability in power system planning. A PSM of the UK is considered along with 36 years of hourly demand and wind levels (over the period 1980 to 2015). The demand timeseries is detrended so that the only differences between years are climatedriven (e.g. demand being higher when it is cold, see Section 6.2.1 for details).
The boxandwhisker plot in figure 1(a) shows the distribution of optimal generation capacities of 4 technology types across the 36 individual years of historic data (i.e. for 1980, 1981, … , 2014, 2015). The white horizontal line inside the box indicates the median and the wider dashed horizontal line shows the optimal capacity when all 36 years of data considered at once: the best estimate for the “truly” optimal design. Two phenomena stand out. The first is the considerable interyear spread, particularly for wind. Using 2010 data indicates almost no wind at all should be built (1.0GW), whereas the optimal configuration for 1986 has more wind capacity than any other technology (33.7GW). The second is the bias between the median of individualyear model runs and the 36year optimal capacities (seen by comparing the white line on the box plots with the wider dashed black line). For example, 92% of individualyear model runs underestimate optimal peaking capacity, with a median underestimation of around 20%. The opposite is true for wind, for which the median individualyear wind capacity overestimates the 36year optimum by around 15%.
Furthermore, a power system designed using one year’s optimal capacities may lead to supply capacity shortages in another. Figure 1(b) shows the distribution of the number of hours with supply capacity shortages (insufficient capacity to meet demand) as a function of the year used to determine the optimal capacities (in other words, the system is “designed” on basis of the data of the year indicated on the axis and asked to meet conditions of the year on the axis). The risk of picking the “wrong year” on which to base system design is high: one designed using 2004’s capacities fails to meet 93 hours of demand in 2010, and an average of 30 hours annually across all 36 years. Across the whole dataset, a power system designed using a random choice of year has a 50% chance of at least one hour of supply capacity shortage in another, and a 35% chance of shortage for at least 3 hours. This highlights the risks of informing power system strategy on single years and illustrates the need to consider longer samples.
3 Methodology
This section introduces the precise importance subsampling methodology in full generality. Like the clustering strategies discussed in Section 1.3, it is based on timeseries compression through subsampling. The specific application to a power system planning problem is presented in Section 4.
3.1 Intuition
The importance subsampling methodology works by subsampling timesteps from the full dataset. They are always selected in full; if one timeseries value at timestep is sampled, then so are all others. This ensures correlations between timeseries (e.g. spatial windspeed correlations or demandwind correlations) are automatically accounted for.
Determining which timesteps to sample can be complicated. In Section 2, a system designed using 2004 data leads to supply capacity shortages in other years because all of 2004’s highdemand timesteps have high wind levels. As a result, the 2004 optimal design underestimates the amount of required backup (dispatchable) generation capacity. This highlights the fact that costoptimal power system design will never have excess capacity; it will just be able to meet demand in the “worst” timestep (the one requiring the largest supply). Indeed, this is the reason that heuristic adjustments such as including the maximum demand day are sometimes employed in the established subsampling approaches outlined in Section 1.3. The disadvantage of heuristic methods is that, when VRE generation is involved, they may fail to identify those timesteps truly required to ensure generation capacity adequacy. The peak demand in 2004 is not lower than in other years, so including the maximum demand day fails to mitigate the underestimation of required generation capacity. In this case, it may be necessary to sample instead a day with a slightly lower demand but virtually no wind. Determining the correct tradeoff between high demand and low wind (e.g. is a timestep with 50GW demand and 0.1 wind capacity factor “worse” than one with 55GW peak demand and 0.2 wind capacity factor?) is difficult a priori and will differ between PSMs.
The importance subsampling approach identifies the essential timesteps in a systematic way by assigning each one a measure of the difficulty in meeting demand, referred to as its importance. This has two advantages. Firstly, it onedimensionalises timeseries inputs: a timestep with many values (e.g. demand levels and renewable outputs across multiple regions) is assigned a single importance. This allows timesteps to be ranked objectively irrespective of the PSM’s structure. Secondly, the ranking means a selection of timestep(s) with the highest importance can be included forcibly into the modelling sample to ensure generation capacity adequacy in the resultant power system.
A timestep’s importance may depend on the power system design. In singleregion models, the net demand (residual demand after all renewable generation is used) is a good candidate since it is equal to the required dispatchable generation. The net demand, however, depends on the installed renewable capacity. For example, a timestep with a demand of 50GW and wind capacity factor 0.1 has a higher net demand than one with a demand of 55GW and wind capacity factor of 0.3 exactly when the installed wind capacity is less than 25GW. Since the installed wind capacity is itself a model output, a “Catch 22” situation occurs: determining which timesteps to use in a simulation requires an estimate for each timestep’s importance, which itself requires the model outputs.
A twostage approach is therefore proposed. A stage 1 optimisation run, using a random sample of timesteps, gives a rough indication of optimal power system design. This is used to estimate each timestep’s importance. A stage 2 sample is subsequently created by including a number of the timesteps with the highest importance and a random selection of those remaining. This is used in a second model run to to estimate the model outputs found using the full dataset. The approach is dependent on the choice of importance function, which should be a proxy for the “difficulty” in meeting demand (in terms of requiring a large amount of generation capacity). In the case study (Section 4), a timestep’s variable cost is used and this choice is applicable in a large class of PSMs. However, others are possible, and the choice may be tailored through consideration of the PSM or from expert knowledge.
3.2 Precise methodology
Full timeseries data: 
Random subsample: 
Stage 1 estimate of optimal design: 
Importance of each timestep: 
Importance subsample (see figure): 
Stage 2 estimate of optimal design: 
The following terminology is employed throughout this section:

: full set of timesteps to be sampled from.

: timeseries input data in timestep . For example, in a model with demand levels and wind capacity factors, .

: weight assigned to timestep in a PSM. For example, when clustering timeseries into representative periods (see Section 1.3), periods are typically weighted for their relative cluster sizes.

: optimal power system design, e.g. a vector with each generation technology’s installed capacity.
One furthermore needs two functions:

A power system model with an optimiser that returns, given some demand & weather timeseries data and timestep weights indexed by timesteps , the optimal power system design:
(1) 
An importance function that assigns to each timestep a measure of the difficulty in meeting demand. This may depend on the power system design:
(2)
Importance subsampling works as follows. Suppose one wants to determine, given some (equally weighted) timeseries data , the optimal power system design . Evaluating this function may be too computationally expensive, so is estimated as follows:

Randomly sample timesteps from the to create a stage 1 subsample of length with equal weights.

Determine optimal design for stage 1 subsample:
(3) 
Calculate importance of each timestep in the full dataset using stage 1 design:
(4) 
Create a stage 2 sample of size using the timesteps with the highest importance and a random selection of those remaining:
(5) with associated weights
(6) Figure 2 provides an illustration of this step.

Determine optimal design for stage 2 subsample:
(7) is the estimate of .
Each evaluation of involves 2 evaluations of optimal design (stage 1 and stage 2) using timesteps each. Since the computational cost of the other steps (sampling and evaluating importance) is typically negligible, the cost of evaluating is twice that of a single evaluation with sample size .
The weights in step 4 account for the relative proportions of the two bins. For example, suppose an importance subsample of size 120 timesteps is created using the 60 timesteps with the highest importance and a random selection of 60 from those remaining. Half of this subsample consists of timesteps with a high importance even though they represent a much smaller proportion of the full dataset. The weights are chosen to cancel out this oversampling and prevent overengineering for extreme scenarios. The timesteps with the highest importance represent a proportion of the dataset and the other represent . The weights account for this:
(8)  
(9) 
4 Test case: UK power system model
4.1 Overview
In this section, the performance of the importance subsampling approach is compared with other subsampling strategies when applied to a PSM based on the United Kingdom (UK) and created in the opensource energy modelling framework Calliope (see Pfenninger & Pickering, 2018)
. This model is designed as an idealised test case and should not be viewed as a realistic representation of the UK power system or used to inform policy or strategy. It is employed only since it uses a similar linear programming formulation as many PSMs popular in the energy community
(see e.g. Trutnevyte, 2016; Bazmi & Zahedi, 2011; Hall & Buckley, 2016) and results on this model can be reasonably expected to generalise to those models also.Hourly UKwide demand levels and wind capacity factors are estimated over the 36year period from 1980 until 2015 and serve as the model’s timeseries inputs. Longterm anthropogenic trends such as economic growth and efficiency improvements are removed so that different years of demand data can be fairly compared. Model outputs are the optimal installed capacities of 4 possible generation technologies. The first 3 (baseload, midmerit and peaking) are generic dispatchable technologies that differ only in their installation and generation costs. The 4th technology, wind, has 0 generation cost but output capped by timevarying wind levels. The PSM and input timeseries are discussed fully in the appendix (Section 6).
A “perfect model” framework is assumed: the optimal capacities across the 36 years are taken to be those that minimise cost under the “true” distribution of demand and wind. The 36year capacities (for baseload, midmerit, peaking and wind technologies) hence serve as targets:
(10) 
where is the demand level and is the wind capacity factor in timestep . Each timestep is assigned equal weight. A subsample generates estimators of the optimal capacities as defined by
(11) 
Subsampling strategies are evaluated on a number of criteria. The estimators defined by equation (11) should have both a low variation across samples generated by the same process and a low bias (understood as median error) when compared to the targets in equation (10). Furthermore, suppose a power system is designed using the estimated optimal capacities in equation (11). For this (hypothetical) system, two statistics are calculated:

hours of unmet demand: number of hours in the full sample in which the power system has insufficient generation capacity to meet demand. For example, the optimal power system design for 1980 data may be unable to meet demand for some hours in the period 19812015, leading to hypothetical supply shortages.

extra system cost: additional cost (sum of installation and generation costs) of meeting the 36year demand using a suboptimal power system. A PSM user might be aware of the risks in designing systems using short samples and want to compensate. The cheapest way to do this a priori (i.e. without more optimisation) is to use extra peaking capacity to meet any unmet demand. Define the extra system cost of the sample by
(12) where is the cost of a system designed using sample (with extra peaking capacity and generation to ensure no unmet demand) and is the cost of the 36year optimal system. Since it is by definition costoptimal, equation (12) is nonnegative.
Computational cost increases with timeseries length, so samples should be made as short as possible. The computational expense in solving the optimisation problem in this model scales linearly in the number of timesteps. In some PSMs (e.g. mixedinteger linear programs), the computational effort increases faster than linearly, in which case a reduction in simulation length causes a proportionally larger decrease in computational cost.
Four subsampling strategies are investigated: using individual years, random sampling of timesteps, medoids clustering into representative days and importance subsampling. In medoids clustering, samples are generated by clustering the 48dimensional vectors of each individual day’s (normalised) hourly demand and wind levels into clusters. Each cluster’s representative day is the day in the full timeseries whose vector is closest to the cluster mean. The model is run on the representative days weighted by the number of days their cluster contains.
4.2 Importance subsampling: setup
Recall from Section 3 that importance subsampling requires a PSM that can determine optimal system design. The test case PSM is an example of this. Here, the design output is the vector of optimal installed capacities of the four technologies. It is determined by minimising the sum of installation and generation costs while meeting demand. Input data is hourly UKwide demand and wind capacity factors:
(13)  
(14) 
where and are the demand level and wind capacity factor in timestep respectively. The second requirement is an importance function. A timestep’s variable cost is proposed for this purpose:
(15)  
(16) 
where is technology ’s cost per unit electricity generated (in £/GWh) and is the amount of electricity generated by technology in timestep (in GWh). The costs per unit electricity of the different technologies are shown in Table 2 in the appendix. Wind is assumed to generate electricity at 0 marginal cost: .
A timestep’s variable cost as a function of its demand is shown in Figure 3. Demand is met by meritorder stacking of technologies in ascending order of generation cost, with all available wind power used first, followed by baseload, midmerit and peaking. The first GW (the installed capacity times the timestep’s wind capacity factor) of demand are met by wind power with no contribution to the variable cost. The next GW are met by baseload, followed by GW of midmerit and GW of peaking (any unmet demand is also assumed to be met by peaking). The variable cost increases more quickly as technologies with progressively higher generation cost are used. It depends implicitly on the timestep’s demand level and wind capacity factor, as well as the installed capacity of the different generation technologies, through the generation terms in equation (16) and as summarised in Figure 3. It is a convenient proxy for the “difficulty” in meeting demand: timesteps with a high variable cost are exactly those requiring the most dispatchable generation capacity.
The optimal capacities across samples of varying total size are determined. Each sample includes the timesteps with the highest estimated variable cost and a random sample of those remaining. For additional clarity, the steps introduced in Section 3 applied to the test case model are explicitly presented below.

Randomly sample timesteps from the 36year dataset to create an equally weighted stage 1 subsample

Determine optimal capacities:
(17) 
Estimate variable cost of each timestep in full dataset using equation (16). The variable cost is the specific choice of importance function:
(18) (19) 
From the full dataset, construct stage 2 subsample by including the timesteps with the highest variable cost and a random sample of size from those remaining. Weight the timesteps in each of the 2 bins to reflect their relative proportions throughout the full dataset:
(20) where 315360 = 876036 is the total number of hourly timesteps in the 36year dataset.

Determine optimal capacities for the stage 2 subsample:
(21) This is the estimate for the optimal system design.
4.3 Results
Figure 4 shows the distribution of optimal capacites across samples generated via different subsampling schemes. The 36year optima (the targets) are shown as dashed black lines. Plots compare capacities obtained at equal computational cost. Since determining a single set of capacities using importance subsampling requires two optimisation runs and for which the computational expense scales linearly in the number of timesteps, the associated plots use half the sample length. For example, in Figure 4(c), the boxandwhiskers plots for individual years, random sampling and medoids clustering correspond to single simulations using 8760 hourly timesteps, while the plot for importance subsampling uses two runs of half this length each. Individual years have only one possible sample length of 8760 hourly timesteps. Since medoids clustering subsamples representative days, is the sample size divided by 24 (e.g. 480 timesteps correspond to 20 representative days).
Figure 4(a) shows the distribution of optimal capacities across simulations with computational expense equivalent to a PSM run of length 480 timesteps. While the variabilities across capacities determined via random sampling and importance subampling are roughly equal, the former induces a larger bias (understood as the difference between the median, indicated by the white line inside the box, and the 36year optimum, indicated by the dashed black line). medoids clustering into 20 representative days leads to low variabilities but large biases, with underestimation of optimal baseload and peaking capacity of 5 and 8GW respectively and overestimation of optimal wind capacity by more than 10GW.
Figure 4(b) shows the same plots for computational expense equivalent to 1920 timesteps (just under 3 months of hours). Importance subsampling performs better than random sampling and clustering, which again underestimate (overestimate) optimal peaking (wind) capacity respectively.
For simulations corresponding to full years of computational cost (8760 timesteps, Figure 4(c)), importance subsampling performs markedly better than the other schemes. A 95% prediction interval across individualyear simulations ranges from 2.5GW to 33.0GW, giving the user virtually no indication of optimal design. Results are slightly better under random sampling and medoids clustering. However, for all 3 schemes, the majority of samples underestimate optimal peaking capacity and overestimate optimal wind capacity. For importance subsampling, bias is low (the median of optima lies almost exactly on the target) and variation is greatly reduced.
Figure 5 shows the distribution of the number of hours of annual unmet demand. Power systems designed using individual years, random sampling and
medoids clustering have a high probability of leading to supply capacity shortages. For example, one designed using
medoids clustering into 20 days (480 timesteps) has a 75% change of unmet demand in at least 40 hours annually. Similarly, a power system designed using a random choice of individual year has a 35% chance of failing to meet demand in more than 3 hours. In contrast, importance subsampling leads to virtually no unmet demand, even for short samples lengths.Figure 5 also shows the distribution of extra system cost. Extra cost for importance subsampling are similar to those for random sampling for 480 timesteps worth of computational expense but levels are lower than other samplers for longer simulation lengths. At computational cost equivalent to a 1920timestep simulation, 95% of simulations lead to cost overruns of no more than 0.2%, and for 8760 timesteps any extra costs are negligible.
The percentage errors in optimal capacities for individual technologies are much larger than those in the total system cost. For example, the 2010 optimal wind capacity is 1.0GW, a 94% error when compared to the 36year optimum of 16.4GW. However, the resultant power system (with extra peaking to ensure no unmet demand) costs just 2.3% more. Across all years, the mean absolute percentage error in installed wind capacity is 34%, but the resultant systems have a mean cost just 0.3% higher. The reason for this is twofold. Firstly, calculating the extra system cost assumes additional peaking (to meet unmet demand) is installed at the same “standard” price — there is no additional penalty for the unanticipated need for this additional capacity. A stronger penalisation for unmet demand (such as a lossofload cost, typically much higher than the costs used here) would greatly increase the additional expense. Secondly, there is a flat optimum with respect to installed wind capacity, shown Figure 6. This curve is constructed by imposing a fixed wind capacity of and optimising the capacities of the remaining technologies in the usual way. The curve’s value hence shows the minimum system cost as a function of prescribed wind capacity. There is one curve for each individual year along with an annualised version using all 36 years at once. In most years, choosing a wind capacity anywhere between 0 and 40GW has a small effect on cost. This explains the large interyear variability in optimal wind capacity; changes to the annual distribution of demand and wind change the shape of the cost curves slightly and may perturb the value of the minimum significantly with a comparatively modest effect on system cost.
5 Discussion
Previous studies indicate that, to reliably estimate optimal system design under natural climate variability, power system models (PSMs) with a high proportion of variable renewable energy require both a high temporal resolution and long samples of demand & weather data. These combined requirements typically exceed computational limitations, and timeseries reduction approaches such as subsampling are required. The accuracy of PSM outputs using subsampled data is highly dependent on whether a number of “extreme” timesteps are included in modelling samples. However, identifying them is difficult a priori since which timesteps are “extreme” frequently themselves depends on what the model is trying to determine (a chickenandegg situation).
This paper introduces a novel subsampling approach, called importance subsampling, that systematically identifies such timesteps and includes them in modelling samples. The main idea is to use a first stage simulation to obtain a rough indication of optimal system design. This is used, in conjunction with an importance function, to determine which timesteps must be sampled. A second stage model run, including these timesteps, provides estimates for the model outputs found using all available data.
A test case is performed on a model of the United Kingdom and 36 years of historic demand & weather data. Optimal system design for individual years is found to be highly dependent on the choice of year as well as underestimating (overestimating) optimal dispatchable (renewable) capacity respectively. Furthermore, resultant power systems lead to supply capacity shortages when applied to different weather years, and the longterm optimal power system design cannot be reliably determined by taking averages across multiple individualyear simulations.
Estimates of optimal system design using random samples or data clustered into representative days have lower variability and slightly smaller errors. Interestingly, randomly sampling 8760 hourly timesteps more accurately estimates optimal capacities than individual years. Hence, in models where timesteps may be scrambled (e.g. ones ignoring ramping or storage constraints), random sampling may be preferred over selecting long contiguous periods. However, errors in estimation of optimal system design and associated generation capacity shortages or cost overruns are present across both random sampling and clustering.
In contrast, importance subsampling consistently estimates, at reduced computational cost, the model outputs found using all available data. Even for short sample lengths, the bias between the median of capacities found using importance subsamples and the longrun optima is very small. This means a user can estimate optimal capacites by the median across multiple PSM runs with short importance subsamples, something that would be impossible with the other subsampling approaches due to their biases.
The importance subsampling approach is introduced in full generality and can be applied to a wide class of optimisationbased PSMs by modifying the choice of importance function. In the test case, this is a timestep’s variable cost, and this choice generalises naturally to any model with such a notion. For example, in more detailed models, the variable cost can be summed across regions or calculated considering more sophisticated technoeconomic factors. A judicious choice of importance function may also be inspired by expert knowledge. For example, if the user has a good idea of which demand & weather scenarios are essential to ensure generation capacity adequacy, the importance function may be chosen to identify such events.
A drawback of the proposed approach is that it scrambles the order of timesteps and hence can only be directly applied to models with a small level of continuous dependence such as storage or ramping. Using the same approach to sample longer time periods such as days or weeks allows the modelling of such phenomena within periods, which may be sufficient in many settings. This may require additional refinements such as combining importance subsampling with clustering to reduce output variability.
The extra cost from designing suboptimal systems (and adjusting to ensure no unmet demand, see Section 4.3) is typically much smaller than the errors in estimates of optimal generation capacities. One of the causes, a flat optimal cost (particularly with respect to installed wind capacity), highlights an important distinction. If the only goal is the cheapest power system, a singleyear simulation may suffice provided some extra capacity is installed to ensure demand can always be met. However, if one seeks to use minimisation of total system cost as a means of determining optimal system design, the impact of climate uncertainty is significant and approaches incorporating many years of climate information are required. In addition, one should be careful when invoking arguments of the type “wind should not be installed since the model indicates optimal design includes almost no wind”. A more robust argument requires the investigation of the degree of suboptimality of power systems with perturbed design.
There are a number of possible extensions to this investigation. One is to relax the “perfect model” assumption that the 36year simulation determines the “true” optimal capacities. In reality, the 36 years are themselves a sample from the “true” climate and the associated model outputs can be uncertain or biased. Using techinques from extreme value statistics or a climate model to generate a larger sample of hypothetical demand & weather data offer a more robust approach. Another extension is to conduct the same investigation using a different PSM or geographical region. On the one hand, Woollings (2010) notes that United Kingdom’s climate exhibits a large degree of interyear variability (so that sampling from multiple decades may not be necessary for other locations). On the other, the PSM employed in this investigation aggregates demand and wind levels into a single value for the whole country, “averaging out” spatial variations (and hence also some of the interyear variability). For this reason, the degree of interyear variability and both the need for and efficacy of importance subsampling
for other PSMs should be explored. Finally, one could investigate its use in sampling longer representative periods (e.g. days or weeks). This may require additional adjustments such as combining the method with clustering approaches to further reduce variance in model outputs.
In line with the open energy modelling movement (see Hilpert et al., ; Pfenninger et al., 2017) the data and all files required for the construction of the model (in the framework Calliope) are publicly available at https://github.com/ahilbers/2019_importance_subsampling.
6 Appendix
6.1 Power system model
Sets  

Technologies: baseload, midmerit, peaking, wind  
Timesteps (hourly)  
Parameters  
Annualised installation cost, technology (£m/GWyr)  
Generation cost, technology (£m/GWh) 
Timeseries data, timestep  

Demand (GWh)  
Wind capacity factor  
Timestep weight  
Decision variables  
Installed capacity, technology (GW)  
Electricity generated, technology , timestep (GWh) 
The PSM is a simple representation of the UK power system, viewing it as a single node with inelastic UKwide hourly demand levels and wind capacity factors. A linear optimiser determines the optimal capacities across 4 possible technologies by minimising the sum of installation and generation costs. The first 3, generically called baseload, midmerit and peaking, differ only in their fixed (investment, £/GW) and variable (generation, £/GWh) costs. The fourth option, wind, has 0 generation cost but a supply level capped by the installed capacity times the wind capacity factor. Its installation cost refers to rated capacity: the maximum power output that a wind farm can produce at wind speeds just under the cutout point when turbines are turned off to avoid damage. The model takes two timeseries inputs: hourly UKwide demand levels and wind capacity factors.
6.1.1 Assumed costs
Installation cost  Generation cost  

Technology  (£m/GWyr)  (£m/GWh) 
Baseload  
Midmerit  
Peaking  
Wind 
Table 2 shows each technology’s assumed costs. They are chosen to be roughly indicative of the possible generation technologies available in the UK but should not be interprested as realistic estimates of future build or generation cost. Each technology’s installation cost is annualised by dividing by the expected plant lifetime.
Midmerit prices are based on the CCGT H class reactor from (Department for Business, Energy and Industrial Strategy, 2016), which has a build cost of £100/kW per year of plant lifetime, corresponding to £100m/GWyr. The generation cost is determined assuming a wholesale gas price of 60 pence per therm. Using the conversion 1 therm 30kWh and assuming a generation efficiency of 55% gives a generation cost £35t/GWh. The costs for baseload and peaking are chosen either side of these values.
Wind is assumed to have no generation cost. The construction cost is taken from the medium value for an Onshore UK 5MW wind farm in (Department for Business, Energy and Industrial Strategy, 2016): £1200/kW, corresponding to £50/kWyr over a 24year lifetime. Adding fixed O&M costs of £23k/kWyr gives £73m/GWyr. This is revised upwards to £100m/GWyr to reflect infrastructure costs and the fact that offshore wind is more expensive than onshore.
6.1.2 Mathematical setup
The optimiser chooses capacities by minimising the sum of installation and generation costs while meeting an inelastic demand timeseries, phrased as a linear optimisation problem with decision variables . Timesteps are 1 hour in length.
(22) 
subject to
(23)  
(24)  
(25)  
(26) 
For definitions of letters and symbols, see Table 1. The constant factor 8760 (= the number of hours in a year) is included since weights are chosen to sum to 1.
Constraint (23) ensures that conventional (baseload, midmerit and peaking) generation levels never exceed their capacity. Constraint (24) ensures that wind generation never exceeds installed capacity times the wind capacity factor. Constraint (25) ensures that supply meets demand in each timestep. Given installed capacities, generation levels are determined by meritorder stacking of technologies in ascending order of variable cost.
The PSM is built and solved in the opensource modelling framework Calliope (see Pfenninger & Pickering, 2018).
6.2 Data
This investigation uses two timeseries: hourly UKwide electricity demand levels and wind capacity factors over the period 19802015. They are modified versions of those found in (Bloomfield et al., 2016)
6.2.1 Demand model
The demand timeseries is based on a regression between weather and demand data collected from two different sources. UKwide daily mean temperature is obtained from the MERRA reanalysis (Rienecker et al., 2017) for the period 19802015. Metered UKwide demand data is obtained from National Grid, the UK transmission operator, over 20062015 (see National Grid, 2018). For this period, in which there is overlapping meteorological and demand data, a regression model is run for daily demand:
(27) 
where

is the totaly daily demand in day .

accounts for anthropogenic demand trends (e.g. economic growth or efficiency improvements).

is .

and account for yearlong demand cycles.

Te() is the effective temperature: a slightly timeoffset daily mean temperature to account for the fact that demand is influenced by temperature at a time lag. It is defined recursively via , where is the UKwide daily average temperature on day , and initialised via .

and are indicator variables that take value 1 only on Monday through Sunday respectively, but not on bank holidays.

is an indicator variable that takes value 1 only on bank holidays.

is an error term.
For values of the parameters, see (Bloomfield et al., 2016). The distribution of the error term
is symmetrical with a sharp mode at 0 and a standard deviation of 34.2GWh. There is no trend in average daily error across years or months.
After determining the coefficients, daily demand is extrapolated to 19802015 by truncating and using the daily temperatures to extend the timeseries back. Four diurnal curves, one for each of the 3month periods DJF, MAM, JJA, SON, are used to upsample the timeseries to hourly resolution, with each day given a weighting of two diurnal curves. For example, December 1 is given a 50/50 weighting of the diurnal curves for SON and DJF.
Anthropogenic demand trends are removed by setting =0 and replacing by , representing the middle of the period 20062015. This ensures that the differences between the demand distribution in different years is weatherdriven.
6.2.2 Wind model
The wind model is identical to that used in the HIGH wind scenario in (Bloomfield et al., 2016), which is in turn based on two other studies. First, from the MERRA reanalysis (Rienecker et al., 2017), hourly windspeeds at 2, 10 and 50m elevation are obtained at gridpoints throughout the UK. These are fed through a logarithmic profile to estimate windspeeds at 80m, the average height of UK wind turbines.
Windspeeds at MERRA gridpoints are linearly interpolated to the locations of all wind farms in operation or in the construction pipeline as of April 2014
(see Drew et al., 2015). At each location, wind speeds are transformed to capacity factors using an adjusted power curve from a Siemens 2.3MW turbine (see Cannon et al., 2015, for details). An average over the wind farm configuration, weighted by each farm’s capacity, determines the UKwide capacity factor. In the PSM, adding capacity happens in proportion to the wind farms’ relative sizes. For example, doubling wind capacity implicitly means doubling the capacity of each wind farm.6.3 Supplementary material
The demand timeseries used in the test case (Section 4) as introduced in Section 6.2.1 is based on a regression. In particular, in obtaining 36 years of demand data, the error term in equation (27) is truncated. This has the effect of reducing variability. For this reason, the performance of importance subsampling is also examined on a real demand timeseries, where only a longterm demand trend is removed but there is no truncation of the error. Figures 4 and 5 are recreated using the altered demand timeseries. Results are broadly identical as those found using regression data and are available as supplementary material at github.com/ahilbers/2019_importance_subsampling.
Acknowledgements
The first author’s work was funded through the EPSRC CDT in Mathematics for Planet Earth. The authors thank Hannah Bloomfield for providing demand and wind timeseries and 3 anonymous referees for their constructive comments.
References
References
 Bazmi & Zahedi (2011) Bazmi, A., & Zahedi, G. (2011). Sustainable energy systems: Role of optimization modeling techniques in power generation and supply — a review. Renewable and Sustainable Energy Reviews, 15, 3480–3500. doi:10.1016/j.rser.2011.05.003.
 Bloomfield et al. (2016) Bloomfield, H., Brayshaw, D., Shaffrey, L., Coker, P., & Thornton, H. (2016). Quantifying the increasing sensitivity of power systems to climate variability. Environmental Research Letters, 11, 124025. doi:10.1088/17489326/11/12/124025.
 Cannon et al. (2015) Cannon, D. J., Brayshaw, D. J., Methven, J., Coker, P. J., & Lenaghan, D. (2015). Using reanalysis data to quantify extreme wind power generation statistics: A 33 year case study in Great Britain. Renewable Energy, 75, 767 – 778. doi:10.1016/j.renene.2014.10.024.
 Collins et al. (2017) Collins, S., Deane, J. P., Poncelet, K., Panos, E., Pietzcker, R. C., Delarue, E., & Ó Gallachóir, B. (2017). Integrating short term variations of the power system into integrated energy system models: A methodological review. Renewable and Sustainable Energy Reviews, 76, 839 – 856. doi:10.1016/j.rser.2017.03.090.
 Collins et al. (2018) Collins, S., Deane, P., Ó Gallachóir, B., Pfenninger, S., & Staffell, I. (2018). Impacts of interannual wind and solar variations on the European power system. Joule, 2, 2076 – 2090. doi:10.1016/j.joule.2018.06.020.
 De Jonghe et al. (2011) De Jonghe, C., Delarue, E., Belmans, R., & D’haeseleer, W. (2011). Determining optimal electricity technology mix with high level of wind power penetration. Applied Energy, 88, 2231 – 2238. doi:10.1016/j.apenergy.2010.12.046.
 de Sisternes & Webster (2013) de Sisternes, F. J., & Webster, M. D. (2013). Optimal selection of sample weeks for approximating the net load in generation planning problems. ESD Working Papers, (pp. 1–12).
 Department for Business, Energy and Industrial Strategy (2016) Department for Business, Energy and Industrial Strategy (2016). Electricity Generation costs. Technical report. Department for Business, Energy and Industrial Strategy.
 Drew et al. (2015) Drew, D. R., Cannon, D. J., Brayshaw, D. J., Barlow, J. F., & Coker, P. J. (2015). The impact of future offshore wind farms on wind power generation in Great Britain. Resources, 4, 155–171. doi:10.3390/resources4010155.
 Hall & Buckley (2016) Hall, L., & Buckley, A. (2016). A review of energy systems models in the UK: Prevalent usage and categorisation. Applied Energy, 169, 607–628. doi:10.1016/j.apenergy.2016.02.044.
 Härtel et al. (2017) Härtel, P., Kristiansen, M., & Korpås, M. (2017). Assessing the impact of sampling and clustering techniques on offshore grid expansion planning. Energy Procedia, 137, 152–161. doi:10.1016/j.egypro.2017.10.342.
 (12) Hilpert, S., Kaldemeyer, C., Krein, U., Günther, S., Wingenbach, C., & Pleßmann, G. (). The open energy modelling framework (OEMOF) — a novel approach in energy systems modelling. Preprint., . doi:10.20944/preprints201706.0093.v1.
 IPCC (2018) IPCC (2018). Global warming of 1.5∘C: Summary for Policymakers.. Policy report. World Meteorological Organization, Geneva, Switzerland.
 Jacobson et al. (2015) Jacobson, M. Z., Delucchi, M. A., Cameron, M. A., & Frew, B. A. (2015). Lowcost solution to the grid reliability problem with 100% penetration of intermittent wind, water, and solar for all purposes. Proceedings of the National Academy of Sciences of the United States of America, 112, 15060—15065. doi:10.1073/pnas.1510028112.
 Kotzur et al. (2018) Kotzur, L., Markewitz, P., Robinius, M., & Stolten, D. (2018). Impact of different time series aggregation methods on optimal energy system design. Renewable Energy, 117, 474–487. doi:10.1016/j.renene.2017.10.017.
 Nahmmacher et al. (2016) Nahmmacher, P., Schmid, E., Hirth, L., & Knopf, B. (2016). Carpe diem: a novel approach to select representative days for longterm power system modeling. Energy, 112, 430–442. doi:10.1016/j.energy.2016.06.081.
 National Grid (2017) National Grid (2017). Future Energy Scenarios 2017. Technical report. National Grid.
 National Grid (2018) National Grid (2018). Data Explorer. Dataset. National Grid. https://www.nationalgrid.com/uk/electricity/marketoperationsanddata/dataexplorer.
 Pfenninger (2017) Pfenninger, S. (2017). Dealing with multiple decades of hourly wind and PV time series in energy models: a comparison of methods to reduce time resolution and the planning implications of interannual variability. Applied Energy, 197, 1–13. doi:10.1016/j.apenergy.2017.03.051.
 Pfenninger et al. (2017) Pfenninger, S., DeCarolis, J., Hirth, L., Quoilin, S., & Staffell, I. (2017). The importance of open data and software: Is energy research lagging behind? Energy Policy, 101, 211–215. doi:10.1016/j.enpol.2016.11.046.
 Pfenninger & Pickering (2018) Pfenninger, S., & Pickering, B. (2018). Calliope: a multiscale energy systems modelling framework. Journal of Open Source Software, 3(29), 825. doi:10.21105/joss.00825.
 Poncelet et al. (2016) Poncelet, K., Delarue, E., Six, D., Duerinck, J., & D’haeseleer, W. (2016). Impact of the level of temporal and operational detail in energysystem planning models. Applied Energy, 162, 631 – 643. doi:10.1016/j.apenergy.2015.10.100.
 Poncelet et al. (2017) Poncelet, K., Höschle, H., Delarue, E., Virag, A., & D’haeseleer, W. (2017). Selecting representative days for capturing the implications of integrating intermittent renewables in generation expansion planning problems. IEEE Transactions on Power Systems, 32, 1936–1948. doi:10.1109/TPWRS.2016.2596803.
 Rienecker et al. (2017) Rienecker, M. et al. (2017). MERRA — NASA’s modernera retrospective analysis for research and applications. Journal of Climate, 24 (14), 3624–3648. doi:10.1175/JCLID1100015.1.
 Ringkjøb et al. (2018) Ringkjøb, H.K., Haugan, P. M., & Solbrekke, I. M. (2018). A review of modelling tools for energy and electricity systems with large shares of variable renewables. Renewable and Sustainable Energy Reviews, 96, 440 – 459. doi:10.1016/j.rser.2018.08.002.
 Staffell (2017) Staffell, I. (2017). Measuring the progress and impacts of decarbonising British electricity. Energy Policy, 102, 463–475. doi:10.1016/j.enpol.2016.12.037.
 Staffell & Pfenninger (2018) Staffell, I., & Pfenninger, S. (2018). The increasing impact of weather on electricity supply and demand. Energy, 145, 65–78. doi:10.1016/j.energy.2017.12.051.
 Stoft (2002) Stoft, S. (2002). Power System Economics. WileyIEEE Press.
 Thornton et al. (2017) Thornton, H., Scaife, A., Hoskins, B., & Brayshaw, D. (2017). The relationship between wind power, electricity demand and winter weather patterns in Great Britain. Environmental Research Letters, 12, 064017. doi:10.1088/17489326/aa69c6.
 Trutnevyte (2016) Trutnevyte, E. (2016). Does cost optimization approximate the realworld energy transition? Energy, 106, 182–193. doi:10.1016/j.energy.2016.03.038.
 Ventosa et al. (2005) Ventosa, M., Baíllo, Á., Ramos, A., & Rivier, M. (2005). Electricity market modeling trends. Energy Policy, 337, 897 – 913. doi:10.1016/j.enpol.2003.10.013.
 Wohland et al. (2018) Wohland, J., Reyers, M., Märker, C., & Witthaut, D. (2018). Natural wind variability triggered drop in German redispatch volume and costs from 2015 to 2016. PLoS ONE, 13, 1. doi:10.1371/journal.pone.0190707.
 Woollings (2010) Woollings, T. (2010). Dynamical influences on European climate: an uncertain future. Philosophical Transactions of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 368, 3733 – 3756. doi:10.1098/rsta.2010.0040.
 Zeyringer et al. (2018) Zeyringer, M., Price, J., Fais, B., Li, P.H., & Sharp, E. (2018). Designing lowcarbon power systems for Great Britain in 2050 that are robust to the spatiotemporal and interannual variability of weather. Nature Energy, 3, 395–403. doi:10.1038/s415600180128x.
Comments
There are no comments yet.