Electrical peak demand forecasting- A review

The power system is undergoing rapid evolution with the roll-out of advanced metering infrastructure and local energy applications (e.g. electric vehicles) as well as the increasing penetration of intermittent renewable energy at both transmission and distribution level, which characterizes the peak load demand with stronger randomness and less predictability and therefore poses a threat to the power grid security. Since storing large quantities of electricity to satisfy load demand is neither economically nor environmentally friendly, effective peak demand management strategies and reliable peak load forecast methods become essential for optimizing the power system operations. To this end, this paper provides a timely and comprehensive overview of peak load demand forecast methods in the literature. To our best knowledge, this is the first comprehensive review on such topic. In this paper we first give a precise and unified problem definition of peak load demand forecast. Second, 139 papers on peak load forecast methods were systematically reviewed where methods were classified into different stages based on the timeline. Thirdly, a comparative analysis of peak load forecast methods are summarized and different optimizing methods to improve the forecast performance are discussed. The paper ends with a comprehensive summary of the reviewed papers and a discussion of potential future research directions.



There are no comments yet.


page 1

page 2

page 3

page 4


Daily peak electrical load forecasting with a multi-resolution approach

In the context of smart grids and load balancing, daily peak load foreca...

Online Charging Scheduling Algorithms of Electric Vehicles in Smart Grid: An Overview

As an environment-friendly substitute for conventional fuel-powered vehi...

Intelligent Residential Energy Management System using Deep Reinforcement Learning

The rising demand for electricity and its essential nature in today's wo...

High-Resolution Peak Demand Estimation Using Generalized Additive Models and Deep Neural Networks

This paper presents a method for estimating high-resolution electricity ...

Multi-Period Flexibility Forecast for Low Voltage Prosumers

Near-future electric distribution grids operation will have to rely on d...

Simulation and Analysis of Container Freight Train Operations at Port Botany

Over two million containers crossed the docks at Sydney's Port Botany in...

Forecasting Wireless Demand with Extreme Values using Feature Embedding in Gaussian Processes

Wireless traffic prediction is a fundamental enabler to proactive networ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the roll-out of advanced metering infrastructure (AMI) [AR-5], the power system is undergoing rapid evolution. The Office for Gas and Electricity Markets (Ofgem) has announced that the UK plans to install more than 50 million smart meters by 2020 [AR-6]. On the one hand, the installation of smart meters enables real-time information exchange between power suppliers and end-users and therefore increases the efficiency of the electric power supply and encourages different smart energy applications such as demand response and demand side management [Areview-1]. On the other hand, the high temporal resolution energy consumption data coupled with intermittent energy resources such as wind energy make electricity demand present unprecedented diversity and complexity.

Different electricity generation units have been adopted in the power plants to meet the specific electrical demand/ load types. Among all the units, peak load units have the lowest efficiency and the highest cost. It is estimated that a 5%-15% reduction in peak load would bring substantial benefits in saving resources and decreasing real-time electricity tariffs

[reason-6], which calls for effective peak load management strategies.

To realize that, being able to accurately predict the magnitude and occurring time of peak load/ demand, which can not only give the power plants sufficient start-up time to avoid grid congestion but also is fundamental in ensuring the economic benefits and the security and stability of the power grid. With the increasing penetration of large-scale intermittent energy such as wind and solar as well as energy storage power station, it has given rise to new characteristics of peak loads and a more challenging task for peak load/ demand forecast.

In such a context, it is evident that accurate peak load demand forecast becomes essential element to the power grid operations [reason-10]. Although optimizing smart grid operation based on standard load forecast has been a long-held principle [reason-4], the new digital and smart grid era calls for more attention to build flexible peak load forecast frameworks to adapt to the rapid development of the power system.

1.1 Motivation and contributions of this review

Instead of the continuous and stable power generation, peak load power plants only run for a short time over a year, which is neither economical nor environmentally friendly. Peak load management strategies were therefore proposed to reduce peak load generation costs based on the incentive and punishment mechanism and programs, such as interruptible load control, demand-side bidding, and emergency demand response [saebi2010demand] [cappers2010demand]. Moreover, it is important to know reasonably well the future peak load demand in order to plan and trigger relevant peak demand management strategies and mechanisms. Therefore, accurate and reliable peak load forecast is crucial for materializing any peak demand management strategy.

In general, a mature peak load forecast can help the system operator to manage the peak load demand effectively in advance, and thus to achieve demand response to help reduce greenhouse gas emissions and decrease non-renewable fuel reliance. Moreover, the ultimate goal of peak demand management and forecast is to balance electricity supply and demand to maximize the benefits of system. Therefore, to further highlight the motivation of the peak demand forecast, TABLE 1 lists key stakeholders (grid operators, electricity retailers, electricity end-users, government) and the impact of peak load demand forecast on them [koh2015evaluating] [sardi2017multiple] [mao2009short].

Electricity market stakeholders Importance of peak load demand forecast
Grid operators
Improve the utilization rate of power generation equipment
Reduce the cost of power generation and investment in power facilities
Alleviate the supply pressure of the grid during peak hours
Electricity retailers
Make reasonable tariff schemes so as to maximize profits
Offer energy-efficiency rebates to encourage customers to reduce peak load demand
Commercial and industrial
Improve the economic benefits and save production resources
Alleviate environmental pollution by distributing emissions concentrated in the peak hours
Save electricity bills and improve their living standard
Enable a reliable power supply system
Ensure economic growth and social welfare
Table 1: The importance of peak load demand forecast for electricity stakeholders

Based on the above analysis, the objective of this review is to provide a clear and comprehensive overview of the peak load demand forecast methods in the literature. To the best of our knowledge, this should be the significant review on the topic of peak load forecast. In particular, the key contributions of this review are described as follows.

  • We give precise definitions of key factors relevant to the peak load forecast framework, which could serve as a useful standardization and guidance for future research in this area.

  • We conduct a thorough review of peak load demand forecast methods and explores hybrid forecast models from a historical and systematic point of view.

  • We provide a comparative analysis based on existing studies and discuss potential improving methods for peak load demand forecast. A comprehensive summary regarding the application scopes of the reviewed methods is also presented, which could provide useful insights for future research directions.

1.2 Literature retrieval strategy

Before a detailed overview of peak load demand forecast methods, a necessary initial step is to follow the standard criteria and protocols to select highly related and high-quality sources and publications.

The literature retrieval databases selected are ScienceDirect (SD) and the Institution of Electrical and Electronics Engineers (IEEE). SD is a famous academic database provided by Elsevier, in which more than a billion articles are downloaded every year, making it the most downloaded academic search platform among academic databases [SD]. IEEE publishes a wide range of peer-reviewed journals, and the criteria defined are of recognized authoritative influence in the field of electrical power analysis [7-IEEE].

The following key phrases are used during the literature retrieval process (searching range of the year: 1872-2020):

  • Peak load forecasting/estimation/prediction

  • Peak load demand forecast/estimation/prediction

  • Maximum load forecasting/estimation/prediction

The keywords in each key phrase utilize the Boolean operator ’AND’; each key phrase is connected with the Boolean operator ’OR’.

After excluding low-relevance articles without key phrases in the title and abstract, a total of 139 highly related and high-quality papers form the basis of this review through a preliminary analysis. The obtained studies consist of 67 journal papers and 72 conference papers. The subsequent discussions of peak load demand forecast are all based on the literature obtained in this section.

1.3 Systematic overview of literature based on time line

To understand the historical development trend of peak load demand forecast, important to follow the timeline to conduct a systematic review. Figure. 1 shows the the number of total publications and journal publications published every year from 1956 to 2020.

Figure 1: The number of papers published every year from 1956.

Built on the exponential trend of obtained publications, the exploration of peak load forecast can be roughly categorized into three stages following the timeline: the initial stage, the developing stage, and the developed stage. The initial stage was from 1956 to 1990, during which the research on peak load demand forecast was in its infancy with a small number of publications. The strengthening phase was from 1991 to 2003, during which the number of publications began to increase gradually, with three or four articles published every year. The developed period is from 2004 to 2020 with a large number of publications on peak load forecast.

The large number of publications over the past decade reveal that there are increasing interests on peak demand forecast. This could be explained by the fact that with the economic development, there are increasing electricity consumption. As a result, peak load forecast becomes increasingly important for safe and reliable energy systems operation. Moreover, considering increasing integration of modern and clean energy technologies such as electric vehicles (EVs) and wind energy, it would be become more challenging for the peak demand forecast and the research interests on the topic will continue to grow in the future.

It should be noted that the number of journal publications on peak load forecast each year over the last decade is usually within the range of 1 to 4, which may indicate there still lacks sufficient efforts from the researchers but on the other hand indicate more research opportunities. This paper will provide a timely review on the important topic of peak load forecast with a clear definition of the research problem, comprehensive review of existing methods and a comparative and forward-looking analysis of future research directions.

1.4 Structure of the review

The remainder of this paper is organized as follows: Section 2

provides comprehensive summaries and precise definitions for the peak load forecast problem including the forecast period, influential variables, general outputs, and evaluation metrics. Section

3 describes peak load demand forecast methods following the timeline by dividing them into manual/human expert stage, classic peak load demand forecast stage, and advanced peak load demand forecast stage. Section 4 firstly gives a comparative analysis and explores possible improving methods for the peak load demand forecast framework. Then, a comprehensive summary of existing studies on the peak load forecast will be presented. In Section 5, a conclusion is given with possible future research directions discussed.

2 Peak load demand forecast problem definition

A general peak load demand forecast framework is shown in Figure 2. Intuitively, the general framework for peak load demand forecast is similar to standard load forecast. However, peak load demand forecast has its particularity when it comes to specific sub-processes, such as input variables and output results. To our best knowledge, many terms that have been defined in the standard load forecast have not been well defined in the peak load demand forecast, which leads to different understanding of the same terms in different studies. To this end, for the first time, this paper will provide an unification of relevant terms to accurately define peak load demand forecast methods and to provide generalized guidance for future research on this topic.

Figure 2: A general peak load demand forecast framework.

The following subsections will first summarize the commonly used time horizon for short-term, medium-term, and long-term peak load demand forecast according to the reviewed literature. Secondly, influential variables used in peak load demand forecast models will be discussed. Thirdly, the outputs of the forecasting model will be summarised. Finally, special evaluation indicators for peak load demand forecast results are presented.

2.1 Peak load demand forecast time period

Although the forecast horizon of standard load forecast has been well known, there is no such summary for peak load demand forecast.

Therefore, through analysing the reviewed literature, we classify the time horizon of peak load demand forecast into following categories:

  • Short-term peak load demand forecast (STPLF), to forecast peak load from several hours to days (days7) [46-2005Probabilistic, 56newd-2006Developed, 70-2008Density, 81-6121762].

  • Medium-term peak load demand forecast (MTPLF), to forecast peak load from per week to months (months 12) [46news-2005Short, 46new1, 81-6121762].

  • Long-term peak load demand forecast (LTPLF), to forecast peak load from more than a year ahead [46news-2005Short, 46new1, 70-2008Density, 81-6121762].

It is worth noting that based on the reviewed papers: 1) daily peak load demand forecast is mainly studied among STPLF; 2) weekly and monthly peak load demand forecast is mainly studied among MTPLF; and 3) annual peak load demand forecast is mainly studied for LTPLF.

2.2 Influential variables of peak load demand forecast

2.2.1 Endogenous variables

The endogenous variables used in peak load demand forecast differ from those used in standard load forecast. For example, assume that the training data are hourly load consumption for one year. A standard load demand forecast model will use the hourly load data, i.e. data points, as the endogenous variables. However, a peak load demand forecast model will use the daily peak load value (sometimes also with the daily peak time), i.e. data points ( data pairs if with the daily peak time), as the endogenous variables.

The endogenous variables used by peak load demand forecast models are generally peak load data in similar days, which can often reflect the internal structure similar to the peak load in the forecast period, making it easier for the algorithm to capture the characteristics of the predicted target. [126-2019Deep] proposed novel algorithms to identify the recent days that are similar to the days before the forecast, and the peak load close to the predicted date is then deduced as historical training data by analogy with the rule of thumb to improve the prediction accuracy.

Furthermore, since the input data only need the peak load in a specific period, the input variable dimension of peak load demand forecast is much lower than that of standard load forecast. The advantage of this is reflected in the high computational efficiency of peak load demand forecast model. [30-Amjady2001Short] compared both the number of input features and the computation time of hourly load forecast and daily peak load forecast based on the same historical data. The statistical analysis showed that only six input features were needed for the daily peak load demand forecast, while 171 input features were necessary for the hourly load forecast.

2.2.2 Exogenous variables

TABLE 2 summarizes the exogenous variables that are frequently used in peak load demand forecast models.

Weather variables
Maximum dry-bulb temperature, average dry-bulb temperature,
minimum dry-bulb temperature, average relative humidity, average pressure,
average amount of cloud, rainfall volume, duration of bright sunshine,
daily global solar radiation, average wind speed
Calendar variables
Time of the day, day of the week, week of the month, month of the year,
season, year number, holidays, special events
Economic variables
Gross National Product (GNP), Gross Domestic Product (GDP),
population growth rate, consumer growth rate, tariff structure, electricity price
Other variables
Customer type (commercial, residential, industrial, etc.)
Table 2: Popular exogenous variables for peak load demand forecast models

The selection of exogenous variables of peak load demand forecast models is similar to yet different from standard load forecast. According to the table, it can be seen that the input variables of peak load are similar to load prediction on the macro level, namely temperature, humidity, etc. Moreover, the selection of input variables of peak load is closely related to the forecasting period, which is also similar to that of the standard load forecast. The commonly used variables of STPLF are weather and calendar factors. For MTPLF and LTPLF, in addition to the weather and calendar variables, it is necessary to capture socio-economic development and population growth trends. Besides, the acquisition method and accuracy of long-term weather data are also thorny problems that must be considered wisely for MTPLF and LTPLE.

On the other hand, the difference between peak load demand forecast and standard load forecast is that, since the prediction target is a series of extreme values under most conditions, the variables that most closely related to a peak load demand forecast model are the extreme variables with the ability to indicate the change degrees of the weather, such as the maximum and minimum temperature. In addition, some weather variables are internally related and can affect each other. For example, [56-4075966] pointed out that high relative humidity in months with apparent seasonality (summer/winter) would lead to an increase in demand for refrigeration or heating, thus affecting the forecast accuracy of peak load demand. Therefore, in their study, relative humidity was quantified as temperature change to correct the inaccurate input variables, which significantly improved the forecasting accuracy of peak load demand.

Calendar variables have a significant influence on areas with rare special events and regular holidays. In [36-2002Artificial], the influence of lunar calendar festivals in Egypt on peak load was considered, and the influence of Ramadan is quantified as a weight factor and input into the expert system. The prediction results showed that models considering special festivals had better performance than others. Moreover, electricity consumption in the weekend and holidays of commercial and industrial sectors is considerably changed from that of working days, and the peak load may not even occur in these sectors during non-working days for the most time. Therefore, some of the reviewed works modeled historical data separately based on these calendar factors to improve the forecast performance. [26news-1997Short] trained models separately for each hour of a day, and the weekend and weekdays were also considered as criteria for model training, which resulted in 48 independent models to predict morning peak and afternoon peak in a day. This time-division modeling method distinguishes between working days and non-working days, which significantly overcome the defect of the traditional model in predicting peak load on weekends.

2.3 Outputs of peak load demand forecast

The main difference between peak load demand forecast and standard load forecast is that the output is usually one value or a pair of values (e.g. a peak load value with its occurring time/date) whereas the output of standard load forecast is generally a set of load values (time series). Existing studies did not make an unified definition for the output of the peak load demand forecast. By reviewing relevant literature, the output of peak load forecast are summarized as follows:

  • Forecast the total peak consumption on a given peak day [108-7893595]

  • Forecast load usage pattern during a given peak period [62-2008Electricity]

  • Forecast peak time [128-8791587].

  • Forecast peak value [73aa-GOIA2010700].

  • Forecast peak value and its occurring time simultaneously [85aab-2012Building].

  • Forecast peak value and forecast its occurring time separately [20].

  • Forecast peak (or together with valley) value as an additional input to produce load profile [26newc-1997Cascaded].

2.4 Evaluation indicators for peak load demand forecast

The evaluation indicators of peak load demand forecast models are partly the same as those of standard load forecast models, such as mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE), etc. Since these indicators are well known in standard load forecast, this section will not describe them in detail. Instead, to highlight the particularity of peak load demand forecast accuracy metrics compared with standard load forecast, this section will list some special evaluation indicators used in the existing studies. In addition, the following evaluation indicators that are specific to peak load forecast are given below.

  • Evaluation indicators for peak value

    Assuming that is the predicted peak value, is the actual peak value, is the number of the training samples:


    Peak absolute percentage error () [85aab-2012Building] is defined as:


    The range of is . equals 0% represents a perfect trained model, while greater than 100% indicates an unacceptable model.

  • Evaluation indicators for peak occurring time

    Assuming that is the predicted peak occurring time, is the actual peak occurring time, is the number of the training samples:


    By using to represent the tolerance residual for the peak occurring time, and as a flag to represent whether the predicted occurring time hits the tolerance interval , then we have, the Hit rate (HR) [85aab-2012Building] is defined as::




    The peak time forecasting is usually measured by [85aab-2012Building][20], which specifies a period before and after the peak load occurring as the forecast error tolerance range. The prediction is considered to be correct as long as the predicted time falls within the tolerance interval.

3 Peak load demand forecast methods

Traditional forecasting methods can be roughly divided into qualitative and quantitative analysis [7-30]. Qualitative analysis refers to the use of expert opinions to develop theoretical insights for prediction, such as curve fitting and extrapolation techniques, on the premise that historical data are not available or technical experiments are not feasible [7-30]. Quantitative analysis, on the other hand, presumes historical data are available and the future development of data still follows the changing trend of historical data within a reasonable range. In quantitative analysis, mathematical or statistical methods are usually used.

Following the literature obtained in Section 1.2, the number of publications for each method is summarized in Figure 3 while Figure 4 shows the timeline of each method being first used for peak load forecast. Finally, the peak load demand forecast methods are categorized according to the development timeline and types in Figure 5.

Figure 3: The number of publications for each type of methods.
Figure 4: Time line of the first time each method was used for peak load demand forecast.
Figure 5: Three stages of peak load demand forecast methods.

According to Figures 3, 4 and 5

, the development of peak load demand forecast methods can be broadly classified into three stages: the manual/human expert stage starting in the late 1950s, the classic stage starting in the early 1970s, and the advanced stage starting in the early 1990s. There are few studies in the manual/human expert stage. In the classic stage, regression is the most popular method for peak load demand forecast, followed by time series decomposition, stochastic time series models, and exponential smoothing. In the advanced stage, artificial neural network (ANN) based methods are the most favorable choice.

In the following we will provide a detailed review of methods in each stage.

3.1 Manual/Human expert peak load demand forecast stage

A manual peak load demand forecast method to transform the forecasted weather into daylight illumination parameters was proposed in [1-1956A]. The obtained results were then combined with the peak load demand table to calculate peak load increment, which was then applied to load curves to forecast daily peak load demand one day ahead. However, most of the studies were based on the simple calculation of variable relations or human experts’ opinions to estimate peak load demand before 1971. Most of the results predicted by over-relying on calculation and human experience were unsatisfying, due to the special characteristics of peak load demand.

3.2 Classic peak load demand forecast stage

3.2.1 Regression

Regression analysis decides model parameters by function expression based on the historical data, thus obtains the causal relationship between explanatory variables

and response variables

(peak loads) to produce peak load demand forecast [8-Turner2012Regression]. Mathematically, a general regression model can be represented by Eq (8).


where represent the model coefficients to be learned from data and denotes the residual left unexplained by this model.

The above model is said to be univariate regression when

describes a univariate random variable, otherwise, it can be described as multivariate regression. Besides, if there is only one explanatory variable (i.e.,

), the model is simple regression, otherwise, it is multiple regression. A parametric regression model is based on the knowing form of , in which parameters are needed to be estimated. When there is a linear relationship between parameters and the explanatory variables , the model is said linear, otherwise, the model is known as non-linear [8-Turner2012Regression].

Most of the obtained papers used univariate regression and selected multiple explanatory variables to get precise forecast results. [9a-1990A] proposed multiple regression-based approaches that took calendar effects into account to forecast short-term system load. The approach produced forecasts using four models: an initial peak forecast regression model, an initial hourly forecast regression model, an adjusted peak model, and an adjusted hourly model. All four models were based on regression, and the forecasted initial peak load and the maximum hourly load were combined with past errors in the adjusted peak model to produce the new peak forecast. Then the new peak forecast was used as a constraint in the adjusted hourly model to produce the final hourly forecasts. The proposed model was more flexible in handling the effects of special days and avoided causing iterative residuals for multiple-day forecasts since past forecast errors were considered as one of the influential variables. A regression-based model was proposed in [94-6934706], which considered econometric effects such as GDP, consumer price index, and population as extra explanatory variables to perform LTPLF for Zimbabwe. [28newr-1999reg] included a new variable, average wind chill, for the winter season, and considered the holidays’ effects by using transformation and reflection techniques to produce better daily peak forecast. [19-2002Regression] adopted a multiple regression model to linearise the load trend for Tokyo Electricity Power Company. The model is simple and has a promising peak load demand forecast performance with a MAPE of 1.43%. However, the performance of the model on a dataset with more fluctuated load patterns was not satisfactory.

There are a few papers utilizing probabilistic forecasts in the regression framework for peak load demand forecast. Instead of providing an estimate of the peak load demand, the probabilistic forecast is capable of predicting the distribution intervals, in which the uncertainty inherent of the demand could be quantified. [70-2008Density] presented a new methodology to forecast annual and weekly peak load demand. This method adopted semi-parametric additive models to estimate the correlations between predictors and the peak load demand. Same as [94-6934706], weather, calendar, economic, and population variables were also considered in the paper, and the results showed a remarkable improvement in the forecast accuracy. [90-2014Long] modeled monthly peak load demand as a function by considering the monthly peak time and the monthly peak load as two key variables. In this paper, the Gaussian process was used to forecast peak load demand. It also proposed a method to optimize the hyper-parameters in the kernel function, which was vital to improve forecast accuracy. [95-2014Non] utilized Alternating Conditional Expectation (ACE) to model hourly peak load during a month. Unlike

most multiple linear regression models, the seasonal and trend components in this model did not require a priori decomposition, and the non-parametric transformed functions could be obtained through ACE. In the model, weather variables such as temperature and humidity were analyzed and used as input to perform a probability density peak load demand forecast.

Regression models are also often found being used combined with other techniques to improve forecast accuracy. [27newdaa-Haida1998Peak] extended the model in [19-2002Regression] by using trend cancellation and estimation techniques to minimize the effects of transitional seasons. [9s-Barakat1990Short] performed monthly peak load demand forecast for the central region of Saudi Arabia where three time-series methods (Census-II multiplicative decomposition, seasonal auto-regressive integrated moving average (SARIMA), Winters’ seasonal smoothing) were combined with the regression model. [73aa-GOIA2010700] considered intra-daily seasonality effect, and proposed a new method to forecast daily peak load based on functional data analysis. They firstly introduced functional clustering to obtain groups that contains similar load usage patterns. Then each group was assigned a specialized functional regression model. Finally, functional linear discriminant analysis was applied to assign new curves to the classified groups to perform peak load demand forecast. The proposed method demonstrated promising performance.

Some representative regression methods for peak load forecast are summarized in TABLE 3. As aforementioned, the advantage of regression analysis lies in the model is usually simple and easy to understand, with fewer parameters and higher forecast efficiency. However, regression analysis usually makes assumptions on the historical data and did not consider the correlation between different time trends, which could limit its applications in some cases. In the following sections, we will review time series models for peak demand forecast to consider the potential correlation between historical data at different time points.

Reference Model detail Input Variable(s) Forecast horizon Geographic scope Forecast output(s) Performance
A composite
Daily peak load,
weather variables
(base ambient temperature,
mean value of maximum and
minimum ambient temperatures),
calendar variables
(holiday, day of the week) (9 years)
Monthly peak
demand forecast
(Consolidated Electric
Company of Central
Region of Kingdom
of Saudi Arabia)
Load values
(monthly peak
load value)
Modeled temperature,
holiday and other special

effects using binary variables

to build a time-independent
peak load demand forecast model
daily peak load,
weather variables
(maximum temperature),
calendar variables
(holiday, day of week) (6 years)
Daily peak
demand forecast
(Pacific Gas and
Electric Company)
Load values
(next day’s peak
load value,
hourly loads)
Proposed a transformation
function with translation and
reflection methods to deal with
the seasonal load change,
annual load growth and
the latest daily load change
peak load of weekdays
(except holidays),
weather variables
(daily maximum temperature,
minimum temperature and humidity)
(4 years)
Daily peak
demand forecast
(Tokyo Electric
Power Company
Load values
(next day’s peak
load value)
[26news-1997Short] Multiple linear regression
Fall&winter historical load,
weather variables
(daily maximum temperature,
seven-day moving average
of past-midnight temperature),
calendar variables
(day of week,
month of the year, year)
Daily peak
demand forecast
(Puget Sound
Power and
Light Company)
Load values
(next day’s hourly
load value,
next day’s
peak load)
Combine fuzzy system
with MLR, weekdays
and weekend days
were assigned with
different models
Daily 15-minutes peak load,
daily energy consumption
(4 months)
Daily peak
demand forecast
Region (substations)
Load values
(daily peak load)
Absolute Error
work days:
weekend days:
Built model based on
functional data analysis.
Functional clustering,
functional linear regression
and functional linear
discriminant analysis were used
Daily load curve (hourly)
for heating demand
(four discrete periods
covering four years)
Daily peak
demand forecast
Load values
(daily peak load)
Table 3: Regression analysis for peak load demand forecast

3.2.2 Time series decomposition

There are different time series decomposition methods, such as Fourier series analysis, wavelet methods and empirical mode decomposition (EMD).

A general time-series decomposition model usually adopts the addition or multiplication model to split the original time series into four sub-parts: Secular trend (T), Seasonal Variation (S), Cyclical Variation (C), Irregular Variation, (I). In the context of peak load demand forecast:

  • The secular trend refers to the continuous change of peak load demand in a long period.

  • The seasonal variation refers to the regular seasonal change of peak load demand.

  • The cyclical variation is the periodic change in peak load demand over years.

  • The irregular variation refers to the unexpected change of the peak load demand caused by many random factors.

When predicting the future peak load, each component is calculated separately first, and then the forecasted value of for each sub-part is passed to the addition model or the multiplication model to obtain the final prediction.

The addition model of time series decomposition is defined as:


where the four components in the addition model are independent of each other, all of which are expressed in absolute quantities and of the same order of magnitude.

On the other hand, the multiplication model of time series decomposition is:


Different from the addition model, the four components of the multiplication model are dependent on each other. In general, the secular trend in the multiplication model is expressed in absolute quantity, while other components are expressed in relative quantity.

When , , do not change over time, addition model is usually selected. Otherwise, the multiplication model could be a better choice. It should, however, be noted that, there is a convertible relationship between the addition and multiplication models where log function is one of the effective converting methods [7-30].

In [2-Gupta1971A]

, the addition model is utilized to produce monthly peak load demand probabilistic forecast. It also utilized Fourier transformation to reduce the non-stationary time series to a stationary series. Moreover, Monte Carlo simulation was adopted to simplify the computation process.

[5-Fong2011The] designed a comparative experiment to compare the time-series model using Fourier series with the auto-regressive model. Weekly peak load demand forecasts for one year ahead were produced, which showed that although over-forecasts exist, the model using Fourier expansion could track the dynamic behavior of peak load demand and produce better results than the auto-regressive model.

[68-2009The] utilized a multiplicative model to forecast the monthly peak load on a regional power grid. [8-Turner2012Regression] utilized the decomposition method to develop a multi regression-decomposition model. The proposed method aims to forecast monthly peak loads for one year ahead, and the result was promising with a MAPE of 7.88%. The advantage of this method is that it related the historical load trend with diverse influencing factors, and could simulate additional cyclic effects. [22-Choi1996A] used the real-world load data from Korea Electric Power Corporation to perform one day ahead daily peak load demand forecast. In this paper, Fourier transformation was adopted to identify the chaotic characteristics of the time series, and the optimal and embedding dimension and delay time were determined to be used as inputs to train an artificial neural network (ANN) model. MAPE for daily peak load demand forecast of the proposed model was close to 1.4%. As aforementioned, wavelet transformation is also a traditional time series decomposition method, and it can transform the information from the

time domain to frequency domain, and thus capture the low-frequency and high-frequency components of the peak load signal. In

[88aaa-2013A], wavelet decomposition was introduced to combine with other advanced methods to build a hybrid model. Daily peak load demand forecast for Iran National Grid was conducted based on the proposed model.

Empirical mode decomposition as a more recent time series decomposition method is also used in peak demand forecast. In [66-2009Long], empirical mode decomposition was proposed to capture long-run seasonality, short-run effects, and trend effects for the daily peak load. The load decomposition results given by EMD contain physical meaning related to time series characterizes, and thus can improve the forecast accuracy.

Note that time series decomposition uses the deterministic function to extract information. Therefore, it often ignores the stochastic factors of the original time series, resulting in insufficient information extraction, which can be compensated by the stochastic time series models, as we will discuss in the next subsection.

3.2.3 Stochastic time series models

Stochastic time series models can be generally divided into: the auto-regression (AR) model where denotes the order of auto-regression; the moving average (MA) model where is the order of moving average; the auto-regression moving average (ARMA) model; the auto-regression integrated moving average (ARIMA) model where denotes the order of integration; and the seasonal auto-regression integrated moving average (SARIMA
) model where are the seasonal parts of the model corresponding to .

A SARIMA model may be written as [80-2011Discrete]:


where: , is the peak load demand observed at time . and denote the non-seasonal and seasonal difference ( is the seasonal length) operators respectively, which transform into stationary time series. is the backshift operator, which is used to represent the backshift of time. When is used for , it means to reverse by one unit of time (). For monthly data, represents data of the same month of the last year. and are the non-seasonal and seasonal auto-regression operators, respectively. and are the non-seasonal and seasonal moving average operators, respectively. denote the maximum backshift order for the non-seasonal, seasonal, auto-regression, and moving average operators, respectively.

is the white noise at time

. The above model can be represented by SARIMA where represents the seasonal time series, and represents the monthly time series. When , the SARIMA model degenerates into an ARIMA model, and when , the SARIMA model degenerates into the white noise process.

It is worth pointing out that AR, MA, and ARMA models are suitable for weak/wide stationary time series. ARIMA is used for non-stationary time series, and SARIMA can deal with non-stationary and cyclical time series.

[5-El1986Weekly] developed two models to forecast weekly peak load one year ahead, where MA model for the seasonal-cyclic component is utilized. In [12-1992New], ARMA was used for monthly peak load demand forecast by considering the seasonal patterns and load fluctuations. Some hybrid methods combining ARMA with other methods such as regression models [63newm-5398613] and ANN [133-2020The] have been seen in the peak demand forecast.

Reference Model detail Input variable Forecast horizon Geographic scope Forecast output Performance
Modified ARIMA
(incorporate ARIMA
with human experience)
Hourly loads
(in the range of
peak times),
weather variables
(current temperature),
calendar variables
(day of year, which
was distinguished based
on average temperature),
human operator’s estimation
of current peak load
Daily peak
demand forecast
(Iran’s power
Load values
(next day’s peak
load value, hourly
Hourly loads:
1.45%- 1.99%;
Daily peak loads:
ARMA(6,5), use
to extrapolated maximum
temperature for
forecasted periods
Monthly peak load demand,
weather variables
(maximum temperature),
calendar variables
(time, months)
Monthly peak
demand forecast
(Saudi Consolidated
Electric Company)
Load values
(monthly peak load demands
for next year)
Most months
were observed to
fall between the
upper and lower
limits of
the forecasts
generalized autoregressive
conditional heteroskedastic
errors (SARIMA-GARCH),
Daily peak load demand
(the maximum hourly
demand in a 24-hour period),
weather variables
(maximum temperature),
calendar variables
(day of week, holiday,
month of a year)
Daily peak
demand forecast
(industrial commercial
and domestic sectors
of South Africa)
Load values
(monthly peak
demands for
next year)
SARIMA models
with different length
of historical data
Daily peak load demand
Daily peak
demand forecast
(New South Wales,
Load values
(daily peak load demands)
(one to seven
days ahead)
Table 4: Stochastic time series models for peak load demand forecast

When practical conditions are considered such as economic and cultural factors, it often lead to nonstationary time series problems. For such cases, ARIMA and SARIMA, are widely used. [30-Amjady2001Short] utilized ARIMA models to forecast hourly loads and daily peak load demand, incorporating human experience within the model. The proposed approach adopted ARIMA to produce a raw output as the initial input of the modified model, and also took temperature into account to distinguish hot and cold days to perform a regression-based analysis. The results of the proposed models were compared with ANN, standard ARIMA, and human operators, and the accuracy of the modified ARIMA models outperformed other models. [57aat-article] used load data from Dubai to build models based on ARIMA and dynamic regression, to forecast monthly peak load where the of method is 0.997. [57aam-2007Monthly] developed a model based on SARIMA to forecast monthly peak load for Sulaimany Governorate in northern Iraq. The adequate SARIMA model they found was , and the forecast results gave better opportunities for the power planners to determine the maximum generating capacity for peak load demand. The paper also pointed out that ARIMA was suitable for short-term forecasting since it prioritized the closer time series. [83aaf-2012Forecasting] utilized SARIMA to forecast monthly peak load demand for India. The SARIMA model outperformed the official load forecasting provided by the Central Electricity Authority (CEA) for both static and dynamic horizons in all five regional grids in India. [8-78] compared SARIMA with Holt-Winters multiplicative exponential smoothing, and it showed that the SARIMA model produced better forecast results.

Although ARIMA and SARIMA models have promising forecast results such as in [30-Amjady2001Short][57aam-2007Monthly][73-5697708][99-7117006][102-7399017], they usually do not take into account trend fluctuations in the data. Following this, variations based on ARIMA and SARIMA have been proposed in some studies to overcome the limitation and to further improve the forecast accuracy. [56aad-4202249] combine generalized autoregressive conditional heteroskedastic errors (GARCH) with ARIMA to define the maximum peak load demand level by considering the unexpected randomness of the load series. [82aap-2011Prediction] presented a regression-SARIMA model with generalized autoregressive conditional heteroskedastic errors (Reg-SARIMA-GARCH), which could accommodate the volatility of the daily peak load demand and the multiple seasonality of the mid and long term peak demand. The proposed model was used to conduct daily peak forecast for South Africa, and the comparative experiment showed that the proposed model produced better prediction accuracy than the piecewise linear regression model, the SARIMA model, and the SARIMA-GARCH model.

Some representative stochastic time series methods for peak load forecast are summarized in TABLE 4.

As aforementioned, the historical load data for peak load demand forecast are characterized by its high randomness. Therefore, the stochastic time series model is widely used in peak load demand forecast as an effective method to deal with random sequences. However, there are some limitations in the stochastic time series methods. For instance, for MA model, it gives the same weight to all the time series data, which does not necessarily reflect the actual situation. As such, a more flexible weights assignment of the data needs to be considered. In the next subsection, we will discuss the exponential smoothing, which can overcome the above limitation.

3.2.4 Exponential smoothing

Exponential smoothing is a time series analysis method developed based on MA. Exponential smoothing predicts the future peak load according to the weighted average of the historical time series. The recent data are given a larger weight whereas the previous data are given a smaller weight. This is based on the principle that the influence of a certain variable on subsequent behavior is gradually attenuating [7-64-incollection].

A general exponential smoothing model for peak load demand forecast can be written as [7-30]:


where is the forecasted peak load demand at time , and is the smoothing parameter that controls the weights decrease (exponentially).

Exponential smoothing can be divided into several different forms. [26aa-27] provides a comprehensive review of exponential smoothing methods, in which 17 basic methods and some extensions based on these methods are described in detail. In general, single exponential smoothing is applied to sequences without trends or seasonality, and second exponential smoothing is applied to time series that only have trends. The triple exponential smoothing (also known as the Holt-Winters) targets sequences with both trends and seasonality. When modeling the seasonal data, a Holt-Winters model consists of three smoothing equations each having its smoothing parameters: trend, level, and seasonality components [17du12-Yaffee2000Introduction]. When the seasonal variations are constant and uncorrelated with time series, the additive Holt-Winters model can be hired. However, if the seasonal variables change proportionally with time series, the multiplication model can be chosen to predict the seasonal data.

The triple exponential smoothing is the most commonly used method in the reviewed papers. [9a-1990A] used exponential smoothing to correct the forecast values that were consistently too high or too small, and the adjusted model showed good capability to track the fast-changing load demand and produce hourly forecasts with higher accuracy. [26aa-Masood1997EDSSF]

proposed a decision support system based on a variety of time series techniques. The near-optimal monthly peak forecast models were built by exponential smoothing, Box-Jenkins vector, and dynamic regression to perform short-term peak load demand forecast. Moreover, a comprehensive assessment of the models was provided by using several evaluation indicators such as MAPE, Akaike information criterion (AIC), Bayes information criterion (BIC), and MSE. The results of the proposed system showed that different models performed differently towards different regions of the country.

When trends and seasonal variations dominate the time series, the Holt-Winters exponential smoothing usually outperforms the ARIMA. This was confirmed by [38aa-J2017Short], which used the Holt-Winters Smoothing to forecast peak load demand for England and Wales. The model they built described the intra-daily and intra-weekly seasonal cycles, and the comparative results showed that this approach achieved a better accuracy than the ARIMA model.

Exponential smoothing is often used in combination with other methods to build composite models. [9a-1990A] applied exponential smoothing to a regression model and compared it with the regression model with ARIMA, and the experiment revealed that most of the initial forecasts were corrected after applying the exponential smoothing. However, the smoothing coefficient of exponential smoothing needs to be artificially selected, and if the time series fluctuates wildly (e.g., the peak load), it will produce unsatisfactory prediction results [billah2006exponential].

3.2.5 Kalman filter and grey prediction

Kalman filter (KF) is a linear system state equation that can estimate the system state optimally through the input and output observations of the system. Since the observed data include the noise, the optimal estimation can also be regarded as the filtering process. KF comprises two processes: prediction and correction. During the prediction, the filter makes a forecast of the current state using the estimation of the state from the previous timestep. The correction was performed using observations of the current state to correct the predicted value acquired in the prediction phase to obtain an improved estimation. Besides, apart from being known as the recursive state estimator for linear systems, some KF variants are also capable of non-linear systems [Kalman1].


presented a hybrid learning scheme that consists of unsupervised and supervised learning phases to forecast daily and weekly peak/average load profiles.

The KF-based learning algorithm was engaged to find the optimum parameters and functions in the supervised learning phase. [85aab-2012Building] selected KF as one of the benchmarks to carry out an integrated hybrid model for STPLF.

The grey system is between a white box model and a black box model, where it focuses on learning the internal structure, parameters, and general characteristics, and tries to decipher known information as much as possible. Grey time series prediction model was constructed based on the observed historical time series reflecting the predicted peak load characteristics.

[61-2008The] used grey correlation theory in sensitivity analysis to select relevant meteorological variables for daily peak load demand forecast. [62-2008Electricity] developed a variable weight combination forecasting model by combining the grey model and ARIMA model, which was used to forecast load consumption in the peak load month for MTPLF. The hybrid model was proved reliable to handle the non-smooth characteristics of monthly load data and achieved satisfactory forecast accuracy. [20] proposed a hybrid grey model to forecast yearly peak load and its occurring date simultaneously for LTPLF. The model only needed a small amount of historical annual peak load data to produce the forecast results, and it was claimed the model was highly adaptive to dynamic changes of yearly peak load.

3.3 Advanced peak load demand forecast stage

With the emergence of artificial intelligence (AI) and big data, traditional AI-based techniques such as fuzzy logic (FL), expert system (ES) and genetic algorithm (GA) and modern AI and machine learning based methods such as artificial neural network (ANN) and deep learning, support vector machines (SVMs), and ensemble models have been adopted for peak load demand forecast.

3.3.1 Traditional AI-based methods

As a bridging stage between the classic and advanced methods, there are a few studies in the reviewed literature using traditional AI-based methods for peak load forecast.

Fuzzy logic imitates the uncertainty concept of judging and reasoning of the human brain. It applies fuzzy rules to the reasoning of the system with the uncertain model to deal with the fuzzy information that is difficult to handle by conventional methods. [27newdaa-Haida1998Peak] adopted separate fuzzy models to predict the peak and valley load, and the simulation results showed a good prediction accuracy. [28-1999The] proposed a fuzzy regression approach to peak load estimation. The effectiveness of the proposed method was demonstrated by forecasting daily load consumption and daily peak load demand at the distribution level.

Expert system has also been used in peak load forecast. An ES is a computer system, which is a knowledge-based programming method that absorbs the domain knowledge and experience of experts and makes intelligent decisions based on the reasoning of such knowledge and experience. A complete ES consists of the knowledge base, the reasoning machine, the knowledge acquisition part, and the interpretation interface. [36-2002Artificial] implemented a knowledge-based ES to forecast yearly peak load for both typical fast-developing system and regular developing system, and the knowledge base of this system was composed of both static and dynamic variables. The results proved that the knowledge-based ES yielded the best performance among all considered models (time series model, traditional ES model, econometric model).

Fuzzy logic has been combined with ES to produce better prediction results. [13aa-Hsu1992Fuzzy] built an ES based on fuzzy set theory to forecast hourly load in Taiwan by improving the estimation accuracy of the

peak and trough loads. The proposed ES could handle uncertain weather variables and heuristic rules, and it

could update peak and trough loads iteratively to produce a more accurate forecast. [30newa-30Kiartzis2000A] proposed a fuzzy ES to forecast morning and afternoon valley, noon and evening peak based on weather information and historical load data from the Greek power system. The results showed that fuzzy ES could forecast daily peak and valley loads reasonably well compared with neural networks.

Genetic Algorithm has also been adopted in building models for peak load forecast. GA is based on natural selection and population genetics, which makes the population evolve to the optimal region in the searching space through selection, crossover, variation, evaluation, and other operations. GA can also be used to optimize the parameters of the forecast models such as initial connection weights of the networks and the threshold values of nodes for neural networks. In [85aab-2012Building], 13 years of regional data from France were utilized for training a real-valued genetic algorithm (RGA)-based neural network with support vector machine (NN-SVM) model. Daily load profile forecast and monthly peak load demand forecast were generated, and the comparative experiments showed that the proposed model was suitable for forecasting long-term peak load. [87aaa-El2013Electric] implemented a comparative analysis for yearly peak load demand forecast based on the unified Egyptian network data. In the experiment, models based on GA, least-square, and least absolute value filtering were trained separately. The results showed that the model developed based on GA gave the best performance with the lowest forecast error of 0.70%.

3.3.2 Artificial neural network and deep learning-based methods

ANN was proposed in 1991, and it has attracted much attention in the peak load demand forecast. Many advanced methods based on ANN such as deep learning methods have since been proposed with good performance.

Artificial neural network (ANN) is inspired by

the anatomy of the human brain, and it consists of artificial neurons in multi-layers for information communication. An example structure of ANN is shown in Figure


Figure 6: An example structure of ANN.

A typical ANN consists of the input layer, the hidden layer, and the output layer. Except for the input layer, each neuron in ANN is connected to neurons of the former layer (i.e. the input neurons), with each connection corresponding to a weight. The sum of the product of all input and the corresponding connection weights are passed to an active function to calculate each neuron’s final value, as is shown in Figure


Figure 7: The calculation process of a neuron in ANN.

The activation function needs to be selected according to data characteristics, and the Sigmoid function is the most commonly used active function of ANN models


. One of the well-known ANN is the backpropagation (BP) neural network, a multi-layer neural network with error backward propagation. BP is widely used for its satisfying performance on prediction tasks. It, however, suffers from high computational cost and low computational efficiency, and therefore the radial basis function network (RBFN) was brought up to deal with this. The input variables of RBFN pass directly to the hidden layers without additional weights, and RBFN is proved to be less time-consuming than the traditional multi-layer neural network


[11e-Saeed1991Electric] collected hourly temperature and load data from Seattle to build a model based on ANN, and the trained ANN was then used to forecast daily peak load and hourly load and total daily load. The mean error of the peak load forecast model was ranged between 1.55% and 2.60%. This ANN model allowed a more flexible relationship between weather variables and load patterns. However, the model produced higher errors when people have specific start-up activities, which indicates that the use of additional calendar variables should yield better results. [37aa-2003Regional] utilized ANN to perform annual regional peak load demand forecast of Taiwan. The proposed model had three input neurons corresponding to economic, demographic, and weather variables, two hidden neurons, and one output neuron representing the regional peak load that needs to be estimated. The effectiveness of the proposed model was demonstrated by comparing the forecast accuracy with a regression-based model. [63newp-Saini2008Peak]

was presented to forecast daily peak load up to seven days ahead based on a feed-forward neural network with

the steepest descent, Bayesian regularization, resilient and adaptive backpropagation learning methods. [108aaa-2016Artificial] presented a new model based on ANN by employing Bayesian regularized neural network model with Levenberg-Marquart (LM) backpropagation algorithm to forecast daily peak load demand for a commercial building complex. [74aa-article] forecasted annual peak load demand five years ahead for Iran based on RBF. The paper selected variables related to yearly incremental growth rate and pointed out that long-term forecasting should pay more attention to economic factors than weather conditions.

There are a few studies combining multiple ANNs to improve the forecast accuracy, in which the peak load is usually generated as a by-product to enhance the forecast performance further. [27] developed a model based on cascaded ANNs (CANNs) to forecast the load profile one day ahead where the daily peak, valley, and total load consumption were first estimated by an ANN, and then such forecasted values were used as additional input data for the next day’s load profile forecast. The results revealed that the cascaded structure of ANN could produce more satisfactory forecasting results. [89aai-Hern2013Improved] proposed a multi-stage ANN to forecast load demand in two stages. Firstly, the daily peak and valley values were generated by ANN. Second, based on the peak and valley load, the whole electricity demand curve was produced. This method outperformed normal ANN for significantly reduced the MAPE. Although multiple ANNs can provide promising results for load forecasting, [89aai-Hern2013Improved] revealed that the multi-stage ANN suffers from higher computational complexity than a single ANN.

In general, ANN has apparent advantages for its adaptive learning and function approximation capabilities. Since ANN can deal with the high randomness and uncertainty of the time series well, it is recommended for STPLF. However, ANN models suffer from long training time and are easy to fall into local optimum. Therefore, researchers mostly focused on optimizing the neural network structure, such as combining with fuzzy logic [63newm-5398613][25-Mandal1997Fuzzy][99aaa], to further improve the model training efficiency and forecasting accuracy.

Many modern and advanced methods have emerged based on ANN, such as self-organizing map (SOM), recurrent neural network (RNN), and convolution neural network (CNN). In particular, long-short memory (LSTM) is one variation of RNN


SOM is often used as improving method due to its unsupervised learning characteristics together with other forecast methods to produce the final results. In

[10d-1991Design, 10-1991Design], SOM was adopted to cluster days with similar load consumption patterns. Then, based on a feed-forward multilayer neural network, daily peak load and valley load were estimated to compute the desired hourly load. [63newc-AMINNASERI20081302]

adopted SOM to cluster load profiles, and principal components analysis (PCA) for reducing the dimensions of the data. Then, separate feed-forward neural network was trained for each cluster. The comparative analysis demonstrated the superiority of the proposed method.

RNN and CNN are commonly used deep learning methods, which have more complex network structures, such as more hidden layers and recurrent structures. Deep learning models can better capture the dynamic characteristics of peak load to provide a more accurate and stable prediction and have more robust learning and generalization ability than the standard ANN, especially in the big data era. [126-2019Deep] proposed a method to combine RNN with dynamic time warping (DTW) for short-term peak load demand forecast. The DTW was introduced to identify load curves with similar trends, and a bespoke gated RNN was trained to forecast daily peak load demand one month ahead based on the half-hourly load data. The proposed method achieved a satisfactory MAPE of 1.01%. In addition, comparative analysis suggested that the DTW distance had the ability to adapt to the dynamic change of non-stationary daily peak load series. In [134-8994442], the LSTM layer was adopted to forecast weekly peak load in Korea. In this study, input variables including weekly peak load, weekly temperature, and weekly GDP of the previous year were used. The LSTM layer in this paper was proved to be able to capture more useful characteristics of the load data, and results showed good forecast accuracy with the lowest forecast error of 2.16%.

About 60 studies have employed ANN and deep learning based methods to perform peak load demand forecasting, revealing its dominant popularity in the peak load forecast. Some representative references related to these methods are listed in TABLE 5.

Reference Model detail Input variable Forecast horizon Geographic scope Forecast output Performance
Peak load: 3 input neurons,
5 hidden neurons and the
output peak load at a given day;
total load: 3 input neurons,
5 hidden neurons and the output
total load at a given day;
hourly load: 6 input neurons,
10 hidden neurons, the output
load at a given hour.
Predicted temperatures;
peak load value, total
load of a day (a sum),
hourly load (1-24 hour ahead),
weather variables
(average, peak and lowest
temperature at predicted day)
Daily peak
demand forecast
Load values
( peak load value,
total load of a day
(a sum), hourly
load (1-24 hour
Peak load:
Total load:
Hourly load:
The neural network has
46 input nodes,
60 hidden nodes,
and the one output
layer (peak/valley load)
Historical peak/valley loads,
weather variables
(high/low temperature)
Daily peak
demand forecast
Load value
(next day’s
load value)
The lower ANNs:
16 input neurons, 8 hidden
neuron, 3 output neurons;
The upper ANNs:
107 neurons, 2 hidden layers
each contains 35 neurons,
48 outputs indicating
48 half-hourly loads.
Historical loads,
weather variables
(maximum and minimum
maximum humidity),
calendar variables
(time of the day,
day of the week, special
event and holidays)
Half-hourly load
for the next day
Load values
(peak load value,
valley load value,
daily load, half-hourly
load profile.
MPE for
peak load:
Feedforward neural
network based on:
back-propagation algorithm,
back-propagation algorithm
+ principle components
analysis (PCA)
Peak load of previous day,
weather variables
(temperature of the day
(maximum. minimum),
gross minimum temperature.
rainfall, evaporation per day,
sunshine hours of
the previous day,
wind speed, the dry bulb
temperature, the wet bulb
temperature. the vapour pressure,
the relative humidity (%) and
the soil temperature ),
calendar variables
(seasons, year number,
day of the year, day of the week,
Daily peak
demand forecast
(a 220 kV substation
of Haryana Vidyut
Prasaran Nigam Ltd.
Load values
(peak load value
of the current day/
one to seven days
Feedforward neural network
based on the conjugate
gradient (CG) backpropagation
algorithm (Fletcher–Reeves
conjugate gradient
backpropagation method
(FRCGBP), Polak–Ribiere
conjugate gradient
backpropagation method
(PRCGBP), Powell–Beale
conjugate gradient backpropagation
method (PBCGBP) and scaled
conjugate gradient backpropagation
method (SCGBP)) + PCA
Peak load of previous day,
weather variables
(temperature of the day
(maximum. minimum),
gross minimum temperature.
rainfall, evaporation per day,
sunshine hours of
the previous day,
wind speed, the dry bulb
temperature, the wet bulb
temperature. the vapour pressure,
the relative humidity (%) and
the soil temperature ),
calendar variables
(seasons, year number,
day of the year, day of the week,
Daily peak
demand forecast
(a 220 kV substation
of Haryana Vidyut
Prasaran Nigam Ltd.
Load values
(peak load value
of the current day/
one to seven days
Self-organizing map (SOM) +
Feedforward neural network
Peak load,
weather variables
(temperature, wind speed,
cloud cover, relative humidity),
calendar variables
(day of the week,
week of the month,
month of year, year number,
Daily peak
demand forecast
(Tehran Regional
Electric Utility
Load values
(peak load value
one day ahead)
Dynamic time warping +
Bespoke gated Recurrent
Neural Network (RNN)
Historical load every 30 minutes,
weather variables
(daily average temperature),
calendar variables
(day of the week, holidays)
Forecast output:
single load value (daily peak load)
Daily peak
demand forecast
(competition data
from the European
Network on Intelligent
Technologies (EUNITE))
Load values
(daily peak load)
Table 5: ANN/ANN-based methods for peak load demand forecast

3.3.3 Support Vector Machines

As one popular machine learning method, support vector machines can minimize actual risk by seeking risk minimization so that to get satisfactory forecasting performance. The variation of SVMs for regression problems is represented as support vector regression (SVR) [85-21], which is efficient for large-scale regression problems [85-5].

Given a training dataset where denotes the -th observation (-dimensional input vector), is the output corresponding to , and denotes the size of training set. For non-linear SVMs, the basic idea is to introduce kernel as below:


where is the hypothetical higher dimensional feature space. Coefficients and need to be estimated based on the structure risk minimization principle.

[64-2009Forecasting] introduced local prediction based on SVM for electric daily peak load forecast. The local prediction can find the approximation function in the reconstructed embedded space. The partitioned inputs were assigned with an SVM model in each sub-domains, and thus local prediction could make better forecasts than the single/global model. [104-2016Peak] developed a novel online-SVM model based on the standard SVM. The proposed model was used to forecast daily peak load for the residential building in Surrey, and results showed that the model could be a more intelligent tool for smart grid systems. [115-8319611]

adopted SVM to build a model for monthly peak load prediction. Firstly, feature selection was implemented based on correlation analysis. Then the training set was reconstructed by the topology network and random walk with restart (RWR) algorithm. Moreover, a feed-forward correlation was utilized to minimize the effect of unknown errors. Finally, the preprocessed training data was fed into SVM to train a model with higher accuracy.

Similar to ANN, SVMs are more suitable for STPLF and can cope with nonlinear and high dimensional data

[104-2016Peak][115-8319611]. The disadvantage of SVMs is also similar to that of ANN for suffering from long training time with large data sets. Besides

, the hyperparameters of SVMs need to be manually selected, which is also a complex step that needed to be optimized.

3.3.4 Ensemble Learning

Ensemble learning trains multiple learners and aggregates each learner’s predicted results to obtain the final output through combining strategies, which generally involve averaging, voting, and stacking [Ensemble-Polikar:2009]. According to the dependencies between learners, one possible classification of the popular ensemble learning methods is as follows:

  • Learners have to be generated in sequence to satisfy the strong dependency between them (boosting).

  • Learners are allowed to be simultaneously generated

    since there is no strong dependence between them (bagging and random forest).

Ensemble learning has been widely used in peak load demand forecast in recent years. Ensemble models used in the reviewed studies mainly are: boosting, bagging, and random forest.

Boosting adjusts the sample distribution according to the performance of the initial learner so that samples with the wrong prediction get more attention than others, and then it trains the next learner based on the adjusted sample. The process is iterated until a specified number of learner clusters are generated, or the aggregated learning criteria reaches the stop threshold [Ensemble-Polikar:2009]

. Commonly used boosting algorithms in the reviewed papers are adaptive boosting (AdaBoost), boosting tree, gradient boosting (GB) and Extreme gradient boosting (XGBoost).

[boosting2018-AHMAD20181008] adopted three machine learning models (ANN with nonlinear autoregressive exogenous multivariable inputs, multivariate linear regression, and AdaBoost) to predict load profile one month, one season, and one year ahead at the district level. During training, datasets with different sizes were utilized for training models for different prediction intervals.

This paper also adopted feature extraction to select essential variables, and the results showed that the AdaBoost outperformed other models significantly for all prediction intervals. Moreover, for seasonal forecasting, the error range of AdaBoost was

relatively narrow, which indicated that the model trained based on AdaBoost was more capable of capturing the dynamic change of load curves. [boosting2019-ZHANG2019116358] conducted short-term load forecasting for southern California. In this study, different models were adopted (multivariate linear regression, random forest, and GB) and the installed solar capacity was identified to be an important feature during the forecasting. The comparative experiment results revealed two insights: (1) The fact that the installed solar capacity became an important feature suggested that new and clean energy resources are important components in the system that researchers need to pay more attention to; (2) Different forecasting accuracy in different periods indicated that being able to capture the fluctuation of load curves is important for forecast. [boosting2020-LU2020117756] combined complete ensemble empirical mode decomposition with XGBoost to predict daily load consumption, daily peak load, and daily water delivery. Compared to traditional XGBoost, the hybrid model showed a lower MAPE of 5.99% for the daily peak load demand forecast.

Bagging is based on bootstrap sampling. It carries out multiple times of put-back sampling for a given dataset and trains learners simultaneously based on the obtained sampling set. When bagging is applied to a regression task, a simple mean or median can be adopted to obtain the final output [bagging2018-53]. [bagging2018-DEOLIVEIRA2018776], for the first time, utilized bagging to forecast monthly load demand for countries with different development stages. The paper combined bagging with exponential smoothing and SARIMA and then used simple mean and median to aggregate the results from single learners. A new variation of bagging, Remainder Sieve Bootstrap (RSB) was also proposed to enhance the forecasting results, and the result showed that the proposed method yielded the best MAPE for both developed and developing countries.

Random forest (RF) can be seen as an extension of bagging, which further introduces random selection in constructing individual decision tree learners based on bagging.

The RF firstly uses bootstrap to generate its training sets, and then a decision tree is constructed for each of the training set. Features are randomly selected and an optimization criteria is used to guide the split of nodes in constructing each decision tree learner. The prediction strategies of RF are: voting for the classification task, and averaging for the regression task [Ensemble-Polikar:2009].

As the number of learners increases, RF generally converges to a smaller generalization error than bagging. Moreover, the training efficiency of RF is often superior to bagging, benefiting from the randomness in constructing single learners. In [97newd-FAN20141]

, an ensemble method combining eight popular forecasting algorithms (multiple linear regression (MLR), ARIMA, SVM, RF, multi-layer perceptron, boosting tree, and multivariate adaptive regression splines) is proposed for peak load forecast. Each model in the studies was

assigned to a weight by GA. The results showed that SVM and RF had the largest weights, which indicates that these two algorithms contributed more potential gains for enhancing peak load demand forecast accuracy. [RF2018-WANG201811] adopted RF to predict hourly load usage patterns for two educational buildings in North Central Florida, and the feature importance distribution was also produced as a by-product. The proposed model was compared with the regression tree and SVM, and the results showed that RF had the best superiority among all the trained models. Moreover, the feature importance distribution also proved that the influential features changed depending on different education periods, which indicated that the load usage behavior of educational buildings is highly related to different semesters.

3.3.5 Hybrid techniques

Many novel hybrid models with satisfactory forecast performance have been proposed in the reviewed papers, and some of the models have already been discussed in the previous section. TABLE 6 summarised papers utilizing hybrid models according to combinations of methods in different forecast methods development stages.

Combination of the stages Hybrid models with references
Manual/human expert stage
+ Classic stage
Human knowledge + ARIMA [30-Amjady2001Short]
Classic stage + Classic stage
Decomposition + Regression [9s-Barakat1990Short]
ARIMA + Regression [63newm-5398613]
Classic stage + Advanced stage
Regression + FL [28-1999The][29]
Regression + GA [34][50-2005Comparison]
Regression + PCA [92-6867500]
Exponential Smoothing + FL [108newd-Laouafi2016Daily]
Advanced stage + Advanced stage
FL + ANN [25-Mandal1997Fuzzy][27newdaa-Haida1998Peak][63newm-5398613][99aaa][108-7893595]
FL + ES [21-1995Short][30newa-30Kiartzis2000A]
FL + SOM [56-4075966]
FL + GA [72-0The]
ANNs [89aai-Hern2013Improved]
ANN + SOM [48-1556396]
ANN + SOM + PCA [63newc-AMINNASERI20081302]
ANN + GA [81-6121762][88aaa-2013A][98-7011462]
GA + RBFN + SVM [85aab-2012Building]
Table 6: Summary of papers utilizing hybrid model

Manual + Classic stage. There are a few papers that proposed hybrid models based on the combination of manual stage and classic stage, in which [30-Amjady2001Short] was the earliest work among the obtained papers that utilized the combination of classic forecasting methods with human experience. In this paper, human experts’ opinions were selected as one of the initial input variables for the daily peak load demand forecast. The proposed modified ARIMA was compared with standard ARIMA, and the results revealed that the former had the best performance with the lowest MAPE of 1.01% for predicting the daily peak load of cold Sunday to cold Wednesday.

Classic + Advanced stage. Some papers combined methods from the classic stage with methods from the advanced stage. Among which, [28-1999The] combined fuzzy logic with a regression model. The fuzzy set theory is good at representing the uncertainty of the data, which allows the use of additional customer information as inputs to the forecast model, and could achieve more accurate forecasts. [92-6867500] used the combination of PCA and MLR to forecast weekly peak load at the distribution level. Firstly, the correlation analysis was utilized to select the important features, and the PCA was adopted to reduce the redundancy of the input dimensions. Finally, the output from PCA was applied to MLR to perform mid-term peak load prediction. This hybrid model was simpler than many advanced AI-based methods, yet could also achieve satisfactory forecast accuracy.

Advanced + Advanced stage. From TABLE 6 we can see that most of the proposed hybrid models are the combinations of the advanced stage methods. Among which, [88aaa-2013A] proposed a hybrid method to forecast daily peak load for Iran. The model was built using the combination of wavelet decomposition, NN, and GA. Historical load data and weather variables from three different cities were used to train the model. The proposed model was also compared with other advanced models, and the results showed that this model outperformed most of the models. [30newa-30Kiartzis2000A], proposed a hybrid model combining fuzzy logic with the expert system. In this study, fuzzy logic has the advantage of obtaining the uncertain and incomplete information from the real-world data, which will be then considered as the input of the expert system, such that the hybrid model can make more accurate predictions based on the acquired knowledge. [63newm-5398613] and [99aaa] both combined fuzzy logic with neural network. The advantage of the hybrid model is that the neural network has strong self-learning ability and can make good use of the expression provided by fuzzy logic to produce forecasts with higher accuracy. Moreover, the fuzzy neural network is effective when handling peak loads with strong fluctuations, and it is good at capturing the calendar effect than other advanced models. In [85aab-2012Building], the real-valued genetic algorithm (RGA) based neural network-SVM model was proposed. In the model, the neural network was responsible for producing the growth index for the forecast target, SVM was adopted to output the deviation value, and the RGA was adopted to select optimal parameters for the neural network and SVM. The experiment demonstrated that the proposed hybrid model had good performance on both short and mid-term load demand forecast.

4 Discussion and summary

This section will first give a comparative analysis of the peak demand forecast methods. Then, improving methods for peak load forecast models are discussed. Finally, a comprehensive summary and discussion of the papers reviewed will be presented.

4.1 Comparative studies of different models

Each forecasting method has its advantages and disadvantages, therefore, it is necessary to compare the performance of various forecasting methods to understand their advantages and limitations. To this end, we will summarize the existing comparative studies in the literature. Some representative comparative studies are listed in TABLE 7 including composite/combined models or comparison analysis. A composite model could refer to inter-methods composite models (i.e. hybrid models) and result-weighted composite models. Based on different development stages of forecast methods (classic stage and advanced stage), existing comparative studies could be categorized into intra-comparison (e.g. methods within the classic stage) and inter-comparison (e.g. methods from both classic and advanced stage).

When compared with human expert opinions, the methods in the classic stage showed good performance, as presented in [26aa-Masood1997EDSSF]. In the classic stage, regression, as the most popular method in the reviewed papers, are often selected as the benchmark for building hybrid models [9a-1990A] [37aa-2003Regional] [50-2005Comparison] [82aap-2011Prediction]. For instance, [9a-1990A] combined the exponential smoothing with regression to forecast the daily peak load demand, and compared the results with the combination of ARIMA and regression. Results showed that the former model could alleviate the bias caused by the latter. [82aap-2011Prediction] also used regression as a benchmark to compare its performance with the hybrid model (Reg-SARIMA-GARCH).

In comparing the methods in classic stage with methods in the advanced stage, [37aa-2003Regional] utilized ANN and regression to forecast annual peak load demand for Taiwan and the results showed that ANN could achieve a better performance. For instance, [50-2005Comparison] combined GA with symbolic regression to build an STPLF framework, and the results showed the hybrid model could achieve comparable performance to an ANN model.

Reference Model/Experiment type Detail Forecast contents Performance
[26aa-Masood1997EDSSF] Composite model
Built an decision support
system (DSS) to compare
ARIMA, exponential smoothing
and human expert suggestions
Monthly peak load demand
for UAE
Different models suited for
different areas (Abu-Dhabi:
Dubai: Exponential smoothing,
Sharjah: ARIMA(1,0,2)(0,1,3));
DSS outperformed human experts
[37aa-2003Regional] Comparative analysis ANN vs Regression
Annual peak load demand
for 4 regions of Taiwan
(Region of China)
MAPE of ANN vs Regression:
Northern: 1.06% vs 2.45%
Central: 1.73% vs 8.52%
Southern: 2.48% vs 8.29%
Eastern: 3.62% vs 4.10%
Composite model +
comparative analysis
Applied exponential smoothing
to regression model.
Regression + ARIMA vs
Regression + exponential smoothing
Hourly load, daily peak
demand for PG&E
Regression + exponential
smoothing could almost eliminate
the negative bias caused by
regression + ARIMA
Composite model +
comparative analysis
GP + symbolic regression
vs MLP (ANN)
Daily peak load demand for
a distribution power system
in Romania
Maximum APE of GP +
symbolic regression vs MLP:
10% vs 11%
MPE of GP + symbolic
regression vs MLP:
0.2% vs 0.1%
Composite model +
comparative analysis
Piece-wise linear regression vs
generalized autoregressive conditional
heteroskedastic errors
Daily peak load demand for
South Africa
MAPE for regression vs SARIMA
2.77% vs 1.47% vs 1.43%
vs 1.42%
Composite model +
comparative analysis
Fuzzy logic (FL) + expert system (ES)
vs NN with similar structure
Morning and afternoon
valley, noon and evening
peak for different seasons
Yearly ME for FL + ES vs NN:
2.45% vs 2.56%
Composite model +
comparative analysis
ANFIS (Fuzzy logic + NN) vs ARMA
Daily peak load demand forecast
for a utility company
in Malaysia
Average MAPE for ANFIS
vs ARMA:
Including weekends:
7.26% vs 2.45%
Excluding weekends:
1.95% vs 4.67%
Composite model +
comparative analysis
RGA-SVM (real-valued genetic
algorithm-SVM) vs KF vs RBF
vs RGA based NN-SVM
(NN thread to output yearly growth
index and SVM thread to output
moment-specific forecast )
Daily peak load demand
with occurring time
MAPE for RGA-SVM vs KF vs
RBF vs RGA based NN-SVM:
one week:14.26% vs 6.63%
vs 6.39% vs 3.20%
Occurring time: Hit rate of RGA
based NN-SVM (one-hour time
deviation tolerance): 91.7%
[128-8791587] Comparative analysis
Regard peak hour forecast
as a classification problem;
Naive Bayes vs SVM vs Random
Forest (RF) vs AdaBoost vs CNN
vs LSTM vs Stacked
AutoEncoder (SAE)
Daily peak hour one day ahead
for Ontario
Accuracy for Naive Bayes vs
SVM vs RF vs AdaBoost
vs CNN vs LSTM vs SAE:
Winter: 0.83 vs 0.97 vs 0.97
vs 0.82 vs 0.97 vs 0.98 vs 0.97
Summer: 0.66 vs 0.95 vs 0.94
vs 0.63 vs 0.94 vs 0.95 vs 0.94
Composite model +
comparative analysis
CEEMDAN(complete ensemble
empirical mode decomposition)
-XGBoost vs CEEMDAN-RF vs
(least squares SVM)
Daily energy consumption
+ daily peak power +
daily water delivery
PSO-SVM vs LSSVM vs XGBoost:
daily energy consumption: 4.85%
vs 6.26% vs 7.67% vs 7.92 vs
7.87% vs 8.06%
daily peak power: 5.99% vs
6.40% vs 9.25% vs 8.15% vs
8.89% vs 9.13%
daily water delivery: 5.09% vs
6.31% vs 8.30% vs 8.08%
vs 8.37% vs 8.32%
Table 7: Papers with composite model/comparative analysis

Considering different distribution and diversity of the data and problem, hybrid methods are not always achieving satisfactory performance and sometimes could be counterproductive. [63newm-5398613] compared ARMA with a hybrid model (FL + ANN) for daily peak load demand forecast. The obtained results showed that ARMA performed better when samples were trained with weekends whereas the proposed hybrid model gained better forecast accuracy when excluding the weekends. [85aab-2012Building] combined real-valued GA with SVM and ANN, and the hybrid model was then used for producing daily peak load and its occurring time. The experiment compared the proposed model with other models (real-valued GA-SVM, KF, RBFN). Under the same experimental conditions, surprisingly the real-valued GA-SVM model produced the worst results.

There are also a few studies that conducted the comparative analysis based on methods in the advanced stage only. For example, [128-8791587] formulated peak load forecast as a classification problem and compared several methods including LSTM, SVM, RF, CNN, and Adaboost. The results showed that among all the methods, LSTM had the best performance following by SVM, RF, and CNN, whereas Adaboost produced unsatisfactory forecast results. [boosting2020-LU2020117756] proposed a hybrid model (complete ensemble empirical mode decomposition (CEEMDAN) combined with XGBoost), which was then compared with other models such as CEEMDAN-RF, RBFN. The results revealed that the proposed model generated the best performance, whereas RBFN had the largest MAPE among these three models.

4.2 Improving methods for peak load demand forecast models

As aforementioned, hybrid methods by combing different methods could be an option to improve the forecast accuracy. There are some other measures that could be taken to further improve the forecast performance such as through optimizing the model inputs (data normalization, feature selection and transformation), and improving the models/algorithms (e.g. by integrating clustering methods).

4.2.1 Data

The magnitude difference between the data set and various variables is likely to lead to the deviation prediction of the training algorithm. Many training algorithms, such as SVR, require input variables of a similar order of magnitude. Beside, in the real scenarios, load data often need to be normalized due to privacy requests [73aa-GOIA2010700]

. Therefore, data normalization is a necessary preprocessing step for training the model. Among the reviewed papers, the commonly used data normalization methods are: zero mean normalization (Z-score normalization)

[63newc-AMINNASERI20081302][104-2016Peak] and Min-Max normalization [89-2014Linguistic][108newl-Julio2016Linear].

The training data size is another important factor that could affect the output accuracy of the model in the training process. If the training size is too small, information learned by the model will be insufficient, and the performance will be poor as a consequence. On the other hand, too much training data will lead to low computational efficiency. Therefore, a good trade-off and balance between the training size and the computation time is worth investigating. [108aaa-2016Artificial] considers training data of four different lengths (one week to four weeks) in forecasting sub-hourly load usage and daily peak load. The training results showed that the larger the training size, the higher the training accuracy of the neural network model. In [84aaf-2012Finding], different training sizes were used to predict the peak load two days to one week ahead. Specifically, the training data are the hourly load from New South Wales in the past three months, six months, nine months, and one year, respectively. The results showed that the model trained with six months of historical data is the best at predicting the peak load in the coming days among all the models.

4.2.2 Feature transformation/Feature selection

As aforementioned, the input variables are often numerous especially in the big data era when training a forecasting model. However, many variables may have unrelated characteristics with the target/ response variable , and variables may also be interdependent, which may easily lead to long training time and decreased forecast performance.

Feature transformation and feature selection are usually adopted to address the problem [FST-escolano2009feature].

Feature transformation aims to get transformed features

by creating a new feature space and the commonly used methods include PCA, independent component analysis (ICA), and linear discriminant analysis (LDA). Feature selection

[fs-DASH1997131] is to select a subset from the original feature space and commonly used methods include filtering, wrapper and embedding.

Most of the reviewed studies utilizing feature selection on peak load forecast adopted filtering and wrapper [108aaa-2016Artificial] while those using feature transformation utilized PCA. For instance, PCA was compared with correlation analysis in [97newd-FAN20141], in which the original pattern matrix of the training data is 281095. Through correlation analysis, variables with correlation factors greater than 0.95 were selected. By applying PCA, the dimension of the input matrix was reduced to 11100. After combining the user-defined neural network to train the model for daily peak load demand forecast, the forecasting accuracy showed that the trained model using PCA was superior to correlation analysis both in computational time and training accuracy.

4.2.3 Clustering methods

With the installation of smart meters, high resolution distributed energy consumption data (e.g. at building levels) becomes available, which provides opportunities in studying different behaviours of forecast models under different buildings. For instance, [dai2020energy] compared performance of different forecast models on different buildings and concluded that clustering buildings based on their historical load usage patterns should be considered to produce more meaning insights (e.g. to improve forecast accuracy) instead of their predefined building use types. Clustering methods divide the data into different clusters according to certain standards, such as distance criterion. After clustering, the data within the same cluster have great similarity, while the data belonging to different clusters have great difference [3-2008A]. The accuracy of peak load forecast can be improved by training different models for different clusters and then obtaining the aggregated final forecasts.

Clustering methods can be divided into partition based clustering (e.g.

-means), hierarchical clustering, density-based clustering (e.g. density-based spatial clustering of applications with noise (DBSCAN)), and model-based clustering (e.g. Gaussian mixture models)

[4-2013The]. Some studies employing clustering in the peak load forecast are summarized in TABLE 8 where commonly used clustering algorithms in peak load forecast are -means, hierarchical clustering, SOM, and fuzzy clustering (FC).

Clustering methods References
K-means and its extensions [97newd-FAN20141],[53-2007A],[108newl-Julio2016Linear]
Hierarchical clustering [55-2006Peak],[58-2007Load]
Fuzzy clustering [108newd-Laouafi2016Daily],[89-2014Linguistic],[41-2004Peak],[49-2005An],
SOM [56newd-2006Developed],[10d-1991Design],[63newc-AMINNASERI20081302]
Table 8: Clustering methods for improving the performance of peak load forecast

-means is a classical algorithm of partition based clustering, which has high efficiency when handling large-scale data. Some variants based on -means, such as the entropy weighted -means [97newd-FAN20141] have also been used in peak load forecast. Hierarchical clustering can be classified into aggregation hierarchical clustering and splitting hierarchical clustering [Cluster]. For instance, [55-2006Peak] adopted hierarchical clustering to optimize the input daily data of a feed-forward neural network (FNN) to predict load usage during a peak period, and the results demonstrated that the FNN could converge more quickly and produce more accurate results. SOM is a commonly used clustering method owing to its unsupervised feature. [63newc-AMINNASERI20081302] firstly utilized SOM to cluster peak loads, then each cluster was trained separately by FNN to get a specified model. Results showed that the proposed hybrid method is effective for daily peak load forecast.

The above methods belong to hard clustering since each data point can only be assigned to a single cluster. Instead, fuzzy clustering such as Fuzzy C-means is a soft clustering method where each observation can belong to multiple clusters with corresponding membership coefficients [bezdek2013pattern]. For instance, [41-2004Peak] adopted fuzzy clustering to cluster peak load patterns according to the working/ non-working days. [49-2005An] combined fuzzy clustering with FNN to forecast load curves during peak load period.

4.3 Summary of the reviewed studies

TABLE 9 gives a comprehensive summary of the reviewed papers including their forecasting periods, forecast outputs, input variables, improving methods and geographical scope.

It is worth mentioning that the classification of peak load forecast methods into three stages (i.e. manual, classic and advanced) generally aligns with the evolving of power systems. In the manual and classic stage, the traditional energy-intensive industry dominated the electricity market with a relative stable peak demand patterns. Peak load forecasting based on statistical methods were commonly used. With the development of smart grids and the changing energy landscape at both demand side (e.g. demand side management and electric vehicles) and supply side (e.g. intermittent renewable energy supply at both transmission and distribution level), peak demand patterns become more random and less predictable. As such, more advanced methods that can better take advantage of big data and capture complex patterns such as deep learning and hybrid machine learning methods are preferable choices.

In addition, different from traditional load forecasting, the occurrence time and magnitudes of the peak demand are equally important in peak load forecasting. The peak load occurrence time is a field that may be more related to extreme value theories or quantile regression because of some rare events. Moreover, considering the uncertainty of peak load, it is also an effective forecasting method to take the peak load as anomalous data to quantify its occurrence probability and magnitude probability


Improving methods Input variables Geographic scope Output References
Clustering FS/FT H W C E/O Region Country City
Regression STPLF
[6],[9s-Barakat1990Short],[9a-1990A],[19-2002Regression], [26news-1997Short], [27newdaa-Haida1998Peak], [28-1999The],
[95-2014Non],[96-7022373],[97newd-FAN20141], [108newl-Julio2016Linear],[130-8810766], [118-2018Distribution]
MTPLF [8-Turner2012Regression],[92-6867500],[93-2014Peak],[124-8671682],[90-2014Long]
LTPLF [27],[106-2016Robust],[117-0Long],[70-2008Density]
time series
[84aaf-2012Finding],[99aaa],[102-7399017],[30-Amjady2001Short],[56aad-4202249],[57aat-article], [8-78],[63newm-5398613],
MTPLF [62-2008Electricity],[57aam-2007Monthly],[12-1992New],[7-IEEE],[5-Fong2011The],[80-2011Discrete],[83aaf-2012Forecasting]
Time series
STPLF [66-2009Long],[88aaa-2013A],[22-Choi1996A],[108-7893595]
MTPLF [2-Gupta1971A],[5-Fong2011The],[8-Barakat1989Forecasting],[68-2009The],[80-2011Discrete]
STPLF [9a-1990A],[38aa-J2017Short],[108newd-Laouafi2016Daily]
MTPLF [9s-Barakat1990Short],[12-1992New],[26aa-Masood1997EDSSF]
STPLF [21-1995Short],[85aab-2012Building]
STPLF [61-2008The]
MTPLF [62-2008Electricity]
LTPLF [20]
MTPLF [57aal-Otavio2007Long],[3-2008A],[40-1372805],[44-1412874],[45-1414771],[48-1556396]
LTPLF [37-Saini2002Artificial]
STPLF [13aa-Hsu1992Fuzzy],[30newa-30Kiartzis2000A]
LTPLF [36newL-2002Long]
MTPLF [72-0The]
STPLF [34],[50-2005Comparison],[72-0The],[81-6121762],[85aab-2012Building],[88aaa-2013A],[97newd-FAN20141]
LTPLF [87aaa-El2013Electric]
SVMs STPLF [60-2008Special],[61-2008The],[64-2009Forecasting],[104-2016Peak],[111-2017Combining],[115-8319611],[116-8330143],[128-8791587]
Boosting STPLF [boosting2019-ZHANG2019116358],[boosting2020-LU2020117756]
MTPLF [boosting2018-AHMAD20181008]
LTPLF [boosting2018-AHMAD20181008]
Bagging STPLF
MTPLF [bagging2018-DEOLIVEIRA2018776]
RF STPLF [97newd-FAN20141],[RF2018-WANG201811],[RF2020-SATREMELOY2020114246]
CNN STPLF [128-8791587],[129-8881305]
RNN STPLF [126-2019Deep],[127-2019Evolutionary],[128-8791587]
MTPLF [132-8985197],[134-8994442]
Table 9: Comprehensive summary for the reviewed literature FS/FT: Feature Selection/Feature Transformation; H: Historical load data; W: Weather variables; C: Calendar variables; E/O: Economic/Other variables; V: Peak values; V+T: Peak values+Occurring time; LP: Load Profiles.

For the input variables, historical load data, weather variables, and calendar variables were usually used in STPLF. On the other hand, economic and other variables such as population growth rate were frequently used in MTPLF and LTPLF. Moreover, due to the high randomness of peak load, the forecast model is greatly affected by small probability events such as extreme weather and accidental events. Accidental events vary among individuals/ entities, whose impact is often limited to a small range and therefore difficult to forecast. Although extreme weather also belongs to the small probability event, its impact is usually well studied. For instance, it is necessary to focus on climatic factors such as the maximum (lowest) temperature, the duration of the high (low) temperature, and the humidity. The maximum (lowest) temperature determines the peak load value. The duration of high (low) temperature affects the peak occurring time range, and the humidity further aggravates the difference between the physical and the actual temperature, thus affecting consumers’ electricity usage decisions.

Most of the forecast methods utilizing improving techniques (e.g. clustering) belong to the advanced stage of peak load demand forecast, which indicates that research attention to this field is increasing. However, it is worth pointing out that although there are many clustering methods available, the current load curvy-based clustering heavily relies on additional physical and social-economic information from entities/users (in other words, the domain knowledge) in order to properly interpret clustering results. More efforts on how to effectively incorporate domain knowledge into the forecast methods and improving techniques are needed.

As for the forecast geographic scope, researchers usually consider peak load forecast over a wide range of regions or countries during the classic stage. With the development of smart grids and the installation of smart meters at the local level, there are more high resolution temporal and spatial data becoming available [4-2013The]. In addition, the increasing penetration of distributed energy resources (e.g. electric vehicles and microgrids) coupled with distributed intelligence and local energy applications [5-Fong2011The] brings the operation and maintenance of power systems into a new era of disaggregated environment. From the perspective of peak load forecast , the highly random human activities will have higher impact on the forecast performance in small geographic areas (e.g. community level) than aggregated level (e.g. region/country) [107-2016Distribution]. Therefore, more research efforts on the interaction between electricity usage decisions of end users and disaggregated load forecasting are needed in the future.

5 Conclusion and future work

Motived by the importance of peak load demand forecast from the perspectives of electricity market stakeholders, this paper carries out a systematic review of the peak load demand forecast, which aims to summarize existing studies on the topic and provide guidance for future research. First, we aim to provide an unified problem definition for peak demand load forecast. Then, the peak load demand forecast methods were categorized into three stages based on their development timeline, and a thorough review of relevant methods in each stage was conducted. Moreover, a comparative analysis of different forecast methods was presented, and useful improving techniques for enhancing the forecast performance were discussed. Finally, a comprehensive summary of reviewed papers on the peak load forecast framework was presented with possible future research directions.

With high-resolution load data (e.g. residential smart meter data) becoming increasingly available, data privacy is an important issue that needs to be addressed. In the new digital era, using private encryption algorithms to protect the consumers’ data has become an essential task that researchers must deal with [6-Lim2016Security]. There are challenges in terms of electricity data transmission and storage compliance, security and privacy protection [privacy1]. In addition, it is well known that the data size and quality usually determines the training quality of machine learning models. However, in practice relevant data that are deemed necessary for a forecast task might be owned by different organizations. To make accurate predictions, it is necessary to combine diverse data sources from different organizations in building the model. This could be achieved by aggregating all the data sources into a third party central database, however, it may face inevitable security risks because of the central distribution of the data [privacy2]. Therefore, designing the forecasting framework under the premise of meeting data privacy, security, and regulatory requirements (e.g. through federated learning [yang2019federated]) is an important future research trend on peak load demand forecast.