A minimalist approach to scenario-based forecasting of electric vehicle consumption

07/24/2020 ∙ by Rahul Roy, et al. ∙ University of Bath Lancaster 0

Electrification of transport is a key strategy in reducing carbon emissions. Many countries have adopted policies of complete but gradual transformation to electric vehicles (EVs). However, mass EV adoption also means a spike in electricity demand, which in turn can disrupt existing electricity infrastructure. Good EV consumption forecasts are key for distribution network operators (DNOs) to effectively manage demand and capacity. In this paper, we consider a suite of models to forecast EV consumption. More specifically, we evaluate a nested modeling approach for scenario-based forecasting of EV consumption. Using the data collected as part of the Electric Nation trials, we studied statistical models (Time Series Regression and Regression with ARIMA Errors), scalable machine learning systems (Extreme Gradient Boosting or XGB), and artificial neural networks (Long Short-Term Memory Networks or LSTMs). We found that LSTMs delivered the best forecasting performance.



There are no comments yet.


page 1

page 9

page 10

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Globally, the EV charging infrastructure has been rapidly expanding to match the growth of EVs. A study by UK’s National Grid reveals that there would be 90% penetration of EVs by 2050 leading to an increased energy demand of 46 TWh; this is over and above the total energy demand of 308 TWh in 2016 [robinson2018electric]. My Electric Avenue (MEA), a 3-year project conducted from 2013 to 2015 in the UK to explore the impact of charging clusters of EVs at peak times on electricity networks, predicted that reinforcement of low-voltage (LV) distribution networks would be required when 40-70% of the vehicles would be EVs [godfrey2016carconnect]

. MEA observed that, with rising uptake of EVs, there is a high probability of cluster formation within localities


, i.e., clusters of consumers with charging requirements occurring at the same time. The additional energy demand caused by such clusters would in turn increase the power demand (rate of consumption of energy measured in kW) and eventually create stress on the distribution networks. Hence, it becomes imperative for the DNOs to have an estimate of the additional consumption of energy from the grids caused solely by EV charging, to ensure seamless demand management, especially during peak hours in a day, in their local distribution networks.

Estimating the energy consumption requirements prior to the widespread adoption of EV charging is part of emergency preparedness in distribution networks as any additional bulk demand on the network may lead to frequent outages or may cause significant damage to the electric infrastructure. It is now widely acknowledged that significant EV uptake will put stress on distribution networks prompting many studies to explore impacts of EVs on electricity infrastructure. For example, in their review on the factors relating to plug-in hybrid EVs (PHEVs) that have an impact on distribution networks, [green2011impact] found that driving patterns, charging characteristics, charge timing and vehicle penetration were most relevant. Besides this, [foley2013impacts] studied the impacts of EV charging in an actual working electricity market in Ireland and showed that EV charging had significant impact on wholesale electricity market and off-peak charging was beneficial. Moreover, while [neaimeh2015probabilistic] used a probabilistic approach, combining EV charging and smart meter household consumption data, to highlight the need for better planning for dealing with stochastic nature of charging demand, [xydas2016data] proposed a ‘risk level’ index using fuzzy logic to assess the impact of EV power demand on distribution networks. Furthermore, based on the findings from the Victorian EV trials in Australia, [khoo2014statistical] projected the mean and maximum percentage increase in the power demands between 3.27% and 5.70%, and 5.72% and 9.79% respectively in the summer of 2032/33. Given its impact on existing infrastructure, naturally, a number of studies focused on predicting either EV energy or power demand using different models and assumptions. For example, [xydas2013electric]

implemented data mining methods, such as decision tables, decision trees, artificial neural networks and support vector machines, to forecast EV load (power demand) using data on previous day load, number of the week, day of the week, type of day, number of new plug-ins every half-hour and total charging connections every half-hour. Moreover,

[wang2015electric] proposed an offline algorithm based on driving behaviour, road topography information, and traffic situation that gave two energy consumption results, one for the maximum driving speed and the other for the most economical driving speed, to give a first impression to the driver on the possible energy consumption and therefore, the range which the EV can cover even before the actual trip. In addition, [wang2016online] also proposed an online energy consumption algorithm that would help in adjusting the energy consumption prediction during driving of battery EVs (BEVs); this would be based on a number of factors, such as vehicle characteristics, driving behaviour, route information, traffic states and weather conditions. Furthermore, while [majidpour2016forecasting] forecast energy consumption for a horizon of 24 hours based on historical data from two different data sets: data from customer charging profile and data from station outlet measurements, [arias2016electric] used historical weather and traffic data to forecast EV charging power demand. Based on the studies of the Korean EV market, [moon2018forecasting] estimated the changes in energy demand based on consumer preferences for EVs, charge time of the day and types of EV supply equipment (EVSE). Total energy demand was estimated using total EV owners, average distance travelled per day, and average fuel efficiency of current EVs. Moving further, [lopez2018demand] proposed a demand response strategy based on machine learning to control EV charging in response to the real-time pricing, such that the overall energy cost of an EV was minimized.

While it is natural for many factors, as explained and used within the above studies, to affect the EV consumption, typically, most of them are not available to be used within a forecast model. For example, information such as driver characteristics including behavioral is hard to obtain. Similarly, route choice and trip details of EV users are unlikely to be available on a continuous basis. This motivated us to consider a more minimalist approach which takes the most basic information, such as EV ownership, day of the week, and season of the year, that is likely to be available in the future to develop a forecast model that would estimate the consumption of energy from the grid caused by EV charging at home. More specifically, in this paper we are interested in the following question: assuming the EV owners have access to unconstrained charging at home, what is the expected EV energy consumption at a given ownership level in a given time period? It is important to note that in several previous studies, [xu2019modal, wu2015electric], the term EV consumption indicated the consumption of energy stored in EV’s battery based on a set of features, such as EV kinematics, driving data, battery state of charge, and so on. However, in our study, EV consumption refers to the consumption of energy by an EV to charge its battery when connected to the electricity grid. We explore several models which take as an input a future scenario, of total EV ownership and time hereafter referred to as scenario-based data, to forecast expected EV energy consumption. Moreover, our focus is forecasting energy consumption at a more aggregated level of per day compared to some studies which focused on inter-day consumption per hour forecasts. Utilities are interested in both the energy consumption (kWh) and the additional load (kW) on LV networks caused by EV charging. While energy consumption drives the income generated by utility companies, additional load affects their costs, as it drives the need for reinforcement of networks. An estimate of the additional energy consumption on the LV distribution networks caused by EV charging is extremely useful for the policymakers, utilities, and city and district administration councils; as an instance, the planning of the roll-out of an EV charging infrastructure in the future.

Several studies also focused on charging demand at public charging points. For example, [van2013data] analyzed the actual usage patterns of public charging infrastructure in the city of Amsterdam, based on more than 109,000 charging events in the year 2012-13. Our focus in this paper is on home EV charging demand.

2 Data

Data in our study were collected as part of the Electric Nation project [electricnation, electricnation2019finalreport]

, which assessed the impact of EV charging on local electricity networks and identified that smart charging could help in demand management under such scenarios. Data, which contained charging transactions of energy consumption for different EV users, were collected from February 1, 2017 to December 30, 2018 and had a total of 80,313 observations. The transactions were spread across four stages: uncontrolled and trials 1, 2, and 3. While the uncontrolled stage allowed people to charge their EVs without any constraints on the quantum of energy consumed, in trial 1, smart charging was introduced to regulate the EV charging but without informing the consumers. In trial 2, consumers were given mobile applications to enable them to interact with the smart charging system. Trial 3 observations were biased as the consumers were given incentives to charge their EVs at specific times in a day. The processed data, after removing trial 3 data and non-recoverable, non-imputable missing data caused by technical glitches, consisted of 56,637 observations across 13 variables (table

1), which were then transformed into day-wise time series for analyses and modeling.

Note that a different time granularity, such as per hour is certainly an interesting and relevant choice. However, such granularity was not suitable for our objective and is more likely a choice in a real-time forecasting scenario and in a data-rich environment.

Variable Description of Variable
Charger ID ID of smarter charger installed at consumer’s home
Participant ID ID of consumer participating in trials
Car kW Power rating of battery
Car kWh Energy capacity of battery
Group ID ID of group to which a consumer was assigned during trials
Trial Stage of trials (uncontrolled, 1, 2, or, 3)
Adjusted Start Time Time at which EV was plugged-in to a smart charger
Adjusted Stop Time Time at which EV was plugged-out from a smart charger
Consumed kWh Energy consumed during EV charging
Active Charging Start Time at which EV actually started charging after plug-in
Car Make Manufacturer of EV
Car Model Model of EV
EV Type Type of EV (battery-operated, hybrid, or range extender)
Table 1: List of variables in EV charging transaction data

3 Methodology

Mathematically, we consider the following statistical learning problem for our objective:


= energy consumption caused by EV charging

= set of features to forecast

= learning algorithm that maps and

= random error, independent of , with mean 0

In this paper, the learning problem is a special case of 3, with = {, , }, where, , , and, are number of EV owners, day of the week and season of the year respectively.

3.1 Clustering of EV Owners

A total of 26 different battery capacities was present in the data set used. Naturally, each battery capacity combined with car usage pattern could result in a completely different consumption pattern. This implied an important parameter choice in our study: aggregation at battery capacity level. Clearly, building a model for all 26 different battery capacities was impractical. At the same time, ignoring battery capacities and treating all consumption patterns identically meant ignoring intrinsic variability. A histogram depicting the variability of battery capacities, caused by the discrete nature of EV battery capacities in the trials, is shown below in figure 1.

Figure 1: Histogram of battery capacities

To validate our assumption of varying charging patterns with distinct battery capacities among consumers, k-means clustering was performed by transforming the transaction data to obtain a consumer-wise summary of energy consumption per charge against respective battery capacities. Figure 

2 shows that based on the consumption behaviour of consumers per transaction and their respective battery capacities, they can be grouped into an optimum number of three clusters as increasing the number of clusters beyond three does not significantly decrease the variability, thereby validating our assumption that one model would fail to capture all the variability of consumer charging behaviour.

As mentioned, clustering suggested an optimum number of three clusters. Table 2 summarises the information about the three clusters. The and capacities are respectively the minimum and maximum kWh capacities of the batteries that are present in the clusters. Besides, the is the average energy consumed by an EV user per transaction. We observe that as the battery capacities increase, the frequency of charging per day decreases and the mean energy consumed per transaction increases, suggesting that with increasing battery capacities, EV users are less likely to charge per day but when they do, they consume a higher energy per transaction.

Figure 2: Optimal number of clusters
Cluster Min Capacity Max Capacity Mean kWh/Charge Charging Frequency/Day
1 4.4 18.7 5.68 0.68
2 22 41 14.30 0.44
3 60 100 26.80 0.36
Table 2: Summary of clusters

3.2 Time Series Analysis

Based on clustering analysis, we transformed the transaction data into three different day-wise time series (since we have less than 2 years’ data, we do not consider annual but only weekly seasonality in this study; hence, the seasonality of the time series is 7), each belonging to a unique cluster. It is worth mentioning that while transforming the transaction data into day-wise time series, several features were extracted which were not present in the transaction data. Table 

3 lists all the variables (features and target) in the time series. All the features except for were easily extracted from the transaction data when converting them into day-wise time series. For e.g., trans was the count of all the transactions occurring in a given day, while users was the count of all those people who charged their EVs in a given day. However,

indicated the count of people who owned EVs in a given day. In a real-world scenario, although the count of people with EVs would not change everyday in a distribution network, it would gradually evolve over months and years. However, in Electric Nation project, the number of participants joining the EV trials increased steadily at a much higher rate than that in case of real-world scenario, resulting in the count of people with EVs changing almost everyday. In a real-world set up, DNOs would not have the information on the actual number of people who would be charging their vehicles everyday as it’s a random variable that would depend on a lot of factors; for e.g., day of the week, battery state of charge (SOC), and so on, and would almost always be less than the number of EV owners in a network. However, based on the charge-point installation notifications, the DNOs would have an estimate of the number of people in their network who own EVs and hence, knowledge of EV owners is essential to the objective. Besides owners, DNOs would have information on season of the year and day of the week too. In a nutshell, the information available to the DNOs would most likely contain EV owners, day of the week and season of the year. Any additional information available to the DNOs is uncertain and hence, our objective was to develop a forecast model that could be leveraged by the DNOs to forecast energy consumption based on the minimal available information. Since the number of owners could not be directly extracted from the transaction data, we made a few assumptions to compute the number of owners: (1) an EV owner joined the trials whenever he (she) charged his (her) EV for the first time; (2) an EV owner, after signing up, never dropped out of the trials until the trials were inactive. Under these assumptions, the count of EV owners increased with time during the trials.

In table 3, , or the total connected load (kWh) per day is given by 3.2.


= capacity of battery

= number of transactions per day for

Variable Type Description of Variable Notation
day Feature Day of transactions
season Feature Season of the year of transactions
owners Feature Count of people with EVs per day
users Feature Count of people charging their EVs per day
trans Feature Count of transactions per day
demand Feature Total connected load (kWh) per day
consumed Target Total energy consumed (kWh) per day
Table 3: List of variables in day-wise time series data

The three time series generated from the transaction data had missing days between different pairs of dates. To impute the data for those missing days, we adopted a three-step method as discussed in [hyndman2018forecasting]: (1) STL decomposition [cleveland1990stl]

was computed to obtain seasonally adjusted data; (2) linear interpolation was then carried out for the seasonally adjusted data; (3) seasonal component was added back to the linearly interpolated data. Besides imputing the missing values, few observations across the three time series were identified as outliers and replaced with suitable values via a two-fold approach: (1) periodic STL decomposition was carried out to identify observations that seemed unusual from rest of the observations

[hyndman2018forecasting]; (2) time plots were analyzed to identify sudden changes in the values that, in turn, helped in identifying which observations were apparently unusual from the rest. The three time series with clusters 1, 2 and 3 had respectively 572, 573 and 573 observations after the final phase of data processing.

3.3 The Nested Modeling Approach

Depending upon what is assumed to be known when forecasting, we can classify forecasts into three categories

[hyndman2018forecasting]: (1) ex-ante (to forecast target, we need to forecast features as no information is available on the future values of features); (2) ex-post (information on features is available prior to forecasting); (3) scenario-based (possible scenarios for the features that are of interest to the objective are considered). In this study, scenario-based forecasting is the apparent choice.

(a) EV owners
(b) EV consumption (kWh)
Figure 3: Variation of EV owners and consumption

Furthermore, classical univariate methods such as exponential smoothing and ARIMA were not applicable in our scenario-based forecasting objective for the following reasons.

  • The trend (direction in which a time series slopes) in the consumption of energy was governed by the number of EV owners in the trials, which itself was a controlled variable. Hence, the trend of energy consumption during the trials did not represent the real-world scenario. Figures 3(a) and 3(b) show the upward trend of both owners and consumption of energy, indicating that as more people joined the trials everyday, the consumption of energy increased. A univariate method would forecast based on the trend captured during the trials and hence, resulting in inaccurate forecasts at a specified period of time in the future.

  • If a univariate method was used for forecasting, we would deviate from the objective of scenario-based forecasting as the forecasts would correspond to a specific combination of , , and at a fixed time stamp in future. In a nutshell, the univariate methods would not be able to generate forecasts for any combination of , , and based on a user’s choice.

3 shows that the objective, involving scenario-based data, does not avail information from all the features in the time series data, i.e., while the time series data have six features, the data in the objective involves only three. Figure 4 shows how EV consumption varies within a week as well as across different seasons of the year.

(a) kWh vs day
(b) kWh vs season
Figure 4: Variation of EV consumption (kWh)

Besides this, Figure 5 enumerates the high values of the Pearson’s correlation coefficients among all the numeric variables. This indicates that features other than might also have an effect on EV consumption, thereby necessitating further inspection. In addition, this also corroborates our assumption that if we use only those features from the time series data that are present in the scenario-based data to fit a model, we fail to make use of the additional information present in the time series data, adversely affecting the forecasting performance. Hence, to ensure that we extract maximal information from the time series data while addressing the mismatch between the feature spaces of the aforementioned data sets, we evaluate a nested modeling approach (figure 6) as explained below.

We observe from figure 5 that correlation coefficient between and is 1, i.e., they are perfectly correlated. This is because

is obtained by linearly transforming

as shown in 3.2. However, also includes information on the EV battery capacities and hence, might be influential in forecasting EV consumption as battery capacity is an indicator of EV’s consumption capacity.

(a) Correlogram (4.4 - 18.7 kWh)
(b) Correlogram (22 - 41 kWh)
(c) Correlogram (60 - 100 kWh)
Figure 5: Correlogram of numeric variables
  • In the real-world set-up, forecasts will be generated using the scenario-based data; this means that in our modeling framework, our test sample should be similar to the scenario-based data. To begin with, we firstly split the time series data into training and test samples and drop all the features from the test sample except for , , and , to generate a truncated test sample; this ensures that while the training sample resembles the time series data in feature space, the truncated test sample, with its three features, is equivalent to the scenario-based data of the DNOs. However, the target variable, EV consumption, is retained in both the samples.

  • Since DNOs will need to generate forecasts using scenario-based data (equivalent to the truncated test sample) but we also need to leverage the additional information available in the training sample (equivalent to time series data), we affix new features, called pseudo features or (p stands for pseudo), to the truncated test sample, to obtain a modified test sample; these p-features are actually the forecasts of the those original features, called , which are dropped from the test sample after the train-test split. A p-feature is obtained by: (1) firstly, fitting a model of an o-feature (), corresponding to a p-feature (), using , , and from the training sample; (2) secondly, generating forecasts of using , , and from the truncated test sample, and appending them as to the truncated test sample to obtain the modified test sample. For e.g., we can add a p-feature of , called , by firstly fitting a model of using , , and from the training sample and then, forecasting using the same features from the truncated test sample, to obtain . We repeat the step for all except for , , and in the training sample, to generate a modified test sample that contains , , and and corresponding to all .

  • After we obtain the modified test sample containing all , we again fit a model of an o-feature (target o-feature) () using a different o-feature (input o-feature) () and generate another set of forecasts of the target o-feature, . We compare the MAPEs of and (already present in the modified test sample), to identify which forecasts yield low error. If MAPE of is found to be less than MAPE of , we assign to to replace with new values. If MAPE of is found to be greater than MAPE of , we reject . We repeat the step for all to obtain the final modified test sample, keeping in mind the causality among features.

  • We now use the training sample to fit a model for EV consumption using corresponding to . We then use from the final modified test sample to compute the forecasts for EV consumption. In the real-world set-up, DNOs will utilize the scenario-based data as test sample to forecast and then append p-features to it using the models of o-features trained on the complete time series data of the trials as explained in previous steps. Once the final modified version of the scenario-based data is obtained, EV consumption forecasts would be generated using the EV consumption models trained on the complete time series data as explained earlier.

While creating pseudo features, causality between two variables should be kept into consideration, i.e., if X causes Y but not vice-versa then, Y should be forecast as a function of X but not the other way around. For instance, users causes transactions but not vice-versa and hence, transactions should be forecast as a function of users but its reverse is not true.

We will not use two or more p-features to forecast EV consumption. This can be attributed to the causality among features and the fact that one p-feature can be forecast using others, indicating that using multiple p-features would cause model to over-fit. For e.g., consider the case of generating using , , and . We can also generate separately using , which is itself generated first using , , and , only to identify that yields more accurate than that obtained from , , and . As such, we retain generated using . We can now forecast EV consumption using either , , and , or , or but not using a combination of these as a combination indicates that we indirectly include a feature or a set of features more than once while forecasting EV consumption.

We call this approach nested modeling as we repeatedly fit auxiliary models and generate forecasts internally within a nest-like loop before eventually forecasting EV consumption.

(a) Step 1
(b) Step 2
(c) Step 3
Figure 6: The nested modeling approach

3.4 Evaluation on Variable Origin

The choice of train-test split is user-specific, and it can not be ascertained that a given split is better than the other. To obviate any bias due to a specific train:test split, we evaluate the forecasting performance on a . In performance evaluation on a variable origin, we firstly fit a model on the first 70% of the data (training sample) and then evaluate the performance on the last 30% of the data (test sample). Subsequently, we increase the training sample to 80% and 90% of the data and fit models on these samples. Model performances are then evaluated on test samples comprising of the last 20% and 10% of the data respectively. The final performance is the mean of all the three performances. We choose the mean absolute percentage error (3.4) as the error metric for performance evaluation.

It is worth mentioning that evaluation of performance on a rolling origin [hyndman2018forecasting]

was also an alternative. However, in our case, it was computationally expensive to choose this method as we tuned several hyperparameters across different algorithms to optimize forecasting performance.

4 Algorithms and Results

In this section, we discuss four algorithms to work out 3 using the nested modeling approach (section 3.3). Before fitting a model, data were normalized using the min-max scaling as shown below.

In 4, is the normalized observation, while is the actual observation. Besides, and are the minimum and maximum values of the feature , which is normalized. It is important to note that we used the parameters of the training sample to not only normalize the training sample but also the test sample, to avoid information leakage.

Moreover, we also present a quantitative comparison of the performances of the final models to forecast EV consumption.

4.1 Time Series (TS) Regression

In TS regression, a dependent variable is modeled as a weighted linear sum of the independent variables, where all the variables are time series. In our case, we set time series regression as the benchmark algorithm, where we modeled EV consumption, , as a weighted linear combination of features, , where the weights are the regression coefficients. For e.g., the equation

models as a function of , , and , with being the regression error.

4.2 Regression with ARIMA errors (reg-ARIMA)

In reg-ARIMA, an ARIMA model is fit on the errors of a regression model and the forecasts from both the regression and the ARIMA components are combined. It is particularly useful when the regression errors show high correlations among each other, indicating that the regression model does not capture all the information in the data. Mathematically, a reg-ARIMA model to forecast as a function can be given by:

As mentioned earlier, an ARIMA model is fitted on the regression errors . Here, and represent the model parameters corresponding to the AR and MA components of the ARIMA model, while is the ARIMA error. In this paper, we implemented the Hyndman-Khandakar algorithm [hyndman2007automatic] to tune the orders of the AR and MA components and subsequently fitted the ARIMA model on the regression error, which was obtained after fitting a regression model as explained in section 4.1.

4.3 Extreme Gradient Boosting (XGB)

XGB is a scalable, tree boosting system that leverages the Gradient Boosting [friedman2001greedy] framework to learn from data and offers superior computational efficiency over other acclaimed machine learning algorithms [xgboost]. The popularity of XGB can be ascertained from the fact that in 2015, 17 out of 29 challenge winning solutions at the machine learning competition site used XGB [chen2016xgboost].

Given the computational resources, we tested 108 combinations of hyperparameters for each iteration of evaluation on a variable origin via random search in the hyperparameter space. Since our data was sequential, we could not use cross-validation via random sub-sampling to tune the hyperparameters as it would have disarrayed the time dynamics of the data. Instead, we implemented time-slicing to create sub-samples of the training data into a variable-length training sub-sample and a fixed-length validation sample, and iteratively increased the size of the training sub-sample by 1 seasonal difference. Figures 7(a) and 7(b) show time-slicing for hyperparameter tuning in XGB.

(a) Time-slicing (first iteration)
(b) Time-slicing (N iterations)
Figure 7: Time-slicing for hyperparameter tuning of XGB

4.4 LSTM Networks

LSTMs [hochreiter1997long] are an enhanced version of Recurrent Neural Networks (RNNs) which overcome a major limitation faced by the conventional RNNs: Vanishing Gradient Problem, in which a network fails to learn long-term dependencies [hochreiter1998vanishing]. In a sequence prediction problem, learning long-term temporal dependencies along with the present state of the system is essential in predicting the future state. LSTMs, courtesy the gating mechanism inside their specially architectured memory cells, regulate the flux of information and enforce a constant error flow through the network, thereby obviating the complications of vanishing and exploding gradients and enabling the LSTMs to capture long-term temporal dependencies. LSTMs’ ability to learn such long-term dependencies have piqued interests among researchers to leverage it for a plethora of sequence prediction problems. For e.g., [bandara2019sales] used a special variant of LSTMs, known as LSTMs with peephole connections, to forecast sales demand in e-commerce. Moreover, [du2018time]

proposed a sequence-to-sequence deep learning framework based on LSTM encoder-decoder architecture for multivariate time series forecasting on air quality data. Besides this,

[9107249] used the LSTM-based encoder-decoder architecture to predict long-term traffic flows. Furthermore, [sehovac2020deep] assessed attention mechanisms with different types of RNN cells (vanilla, LSTM, and GRU) and forecasting horizons, for electrical load forecasting.

In this study, we compared a suite of network architectures: (one hidden layer between input and output layers) vs (multiple hidden layers between input and output layers), vs [schuster1997bidirectional], and vs [cho2014learning, sutskever2014sequence], to assess the effect on forecasting performance. It is important to note that although both vector-output and encoder-decoder architectures are popular choices when forecasting horizon () is greater than 1, we evaluated them in our study as a special case for

being 1. Leveraging these 2 architectures ensures that, with nominal transformations to the input data and output layers of the networks, the LSTMs are flexible to output forecasts of any horizon, 1 or higher, depending upon user requirements. While designing the networks, we set the maximum number of hidden layers to 2, i.e., the depth of LSTMs would be either 1 (vanilla) or 2 (stacked) layers. This implies that in case of vector-output architecture, the maximum number of LSTM layers cannot exceed 2. In addition, as encoder-decoder architectures have 2 components, encoder and decoder, where each component is an LSTM network, we specifically assessed the effect of the encoder depth (vanilla or stacked) on the forecasting performance by tuning the decoder depth for a given encoder depth. Besides, the number of neurons in each hidden layer was tuned between 50 and 200. We chose a batch size of 7,

mean squared error as the cost function, and Adam [kingma2014adam] as the optimization algorithm to minimize the cost function, with learning rate ranging from 0.0001 to 0.01 on a logarithmic scale. In addition, we also introduced regularization via neuron dropouts [hinton2012improving], ranging from 0 (no dropout) to 0.4 (randomly removing 40% of neurons in each iteration of training), to avoid over-fitting. To tune the hyperparameteres, we used Bayesian Optimization [snoek2012practical]

. It is important to note that given the computational resources, we set the maximum number of search iterations in Bayesian optimization to 10 and the maximum number of training epochs to 100 for tuning the hyperparameters.

4.5 Results

Table 4 enumerates the lowest MAPE values of EV consumption forecasts for each feature in the data set across all the 4 algorithms and 3 clusters. The MAPE values of p-features are tabulated in Appendix A. We observe that for all the 3 clusters, LSTMs deliver the lowest MAPE values, thereby the best forecasting performance among all the models developed.

Cluster Feature (s) for Forecasting Regression reg-ARIMA XGB LSTMs
1 , , 26.80 26.33 22.13 13.14
27.13 26.49 23.20 17.27
26.73 26.05 23.62 16.90
26.50 22.46 21.81 16.89
2 , , 34.61 18.00 17.43 15.41
33.57 21.20 19.31 17.26
32.54 17.96 18.76 17.37
32.47 18.02 17.98 17.41
3 , , 49.40 36.54 33.40 31.35
50.74 42.03 39.00 32.35
49.35 40.06 46.68 32.45
48.73 38.19 42.91 32.16
Table 4: MAPE of EV consumption forecasts for all clusters

We note that out of the 12 best models (4 algorithms 3 clusters), 6 (3 for time series regression, 2 for reg-ARIMA, and 1 for XGB) use p-features to deliver the best forecasting performance. On investigating the performance of the 2 best algorithms, XGB and LSTMs, we further observe that to deliver the lowest MAPE values, while XGB uses the p-feature, , in cluster 1 but , , and in clusters 2 and 3, LSTMs never use the p-features. However, it is worth mentioning that the XGB and LSTM results are constrained by the number of search iterations during hyperparameter tuning and hence, the possibility of obtaining a better forecasting performance with p-features for all or some of the clusters can not be ruled out, thereby necessitating that relevant p-features should be appended to the scenario-based data of the DNOs before forecasting EV consumption, using nested modeling as discussed in section 3.3.

Cluster LSTM Architecture LSTM Depth Learning Direction MAPE
1 Encoder-Decoder Vanilla (Encoder) Unidirectional 13.14
Vector-Output Stacked Bidirectional 13.48
Encoder-Decoder Vanilla (Encoder) Bidirectional 13.60
Vector-Output Vanilla Bidirectional 13.70
Encoder-Decoder Stacked (Encoder) Unidirectional 13.93
2 Vector-Output Vanilla Bidirectional 15.41
Vector-Output Vanilla Unidirectional 15.43
Vector-Output Stacked Unidirectional 15.91
Encoder-Decoder Stacked (Encoder) Bidirectional 15.99
Encoder-Decoder Stacked (Encoder) Unidirectional 16.18
3 Vector-Output Vanilla Bidirectional 31.35
Vector-Output Vanilla Unidirectional 31.40
Encoder-Decoder Vanilla (Encoder) Unidirectional 31.50
Encoder-Decoder Stacked (Encoder) Unidirectional 31.59
Vector-Output Stacked Unidirectional 31.63
Table 5: Top 5 LSTM architectures and MAPEs to forecast EV consumption

As discussed in section 4.4, we compared several LSTM architectures and found that no specific architecture is suitable for all the clusters. Table 5 lists the top 5 architectures along with respective MAPE values for each cluster to forecast EV consumption. It is worth mentioning that all these networks use , , and as features. We observe that the MAPE values for the top 5 models in each cluster differ by less than 1 percentage-point, implying that any of the architectures can be chosen to forecast EV consumption without significant loss in forecasting performance. However, following the problem-solving principle called Occam’s razor, we should choose an architecture that minimizes model complexity (reduced number of model parameters) without detracting from forecasting performance.

5 Conclusions

In this study, we evaluated several models, ranging from linear statistical models, such as time-series regression, to non-linear artificial neural networks, such as LSTMs, to forecast daily energy consumption caused by EV charging. We observed that LSTMs delivered by the best forecasting performance among all the algorithms considered to develop the forecast models for EV consumption, with MAPEs of 13.14, 15.41, and 31.35 for clusters 1, 2, and 3 respectively. The models were developed keeping in view of the minimal information that would certainly be available to the DNOs in the future. However, it is likely that there are more contributors to the variation in EV consumption, and including more information would help in better understanding of its variability, leading to the identification of more important features for forecasting. In fact, minimal information appears to adversely affect the forecasting performance as we move towards clusters with higher battery capacities (table 4). The decreasing correlation coefficients of with features in the data set, especially , figure 5, also corroborate the fact that the limited set of features fail to capture significant variability in EV consumption as we move towards EVs with higher battery capacities. Statistical analysis reveals that as we move towards clusters with higher battery capacities, the fraction of EV owners charging their EVs per day drops, leading to a reduced number of total charging transactions per day. This can be attributed to several factors, such as the range of the vehicle (distance the vehicle can travel before needing recharging). Given the same vehicle usage (e.g. driving style, weather, etc.), a higher capacity battery has a greater range than a smaller capacity battery does, assuming both batteries are initially charged to 100% of their capacities. Vehicles with a greater capacity battery (higher range) are more likely to be able to complete their next journey without charging again when compared to those with a lower battery capacity, leading to lower charging frequencies. Under such a scenario, if the number of EV owners are the same in two clusters with different battery capacities, lesser number of EVs would get charged in that cluster which has higher capacity batteries as the range of vehicle would also be influential in determining the charging frequency. On a similar note, we can identify more relevant features that might be influential in explaining the variability in EV consumption. A few areas worth exploring in future endeavors are summarised below.

  • The framework we consider in this work is that of minimal information that can be used to forecast EV consumption. Within the data available for our study, this information is EV ownership, day of the week and season of the year. However, there is a scope to collect additional information which is generally available with city councils and DNOs, such as socio-demographic information of EV owners. It is conceivable that different socio-economic profiles would have different consumption patterns. It would be interesting to explore the models which take these into account. To this end, a future study can combine relevant choice models which explain EV charging preferences with forecast models.

  • We can easily verify from figure 3(a) that the number of EV owners and hence, users charging their EVs is very small in the initial days of the trials, leading to a very small consumption of energy. High absolute errors on such smaller values amplify the percentage errors more than they do on larger values. A high value of percentage error can be misleading as it fails to capture the real picture and, in turn, gives a false alarm caused by the poor forecasts of smaller values. Hence, to have a more realistic understanding of the forecast accuracy of the models, we need to work with data which are more realistic. Dropping observations that do not reflect the real-world scenario might help in getting rid of the false alarms.

  • In our methodology, we could not leverage the autocorrelations among the lagged values of the target variable as we forecast a specific value of EV consumption based on one instance or observation of scenario-based data, i.e., we forecast EV consumption based on a given day’s features affecting EV consumption but do not include information of EV consumption from previous days, leading to the loss of relevant information in forecasting EV consumption. A plausible solution to resolve this issue would be multi-step forecasting, in which, the forecasting horizon is greater than 1, i.e., for a given input, instead of forecasting for one time-step (e.g, next day), forecasts are generated for a sequence of time-steps (e.g., next 7 days). In multi-step forecasting, we train an algorithm to firstly, observe a sequence of inputs of length , called an input window of length , and then, forecast a sequence of corresponding outputs of length , called an output window of length , at a time before moving on to further sequences of input-output pairs; this approach is unlike the current learning methodology where both and are 1. A multi-step forecasting scenario necessitates the availability of large volumes of data so that a large number of training samples, comprising of such input and output windows, can be generated from the data.

6 Acknowledgement

This study was sponsored by grants from the European Regional Development Fund, EA Technology and Lancaster University, facilitated by the Centre for Global Eco-Innovation, Lancaster University. We express out gratitude to Dr. Christopher Kirkbride (Lancaster University) and Dr. Florian Dost (The University of Manchester) for their constructive appraisal of the study.


Appendix A MAPE of p-feature Forecasts

Tables 6, 7, and 8 enumerate the lowest MAPE values of the p-feature forecasts for all the possible input features to p-features across all the 4 algorithms and 3 clusters.

We see that LSTMs outperform other algorithms to generate forecasts for p-features as well. Since can be forecast using , , and or , there are 2 possible set of input features to forecast . Similarly, can be forecast using , , and or and as such, there are 2 possible set of input features to forecast . It is important to note that since was obtained by linear transforming , we would not use to forecast .

Feature (s) Cluster Regression reg-ARIMA XGB LSTMs
, , 1 19.38 22.00 18.55 12.01
2 21.99 14.42 13.60 11.44
3 29.42 28.55 23.93 23.87
Table 6: MAPE of p-users () forecasts
Feature (s) Cluster Regression reg-ARIMA XGB LSTMs
, , 1 23.35 23.54 20.16 13.33
2 25.20 13.85 13.91 12.16
3 32.59 31.04 28.24 28.17
1 23.05 23.10 20.71 16.24
2 25.67 15.04 15.31 13.46
3 32.49 30.92 28.52 27.89
Table 7: MAPE of p-trans () forecasts
Feature (s) Cluster Regression reg-ARIMA XGB LSTMs
, , 1 23.52 24.36 20.47 13.96
2 25.45 14.31 13.86 12.58
3 32.33 31.39 28.08 27.55
1 23.44 23.99 20.71 17.16
2 25.92 16.51 15.13 13.71
3 32.05 29.70 28.55 27.89
Table 8: MAPE of p-demand () forecasts