1 Introduction
Globally, the EV charging infrastructure has been rapidly expanding to match the growth of EVs. A study by UK’s National Grid reveals that there would be 90% penetration of EVs by 2050 leading to an increased energy demand of 46 TWh; this is over and above the total energy demand of 308 TWh in 2016 [robinson2018electric]. My Electric Avenue (MEA), a 3year project conducted from 2013 to 2015 in the UK to explore the impact of charging clusters of EVs at peak times on electricity networks, predicted that reinforcement of lowvoltage (LV) distribution networks would be required when 4070% of the vehicles would be EVs [godfrey2016carconnect]
. MEA observed that, with rising uptake of EVs, there is a high probability of cluster formation within localities
[godfrey2016carconnect], i.e., clusters of consumers with charging requirements occurring at the same time. The additional energy demand caused by such clusters would in turn increase the power demand (rate of consumption of energy measured in kW) and eventually create stress on the distribution networks. Hence, it becomes imperative for the DNOs to have an estimate of the additional consumption of energy from the grids caused solely by EV charging, to ensure seamless demand management, especially during peak hours in a day, in their local distribution networks.
Estimating the energy consumption requirements prior to the widespread adoption of EV charging is part of emergency preparedness in distribution networks as any additional bulk demand on the network may lead to frequent outages or may cause significant damage to the electric infrastructure. It is now widely acknowledged that significant EV uptake will put stress on distribution networks prompting many studies to explore impacts of EVs on electricity infrastructure. For example, in their review on the factors relating to plugin hybrid EVs (PHEVs) that have an impact on distribution networks, [green2011impact] found that driving patterns, charging characteristics, charge timing and vehicle penetration were most relevant. Besides this, [foley2013impacts] studied the impacts of EV charging in an actual working electricity market in Ireland and showed that EV charging had significant impact on wholesale electricity market and offpeak charging was beneficial. Moreover, while [neaimeh2015probabilistic] used a probabilistic approach, combining EV charging and smart meter household consumption data, to highlight the need for better planning for dealing with stochastic nature of charging demand, [xydas2016data] proposed a ‘risk level’ index using fuzzy logic to assess the impact of EV power demand on distribution networks. Furthermore, based on the findings from the Victorian EV trials in Australia, [khoo2014statistical] projected the mean and maximum percentage increase in the power demands between 3.27% and 5.70%, and 5.72% and 9.79% respectively in the summer of 2032/33. Given its impact on existing infrastructure, naturally, a number of studies focused on predicting either EV energy or power demand using different models and assumptions. For example, [xydas2013electric]
implemented data mining methods, such as decision tables, decision trees, artificial neural networks and support vector machines, to forecast EV load (power demand) using data on previous day load, number of the week, day of the week, type of day, number of new plugins every halfhour and total charging connections every halfhour. Moreover,
[wang2015electric] proposed an offline algorithm based on driving behaviour, road topography information, and traffic situation that gave two energy consumption results, one for the maximum driving speed and the other for the most economical driving speed, to give a first impression to the driver on the possible energy consumption and therefore, the range which the EV can cover even before the actual trip. In addition, [wang2016online] also proposed an online energy consumption algorithm that would help in adjusting the energy consumption prediction during driving of battery EVs (BEVs); this would be based on a number of factors, such as vehicle characteristics, driving behaviour, route information, traffic states and weather conditions. Furthermore, while [majidpour2016forecasting] forecast energy consumption for a horizon of 24 hours based on historical data from two different data sets: data from customer charging profile and data from station outlet measurements, [arias2016electric] used historical weather and traffic data to forecast EV charging power demand. Based on the studies of the Korean EV market, [moon2018forecasting] estimated the changes in energy demand based on consumer preferences for EVs, charge time of the day and types of EV supply equipment (EVSE). Total energy demand was estimated using total EV owners, average distance travelled per day, and average fuel efficiency of current EVs. Moving further, [lopez2018demand] proposed a demand response strategy based on machine learning to control EV charging in response to the realtime pricing, such that the overall energy cost of an EV was minimized.While it is natural for many factors, as explained and used within the above studies, to affect the EV consumption, typically, most of them are not available to be used within a forecast model. For example, information such as driver characteristics including behavioral is hard to obtain. Similarly, route choice and trip details of EV users are unlikely to be available on a continuous basis. This motivated us to consider a more minimalist approach which takes the most basic information, such as EV ownership, day of the week, and season of the year, that is likely to be available in the future to develop a forecast model that would estimate the consumption of energy from the grid caused by EV charging at home. More specifically, in this paper we are interested in the following question: assuming the EV owners have access to unconstrained charging at home, what is the expected EV energy consumption at a given ownership level in a given time period? It is important to note that in several previous studies, [xu2019modal, wu2015electric], the term EV consumption indicated the consumption of energy stored in EV’s battery based on a set of features, such as EV kinematics, driving data, battery state of charge, and so on. However, in our study, EV consumption refers to the consumption of energy by an EV to charge its battery when connected to the electricity grid. We explore several models which take as an input a future scenario, of total EV ownership and time hereafter referred to as scenariobased data, to forecast expected EV energy consumption. Moreover, our focus is forecasting energy consumption at a more aggregated level of per day compared to some studies which focused on interday consumption per hour forecasts. Utilities are interested in both the energy consumption (kWh) and the additional load (kW) on LV networks caused by EV charging. While energy consumption drives the income generated by utility companies, additional load affects their costs, as it drives the need for reinforcement of networks. An estimate of the additional energy consumption on the LV distribution networks caused by EV charging is extremely useful for the policymakers, utilities, and city and district administration councils; as an instance, the planning of the rollout of an EV charging infrastructure in the future.
Several studies also focused on charging demand at public charging points. For example, [van2013data] analyzed the actual usage patterns of public charging infrastructure in the city of Amsterdam, based on more than 109,000 charging events in the year 201213. Our focus in this paper is on home EV charging demand.
2 Data
Data in our study were collected as part of the Electric Nation project [electricnation, electricnation2019finalreport]
, which assessed the impact of EV charging on local electricity networks and identified that smart charging could help in demand management under such scenarios. Data, which contained charging transactions of energy consumption for different EV users, were collected from February 1, 2017 to December 30, 2018 and had a total of 80,313 observations. The transactions were spread across four stages: uncontrolled and trials 1, 2, and 3. While the uncontrolled stage allowed people to charge their EVs without any constraints on the quantum of energy consumed, in trial 1, smart charging was introduced to regulate the EV charging but without informing the consumers. In trial 2, consumers were given mobile applications to enable them to interact with the smart charging system. Trial 3 observations were biased as the consumers were given incentives to charge their EVs at specific times in a day. The processed data, after removing trial 3 data and nonrecoverable, nonimputable missing data caused by technical glitches, consisted of 56,637 observations across 13 variables (table
1), which were then transformed into daywise time series for analyses and modeling.Note that a different time granularity, such as per hour is certainly an interesting and relevant choice. However, such granularity was not suitable for our objective and is more likely a choice in a realtime forecasting scenario and in a datarich environment.
Variable  Description of Variable 

Charger ID  ID of smarter charger installed at consumer’s home 
Participant ID  ID of consumer participating in trials 
Car kW  Power rating of battery 
Car kWh  Energy capacity of battery 
Group ID  ID of group to which a consumer was assigned during trials 
Trial  Stage of trials (uncontrolled, 1, 2, or, 3) 
Adjusted Start Time  Time at which EV was pluggedin to a smart charger 
Adjusted Stop Time  Time at which EV was pluggedout from a smart charger 
Consumed kWh  Energy consumed during EV charging 
Active Charging Start  Time at which EV actually started charging after plugin 
Car Make  Manufacturer of EV 
Car Model  Model of EV 
EV Type  Type of EV (batteryoperated, hybrid, or range extender) 
3 Methodology
Mathematically, we consider the following statistical learning problem for our objective:
where,
= energy consumption caused by EV charging
= set of features to forecast
= learning algorithm that maps and
= random error, independent of , with mean 0
In this paper, the learning problem is a special case of 3, with = {, , }, where, , , and, are number of EV owners, day of the week and season of the year respectively.
3.1 Clustering of EV Owners
A total of 26 different battery capacities was present in the data set used. Naturally, each battery capacity combined with car usage pattern could result in a completely different consumption pattern. This implied an important parameter choice in our study: aggregation at battery capacity level. Clearly, building a model for all 26 different battery capacities was impractical. At the same time, ignoring battery capacities and treating all consumption patterns identically meant ignoring intrinsic variability. A histogram depicting the variability of battery capacities, caused by the discrete nature of EV battery capacities in the trials, is shown below in figure 1.
To validate our assumption of varying charging patterns with distinct battery capacities among consumers, kmeans clustering was performed by transforming the transaction data to obtain a consumerwise summary of energy consumption per charge against respective battery capacities. Figure
2 shows that based on the consumption behaviour of consumers per transaction and their respective battery capacities, they can be grouped into an optimum number of three clusters as increasing the number of clusters beyond three does not significantly decrease the variability, thereby validating our assumption that one model would fail to capture all the variability of consumer charging behaviour.As mentioned, clustering suggested an optimum number of three clusters. Table 2 summarises the information about the three clusters. The and capacities are respectively the minimum and maximum kWh capacities of the batteries that are present in the clusters. Besides, the is the average energy consumed by an EV user per transaction. We observe that as the battery capacities increase, the frequency of charging per day decreases and the mean energy consumed per transaction increases, suggesting that with increasing battery capacities, EV users are less likely to charge per day but when they do, they consume a higher energy per transaction.
Cluster  Min Capacity  Max Capacity  Mean kWh/Charge  Charging Frequency/Day 
1  4.4  18.7  5.68  0.68 
2  22  41  14.30  0.44 
3  60  100  26.80  0.36 
3.2 Time Series Analysis
Based on clustering analysis, we transformed the transaction data into three different daywise time series (since we have less than 2 years’ data, we do not consider annual but only weekly seasonality in this study; hence, the seasonality of the time series is 7), each belonging to a unique cluster. It is worth mentioning that while transforming the transaction data into daywise time series, several features were extracted which were not present in the transaction data. Table
3 lists all the variables (features and target) in the time series. All the features except for were easily extracted from the transaction data when converting them into daywise time series. For e.g., trans was the count of all the transactions occurring in a given day, while users was the count of all those people who charged their EVs in a given day. However,indicated the count of people who owned EVs in a given day. In a realworld scenario, although the count of people with EVs would not change everyday in a distribution network, it would gradually evolve over months and years. However, in Electric Nation project, the number of participants joining the EV trials increased steadily at a much higher rate than that in case of realworld scenario, resulting in the count of people with EVs changing almost everyday. In a realworld set up, DNOs would not have the information on the actual number of people who would be charging their vehicles everyday as it’s a random variable that would depend on a lot of factors; for e.g., day of the week, battery state of charge (SOC), and so on, and would almost always be less than the number of EV owners in a network. However, based on the chargepoint installation notifications, the DNOs would have an estimate of the number of people in their network who own EVs and hence, knowledge of EV owners is essential to the objective. Besides owners, DNOs would have information on season of the year and day of the week too. In a nutshell, the information available to the DNOs would most likely contain EV owners, day of the week and season of the year. Any additional information available to the DNOs is uncertain and hence, our objective was to develop a forecast model that could be leveraged by the DNOs to forecast energy consumption based on the minimal available information. Since the number of owners could not be directly extracted from the transaction data, we made a few assumptions to compute the number of owners: (1) an EV owner joined the trials whenever he (she) charged his (her) EV for the first time; (2) an EV owner, after signing up, never dropped out of the trials until the trials were inactive. Under these assumptions, the count of EV owners increased with time during the trials.
where,
= capacity of battery
= number of transactions per day for
Variable  Type  Description of Variable  Notation 

day  Feature  Day of transactions  
season  Feature  Season of the year of transactions  
owners  Feature  Count of people with EVs per day  
users  Feature  Count of people charging their EVs per day  
trans  Feature  Count of transactions per day  
demand  Feature  Total connected load (kWh) per day  
consumed  Target  Total energy consumed (kWh) per day 
The three time series generated from the transaction data had missing days between different pairs of dates. To impute the data for those missing days, we adopted a threestep method as discussed in [hyndman2018forecasting]: (1) STL decomposition [cleveland1990stl]
was computed to obtain seasonally adjusted data; (2) linear interpolation was then carried out for the seasonally adjusted data; (3) seasonal component was added back to the linearly interpolated data. Besides imputing the missing values, few observations across the three time series were identified as outliers and replaced with suitable values via a twofold approach: (1) periodic STL decomposition was carried out to identify observations that seemed unusual from rest of the observations
[hyndman2018forecasting]; (2) time plots were analyzed to identify sudden changes in the values that, in turn, helped in identifying which observations were apparently unusual from the rest. The three time series with clusters 1, 2 and 3 had respectively 572, 573 and 573 observations after the final phase of data processing.3.3 The Nested Modeling Approach
Depending upon what is assumed to be known when forecasting, we can classify forecasts into three categories
[hyndman2018forecasting]: (1) exante (to forecast target, we need to forecast features as no information is available on the future values of features); (2) expost (information on features is available prior to forecasting); (3) scenariobased (possible scenarios for the features that are of interest to the objective are considered). In this study, scenariobased forecasting is the apparent choice.Furthermore, classical univariate methods such as exponential smoothing and ARIMA were not applicable in our scenariobased forecasting objective for the following reasons.

The trend (direction in which a time series slopes) in the consumption of energy was governed by the number of EV owners in the trials, which itself was a controlled variable. Hence, the trend of energy consumption during the trials did not represent the realworld scenario. Figures 3(a) and 3(b) show the upward trend of both owners and consumption of energy, indicating that as more people joined the trials everyday, the consumption of energy increased. A univariate method would forecast based on the trend captured during the trials and hence, resulting in inaccurate forecasts at a specified period of time in the future.

If a univariate method was used for forecasting, we would deviate from the objective of scenariobased forecasting as the forecasts would correspond to a specific combination of , , and at a fixed time stamp in future. In a nutshell, the univariate methods would not be able to generate forecasts for any combination of , , and based on a user’s choice.
3 shows that the objective, involving scenariobased data, does not avail information from all the features in the time series data, i.e., while the time series data have six features, the data in the objective involves only three. Figure 4 shows how EV consumption varies within a week as well as across different seasons of the year.
Besides this, Figure 5 enumerates the high values of the Pearson’s correlation coefficients among all the numeric variables. This indicates that features other than might also have an effect on EV consumption, thereby necessitating further inspection. In addition, this also corroborates our assumption that if we use only those features from the time series data that are present in the scenariobased data to fit a model, we fail to make use of the additional information present in the time series data, adversely affecting the forecasting performance. Hence, to ensure that we extract maximal information from the time series data while addressing the mismatch between the feature spaces of the aforementioned data sets, we evaluate a nested modeling approach (figure 6) as explained below.
We observe from figure 5 that correlation coefficient between and is 1, i.e., they are perfectly correlated. This is because
is obtained by linearly transforming
as shown in 3.2. However, also includes information on the EV battery capacities and hence, might be influential in forecasting EV consumption as battery capacity is an indicator of EV’s consumption capacity.
In the realworld setup, forecasts will be generated using the scenariobased data; this means that in our modeling framework, our test sample should be similar to the scenariobased data. To begin with, we firstly split the time series data into training and test samples and drop all the features from the test sample except for , , and , to generate a truncated test sample; this ensures that while the training sample resembles the time series data in feature space, the truncated test sample, with its three features, is equivalent to the scenariobased data of the DNOs. However, the target variable, EV consumption, is retained in both the samples.

Since DNOs will need to generate forecasts using scenariobased data (equivalent to the truncated test sample) but we also need to leverage the additional information available in the training sample (equivalent to time series data), we affix new features, called pseudo features or (p stands for pseudo), to the truncated test sample, to obtain a modified test sample; these pfeatures are actually the forecasts of the those original features, called , which are dropped from the test sample after the traintest split. A pfeature is obtained by: (1) firstly, fitting a model of an ofeature (), corresponding to a pfeature (), using , , and from the training sample; (2) secondly, generating forecasts of using , , and from the truncated test sample, and appending them as to the truncated test sample to obtain the modified test sample. For e.g., we can add a pfeature of , called , by firstly fitting a model of using , , and from the training sample and then, forecasting using the same features from the truncated test sample, to obtain . We repeat the step for all except for , , and in the training sample, to generate a modified test sample that contains , , and and corresponding to all .

After we obtain the modified test sample containing all , we again fit a model of an ofeature (target ofeature) () using a different ofeature (input ofeature) () and generate another set of forecasts of the target ofeature, . We compare the MAPEs of and (already present in the modified test sample), to identify which forecasts yield low error. If MAPE of is found to be less than MAPE of , we assign to to replace with new values. If MAPE of is found to be greater than MAPE of , we reject . We repeat the step for all to obtain the final modified test sample, keeping in mind the causality among features.

We now use the training sample to fit a model for EV consumption using corresponding to . We then use from the final modified test sample to compute the forecasts for EV consumption. In the realworld setup, DNOs will utilize the scenariobased data as test sample to forecast and then append pfeatures to it using the models of ofeatures trained on the complete time series data of the trials as explained in previous steps. Once the final modified version of the scenariobased data is obtained, EV consumption forecasts would be generated using the EV consumption models trained on the complete time series data as explained earlier.
While creating pseudo features, causality between two variables should be kept into consideration, i.e., if X causes Y but not viceversa then, Y should be forecast as a function of X but not the other way around. For instance, users causes transactions but not viceversa and hence, transactions should be forecast as a function of users but its reverse is not true.
We will not use two or more pfeatures to forecast EV consumption. This can be attributed to the causality among features and the fact that one pfeature can be forecast using others, indicating that using multiple pfeatures would cause model to overfit. For e.g., consider the case of generating using , , and . We can also generate separately using , which is itself generated first using , , and , only to identify that yields more accurate than that obtained from , , and . As such, we retain generated using . We can now forecast EV consumption using either , , and , or , or but not using a combination of these as a combination indicates that we indirectly include a feature or a set of features more than once while forecasting EV consumption.
We call this approach nested modeling as we repeatedly fit auxiliary models and generate forecasts internally within a nestlike loop before eventually forecasting EV consumption.
3.4 Evaluation on Variable Origin
The choice of traintest split is userspecific, and it can not be ascertained that a given split is better than the other. To obviate any bias due to a specific train:test split, we evaluate the forecasting performance on a . In performance evaluation on a variable origin, we firstly fit a model on the first 70% of the data (training sample) and then evaluate the performance on the last 30% of the data (test sample). Subsequently, we increase the training sample to 80% and 90% of the data and fit models on these samples. Model performances are then evaluated on test samples comprising of the last 20% and 10% of the data respectively. The final performance is the mean of all the three performances. We choose the mean absolute percentage error (3.4) as the error metric for performance evaluation.
It is worth mentioning that evaluation of performance on a rolling origin [hyndman2018forecasting]
was also an alternative. However, in our case, it was computationally expensive to choose this method as we tuned several hyperparameters across different algorithms to optimize forecasting performance.
4 Algorithms and Results
In this section, we discuss four algorithms to work out 3 using the nested modeling approach (section 3.3). Before fitting a model, data were normalized using the minmax scaling as shown below.
In 4, is the normalized observation, while is the actual observation. Besides, and are the minimum and maximum values of the feature , which is normalized. It is important to note that we used the parameters of the training sample to not only normalize the training sample but also the test sample, to avoid information leakage.
Moreover, we also present a quantitative comparison of the performances of the final models to forecast EV consumption.
4.1 Time Series (TS) Regression
In TS regression, a dependent variable is modeled as a weighted linear sum of the independent variables, where all the variables are time series. In our case, we set time series regression as the benchmark algorithm, where we modeled EV consumption, , as a weighted linear combination of features, , where the weights are the regression coefficients. For e.g., the equation
models as a function of , , and , with being the regression error.
4.2 Regression with ARIMA errors (regARIMA)
In regARIMA, an ARIMA model is fit on the errors of a regression model and the forecasts from both the regression and the ARIMA components are combined. It is particularly useful when the regression errors show high correlations among each other, indicating that the regression model does not capture all the information in the data. Mathematically, a regARIMA model to forecast as a function can be given by:
As mentioned earlier, an ARIMA model is fitted on the regression errors . Here, and represent the model parameters corresponding to the AR and MA components of the ARIMA model, while is the ARIMA error. In this paper, we implemented the HyndmanKhandakar algorithm [hyndman2007automatic] to tune the orders of the AR and MA components and subsequently fitted the ARIMA model on the regression error, which was obtained after fitting a regression model as explained in section 4.1.
4.3 Extreme Gradient Boosting (XGB)
XGB is a scalable, tree boosting system that leverages the Gradient Boosting [friedman2001greedy] framework to learn from data and offers superior computational efficiency over other acclaimed machine learning algorithms [xgboost]. The popularity of XGB can be ascertained from the fact that in 2015, 17 out of 29 challenge winning solutions at the machine learning competition site used XGB [chen2016xgboost].
Given the computational resources, we tested 108 combinations of hyperparameters for each iteration of evaluation on a variable origin via random search in the hyperparameter space. Since our data was sequential, we could not use crossvalidation via random subsampling to tune the hyperparameters as it would have disarrayed the time dynamics of the data. Instead, we implemented timeslicing to create subsamples of the training data into a variablelength training subsample and a fixedlength validation sample, and iteratively increased the size of the training subsample by 1 seasonal difference. Figures 7(a) and 7(b) show timeslicing for hyperparameter tuning in XGB.
4.4 LSTM Networks
LSTMs [hochreiter1997long] are an enhanced version of Recurrent Neural Networks (RNNs) which overcome a major limitation faced by the conventional RNNs: Vanishing Gradient Problem, in which a network fails to learn longterm dependencies [hochreiter1998vanishing]. In a sequence prediction problem, learning longterm temporal dependencies along with the present state of the system is essential in predicting the future state. LSTMs, courtesy the gating mechanism inside their specially architectured memory cells, regulate the flux of information and enforce a constant error flow through the network, thereby obviating the complications of vanishing and exploding gradients and enabling the LSTMs to capture longterm temporal dependencies. LSTMs’ ability to learn such longterm dependencies have piqued interests among researchers to leverage it for a plethora of sequence prediction problems. For e.g., [bandara2019sales] used a special variant of LSTMs, known as LSTMs with peephole connections, to forecast sales demand in ecommerce. Moreover, [du2018time]
proposed a sequencetosequence deep learning framework based on LSTM encoderdecoder architecture for multivariate time series forecasting on air quality data. Besides this,
[9107249] used the LSTMbased encoderdecoder architecture to predict longterm traffic flows. Furthermore, [sehovac2020deep] assessed attention mechanisms with different types of RNN cells (vanilla, LSTM, and GRU) and forecasting horizons, for electrical load forecasting.In this study, we compared a suite of network architectures: (one hidden layer between input and output layers) vs (multiple hidden layers between input and output layers), vs [schuster1997bidirectional], and vs [cho2014learning, sutskever2014sequence], to assess the effect on forecasting performance. It is important to note that although both vectoroutput and encoderdecoder architectures are popular choices when forecasting horizon () is greater than 1, we evaluated them in our study as a special case for
being 1. Leveraging these 2 architectures ensures that, with nominal transformations to the input data and output layers of the networks, the LSTMs are flexible to output forecasts of any horizon, 1 or higher, depending upon user requirements. While designing the networks, we set the maximum number of hidden layers to 2, i.e., the depth of LSTMs would be either 1 (vanilla) or 2 (stacked) layers. This implies that in case of vectoroutput architecture, the maximum number of LSTM layers cannot exceed 2. In addition, as encoderdecoder architectures have 2 components, encoder and decoder, where each component is an LSTM network, we specifically assessed the effect of the encoder depth (vanilla or stacked) on the forecasting performance by tuning the decoder depth for a given encoder depth. Besides, the number of neurons in each hidden layer was tuned between 50 and 200. We chose a batch size of 7,
mean squared error as the cost function, and Adam [kingma2014adam] as the optimization algorithm to minimize the cost function, with learning rate ranging from 0.0001 to 0.01 on a logarithmic scale. In addition, we also introduced regularization via neuron dropouts [hinton2012improving], ranging from 0 (no dropout) to 0.4 (randomly removing 40% of neurons in each iteration of training), to avoid overfitting. To tune the hyperparameteres, we used Bayesian Optimization [snoek2012practical]. It is important to note that given the computational resources, we set the maximum number of search iterations in Bayesian optimization to 10 and the maximum number of training epochs to 100 for tuning the hyperparameters.
4.5 Results
Table 4 enumerates the lowest MAPE values of EV consumption forecasts for each feature in the data set across all the 4 algorithms and 3 clusters. The MAPE values of pfeatures are tabulated in Appendix A. We observe that for all the 3 clusters, LSTMs deliver the lowest MAPE values, thereby the best forecasting performance among all the models developed.
Cluster  Feature (s) for Forecasting  Regression  regARIMA  XGB  LSTMs 

1  , ,  26.80  26.33  22.13  13.14 
27.13  26.49  23.20  17.27  
26.73  26.05  23.62  16.90  
26.50  22.46  21.81  16.89  
2  , ,  34.61  18.00  17.43  15.41 
33.57  21.20  19.31  17.26  
32.54  17.96  18.76  17.37  
32.47  18.02  17.98  17.41  
3  , ,  49.40  36.54  33.40  31.35 
50.74  42.03  39.00  32.35  
49.35  40.06  46.68  32.45  
48.73  38.19  42.91  32.16 
We note that out of the 12 best models (4 algorithms 3 clusters), 6 (3 for time series regression, 2 for regARIMA, and 1 for XGB) use pfeatures to deliver the best forecasting performance. On investigating the performance of the 2 best algorithms, XGB and LSTMs, we further observe that to deliver the lowest MAPE values, while XGB uses the pfeature, , in cluster 1 but , , and in clusters 2 and 3, LSTMs never use the pfeatures. However, it is worth mentioning that the XGB and LSTM results are constrained by the number of search iterations during hyperparameter tuning and hence, the possibility of obtaining a better forecasting performance with pfeatures for all or some of the clusters can not be ruled out, thereby necessitating that relevant pfeatures should be appended to the scenariobased data of the DNOs before forecasting EV consumption, using nested modeling as discussed in section 3.3.
Cluster  LSTM Architecture  LSTM Depth  Learning Direction  MAPE 

1  EncoderDecoder  Vanilla (Encoder)  Unidirectional  13.14 
VectorOutput  Stacked  Bidirectional  13.48  
EncoderDecoder  Vanilla (Encoder)  Bidirectional  13.60  
VectorOutput  Vanilla  Bidirectional  13.70  
EncoderDecoder  Stacked (Encoder)  Unidirectional  13.93  
2  VectorOutput  Vanilla  Bidirectional  15.41 
VectorOutput  Vanilla  Unidirectional  15.43  
VectorOutput  Stacked  Unidirectional  15.91  
EncoderDecoder  Stacked (Encoder)  Bidirectional  15.99  
EncoderDecoder  Stacked (Encoder)  Unidirectional  16.18  
3  VectorOutput  Vanilla  Bidirectional  31.35 
VectorOutput  Vanilla  Unidirectional  31.40  
EncoderDecoder  Vanilla (Encoder)  Unidirectional  31.50  
EncoderDecoder  Stacked (Encoder)  Unidirectional  31.59  
VectorOutput  Stacked  Unidirectional  31.63 
As discussed in section 4.4, we compared several LSTM architectures and found that no specific architecture is suitable for all the clusters. Table 5 lists the top 5 architectures along with respective MAPE values for each cluster to forecast EV consumption. It is worth mentioning that all these networks use , , and as features. We observe that the MAPE values for the top 5 models in each cluster differ by less than 1 percentagepoint, implying that any of the architectures can be chosen to forecast EV consumption without significant loss in forecasting performance. However, following the problemsolving principle called Occam’s razor, we should choose an architecture that minimizes model complexity (reduced number of model parameters) without detracting from forecasting performance.
5 Conclusions
In this study, we evaluated several models, ranging from linear statistical models, such as timeseries regression, to nonlinear artificial neural networks, such as LSTMs, to forecast daily energy consumption caused by EV charging. We observed that LSTMs delivered by the best forecasting performance among all the algorithms considered to develop the forecast models for EV consumption, with MAPEs of 13.14, 15.41, and 31.35 for clusters 1, 2, and 3 respectively. The models were developed keeping in view of the minimal information that would certainly be available to the DNOs in the future. However, it is likely that there are more contributors to the variation in EV consumption, and including more information would help in better understanding of its variability, leading to the identification of more important features for forecasting. In fact, minimal information appears to adversely affect the forecasting performance as we move towards clusters with higher battery capacities (table 4). The decreasing correlation coefficients of with features in the data set, especially , figure 5, also corroborate the fact that the limited set of features fail to capture significant variability in EV consumption as we move towards EVs with higher battery capacities. Statistical analysis reveals that as we move towards clusters with higher battery capacities, the fraction of EV owners charging their EVs per day drops, leading to a reduced number of total charging transactions per day. This can be attributed to several factors, such as the range of the vehicle (distance the vehicle can travel before needing recharging). Given the same vehicle usage (e.g. driving style, weather, etc.), a higher capacity battery has a greater range than a smaller capacity battery does, assuming both batteries are initially charged to 100% of their capacities. Vehicles with a greater capacity battery (higher range) are more likely to be able to complete their next journey without charging again when compared to those with a lower battery capacity, leading to lower charging frequencies. Under such a scenario, if the number of EV owners are the same in two clusters with different battery capacities, lesser number of EVs would get charged in that cluster which has higher capacity batteries as the range of vehicle would also be influential in determining the charging frequency. On a similar note, we can identify more relevant features that might be influential in explaining the variability in EV consumption. A few areas worth exploring in future endeavors are summarised below.

The framework we consider in this work is that of minimal information that can be used to forecast EV consumption. Within the data available for our study, this information is EV ownership, day of the week and season of the year. However, there is a scope to collect additional information which is generally available with city councils and DNOs, such as sociodemographic information of EV owners. It is conceivable that different socioeconomic profiles would have different consumption patterns. It would be interesting to explore the models which take these into account. To this end, a future study can combine relevant choice models which explain EV charging preferences with forecast models.

We can easily verify from figure 3(a) that the number of EV owners and hence, users charging their EVs is very small in the initial days of the trials, leading to a very small consumption of energy. High absolute errors on such smaller values amplify the percentage errors more than they do on larger values. A high value of percentage error can be misleading as it fails to capture the real picture and, in turn, gives a false alarm caused by the poor forecasts of smaller values. Hence, to have a more realistic understanding of the forecast accuracy of the models, we need to work with data which are more realistic. Dropping observations that do not reflect the realworld scenario might help in getting rid of the false alarms.

In our methodology, we could not leverage the autocorrelations among the lagged values of the target variable as we forecast a specific value of EV consumption based on one instance or observation of scenariobased data, i.e., we forecast EV consumption based on a given day’s features affecting EV consumption but do not include information of EV consumption from previous days, leading to the loss of relevant information in forecasting EV consumption. A plausible solution to resolve this issue would be multistep forecasting, in which, the forecasting horizon is greater than 1, i.e., for a given input, instead of forecasting for one timestep (e.g, next day), forecasts are generated for a sequence of timesteps (e.g., next 7 days). In multistep forecasting, we train an algorithm to firstly, observe a sequence of inputs of length , called an input window of length , and then, forecast a sequence of corresponding outputs of length , called an output window of length , at a time before moving on to further sequences of inputoutput pairs; this approach is unlike the current learning methodology where both and are 1. A multistep forecasting scenario necessitates the availability of large volumes of data so that a large number of training samples, comprising of such input and output windows, can be generated from the data.
6 Acknowledgement
This study was sponsored by grants from the European Regional Development Fund, EA Technology and Lancaster University, facilitated by the Centre for Global EcoInnovation, Lancaster University. We express out gratitude to Dr. Christopher Kirkbride (Lancaster University) and Dr. Florian Dost (The University of Manchester) for their constructive appraisal of the study.
References
Appendix A MAPE of pfeature Forecasts
Tables 6, 7, and 8 enumerate the lowest MAPE values of the pfeature forecasts for all the possible input features to pfeatures across all the 4 algorithms and 3 clusters.
We see that LSTMs outperform other algorithms to generate forecasts for pfeatures as well. Since can be forecast using , , and or , there are 2 possible set of input features to forecast . Similarly, can be forecast using , , and or and as such, there are 2 possible set of input features to forecast . It is important to note that since was obtained by linear transforming , we would not use to forecast .
Feature (s)  Cluster  Regression  regARIMA  XGB  LSTMs 

, ,  1  19.38  22.00  18.55  12.01 
2  21.99  14.42  13.60  11.44  
3  29.42  28.55  23.93  23.87 
Feature (s)  Cluster  Regression  regARIMA  XGB  LSTMs 

, ,  1  23.35  23.54  20.16  13.33 
2  25.20  13.85  13.91  12.16  
3  32.59  31.04  28.24  28.17  
1  23.05  23.10  20.71  16.24  
2  25.67  15.04  15.31  13.46  
3  32.49  30.92  28.52  27.89 
Feature (s)  Cluster  Regression  regARIMA  XGB  LSTMs 

, ,  1  23.52  24.36  20.47  13.96 
2  25.45  14.31  13.86  12.58  
3  32.33  31.39  28.08  27.55  
1  23.44  23.99  20.71  17.16  
2  25.92  16.51  15.13  13.71  
3  32.05  29.70  28.55  27.89 
Comments
There are no comments yet.