Convolutional Neural Networks applied to sky images for short-term solar irradiance forecasting

05/22/2020 ∙ by Quentin Paletta, et al. ∙ University of Cambridge 0

Despite the advances in the field of solar energy, improvements of solar forecasting techniques, addressing the intermittent electricity production, remain essential for securing its future integration into a wider energy supply. A promising approach to anticipate irradiance changes consists of modeling the cloud cover dynamics from ground taken or satellite images. This work presents preliminary results on the application of deep Convolutional Neural Networks for 2 to 20 min irradiance forecasting using hemispherical sky images and exogenous variables. We evaluate the models on a set of irradiance measurements and corresponding sky images collected in Palaiseau (France) over 8 months with a temporal resolution of 2 min. To outline the learning of neural networks in the context of short-term irradiance forecasting, we implemented visualisation techniques revealing the types of patterns recognised by trained algorithms in sky images. In addition, we show that training models with past samples of the same day improves their forecast skill, relative to the smart persistence model based on the Mean Square Error, by around 10 ahead prediction. These results emphasise the benefit of integrating previous same-day data in short-term forecasting. This, in turn, can be achieved through model fine tuning or using recurrent units to facilitate the extraction of relevant temporal features from past data.



There are no comments yet.


page 2

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Contribution

The proposed approach introduced by Zhang (2018), Siddiqui (2019) and Zhao (2019)

is to apply Deep Learning models to extract patterns not only from single point data but also from sky images. In particular, 2D inputs such as images can be treated by a specific type of neural network termed a CNN. The learning of such models is based on the use of filters which are trained to find task relevant patterns in an image. In parallel with other past meteorological data, such information can be given to an Artificial Neural Network (ANN) to forecast irradiance. The global network made of the CNN and the ANN can be trained simultaneously using supervised learning. The original contributions of this study are the following:

  • Siddiqui (2019) uses CNNs to extract patterns on sky images from the past 4 to 6 hours as general information about the cloud cover of that day to adapt forecasts up to 6h ahead. We, however, set up our protocol to learn how the clouds visible from the lens at time will induce the variability of the solar resource in the short-term upcoming future. In other words, we limit the forecasting time window to a narrower 20-min window from auxiliary data (past irradiance measurements, solar position of the sun and its sine / cosine transformations) and images taken at time and min.

  • Instead of training and validating the model on distinct days, which does not let the model learn from same day past samples, we tried a new setting to generate both sets, which improves forecasting performances by around 10%. With this approach, the model is validated on afternoon samples only, morning samples being in the training set. This way, the protocol is closer to real life applications where past samples of the same day are available for model fine-tuning.

  • To assess the model’s performance we implemented the MSE forecast skill based on the smart persistence model, which is using the last in-situ pyranometric measurements only. A clear-sky index is computed with a concomitant model of clear-sky (here the ESRA model). This clear-sky index is then multiplied by the corresponding clear-sky prediction of the model to obtain the smart persistence forecast. This gives a reliable performance score to compare the proposed Deep Learning model with other approaches.

  • To increase the receptive field of the network without using dilated convolution (Fisher Yu, 2016; Siddiqui, 2019), we implemented a ResNet architecture (Kaiming He, 2015), which enables solvers to handle a deeper model.

2 Approach

2.1 Dataset

The chosen approach is to provide the Deep Learning model with sky images taken by hemispherical cameras on the ground (see Figure 1) and a range of metadata such as past irradiance measurements or the angular position of the sun.

The dataset used in this study originated from the SIRTA laboratory Haeffelin (2005). Samples were collected over a period of seven months from March 2018 to September 2018. Images of the sky have RGB resolution of 768 by 1024 pixels. Samples are taken every two minutes and composed of two images with different exposures (referred to as long and short exposures, see Figure 1).

Figure 1: Images of the sky taken by a hemispherical camera with short and long exposures (source: Haeffelin (2005))

The short exposition time provides more details on the region close to the sun, whereas the long exposure focuses on the rest of the sky, and in particular on distant clouds. In addition to this, the dataset has a range of metadata, in particular the Global Horizontal Irradiance (GHI) measurements which are averaged over a minute. Also, the angular position of the sun is available through the Zenith and Azimuthal angles.

2.2 Network Architecture

The proposed model follows the configuration presented in Figure 2. It is composed of two distinct networks merged into one which output the irradiance estimate. On the one side, a CNN made of ResNet units is used to extract features from sky images and on the other side, an ANN treats available auxiliary data (past irradiance measurements, angular position of the sun, etc). Both outputs are fed into another ANN, which integrates them to give its prediction.

Figure 2: Schematic network architecture

Hyperparameters of the model such as the number of filters / nodes per layer or the number of layers are mostly hand-tuned given the model’s performance. Another approach presented in Section 4

was used to select the number of filters per layer. In addition, the number of layers, and in particular the number of convolutional layers with a stride of 2 (the filter convolves around the input volume by shifting two units at a time) was set such that the output of the last convolutional layer was computed from each pixel of the input images.

The CNN architecture is composed of a set of convolutional layers with 32 filters in each and strides of 1 or 2. ResNet units using residual connections are implemented to increase the depth of the network. This way, if a set of convolutional layers does not improve the overall performance, the network is able to bypass the given ’residual connection’ through a parallel identity connection (see Figure 

3). The vectorisation layer reshapes the 2D output of the last convolutional layer into a 1D layer whose nodes are connected to dense layers with 512 and 64 nodes.

The parallel ANN is composed of two layers of 16 nodes plus a residual connection, which merges with the other network through a concatenation layer. Following this, another set of 2 densely connected layers integrate the incoming information to output the irradiance estimate through the final node.

Figure 3: Detailed network architecture

2.3 Training and Validation

Each sample given to the network is composed of four images in total, the short and the long exposure taken at time and min. Each 2D input was originally downsized to a grey scale 150x150px resolution. In parallel, metadata are made of the past irradiance measurements at time and min, the angular position of the sun at time defined by its Zenith and Azimuthal angles, plus the cosine and sine of those angles.

The assessment of the model was performed on a setting of the training and the validation sets composed of 16,000 and 4000 samples respectively. Samples collected from 8am to 7pm over the seven months period were randomly allocated to the validation and the training sets from two groups comprising data from distinct days. This way we prevent the model from using samples of the same day seen in the training set to adapt its forecasts on samples of the validation set.

In a second setting however, we tried to shift morning samples from the validation set to the training set to see how training the model on past samples of the same day improves its predictions. This configuration could be seen as closer to real life applications as data are streamed continuously and could be used to fine-tune the model given previous measurements.

The loss function used by the model as a reference to assess its own performance is the unregularised MSE. The learning rate associated with the minimisation of the loss was set to

. The performance of the model was assessed using the forecast skill metrics based on the persistence of the clear-sky index on both training and validation set settings.

Hyperparameter tuning was performed to achieve the best forecasting performance on the 10-min ahead forecast. The same network architecture has been used to train models for the 2-min to 20-min ahead forecasts. The only difference is that the number of filters per convolutional layer was reduced to 16 for the 2-min and 4-min ahead forecasts due to a tendency to overfitting with 32 filters.

3 Results

Figure 4 shows the performance of the model obtained with the same architecture for different time windows from 2 to 20-min. Contrary to Zhang (2018), the longer the forecast window, the better the forecast skill, until it reaches a plateau from the 12-min forecast window, which tempers the conclusion of Zhang (2018) about a decrease of the skill score for longer time horizon forecasts.

Figure 4: Forecasting skill on the validation set using the MSE based on the smart persistence model for different time windows, the training and validation sets being generated from different days

Also, it is interesting to note that showing past data of the same day during the training strongly increases the model’s performance with a skill score on the 10-min ahead forecast increasing by around 10% from 0.4 to 0.44 for instance (see Table 1

). These results highlight the need to take same day samples of the past into account even for short term forecasting. This could be achieved through active learning or forecasting from a longer sequence of samples from the past (see 

Siddiqui (2019)).

Settings 2-min 10-min 20-min
Distinct days 0.20 0.4 0.43
Validation on afternoon samples 0.26 0.44 0.45
Table 1: Forecast Skill (MSE) on the validation set based on the smart persistence model

4 Model’s learning assessment

Often considered as black boxes, Deep Learning models foster research focusing on their interpretability. In that respect, different methods have been proposed to visualise models’ learning, from intermediate activations and filter visualisation (Chollet, 2018) to feature maps (Zeiler, 2013).

4.1 Intermediate activation visualisation

Visualising the layer activation of a trained model is useful for two reasons. Firstly, one can see what trained filters focus on for a given input. Figure 6 shows the transformation of figure 5, once passed through 60 filters of the first convolutional layer. As we can see, some filters focus on specific patterns of the image such as the sun or the distant sky. However, some filters seem to look at useless areas of the image such as the black background or even simply return a blank output.
Figure 5: Example of an image of the sky, which is passed through the first layers to visualise intermediate activations (see Figure 6)

This leads to the second use of this analysis technique, which is selecting the number of filters per layer in the network. Figure 6 indicates that the model did not make use of all 60 filters during the training. This gives a hint that decreasing the number of filters in this layer would not penalise the learning process, and would reduce the complexity of the model, i.e shorten the training time while preventing the model from overfitting. In the final configuration, the number of filters per convolutional layer was reduced to 32.

Figure 6: 2D output resulting from the convolution stage by filters of the first layer

4.2 Filter Visualisation

As explained in Chollet (2018), the different patterns learnt by a model can be visualised by generating, for each filter of the network, the visual pattern that it responds to. To quantify the response of a filter to an image, one can define a loss function returning the average value of the output resulting from the convolution by the given filter. Starting from a noisy image, the visual pattern is then formed using gradient ascent on the pixel of the input to maximise the loss function measuring the response of the filter to that input.

Some images generated by this method for filters of the first, third and fifth convolutional layers are presented in Figure 7. One can notice that patterns spotted by filters of the first layers are rather abstract and difficult to interpret. However, as we go deeper into the network, corresponding filters respond to patterns of clouds. And the higher the convolutional layer, the more complex the corresponding images: a repetition of small patterns for images of the first and third layers, but complex shapes of cloud covers for the fifth layer. In particular, in Figure 7, some filters seem to focus on sparse cloud covers and others at more dense covering.

Figure 7: Tuned images leading to strong responses by filters of the first, third and fifth convolutional layers (first, second and third row respectively)

5 Conclusions

This study shows that the Deep Learning framework is a promising approach to irradiance forecasting with CNNs being able to extract relevant features from sky images. Also, integrating same day historical data in model predictions proves to be a sensible aspect of short-term forecasting.

The authors would like to acknowledge SIRTA for providing the sky images and irradiance measurements used in this study. We also would like to thank Prof. Philippe Blanc for his valuable advice. This research was supported by Engie, EPSRC and the University of Cambridge.



  • Siddiqui (2019) Siddiqui, Talha A., Bharadwaj, Samarth and Kalyanaraman, Shivkumar 2019, A deep learning approach to solar-irradiance forecasting in sky-videos IEEE Winter Conference on Applications of Computer Vision, WACV 2019
  • Kaiming He (2015) Kaiming He, et al., Deep Residual Learning for Image Recognition Arxiv
  • Fisher Yu (2016) Fisher Yu, Vladlen Koltun, Multi-scale Context Aggregation by Dilated Convolutions Arxiv, International Conference on Learning Representation
  • Chollet (2018) Chollet F., Deep Learning with Python Manning
  • Zeiler (2013) Zeiler, Matthew D. and Fergus, Rob, Visualizing and Understanding Convolutional Networks Arxiv 1311.2901
  • Haeffelin (2005) Haeffelin M., Site instrumental de recherche par télédétection atmosphérique (Sirta) International Energy Agency (IEA)
  • Zhang (2018) Zhang J, Verschae R, Nobuhara S, Lalonde J, Deep photovoltatic nowcasting Solar Energy
  • Zhao (2019)

    Zhao X, Wei H, Wang H, Zhu T, Zhang K, 3D-CNN-based feature extraction of ground-based cloud images for direct normal irradiance prediction Solar Energy