spatio-temporal probabilistic forecasting of solar power...

74
Dennis van der Meer Spatio-temporal probabilistic forecasting of solar power, electricity consumption and net load

Upload: others

Post on 22-Mar-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Dennis van der Meer

Spatio-temporal probabilisticforecasting of solar power,

electricity consumption and netload

Page 2: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

AbstractThe increasing penetration of renewable energy sources into the electricity generating mix poseschallenges to the operational performance of the power system. Similarly, the push for energyefficiency and demand response—i.e., when electricity consumers are encouraged to alter theirdemand depending by means of a price signal—introduces variability on the consumption sideas well.

Forecasting is generally viewed as a cost-efficient method to mitigate the adverse effects ofthe aforementioned energy transition because it enables a grid operator to reduce the operationalrisk by, e.g., unit-commitment or curtailment. However, deterministic—or point—forecastingis currently still the norm.

This thesis focuses on probabilistic forecasting, a method with which the uncertainty ac-companying the forecast is expressed by means of a probability distribution. In this framework,the thesis contributes to the current state-of-the-art by investigating properties of probabilisticforecasts of PV power production, electricity consumption and net load at the residential anddistribution level of the electricity grid.

The thesis starts with an introduction to probabilistic forecasting in general and two modelsin specific: Gaussian processes and quantile regression. The former model has been used toproduce probabilistic forecasts of PV power production, electricity consumption and net loadof individual residential buildings—particularly challenging due to the stochasticity involved—but important for home energy management systems and potential peer-to-peer energy trading.Furthermore, both models have been utilized to investigate what effects spatial aggregation andincreasing penetration have on the predictive distribution. The results indicated that only 20-25 customers—out of a data set containing 300 customers—need to be aggregated in order toimprove the reliability of the probabilistic forecasts. Finally, this thesis explores the potential ofGaussian process ensembles, which is an effective way to improve the accuracy of the forecasts.

Page 3: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

All models are wrong, but someare useful.

George Edward Pelham Box

Page 4: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 5: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

List of papers

This thesis is based on the following papers, which are referred to in the textby their Roman numerals.

I D.W. van der Meer, J. Widén, J. Munkhammar, "Review onprobabilistic forecasting of photovoltaic power production andelectricity consumption", Renewable and Sustainable Energy Reviews,Vol. 81, pp. 1484-1512 (2018).

II D.W. van der Meer, M. Shepero, A. Svensson, J. Widén, J.Munkhammar, "Probabilistic forecasting of electricity consumption,photovoltaic power generation and net demand of an individualbuilding using Gaussian Processes", Applied Energy, Vol. 213, pp.195-207 (2018).

III D.W. van der Meer, J. Munkhammar, J. Widén, "Probabilisticforecasting of solar power, electricity consumption and net load:Investigating the effect of seasons, aggregation and penetration onprediction intervals", Solar Energy, Vol. 171, pp. 397-413 (2018).

IV D.W. van der Meer, J. Munkhammar, J. Widén, "Probabilisticclear-sky index forecasts using Gaussian process ensembles", inProceedings of the 2018 World Conference on Photovoltaic EnergyConversion (WCPEC-7) (IEEE Photovoltaic Specialist Conference(PVSC-45)), Waikoloa, Hawaii, June 9-15 (2018). ©2018 IEEE.

Reprints were made with permission from the publishers.

Page 6: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Papers not included in the thesis

V D.W. van der Meer, G. R. Chandra Mouli, G. Morales-España, L.Ramirez Elizondo, P. Bauer, "Energy Management System With PVPower Forecast to Optimally Charge EVs at the Workplace", IEEE Trans-actions on Industrial Informatics, Vol. 14, pp. 311-320 (2018).

VI M. Shepero, D.W. van der Meer, J. Munkhammar, J. Widén, "Resi-dential probabilistic load forecasting: A method using Gaussian processdesigned for electric load data", Applied Energy, Vol. 218, pp. 159-172(2018).

VII D.W. van der Meer, J. Widén, J. Munkhammar, "A comparison ofstrategies for net demand forecasting in case of photovoltaic power pro-duction and electricity consumption", in Proceedings of the 34th Euro-pean Photovoltaic Solar Energy Conference (EU-PVSEC), Amsterdam,The Netherlands, September 25-29 (2017).

VIII D.W. van der Meer, J. Widén, J. Munkhammar, "Investigating the ef-fect of aggregation on prediction intervals in case of solar power, elec-tricity consumption and net demand forecasting", in Proceedings of the7th Solar Integration Workshop (SIW), Berlin, Germany, October 24-25(2017).

IX D.W. van der Meer, J. Andersson, V. Bernström, J. Törnqvist, J. Widén,"Predicting hosting capacity of photovoltaic power production in low-voltage grids using regressive techniques", in Proceedings of the 7th So-lar Integration Workshop (SIW), Berlin, Germany, October 24-25 (2017).

Page 7: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Notes on my contributions

I contributed with the following in the appended papers:

Paper I, I surveyed the literature and wrote the paper.

Paper II, I co-developed the model, performed most of the simulations andwrote most of the paper.

Paper III, I developed the models, performed the simulations and wrote thepaper.

Paper IV, I developed the model, performed the simulations and wrote thepaper.

Page 8: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 9: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Ethical considerations

Ethics is a broad field of research with a long history. Although typicallythought of as mainly applicable in studies involving people and animals, ethicsplays an important role in all scientific fields, albeit in different forms. In thisbrief chapter, the focus lies on the justification of this licentiate thesis withregards to society, truth telling and impartiality.

Ethics constitutes an important aspect of one of today’s fundamental chal-lenges: climate change. Although the majority of researchers agree that ev-idence exists for anthropogenic climate change, controversy exists regardingthe extent of it [1]. It is, then, important to be objective as to the justificationand purpose of research related to mitigating climate change, which concernsrenewable energy in this licentiate thesis. Renewable energy sources (RESs)will play an important role in sustainable energy production but they posechallenges to the stability and reliability of the energy system due to theirvariable nature. It has been hypothesized and proven that the ability to ac-curately predict allows for further integration of RESs while reducing energygeneration costs and mitigating greenhouse gas (GHG) emissions. In this li-centiate thesis, the aim is to advance the field of forecasting and subsequentdecision making by considering spatio-temporal probabilistic forecasts, whichshould—at least ideally—advance the integration of RESs into the electricitygenerating mix. It is, however, important to point out that this licentiate thesisor RESs in general will not be sufficient to mitigate climate change and otheralternatives, e.g., reforestation, carbon capture and storage (CSS) and energyefficiency, require further research efforts as well.

Truth telling and impartiality are vital to sustain the credibility of science.In probabilistic forecasting, the research topic of this licentiate thesis, properscoring rules are used that give the highest reward—in some sense—in expec-tation by reporting the true probability distribution [2]. It therefore encouragesthe researcher to be truthful so that he or she may continue to improve the fore-cast accuracy and maximize the reward in expectation [3]. Impartiality, on theother hand, is not defined mathematically. Consequently, it requires dedicationfrom the researcher to remain impartial and objective. An important aspect ofimpartiality is conflict of interest, which is defined as the risk that secondaryinterests may affect the primary interest by unsound judgment or actions [4].In this licentiate thesis, there is no conflict of interest.

Page 10: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 11: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Contents

Ethical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Aim of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Overview of the thesis and the appended papers . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.1 Why do we need to forecast? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 Distributed generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.1.2 Electricity consumption 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1.3 Net load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.1.4 Balancing supply and demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.1.5 Deterministic versus probabilistic forecasting . . . . . . . . . . . . . . . 11

2.2 How can we forecast? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.1 Numerical weather prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2.2 Satellites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.3 Statistical machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.4 Hybrid methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4 Previous work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.5 Research gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Methodology and data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.1 Overview of data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2 Forecast models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2.1 Gaussian process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.2.2 Quantile regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3 Overview of approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.3.1 The behavior of predictive distributions . . . . . . . . . . . . . . . . . . . . . . . . 283.3.2 Gaussian process ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.4 Cross validation, performance metrics and benchmarks . . . . . . . . . . . . . 303.4.1 Cross validation for time series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.4.2 Probabilistic metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.4.3 Persistence ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.4.4 ARIMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.1 Temporal forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.1.1 Solar power forecasting at the residential level . . . . . . . . . . . . . 37

Page 12: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

4.1.2 Electricity consumption forecasting at the residentiallevel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.1.3 Net load forecasting at the residential level . . . . . . . . . . . . . . . . . . . 394.2 Spatio-temporal forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.2.1 The effects of aggregation on the predictivedistribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.2.2 Moving beyond the unimodal Gaussian distribution . . . . . 44

5 Discussion and future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Page 13: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

List of abbreviations

Abbreviation DescriptionAC Alternating currentAGC Automatic generation controlANN Artificial neural networkARIMA Auto regressive integrated moving averageCDF Cumulative distribution functionCRPS Continuous ranked probability scoreCSGHI Clear-sky global horizontal irradianceCSI Clear-sky indexDR Demand responseEC Electricity consumptionEM Expectation-maximization algorithmEV Electric vehicleGHG Greenhouse gasGHI Global horizontal irradianceGP Gaussian processHVAC Heating, ventilating and air conditioningISO Independent system operatorMAE Mean absolute errorNL Net loadNWP Numerical weather predictionPDF Probability density functionPeEn Persistence ensemblePI Prediction intervalPICP Prediction interval coverage probabilityPINAW Prediction interval normalized average widthPLF Probabilistic load forecastingPSPF Probabilistic solar power forecastingPV PhotovoltaicP2P Peer-to-peerQR Quantile regressionRES Renewable energy sourcesRMSE Root mean square errorTSI Total sky imagerWRF Weather and research forecasting

1

Page 14: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 15: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

1. Introduction

The best way to predict the futureis to create it.

Abraham Lincoln

Electricity production and consumption are set to change drastically in thenear future. Traditionally, electricity production is centralized by means ofpower plants that typically rely on burning fossil fuels in order to convert wa-ter into steam, which in turn can be converted into kinetic energy via, e.g., aturbine. This process has the advantage that it is scalable and that the outputis controllable. However, the fossil fuels used in this process, i.e., coal, nat-ural gas, nuclear and, to a lesser extent, oil, have their distinct disadvantagessuch as greenhouse gas (GHG) emissions or radioactive waste. Nevertheless,these fuels accounted for 76.9% of the total energy carriers used to generateelectricity in 2015 [5]. Because, at least in part, of the aforementioned dis-advantages, renewable energy sources (RESs) such as solar and wind powerhave experienced tremendous growth in recent years. The installed capac-ity of solar power, or photovoltaic (PV) power, has increased exponentiallyover the last decade, with a total of 397 GW installed capacity worldwide bythe end of 2017 [6]. However, the increasing penetration of PV power in theelectricity generation mix poses challenges to the operational performance ofthe electricity grid due to the variability PV power introduces [7], caused bystochastic atmospheric processes.

Electricity consumption is traditionally predictable—similar to electricityproduced by power plants—characterized by peaks in the morning and evening.In fact, the first electricity consumption forecasts were simply counts of thenumber of installed light bulbs, a method that is still in use to forecast street-light loads [8]. However, energy use as a whole has risen dramatically and itis projected that electricity will account for 40% of the rise in energy use until2040 [9]. Especially electric vehicles (EVs1) play an important role in sucha scenario, of which there were 2 million in 2016 worldwide, but which areexpected to increase to 58-200 million in 2030, depending on the policies inplace [10]. This poses challenges as well, since uncontrolled charging of EVscan reduce the operational performance of the grid in densely populated areasdue to power surges, even at low levels of penetration [11]. Other policies,

1Herein, the EV is an umbrella term for battery electric vehicles (BEV) and plug-in hybridelectric vehicles (PHEV).

3

Page 16: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

such as the push for energy efficiency or the adoption of demand response(DR), will likely increase the variability of electricity consumption [12].

In Sweden, the changes in power generation and consumption outlinedabove have just begun. The cumulative installed capacity of PV power re-mains relatively low because of low electricity prices and production tech-nologies with relatively low GHG emissions [13]. At the end of 2016, a totalof 205 MW was installed, which accounted for 0.1% of total electricity pro-duction [13]. In this respect, variability as of yet poses little challenge to theoperational performance of the grid. As for electricity consumption, Svenskakraftnät, responsible for the entire Swedish electricity network, ensures bal-ance between supply and demand by purchasing power reserves [14]. How-ever, here too the expectation is that future electricity consumption will callfor more advanced methods to maintain grid reliability [14].

It is generally understood that a forecast, a prediction of a future event overa certain time horizon, is a cost-efficient solution to the issue of variabilitybecause of its ability to reduce operational risks, e.g., via curtailment and unitcommitment [15, 16]. However, recent studies have mainly focused on de-terministic, or point, forecasts, which neglect the intrinsic uncertainty accom-panying the forecasts. The aforementioned requires studies into probabilisticforecasting of both PV power production and electricity consumption, and po-tential applications thereof [17, 18].

1.1 Aim of the thesisAs briefly noted above, the increase in variability on both the supply and de-mand side of the power system requires advanced yet cost-efficient methodsthat can be used to ensure high operational performance. The main aim ofthe thesis is therefore to develop models to produce accurate and reliable fore-casts. In order to do so, four concrete goals have been formulated:

I Evaluate the state-of-the-art in forecasting of PV power and electricityconsumption and condense the knowledge.

II Investigate the properties of probabilistic forecasts of PV power, elec-tricity consumption and net load on a residential level.

III Investigate, building on II, the properties of probabilistic forecasts forincreasing levels of aggregation and penetration.

IV Assess the potential of ensemble techniques for probabilistic forecasts.

4

Page 17: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

1.2 Overview of the thesis and the appended papersThe remainder of this licentiate thesis summary is structured as follows: Chap-ter 2 provides a background on forecasting, touches upon the why and how ofit, introduces basic definitions and reviews the state-of-the-art. Chapter 3 in-troduces the data, methodologies, models and performance metrics that areused throughout this licentiate thesis. Chapter 4 presents the main resultsfrom the appended papers and Chapter 5 discusses these results and futurework. Conclusions are finally drawn in Chapter 6. The results presented hereare based on the appended papers:

I Paper I provides a review of the state-of-the-art in forecasting of PVpower, electricity consumption and net load. It focuses mainly on prob-abilistic forecasting, since that has been identified as a valuable additionto express the uncertainty of a forecast. Besides the review of the lit-erature, the paper also aims at providing an introduction to the variousprobabilistic methods that can be used to forecast, with a particular focuson statistical and machine learning techniques.

II Paper II introduces and investigates the potential of the Gaussian pro-cess (GP), a nonlinear and non-parametric probabilistic model, whenapplied to electricity consumption, PV power and net load forecastingof individual residential buildings. Residential buildings are typicallythe most challenging due to the lack of any type of smoothing and there-fore form an interesting base case. Besides thorough cross validation ofvarious covariance functions for this particular task, it compares a staticto a regularly updated GP designed to reduce the computational burden.

III Paper III aims to fill a research gap, namely, what impact aggregationand penetration have on the accuracy of probabilistic forecasts. Thishas been well documented in the literature for deterministic forecasting,in particular for irradiance and PV power forecasting, but this is, as faras we are aware of, the first attempt to investigate this for probabilisticforecasting. The main motivation for this study is to see to what extentit is possible to improve probabilistic forecasts by simply aggregatingcustomers.

IV Paper IV moves away from the single model approach and investigatesthe benefits of an ensemble of two GPs, which are trained on subsetsof data that are normally distributed and that are subsequently optimallycombined via the expectation-maximization (EM) algorithm. Ensembleshave been shown to substantially improve the predictive performanceby eliminating model or data specific deficiencies, and ensembles havetherefore been applied using the aforementioned novel approach.

5

Page 18: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 19: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

2. Background

This chapter presents the background of the thesis. Section 2.1 aims to makethe case for forecasting as a cost-efficient solution to mitigate adverse effectsof large scale penetration of PV power and stochastic electricity consump-tion into the electricity grid of the future. Section 2.2 introduces approachesto forecasting that are currently available, where forecast horizons span frommilliseconds to weeks. Section 2.3 provides a gentle introduction to basicdefinitions that will be used throughout this thesis, whereas Section 2.4 sum-marizes the state-of-the-art. Section 2.5 concludes this chapter by identifyingresearch gaps that have been addressed in this thesis. Sections 2.2 to 2.5 rep-resent updated information that was first presented in Paper I.

2.1 Why do we need to forecast?Throughout this thesis, we shall see that variability plays a key role in thepower system of the future. Indeed, variability is the main reason for theuncertainty in forecasts, and it is therefore understandable that this area hasattracted a considerable amount of research.

2.1.1 Distributed generationThe push for a low-carbon power system requires RESs such as solar power,wind power, biomass and hydro power that do not emit GHGs—or are at leastcarbon neutral. Whereas the latter two are capable of providing a steadypower output over an extended period of time, the former two depend onstochastic atmospheric processes. In case of solar power, clouds represent themain source of variability and hence, uncertainty1, mainly because numericalweather prediction (NWP) models, which rely on physical representations ofthe atmosphere, cannot resolve small-scale cloud formation due to their coarseresolution [15, 21, 22].

Currently, penetration levels of PV power are modest. In Germany, for in-stance, power generated by PV plants covered 7.2% of the total electricity

1Aerosols, formed by, e.g., pollution, desert dust and pollen, also introduce variability by ab-sorbing and scattering irradiance [19], but these are generally not modeled explicitly whenforecasting PV power or are incorporated when calculating the clear-sky irradiance, e.g., in theIneichen model [20].

7

Page 20: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Net Load

Consumption

Production

Oct 20 Oct 21 Oct 22 Oct 23 Oct 24

0.00.51.0

0.00.20.40.6

0.00.20.40.60.8

Date

Powe

r (−)

1 Customer 30 Customers 210 Customers

Figure 2.1. The effect of aggregation on time series that have been normalized be-tween 0 and 1 to allow for visual comparison, since time series are generally scaledependent. The data come from Ausgrid [26] and will be formally introduced in Sec-tion 3.1.

consumption in 2017 [23]. However, the coverage rate can increase to 50%on weekends and holidays, and such levels can cause problems in the powersystem such as severe voltage fluctuations [23, 24]. It is possible to hedgeagainst such fluctuations by diversification of the electricity generating port-folio, similar to the case of financial engineering. For instance, Olauson etal. [25] thoroughly analyzed two scenarios on various timescales with high orfull renewable energy penetration and their impact on the net load, i.e., elec-tric load minus (renewable) electricity production, for Nordic countries. Theyfound that relatively low variability can be guaranteed by introducing an RESmix that has low variability at a particular timescale and they went on to con-clude that although a fully renewable Nordic power system is possible, it willrequire additional peak generating capacity [25].

Variability is also reduced by aggregation, i.e., the combined output fromseveral dispersed generation units. Figure 2.1 shows an example, where thetop, middle and bottom facets present PV power production, electricity con-sumption and net load, respectively. The color and linetype indicate the num-ber of customers that have been aggregated, after which all time series havebeen normalized between 0 and 1 individually in order to allow for visualcomparison. This effect, also referred to as the smoothing effect in case ofirradiance and PV power, occurs due to geographical dispersion of PV plantsand is well described in the literature [21]. It has been shown that geographi-

8

Page 21: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

cal dispersion over the entire planet could almost entirely eliminate variability[21].

The aforementioned measures require strategic planning and can be costly.Moreover, diversification might not be suitable due to geographical constraintsand although the smoothing effect reduces the variability, the question “Howmuch power production can one expect?” remains unanswered. Indeed, fore-casts allow decision-makers to estimate future production and act accordingly,which has the potential to substantially reduce operational cost and mitiga-tion measures [27]. More specifically, Martinez-Anido et al. [28] showed thatsavings on variable electricity generation of an independent system operator(ISO) can increase between 0.5 and 11 M$ when the forecast accuracy in-creases from 0% to 25%—depending on the PV power penetration level—dueto the reduced necessity of expensive backup generation.

2.1.2 Electricity consumption 2.0Traditionally, electricity grids have had a clear hierarchical structure with pro-duction capacity on one side, and consumers on the other side. Due to the ho-mogeneity of the electric loads in the past—initially the loads consisted solelyof light bulbs—forecasting was straightforward [8]. As wealth increased overtime, and following the advent of the transistor and electronic devices, elec-tricity consumption shifted over the course of the day and became increasinglystochastic over short time intervals. Other electric devices such as heating,ventilating and air conditioning (HVAC) devices represent a large share oftotal electricity consumption in many countries and have increased the sensi-tivity of electricity consumption to the weather drastically [12, 29].

The aforementioned trends have resulted in electricity consumption pat-terns that display clear diurnal patterns. Without detailed knowledge of thecustomer, the middle facet of Figure 2.1 clearly shows a high level of random-ness when considering a single person. However, the load profile smoothssignificantly when multiple customers are aggregated. This has allowed mostutilities to achieve day-ahead forecast errors of around 3% [12], where dummyvariables indicating the time of day and day of year, and weather inputs, areimportant predictors for electricity consumption [30, 31].

Currently, there is a push to electrify the transportation sector by means ofEVs, either hybrid or solely powered by a battery. Although the total numberof EVs remains modest with 2 million worldwide in 2016, it is estimated toincrease to 58-200 million in 2030, depending on the policies put in place[10]. Without any control, opportunistic, i.e., uncoordinated, charging of EVsis expected to add significantly to the typical diurnal electricity consumptionpeak in the evening [12, 32]. It has been shown that coordinated chargingcan improve the power quality to a similar level without EVs present, whilein case no form of charging strategy is imposed, grid reinforcements, that are

9

Page 22: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

generally costly, are necessary to increase the hosting capacity [32]. The lattercase also requires conventional power plants that are either costly to utilize orare constrained by slow ramp rates to be used more frequently. Either way,load forecasting is imperative to be able to control dispatchable loads such asEVs, or to ramp power plants efficiently.

2.1.3 Net loadThe bottom facet of Figure 2.1 presents the net load, i.e., electricity consump-tion minus PV power production. The figure indicates that clear skies pre-vailed on the first and last days, which has had a significant impact on thenet load. The typical shape of the net load curve is commonly referred toas the duck curve and is indicative of the mismatch between peak electricityconsumption and production [33].

In order to cope with the aforementioned mismatch and to benefit fromRESs with only limited curtailment, load shifting by means of DR and en-ergy storage, e.g., thermal or electric, are typically suggested on the consump-tion side [34, 35]. On the production side, real-time unit dispatching on shorttimescales and unit commitment on longer timescales are important measuresto reduce costs and GHG emissions [15]. Consequently, accurate forecastsare important for the measures on either side to be effective and thus play animportant role in the smart grid [15, 35].

2.1.4 Balancing supply and demandData play a key role when it comes to balancing supply and demand of elec-tricity, since more data containing actionable information are required whenthe variability increases. For low levels of variability, the supply can be adaptedto the demand when changes in the frequency of the grid are detected via au-tomatic generation control (AGC) [36]. However, due to the inertia of therotating masses involved, this can be relatively slow. When the variability in-creases, it becomes more important to be able to predict sudden fluctuations,and reliable and correlated data serve as explanatory variables in such forecastmodels.

The electricity generating industry has invested heavily in order to copewith the increasingly variable production and consumption, e.g., by installingcommunication devices, high frequency meters and other sensors [8]. Theaim of these investments is to increase the volume of data containing action-able information, which can be used in a smart grid. The smart grid is anelectricity network in which large volumes of bidirectional data streams—i.e.,big data—flow that facilitate solutions related to reliability, (cyber) security,optimization, automation and communication [37]. Therefore, it is hypoth-esized by researchers that the smart grid enables a sharing economy in the

10

Page 23: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

0

1000

2000

3000

1995 2000 2005 2010 2015Year

Num

ber o

f pub

licat

ions

Deterministic Load Deterministic SolarProbabilistic Load Probabilistic Solar

Figure 2.2. The number of publications on deterministic and probabilistic forecasting,where it is clear to see that probabilistic forecasting lags deterministic forecasting.Data from www.scopus.com [39].

energy sector because it allows for peer-to-peer (P2P) trading of electricityover various time intervals, thus increasing the flexibility of the electricity net-work and consequently increasing network capacity [38]. The reliability ofthe smart grid is therefore highly dependent on the accuracy of the forecastssince they constitute a substantial part of the actionable information providedto decision-makers.

2.1.5 Deterministic versus probabilistic forecastingThe previous sections tried to stress the importance of accurate forecasts, bothfor PV power production and electricity consumption. A great deal of re-search has been, and still is, devoted to deterministic forecasting. However,since all models are wrong2, it seems unrealistic to assign a single value towhat is in fact a random variable [40]. In order to deal with the increasingvariability, with increasing uncertainty as a consequence, recent research ef-forts have begun to focus on probabilistic forecasting [2, 12, 41], where thegoal is to describe the probability of occurrence by means of a distribution.In practice, probabilistic forecasts allow for risk management in operationalplanning, e.g., by identifying the amount of backup generation required tointegrate variable RESs [42]. For instance, Etingov et al. [18] showed that

2George Edward Pelham Box, who went on to say that Occam’s razor should be followed whendescribing or modeling natural phenomena.

11

Page 24: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

probabilistic forecasts could enable the California Independent System Oper-ator (CAISO) to reduce the regulation range by 12-31%, particularly duringmidday. Figure 2.2 presents the results of four straightforward search queries3

on Scopus [39]. The figure reveals the dominance of studies into deterministicforecasting and the slow uptake of probabilistic forecasting in recent years,particularly in the case of solar power. Hong and Fan argued that determin-istic forecasts are predominant because probabilistic forecasts were evaluatedwith deterministic performance metrics, which led researchers to believe thatprobabilistic forecasts were not as accurate [12].

Despite its advantages in case of, e.g., decision making [42], probabilis-tic forecasting has some challenges, not in the least for the forecaster. Forinstance, Gneiting and Katzfuss noted that the sharpness of the predictive dis-tributions should be maximized, subject to calibration, i.e., the observationsshould be similar to random draws of the predictive distributions [2]. Further-more, while deterministic forecasts are usually assessed by common metricssuch as the mean absolute error (MAE) or root mean squared error (RMSE),probabilistic forecasts require more thorough analyses of the predictive distri-bution, for which a plethora of scoring rules and visual inspection aides exist[2]. Sections 2.3 and 3.4 formally introduce the basic concepts and dedicatedperformance metrics, respectively. Finally, the resulting predictive distribu-tions usually do not allow a closed form solution and require Monte Carlomethods to approximate, which can be computationally demanding.

2.2 How can we forecast?The present section goes into more detail of various methods that are currentlyemployed in forecasting. Sections 2.2.1 and 2.2.2 pertain mainly to weatherrelated forecasts, although some outputs of these models, e.g., temperature,are also used in electricity consumption forecasting. It should be noted thatthe respective research areas are too rich and voluminous to cover in a thesis,let alone in a section, and the interested reader is therefore referred to relevantliterature. Figure 2.3 presents a graphic overview of the available methods andtheir respective temporal and spatial resolutions, which is taken from Paper I.

3TITLE-ABS-KEY(("electricity consumption" OR load OR "electricitydemand") AND (forecast OR predict)TITLE-ABS-KEY(("solar power" OR pv OR photovoltaic) AND (forecast ORpredict))TITLE-ABS-KEY(("electricity consumption" OR load OR "electricitydemand") AND (forecast OR predict) AND (probability OR probabilisticOR "prediction interval"))TITLE-ABS-KEY(("solar power" OR pv OR photovoltaic) AND (forecast ORpredict) AND (probability OR probabilistic OR "prediction interval"))

12

Page 25: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Temporal resolution (h)

Spat

ial r

esol

utio

n (k

m)

0.1

100

1

10

100010010

1

0.1

0.01

SolarStatistical

PLF

PhysicalPersistence

TSI

NWP

Sate

llite

imag

es

Intra-hour

Intra-day

Day-ahead

Figure 2.3. Visualization of the spatial and temporal resolutions of the most commonlyapplied methods in PV power and electricity consumption forecasting. This figure waspresented in Paper I and inspired by [43, 44].

2.2.1 Numerical weather predictionNWP models rely on extensive mathematical descriptions of the physical pro-cesses in the atmosphere and provide the most accurate forecasts for horizonsfrom 6 hours onward [15]. Although the history of NWP models leads backto the beginning of the twentieth century, it was not until the late twentiethcentury that computing power sufficed to solve the governing equations, e.g.,the momentum equations and thermodynamic equation, or approximationsthereof, numerically. The reader is referred to Warner [45] for an in-depthdiscussion on atmospheric modeling. NWP models are generally determin-istic, which is to say that given identical starting conditions, the model willoutput the same results [40]. It has been argued that this yields incompleteresults for two reasons: (1) the inability of the models to capture the fine scaleprocesses requires approximations that induce uncertainty and (2) due to dy-namical chaos [40, 46]. Since the entire atmosphere cannot be observed, theinitial conditions of a perfect NWP model would still differ from the actualatmospheric conditions, which results in rapid divergence between the modeland reality owing to the sensitivity of the governing equations to the initialconditions [40]. It was not until 1992 that the European Center for Medium-

13

Page 26: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Range Weather Forecasts (ECMWF) began producing ensemble forecasts thatrepresented the uncertainty in their predictions4.

Typically, NWP models predict weather related variables at a temporal res-olution of 1 hour or lower and at a spatial resolution of 9× 9 km2 or lower,although the weather research and forecasting (WRF) model achieves a spatialresolution of 1×1 km2 [15]. Due to their coarse resolution in both space andtime, NWP models are generally not suitable to directly predict solar irradi-ance or PV power production at residential or even plant level. Moreover, theyare generally biased and post-processing is recommended [40]. However, theycan provide valuable information, e.g., temperature, cloud cover or precipita-tion, for statistical methods, particularly in day-ahead forecasting. It shouldbe noted that NWP forecasts are not an integral part of this thesis.

2.2.2 SatellitesSatellites that observe weather phenomena are commonly geostationary orpolar-orbiting, and are located roughly 36000 km and 850 km away, respec-tively [19]. Cloud motion vectors (CMVs) are extracted from consecutive im-ages for each pixel, which in turn can be used to forecast up to 6 hours aheadat a spatial resolution of 1 km, although this depends on the satellite, see,e.g., the book by Kleissl et al. [19] and references therein. Recent advancesin weather satellites have improved the spatio-temporal resolution drastically.For instance, Bright et al. [47] used satellite images from the Himawari-8satellite and reference PV plants to nowcast, i.e., estimating at the time theimages are issued, PV power at distributed PV plants in Canberra, Australia.The Himawari-8 satellite issues images every 10 minutes at a spatial resolu-tion of 0.5× 0.5 km2 at nadir5 and therefore offers a wealth of informationto improve now- and forecasts [47]. Another notable example is the study byLorenzo et al. [48] who used ground measurements and satellite imagery tonowcast via optimal interpolation (OI), also known as Kriging or Gaussianprocess (GP) regression. It should be noted that satellite derived forecasts arenot an integral part of this thesis.

2.2.3 Statistical machine learningData play a key-role in our understanding of our surroundings and at each timeinstance, huge amounts of data are being created worldwide. In order to pre-dict, whether it concerns the weather or what film we might want to see next,relationships between the variable of interest (the next film) and explanatory

4https://www.ecmwf.int/en/about/media-centre/fact-sheet-ensemble-weather-forecasting, date accessed: 2018-09-03.5Directly below the satellite, and this resolution decreases due to the curvature of the surface ofthe Earth.

14

Page 27: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

variables (the previous films) need to be established, usually through super-vised learning [49]. While statistical learning and machine learning shareconsiderable common ground, a key difference is flexibility in the models.Linear regression is a well-known statistical model that is often used to pre-dict or extract information from data, but it is inflexible in the sense that afixed number of parameters are used, i.e., it is a parametric model, and thatan assumption regarding the interdependencies of the data has to be made.Machine learning approaches typically do not make assumptions regardingthe parameterization of the model, i.e., they are non-parametric models, andusually do not make assumptions regarding the data. This flexibility tends toimprove the predictive performance of these models [50]. A notable exceptionare artificial neural networks (ANNs), which contain parameters such as thenumber of hidden nodes and the weights connecting the inputs to these hiddennodes, but these drastically increase in size, thus adding flexibility. For in-stance, an ANN that was developed to translate English and French sentencescontained 384 million parameters [51]. The line between statistical learningand machine learning is therefore not always a clear one.

Statistical machine learning includes probability theory in order to accountfor the inherent uncertainties that accompany, among others, modeling, datagathering and data observation. The interested reader is referred to Ghahra-mani [50] for a gentle introduction. Various (statistical) machine learningmodels that are commonly applied to probabilistic PV power or load fore-casting have been presented in Paper I. Section 3.2 introduces two models thatwere key in developing Papers II–IV.

2.2.4 Hybrid methodsAs mentioned before, NWP models are typically deterministic and biased [40],and post-processing can be applied to produce a probabilistic forecast andreduce the bias. The result is a hybrid between physical and statistical models.Hybrid methods can serve multiple purposes. For instance, Chu et al. [52]used sky-imagery as input to ANNs and support vector machines (SVMs) toimprove the forecasts and estimate the uncertainty. The interested reader isreferred to Paper I.

2.3 Basic definitionsThis section introduces a few basic definitions to familiarize the reader withcommon concepts and terminology in solar power and electricity consump-tion forecasting necessary for Section 2.4. Section 3.4 further formalizes theperformance metrics. In solar power or irradiance forecasting, it is commonto remove deterministic variability present in the global horizontal irradiance(GHI, Gt,h) via the clear-sky global horizontal irradiance (CSGHI, Gt,cs) in

15

Page 28: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

(a)

0

500

1000

1500

09 12 15 18Time

Irrad

iance

(W

m−2

) CSGHIGHI

(b)

0.4

0.8

1.2

1.6

09 12 15 18Time

CSI (−)

Figure 2.4. In (a), the clear-sky global horizontal irradiance (CSGHI) and globalhorizontal irradiance (GHI) at 1s resolution, measured by a pyranometer on Oahu,Hawaii [53]. In (b), the clear-sky index (CSI) during the same day.

1 − η2

1 − η2

PIη

η

0.0

0.5

1.0

1.5

2.0

0.5 1.0 1.5 2.0Figure 2.5. Visualization of a prediction interval of a Gaussian forecast with nominalcoverage level η . The dashed Gaussian density represents a probabilistic forecast thatis sharper than the solid Gaussian density.

order to acquire the stationary clear-sky index (CSI, kt), which implies that ithas constant mean and that the autocovariance only depends on the distancebetween time samples [54]. The CSI is formulated as

kt =Gt,h

Gt,cs. (2.1)

Figure 2.4a presents the GHI and CSGHI, where the latter is deterministiccaused by the rotation of the Earth and its orbit around the sun. Figure 2.4bpresents the CSI during the same time period, the variability of which is in-duced by cloud movement and aerosols.

16

Page 29: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Forecasts are issued for horizons that can be classified as very short, short,medium and long-term forecasts. However, the time spans vary between elec-tric load and PV power forecasts. In case of the former, Hong and Fan definethe cut-off times as one day, two weeks and three years, respectively [12, 55].In contrast, the cut-off times are 15 minutes, 6 hours, day-ahead and severaldays ahead in relation to solar power forecasting [17], although it should benoted that this is not standardized in either of the cases.

Section 2.1.5 briefly introduced the notions of reliability and sharpness,characteristics with which probabilistic forecasts can be assessed. It is worthreiterating that the sharpness of the predictive distribution should be maxi-mized, subject to calibration, and that a reliable distribution is said to be cali-brated [2, 56]. Forecasts are said to be reliable when quantiles, defined by theirnominal probability levels τ , overpredict the target approximately 100% · τ ofthe time [57]. For instance, the median quantile qτ=0.5 should over- and un-derestimate the target approximately 50% of the time. As mentioned above,Section 3.4 elaborates on the performance metrics.

In some situations, the entire probability distribution is not required andparticular prediction intervals (PIs) are of interest. These are defined by theirnominal coverage level η , such that PIη is defined as

[qτ= 1−η

2, qτ=1− 1−η

2

].

For instance, PI0.8 = [qτ=0.1, qτ=0.9], in case a nominal coverage level of 0.8is required. Figure 2.5 presents a visualization of a probabilistic forecast bymeans of a single PI, while the dashed line represents a sharper probabilisticforecast than the solid line. The nominal coverage level η can be used toquantify how many of the observations should be covered by PIη and it isthe coverage probability of a PI that is a quantification of reliability. A PI isreliable when it covers roughly η · 100% of the observations in the test set[58, 59].

2.4 Previous workBetween the moment Paper I was published and the writing of this thesis sum-mary, many papers have been published, most likely enough to write an entireadditional review paper. In fact, the ever expanding body of literature has mo-tivated some researchers to explore text mining [60]. This section thereforebriefly summarizes the main conclusions from Paper I and then continues toreview but the most notable of the newly published works in order to identifytrends and research gaps.

Paper I uncovered several research gaps such as the need for the consistentapplication of probabilistic performance metrics in the case of probabilisticsolar power forecasting (PSPF); the need for benchmark data sets, e.g., GEF-Com2014 [61] or Oahu, Hawaii [53]; the necessity to take spatio-temporalcorrelations into account; and the need for alignment of temporal resolutions

17

Page 30: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

of solar power data and electricity consumption data to improve intra-day andintra-hour forecasts. Although public data sets remain relatively rare, recentresearch efforts have clearly improved on the use of probabilistic performancemetrics. The following review suggests that there is still a need to considerspatio-temporal correlations.

With the widespread installation of smart meters and PV plants, a wealthof information is available to grid operators. As argued before, the action-able information contained herein can be used to balance supply and demand.Currently, there is a paradigm shift where not only temporal information isused, but also spatial information. The main driver behind this is Tobler’s firstlaw of geography: “Everything is related to everything else, but near thingsare more related than distant things” [62]. It therefore seems sensible to in-clude information from the surroundings, which is known as spatio-temporalforecasting. In this setting, one can either use the distributed information topredict at the locations where the information originated from or use spatialinterpolation techniques to predict anywhere on the area defined by the loca-tions. The latter, also known as Kriging, was employed by Jamaly and Kleissl[63] who used it to predict irradiance at arbitrary locations. The advantageof this method is that all dependencies can be modeled using a covariancefunction, and in their work, the authors proposed a novel anisotropic covari-ance function to account for irregularities in the spatio-temporal correlations,whilst defining a spatial decorrelation distance. The disadvantage is the highcomputational burden, which costs O(N3) for prediction and inference, andcan therefore become prohibitive when the volume of data becomes too large[64]. The other alternative, where one predicts solely at the locations wherethe data are observed, is generally computationally less expensive, despite therequirement that a model is learned for each location of interest. For instance,Agoua et al. [65] used quantile regression (QR) in combination with leastabsolute shrinkage and selection operator (LASSO) to forecast PV power at136 locations. The LASSO reduces the dimension of the problem by select-ing only surrounding locations that improve the forecast of the target location.The downside of this method is, however, that the joint distribution betweenthe forecasts at the locations remains unknown.

In order to predict up to, e.g., 6 hours into the future with hourly data, itis common practice to train a model for each forecast horizon. Similar tothe spatial example above, however, this approach fails to incorporate the de-pendencies between the consecutive forecasts. Multi-stage decision makingprocesses therefore benefit from multivariate, or joint, probabilistic forecastsfrom which space-time trajectories can be sampled [42]. Golestaneh et al.[66] used QR to forecast the marginal distributions at each location of interestand coupled these via a Gaussian copula to produce a multivariate normal dis-tribution from which the space-time trajectories could be sampled. The resultsrevealed that their proposed method outperformed the independent methods,i.e., where the trajectories were sampled from the marginal distributions. It

18

Page 31: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

should be noted that performance metrics for univariate distributions do notapply, and adapted ones are required, see, e.g., [67]. Another notable exampleis the study by Möller et al. [68], who employed ensemble Bayesian modelaveraging (BMA) to produce marginal distributions of 5 NWP variables, afterwhich these were combined into a joint distribution using a Gaussian copula.

As the aforementioned studies reveal, spatio-temporal and multivariate prob-abilistic forecasting are gaining attention due to their importance in decisionmaking processes. Another important aspect of forecasts is that in case of hi-erarchical time series, the forecasts at the various levels of the hierarchy areaggregate consistent [69]. Hierarchical time series occur in many settings,e.g., tourism and organizations, although the focus here is on the hierarchyof electricity networks. Forecasts are aggregate consistent if the summationof their outputs is equal at each level in the hierarchy. For instance, the sumof the forecasts of each household in the neighborhood should equal the fore-cast at the neighborhood level. Hyndman et al. [69] introduced a regressionmodel to optimally and simultaneously combine and reconcile forecasts at alllevels. Although applied to deterministic forecasting, Yang et al. [70, 71]used regression techniques that were built on the regression model proposedby Hyndman et al. [69] and noted that the reconciled forecasts outperformedthe base forecasts by considerable margins at all levels of the hierarchy. Taiebet al. [72] explored two methods of probabilistic forecast combination andreconciliation, namely a bottom-up approach and a two-step approach wheremean forecasts were first combined, after which they were reconciled. Theformer starts by predicting the marginal predictive distribution for each seriesin the lowest level of the hierarchy, after which the marginal predictive distri-bution of the level above can be computed by using a copula to model the jointdistribution of the series at the lowest level. The second method was built onthe game-theoretically optimal reconciliation (GTOP) approach [73], whichrelied on the combination of all mean forecasts through optimization and asecond optimization step that minimized the difference between the combinedforecasts and a set of coherent vectors at all levels of the hierarchy.

As mentioned in the introduction of this subsection, many other interest-ing studies have been published concerning probabilistic forecasting. For in-stance, David et al. [56] and Pedro et al. [74] thoroughly investigated short-term probabilistic irradiance forecasting using a wide range of machine learn-ing techniques, where it was noteworthy that no single method performed best.Forecasting net load, introduced in Section 2.1.3, was studied by Chu et al.[75] and Wang et al. [76] for high levels of PV penetration. Chu et al. triedto enhance the forecasts by decomposing the net load in order to remove lowfrequency components, by dividing the training data into day and night partsand by using sky images as additional explanatory variables and noted sig-nificant improvement in forecast skill. Wang et al. started from the premisethat PV power and electricity consumption are easier to predict separately andproposed a framework to decompose the net load and subsequently gener-

19

Page 32: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

ate a joint probability distribution of the forecast, and went on to show thattheir approach was superior to forecasting the net load directly. Wang et al.[77] continued their efforts by proposing a framework to combine probabilis-tic load forecasts, since the literature has provided overwhelming evidenceof the strength of ensemble methods. By averaging quantile regression fore-casts in an optimal way through the use of linear programming, the authorsachieved improved the pinball score by 4.39% when compared to the bestmember of the ensemble. Finally, Appino et al. [78] and El-Baz et al. [79]studied applications of probabilistic forecasts, specifically for energy manage-ment and scheduling, and showed the improvement in self-consumption andself-sufficiency that can be achieved by using probabilistic forecasts in com-bination with smart scheduling.

2.5 Research gapsProbabilistic forecasting is a rapidly developing research area. However, asFigure 2.2 indicated, it is still a relatively immature field, particularly in thecase of PV power. Many lessons can be learned from the field of wind power,where probabilistic forecasting is common and significant advances have beenmade by collaborations between domain experts and statisticians [42]. Fur-thermore, leveraging the vast body of literature on deterministic forecastingcould also prove beneficial [12]. The following research gaps have been iden-tified and addressed in the appended papers:

• Residential PV power, electricity consumption and net load forecastinghas not been studied often—not at all in this combination—even thoughthis spatial level becomes increasingly important when residential build-ing are equipped with energy management systems that are required tomake optimal decisions regarding energy distribution. Paper II thereforeaddresses this research gap, by using a GP as a non-parametric, nonlin-ear probabilistic model in order to model and predict the highly variabletime series.

• The effect of aggregation on the accuracy of deterministic forecasts hasbeen well described by the scientific literature, in particular for solarpower forecasting, see, e.g., Perez et al. [21] and references therein.Load forecasting typically took place at higher levels of aggregation, al-though the recent deployment of smart meters has increased the spatio-temporal resolution of the data, and it is well understood that the in-crease in resolution decreases the accuracy of the forecasts. Few studieshave, however, assessed the effect of aggregation on probabilistic fore-casts and Paper III aims to contribute there. However, exact inference onthe relationship between distance and accuracy was not possible due to

20

Page 33: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

the limitations of the data set. Future work will therefore consider datasets where location specific data is available, although acquiring these ischallenging due to the sensitive nature of these data.

• Ensemble forecasts have received considerable attention in the determin-istic forecasting literature under the premise that the average of multiplemodels provides a better estimate than a single model. Wang et al. [77]noted that this had not been done for probabilistic load forecasting andPaper IV addresses this issue for probabilistic CSI forecasting.

• Spatio-temporal forecasting exploits information from the surroundingenvironment and this information can substantially improve the accu-racy of forecasts, as has been shown by many studies into deterministicforecasting. Paper IV makes a first attempt at incorporating informationfrom surrounding sensors. The challenge in this case is, however, to re-duce the number of unnecessary explanatory variables, since these onlycontribute to the variance of the prediction [80].

21

Page 34: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 35: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

3. Methodology and data

This chapter introduces the data, the forecast models, the performance metricsand the benchmarks that have been used in Papers II–IV and in Sections 3.1,3.2 and 3.4, respectively.

3.1 Overview of dataThe data used in Papers II–III come from Ausgrid [26], an Australian utilitythat operates in the Sydney metropolitan area. Ausgrid monitored the PVpower production and electricity consumption of roughly 3000 customers ona half-hourly basis for a period of three years. Because of the sensitivity ofthese data, particularly that of electricity consumption, Ausgrid sampled 300customers from the data set, anonymized the data and published it online. Datasets covering both PV power production and electricity consumption allow forstudying of net load on the residential level, but such sets are rare. The factthat the data are publicly available and cover both PV power production andelectricity consumption are the main motivations for selecting this data set.

The data used in Paper IV have been collected by the National RenewableEnergy Laboratory (NREL) on the island Oahu, Hawaii [53]. The data setcontains 1 second measurements of the GHI from a network of 17 pyranome-ters. In order to compute the CSI, these data were divided by the clear-skyirradiance using the CAMS McClear Clear-Sky Irradiation service [81]. Sincethis service has a temporal resolution of 1 minute, these values were assumedconstant for the seconds within the minute, see Widén et al. [82] for moredetails.

3.2 Forecast modelsThis section introduces the forecasting models that were used in Papers II–IV.In both cases, the aim is to find a model that best fits the observed data asyi = f (xi)+ ε , where f is some model1 and ε is white noise. Sections 3.2.1and 3.2.2 show the distinction between statistical machine learning and statis-tical learning, respectively, as was highlighted in Section 2.2.3.

1In the simplest case, we could consider a linear model of the form: f (xi) = axi +b.

23

Page 36: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

3.2.1 Gaussian processA univariate Gaussian probability density function (PDF) is fully parameter-ized by its mean μ and variance σ2 as follows [83]

f (x; μ,σ2) =1

σ√

2πe−

12(

x−μσ )

2

. (3.1)

The Gaussian, or normal, probability distribution plays an important role inprobability theory and statistics, and can be extended into a multivariate Gaus-sian distribution and further extended into a Gaussian process. As such, theGP is defined as a collection of random variables in which any subset has amultivariate Gaussian distribution [84].

The GP is closely related to Bayesian learning, which provides a frame-work to deal with uncertainties through probability theory [50]. In Bayesianlearning, a prior distribution represents the initial beliefs of the model andthese beliefs are updated as more information becomes available, i.e., when itis possible to condition the prior distribution on the data in order to computethe posterior distribution using Bayes’ theorem. In case of the GP, the initialassumption is that two function value vectors f (x) and f (x′), evaluated at xand x′, are jointly Gaussian distributed by means of mean μμμ and covariance Kas follows [84]:

p(

f (x)f (x′)

)∼N (μμμ,K), (3.2)

where bold lower case letters indicate vectors and bold upper case letters in-dicate matrices. A GP is therefore also termed as a distribution over functionsin the sense that any of the functions f defined by the joint Gaussian distri-bution is a potential model for the data x, and observing more data eliminatesfunctions that do not explain those observations, i.e., conditioning. This isformalized later in this subsection. In eq. (3.2), μμμ consists of a mean functionm(·) and is commonly set to zero without loss of generality [84]. The covari-ance matrix K encodes the relationship between the input variables througha covariance function k(·, ·) that depends on the distance between the inputvariables. The prior for the function f is then defined as a Gaussian processaccording to

f (x)∼ GP(m(x),k(x,x′)). (3.3)

Figures 3.1a and 3.1b illustrate the importance of the covariance function,as it determines the smoothness of the sampled functions. The color gradientindicates the probability density. Figure 3.1a presents 10 samples from a GPprior with zero mean and square exponential (SE) covariance function:

k(r) = σ2f exp

(− r2

2�2

), (3.4)

where r = ||xi −x j||, and � and σ2f are hyperparameters representing the char-

acteristic length and amplitude, respectively. Covariance functions that solely

24

Page 37: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

(a)

−2

−1

0

1

2

−5.0 −2.5 0.0 2.5 5.0x

f(x)

(b)

−2

−1

0

1

2

−5.0 −2.5 0.0 2.5 5.0x

f(x)

(c)

−2

0

2

−5.0 −2.5 0.0 2.5 5.0x

f(x)

(d)

−3−2−1

0123

−5.0 −2.5 0.0 2.5 5.0x

f(x)

Figure 3.1. Visualizations of the prior (a and b) and posterior (c and d) distributionsof GPs with covariance SE (a and c) and Matérn (b and d) covariance functions. Thecolor gradient indicates the density of the probability distribution and therefore theuncertainty. The grey lines represent samples drawn from the prior (a and b) andposterior (c and d) distributions, whereas the black lines represent the means of thedistributions. Finally, the points in (c) and (d) represent the observations points.

depend on r are isotropic, which is typically considered to be unrealistic whenthe GP is used for spatio-temporal forecasting because weather phenomenaare anisotropic [85]. Two methods to amend this issue currently prevail inthe literature, namely spatial deformations [85, 86] and adjusted covariancefunctions that consider anisotropy [63, 87].

The SE is infinitely differentiable, which is why Stein [88] argued that thisis typically not realistic for physical processes and recommended the Matérn3 or 5 covariance functions instead [89]

k(r) = σ2f

(1+

√3r�

)exp

(−√

3r�

), (3.5)

k(r) = σ 2f

(1+

√5r�

+5r2

3�2

)exp

(−√

5r�

). (3.6)

Figure 3.1b presents 10 samples from a GP with the Matérn 3 covariance func-tion, which is one-time differentiable [84, 89]. Paper II thoroughly investigates

25

Page 38: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

(combinations of) covariance functions for PV power, electricity consumptionand net load forecasting for residential customers.

The prior does not hold much actionable information. Given a training set{X,y} of length N and dimension D, where X ∈ R

N×D and y ∈ RN , and a test

input X∗ of length N∗, where X∗ ∈ RN∗×D, it is possible to express the joint

distribution between the training set and the test input as follows:

p([

f (X)f (X∗)

])=N

(0,[

K(X,X) K(X,X∗)K(X∗,X) K(X∗,X∗)

]), (3.7)

where f (X) is the vector of function values that represents the—in this caseassumed—noiseless observations y and where the mean function is assumedto be zero. In a time series setting, X contains N observations of D explana-tory variables until time t −1, f (X) contains the noiseless observations of theresponse variable, e.g., PV power, until time t − 1 and X∗ contains observa-tions of the explanatory variables at time t, while we wish to predict functionvalues f (X∗) for time t.2 However, rather than predicting suitable functions,it is more efficient to compute the posterior distribution through conditioningon X∗, which results in [84]

p( f (X∗)|X, f (X),X∗)∼N (K(X∗,X)K(X,X)−1 f (X),

K(X∗,X∗)−K(X∗,X)K(X,X)−1K(X,X∗)).

(3.8)

Figures 3.1c and 3.1d present the posterior distribution after having observed5 data points. Note the rapid increase in uncertainty in between the observa-tions, which is why it is important to learn the hyperparameters � and σ2

f bymaximizing the log-likelihood of the function values given the training data:

log p( f (X)|X,θ) =−12

f (X)T K(X,X)−1 f (X)− 12

log |K(X,X)|− N2

log 2π,

(3.9)

where θ contains the hyperparameters.As mentioned in Section 2.4, a GP has the disadvantage that it is computa-

tionally expensive to invert matrix K (O(N3)), which is required for learningthe hyperparameters and prediction. Paper II therefore compares a static GP toa dynamic GP where the hyperparameters are regularly updated. Figure 3.2apresents the cross validation procedure to select the appropriate covariancefunction, further explained in Section 3.4. Figure 3.2b presents the flowchartthat highlights the difference between the static and dynamic GP, in which Mis the length of the training set such that N � M and where N∗ is the length ofthe test set. This is motivated by Salcedo-Sanz et al. [90], who noted that theGP showed excellent performance with a limited amount of training data. Inthis thesis, the GPML package [84] in Matlab is used to implement the GP.

2Equally, we could say that X contains observations of the explanatory variables until time tand we wish to predict f (X∗) for time t +1.

26

Page 39: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

(a)Training/ validation

data

Initial guess ofhyperparameters

Train and validateas in Fig. 3.3

Assessperformance

More cov. functions?

Select best cov.function and exit

Next cov.function

N

Y

(b)

Test data Static or dynamic?

Learn hyperparameterson train set of length N

Learn hyperparameterson M observations of the

test set, M << N

Training data

S D

Forecast on M + 1 untilM + N*

Forecast on the test set

Assess performance andexit

Move M forward with N*

End of test set? N

Y

Figure 3.2. In (a), a flowchart that represents the cross validation procedure usedin Paper II, which will be further explained in Section 3.4. In (b), a flowchart thatrepresents the functioning of the static and dynamic GPs explored and applied in PaperII and Paper III.

3.2.2 Quantile regressionDespite the advantages of the GP, recall that its output is a Gaussian distribu-tion, it has been argued that this is too restrictive for physical processes, see,e.g., [72, 91]. Paper III therefore employs an additional method, namely quan-tile regression (QR). It was proposed by Koenker and Bassett [92] and aims toestimate the conditional quantiles of the distribution by minimizing a sum ofabsolute errors that are weighted asymmetrically to positive and negative er-rors. Unlike the GP, QR assumes a linear relationship between the explanatoryvariables X and output y:

yτ = βββ τX+ ε, (3.10)

where τ represents the quantile probability (0< τ < 1), βββ the parameters and εthe model error. In order to learn the parameters βββ , the following minimizationproblem is solved:

βββ τ = argminβββ

N

∑i=1

ρτ (yi −βββXi) , (3.11)

where ρτ is the asymmetric weight function, also known as the pinball lossfunction, which is defined as [92]

ρτ(u) =

{τu if u ≥ 0(τ −1)u if u < 0.

(3.12)

27

Page 40: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

As hinted by eq. (3.11), each quantile is estimated independently, which im-plies that quantile crossing could occur and this violates the monotonicityproperty [93]

qτ1 ≤ qτ2 ∀ τ1, τ2 subject to τ1 ≤ τ2. (3.13)

This is circumvented by monotone rearranging when necessary.In this thesis, the quantreg package in R [94] is used to implement the QR

model. QR is a tried and tested method in forecasting, see, e.g., [95, 96, 97,98]. It should, however, be noted that more advance techniques exist, suchas QR neural networks (QRNNs) [99] and QR forests (QRFs) [100]. Thesehave been used less often in the literature (see, e.g., [101, 102, 103]) but haveshown promising results.

3.3 Overview of approachesThis section introduces the methodologies used in Paper III (Section 3.3.1)and Paper IV (Section 3.3.2). In the former, the Ausgrid data set [26] wasused, whereas the data set recorded by a network of pyranometers on Oahu,Hawaii [53], was used in the latter study. The methodologies used in Paper IIare straightforward and require no further clarification.

3.3.1 The behavior of predictive distributionsThe purpose of Paper III was to investigate the effect that aggregation, pene-tration and seasonal variations have on predictive distributions. The followingparagraphs describe how the data were prepared and how the models were setup to achieve the aforementioned. It should be noted that the study on pene-tration is omitted in this thesis for brevity and the interested reader is referredto Paper III.

AggregationIn order to investigate what effect aggregation has on the predictive distribu-tions, aggregation was considered on two scales. First, customers were aggre-gated in increments of 1 to a total of 30 customers in order to get insight intothe small scale variations. Second, customers were aggregated in incrementsof 30 to a total of 240 to assess whether further increasing the level of aggre-gation could be helpful in improving the accuracy. In both cases, the processwas repeated 5 times using random samples of customers in order to reducenoise from abnormal customer behavior. These steps were carried out duringspring and winter.

28

Page 41: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

SeasonsIt is well understood that highly variable time series are more challenging toforecast than time series that are less variable. Few studies have, however,assessed this in case of probabilistic forecasting. In Paper III, the data weredivided by season and the two seasons with highest and lowest variability wereused as test cases. Winter represented the case in which PV power variabilitywas relatively low as opposed to relatively high variability in electricity con-sumption during this month. Conversely, spring represented the case in whichelectricity consumption variability was relatively low, whereas the variabilityof PV power production was relatively high.

3.3.2 Gaussian process ensemblesThe probability distribution of the CSI can be approximated as a multiple-statemodel using multiple Gaussian densities [82, 104]. Furthermore, it has beenshown in the literature that ensembles of forecast models improve the accuracy[105, 106, 107]. In Paper IV, the CSI data were therefore divided such thatthe resulting two sets were approximately normally distributed, which couldfurther enhance the accuracy of the GP predictions. Both GPs were used onthe test set, after which the forecasts were combined as a bimodal Gaussianmixture model (GMM) [49]:

f (x) = (1−π) ·N (x; μ1, σ2

1)+π ·N (

x; μ2, σ22), (3.14)

where the mixing coefficient π ∈ [0,1], μ1, σ21 , μ2 and σ2

2 are the forecastmeans and variances of the individual GPs.

In order to fit mixing coefficient π , the expectation-maximization (EM) al-gorithm was used. The algorithm is initialized with π = 0.5 and the expec-tation (eq. (3.15)) and maximization (eq. (3.16)) steps are iterated until thelog-likelihood converges [49]:

γ j =π ·N (

x; μ2, σ22)

(1− π) ·N (x; μ1, σ2

1

)+ π ·N (

x; μ2, σ22

) , (3.15)

π =1M

M

∑j=1

γ j. (3.16)

Another ensemble of GPs can be achieved through convolution. Providedthe two GPs are independent, the convolution of the Gaussian random vari-ables X and Y and mixing coefficient α amounts to [108]

αX +(1−α)Y ∼N(

αμX +(1−α)μY ,α2σ2X +(1−α)2 σ2

Y

). (3.17)

29

Page 42: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

(a)

TrainingValidation

Time

K-f

old

(b)

TrainingValidation

Time

K-f

old

Figure 3.3. Time series cross validation in which each row represents a fold and eachdot a data point. Using this construction, the natural ordering of the data remains intact[109]. In (a), the procedure for one step ahead forecasts and in (b) the procedure fortwo step ahead forecasts. From: Paper II.

3.4 Cross validation, performance metrics andbenchmarks

3.4.1 Cross validation for time seriesCross validation is a process that concerns model selection and tuning on dataseparate from the test data [49]. Typically, one divides the training set into Kfolds, trains the models on K −1 folds and tests on the remaining fold, whichis repeated K times. The prediction errors are then averaged over the folds andthe best performing model is selected. In the case of time series, each month ina year could be envisaged as a fold. Following the cross validation procedure,we can remove one month, e.g., June, train a model on the remaining monthsand validate it on June. This process can then be repeated 11 more times untileach month has been used as a validation set. Finally, the prediction error isdivided by 12.

Due to serial correlation in time series it may not be preferable to simply di-vide the data [109]. Therefore, time series cross validation continues forwardin time and keeps the time series intact, as depicted in Figures 3.3a and 3.3b.The predictions in the validation set are assessed by means of a numeric scoreand averaged over all validation sets.

3.4.2 Probabilistic metricsProbabilistic performance metrics can be divided in quantitative and qualita-tive metrics. The former pertains to metrics that condense the performanceinto a single number, whereas the latter commonly are visualizations of theperformance. This subsection is divided according to this distinction.

Quantitative metricsThree quantitative metrics are used throughout this thesis, namely the predic-tion interval coverage probability (PICP), the prediction interval normalized

30

Page 43: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

(a)

Observations PIη=0.6

Mean

0

2

4

6

2.5 5.0 7.5 10.0Time

y

(b)

I(y ≤ x)

F(x)

y

0.00

0.25

0.50

0.75

1.00

0.0 0.5 1.0 1.5 2.0x

Figure 3.4. In (a), a visualization of a reliable probabilistic forecast since 6 out ofthe 10 observations are covered by the PIs with nominal coverage level 0.6. In (b), avisualization of the CRPS, where the area is a function of the difference between theforecast and the observation, and the sharpness of the forecast.

average width (PINAW) and the continuous ranked probability score (CRPS).The PICP is directly related to the nominal coverage level η , as introduced inSection 2.3, and is defined as [110]:

PICP(η) =1N

N

∑i=1

1{

yi ∈[qi,τ= 1−η

2, qi,τ=1− 1−η

2

]}, (3.18)

where 1 is the indicator function that is 1 when the observation y lies withinthe PI. As mentioned in Section 2.3, the PIs are considered reliable when theycover roughly η ·100% of the observations [95]. Figure 3.4a visualizes reliablePIs since 6 out of the 10 observations are covered by the PIs with a nominalcoverage level of 0.6, i.e., PICP(0.6) = 0.6.

It is, however, straightforward to produce PIs with PICP ≈ η , but thesemight not convey actionable information. It is therefore important to simul-taneously assess the sharpness of the PIs via the PINAW, which is defined as[110]:

PINAW(η) =1

NR

N

∑i=1

(qi,τ=1− 1−η

2− qi,τ= 1−η

2

), (3.19)

where R is a normalization constant, typically the difference between the max-imum and minimum value of the test set [110].

The choice of nominal coverage level η depends on the application of theforecast in cases where an entire probability distribution is unnecessary. Oneexample is that of wind energy trading on the Nord Pool electricity market,where only one PI is required for revenue-maximization strategies [96]. Pin-son et al. [111] pointed out that selecting η = 0.50 leads to the uncomfortablesituation where the PIs cover the observations as often as they do not, whereassetting η = 0.90 likely produces too wide PIs. Chatfield [112] indicated that

31

Page 44: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

0.75 < η < 0.85 would be a good compromise. A nominal coverage levelη = 0.80 is used throughout this thesis when calculating PICP and PINAW,except in Paper III, where multiple levels were used.

The CRPS assesses both sharpness and reliability simultaneously, and re-duces to the MAE in case of a deterministic forecast [2]. It therefore has thesame units as the dependent variable and can be used to rank various modelsduring cross validation [2]. The CRPS can be formulated as [2, 113]:

CRPS(Fi,yi) =∫ ∞

−∞(Fi(x)−1yi≤x)

2 dx, (3.20)

CRPS(Fi,yi) = EF|Yi − yi|− 1

2EF

∣∣Yi −Y ′i∣∣ , (3.21)

where Y and Y ′ are independent samples drawn from the predicted cumulativedistribution function (CDF) F . Subsequently, the CRPS can be averaged overthe test period as follows

CRPS =1N

N

∑i=1

CRPS(Fi,yi) . (3.22)

Figure 3.4b visualizes the CRPS and shows that it is a function of both thedifference between the forecast and observation, and the sharpness of the fore-cast. In order to calculate the CRPS, we use the package SpecsVerification[114]. As a final metric, the continuous ranked probability skill score (CRPSS)is employed in Paper IV, which reveals the performance relative to the persis-tence ensemble (PeEn)—introduced in Section 3.4.3—and is defined as:

CRPSS = 100 ·(

1− CRPSm

CRPSPeEn

), (3.23)

where m indicates the model.

Qualitative metricsAs mentioned in Section 2.1.5, it is the aim to maximize the sharpness of prob-abilistic forecasts, subject to calibration [2]. The reliability diagram providesa clear visual tool to inspect the reliability of a probabilistic forecast. Herein,the observed frequency of overpredictions of quantile τ is plotted versus thenominal probability of quantile τ . This is defined as [57]

aτ =1N

N

∑i=1

1{yi < qi,τ} ∀ τ. (3.24)

A forecast is said to be reliable when the resulting line lies close to the diag-onal. Figures 3.5a to 3.5h present four example forecasts and their reliabilitydiagrams that represent extreme cases. In Figure 3.5a, the forecast means and

32

Page 45: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

(a)

2.5

5.0

7.5

10.0

2.5 5.0 7.5 10.0Time

y 0.250.500.75

Interval

Obs.

(b)

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00Nominal probability

Obse

rved

freq

uenc

y

ReliabilityIdeal

(c)

0

5

10

2.5 5.0 7.5 10.0Time

y 0.250.500.75

Interval

Obs.

(d)

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00Nominal probability

Obse

rved

freq

uenc

y

ReliabilityIdeal

(e)

5

10

2.5 5.0 7.5 10.0Time

y 0.250.500.75

Interval

Obs.

(f)

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00Nominal probability

Obse

rved

freq

uenc

y

ReliabilityIdeal

(g)

0

4

8

2.5 5.0 7.5 10.0Time

y 0.250.500.75

Interval

Obs.

(h)

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00Nominal probability

Obse

rved

freq

uenc

y

ReliabilityIdeal

Figure 3.5. Example forecasts and the resulting reliability diagrams.

33

Page 46: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

observations are identical. Although this is desirable in case of determinis-tic forecasting, such a “perfect” forecast poses issues in case of a probabilis-tic forecast since a decision maker should be able to rely on the fact that a90% probability quantile overestimates roughly 90% of the observations. Fig-ure 3.5b shows the consequence of a perfect forecast in terms of the reliabil-ity, which is identical to what can be expected from a perfect deterministicforecast. In contrast, Figure 3.5c represents the case in which the predictivedistribution equally over- and underestimates the observations, which resultsin the horizontal reliability diagram shown in Figure 3.5d. In this case, it isimpossible to use such a forecast since it is equally likely that a quantile—anda PI by extension—over- or underestimates. Figures 3.5e and 3.5g representforecasts that are systematically negatively and positively biased, respectively.The reliability diagrams in Figures 3.5f and 3.5h are intuitively 0 and 1 in caseof negatively and positively biased probabilistic forecasts, respectively.

Deviations from the diagonal can, despite reliable forecasts, occur due toa test set of limited length [57]. It is therefore important to indicate the like-lihood of the frequencies simultaneously, which can be done through consis-tency resampling [115]. In consistency resampling, surrogate observationsyt are produced by assuming that the predictions are reliable and thereforeFt(Yt) ∼ U [0,1], where U is the uniform distribution function. It is then pos-sible to produce surrogate observations via F−1

t (ui) = yt,i, where ui is a drawfrom U [0,1]. This process is repeated Nb times in which frequency a j,τ iscalculated for every repetition j using eq. (3.24). Subsequently, the range of[a1,τ , . . . , aNb,τ

]represents the likelihood of aτ , which can visually be repre-

sented as errorbars and is meant to show random variability in the data set.The forecast is said to be reliable when the reliability curve falls within theconsistency bars.

3.4.3 Persistence ensembleThe persistence benchmark assumes that the current state persists into the nexttime step, such that yt+h = yt . It has been applied in many studies because itis challenging to outperform, particularly on short forecast horizons, see, e.g.,[52, 116, 117]. The probabilistic counterpart of the persistence is used in thisthesis: the persistence ensemble (PeEn) [118]. The 9 preceding observationsare ranked in ascending order to construct percentiles 10 to 90, see, e.g., [119].

3.4.4 ARIMAThe auto-regressive integrated moving average (ARIMA) model is a bench-mark that was first proposed by George Box and Gwilym Jenkins in 1970. Itis a linear time series model that is widely used in financial engineering andenergy engineering, see Box et al. [120] for details. The ARIMA model can

34

Page 47: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

be expressed as [54]:

φ(B)(1−B)d xt = θ(B)εt , (3.25)

where B is the backward shift operator, from which it follows that Bxt = xt−1.Further, φ(B) = 1−φ1B−φ2B2 − . . .−φpBp describes the AR process of or-der p with φ1, . . . ,φp as parameters and θ(B) = 1+ θ1B+ θ2B2 + . . .+ θqBq

the MA process of order q with θ1, . . . ,θq as parameters. The R packageforecast [121] is used in this thesis to model the univariate time series withthe ARIMA, which assumes the variance of the residuals and one-step aheadforecast to be identical [122].

35

Page 48: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 49: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

4. Results

This chapter summarizes the results of the appended papers. The sections inthis chapter are ordered according to the papers. Section 4.1 summarizes theresults of Paper II and encompasses purely temporal forecasting of PV powerand electricity consumption of a residential building. Section 4.2 summarizesthe results of Paper III and Paper IV, in which various levels of aggregationwere examined and information from neighboring locations was incorporated.

4.1 Temporal forecastingThis section summarizes the results first presented in Paper II. The forecastsherein only utilize endogenous information, that is to say that no informationother than the time series of interest has been used. This section is furtherdivided into the three topics of this thesis, namely PV power (Section 4.1.1),electricity consumption (Section 4.1.2) and net load (Section 4.1.3) forecast-ing. It is important to note that Paper II does not contain visual metrics suchas the reliability diagram, but these will be presented here, in turn leading tonew insights. Furthermore, it is important to point out that the horizontal axesfor PICP(η), PINAW(η) and the reliability diagram are not identical—theyrepresent nominal coverage level η in case of the former two and nominalprobability in case of the latter—they are, however, on the same scale and aretherefore plotted as such.

4.1.1 Solar power forecasting at the residential levelFigure 4.1 presents the PICP(η), the reliability diagram and the PINAW(η)for the PV power forecasts, which, as mentioned, were not included in PaperII. In case of the PV power forecasts, nighttime values have been excludedsince the abundance of zero values tends to distort the results. The two topfacets should be close to the diagonal for the forecast to be reliable, whereasthe bottom facet indicates the sharpness of various PIs.

The top two facets indicate modest unreliability of the GP models. At lowernominal coverage levels, the GPs tend to cover too many of the observations,which effectively implies that the PIs could be sharper. The reliability dia-gram indicates this as well, particularly the quantile forecasts with nominalprobabilities of 0.3 and 0.4. These quantiles underpredict more often—which

37

Page 50: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

PINAW

Reliability

PICP

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0.250.500.751.00

0.000.250.500.751.00

0.10.2

Nominal

Scor

e

Dynamic GP Static GP ARIMA

Figure 4.1. Results of the PV power forecasts from Paper II in terms of PICP(η), thereliability diagram and PINAW(η). In this thesis, nighttime values have been removedfrom the PV power forecasts.

is why the reliability lies below the diagonal—and this results in high PICP atnominal coverage levels 0.2 (qτ=0.60 − qτ=0.40) and 0.3 (qτ=0.65 − qτ=0.35). Itis also interesting to note that the dynamic GP slightly outperforms the staticGP in terms of reliability, but that performance is identical in terms of sharp-ness. Finally, it is clear that the ARIMA model has poor reliability despiteexcellent mean prediction performance, see, e.g., Figure 6d in Paper II. A dy-namic approach similar to that of the GP could improve upon the displayedreliability.

4.1.2 Electricity consumption forecasting at the residential levelFigure 4.2 presents the PICP(η), the reliability diagram and the PINAW(η)for the electricity consumption forecasts. Nighttime values have not been re-moved here and the upper two facets are therefore smooth in case of the GPmodels. It can be seen from the top two facets, and the middle facet in par-ticular, that the upper quantiles of the GP models tend to overestimate theelectricity consumption, whereas the lower quantiles underestimate it. Thisindicates that the model is underconfident, since the predictive distributionsare wider than necessary to cover the observations, which in turn is expressedby the top facet, where the PICP lies above the diagonal. The conclusion thatthe forecasts are slightly biased is due to the observation that qτ=0.4 instead ofqτ=0.5 indicates the turning point from which point onward the quantiles over-

38

Page 51: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

PINAW

Reliability

PICP

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0.250.500.751.00

0.000.250.500.751.00

0.050.100.150.20

Nominal

Scor

e

Dynamic GP Static GP ARIMA

Figure 4.2. Results of the electricity consumption forecasts from Paper II in terms ofPICP(η), the reliability diagram and PINAW(η).

estimate the electricity consumption more often than they ought to. As was thecase for PV power, it can be seen that the dynamic GP slightly outperformsthe static GP. The results of the ARIMA model indicate poor reliability sincethe PIs are significantly too wide.

4.1.3 Net load forecasting at the residential levelFigure 4.3 presents the PICP(η), the reliability diagram and the PINAW(η)results for the net load forecasts. The results of the net load forecasts by theGP models follow a similar trend as the results of the electricity consumptionforecasts. However, two notable differences can be observed. Firstly, the re-liability has improved, i.e., the PICP and the reliability diagram are closer tothe diagonal. Secondly, the sharpness, presented in the bottom facet, has im-proved. Since PICP and PINAW counteract each other—increasing the sharp-ness typically results in lower coverage—these observations imply that the in-clusion of PV power in the electricity consumption substantially improves theprobabilistic forecasts, at least in this case. As regards the ARIMA forecastresults, they indicate that combining PV power and electricity consumptionpositively affects the accuracy as well although it still does not produce reli-able forecasts.

39

Page 52: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

PINAW

Reliability

PICP

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

0.250.500.751.00

0.000.250.500.751.00

0.050.100.15

Nominal

Scor

e

Dynamic GP Static GP ARIMA

Figure 4.3. Results of the net load forecasts from Paper II in terms of PICP(η), thereliability diagram and PINAW(η).

4.2 Spatio-temporal forecastingThis section summarizes the results of Paper III and Paper IV. In the former,time series of the PV power production, electricity consumption and net loadof the customers in the Ausgrid data set [26] were aggregated to investigatethe effect that has on the predictive distribution. Furthermore, the share of PVpower was incremented to assess the effect of increasing penetration on thepredictive distribution, although these results are omitted here for brevity andthe interested reader is instead referred to Paper III. In Paper IV, exogenousinformation, i.e., information from correlated locations or variables, was in-cluded to enhance the accuracy and investigate the potential of an ensemble ofGPs on a pyranometer network on Oahu, Hawaii [53].

4.2.1 The effects of aggregation on the predictive distributionThe methodology described in Section 3.3.1 considered the aggregation of 1to 30 customers with increments of 1 customer and 1 to 240 customers withincrements of 30 customers. This licentiate thesis summary only presents theresults of the former; the interested reader is referred to Paper III for resultsof the latter. Figures 4.4 to 4.6 visualize the effect that aggregation has onprobabilistic forecasts of PV power production, electricity consumption andnet load, respectively, in terms of (a) PICP(η ,C), (b) PINAW(η ,C) and (c)CRPS(C), where C indicates the number of customers in the aggregation.

40

Page 53: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

GP QR

SpringW

inter

1 10 20 301 10 20 30

0.10.30.50.70.9

0.10.30.50.70.9

Customer

η

0.25

0.50

0.75

PICP

(a)GP QR

SpringW

inter

1 10 20 301 10 20 30

0.10.30.50.70.9

0.10.30.50.70.9

Customer

η

0.10.20.3

PINAW

(b)

0.020.030.040.050.06

1 10 20 30Customer

CRPS

Model GP QR Season Spring Winter

(c)

Figure 4.4. Results of the PV power production forecasts of the GP and QR mod-els, during spring and winter. In (a), the PICP(η ,C), in (b) PINAW(η ,C) and in (c)CRPS(C), where C indicates the number of customers in the aggregation.

PV power productionRecall that for reliable probabilistic forecasts, the PICP should be approxi-mately equal to the nominal coverage level η . Figure 4.4a indicates that theGP is slightly unreliable in that the PICP is higher than η during both seasonsand the increasing level of aggregation, at least during winter, increases thePICP. The reason for this being the smoothing effect, and its implications areclearly visible. Figure 4.4b corroborates this observation, since it clearly indi-cates the improvement in sharpness. However, as the PICP remains too high,it can be concluded that the predictive distributions are not yet sharp enoughsince increasing the sharpness would most likely reduce the PICP.

In contrast, the QR model produces markedly constant PICP over the ag-gregation levels, as shown in Figure 4.4a. The coverage probability is belowthe nominal level during winter, whereas it shows good correspondence duringspring. Interestingly, and unlike the GP, Figure 4.4b shows that the sharpnessimproves as the level of aggregation increases and that PICP remains constantdespite this improvement. This could be due to the fact that the QR model pro-

41

Page 54: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

GP QR

SpringW

inter

1 10 20 301 10 20 30

0.10.30.50.70.9

0.10.30.50.70.9

Customer

η

0.25

0.50

0.75

PICP

(a)GP QR

SpringW

inter

1 10 20 301 10 20 30

0.10.30.50.70.9

0.10.30.50.70.9

Customer

η

0.10.20.3

PINAW

(b)

0.020.030.04

1 10 20 30Customer

CRPS

Model GP QR Season Spring Winter

(c)

Figure 4.5. Results of the electricity consumption forecasts of the GP and QR mod-els, during spring and winter. In (a), the PICP(η ,C), in (b) PINAW(η ,C) and in (c)CRPS(C), where C indicates the number of customers in the aggregation.

duces a non-parametric distribution in which each quantile forecast has beentrained separately.

Figure 4.4c shows the trend in terms of CRPS, which clearly indicates asignificant improvement of the forecasts when few customers are aggregated.Furthermore, it also indicates that aggregation above 20 customers does notobviously lead to better forecasts. The results of the aggregation steps from 1to 240 customers corroborate this, which can be found in Paper III, in additionto the reliability diagrams.

Electricity consumptionFigure 4.5a reveals that the GP produces unreliable electricity consumptionforecasts for lower levels of aggregation, but that the reliability steadily im-proves as the load profile becomes smoother. Furthermore, it can be observedthat the reliability of the GP is higher during spring than during winter (cf.Figure 5a of Paper III), which can be ascribed to the latter period displayingmore variance (cf. Figure 2a in Paper III). Furthermore, Figure 4.5b indicates

42

Page 55: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

GP QR

SpringW

inter

1 10 20 301 10 20 30

0.10.30.50.70.9

0.10.30.50.70.9

Customer

η

0.250.500.75

PICP

(a)GP QR

SpringW

inter

1 10 20 301 10 20 30

0.10.30.50.70.9

0.10.30.50.70.9

Customer

η

0.1

0.2

0.3PINAW

(b)

0.0150.0200.0250.0300.0350.040

1 10 20 30Customer

CRPS

Model GP QR Season Spring Winter

(c)

Figure 4.6. Results of the net load forecasts of the GP and QR models, during springand winter. In (a), the PICP(η ,C), in (b) PINAW(η ,C) and in (c) CRPS(C), where Cindicates the number of customers in the aggregation.

that the sharpness improves as the profile gets smoother and that, at low levelsof aggregation, the forecasts are sharper during winter.

As regards the QR forecasts, Figure 4.5a shows that the forecasts are reli-able during spring, even for low levels of aggregation, whereas the forecastsare unreliable during winter. More specifically, the coverage probability is be-low the nominal level during winter, which, when taking into considerationthe reliability diagram in Figure 5d of Paper III, is caused by a positive bias,i.e., an overestimation of the electricity consumption by the lower quantiles.

Finally, Figure 4.5c shows that the forecasts are more accurate during springthan during winter, which may be unexpected given the sharper predictive dis-tributions as presented in Figure 4.5b. However, the CRPS does not directlydiscriminate between reliability and sharpness and it is therefore importantto visually assess the predictive distribution, in addition to numerical assess-ments. As a final note, it is interesting to point out the sharp increase in CRPSin case of the QR model during winter. These 5 random customers were mostlikely all straightforward to forecast, despite the 5 random samples meant tocompensate for such an effect.

43

Page 56: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Net loadFigure 4.6a shows that combining solar power and electricity consumptioninto the net load improves the forecasts when considering the GP, in particularduring winter (cf. Figure 5c of Paper III). It is also interesting to note thatthe bias of the GP has improved when compared to that of the electricity con-sumption forecasts, as can be seen from the reliability diagram in Figure 5c ofPaper III. The sharpness improves steadily over the aggregation steps, as canbe expected.

Similarly as with PV power and electricity consumption forecasting, Fig-ure 4.6a shows that the QR model does not cover enough observations to meetthe nominal coverage level requirement during winter. The reason for this ismost likely the lack of a dedicated validation procedure for each model, whichwill be discussed further in Section 5.1. The reliability diagram in Figure 5f ofPaper III indicates that quantiles {qτ=0.4, . . . , qτ=0.9} do not overestimate thenet load frequently enough. Furthermore, Figure 4.6b shows that the predic-tive distributions are sharp during winter; too sharp taking into considerationthe PICP results and the reliability diagram.

Figure 4.6c shows that, as was the case for electricity consumption, CRPS ishigher during winter than during spring for both models. The reason for this ismainly the lower reliability during winter since the sharpness—especially forthe QR model—is in fact too high during this period, which only contributespositively to the CRPS (cf. eq. (3.21)).

4.2.2 Moving beyond the unimodal Gaussian distributionFigure 4.7a depicts the layout of the pyranometer network on Oahu, Hawaii[53]. The aim is to produce probabilistic forecasts of the CSI for locationDH6 using information from neighboring locations. Figure 4.7b presents thedata recorded by the network where the test set is representative of the entiredata set and has the familiar CSI distribution. The training data set is thendivided into overcast (OC) and sunny (SU) days, both approximately normallydistributed, on which GPs are trained separately. The predictions of theseGPs can then be combined as a GMM or through convolution, as describedin Section 3.3.2. In this section, we only present the results at a 1 minuteresolution, the interested reader is referred to Paper IV for the results at the 10second and 15 minute resolutions.

Figures 4.7c and 4.7d visualize the resulting probabilistic forecasts by meansof the reliability diagram and the PINAW as a function of nominal coveragelevel η , respectively. It can be seen from Figure 4.7c that the PeEn quantilessystematically underestimate the CSI. The three GP models, i.e., the GMM,the convolution and the GP that has been trained on the entire training set,produce better predictive distributions. Interestingly, and in contrast to conclu-sions in Paper IV, the convolution model performs substantially better than the

44

Page 57: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

(a)

DH6

21.306

21.309

21.312

21.315

21.318

−158.088 −158.084 −158.080Longitude

Latitu

de(b)

0

5000

10000

15000

20000

0.0 0.5 1.0 1.5CSI (−)

Coun

t

OCSUTest

(c)

0.10.20.30.40.50.60.70.80.9

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9Nominal probability

Obse

rved

PeEn Overall GMM Convolution

(d)

0.1

0.2

0.2 0.3 0.4 0.5 0.6 0.7 0.8η

PINA

W

PeEn Overall GMM ConvolutionFigure 4.7. In (a), a map with the layout of the pyranometer network on Oahu, Hawaii,and the target sensor highlighted in yellow ©2018 IEEE. In (b), a frequency polygonof the CSI recorded by the pyranometers, where the color indicates the subset of thedata, i.e., overcast (OC), sunny (SU) and the test set. In (c), the reliability diagram ofthe predictive distributions at a 1 minute resolution. In (d), the PINAW as a functionof the nominal coverage level η .

other models. The reason for this is closely related to a theorem by Gneitingand Ranjan [123], stating that a linear combination of predictive distributionsis more dispersed than the least dispersed individual predictive distribution.This implies that the predictive distributions exhibit more spread. Simultane-ously, however, the definition of convolution for independent Gaussians pre-sented in eq. (3.17) induces an increase in sharpness due to squaring of themixing coefficient. The combined effect of the aforementioned improves thereliability and the sharpness. The fact that the GMM does not show substan-tial improvement over the Overall model despite that it is also an ensemble isrelated to the fact that the mixing coefficient changes over time depending onthe predominant weather type. It is therefore likely that the mixing coefficient

45

Page 58: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

is oftentimes close to either 0 or 1 and consequently, the effects of mixing willbe limited.

Figure 4.7d shows that the PeEn model produces rather wide PIs and isincapable of producing reliable predictive distributions, as was stated before.The three GP models substantially improve on this point, while the convolutedGP produces the sharpest predictive distributions. Furthermore, it can be seenthat the GMM only marginally improves on the sharpness when compared tothe Overall model. It is therefore interesting to conclude that, in fact, straight-forward convolution produced the best results, at least in this case.

46

Page 59: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

5. Discussion and future work

This section discusses the results described in the previous chapter and at-tempts to forecast what can and should be done.

5.1 DiscussionThe GP played a central role in Papers II–IV and proved to be highly flexiblein predicting non-linear data. Focusing on Paper II, the GP showed to produceaccurate and reliable forecasts of the PV power, whereas in the case of elec-tricity consumption, the predictive distributions were typically too wide. Thereason for this is rather straightforward: learning the hyperparameters doesnot involve maximizing the sharpness, subject to reliability. Since few ma-chine learning techniques are dedicated to such a training strategy, it is oftenleft to the forecaster to construct the model appropriately and add additionalexplanatory variables, if possible1. The results of Paper II further revealedthat combining PV power and electricity consumption as net load allowed forimproved reliability of the forecasts, as was also shown in Paper III. We hy-pothesize that it is caused by the counteracting signals but more research isrequired, also as to whether these observations hold when considering otherdata sets. For instance, Wang et al. [76] started from the premise that the netload would be more challenging to forecast due to PV power and proposed toseparate the net load signal into PV power production, electricity consump-tion and the residuals and found that they could improve the forecasts in thatway. The result found in Paper II is, however, an interesting result that couldimprove the accuracy of, e.g., hierarchical forecasts. Finally, a dynamic GPwas tested in order to reduce the learning and inference time, and the resultsshowed that it performed highly similar—if not better—to the static GP at afar lower computational burden.

The results of Paper III contain various limitations. For instance, only oneforecast horizon was explored and the metadata contained postcodes only. Fur-thermore, the model structure was kept identical over the random aggregationsamples. It was, however, the first study that investigated the effect of variousfactors on probabilistic forecasts. The most important result was that a limitednumber of customers need to be aggregated in order to substantially improve

1This is generally a big if, as public data sets are scarce and tend to be void of data other thanthe variables that were set out to be measured.

47

Page 60: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

the accuracy. Whether this holds on the distribution grid level requires moreresearch, but the results herein indicate that it is possible to improve the accu-racy while retaining local information. As for the GP and QR models used inthis work, they had their strengths and weaknesses that could be exploited infurther research. For instance, QR tended to perform poorly during winter butwhen it would perform well, it would do so already for low levels aggregation.In contrast, the performance of the GP would steadily increase when the levelof aggregation increased. This may be due to the Gaussian assumption, whichcould be too restrictive at the lower levels of the electricity grid, a view sharedby Taieb et al. [72]. Furthermore, the GP was typically less affected by themodel structure than the QR due to the fact that it is a non-parametric model.These results can be used in constructing a hierarchy of models required forthe various levels of the grid.

Ensembles, which were investigated in Paper IV, have been shown to im-prove the forecast accuracy. Ensembles have the potential to improve the GPforecasts because it allows for departure from the unimodal Gaussian density,which may have been a reason for the relative poor performance at lower levelsof aggregation in Paper III. The results did not reveal a clear advantage of theGMM over straightforward convolution. In fact, the latter outperformed theGMM, both in terms of reliability and sharpness because a linear combinationof predictive distributions is more dispersed than the least dispersed individualpredictive distribution [123], and because the mixing coefficient in the convo-lution step is squared (cf. eq. (3.17)). Overall, however, the results did reaffirmthe general understanding that ensembles improve the forecast accuracy andsuggested that the asymmetry of the GMM can further improve the predictivedistribution. Further research is, however, required in the area of probabilisticensembles and additional GPs could be included to assess whether the asym-metry can indeed further improve the forecasts.

5.2 Future workProbabilistic forecasting is still a relatively young field despite the significantadvances made recently. However, many interesting opportunities for researchremain. One such direction for future work is multivariate forecasting. Cur-rently the most common approach to multiple step ahead forecasts is to train amodel for each forecast horizon. However, the forecast errors are uncorrelatedin such a setup, which is unrealistic and which can have an adverse effect whensuch forecasts are used in a decision making process. However, one mightalso think of applying multivariate forecasting to the combination of electric-ity consumption, electricity price, RES and EV forecasting, since these areinterdependent. Another research opportunity is to extend the work of PaperIII to data where exact locations of sensors are known in order to infer exactrelations between the effect of aggregation on the accuracy of probabilistic

48

Page 61: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

forecasts as a function of distance. Hierarchical forecasting poses yet anotheropportunity, where it is particularly challenging to ensure that the probabilisticforecasts are coherent.

A collaboration with the University of California, San Diego (UCSD), hasbeen initiated in order to apply probabilistic forecasts to enhance further gridintegration of solar power. An initial study into the benefit of probabilisticforecasts in a stochastic optimization problem is underway, with the aim tomitigate the adverse effect of PV power on the net load by charging EVs withPV power when it is optimal—in some sense—to do so. This study will becontinued by means of a research visit at UCSD, where the optimal PV-EVworkplace charging strategy (Paper V) will be further developed to enablestochastic optimization and properly take into account the uncertainties.

49

Page 62: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 63: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

6. Conclusion

In order to get acquainted with forecasting of photovoltaic (PV) power pro-duction and electricity consumption, the state-of-the-art has been reviewed.Besides field specific knowledge, an important conclusion from the reviewwas that probabilistic forecasting is preferred over deterministic forecastingdue to its ability to express the uncertainty.

In this thesis, properties of probabilistic forecasts of PV power, electricityconsumption, net load and the clear-sky index (CSI) have been investigated.The results revealed that combining PV power production and electricity con-sumption as the net load improved the results over those of electricity con-sumption alone. Whether this generalizes requires further study but these re-sults could prove helpful in the case of, e.g., hierarchical forecasts and couldreduce the need for dedicated sensors. In addition, we showed that it is possi-ble to achieve similar accuracy and reduce the computation time by an orderof 2 by means of a dynamic GP.

It is well known that aggregating time series reduces the variability, andimproves the forecast accuracy by extension, but the extent had not been thor-oughly studied in case of probabilistic forecasting. The results showed thatsubstantial improvements can be made in terms of both the reliability and thesharpness by aggregating even a small number of customers. This could allowgrid operators to preserve local information while still improving their fore-casts at the lower levels of the grid. Further study is, however, required in orderto establish a relationship between aggregation distance and the improvementthat can be expected, if any.

Ensembles are known to improve the accuracy of forecasts as well becausethese can compensate for model uncertainty. The results indicated that for theCSI, a straightforward convolution outperformed a Gaussian mixture model(GMM), but they also hinted that the asymmetry of a multi-modal Gaussiandensity could improve the results at certain quantiles. Further research withhigher order GMMs is required to explore the potential of these models.

51

Page 64: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 65: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Acknowledgments

First of all, I want to thank my supervisor Joakim Widén and co-supervisorJoakim Munkhammar for giving me the opportunity to pursue the Ph.D. de-gree and for all their help and support during this process. I also want tothank Gautham Ram Chandra Mouli, my supervisor during my master thesisat the Technical University of Delft and a recently graduated Ph.D. student,for inspiring me to pursue an academic career. A debt of gratitude is owed tomy co-authors—besides Joakim Widén and Joakim Munkhammer—AndreasSvensson and Mahmoud Shepero, who is also my office mate, for broadeningand deepening my understanding on probabilistic modeling. Mahmoud, I hope(and forecast) that our research converges and we can work together more of-ten. I would also like to thank all the colleagues at the Built EnvironmentEnergy Systems Group for the great work environment and fun after-work ac-tivities. Finally, I want to thank Anna and my family for their continuous loveand support, it means a lot to me.

53

Page 66: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp
Page 67: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

References

[1] IPCC. Summary for Policymakers, book section SPM, pages 1–30. CambridgeUniversity Press, Cambridge, United Kingdom and New York, NY, USA,2013.

[2] T. Gneiting and M. Katzfuss. Probabilistic Forecasting. Annu. Rev. Stat. ItsAppl., 1:125–151, 2014.

[3] T. Gneiting and A. E. Raftery. Strictly proper scoring rules, prediction, andestimation. J. Am. Stat. Assoc., 102(477):359–378, 2007.

[4] Institute of Medicine. Conflict of Interest in Medical Research, Education, andPractice. The National Academies Press, Washington, DC, 2009.

[5] International Energy Agency. Key world energy statistics. Technical report,2017.

[6] IEA. 2018 Snapshot of Global Photovoltaic Markets. Technical report, 2018.[7] R. Luthander, D. Lingfors, and J. Widén. Large-scale integration of

photovoltaic power in a distribution grid using power curtailment and energystorage. Sol. Energy, 155:1319–1325, 2017.

[8] T. Hong. Energy Forecasting : Past , Present and Future. Foresight Int. J.Forecast., (32):43–49, 2014.

[9] International Energy Agency. World Energy Outlook. Technical report, 2017.[10] International Energy Agency. Global EV Outlook. Technical report, 2017.[11] J. W. Eising, T. van Onna, and F. Alkemade. Towards smart grids: Identifying

the risks that arise from the integration of energy and transport supply chains.Appl. Energy, 123(2014):448–455, 2014.

[12] T. Hong and S. Fan. Probabilistic Electric Load Forecasting: A TutorialReview. Int. J. Forecast., 32(3):914–938, 2016.

[13] J. Lindahl. National survey report of PV power applications in Sweden 2016.Technical report, IEA-PVPS, 2017.

[14] A. Nordling. Sweden’s Future Electrical Grid, A Project Report. Technicalreport, The Royal Swedish Academy of Engineering Sciences, 2017.

[15] R. H. Inman, H. T. C. Pedro, and C. F. M. Coimbra. Solar forecasting methodsfor renewable energy integration. Prog. Energy Combust. Sci., 39(6):535–576,2013.

[16] A. Kaur, L. Nonnenmacher, H. T. C. Pedro, and C. F. M. Coimbra. Benefits ofsolar forecasting for energy imbalance markets. Renew. Energy, 86:819–830,2016.

[17] N. Abdel-Karim, M. Lauby, J. N. Moura, and T. Coleman. Operational RiskImpact of Flexibility Requirements and Ramp Forecast on the North AmericanBulk Power System. In 2018 Int. Conf. Probabilistic Methods Appl. to PowerSyst. Boise, ID, USA, June 24-28, 2018.

[18] P. Etingov, L. Miller, Z. Hou, Y. Makarov, K Pennock, P. Beaucage, C. Loutan,and A. Motley. Balancing Needs Assessment Using Advanced Probabilistic

55

Page 68: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Forecasts. In 2018 Int. Conf. Probabilistic Methods Appl. to Power Syst. Boise,ID, USA, June 24-28, 2018.

[19] J. Kleissl. Solar Energy Forecasting and Resource Assessment. AcademicPress, Boston, 2013.

[20] P. Ineichen. Comparison of eight clear sky broadband models against 16independent data banks. Sol. Energy, 80(4):468–478, 2006.

[21] R. Perez, M. David, T. E. Hoff, M. Jamaly, S. Kivalov, J. Kleissl, P. Lauret, andM. Perez. Spatial and Temporal Variability of Solar Energy. Found. TrendsRenew. Energy, 1:1–44, 2016.

[22] V. Larson. Chapter 12 - Forecasting Solar Irradiance with Numerical WeatherPrediction Models. In: Solar energy forecasting and resource assessment.Boston: Acadamic Press; 2013. p. 171 - 194. 2013.

[23] H Wirth. Recent facts about photovoltaics in Germany. Technical report,Fraunhofer ISE, 2018.

[24] M. Ropp, J. Newmiller, C. Whitaker, and B. Norris. Review of potentialproblems and utility concerns arising from high penetration levels ofphotovoltaics in distribution systems. In IEEE Photovolt. Spec. Conf., 2008.

[25] J. Olauson, M. N. Ayob, M. Bergkvist, N. Carpman, V. Castellucci, A. Goude,D. Lingfors, R. Waters, and J. Widén. Net load variability in Nordic countrieswith a highly or fully renewable power system. Nat. Energy, 1:1–8, 2016.

[26] Ausgrid. Solar home electricity data, 2014. Online; accessed 31 August 2017.[27] R. Perez, T. E. Hoff, J. Dise, D. Chalmers, and S. Kivalov. Mitigating

short-term PV output variability. In Proceedings of the 28th EuropeanPhotovoltaic Solar Energy Conference and Exhibition, Paris, France, 2013.

[28] C. B. Martinez-Anido, B. Botor, A. R. Florita, C. Draxl, S. Lu, H. F. Hamann,and B. M. Hodge. The value of day-ahead solar power forecastingimprovement. Sol. Energy, 129:192 – 203, 2016.

[29] T. Bossmann and I. Staffell. The shape of future electricity demand: Exploringload curves in 2050s Germany and Britain. Energy, 90:1317–1333, 2015.

[30] I. Drezga and S. Rahman. Input variable selection for ANN-based short-termload forecasting. IEEE Trans. Power Syst., 13:1238–1244, 1998.

[31] M. Q. Raza and A. Khosravi. A review on artificial intelligence based loaddemand forecasting techniques for smart grid and buildings. Renew. Sustain.Energy Rev., 50:1352–1372, 2015.

[32] K. Clement-Nyns, E. Haesen, and J. Driesen. The Impact of Charging Plug-InHybrid Electric Vehicles on a Residential Distribution Grid. IEEE Trans.Power Syst., 25:371–380, 2010.

[33] P. Denholm, R. Margolis, and J. Milford. Production Cost Modeling for HighLevels of Photovoltaics Penetration. Technical report, National RenewableEnergy Laboratory, 2008.

[34] P. Denholm and M. Hand. Grid flexibility and storage required to achieve veryhigh penetration of variable renewable electricity. Energy Policy,39(3):1817–1830, 2011.

[35] R. Luthander, J. Widén, D. Nilsson, and J. Palm. Photovoltaicself-consumption in buildings: A review. Appl. Energy, 142:80–94, mar 2015.

[36] R. H. Miller and J. H. Malinowski. Power system operation. McGraw-HillProfessional, 1994.

56

Page 69: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

[37] Energy independence and security act of 2007, 2007.https://www.gpo.gov/fdsys/pkg/PLAW-110publ140/html/PLAW-110publ140.htm.

[38] F. Li, R. Li, Z. Zhang, M. Dale, D. Tolley, and P. Ahokangas. Big DataAnalytics for Flexible Energy Sharing. IEEE power and energy magazine,18:35–42, 2018.

[39] Scopus. Document search, 2018. Online; accessed 10 August 2018.[40] D. Wilks. Statstical methods in atmospheric sciences. San Diego: Acadamic

Press, 2006.[41] D.W. van der Meer, J. Widén, and J. Munkhammar. Review on probabilistic

forecasting of photovoltaic power production and electricity consumption.Renew. Sustain. Energy Rev., 81:1484–1512, 2018.

[42] P. Pinson. Wind Energy: Forecasting Challenges for Its OperationalManagement. Stat. Sci., 28(4):564–585, 2013.

[43] M. Diagne, M. David, P. Lauret, J. Boland, and N. Schmutz. Review of solarirradiance forecasting methods and a proposition for small-scale insular grids.Renew. Sustain. Energy Rev., 27:65–76, 2013.

[44] J. Widén, N. Carpman, V. Castellucci, D. Lingfors, J. Olauson, F. Remouit,M. Bergkvist, M. Grabbe, and R. Waters. Variability assessment andforecasting of renewables: A review for solar, wind, wave and tidal resources.Renew. Sustain. Energy Rev., 44:356–375, 2015.

[45] T. T. Warner. Numerical Weather and Climate Prediction. CambridgeUniversity Press, 2011.

[46] E. N. Lorenz. Deterministic nonperiodic flow. J. Atmospheric Sci.,20:130–141, 1963.

[47] J. M. Bright, S. Killinger, D. Lingfors, and N. A. Engerer. Improvedsatellite-derived pv power nowcasting using real-time power data fromreference pv systems. Sol. Energy, 168:118 – 139, 2018. Advances in SolarResource Assessment and Forecasting.

[48] A. T. Lorenzo, M. Morzfeld, W. F. Holmgren, and A. D. Cronin. Optimalinterpolation of satellite and ground data for irradiance nowcasting at cityscales. Sol. Energy, 144:466 – 474, 2017.

[49] T. Hastie, R. Tibsharani, and J. Friedman. The Elements of StatisticalLearning, volume 27. 2009.

[50] Z. Ghahramani. Probabilistic machine learning and artificial intelligence.Nature, 521:452–459, 2015.

[51] I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning withneural networks. In Proceedings of the 27th International Conference onNeural Information Processing Systems - Volume 2, NIPS’14, pages3104–3112, Cambridge, MA, USA, 2014. MIT Press.

[52] Y. Chu, M. Li, H. T. C. Pedro, and C. F. M. Coimbra. Real-time predictionintervals for intra-hour DNI forecasts. Renew. Energy, 83:234–244, 2015.

[53] M. Sengupta and A. Andreas. Oahu solar measurement grid (1-year archive):1-second solar irradiance; oahu, hawaii (data), 2010. Data retrieved from:https://midcdmz.nrel.gov/oahu_archive/.

[54] R. H. Shumway and D. S. Stoffer. Time Series Analysis and Its Applications(Springer Texts in Statistics). Springer-Verlag, Berlin, Heidelberg, 2005.

57

Page 70: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

[55] T. Hong. Short Term Electric Load Forecasting. PhD thesis, North CarolinaState University, 2010.

[56] M. David, M. Aguiar Luis, and P. Lauret. Comparison of intraday probabilisticforecasting of solar irradiance using only endogenous data. Int. J. Forecast.,34(3):529 – 547, 2018.

[57] P. Pinson, P. McSharry, and H. Madsen. Reliability diagrams fornon-parametric density forecasts of continuous variables: Accounting forserial correlation. Q. J. R. Meteorol. Soc., 136:77–90, 2010.

[58] A Khosravi, S Nahavandi, D Creighton, and A F Atiya. Lower Upper BoundEstimation Method for Construction of Neural Network-Based PredictionIntervals. IEEE Trans. Neural Networks, 22(3):337–346, 2011.

[59] H. Quan, D. Srinivasan, and A. Khosravi. Uncertainty handling using neuralnetwork-based prediction intervals for electrical load forecasting. Energy,73:916–925, 2014.

[60] D. Yang, J. Kleissl, C. A. Gueymard, H. T. C. Pedro, and C. F. M. Coimbra.History and trends in solar irradiance and PV power forecasting: A preliminaryassessment and review using text mining. Sol. Energy, 168:60–101, 2018.

[61] T. Hong, P. Pinson, S. Fan, H. Zareipour, A. Troccoli, and R. J. Hyndman.Probabilistic energy forecasting: Global Energy Forecasting Competition 2014and beyond. Int. J. Forecast., 32(3):896–913, 2016.

[62] W. R. Tobler. A computer movie simulating urban growth in the detroit region.Economic Geography, 46:234–240, 1970.

[63] M. Jamaly and J. Kleissl. Spatiotemporal interpolation and forecast ofirradiance data using Kriging. Sol. Energy, 158:407–423, 2017.

[64] N. Cressie and G. Johannesson. Fixed rank kriging for very large spatial datasets. Journal of the Royal Statistical Society: Series B (StatisticalMethodology), 70(1):209–226, 2008.

[65] X. G. Agoua, R. Girard, and G. Kriniotakis. Probabilistic Model forSpatio-Temporal Photovoltaic Power Forecasting. IEEE Trans. Sustain.Energy, -(-):1–9, 2018.

[66] F. Golestaneh, P. Pinson, and H. B. Gooi. Generation and Evaluation ofSpace-Time Trajectories of Photovoltaic Power. Appl. Energy, 2016.

[67] T. Gneiting, L. I. Stanberry, E. P. Grimit, L. Held, and N. A. Johnson.Assessing probabilistic forecasts of multivariate quantities, with an applicationto ensemble predictions of surface winds. TEST, 17(2):211, 2008.

[68] A. Möller, A. Lenkoski, and T. L. Thorarinsdottir. Multivariate probabilisticforecasting using ensemble bayesian model averaging and copulas. Q. J. RoyalMeteorol. Soc., 139(673):982–991.

[69] R. J. Hyndman, R. A. Ahmed, G. Athanasopoulos, and H. L. Shang. Optimalcombination forecasts for hierarchical time series. Comput. Stat. Data Anal.,55(9):2579–2589, September 2011.

[70] D. Yang, H. Quan, V. R. Disfani, and L. Liu. Reconciling solar forecasts:Geographical hierarchy. Sol. Energy, 146:276 – 286, 2017.

[71] D. Yang, H. Quan, V. R. Disfani, and C. D. Rodriguez-Gallegos. Reconcilingsolar forecasts: Temporal hierarchy. Sol. Energy, 158:332 – 346, 2017.

[72] S. B. Taieb, J. W. Taylor, and R. J. Hyndman. Coherent probabilistic forecastsfor hierarchical time series. In Proceedings of the 34th International

58

Page 71: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

Conference on Machine Learning, pages 3348–3357, Sydney, Australia, 06–11Aug 2017. PMLR.

[73] T. van Erven and J. Cugliari. Game-theoretically optimal reconciliation ofcontemporaneous hierarchical time series forecasts. In Modeling andStochastic Learning for Forecasting in High Dimensions, pages 297–317,Cham, 2015. Springer International Publishing.

[74] H. T. C. Pedro, C. F. M. Coimbra, M. David, and P. Lauret. Assessment ofmachine learning techniques for deterministic and probabilistic intra-hoursolar forecasts. Renew. Energy, 123:191 – 203, 2018.

[75] Y. Chu, H. T. C. Pedro, A. Kaur, J. Kleissl, and C. F. M. Coimbra. Net loadforecasts for solar-integrated operational grid feeders. Sol. Energy,158(September 2016):236–246, 2017.

[76] Y. Wang, N. Zhang, Q. Chen, D. S. Kirschen, P. Li, and Q. Xia. Data-DrivenProbabilistic Net Load Forecasting with High Penetration of Behind-the-MeterPV. IEEE Trans. Power Syst., pages 1–10, 2017.

[77] Y. Wang, N. Zhang, Y. Tan, T. Hong, D. S. Kirschen, and C. Kang. CombiningProbabilistic Load Forecasts. IEEE Trans. Smart Grid, Early access:1–1, 2018.

[78] R. R. Appino, J. A. González Ordiano, T. Mikut, R.and Faulwasser, andV. Hagenmeyer. On the use of probabilistic forecasts in scheduling ofrenewable energy sources coupled to storages. Appl. Energy, 210:1207 – 1218,2018.

[79] W. El-Baz, M. Seufzger, S. Lutzenberger, P. Tzscheutschler, and U. Wagner.Impact of probabilistic small-scale photovoltaic generation forecast on energymanagement systems. Sol. Energy, 165:136 – 146, 2018.

[80] D. Yang, Z. Ye, L. H. Idris Lim, and Z. Dong. Very short term irradianceforecasting using the lasso. Sol. Energy, 114:314–326, 2015.

[81] SoDa Service. Cams McClear Service for Estimating Irradiation underClear-Sky, 2017. Online; accessed 02 May 2017.

[82] J Widén, M. Shepero, and J. Munkhammar. On the properties of aggregateclear-sky index distributions and an improved model for spatially correlatedinstantaneous solar irradiance. Sol. Energy, 157:566–580, 2017.

[83] F.M. Dekking, C. Kraaikamp, and Meester L. E. Lopuhaa, H. P. A ModernIntroduction to Probability and Statistics. Springer Texts in Statistics, 2005.

[84] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for MachineLearning. MIT Press, 2006.

[85] D. Yang, C. Gu, Z. Dong, P. Jirutitijaroen, N. Chen, and W. M. Walsh. Solarirradiance forecasting using spatial-temporal covariance structures andtime-forward kriging. Renew. Energy, 60:235–245, 2013.

[86] D. Yang, Z. Dong, T. Reindl, P. Jirutitijaroen, and W. M. Walsh. Solarirradiance forecasting using spatio-temporal empirical kriging and vectorautoregressive models with parameter shrinkage. Sol. Energy, 103:550–562,2014.

[87] A. W. Aryaputera, D. Yang, L. Zhao, and W. M. Walsh. Very short-termirradiance forecasting at unobserved locations using spatio-temporal kriging.Sol. Energy, 122:1266–1278, 2015.

[88] M. L. Stein. Interpolation of Spatial Data : Some Theory for Kriging.Springer New York, 1999.

59

Page 72: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

[89] B. Matern. Spatial Variation. Second edition (1986), Springer-Verlag, Berlin,1960.

[90] S. Salcedo-Sanz, C. Casanova-Mateo, J. Munoz-Mari, and G. Camps-Valls.Prediction of Daily Global Solar Irradiation Using Temporal GaussianProcesses. IEEE Geosci. Remote Sens. Lett., 11(11):1936–1940, 2014.

[91] F. Golestaneh, P. Pinson, and H. B. Gooi. Very Short-Term NonparametricProbabilistic Forecasting of Renewable Energy Generation; With Applicationto Solar Energy. Power Syst. IEEE Trans., PP(99):1–14, 2016.

[92] R. Koenker and G. Bassett. Regression Quantiles. Econometrica, 46(1):33–50,1978.

[93] S. B. Taieb, R. Huser, R. J. Hyndman, and M. G. Genton. ForecastingUncertainty in Electricity Smart Meter Data by Boosting Additive QuantileRegression. IEEE Trans. Smart Grid, 7(5):2448–2455, 2016.

[94] R. Koenker. quantreg: Quantile Regression in R, version 5.35, 2016.[95] P. Lauret, M. David, and H. Pedro T.C. Probabilistic Solar Forecasting Using

Quantile Regression Models. Energies, 10, 2017.[96] J. B. Bremnes. Probabilistic wind power forecasts using local quantile

regression. Wind Energy, 7:47–54, 2004.[97] R. Juban, H. Ohlsson, M. Maasoumy, L. Poirier, and J. Z. Kolter. A multiple

quantile regression approach to the wind, solar, and price tracks ofGEFCom2014. Int. J. Forecast., 32(3):1094–1102, 2016.

[98] S. Haben and G. Giasemidis. A hybrid model of kernel density estimation andquantile regression for GEFCom2014 probabilistic load forecasting. Int. J.Forecast., 32(3):1017–1022, 2016.

[99] A. J. Cannon. Quantile regression neural networks: Implementation in R andapplication to precipitation downscaling. Comput. Geosci., 37:1277–1284,2011.

[100] N. Meinshausen. Quantile Regression Forests. J. Mach. Learn. Res.,7:983–999, 2006.

[101] G. I. Nagy, G. Barta, S. Kazi, G. Borbély, and G. Simon. GEFCom2014:Probabilistic solar and wind power forecasting using a generalized additivetree ensemble approach. Int. J. Forecast., 32(3):1087–1093, 2016.

[102] M. P. Almeida, O. Perpiñán, and L. Narvarte. PV power forecast using anonparametric PV model. Sol. Energy, 115:354–368, 2015.

[103] Y. He, Q. Xu, J. Wan, and S. Yang. Short-term power load probability densityforecasting based on quantile regression neural network and triangle kernelfunction. Energy, 114:498–512, 2016.

[104] K. G. T. Hollands and H. Suehrcke. A three-state model for the probabilitydistribution of instantaneous solar radiation, with applications. Sol. Energy,96:103–112, 2013.

[105] L. Delle Monache, F. A. Eckel, D. L. Rife, B. Nagarajan, and K. Searight.Probabilistic Weather Prediction with an Analog Ensemble. Mon. WeatherRev., 141:3498–3516, 2013.

[106] A. Bracale, G. Carpinelli, and P. De Falco. A Probabilistic CompetitiveEnsemble Method for Short-Term Photovoltaic Power Forecasting. IEEETrans. Sustain. Energy, 8(2):551–560, apr 2017.

60

Page 73: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp

[107] Q. Ni, S. Zhuang, H. Sheng, G. Kang, and J. Xiao. An ensemble predictionintervals approach for short-term PV power forecasting. Sol. Energy,155:1072–1083, oct 2017.

[108] A. Papoulis and S.U. Pillai. Probability, random variables, and stochasticprocesses. McGraw-Hill, 2002.

[109] C. Bergmeir, R. J. Hyndman, and B. Koo. A note on the validity ofcross-validation for evaluating autoregressive time series prediction.Computational Statistics & Data Analysis, 120(C):70–83, 2018.

[110] H. Quan, D. Srinivasan, and A. Khosravi. Short-term load and wind powerforecasting using neural network-based prediction intervals. IEEE Trans.Neural Networks Learn. Syst., 25(2):303–315, 2014.

[111] P. Pinson, H. Nielsen, J. K. Møller, H. Madsen, and G. N. Kariniotakis.Non-parametric probabilistic forecasts of wind power: Required properties andevaluation. Wind Energy, 10(6):497–516, 2007.

[112] C. Chatfield. Time-Series Forecasting. Chapman & Hall / CRC, Bath, 2000.[113] H. Hersbach. Decomposition of the continuous ranked probability score for

ensemble prediction systems. Wea. Forecasting, 15(5):559–570, 2000.[114] S. Siegert. SpecsVerification: Forecast Verification Routines for Ensemble

Forecasts of Weather and Climate in R, version 0.5-2, 2017.[115] J. Bröcker and L. A. Smith. Increasing the reliability of reliability diagrams.

Wea. Forecasting, 22(3):651–661, 2007.[116] C. Cornaro, M. Pierro, and F. Bucci. Master optimization process based on

neural networks ensemble for 24-h solar irradiance forecast. Sol. Energy,111:297–312, 2015.

[117] H. Shaker, H. Chitsaz, H. Zareipour, and D. Wood. On comparison of twostrategies in net demand forecasting using Wavelet Neural Network. 2014North Am. Power Symp. NAPS 2014, 2014.

[118] S. Alessandrini, L. Delle Monache, S. Sperati, and G. Cervone. An analogensemble for short-term probabilistic solar power forecast. Appl. Energy,157:95–110, 2015.

[119] Y. Chu and C. F. M. Coimbra. Short-term probabilistic forecasts for DirectNormal Irradiance. Renew. Energy, 101:526–536, 2017.

[120] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel. Time series analysis :forecasting and control. Englewood Cliffs, N.J. :, Englewood Cliffs, N.J. :,1994.

[121] R. J. Hyndman and Y. Khandakar. Automatic Time Series Forecasting: Theforecast Package for R (Version 8.1). J. Stat. Softw., 27(3):1–22, jul 2008.

[122] R. J. Hyndman, G. Athanasopoulos, and OTexts.com. Forecasting: principlesand practice. OTexts.com [Heathmont, Victoria], 2014.

[123] T. Gneiting and R. Ranjan. Combining predictive distributions. Electron. J.Statist., 7:1747 – 1782, 2016.

61

Page 74: Spatio-temporal probabilistic forecasting of solar power ...uu.diva-portal.org/smash/get/diva2:1256832/FULLTEXT01.pdfdesigned for electric load data", Applied Energy, Vol. 218, pp