• Aucun résultat trouvé

Spatial projections of solar PV installations at subnational level: Accuracy testing of regression models

N/A
N/A
Protected

Academic year: 2022

Partager "Spatial projections of solar PV installations at subnational level: Accuracy testing of regression models"

Copied!
45
0
0

Texte intégral

(1)

Article

Reference

Spatial projections of solar PV installations at subnational level:

Accuracy testing of regression models

MULLER, Jonas, TRUTNEVYTE, Evelina

Abstract

As the growth of solar photovoltaics (PV) accelerates, spatial PV projections at subnational level are necessary for planning grid infrastructure and addressing demand-supply balancing challenges, posed by this intermittent source of electricity. Although spatial models of weather-dependent PV productivity are common, few studies have focused on projections of PV installations. This study uses a comprehensive dataset with 68′341 PV installations in Switzerland to develop 1- to 5-year-ahead projections of PV installations at a level of 143 Swiss districts. A new modelling methodology is demonstrated, using in-sample and out-of-sample accuracy testing of a multiple linear and two spatial regression models with techno-economic and socio-demographic predictor variables. The results show that exploitable solar PV potential, household size, population density, and electricity prices are predictors with positive effect, and the share of unproductive land area is a predictor of PV installations at a district level with negative effect. Spatial regression models point to the importance of spatial spillovers across proximate [...]

MULLER, Jonas, TRUTNEVYTE, Evelina. Spatial projections of solar PV installations at subnational level: Accuracy testing of regression models. Applied Energy, 2020, vol. 265, p.

114747

DOI : 10.1016/j.apenergy.2020.114747

Available at:

http://archive-ouverte.unige.ch/unige:134077

Disclaimer: layout of this document may differ from the published version.

(2)

Spatial projections of solar PV installations at subnational level: accuracy testing of regression models

Applied Energy 265 (2020) 114747

https://doi.org/10.1016/j.apenergy.2020.114747

Authors: Jonas Müller1,2*, Evelina Trutnevyte1

1 Renewable Energy Systems, Institute for Environmental Sciences (ISE), Section of Earth and Environmental Sciences, University of Geneva, Switzerland

2 Department of Environmental Systems Science, ETH Zurich, Switzerland

* corresponding author (Uni Carl Vogt, Boulevard Carl Vogt 66, CH-1211 Geneva 4, Switzerland; jonas.edwin.mueller@alumni.ethz.ch)

Abstract

As the growth of solar photovoltaics (PV) accelerates, spatial PV projections at subnational level are necessary for planning grid infrastructure and addressing demand-supply balancing challenges, posed by this intermittent source of electricity. Although spatial models of weather- dependent PV productivity are common, few studies have focused on projections of PV installations. This study uses a comprehensive dataset with 68’341 PV installations in Switzerland in order to develop 1- to 5-year-ahead projections of PV installations at a level of 143 Swiss districts. A new modelling methodology is demonstrated, using in-sample and out- of-sample accuracy testing of a multiple linear and two spatial regression models with techno- economic and socio-demographic predictor variables. The results show that exploitable solar PV potential, household size, population density, and electricity prices are predictors with positive effect, and the share of unproductive land area is a predictor of PV installations at a district level with negative effect. Spatial regression models point to the importance of spatial spillovers across proximate districts. The accuracy testing shows that spatial regression models have slightly higher accuracy during in-sample testing of projections, but concerning out-of- sample testing, the multiple linear regression model performs equally well for 1- to 5-year- ahead projections.

(3)

Keywords

Solar PV, spatial projections, technology diffusion, spatial energy models, model accuracy, out- of-sample testing

Highlights

- New method for computing 1- to 5-year-ahead spatial projections of PV installations - Different regression models are compared for in-sample and out-of-sample accuracy - Spatial regression models have slightly higher accuracy during in-sample testing - Multiple linear regression model performs equally well in out-of-sample testing

Abbreviations

FIT Feed-in-tariff

LCOE Levelized cost of electricity MLR Multiple linear regression OLS Ordinary least squares PV Photovoltaics

RMSLE Root mean square logarithmic error ROI Return on investment

SAR Simultaneous autoregressive model SEM Spatial error model

(4)

1. Introduction

Resource depletion, climate change, and other environmental and security concerns are the main reasons to foster the transition towards sustainable energy systems [1], especially based on renewable electricity generation [2] and electrification of the building and transport sector [3,4]. For example, Switzerland currently relies heavily on hydropower and nuclear power, which accounted for 60% and 32% respectively of the total electricity generation in 2017 [5].

Following the nuclear accident of Fukushima in 2011, the Swiss Federal Council and the Parliament set the course for an energy transition by prohibiting new nuclear power plants and aiming for a long-term phase-out of five existing reactors [6]. The new Energy Act in 2016 [7]

further defined the targets of increasing energy efficiency and expanding domestic renewable electricity generation from 3.7 TWh/year in 2017 (excluding hydropower [5]) to at least 4.4 TWh/year in 2020, and at least 11.4 TWh/year in 2035. Solar photovoltaics (PV) was so far the fastest growing renewable electricity technology in Switzerland (Figure 1a), thanks to the introduction of a feed-in tariff (FIT) in 2009 and one-time subsidies for smaller installations since 2014 [8]. However, the rate of PV uptake differed considerably among various Swiss regions [9], leading to a highly uneven spatial pattern in PV diffusion (Figure 1b and Figures SI1 and SI2 in the Supplementary Information (SI)). This spatial pattern does not purely follow the pattern in global solar irradiation, indicating the productivity of PV installations (Figure 1c), or the technical solar PV potential that combines productivity and the availability of surface area for PV (Figure 1 d).

(5)

Figure 1: Spatio-temporal patterns in solar PV adoption and PV resource potential in Switzerland: (a) overall increase in cumulative installed PV capacity since 2008 [10]; (b) cumulative installed PV capacity per capita by the end of 2017 [10]; (c) global solar irradiation [11,12]; and (d) yearly exploitable PV potential per capita [12].

As the growth of solar photovoltaics (PV) accelerates, spatial PV projections at subnational level are necessary for planning the grid infrastructure and addressing demand-supply

(6)

balancing challenges, posed by this intermittent source of electricity [13,14]. Although spatial models of weather-dependent PV productivity [15–17] or PV detection from satellite data [18,19] are increasingly common, we are not aware of any study that has focused on producing forward-looking spatial projections of PV installations with spatial regression models.

Typically, energy models have either not considered the spatial dimension [20] or assumed either spatially even diffusion of PV installations in the future [15] or extrapolated past spatial trends [11]. Plentiful evidence shows that such models poorly project actual PV diffusion even at an aggregated national level [21,22], hence making them even more unreliable for spatial projections. Another strand of literature on spatial diffusion investigated techno-economic, socio-demographic and policy predictors behind the spatial diffusion patterns for solar PV [9,23,24] or other technologies [25] but did not develop forward-looking projections.

In order to develop a new methodology to compute reliable forward-looking spatial projections of PV installations, this study uses a comprehensive dataset with 68’341 PV installations in Switzerland at a level of 143 districts and within the timeframe 2010-2017. The study investigates the key predictors of solar PV diffusion in Switzerland at the district level and the accuracy of linear and spatial regression models to project PV diffusion 1- to 5-year-ahead.

Existing literature is reviewed to identify potential PV predictors. These predictors are then integrated into different statistical regression models, which were developed in the last decades for spatial data analysis and spatial econometrics [26] and have recently been applied in the analysis of PV diffusion [24,27–29]. By implementing an in-sample and out-of-sample testing approach, the projection accuracy of the regression models is compared within the timeframe 2010-2017 in order to identify the best performing model to develop forward-looking PV projections to 2022. The study’s focus on producing spatial projections of PV installations and the subsequent out-of-sample testing of these projections are both novel in the energy modelling field and extend previous work [24,27–29] on the analysis of PV diffusion by spatial regression models.

In Chapter 2, an overview of the literature review is presented to select the key predictors of spatial PV diffusion, the data gathering process is explained, the statistical models are defined, and the methodology for the out-of-sample testing is described. Results are presented in Chapter 3. In Chapter 4, the results are discussed as well as the limitations and future research needs are presented. Conclusions are drawn in Chapter 5.

(7)

2. Methodology

The proposed methodology to compute spatial projections of PV installations is summarized in Figure 2. The methodology is demonstrated using a dataset of PV systems, installed in Switzerland by the end of 2017 [10] and various techno-economic and socio-demographic predictors (Section 2.1). First, a multiple linear regression model (MLR), a simultaneous autoregressive model (SAR) and a spatial error model (SEM) are fitted (Section 2.2) to identify the key predictors for PV diffusion at a level of 143 districts in Switzerland. Second, the different regression models are compared according to their projection accuracy by an in- sample and out-of-sample testing (Section 2.3). Last, the tested models are used to compute 1- to 5-year-ahead projections of spatial PV diffusion in Switzerland by 2022.

Figure 2: Methodology for calculating spatial projections of PV installations and testing their accuracy. Grey boxes indicate the steps of the methodology, blue boxes indicate the outputs.

2.1. Dataset

2.1.1. PV installation data

The PV dataset acquired from Pronovo AG [10] includes 68’341 installations in Switzerland by the end of 2017 that received or were on the waiting list for the FIT or a one-time subsidy.

These installations represent 1.8 GW of installed PV capacity by the end of 2017, which is about 90% of the total installed PV capacity in Switzerland [30]. For each PV installation, the dataset provides information on initial operation date, location, and capacity. For the subsequent analysis, four different response variables were used: number of PV projects, installed PV capacity, projects per capita and installed capacity per capita (Table 1). Spatio-temporal data for these variables is shown in Figure SI 1 and Figure SI 2. The PV data was aggregated at the

(8)

level of 143 Swiss districts, assuming the borders at the beginning of 2019. The average size of a district is approximately 280 km2 and 60’000 inhabitants and the district therefore represents a reasonable small spatial unit as compared to methodologically similar studies [24,27,28].

Table 1: Overview of response and predictor variables used in the analysis. All data was collected on a yearly basis at the district level.

Response variables Explanation

Number of PV projects Cumulative number of PV projects PV Capacity Cumulative installed PV capacity [kW]

Projects per capita Cumulative number of PV projects per capita Capacity per capita Cumulative installed PV capacity per capita [kW]

Techno-economic variables Unit Year

Exploitable solar PV potential GWh 2017

Electricity price CHF/kWh 2010-2017

Electricity demand kWh/capita 2010-2017

Energiestadt1 % 2010-2017

Return on investment (ROI) % 2010-2017

Socio-demographic variables Unit Year

Population density Capita/km 2010-2017

Household size Number of persons 2014-2017

Age coefficient2 % 2011-2017

Green voters3 % 2011,2015

Net income CHF/capita 2010-2015

Unproductive area % 2004/2009

1 share of people living in a municipality labelled as Energiestadt

2 persons with age > 65 per 100 persons with 20 ≤ age ≤ 65

3 sum of party shares by the Social Democratic, the Green, the Green Liberal, and the Evangelical People’s Party

2.1.2. Techno-economic and socio-demographic data

In order to select predictor variables for the regression models, a literature review was conducted. Figure 3 presents an overview of which predictor variables for spatial PV diffusion have been investigated in the past and found to be the key predictors [9,23,36–42,24,27,28,31–

35]. By the term key predictor, we mean that the respective authors concluded that this predictor contributes to the outcome of spatial PV distribution; this definition is not solely limited to the p-value of statistical significance. Many studies identified the key predictors for PV diffusion to be techno-economic and socio-demographic variables, such as solar irradiation [24,28,32,34–36], electricity prices [35,36,39], population density [24,28,33,34,38], household size [24,33,37], policy incentives [34,35,39], and homeowner share [24,31,32,34,39].

(9)

Conversely, several studies did not find prevalent demographic variables, such as income [23,41], education level [31,34], age [32,34], or environmental attitude [27,28], to be key predictors. In addition to the techno-economic and socio-demographic variables, a vast amount of literature highlighted the importance of peer effects using household- and municipality-level data [23,37–42]. Peer effects were shown to have a strong positive effect on PV diffusion, where both visibility of PV installations and word of mouth play an important role [23,37,41].

Past research demonstrated that peer effects decreased with increasing distance from other PV installations and there was a diminishing effect with increasing time since the last installation [39]. Peer effects were found to be stronger at smaller spatial scales [40]. Other studies [24,27–

29] accounted for autocorrelation in spatial data and highlighted that PV diffusion at the district level is affected by spatial spillover effects from neighboring regions. This spillover effect should not be confused with peer effects, when analysis is conducted at a higher spatial resolution, e.g. usually at household level, and is driven by social interaction.

(10)

Figure 3: Overview of the predictor variables from the previous studies on spatial PV diffusion (analyzed studies are: [9,23,36–

42,24,27,28,31–35]).

The predictor variables to be included in the regression models are shown in Table 1. If the majority of the analyzed studies in Figure 2 found a certain variable as key predictor for PV diffusion, the variable was considered in our study. Based on the results from previous Swiss studies [9,23], age, income, and environmental attitudes were also included as predictor variables even if they were not found to be the key predictors in other countries. On the contrary, other variables (homeowner share, construction activity) did not show any significance for all types of models and thus were excluded from the analysis. Most socio-demographic data from

(11)

Table 1 is available at the municipality level in Switzerland [43] and was thus aggregated to the level of 143 districts. To account for the different population sizes of municipalities when allocating them to districts, a population-weighted approach was used. As observable from Table 1, data for socio-demographic variables was available for several years only, and missing data for the years between 2010 and 2017 was linearly interpolated or extrapolated. Among the Swiss political parties, parliament members of the Social Democratic Party, the Green Party, the Green Liberal Party, and the Evangelical People’s Party were shown to vote mostly in favor of environmental topics [44] and the share of voters in the population was used as a proxy for green voters.

In terms of techno-economic variables, the estimates of the exploitable rooftop PV potential for all Swiss municipalities are publicly available [12]. This variable represents PV electricity production (mean 467 GWh, standard deviation of 359 GWh), considering the resource potential (solar irradiation and availability of roofs), technical constraints (e.g. PV efficiency, angle of roofs), and environmental and legal constraints (e.g. exclusion of heritage-protected buildings) [9]. Electricity prices from the different Swiss utilities were matched to the corresponding municipalities and then districts [45], focusing on the tariff of a 5-room apartment with an electric cooker (mean 0.196 CHF/kWh, standard deviation of 0.028 CHF/kWh). The electricity demand represents the estimations of the yearly demand in a certain district per capita [9,11]. The variable “Energiestadt” represents the share of people living in a municipality which received the label “Energiestadt” for its local energy policy [46]. The return on investment (ROI) was used as a proxy for the financial incentive to install PV panels and was estimated for each district and year following previous work [27]:

𝑅𝑂𝐼 = 𝑅𝑃𝑉+𝑅𝐹𝐼𝑇−𝐿𝐶𝑂𝐸

𝐿𝐶𝑂𝐸 (1)

𝐿𝐶𝑂𝐸 =

𝐶0+ ∑ 𝐶𝑡 (1 + 𝑟)𝑡

𝑇𝑡=1

∑ 𝐸𝑡

(1 + 𝑟)𝑡

𝑇𝑡=1

(2)

where ROI is the return on investment in %;

𝑅𝑃𝑉 is the so-called PV tariff which a PV electricity producer receives from the local utility in CHF per kWh and varies between 0.035 CHF/kWh and 0.23 CHF/kWh [47];

(12)

𝑅𝐹𝐼𝑇 are the national reimbursements as FIT for electricity fed into the grid in CHF per kWh, assumed as FIT for a 30 kW installation at 0.53 CHF/kWh in 2010 and decreased to 0.16 CHF/kWh in 2017;

LCOE are the levelized cost of electricity in CHF per kWh;

𝐶0 are the initial investment cost in CHF per kW [48];

𝐶𝑡 are the operation and maintenance costs in CHF per kW in year t, estimated as 2% of the investment cost per year [49];

𝐸𝑡 is the electricity production in kWh per kW in year t [11,12], as a function of the yearly global solar irradiation in a specific district because the solar irradiation is the primary driver of spatial differences in Switzerland;

t is the year of analysis, t=1,2,…, T;

T is the lifetime of a PV installation, assumed as 25 years;

r is the discount rate that is represented by the weighted average capital cost of 4.75% [50].

All LCOE and ROI calculations are based on a 30 kW PV installation. PV installations with a lower capacity have slightly higher investment costs, but also receive a higher FIT. The opposite is valid for PV installations with a higher capacity. Due to right-skewed marginal distributions for most variables, all variables were log-transformed, which lead to an improvement of the models. Since the predictor variables significantly differ in units and the orders of magnitude, they were standardized by subtracting the mean from each value and dividing it by the standard deviation. Due to the lack of robust data on self-consumption and its spatial differences across Switzerland, it was assumed that any self-consumed electricity has the same monetary value as the reimbursed electricity.

2.2. Statistical regression models

2.2.1. Multiple linear regression model

First, a multiple linear regression (MLR) model for a single year is used to quantify the relation between the response variable like the number of PV installations and the chosen predictor variables [51]:

𝑦 = 𝑋𝛽 + 𝜀 (3)

where 𝑦 is a n × 1 vector of observations of PV installations per district (measured in four ways as described in Section 2.1);

𝑋 is a n × p matrix of observations on the predictor variables in Section 2.1, where the intercept is included as first column;

(13)

𝛽 is a p × 1 vector of regression parameters;

𝜀 is n × 1 vector of errors terms;

n is the number of the Swiss districts;

p is the number of predictors (including the intercept).

In MLR, the regression parameters are estimated by ordinary least squares (OLS). After checking for linearity, homoscedasticity and normality, the assumption of no autocorrelation was verified using Moran’s I and a Monte Carlo test [52]. If spatial autocorrelation is detected in a dataset, the spatial observations are not independent from each other [53]. However, due to the spatial character of the data, strong spatial autocorrelation was observed for all four response variables, suggesting that more sophisticated statistical models such as spatial regression models ought to be tested.

2.2.2. Spatial regression models

Two spatial regression models were used in the analysis: the simultaneous autoregressive (SAR) model [26,54] and the spatial error model (SEM) [54]. While the MLR model is estimated by ordinary least squares (OLS), the SAR and SEM models are estimated by maximum-likelihood estimation (MLE). Thereby, the spatial autoregressive/error parameters are first found by numerical optimization, whereas in a second step, 𝛽 and other parameters are found by generalized least squares (GLS) [55].

The SAR model, specified in Equation (4), assumes that the response variable in a certain spatial unit is affected by its proximate spatial units. Hence, compared to the general regression model in Equation (3), the SAR model is complemented with the spatial autocorrelation term 𝜌𝑊𝑦 [56]. The implied data generating process (DGP) is given in Equation (5).

𝑦 = 𝜌𝑊𝑦 + 𝑋𝛽 + 𝜀 (4)

𝑦 = (𝐼𝑛− 𝜌𝑊)−1𝑋𝛽 + (𝐼𝑛− 𝜌𝑊)−1𝜀 (5) Where 𝑊 is a n×n spatial weight matrix which defines the neighboring structure of the spatial units (i.e. districts);

𝜌 is the spatial autoregressive parameter;

𝐼𝑛 is a n×n identity matrix.

(14)

An alternative is the spatial error model (SEM) [54], described by Equation (6) with the DGP in Equation (7). Here, the spatial lag is modelled in the disturbances 𝑢.

𝑦 = 𝑋𝛽 + 𝑢 with 𝑢 = 𝜆𝑊𝑢 + 𝜀 (6)

𝑦 = 𝑋𝛽 + (𝐼𝑛 − 𝜆𝑊)−1𝜀 (7)

where 𝜆 represents the spatial error parameter;

𝑢 is a n × 1 vector of disturbances.

In order to define the spatial weight matrix in Equation (4) and (6), two different ways were chosen as shown in Figure 4: rook contiguity weights and radial distance-based weights [57–

59]. Rook contiguity weights determine the spatial influence on the basis of adjacent spatial units [59] and are calculated as:

𝑤𝑖𝑗 = {1 𝑖𝑓 𝑙𝑖𝑗 > 0

0 𝑖𝑓 𝑙𝑖𝑗 = 0 (8)

Where 𝑤𝑖𝑗 indicates the element in row i and column j in the n×n matrix 𝑊;

𝑙𝑖𝑗 is the length of the shared boundary between district 𝑖 and district 𝑗.

A radial distance-based weight matrix is defined as [59]:

𝑤𝑖𝑗 = {1 𝑖𝑓 0 ≤ 𝑑𝑖𝑗 ≤ 𝑑

0 𝑖𝑓 𝑑𝑖𝑗 > 𝑑 (9)

Where 𝑑𝑖𝑗 denotes the distance between the centroids of two districts 𝑖 and 𝑗;

𝑑 is the threshold distance within which a spatial influence is assumed. In this study, the distance 𝑑 was chosen, that each district has at least one neighbor.

(15)

Figure 4: Spatial neighbor structure of the Swiss districts based on rook contiguity weights and radial distance weights

2.3. Accuracy evaluation

To evaluate the accuracy of the three regression models and following previous typology [60], it is differentiated between in-sample and out-of-sample testing (Figure 5). The accuracy of in- sample projections measures the model fit, similar to R2, to generate solar PV projections for the same year as the initial fitting (i.e. training of the regression model). The accuracy of out- of-sample projections tests the fitted statistical model against a different dataset, in particular for the solar PV diffusion in another year [60]. A comparison of in-sample and out-of-sample testing is essential to get insights about the model performance. Even though some models perform well with in-sample data, they could perform poorly on out-of-sample data [61].

Figure 5: In-sample (left) vs. out-of-sample (right) testing of the accuracy of solar PV projections.

Whereas for the MLR model, projections are straightforward, a so-called "trend-signal-noise"

predictor, which was introduced by Haining [62] and further elucidated by Bivand [61], is used to generate the projections with the spatial regression models. In order to improve the accuracy of projections, a time-lagged response variable 𝑦𝑡−1, known from time series analysis [63], is added as predictor variable to the models.

To measure the accuracy of projections, the root mean square error (RMSE) is the most common model evaluation measure for regression [64]. If the projection is very inaccurate for a single district, the RMSE becomes large and it is therefore not robust against outliers [64].

(16)

Following [65,66], the root mean square logarithmic error (RMSLE) is therefore chosen as the accuracy evaluation measure because it allows to handle different orders of magnitude among the districts:

𝑅𝑀𝑆𝐿𝐸 = √1

𝑛∑(log(𝑦𝑖) − log(𝑦̂𝑖))2

𝑛

𝑖=1

(10)

where 𝑦𝑖 is the observed value in the Swiss district 𝑖;

𝑦̂𝑖 is the estimated value by the regression models for the district 𝑖.

Data aggregation was conducted with Microsoft Excel and SQLite. All models were implemented within the statistical software R. Spatial weight matrices were also calculated within R and all figures (besides Figure 3) including the maps were created within R.

3. Results

3.1. Analysis of predictor variables for PV diffusion

Table 2 presents the regression results and model evaluation measures for four response variables: number of solar PV projects, PV capacity, number of PV projects per capita, and PV capacity per capita for the year 2017. Depending on the regression model, 66-93% of the variance in the response variable is explained by the different regression models (see R2 for MLR and Nagelkerke pseudo R2 for SAR and SEM models). The results reveal that for the model specification with number of PV projects per capita the highest share of the variance can be explained, whereas the models are worse in describing capacity per capita. The Nagelkerke pseudo R2 can be increased for SAR and SEM models compared to the MLR model for the number of PV projects and projects per capita. In contrast, for PV capacity and capacity per capita, the spatial regression models fit only slightly better. Aside from R2, the Akaike Information Criterion (AIC) is another model evaluation measure, where models with smaller AIC values are preferred, and its values in Table 2 validate the findings of R2.

Table 2: Standardized parameter estimates and model evaluation measures of five different regression models based on cross- sectional data from 2017: Multiple linear regression (MLR), SAR with rook contiguity weights (SAR.Rook), SAR with radial distance weights (SAR.Dist), SEM with rook contiguity weights (SEM.Rook), and SEM with radial distance weights (SEM.Dist) and for four different response variables: number of PV projects, PV capacity, number of PV projects per capita, and PV capacity per capita.

Projects MLR SAR.Roo

k

SAR.Dis t

SEM.Ro ok

SEM.Dis t

Capacity MLR SAR.Ro

ok

SAR.Dis t

SEM.Ro ok

SEM.Dis t

Intercept 5.85*** 4.67*** 4.03*** 5.83*** 5.86*** Intercept 9.06*** 7.93*** 7.65*** 9.06*** 9.06***

Expl. PV potential 0.68*** 0.65*** 0.65*** 0.7*** 0.68*** Expl. PV potential 0.7*** 0.68*** 0.69*** 0.72*** 0.71***

Household size 0.28*** 0.25*** 0.23*** 0.21*** 0.23*** Household size 0.3*** 0.29*** 0.28*** 0.29*** 0.3***

Population density 0.22*** 0.16*** 0.15*** 0.09* 0.13** Population density 0.27*** 0.24*** 0.24*** 0.27*** 0.28***

(17)

Age coefficient 0.14*** 0.14*** 0.14*** 0.08** 0.11*** Age coefficient 0.01 0.02 0.02 0.02 0.02

Electricity price 0.13*** 0.1*** 0.09*** 0.6* 0.1*** Electricity price 0.13*** 0.12*** 0.11*** 0.1** 0.11***

Energiestadt 0.07 0.08 0.09 0.04 0.04 Energiestadt 0.08* 0.09* 0.09* 0.06 0.06

Green voters 0.04* 0.04** 0.03** 0.07 0.06 Green voters -0.04 -0.05 -0.05 -0.06 -0.06

ROI 0.03 0.02 0.02 0.04** 0.04* ROI 0.00 0.00 0.00 0.00 0.00

Electricity demand -0.03 -0.03 -0.03 -0.05* -0.05* Electricity demand 0.09* 0.08** 0.08** 0.06* 0.07*

Net income -0.06 -0.06* -0.05 0.02 0.01 Net income -0.13*** -0.13*** -0.13*** -0.12** -0.11**

𝜌 0.2*** 0.31*** 𝜌 0.12* 0.16*

𝜆 0.75*** 0.74*** 𝜆 0.34* 0.34*

AIC 57.92 45.47 41.72 6.91 28.60 AIC 114.94 112.34 113.19 110.18 114.01

R2 0.90 0.91 0.91 0.93 0.92 R2 0.88 0.88 0.88 0.89 0.88

Adj. R2 0.89 Adj. R2 0.87

F-statistic 115.2 F-statistic 97.15

Projects per capita MLR SAR.Roo k

SAR.Dis t

SEM.Ro ok

SEM.Dis t

Capacity per capita MLR SAR.Ro ok

SAR.Dis t

SEM.Ro ok

SEM.Dis t

Intercept -4.68*** -2.58*** -2.62*** -4.68*** -4.68*** Intercept -1.47*** -1.16*** -1.12*** -1.47*** -1.47***

Expl. PV

potential/capita

0.29*** 0.25*** 0.27*** 0.31*** 0.31*** Expl. PV

potential/capita

0.33*** 0.31*** 0.32*** 0.33*** 0.32***

Household size 0.24*** 0.23*** 0.25*** 0.21*** 0.23*** Household size 0.23*** 0.23*** 0.24*** 0.23*** 0.23***

Age coefficient 0.14*** 0.14*** 0.15*** 0.1*** 0.13*** Age coefficient -0.02 0 -0.01 -0.01 -0.01

Electricity price 0.09** 0.03 0.05 0.05 0.08** Electricity price 0.08* 0.05 0.05 0.06 0.07*

Energiestadt 0.05 0.03 0.03 0.01 0.01 Energiestadt 0.09* 0.07* 0.07* 0.07* 0.07*

Green voters 0.01 0.01 0.01 0.05* 0.03 Green voters -0.06 -0.06 -0.06 -0.06 -0.06

ROI 0.05 0.04 0.03 0.06* 0.06* ROI 0.02 0.02 0.01 0.02 0.02

Electricity demand -0.02 -0.03 -0.02 -0.05* -0.04 Electricity demand 0.09* 0.08** 0.09** 0.08* 0.08*

Net income -0.07* -0.03 -0.03 0 -0.01 Net income -0.12** -0.1* -0.1* -0.11** -0.11**

Unproductive area -0.14*** -0.11*** -0.13*** -0.09* -0.1** Unproductive area -0.18*** -0.16*** -0.17*** -0.16*** -0.17***

𝜌 0.45*** 0.44*** 𝜌 0.21* 0.24*

𝜆 0.77*** 0.73*** 𝜆 0.31** 0.22

AIC 65.91 30.01 46.85 6.51 34.60 AIC 123.95 120.16 121.54 120.12 124.60

R2 0.71 0.78 0.75 0.81 0.77 R2 0.66 0.67 0.67 0.67 0.66

Adj. R2 0.69 Adj. R2 0.63

F-statistic 33.04 F-statistic 25.12

* p ≤ 0.05, ** p ≤ 0.01, *** p ≤ 0.001

According to Table 2, exploitable solar PV potential, the main predictor variable, has a strong positive effect on PV diffusion in all four response variables and in all five models. Household size, population density, and electricity prices positively affect the PV diffusion for the number of projects and capacity as response variable. In terms of PV projects per capita and PV capacity per capita, both exploitable PV potential and household size are strong positive predictors, whereas unproductive area is a negative predictor for PV diffusion. Net income and electricity demand do not have significant impact on the number of PV projects and PV projects per capita, but a negative (net income) and positive (electricity demand) association is observed for PV capacity and capacity per capita as response variables. The age parameter does not affect PV capacity and capacity per capita, but has a positive effect on the number of PV projects and PV projects per capita. The effect of the predictor variables Energiestadt, ROI, and green voters is rather low in all models.

(18)

Parameter estimates of the spatial autoregressive parameter 𝜌 and the spatial error parameter 𝜆 from SAR and SEM models show high values (Table 2). The SEM models show a particularly high 𝜆, indicating the presence of strong spatial spillovers, which is in line with preliminary tests on spatial autocorrelation. The two spatial parameters are more distinct with the response variables number of PV projects and PV projects per capita. The latter is expected, since spillover effects are more distinct for smaller installations (e.g. installations of private households). In contrast, PV capacity as response variable was dominated by large-sized installations in some selected Swiss districts, leading to lower spillover effects to proximate districts. In addition, the Wald test was used to detect the absence of spatial dependence in the spatial regression models. Test results for different significance levels are signified as stars behind the autoregressive/error parameters 𝜌 and 𝜆 in Table 2. Test results were statistically significant for all spatial regression model combinations and therefore justify the inclusion of the spatial terms in Equation (4) and Equation (5). As such, spatial regression models should be preferred over the MLR model.

3.2. Accuracy testing

For an in-sample accuracy testing, the MLR, SAR, and SEM models were fitted for each year between 2010 and 2017 and Figure 6 shows the RMSLE for the different response variables.

The decreasing RMSLE over the years indicates the improved performance of the regression models due to a richer PV dataset every year. Spatial regression models (SAR and SEM) entail a smaller RMSLE than the MLR model for the number of PV projects as response variable;

thus, the autocorrelation term with the significant autoregressive parameter improves the model fit. For the response variable PV capacity, the relative decrease in the RMSLE over the years is higher. Notably, spatial regression models do not entail a much lower RMSLE. Due to smaller spatial autocorrelation in capacity, spatial regression models are only slightly advantageous, which is also reflected in the lower values of 𝜌 and 𝜆 in Table 2. Similar patterns are observable for number of PV projects per capita and capacity per capita.

(19)

Figure 6: In-sample testing accuracy testing of five regression models.

The RMSLE for 1-year-ahead out-of-sample spatial PV projections between 2011-2017, is illustrated in Figure 7. In the beginning of the time series, out-of-sample results reveal higher RMSLE compared to the in-sample testing and, in line with the in-sample testing results, the RMSLE subsequently decreases over the years as more data is available. In particular, projections for 2016 and 2017 show RMSLE values that only slightly exceed in-sample results for PV capacity. In contrast to in-sample testing, accuracy results of the SAR and SEM models do not outperform results of the MLR model for 1-year-ahead projections. The high number of newly installed PV projects or capacity, for instance in 2015, is difficult to capture in all the models. Consequently, all models underestimate the number of projects and the capacity for 2015 and the RMSLE cannot be reduced. Similar patterns are observable for number of PV projects per capita and capacity per capita.

2010 2012 2014 2016

Number of PV projects

RMSLE 0.00.20.40.60.8

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2010 2012 2014 2016

PV capacity

RMSLE 0.00.20.40.60.8

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2010 2012 2014 2016

Projects per capita

RMSLE 0.00.20.40.60.8

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2010 2012 2014 2016

Capacity per capita

RMSLE 0.00.20.40.60.8

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

(20)

Figure 7: Out-of-sample accuracy testing of five regression models for 1-year-ahaed projections.

Although the regression results of the MLR, SAR and SEM models provide insights into the predictors for PV diffusion (Table 2), the out-of-sample accuracy of these models is limited (Figure 7). To further improve the accuracy performance, a time-lagged response variable 𝑦𝑡−1 is included as predictor variable in the regression models and results are presented in Figure 8.

A decrease in RMSLE is again observed although it is not as steady as in the previous results (Figure 7) because the projection is more dependent on the previous year. For example, in 2015, the RMSLE again increased, due to the high number of newly installed projects per capacity, which was underestimated by the models. The high number in 2011 is because no previous data is available for the fit of 2010.

2011 2012 2013 2014 2015 2016 2017

Number of PV projects

RMSLE 0.00.20.40.60.81.01.2

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2011 2012 2013 2014 2015 2016 2017

PV capacity

RMSLE 0.00.20.40.60.81.01.2

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2011 2012 2013 2014 2015 2016 2017

Projects per capita

RMSLE 0.00.20.40.60.81.01.2

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2011 2012 2013 2014 2015 2016 2017

Capacity per capita

RMSLE 0.00.20.40.60.81.01.2

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

(21)

Figure 8: Out-of-sample accuracy testing of the five regression models for 1-year-ahead projections with a time-lagged response variable yt−1 as predictor.

The RMSLE results in Figure 7 and Figure 8 only depict a mean value for Switzerland. The percentage errors of 1-year-ahead projections for 2012 and 2017 are depicted in Figure 9 for the number of PV projects at a district level (the errors for the three other response variables are shown in Figures SI 3-5). Due to the smaller dataset, percentage errors are much higher for the number of PV projects in 2012, whereas the 1-year-ahead projections for 2017 reveal significantly lower percentage errors because the fitted models from 2016 are more reliable.

Similar to Figure 8, the differences among the models diminish with time, and in 2017, the performance of all models appears similar. The maps illustrate the spatial dependence in the data: since there is less data for 2012, the spatial models fail to capture spatial dependence, and red/gray districts are more evenly distributed in Figure 9. In 2017, adjacent districts seem to have a similar model performance. Furthermore, PV capacity is strongly underestimated in most of the districts in 2012 (Figure SI 3) due to the fact that the observed capacity increased by a factor of 4 from 2010 to 2012. This observation has significant impact on the out-of-sample projection accuracy since the capacity in 2012 is projected by the model fit from 2011, for which the capacity from 2010 is used as time-lagged variable. The underestimation for 2012 regarding the number of PV projects is less distinct (cf. Figure 9) because the number of PV projects only increased by a factor of 2.7 from 2010 to 2012.

2011 2012 2013 2014 2015 2016 2017

Number of PV projects

RMSLE 0.00.20.40.60.81.01.2

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2011 2012 2013 2014 2015 2016 2017

PV capacity

RMSLE 0.00.20.40.60.81.01.2

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2011 2012 2013 2014 2015 2016 2017

Projects per capita

RMSLE 0.00.20.40.60.81.01.2

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

2011 2012 2013 2014 2015 2016 2017

Capacity per capita

RMSLE 0.00.20.40.60.81.01.2

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

(22)

Figure 9: Out-of-sample testing by percentage error of 1-year-ahead projections for number of PV projects

(23)

In order to analyze the accuracy of projections over longer time horizons, Figure 10 illustrates 1- to 5-year-ahead projections for 2013-2017. Here, the model fit from 2012 is used for projections until 2017. The projected response variable for 2013 was iteratively used for the projection of 2014. This approach was preferred, and it lead to more accurate projections than the method in which the model is subsequently refitted for every year to generate another 1- year-ahead projection. The drawback of this method, however, is that the inaccuracy in the fitting year, as indicated by the RMSLE, is propagated to the future projections. This tendency is visible in the poor 5-year-ahead projections of the SAR model with radial-distance weights for the number of PV projects as response variable. The poor projection accuracy is largely due to the deviation observed in the fitting year 2012 that is propagated to subsequent years. Similar patterns are noted for PV capacity, where the slightly higher RMSLE of the MLR and SEM models increases with longer projection horizons. For the number of PV projects per capita and PV capacity per capita, the MLR and especially the SAR models perform poorly for projection horizons up to five years. As shown in Figure 8, SAR models already perform worse for a 1- year-ahead projections in 2013 and the error is propagated to later years. Figure SI 6 illustrates 1- to 3-year-ahead projections for 2015-2017 with the fitted models from 2014. The results reveal that fitting with data from 2014, generates much more accurate projections.

Figure 10: Out-of-sample testing for different projection horizons (1- to 5-years-ahead) by model fit from 2012. All models include a time-lagged response variable yt−1 as predictor variable.

1 year 2 years 3 years 4−years 5 years

Number of PV projects

RMSLE 0.00.20.40.60.81.0

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

1 year 2 years 3 years 4−years 5 years

PV capacity

RMSLE 0.00.20.40.60.81.0

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

1 year 2 years 3 years 4 years 5 years

Projects per capita

RMSLE 0246810

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

1 year 2 years 3 years 4 years 5 years

Capacity per capita

RMSLE 0.00.51.01.52.02.5

MLR

SAR with rook contiguity w eights SAR with radial distance w eights SEM with rook contiguity w eights SEM with radial distance w eights

(24)

3.3. Spatial solar PV projections up to 2022

After achieving the best projection accuracy with time-lagged models, the five models were used to make 5-year-ahead projections of solar PV diffusion at the district level in Switzerland (see Figure 11). The model fit from 2017 with a time-lagged variable from 2016 is used. All predictor variables are linearly extrapolated up to 2022. After each step of generating projections, the projected number of PV projects is stored and then iteratively used in the next step for a new projections. Figure 11 shows spatial projections of the number of solar PV projects up to 2022; Figures SI 7-9 depict the projections for PV capacity, number of PV projects per capita, and PV capacity per capita, respectively.

As before, a propagation of the model fits from 2017 is observed. Since all models fitted very well to the observed data in 2017, the differences in projections up to 2022 among the regression models are rather small. Figure 11 (b) further illustrates spatially-explicit projections for each district for 2022. Since the number of PV projects is highly dependent on the size of the population and the district, the lower expected number of PV projects in the alps in Figure 11 could be adequate. Densely-populated regions, around the lake Geneva and in the Swiss midlands still show high a high number for PV projections. The highest number of PV projects is expected for the district Bern-Mittelland where the projected number of PV installations in 2022 varies from 6332 (SEM with rook contiguity weights) to 6546 (MLR) and 6697 (SAR with rook contiguity weights).

(25)

Figure 11: 5-year-ahead spatial projections of the number of PV projects up to 2022.

In terms of 5-year-ahead projections of PV capacity (Figure SI7) the mean Swiss capacity for 2022 by the five models is 3030 MW. Whereas the mean projected number of projects for 2022 more than doubles compared to 2017 (Figure 11), the PV capacity is projected to increase by roughly 70%, suggesting a high added number of smaller-sized PV installations. The spatial patterns are similar for the projections of the number of PV (Figure 11 (b)) and PV capacity (Figure SI7). 5-year-ahead projections up to 2022 for the number of PV projects per capita and PV capacity per capita are presented in Figure SI 8 and Figure SI 9, respectively. The mean projected number of PV projects per capita in 2022 is 0.02 PV projects per capita and the mean PV capacity per capita in 2022 accounts to 0.375 kW/capita.

4. Discussion

4.1. Predictors of spatial solar PV diffusion

The regression results in this study are in line with previous studies, which analyzed solar PV diffusion at various spatial levels. Exploitable solar PV potential was also found to be a strong positive predictor of financially-supported PV diffusion at a municipality level in Switzerland

a)

b)

Projection

0 40000 80000 120000 160000

2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022

Number of PV projects

Observed MLR

SAR with rook contiguity w eights SAR with radial distance weights SEM with rook contiguity w eights SEM with radial distance weights

(26)

[9]. To the author's best knowledge, these are the only two studies that considered exploitable PV potential as a predictor. The positive effect of household size on PV diffusion at the district level in this study was also identified at higher spatial resolution [33,37], where it is argued, that stronger peer effects are induced when there are more people living in the same household [37]. Due to the focus on districts in the present study, the positive effect of household size would rather be explained by the urban-rural differences that were observed in Switzerland [9], where rural cantons with a larger average household size also exhibit higher PV deployment.

Conversely, one previous study found a negative effect of household size [24], attributing their finding to the higher disposable income for PV by smaller families.

In terms of population density, the positive effect in this study was also affirmed elsewhere [28], whereas the other studies found a negative effect [24,34,38]. This disparity can be explained by the context of our study: in Switzerland, the absolute number of PV installations and capacity are high in mid-sized towns that have rather high population density and a large average household size. In contrast, cities and alpine regions have lower household sizes. As such, population density is not a driver of PV diffusion but is either a positive or negative predictor, depending on the spatial aggregation of the analysis. This can be seen from another Swiss study at the municipality level which found a negative effect of population density [9], due to a urban-rural divide that was observed at a more precise spatial aggregation. Due to the large number of mountains, lakes and glaciers in Switzerland, the share of unproductive land area was also found to have a negative effect on PV diffusion at the district level.

In line with our findings, the positive effect of high electricity prices on PV diffusion was previously found at a district level [35], municipality level [36], and census block group level [39]. Electricity prices can thus be considered as a decisive predictor for PV diffusion, independent of the spatial level. The other financial variable in this study, the ROI, was slightly positive and only significant for some model combinations (Table 2). In contrast to electricity prices that are widely known to anyone interested in investing in solar PV, the ROI is more difficult to estimate because it depends on PV tariffs and the plant’s productivity (solar irradiation). Consistent with earlier studies (Figure 2), districts with a high share of political parties that promote environmental topics do not necessarily adopt more PV. In Switzerland, districts with a high share of green voters are typically urban districts, which demonstrate several reasons for lower PV investments (e.g. higher tenant share, wider availability of electricity products from the utilities including PV electricity). The effect of net income is slightly negative, due to the fact that urban districts with lower investment incentives for PV

Références

Documents relatifs

Analysis of different synergy schemes to improve SMOS soil moisture accuracy or spatial resolution.. Andre

The main idea of the proposed method is to apply clustering to model outputs simulated on a numerical design-of-experiment generated using a given GSA method, and

Dans le cadre du Programme santé environ‑ nement-santé travail financé par l’Agence natio‑ nale de la recherche, une étude a été menée pour estimer les fractions de cas

We study the qualitative properties of a spatial diffusive heterogeneous SIR model, that appears in mathematical epidemiology to describe the spread of an infectious disease in

When combined with the immense light-gathering power of LUVOIR, it will deliver unprecedented views of the disks, winds, chromospheres and magnetospheres around a broad range

The objective of this work is to propose a quantitative approach to use spatial information of prediction maps for supporting the evaluation of regression models applied to HS

Keywords: Kernel estimator; Spatial regression; Random fields; Strong mixing coef- ficient; Dimension reduction; Inverse Regression.. The set S can be discret, continuous or the set

We suggest a nonparametric regression estimation approach which is to aggregate over space.That is, we are mainly concerned with kernel regression methods for functional random