• Aucun résultat trouvé

Study of the national univariate effects

Dans le document Benjamin Dubois pour obtenir le grade de (Page 90-97)

Independent models

3.5 Experiments with independent models

3.5.3 Study of the national univariate effects

In order to better understand the model estimated for thenationalload forecast-ing, we propose in this section to illustrate the univariate effects and the distribution of the residuals. In addition, this allows us to identify potential weaknesses of the chosen modeling. We recall that the national model uses a fictive weather station whose temperature and cloud cover is computed with the weighted mean given in Table F.1.

In the illustrations of this section the model is not updated during 2016 because we fix the training and test datasets to the3×52weeks before and the52weeks after Friday, January1st, 2016 so that both are well-balanced and have approximately the same quantity of data for each month, each day of the week and each hour. Thus the distribution of the data in the test set is representative of operational conditions, in terms of possible values of the input variables.

Hour of the week The effect estimated for the hour of the week is presented in Figure 3.7. It is one of the most important effects as its amplitude is the sec-ond largest, after the effect of the past loads. The average forecasts Etrain[ˆ`|h] and Etest[ˆ`|h] are so close on average to the target loads Etrain[`|h] and Etest[`|h] that the curves are superimposed. Note that the hour of the week is also included in interactions with the day of the year, the past loads and the indicator of holidays.

Every day of the week, the forecasts are less accurate during the day than during the night. Besides, Mondays clearly represent a problem as the residuals are much larger than the other days. We believe that this is because the model includes the effect of the load the two days before and Mondays are the only working days preceded by two non-working days. The inverse situation occurs with transitions from Fridays to Saturdays but it is not as visible in terms of residuals.

Day of the year The effect of the day of the year is presented in Figure 3.8.

Although the shape of the learned effect in the national model matches the shape of the load over one year, the amplitude is ten times smaller than the effect of the hour of the week. The residuals are particularly large during the Christmas period and, while they are on average smaller during summer, they are still large during the summer break.

Temperatures The effects estimated for the temperature, the delayed tempera-ture and the extremal values are presented for the national model in Figures3.9,3.10 and F.23 - F.27. The norms of the residuals are larger in cold temperatures which typically correspond to larger loads.

Note that the shape of the effect f5s in Figure 3.9 learned for the temperature matches the shape of the conditional load but it is not the case of the effects learned for the delayed temperature f6,924s in Figure 3.10 and f6,948s in Figure F.23. How-ever, we should not try and make a causal interpretation of these graphs. Indeed, interpreting the learned effects is not trivial because the inputs are highly correlated.

Also, the amplitudes of the estimated effects related to the temperatures are much larger than the amplitude of the effect of the day of the year presented in Figure 3.8 : according to the learned model, the variations of the amplitude of the

35

FIGURE 3.7: Estimated effect of the hour of the week

( top left ) Effectβ0+f3(h)of the hour of the week estimated in the

Note that the residuals are still quite correlated with the hour of the week.

load during the year are much more explained by the changes in the temperatures than by the day of the year.

Past loads The effect f15,924(`924) of the 24 hours-delayed loads is presented in Figure 3.11. It seems almost linear, which corroborates the idea that the past load acts like a corrective term, but the slope in Figure3.11 is slightly smaller than the slope of the empirical loadsE[ˆ`|`924]that is close to 1. This effect is significant since it has the largest amplitude. The estimated effectf15,948(`948)of the 48 hours-delayed loads is presented in Figure F.28.

Indicator of the Christmas period The average values of the national loads, the forecasts and the residual during and outside the Christmas period are given in Table 3.7, for the training and the test sets. Because this indicator is highly correlated with the temperatures and the day of the year, the average values of the different time series are computed only for observations in December and January.

The coefficient α1 is negative, as expected because the economic activity de-creases during this period. However, its amplitude is 5 times smaller than the difference in the training set between the average loads during the Christmas

pe-37

FIGURE 3.8: Estimated effect of the day of the year

( top left ) Effect β0 + f4(d) estimated in the national short-term model for the day of the year after renormalization.

( bottom left ) Smoothed and renormalized graphs of the conditional norms of the residualsEtrain[|`−`ˆ||d] andEtest[|`−`ˆ||d].

( top right ) Conditional loadsEtrain[`|d]andEtest[`|d]and conditional forecasts Etrain[ˆ`|d] and Etest[ˆ`|d].

The average loads and the forecasts are very close on the training and the test sets so the curves are superimposed. Note that the important residuals during the Christmas period are not due to boundary effects or to a low density of the data since the day of the yeardis uniformly distributed. Also, the learned effect β0+f4(d) presents high frequencies. This may be a sign of overfitting, although the hyperparameters have been selected empirically.

riodEtrain[`|1xmas = 1] and the average load Etrain[`|1xmas = 0, month∈ {Dec, Jan}] during the rest of December and January.

Timestamp The last univariate effect in the model depends on the timestamp. In Figure3.12, we present the smoothed loads, forecasts and residuals over the training and the test sets. Visually, it is not obvious what the linear effect of the timestamp should be. Yet, the coefficient γ2 in front of the linear timestamp t equals −45 in the national short-term model, which corresponds to a decrease of 45MWh of the hourly load every year. Still, there was an overestimation of the loads in2016 since the average residuals are mostly negative. Indeed, if the model can know the test set in advance, that is to say when the model is learned with 4 years of data, the learned coefficient is −49. It remains however a rather small value.

The significativity of these coefficients has not been tested and using the

times-FIGURE 3.9: Estimated effect of the temperature

( top left ) Effect β0 + f5s(Ts) learned by the national short-term model for the weighted temperature defined in TableF.1 after renormalization.

( bottom left ) Norm of the conditional residuals Etrain[|` − `ˆ||d] and Etest[|`−`ˆ||d].

( top right ) Conditional loads Etrain[`|d] and Etest[`|d] with the con-ditional forecasts Etrain[ˆ`|d] and Etest[ˆ`|d].

( bottom left ) Density of the data in the training and the test sets.

The illustration of the density of the data shows that the distribution µtest

of the temperatures in 2016 was different from the distribution µtrain from 2013 to 2015 : this is indeed confirmed by the boxplots in Figure 2.3.

tamp in our experiments only led to a minor improvement. A more refined and efficient treatment of the modeling of the trend over time is proposed by Goude et al.[2013].

FIGURE 3.10: Estimated effect of the 24 h-delayed temperature ( top left ) Effect β0 +f6,924s (Ts924) learned by the national

short-term model for the24hours-delayed temperature after renormalization.

( bottom left ) Marginal norm of the residuals Etrain[|`−`ˆ||Ts924] and Etest[|`−`ˆ||Ts924].

( top right ) Target loads Etrain[`|Ts924] and Etest[`|Ts924] with the forecasts Etrain[ˆ`|Ts924] and Etest[ˆ`|Ts924].

( bottom right ) Density of the data in the training and the test sets.

FIGURE 3.11: Estimated effect of 24 h-delayed load

( top left ) Effect β0 +f15,924(`924) learned by the national short-term model for the 24 hours-delayed loads after renor-malization.

( bottom left ) Conditional norm of the residualsEtrain[|`−`ˆ||`924]and Etest[|`−`ˆ||`924].

( top right ) Conditional loadsEtrain[`|`924]andEtest[`|`924]with the conditional forecasts Etrain[ˆ`|`924] and Etest[ˆ`|`924].

( bottom right ) Density of the loads in the training and the test sets.

Christmas

period 1xmas= 1 Rest of December and January 1xmas= 0

TABLE 3.7: Estimated effect of the Christmas period

Number of samples and average values of the loads, the forecasts and the residuals during the Christmas period (1xmas = 1) and during the rest of December and January (1xmas= 0). The coefficient α1 is the factor in front of the indicator 1xmas in the national short-term model.

30

FIGURE 3.12: Forecasts and residuals over the database

( top ) Loads and forecasts from 2013 to 2015 for the training set and in 2016 for the test set.

( bottom ) Residuals of the national short-term model over the training and the test sets.

Dans le document Benjamin Dubois pour obtenir le grade de (Page 90-97)