• Aucun résultat trouvé

Pending questions

Dans le document Benjamin Dubois pour obtenir le grade de (Page 120-124)

Independent models

3.7 Pending questions

In this section, we propose to emphasize the problems encountered and the limits of the independent models that we have presented so far.

3.7.1 Important residuals on Mondays

At all levels of aggregation, the short-term models particularly struggle to predict the loads on Mondays, as shown for instance in Figure3.7. We consider that this is due to the use of the past loads in the model and the fact that Mondays are preceded by non-working days. Although the introduction of the interaction between the past loads and the hour of the week leads to an improvement, it does not solve entirely

the problem. How to deal with sequences of non-working and working days remains unanswered so far, although essential.

3.7.2 Different regularizations for the local models

At disaggregated levels, we chose equal hyperparameters for the different time series to forecast in order to control the dimension of the hyperparameters space during the exploratory part of the work. When trying to fit the best models, this constraint can be relaxed and we have observed empirically that this leads to bettter performances on the test year 2016. However, this relaxation increases the risk of overfitting the training sets and makes more difficult the search for any further improvement of the models. At this point, we do not have a clear answer about the best way to proceed.

Note that by choosing the same hyperparameters for the different models in a same aggregation level, that is to say the same number of knots for the features and the same regularization coefficients, we have somehow already considered the local load forecasting models in a multi-task setting since choosing the best hyperparam-eters on average let the different models interact together. This is is only a first step towards the more general multi-task settings considered in Chapter 4.

3.7.3 Possibility of additional information

In addition to the delayed temperatures and their extremal values over 24 or 48 hours windows, we have considered different transformations of the weather data.

However, we did not obtain any clear improvement by feeding the models with, for instance, univariate features for the cloud cover, exponential smoothing or differences of past temperatures.

To improve the forecasts, we believe that the modeling rather needs other vari-ables. However, in order to simplify the set of inputs and focus on the structure of the models, we have ignored, following expert advice, some extra information that nevertheless are known to marginally impact the demand [RTE, 2014].

Wind and humidity Wind speed and humidity were neglected because they are considered to impact only slightly the electricity demand of end-users. However, the recent evolution of the French electric power system should bring us to question this consideration.

Indeed, the time series that we want to forecast are the demands at the different substations, that correspond roughly to the end-users demand minus the local pro-duction, as explained in Section2.1. Although RTE corrected the load time series in the database to account for the local production of energy that reduces the transit of electricity on the high-voltage network, we know that this procedure might be imperfect (c.f. Section 2.1). Therefore, changes of the weather conditions like the wind speed and the intensity of the sunlight might still impact the electricity de-mand through this mechanism because of the recent development of local renewable energy farms.

Since the sunlight, or more exactly the cloud cover, is part of the inputs of our model, we can hope that the local solar production is automatically taken into

account. However, wind speed is not part of the data that we use. Put differently, wind speed may not have a considerable impact on the demand of end-users but probably has one on the load at the substations level. We believe that this question requires further investigation.

Substations for national forecasting For the national load forecasting problem, the historical model of EDF has access to a fixed weighted mean of the conditions at 32 weather stations [Pierrot and Goude, 2011], as explained in Section 2.5.2. This choice of this meteorological information was not thoroughly questioned.

In order to better understand what this weighted mean is, we illustrate in Fig-ure 3.31 its coordinates in the 2-dimensional space spanned by the 2 first princi-pal components of the matrix whose columns are the temperatures at the different weather stations, after its rows have been centered.

−400 −200 0 200 400 600

−300

−200

−100 0 100 200

individual weather stations uniform mean

weighted mean

FIGURE 3.31: Projection of stations on 2 principal components Projection on the 2 principal components of the centered temperature ma-trix, of the time series at different weather stations. The national uniform mean and the weighted mean (c.f. Table F.1) are also plotted with crosses.

Besides, we assess in Figures F.66 and F.67 the possibility of summarizing the information at the 32 weather stations in a lower dimension vector with the best rank-r approximation of the temperature matrix.

In the national model, we could include the first principal components instead of the weighted average to enrich the information about the weather. Alternatively, we tried to find automatically a linear combination during the optimization of the model, with and without variable selection penalties. However, none of our limited experiments let us conclude that another combination of the weather stations could

lead to better numerical performances at the national level. Therefore, we have kept the weighted average of the 32 weather stations for the national model.

Weather forecasts for local models At the local level, we feed the models with the weather at the 2 closest weather stations. Even though this number was selected empirically, other settings should be considered.

In particular, Météo-France provides richer forecasts, at approximatively 4000 points of a grid covering the French territory. Using forecasts with this thinner granularity can potentially improve the quality of the load forecasts, as long as the relevant information is selected for each local model.

Special tariffs In France, some of EDF customers benefit from a reduced price on electricity most days of the year in exchange for a high tariff on some (cold) days, where the electricity demand is particularly high and expensive for the producers.

These are the so-called special-tariff contracts [RTE, 2014]. The high-price days are announced the day before and logically impact the demand [Thouvenot, 2015, Figure 2.3]. These days are made public on the Eco2Mix website [RTE,2019a] but we have ignored those to simplify the modeling.

At least, these days can be discarded when evaluating the numerical perfor-mances of the model, as done by Pierrot and Goude [2011]. We have tried this and since the relative order of the models remained the same, all the results in this manuscript include those days. Other demand-response mechanisms, specific to the electricity markets [Wikipedia, 2019], are not public but potentially impact significantly the demand too.

Holidays In addition to the features related to the day of the year, we have introduced in the load forecasting model an indicator of the Christmas period to take into account the notable decrease of the economic activity during this period.

We have considered using a similar indicator for the summer break because of the specificity of this period, from mid-July to the end of August but could not conclude that it helped the models.

Given the improvement with the introduction of the Christmas period indicator at the level of substations, we believe that taking into account other vacation periods (winter, Easter, autumn) can potentially also improve the forecasts. However, these vacations are not nationally synchronized and their introduction into the model requires a more refined treatment.

Ignored Daylight Saving Time Every collected data was measured and injected in the models hourly with UTC time. Thus, shift between Standard Time (ST) and Daylight Saving Time (DST) are ignored during the measuring process. They are also ignored in the modeling so far.

While it is difficult to measure how this change of regime impacts the forecasts when it is not taken into account, we noticed the shift of the peaks in the morning and in the evening in Figures2.14 and 2.9. Besides, we illustrate the national load demand near the shifts between ST and DST in Figure 3.32 and while we notice differences before and after the shift, it is not clear how this issue should be dealt with.

30 40 50

Load(GWh)

Dans le document Benjamin Dubois pour obtenir le grade de (Page 120-124)