• Aucun résultat trouvé

Application to the load forecasting problem

Dans le document Benjamin Dubois pour obtenir le grade de (Page 80-84)

Independent models

3.4 Application to the load forecasting problem

Target variables In Section2.5.2, we introduced 5 partitions of theKsubstations that correspond to different levels of aggregation. To instantiate a load forecasting model, consider one element Zk ⊂ [[1,K]] in these partitions, that corresponds to a subset of the substations, with cardinal |Zk|. Like in Equation (2.6), we denote (rκ)κ∈Zk ∈ R|Zk| the loads of the substations in the area Zk and we define the aggregated load of this area :

`k := X

κZk

rκ ∈R. (3.27)

In this section, we introduce a parametrization to model this aggregated load, that we denote for simplicity `. Its estimator is denoted `ˆ.

There is no reason for the parametrization to be the same at different aggrega-tion levels and indeed, we found empirically that they should be different. Conse-quently, we restrict this section to the parametrization of the national models and the parametrization for the models at the level of substations are given in the three Tables F.2 -F.4. These choices are then justified in Sections3.5 and 3.6.

Inputs The calendar inputs given to the middle-term model are the timestamp t, the hour of the week h and the day of the year d. We also use a unique binary indicator 1hld for the 11French holidays and two others 1hld and 1hld+ for the days respectively preceding and following a holiday. We denote 1xmas an indicator of the two weeks around Christmas and New Year’s Eve and 1sun an indicator of the daytime, that is 1 between the sunset and the sunrise measured in Paris and 0 otherwise.

In Section 2.5.2, the subset of substationZk was associated with a subset of the weather stations that we denote[[1, S]]withS ∈N. Fors∈[[1, S]]andδ ∈ {24,48}, we write Ts the corresponding temperature, Ts the temperature with a delay of δ hours,T¯s the maximum temperature over the δ last hours, and

¯Ts the minimum.

Finally, the instantaneous cloud cover is denotedcs.

For short-term forecasts, the models are additionally aware of the past loads`

with a delay of δ hours, for δ ∈ {24,48}. An enumeration of the inputs to the short-term model is given in Table 3.2.

Category Name Symbol

Dateand time

cyclic hour of the week h

day of the year d

indicators holidays 1hld

days before a holiday 1hld

days after a holiday 1hld+

Christmas period 1xmas

Sun is up 1sun

absolute time timestamp t

Weather acyclic temperatures Ts

δ hours-delayed temperatures Ts maximum over a δ hours window T¯s minimum over a δ hours window

¯Ts

cloud covers cs

Past loads acyclic δ hours-delayed load `

TABLE 3.2: Inputs to the short-term load forecasting model There are as many copies of the weather-related inputs as weather stations s ∈[[1, S]] and one copy of the past information for each δ∈ {24, 48}.

Univariate features To build features as in Section3.1, we decide of a granularity for each input. Based on preliminary experiments, we consider the hour of the week instead of considering separately the hour of the day and the day of the week like Goude et al. [2013]; Thouvenot [2015]. For this input, we choose 168 knots for the hour of the week because of the high frequencies of the expected load conditioned on this variable (c.f.Figure2.10). This is equivalent with using an indicator for each hour of the week.

For the other continuous inputs, we expect smoother univariate effects given the empirical conditional expectations presented in Section 2.3. The granularity is set to128 for the day of the year, to 16for the temperature-related univariate features and, to 4 for the past loads in the short-term models, leading respectively to 128, 17 and 5 knots because the temperatures and the past loads are not cyclic. Finally, we add a linear function for the timestamp t.

The coefficients for the hour of the week h and the timestamp t are penalized with the Ridge regularization :

For the other coefficients, we use the smoothing spline regularizationΩS2, like in [Fan and Hyndman,2011;Goude et al.,2013;Pierrot and Goude,2011;Thouvenot,2015], that penalizes abrupt changes in the consecutive differences between coefficients. It is defined for vectors by :

S2 :θ ∈Rp 7→ 1 We also include the indicator of the Christmas period in the univariate effects and use the Ridge regularization to penalize the associated coefficient. These univariate features are gathered in Table 3.3 and justified empirically in Section 3.6.4. This setting leads to 366 degrees of freedom for the univariate part in the middle-term model (i.e. without the past loads) and371 in the short-term model.

Name Category Symbol Parametrization

hour of the week Cyclic h 168 knots

day of the year d 128 knots

Christmas period indicator 1xmas indicator

timestamp timestamp t linear function

temperatures acyclic Ts 17knots

δ hours delayed temperatures Ts 17knots

last maxima overδ hours T¯s 17knots

last minima over δ hours

s 17knots

δ hours-delayed load acyclic ` 5 knots

TABLE 3.3: Univariate features for the national forecasts

Set U of univariate features with the corresponding parametrization for the short-term load forecasting model at the national level.

Bivariate features There are multiple reasons to believe that interactions be-tween the inputs also play a major role for the determination of the load :

• the 2-dimensional conditional expectations observed in Section2.3.3,

• the good results of (highly non-linear) tree-based models,

• intuitively, a holiday will not have the same effect on working and non-working days of the week, and similarly, the past loads with a fixed delay depend on the day of the week.

While any order interactions can be considered, we did not obtain so far results indicating that explicit interactions between more than two univariate features are useful in a forecasting perspective. In fact interactions between two covariates al-ready introduce an extra layer of complexity. Indeed, as presented in Table 3.4, the number of parameters associated to the unconstrained bivariate features is more than ten times the number of degrees of freedom of the univariate effects. Depend-ing on the quantity of data to fit the model and its regularity which is strongly dependent on the considered level of aggregation, the regularization is essential.

Names Symbols Parametrization

cloud covers and day/night (cs,1sun) 3 knots & indicator week hour and holiday (h,1hld) 84 knots & indicator week hour and day before a holiday (h,1hld) 84 knots & indicator week hour and day after a holiday (h,1hld+) 84 knots & indicator week hour and δ hours-delayed load (h, `) 84& 5knots week hour and day of the year (h,d) 168 & 32knots temperatures and day of the year (Ts,d) 5 &32knots TABLE 3.4: Bivariate features for the national forecasts

Set B of bivariate features with the corresponding parametrization for the short-term load forecasting model.

It is not relevant to impose any of the structures defined in Section 3.3 on in-teractions between an indicator and another input since the coefficient matrix has only one column or one row. Besides, we have found empirically that the struc-tures on the interaction matrices described in Section 3.3 are not essential for the other inputs so we first consider the unstructured case and discuss this question in Section3.6.6.

Finally, the interactions including an indicator are regularized with the Ridge penalty k·k2F of Equation (3.28) while the others are penalized with the smoothing spline regularization ΩS2 defined in Equations (3.29) and (3.30).

Proposed models The version of Equation (3.15) for the middle-term model is : MM T ∼ β011xmas2t+f3(h) +f4(d)

The short-term model additionally uses the past loads : MST ∼ MM T + X

δ∈{24,48}

[f15,9δ(`) +g16,9δ(h, `)]. (3.32) As explained in Section 2.7.1, we are interested in minimizing the NMSE so we define the centered target variable :

y= `

`¯−1∈R. (3.33)

where `¯is the empirical expectation of the load `. Thereby, the regularized mini-mization of theNMSE over a training set [[1, n]]is exactly Equation (3.20) where the covariates (xi)i=1,...,n ∈Rn,p are obtained by concatenating the defined features.

Finally, when considering the simultaneous forecasting problems of several time series, the corresponding optimization problem is the sum of the individual objec-tives. It is strictly equivalent with an independent optimization of each problem since there is no coupling in this chapter. Writing this sum allows us to use the general form of Equation (3.25) for the simultaneous optimization of all the models at a given aggregation level. Each target time series has indeed access to its own design matrix since the past loads are not shared and the associated weather stations might be different.

Dans le document Benjamin Dubois pour obtenir le grade de (Page 80-84)