Structure of the independent models

Multi-task setting

4.1 Structure of the independent models

We compared in Section 2.4.2 the distribution of the input data for different substations. In Section 3.5.5 and Section 3.5.6, we illustrated the distribution of the coefficients learned by independent single-task models at the local level. In this section, we additionally motivate the multi-task setting by presenting the similarities in the forecasts and in the residuals of these models. The graphs presented in this section follow the estimation of the models with the training years 2013 to 2015 and the test year 2016.

4.1.1 Similarities between models

Because we are interested in coupling the models of the different substations, we first analyze the similarities between the estimated coefficients.

Clustering A natural illustration of the potential similarities between the models is to cluster the learned coefficient vectors and look for a visually apparent struc-ture. However, we do not have a clear interpretation of the clusters presented in Figure F.68. We study in more details the possibility of clustering the coefficients in Section4.4.

Rank of the coefficient matrix Alternatively, we propose in Figure 4.1 an il-lustration of the spectrum of the learned coefficient matrix :

B:=

A C

∈R^p,K, (4.1)

where A is the bloc corresponding to the features shared by all the substations and C is the bloc corresponding to the individual features in the model defined by Equation (3.25), with the parametrization for the substations level described in Section3.4. Additionally, the spectrum of the prediction matrices are illustrated in Figure F.69.

In both cases, a significant part of the spectrum is localized in the first ponents of the approximations. This leads us to question whether these first com-ponents could be shared by the substations. We develop this idea with a low-rank constraint on the coefficient matrix in Section 4.5.

4.1.2 Commonly structured errors

While Section 4.1.1 is dedicated to the similarities between the local models, we

10¹ 10³

FIGURE 4.1: Singular values of the coefficient matrices

Norm of the residuals after subtracting the best rank-rapproximation of the coefficient matrices whose rows have been centered. GivenM ∈ {B,A,C}, we denote M˜ the same matrix after its rows have been centered and for r ∈N, the matrixM˜^(r) is the closest rank-rapproximation ofM˜ in terms of the Frobenius norm (c.f. Figure 2.22 for supplementary details). With the parametrization of the local models described in Chapter 3, the matrix A has dimension (3379, 1751), the matrix Ccontaining the coefficients related to the past loads and the weather conditions has dimension (510, 1751) and the concatenationBhas dimension (3889, 1751). The three functions equal 1 for r = 0 but the curves begin at r = 1 because of the logarithmic scale of the x-axis.

Spatial Correlation A natural way ro represent the residual correlations between the different substations is to represent them on maps. It seems indeed reasonable to assume that there is a higher probability to be correlated for nearby substations.

We present these correlations for 9 different substations in Figure 4.2. A map is associated with one substation k ∈[[1, K]], indicated by the black circle, and the color in another area`represents the empirical correlations in the residuals with the corresponding substation` :

corrtest(yk−ˆy_k,y`−ˆy_`) := E^test[(y_k−ˆy_k)(y_`−ˆy_`)]

pVar^test[y_k−ˆy_k]Var^test[y_`−ˆy_`], (4.2) where Var denotes the variance, yk, y` ∈ R are the normalized loads defined in Equation (3.33) and ˆy_k,ˆy_` ∈R are their estimates.

For a given substation, the closest substations in Figure 4.2 are not necessar-ily the only ones that present an important positive correlation. However, a small geographical distance between two substations seems to be a strong indicator of potential correlations. Since we are interested in this section in sharing informa-tion between different models for a multi-task models, it is essential to ask which substations should be coupled.

− 1.00 − 0.75 − 0.50 − 0.25 0.00 0.25 0.50 0.75 1.00

FIGURE 4.2: Spatially correlated residuals

Out of honesty, we illustrate the correlation of the residuals in the test year 2016 for the 9 first substations in the database ordered alphabetically.

The residuals of the substation in the center are positively correlated with the residuals of geographically close substations. On the contrary, there are negative correlations for the substation in the center right. As for the substation in the top right-hand corner, it has little correlation with the others.

To emphasize the important correlations between the substations and their neigh-bors, letN_ν^k0 denote the ν⁰-th geographically closest neighbor of a substationk. The average over the substations of the correlation between a substationk and its ν⁰-th

closest neighbor :

is illustrated in Figure 4.3, as well as the average correlation with the ν closest neighbors :

Figure4.3confirms that positive correlations decrease on average with the geograph-ical distance separating two substations.

0 250 500 750 1000 1250 1500 1750

The ν^′-th closest neighbor 0.0

0.2 0.4

Correlation

0 250 500 750 1000 1250 1500 1750

The ν closest neighbors

FIGURE 4.3: Residual correlations with closest neighbors

( top ) Average over the substations of the residual correlation in the training and the test sets with the ν⁰-th geographically closest neighbor.

( bottom ) Average over the substations of the residual correlation in the training and the test sets with the ν geographically closest neighbors.

Temporal correlation In addition to the spatial correlations, we illustrate in Figure4.4 the average quantiles of the residuals and their norms for different values of the hour of the week and over the test year 2016. The night and the summer

are better predicted than the daytime and the winter for all the quantiles. The fact that overestimation and underestimation of the loads occur simultaneously for the 5 quantiles indicates the presence of commonly structured errors : the hardest moments to predict seem common to a majority of the substations. We explore this idea in Section 4.6.

Quantiles and mean over the substations of the average residuals and aver-age norm of the residuals conditioned on the different values of the hour of the week h and of the day of the year d over the test year 2016.

( top left ) Average residuals conditioned with the hour of the week E^test[y−ˆy |h].

( bottom left ) Average norm of the residuals conditioned with the hour of the week E^test[|y−ˆy| | h].

( top right ) Average residuals conditioned with the day of the year E^test[y−ˆy |d].

( bottom left ) Average norm of the residuals conditioned with the day of the year E^test[ |y−ˆy| | d].

The hour of the week h = 0 corresponds to Monday at 00:01 and the day d = 0 is the 1^st of January. The norms of the residuals are smaller during the 7 nights of the week and from d = 120, that is the end of April, to d = 270, the end of September. The goal of these graphs is to show that the quantiles have similar variations, which means that the most difficult periods of the week and of the year are the same for all the substations.

Dans le document Benjamin Dubois pour obtenir le grade de (Page 130-135)