• Aucun résultat trouvé

Related work - Load forecasting

Dans le document Benjamin Dubois pour obtenir le grade de (Page 57-60)

Sep, Oct, Nov

2.6 Related work - Load forecasting

The modeling of the relationship between the electricity demand, the calendar information and the weather conditions has been the object of interest of both the statistical and the economy communities for the last few decades. The models based on socio-economic information and the analyses of electricity end-uses have proved more relevant for long-term horizons, of several months or years while the Machine Learning and more generally the statistical modeling, sustained by the increase in computing power of modern machines and by the vast amount of collected data, have dominated among the approaches to load forecasting problems with short-term horizons.

Broad surveys on the topic of load forecasting include [Hahn et al., 2009; Kyr-iakides and Polycarpou, 2007; Muñoz et al., 2010; Weron, 2007] and the models presented often extend to other quantities of current interest for forecasting and power systems including energy prices [Nowotarski and Weron,2018] and renewable energy production, which establish a connection with short-term weather forecasts techniques [Cros and Pinson, 2018; Messner and Pinson, 2018; Nagbe et al., 2017;

Petra et al., 2014].

Note that in operational conditions, most forecasting tools are different from the models encountered in the literature because the forecasts are often manually modified a posteriori by forecasters, in particular to take into account special and punctual events having a noticeable impact on the electricity demand.

Statistical approach A wide variety of forecasting methods have been proposed to model the electricity load, because no model has proved to be significantly better than the others in all possible settings, even for the short-term forecasting of the aggregated load at regional or national levels. Pioneer works on electricity load forecasting applied classical statistical tools, notably autoregressive models. Their flexibility and especially their ability to include seasonal components, trends and effects of exogenous variables [Huang and Shih,2003;Nowicka-Zagrajek and Weron, 2002] has justified their use as classical benchmarks for load forecasting problems.

Exponential smoothing techniques have equally been considered [Taylor,2010,2011].

Multilayer prediction The complex relationships between the input variables and the electricity demand lead researchers to consider more sophisticated models.

Thereby, the universal approximation of neural networks motivated non-statistical modeling [Hippert et al., 2001; Khotanzad et al., 1997; Kiartzis et al., 1995; Park et al., 1991]. A significant improvement was finally obtained in 2004 with Support Vector Machines [Chen et al., 2004].

The potential of tree-based models able to model high non-linearities with weak learners at a low computational cost was also assessed byDudek[2015]. Additionally, the design and the aggregation of specialized load forecasting experts was studied by Devaine et al. [2013]; Gaillard and Goude [2015], with models estimated over different time windows byPesaran and Pick[2011], and with a procedure of selection for high-dimensional data modeled with functional regression by Mougeot et al.

[2015].

Generalized Additive Models From 2011, the successful application to the load forecasting problem and the interpretability of the Generalized Additive Models (GAM) based on the calendar variables, the weather and the past values of the series has motivated a deeper analysis and various extensions2. In particular, they lead to an improvement of the predictions compared with the historical additive model used by EDF [Bruhns et al.,2005], that requires expert knowledge to be tuned and is considered to be insufficiently modular. That is why they are given a particular attention in this manuscript and are discussed in more details in Section 2.9.4. In particular,Pierrot and Goude[2011] specialized these models to the modeling of the French national demand and Goude et al. [2013] pursued this approach to model the electrical load of about 2000 substations of the French distribution network.

Variable selection For an adaptation to high-dimensional inputs and outputs, the GAM were also considered simultaneously with a 2-step variable selection proce-dure [Thouvenot,2015;Thouvenot et al.,2015], first with a selection of the relevant inputs variables with a group-Lasso regularization like in Equation (2.3) and a tuning of the regularization hyperparameters based on a Model Selection Criteria [Akaike, 1974; Craven and Wahba, 1978; Shenoy et al., 2015], then with a relaxed version of the objective, i.e. without the group-Lasso regularization, to correct the bias induced by the latter [Zhang et al.,2008]. In addition, Thouvenot et al. [2015] pro-vided a statistical analysis of their estimator and proved its consistency for variable selection.

State-Space Models As a major shortcoming of the aforementioned models, the difficulty to model the non-stationarity of electricity demand was addressed with periodic State-Space Models (SSM), able to adapt to changes of regime and long-term non-stationarity of the electricity consumption [Dordonnat et al., 2008]. A functional vector autoregressive SSM based only on endogenous data was proposed byNagbe et al.[2018].

Modeling uncertainty More recently quantile regression and density forecasting, that is to say the prediction of a whole conditional distribution, were the objects of an increasing attention of both the Machine Learning community [Dawid, 1984;

Sangnier et al., 2016] and the users of the load forecasting models [Hong and Fan, 2016; Shenoy et al., 2015] as well as the wind power forecasting models [Pinson, 2012;Sloughter et al., 2010].

With stochastic process modeling and the estimation of a confidence interval, Antoniadis et al.[2014] extends the work of [Antoniadis et al., 2012], whose general principle consists in finding in the history, observations similar to the present-day context in order to provide a forecast based on a linear combination of the similar observations, where the similarity is measured with the coefficients obtained by a

2In the electricity load forecasting literature, these models are sometimes called semi-parametric models. This denomination seems less appropriate than GAM since what matters most is their nonlinearity, which leads to the adjective Generalized, and their Additive structure, but not their potential infinite parametrization. Besides, it is very rare to have a load forecasting model that actually has an infinite number of parameters to estimate, mainly because of the Representer Theorem [Wahba,1990, and references therein].

Kernel Wavelet transform [Antoniadis et al., 2006]. A comparable approach based on curve linear regression to forecast a day of consumption given its recent past was developed byCho et al. [2013, 2015].

Alternatively, bootstrapping methods were considered almost 10 years ago [Fan and Hyndman, 2011] and more recently with randomly generated temperature sce-narios [Gaillard et al.,2016]. Meanwhile,Gaillard et al.[2016] approached the prob-lem with the pinball loss developed for quantile regression [Koenker,2005; Koenker and Bassett Jr, 1978]. Finally, an estimation of the time-varying covariance matrix in GAM was studied by Wijaya et al.[2015].

Particularly relevant and studied in meteorology, the problem of choosing an ap-propriate method to assess and compare empirically density forecasts was addressed byGneiting et al. [2007] who propose a study of the probability integral transform histogram, marginal calibration plots, the sharpness diagram and proper scoring rules.

Multiple output forecasting A large majority of the load forecasting models presented so far is focused on the load aggregated at regional or national levels.

Still, forecasts at non-aggregated levels were considered for buildings or residential neighborhoods [Kolter and Ferreira, 2011; Wijaya, 2015], homogeneous groups of consumers [Cugliari et al., 2016; Wijaya et al.,2014], and geographical areas [Hong et al.,2014]. Additionally,Thouvenot [2015] studies the local load forecasting prob-lem for61 of the 1751 substations that we consider in this manuscript, in a region near Lyon and with a particular attention paid to the selection of relevant input variables.

Depending on the residential, commercial or industrial nature of the electric-ity uses contained in these disaggregated time series, the curves may have strong similarities and share an underlying structure. Leveraging such a structure in the modeling to obtain a better generalization performance is the question of interest in Multi-task Learning. The tools developed in this branch of Mathematics have been applied in the last decade to the forecasting of electricity production from renew-able sources [Sanandaji et al., 2015; Wytock and Kolter, 2013] and to local loads forecasting problems.

Relying on the hierarchical organization of the time series, Auder et al. [2018]

studied the individual (household level) and aggregated (national level) load curves to propose a clustering tool with the same wavelet-based notion of similarity as in [Antoniadis et al., 2012]. Instead, Hyndman et al. [2011] introduced a model where the time series of different levels are forecast independently and then opti-mally combined with a linear regression model consistently with the hierarchical organization of the network.

Alternatively,Kim and Giannakis[2013] consider low-rank formulation of multi-task load forecasting problems in an attempt to leverage and potentially reveal the underlying structure of the load curves. Promoting the interpretability of non-negative matrix factorization formulation [Lee and Seung, 1999, 2001], Mei et al.

[2017] studied the problem of time series recovery in the context of incomplete measurements and extended the model with side-information to times series predic-tion [Mei et al., 2018].

The rising interest for local load curves has additionally motivated the

develop-ment of methods for the detection of anomalies [Jian et al., 2018, and references therein], much more present at disaggregated levels.

Future stakes Major challenges have recently emerged in the electricity sector, such as the adaptation to modern energy markets, the integration of renewable energies and the penetration of electric vehicles, progressively being reflected in the research literature. The installation of smart meters and the conditions neces-sary to the realization of their potential progressively draw attention too, to take into account the Demand-Response Mechanisms as well as to leverage the consid-erable datasets collected, certainly leading to Big Data considerations. Mei et al.

[2016] studied for instance the relationship between socio-demographic character-istics and local electricity uses in order to extrapolate the demand in regions with socio-demographic information but few measurements of the electricity demand with smart meters.

Dans le document Benjamin Dubois pour obtenir le grade de (Page 57-60)