A novel method for decomposing electricity feeder load into elementary profiles from customer information

(1)

HAL Id: hal-01558385

https://hal-mines-paristech.archives-ouvertes.fr/hal-01558385

Submitted on 7 Jul 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

A novel method for decomposing electricity feeder load into elementary profiles from customer information

Alexis Gerossier, Thibaut Barbier, Robin Girard

To cite this version:

Alexis Gerossier, Thibaut Barbier, Robin Girard. A novel method for decomposing electricity feeder

load into elementary profiles from customer information. Applied Energy, Elsevier, 2017, 203, pp.752

- 760. �10.1016/j.apenergy.2017.06.096�. �hal-01558385�

(2)

A novel method for decomposing electricity feeder load into elementary proles from customer information

Alexis Gerossier ^∗ Thibaut Barbier ^∗ Robin Girard

{alexis.gerossier},{thibaut.barbier}@mines-paristech.fr

∗ Both authors equally contributed to this paper

MINES ParisTech, PSL Research University, Center for processes, renewable energy and systems (PERSEE), 1 rue Claude Daunesse, 06904 Sophia Antipolis, France

June 2017

Abstract

To plan a distribution grid involves making a long-term forecast of sub-hourly demand, which requires modeling the demand and its dynamics with aggregated measurement data. Distribution system operators (DSOs) have been recording electricity sub-hourly demand delivered by their medium-voltage feeders (around 1,00010,000 customers) for several years. Demand proles dier widely among the various considered feeders. This is partly due to the varying mix of customer categories from one feeder to another. To overcome this issue, elementary demand proles are often associated with customer categories and then combined according to a mix description. This paper presents a novel method to estimate elementary proles that only requires several feeder demand curves and a description of customers. The method relies on a statistical blind source model and a new estimation procedure based on the augmented Lagrangian method. The use of feeders to estimate elementary proles means that measurements are fully representative and continuously updated. We illustrate the proposed method through a case study comprising around 1,000 feeder demand curves operated by the main French DSO Enedis. We propose an application o that uses the obtained proles to evaluate the contribution of any set of new customers to a feeder peak load. We show that proles enable a simulation of new unmeasured areas with errors of around 20%. We also show how our method can be used to evaluate the relevancy of dierent customer categorizations.

1 Introduction

1.1 Motivation

Electricity represented 18% of total nal energy consumption in 2013 [3] and is expected to con- stitute a quarter of nal energy consumption by 2040 [1]. 42% of global CO2 emissions in 2012,

i.e. 13.8 gigatons of CO2, are due to electricity and heat production [2]. To reduce CO2 emis- sions due to electricity, many states are devel- oping energy transition strategies. This kind of transition involves signicant changes to electric- ity ows in the distribution network (with e.g.

decentralized production, improved eciency of

buildings and appliances, new uses and demand

(3)

Nomenclature

a ^f Consumption trend of feeder f relative to temperature

b ^f Temperature threshold of feeder f B Matrix of demand proles of

customer categories

β Column vector associated with B c ^f _k Annual consumption of category k

for feeder f

d ^f Demand of feeder f

d _k Elementary prole of customer category k

ε ^f Residual term for modeling feeder demand f

F Number of feeders f Feeder

K Number of customer categories k Customer category

m k Average demand share of a given category k

p ^f _k Share of electricity used by category k for feeder f

σ ² _k Empirical variance of p ¹ _k , . . . , p ^F _k T Number of instants

t Instant

T ^f Outside temperature of feeder f u Vector (1, . . . , 1) ^| of length K v Vector (T ⁻¹ , . . . , T ⁻¹ ) ^| of length T V _inter Inter group variability

V tot Total variance

X Matrix of feeder demands

x Column vector associated with X

y Year

⊗ Kronecker product

response enabling energy consumption manage- ment [18]).

These changes impact the planning process of distribution system operators (DSOs). The cur- rent network planning process considers the two most extreme situations [16], i.e. maximum de- mand with minimum supply, and maximum sup- ply with minimum demand. While planning with such a method does not require a deep model- ing of the dierent dynamics and their correla- tions, it does not take into account the aggre- gation eect between supply and demand [15].

The above-mentioned changes make it necessary to model all of the aggregated demand dynamics.

1.2 Literature review

In this section we present two kinds of existing approach for modeling aggregated demand. The rst is bottom-up, and uses individual customer proles, which are summed to obtain aggregated demand. The second is a global approach in which the aggregated load curve is directly mod- eled using aggregated measurement data.

1.2.1 Bottom-up approaches

Measuring the electricity demand of individual

electricity customers is a simple way to establish

their load proles and dynamics, and therefore

a necessary step in bottom-up modeling. The

current smart-meter roll-out in Europe will pro-

vide precise measurements of individual demand

proles. Around 80% of customers are sched-

uled to receive a smart-meter by 2020 [28]. How-

ever, this massive deployment is hindered by cost

and privacy issues [21]. In 2014, only 23% of

smart-meters in the European Union were in-

stalled in localized areas for private customers

[13]. In some countries, this share is still insu-

(4)

cient to be representative, and the corresponding deployment is too recent to adequately cover long periods. To deal with the lack of individual mea- surements and characterize the behavior of elec- tricity customers, researchers have attempted to classify them into dierent categories.

The classication of electricity demand pro- les is a ourishing research topic (see reviews [19], [25]). Researchers use individual measure- ments from smart-meters as input and apply dif- ferent clustering methods [31]. This reduces the dimension, which makes it easier to manipulate data [22]. With the resulting classication, each customer is associated with a cluster and its cor- responding load prole [26]. The classication and the obtained load proles can be used for a number of applications.

First, a ne classication can be made in or- der to help decision-makers design personalized policies for specic customers [7].

Secondly, the classication allows a DSO to plan its network and anticipate its investments [23, 27]. For example, the French DSO uses a model named "Bagheera" combining about 50 customer categories to plan its low-voltage net- work [16]. Classication is combined with the evolution of category distributions to forecast ag- gregated demand in prospective scenarios [5].

Last, classication and load proles allow us to understand the contribution made by each cate- gory to aggregated demand [27].

Large measurement campaigns are necessary with these methods since a representative set of customers is required. This constraint makes continuous updating of the proles dicult, which is an issue since it remains necessary to adapt the proles to the changing consumption habits [4, 26].

1.2.2 Global approaches

In global approaches, models forecast aggregated electricity demand with past measurements and explanatory variables, such as expected temper- ature or sometimes economic progress [30].

In order to obtain past measurements, most DSOs have been recording the electric power de- livered by their medium-voltage feeders (around 1,00010,000 customers) for several years.

These measurements are aggregated, but exhaus- tive, since all electricity customers' contributions are taken into account. This aggregated electric- ity demand data is considered as a nonlinear, non-stationary series, and is often made up by a superposition of several distinct frequencies [29]

with daily to monthly periods in global models [8]. Additionally, the demand series can be di- vided into dierent parts (e.g. working time, hol- idays) [9, 17].

The global approach produces accurate fore- casts. However, these are based on aggregated past measurements, which are not available when planning a new unmeasured zone. This type of planning is improved with specic information about customers, which DSOs possess thanks to the Customer Information System (CIS) [23].

The CIS stores information on all customers re- garding their electric connection to the grid, an- nual energy consumption, type of contract, and contracted power.

In all of the reviewed global methods [29]

for modeling demand dynamics, the explanatory variables used, such as expected temperature or sometimes economic changes [30], do not charac- terize the feeder-specic local features. In partic- ular, none of them employs CIS general statistics.

Finally, the drawback of these methods when

used for planning purposes is that they cannot

adapt to a change in the mix of customer cate-

(5)

gories. For example, in the case of the develop- ment of a commercial area in a residential feeder, such methods fail to take into account the corre- sponding information. If the prole dierences of the two sectors is not accounted for, this might result in an overestimation of the future peak and hence an over-sizing of the network.

1.3 Contributions

Our paper presents a novel method to estimate elementary proles. The proposed method re- lies on a statistical model that takes into ac- count the mix of customer categories. To do this, we assume that the demands aggregate dierent shares of elementary proles associated with dif- ferent customer categories. These proles are op- timally found by minimizing prediction errors in a new algorithm relying on the augmented La- grangian method.

Unlike bottom-up methods, our method only requires several feeder demand curves and a de- scription of customers. The advantages of aggre- gated measurements compared to a set of indi- vidual load curves are: the availability of long- term historical data, full representativeness, and continuous updates. We show that the method performs similarly or better than a bottom-up method in the literature when predicting new lo- cal areas.

We illustrate the proposed method through a case study comprising around 1,000 feeder de- mand curves operated by the main French DSO Enedis. The proles obtained are essential to size the distribution network. This is illustrated by an application that evaluates the contribu- tion of any set of new customers to a feeder peak load. We show that proles enable a simulation of new unmeasured areas with errors of around 20%. We also show how our method can be used

to evaluate the relevancy of dierent customer categorizations.

1.4 Description of the paper

In section 2, the methodology is described. A case study is presented in section 3 with the re- sulting proles by category. Section 4 describes two applications that use the obtained proles.

One is employed to estimate the contribution of set of new customers to a feeder peak load. The other evaluates forecasting errors for unmeasured areas, by testing dierent categories and compar- ing performances with a similar framework case study in the literature. Finally, some conclusions are presented and discussed in section 5.

2 Methodology

2.1 The problem of recovering load proles and the forecasting method

Our paper assumes that the sub-hourly demands d ^f (t) of a feeder f aggregate dierent proles d ₁ (t), . . . , d _K (t) associated with K categories of customers with weights p ^f ₁ , . . . , p ^f _K ,

d ^f (t) =

K

X

k=1

p ^f _k d k (t) + ε ^f (t). (1)

We take the elementary proles d _k (t) to be com-

mon to all feeders, while the weights vary from

one feeder to another. The corresponding resid-

ual term ε ^f (t) is meant to be small. The time t

can vary along any set. The aim is to recover un-

known elementary electricity proles d _k (t) . For

each feeder f ∈ {1, . . . , F } , d ^f (t) is observed

and, thanks to the CIS, for each category k ∈

{1, . . . , K } , we also have access to the weight p ^f _k .

(6)

The process of obtaining proportions from the CIS and dening categories is the categorization step, and is described in subsection 2.3. Once the K proles have been obtained on a set of feeders, it is possible to turn Equation (1) into a simulation algorithm. The process is described in Figure 1. In the signal processing community, the corresponding problem is called blind signal separation and is well-known (see e.g. [11]).

2.2 Optimization problem

The aim is to nd the elementary proles d _k (t) from aggregated demand d ^f (t) according to Equation (1). We write and solve the following optimization problem.

To mathematically write this optimization problem, we dene a matrix A of size (F, K ) whose elements are proportions p ^f _k for k ∈ {1, . . . , K } and f ∈ {1, . . . , F } . Aggregated demands d ^f (t) for all feeders and instants {1, . . . , T } are gathered in a matrix X of size (F, T ). We are trying to compute demand pro- le d _k (t) for all categories and instants: these unknown values can be put in a matrix B of size (K, T ). It is useful to dene β (resp. x), the col- umn vector obtained by stacking rows of B (resp.

X ) on top of each other. Two constraints limit the values of matrix B:

1. Each component of β is an electricity de- mand. Since electricity producers are not considered in this paper, components should be positive.

2. For each class k , components should have an average unit, i.e. P

t d _k (t) = T , to have comparable proles. To write this constraint in mathematical terms, we dene the col- umn of length K , u = (1, . . . , 1) ^| , and the column of length T , v = (T ⁻¹ , . . . , T ⁻¹ ) ^| in

order to write the average unit constraint, with a Kronecker product ⊗ , as (I _K ⊗v ^| )β = u .

The optimization problem then writes

min β kx − (A ⊗ I T )β k ² (2) s.t. β ≥ 0

(I K ⊗ v ^| )β = u

An alternating direction method of multipliers [10] is used to recursively solve problem (2):

1. minimize the function with the equality con- straint by employing the augmented La- grangian method,

2. retain only positive components to satisfy the positivity constraint,

3. adjust a penalty variable balancing positiv- ity and the minimization.

The algorithm is implemented with the R lan- guage [24]. Special care is taken on the rst step, since the minimization requires inverting a large matrix of size K(T + 1) . With common Kro- necker product rules, matrix to be invert is re- duced to size K divided the number of ops by approximately T ³ .

2.3 Categorization of electricity cus- tomers

The aggregated demand prole d ^f (t) of a feeder

f aggregates a large group of customers (a few

thousands). The CIS provides general features

on these customers, i.e. annual consumption,

type of contract, and contracted power, which

can be used to cluster them into K dierent

categories. Once the features are selected, the

(7)

DATASET

LOAD PROFILES RECOVERY

DECOMPOSITION

UNKNOWN DEMAND

SIMULATION ALGORITHM

Catw1 CatwK

0 24

Demands Proportions

Feederw1

FeederwF

10% 27%

47% 5%

hour

Profiles Categoryw1

CategorywK

0 24

Proportions

Newwfeeder

79% 12%

Newwfeeder

hour

Catw1 CatwK

0 24

Demand

Figure 1: Diagram detailing the method. A dataset of F feeder measurements is used to nd the K category proles. Once the load proles recovery is operated, a new feeder whose category distribution is known can be run through the simulation algorithm in order to obtain its expected demand.

total annual consumption c ^f _k of a category k ∈ {1, . . . , K } in a feeder f ∈ {1, . . . , F } is com- puted from each annual individual consumption.

The corresponding weight p ^f _k is a normalized ver- sion of this consumption

p ^f _k = c ^f _k P K

k=1 c ^f _k

∈ [0, 1] (3) It is important that the size of the dataset F should be larger than the number of categories K . Empirically, it was observed that the condi- tion F > 5K is preferable in order to obtain a wide range in the set of category distributions, and thus a more precise result. Features should be general enough to keep a reasonably low K for three reasons: (i) to obtain a robust prole, (ii)

to avoid an excessively long computing time, and (iii) to ensure that user privacy is not violated.

Figure 2 sets out four dierent categorizations, based on information from the CIS. The rst categorization divides the total energy into two groups: residential and tertiary. The second splits the tertiary into 7 categories to make a total of 8 categories, i.e. residential, agriculture, commercial, public equipment, oce and hospi- tal, industry, restaurant and hotel, and medium- voltage (MV) customers (e.g. large buildings that have a specic contract with the operator).

A 9-group division results from splitting the res- idential share into two groups: base tari and special tari ¹ . Finally, an even more precise cat-

1

Special tari charges less during xed o-peak peri-

(8)

egorization, i.e. 12 groups, is proposed. Com- mercial buildings are split into 2 categories re- ecting low and high annual consumption. Sim- ilarly, MV customers are divided into 3 groups:

low, medium and high.

On Figure 2, category heights for a category k represent the average demand shares for a given category m _k = _F ¹ P F

f=1 p ^f _k .

The share in category distribution is dierent for every feeder. For instance, there are more restaurants in a city center than in a rural area and so the two electricity shares are dierent.

This share has to vary between feeders to e- ciently compute the demand proles. We com- puted the coecients of variation

σ k

m _k (4)

where σ _k ² is the empirical variance of p ¹ _k , . . . , p ^F _k . The coecients are always higher than 40%, and thus the dierent categorizations are suciently spread from one feeder to another for our algo- rithm.

3 Case study

3.1 Data description

In this case study, we use electricity feeder de- mand measured every ten minutes in 3 geograph- ical regions in France. Data come from the main French DSO, Enedis. The three regions encom- pass a large French city and the surrounding countryside. The three cities are Blois, Lyon and Rennes. Each region is divided into around 500 feeders, and each of these feeders provides electricity for about 1,000 customers. For each feeder, we know the demand measured for 4 years

ods (i.e. during the night) but more during peak hours.

from 2010 to 2013. We discard some feeders be- cause the measures are too scarce and their over- all quality is not sucient. This can result from database errors or from network reconguration or physical injuries on the grid [17]. Ultimately, between 200 and 400 feeders are selected for each region.

3.2 Temperature eect and normal- ization

Aggregated demand measurements cannot be di- rectly compared since some feeders are connected to more customers than others, causing a large discrepancy in average consumption. In order to be used as inputs in the method, measurements therefore need to be pre-processed. The two steps of this pre-processing are: removal of the temperature eect, and normalization by weekly consumption.

Electricity demand is mostly inuenced by

outdoor air temperature, as residents turn on

electric devices to adjust their indoor tempera-

ture (heating and air conditioning). In France,

the air conditioning eect is low and not consid-

ered in this paper, but the heating eect is high

during cold weather. French electric demand

represents 40% of the European thermal sensi-

tivity [14]. Indeed, since most French heating

devices are electric, demand strongly increases

when temperature decreases. However, this ef-

fect is well understood and can be removed and

treated separately with a method used by the

French TSO [20, pp 1112]: one linear regres-

sion for each hour of week. Therefore, for each

feeder f , we can determine a temperature thresh-

old b ^f and a trend a ^f > 0 such as for each degree

colder than threshold b ^f , demand increases by

a ^f . A new demand series is dened from the

(9)

2 8 9 12

residential

tertiary

residential

base tariff base tariff

special

tariff special

tariff

Agriculture Commercial-low&

Commercial-high Public&equipment Office&&&hospital Industry

Restaurant&&&hotel MV&customer-low MV&customer-medium MV&customer-high

MV&customer MV&customer Commercial Commercial

Commercial

Mean share&

of&

demand

Category name Number&of&categories

Figure 2: Example of dierent categorizations (in 2, 8, 9 or 12 groups) for the region near Lyon.

There are F = 320 feeders in this dataset. The height of a division shows the mean share of the category in all feeders in the region.

initial d ^f ₀ (t)

d ^f ₁ (t) = (

d ^f ₀ (t) if T ^f (t) > b ^f d ^f ₀ (t) − a ^f b ^f − T ^f (t)

otherwise.

where T ^f (t) is the outside temperature of feeder (5) f at instant t . In fact, trends a ^f and threshold b ^f are calculated for each hour of the day but the hour index is omitted for clearer notation.

The new series is thus supposed to be indepen- dent from the temperature, and demand dynam- ics are supposed to be similar during cold and warm periods.

To obtain comparable measurements between

feeders, demand is normalized. Each measure-

ment within a given week is divided by the energy

it consumed during that week. This total energy

can be predicted using dierent models, such as

that employed in [6], and is thereafter supposed

to be known. After the normalization, data val-

ues uctuate around a dimensionless value equal

to 1.

(10)

Hour%of%the%day

0 6 12 18 24

average%weekly consumption

+50q +100q +200q +150q

-50q -100q

Commercial Public%eq.

Rest.%.%hotels Industry

Lyon%2011

Figure 3: Weekday proles of 4 dierent categories computed with the algorithm (9 overall cate- gories) using aggregated consumption data relating to Lyon in 2011. Plots represent the variations around the average weekly consumption and not absolute consumptions.

3.3 Proles

As previously described (see Figure 1), we dis- aggregated the electricity demand in order to recover a load prole d _k (t) for each category k ∈ {1, . . . , K } . The number of overall categories depends on the customer categorization: 2, 8, 9 and 12 categories were tried out (see Figure 2).

A total of 12 datasets is formed (for each region:

Blois, Lyon and Rennes; and for each year: from 2010 to 2013) and separately used as input into matrix X in problem (2).

Figure 3 presents the proles obtained for K = 9 with only 4 categories shown: commer- cial, public equipment, restaurant and hotel, in- dustry. Proles are computed with the demand dataset of Lyon in 2011. Proles are presented for a typical weekday (144 values, once every 10 minute). Since we have normalized the data, the

variations around the average weekly consump-

tion are displayed. Dierent eects are note-

worthy, e.g. the electricity consumption of com-

mercial buildings increases by around 75% dur-

ing working hours, and decreases by 50% dur-

ing the night. Conversely, the consumption of

public equipment (mainly public lighting and

lifts) greatly increases at night. These proles

are a pertinent way to understand electricity de-

mand patterns. Proles can be plotted for other

datasets (another region or another year) in or-

der to analyze specic characteristics.

(11)

4 Applications of the method

4.1 Estimation of the contribution of new customer sets to a feeder peak load

To plan the expansion of a new area, the DSO has to estimate the evolution of peak demand. The proles obtained enable it to quantify and fore- cast the contribution of the new set of customers in the peak load demand. Indeed, for a feeder f at year y ₀ with proportions p ^f _1,y

0

, . . . , p ^f _K,y

0

we can determine the residuals ε ^f _y

₀

(t) in Equation 1 and for new proportions p ^f _1,y

₁

, . . . , p ^f _K,y

1

in a fu- ture year y ₁ the forecast demand is obtained by

d ^f _y

₁

(t) =

K

X

k=1

p ^f _k,y

1

d k (t) + ε ^f _y

₀

(t). (6) Figure 4 depicts the peak change obtained with this formula in the case of dierent evolutions for both oces and special-tari residential con- sumers. In this case study, the considered feeder is from the Lyon region and has the following distribution of customers: 30% commercial, 15%

oces, 30% basic residential and 20% special special-tari residential. The initial peak occurs at 12:10 and is 650 kW. The proles used are taken from the 9-category breakdown. We quan- tify the inuence on the peak value (black lines with value added to the initial peak value, per 50 kW) by adding an oce category load (Y axis) and a special-tari residential load (X axis). We also depict the evolution of the peak hour (black dashed line). Adding oces contributes to in- creasing the 12:10 peak, whereas the residential load increases the 23:00 peak, which corresponds to the start of the special-tari period.

This is an illustration of an application of the method that can for example help decision- makers to choose between two projects (oces or

a new residential area) and quantify the impact on the existing feeder demand.

4.2 Evaluation, comparison of the method and category relevancy 4.2.1 Simulation evaluation

Thanks to the computed proles, the aggregated demand of a feeder can be simulated. Each cat- egory prole is multiplied by the consumption share of the category. The category distribu- tion is the only information required for the sim- ulation; there is no need for historical demand recordings. We show a simulation example on Figure 5. Demand is simulated with only two categories: residential (green area) and tertiary (orange area). We sum the two proles multi- plied by their respective share (here 75% residen- tial and 25% tertiary consumption). The mea- sured consumption of a feeder with a 75/25 pro- portion is superimposed in black. The respective contribution of the two categories at each time step is clearly observable on the aggregated de- mand.

To assess the quality of the model, we use the Root-Mean-Square Error (RMSE) index. For each region and for each year, we compute pro- les for 2, 8, 9 and 12 overall categories and use them to simulate new feeders. We then com- pare the simulation with actual demand with a leave- k -out approach ( k = 50 ). This means that a subset of k feeders (that are not used in the training stage) is simulated. An RMSE for each of these feeder subsets is obtained and the aver- age value is computed. This process is repeated 100 times to remove the volatility eect caused by the random subset of a 50-feeder selection.

Computation takes roughly 16 hours for every

region and every year on a 3.50 GHz machine.

(12)

Peak hour

12:10 23:00

Added residential load (kW)

Added office load (kW)

Figure 4: Peak change with a new load in a given feeder.

(13)

Time Demand (MWh) 0 0.5 1 1.5

Thu 19 Jul Fri 20 Jul Sat 21 Jul

Tertiary Residential

Figure 5: Simulation for one feeder. The proles were obtained using demand data from Blois for

2012. The black line represents the actual consumption of the unknown feeder (not used in the

training dataset). Our algorithm obtained two proles: the orange part represents the tertiary

demand and the green part the residential demand.

(14)

Table 1 reports the average RMSE and its de- viation for the Blois, Lyon and Rennes during the 4 years for dierent numbers of categories. As a reminder, with consumption normalization, av- erage consumption is dimensionless and equal to 1 (see Section 3.2). Hence, the RMSE reported is also dimensionless, and can be expressed as a percentage.

4.2.2 Category relevancy

Average RMSE is 22.59% for Blois, 18.16% for Lyon and 22.42% for Rennes with 9 categories.

The errors are highly dependent on the regions, meaning that some regions are less predictable than others. Increasing the number of categories improves the overall model quality. The 8 cat- egory scheme almost always outperforms the 2 category one (by 2.5%). The 9 category scheme slightly improves results compared to the 8 ver- sion (by 1%), and so dividing customers into ba- sic and special taris is meaningful. However, splitting small categories into even smaller cate- gories is not recommended, as can be seen by the poor results of the 12 category scheme. A rst reason may come from the use of CIS for classi- cation: previous works have stated that using directly the CIS classication does not necessar- ily lead to the best proles [12].

Another reason can come from the inter-group variability. As in any blind source separation task, a class is easy to recover and predict if it is distinctly separated from the other classes, and if it is observed in many dierent congurations.

The statistics literature proposes many dierent separation metrics, but the simplest is a ratio between an inter-group variability measure and a total variability. In this context, since the vari- able of interest if a vector or even a curve it is not obvious to dene the variability. We propose

to dene an inter-group variability measure with the weighted distance between d _k and d ^f

V _inter = X

f,k

p ^f _k kd _k − d ^f k ² ₂ ,

and a total variance by V tot = P

f kd ^f k ² ₂ , where kxk ² ₂ is the sum of the square of a vector x . The ratio between inter-groups and the total variance should be as high as possible. Measuring the di- versity of congurations in which the nal signal is observed can be related to the variance σ _k ² and mean m _k of p ^f _k among the feeders, the larger this variance and mean the more accurate the esti- mation will be. These separations and variabil- ity measures can be used to evaluate the value of adding categories. The inter-variance requires the computation of the d k but σ ² _k and m k can be computed before any estimation.

4.2.3 Comparison to other models

Errors are higher than for middle-term forecast- ing methods, which can be around 7 to 10% of RMSE (see e.g. [8], [17]). However, our problem is dierent, and the relationship between the de- mand for a feeder f ₁ for a given year y ₀ and the demand for a feeder f 1 for the next year y 1 is much stronger than the relationship between the consumption of a feeder f ₁ and the consumption of feeder f 2 for the same year y 0 .

Framework of Andersen et al. is more simi- lar to ours [5]. This presents a model calculat- ing local consumption by categories of customer with specic consumption proles and dierent weights in local areas. Unlike us, their proles are obtained by clustering representative smart- meter measurements, i.e. a bottom-up method.

Their results from simulating local areas without

using past measurements are expressed with R ²

(15)

value and are between 0.95 and 0.56 (their mean R ² is 0.84). In their case study, the mean con- sumption of areas is 55.3 MW while in our case, for a given feeder it is between 0.5 and 7 MW. In order to compare our method with their method, we aggregated our areas to obtain similar aver- age power levels and computed the R ² between prediction and measurements. The results are shown in Table 2.

The performances of our method are a little higher than Andersen et al.'s method in the Lyon and Rennes case studies, and similar in the Blois study.

Area Avg. demand 2010 (MW) R ²

Blois 31.5 0.82

Lyon 46.2 0.88

Rennes 37.4 0.87

Table 2: Coecient of determination R ² for dif- ferent areas showing the predictive performance of our method with a 9-category breakdown. The prediction of a group of 20 feeders is compared to the measured demand of the 20 feeders. We also report the average demands, which are compara- ble to the areas described by Andersen et al. [5]

with similar R ² values : on average they found R ² of 0.84 for predicting dierent areas with an average demand of 55.3 MW.

5 Conclusions

Our paper has proposed a novel method to esti- mate elementary proles. The main assumption of the method relies on feeder demands that ag- gregate various shares of elementary proles as- sociated with dierent customer categories. The proles are optimally found by minimizing pre-

diction errors in a new algorithm relying on the augmented Lagrangian method.

Unlike bottom-up methods that require indi- vidual load curves, our method only requires sev- eral feeder demand curves and a description of customers. One of the advantages of using aggre- gated measurements on a set of individual load curves is that they can be updated regularly and are fully representative. In the meantime, we have shown that our method performs similarly or better than a bottom-up method in the liter- ature to predict a new local area.

The method has been applied in a case study comprising three zones in France, with around 300 available feeder measurements over 4 years per zone. The result is a load prole for each customer category. We have shown that each load prole gathers intrinsic features of the given category.

A rst application using the resulting proles was presented for planning the expansion of a new area at DSO level. The resulting proles allow for dierent quantication and forecast- ing of the contribution made by the new set of customers to peak load demand. This was il- lustrated by a case study on a specic feeder where the evolution of peak demand in the case of adding two share categories was discussed. A second application of the proles is to simulate the electricity demand of the new unmeasured areas. This can be used to test the relevancy of various types of categorization (2, 8, 9 or 12 groups were tested). By analyzing forecasting er- rors, we observe that using more categories does not necessarily lead to more ecient models, sev- eral causes are discussed.

Further research could investigate the creation of an automatic way to create categories, e.g.

by maximizing entropy information, to create

the best proles and minimize prediction errors.

(16)

Socio-demographic statistics might be ecient to accurately describe categories. Information such as mean household area and building age are very meaningful in electricity demand forecast- ing, and are thus areas for further research.

6 Acknowledgments

The authors would like to thank Enedis for sup- plying data to make this work possible, and par- ticularly Nicolas Kong from the Direction Tech- nique, Politiques et Stratégie group, for his pre- cious expertise on measurements, CIS data, and the Bagheera planning model.

References

[1] I. E. Agency, World Energy Outlook, IEA Publishing, Paris, 2015.

[2] , Co2 emissions from fuel combus- tion highlights 2016, IEA Publishing, Paris, 2016, ch. Key trend in CO2 emission from fuel combustion, p. 12.

[3] , Key World Energy Statistics, IEA Publishing, Paris, 2016.

[4] F. Andersen, H. Larsen, and T. Boomsma, Long-term forecasting of hourly electricity load: Identication of consumption proles and segmentation of customers, Energy Conversion and Management, 68 (2013), pp. 244 252.

[5] F. Andersen, H. Larsen, and R. Gaardestrup, Long term fore- casting of hourly electricity consumption in local areas in Denmark, Applied Energy, 110 (2013), pp. 147 162.

[6] F. Andersen, H. Larsen, N. Juul, and R. Gaardestrup, Dierentiated long term projections of the hourly electricity con- sumption in local areas. the case of Denmark West, Applied Energy, 135 (2014), pp. 523 538.

[7] N. Bassamzadeh and R. Ghanem, Mul- tiscale stochastic prediction of electricity de- mand in smart grids using bayesian net- works, Applied Energy, 193 (2017), pp. 369 380.

[8] K. G. Boroojeni, M. H. Amini, S. Bahrami, S. Iyengar, A. I. Sar- wat, and O. Karabasoglu, A novel multi-time-scale modeling for electric power demand forecasting: From short-term to medium-term horizon, Electric Power Sys- tems Research, 142 (2017), pp. 58 73.

[9] K. G. Boroojeni, S. Mokhtari, M. H.

Amini, and S. S. Iyengar, Optimal two- tier forecasting power generation model in smart grids, CoRR, abs/1502.00530 (2015).

[10] S. Boyd, N. Parikh, E. Chu, B. Pe- leato, and J. Eckstein, Distributed op- timization and statistical learning via the alternating direction method of multipli- ers, Foundations and Trends R in Machine Learning, 3 (2011), pp. 1122.

[11] J.-F. Cardoso, Blind signal separation:

statistical principles, Proceedings of the IEEE, 86 (1998), pp. 20092025.

[12] G. Chicco, R. Napoli, and

F. Piglione, Comparisons among clus-

tering techniques for electricity customer

classication, IEEE Transactions on Power

Systems, 21 (2006), pp. 933940.

(17)

[13] E. Commission, Benchmarking smart me- tering deployment in the EU-27 with a focus on electricity, European Commission, Brus- sels, June 2014.

[14] R. de transport d'électricité, Bi- lan prévisionnel de l'équilibre ore-demande d'électricité en france, 2016, ch. Consomma- tion d'électricité en France, p. 40.

[15] J. Dickert and P. Schegner, Residen- tial load models for network planning pur- poses, in 2010 Modern Electric Power Sys- tems, Sept 2010, pp. 16.

[16] N. Ding, Load models for operation and planning of electricity distribution networks with metering data, theses, Université de Grenoble, Nov. 2012.

[17] Y. Goude, R. Nedellec, and N. Kong, Local short and middle term electricity load forecasting with semi-parametric additive models, IEEE transactions on smart grid, 5 (2014), pp. 440446.

[18] M. Jin, W. Feng, P. Liu, C. Marnay, and C. Spanos, Mod-dr: Microgrid opti- mal dispatch with demand response, Applied Energy, 187 (2017), pp. 758 776.

[19] K. le Zhou, S. lin Yang, and C. Shen, A review of electric load classication in smart grid environment, Renewable and Sustainable Energy Reviews, 24 (2013), pp. 103 110.

[20] V. Lefieux, Modèles semi-paramétriques appliqués à la prévision des séries temporelles. Cas de la consommation d'électricité., PhD thesis, Université Rennes 2, 2007.

[21] E. McKenna, I. Richardson, and M. Thomson, Smart meter data: Balanc- ing consumer privacy concerns with legiti- mate applications, Energy Policy, 41 (2012), pp. 807814.

[22] F. McLoughlin, A. Duffy, and M. Conlon, A clustering approach to domestic electricity load prole characteri- sation using smart metering data, Applied Energy, 141 (2015), pp. 190 199.

[23] A. Mutanen, M. Ruska, S. Repo, and P. Jarventausta, Customer classication and load proling method for distribution systems, IEEE Transactions on Power De- livery, 26 (2011), pp. 17551763.

[24] R Core Team, R: A Language and Envi- ronment for Statistical Computing, R Foun- dation for Statistical Computing, Vienna, Austria, 2015.

[25] J. D. Rhodes, W. J. Cole, C. R. Up- shaw, T. F. Edgar, and M. E. Web- ber, Clustering analysis of residential elec- tricity demand proles, Applied Energy, 135 (2014), pp. 461 471.

[26] T. Räsänen, D. Voukantsis, H. Niska, K. Karatzas, and M. Kolehmainen, Data-based method for creating electricity use load proles using large amount of customer-specic hourly measured electric- ity use data, Applied Energy, 87 (2010), pp. 3538 3545.

[27] A. Seppälä, Load research and load estima-

tion in electricity distribution, theses, Tech-

nical research center of Finland, VTT Pub-

lications, Jan. 1996.

(18)

[28] E. P. R. Service, Smart electricity grids and meters in the EU Member States, Euro- pean Parliament, Brussels, September 2015.

[29] Z. Shao, F. Chao, S.-L. Yang, and K.- L. Zhou, A review of the decomposition methodology for extracting and identifying the uctuation characteristics in electricity demand forecasting, Renewable and Sustain- able Energy Reviews, (2016), pp. .

[30] Z. Shao, F. Gao, Q. Zhang, and S.- L. Yang, Multivariate statistical and simi- larity measure based semiparametric model- ing of the probability distribution: A novel approach to the case study of mid-long term electricity consumption forecasting in China, Applied Energy, 156 (2015), pp. 502 518.

[31] J. L. Viegas, S. M. Vieira, R. Melício,

V. Mendes, and J. M. Sousa, Classi-

cation of new electricity customers based on

surveys and smart metering data, Energy,

107 (2016), pp. 804 817.

(19)

Region Year 2 categories 8 categories 9 categories 12 categories Blois

2010 24.36 (2.32) 23.90 (2.75) 24.04 (2.99) 26.78 (2.91) 2011 23.87 (1.62) 22.79 (1.42) 22.91 (1.16) 24.78 (2.09) 2012 22.84 (1.26) 22.54 (1.17) 22.09 (1.24) 24.09 (2.17) 2013 22.34 (2.06) 22.31 (1.98) 21.32 (1.96) 23.34 (2.04) Average 23.35 (1.86) 22.89 (1.93) 22.59 (1.98) 24.75 (2.33) Lyon

2010 19.05 (2.39) 19.42 (2.71) 18.29 (2.24) 19.23 (1.94) 2011 19.28 (1.24) 18.06 (1.55) 18.56 (1.20) 18.46 (1.42) 2012 19.07 (1.35) 18.21 (1.72) 18.30 (1.34) 19.00 (1.86) 2013 18.06 (1.03) 17.92 (2.03) 17.49 (1.12) 18.68 (1.91) Average 18.87 (1.59) 18.40 (2.05) 18.16 (1.58) 18.84 (1.79) Rennes

2010 22.57 (0.96) 21.67 (1.23) 21.59 (1.04) 22.70 (1.57) 2011 22.62 (1.22) 21.54 (1.48) 21.57 (1.06) 22.10 (1.08) 2012 22.75 (1.11) 22.96 (0.99) 22.39 (0.98) 22.61 (0.84) 2013 24.94 (1.03) 23.99 (1.37) 24.14 (1.26) 24.08 (1.37) Average 23.22 (1.08) 22.54 (1.28) 22.42 (1.09) 22.87 (1.25)

Table 1: RMSE (in %) of the models for the 3 dierent zones over the 4 years with a dierent

number of categories. The simulation is run 100 times. We reported the average RMSE and its

standard deviation between parentheses. The best results over the 4 numbers of categories are

written in bold.

A novel method for decomposing electricity feeder load into elementary profiles from customer information

HAL Id: hal-01558385

https://hal-mines-paristech.archives-ouvertes.fr/hal-01558385

Submitted on 7 Jul 2017

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

A novel method for decomposing electricity feeder load into elementary profiles from customer information

Alexis Gerossier, Thibaut Barbier, Robin Girard

To cite this version:

Alexis Gerossier, Thibaut Barbier, Robin Girard. A novel method for decomposing electricity feeder

load into elementary profiles from customer information. Applied Energy, Elsevier, 2017, 203, pp.752

- 760. �10.1016/j.apenergy.2017.06.096�. �hal-01558385�

A novel method for decomposing electricity feeder load into elementary proles from customer information

Alexis Gerossier ∗ Thibaut Barbier ∗ Robin Girard

{alexis.gerossier},{thibaut.barbier}@mines-paristech.fr

∗ Both authors equally contributed to this paper

MINES ParisTech, PSL Research University, Center for processes, renewable energy and systems (PERSEE), 1 rue Claude Daunesse, 06904 Sophia Antipolis, France

June 2017

Abstract

1 Introduction

1.1 Motivation

Electricity represented 18% of total nal energy consumption in 2013 [3] and is expected to con- stitute a quarter of nal energy consumption by 2040 [1]. 42% of global CO2 emissions in 2012,

i.e. 13.8 gigatons of CO2, are due to electricity and heat production [2]. To reduce CO2 emis- sions due to electricity, many states are devel- oping energy transition strategies. This kind of transition involves signicant changes to electric- ity ows in the distribution network (with e.g.

decentralized production, improved eciency of

buildings and appliances, new uses and demand

Nomenclature

a f Consumption trend of feeder f relative to temperature

b f Temperature threshold of feeder f B Matrix of demand proles of

customer categories

β Column vector associated with B c f k Annual consumption of category k

for feeder f

d f Demand of feeder f

d k Elementary prole of customer category k

ε f Residual term for modeling feeder demand f

F Number of feeders f Feeder

K Number of customer categories k Customer category

m k Average demand share of a given category k

p f k Share of electricity used by category k for feeder f

σ 2 k Empirical variance of p 1 k , . . . , p F k T Number of instants

t Instant

T f Outside temperature of feeder f u Vector (1, . . . , 1) | of length K v Vector (T −1 , . . . , T −1 ) | of length T V inter Inter group variability

V tot Total variance

X Matrix of feeder demands

x Column vector associated with X

y Year

⊗ Kronecker product

response enabling energy consumption manage- ment [18]).

The above-mentioned changes make it necessary to model all of the aggregated demand dynamics.

1.2 Literature review

1.2.1 Bottom-up approaches

Measuring the electricity demand of individual

electricity customers is a simple way to establish

their load proles and dynamics, and therefore

a necessary step in bottom-up modeling. The

current smart-meter roll-out in Europe will pro-

vide precise measurements of individual demand

proles. Around 80% of customers are sched-

uled to receive a smart-meter by 2020 [28]. How-

ever, this massive deployment is hindered by cost

and privacy issues [21]. In 2014, only 23% of

smart-meters in the European Union were in-

stalled in localized areas for private customers

[13]. In some countries, this share is still insu-

cient to be representative, and the corresponding deployment is too recent to adequately cover long periods. To deal with the lack of individual mea- surements and characterize the behavior of elec- tricity customers, researchers have attempted to classify them into dierent categories.

First, a ne classication can be made in or- der to help decision-makers design personalized policies for specic customers [7].

Last, classication and load proles allow us to understand the contribution made by each cate- gory to aggregated demand [27].

Large measurement campaigns are necessary with these methods since a representative set of customers is required. This constraint makes continuous updating of the proles dicult, which is an issue since it remains necessary to adapt the proles to the changing consumption habits [4, 26].

1.2.2 Global approaches

In global approaches, models forecast aggregated electricity demand with past measurements and explanatory variables, such as expected temper- ature or sometimes economic progress [30].

In order to obtain past measurements, most DSOs have been recording the electric power de- livered by their medium-voltage feeders (around 1,00010,000 customers) for several years.

These measurements are aggregated, but exhaus- tive, since all electricity customers' contributions are taken into account. This aggregated electric- ity demand data is considered as a nonlinear, non-stationary series, and is often made up by a superposition of several distinct frequencies [29]

with daily to monthly periods in global models [8]. Additionally, the demand series can be di- vided into dierent parts (e.g. working time, hol- idays) [9, 17].

The CIS stores information on all customers re- garding their electric connection to the grid, an- nual energy consumption, type of contract, and contracted power.

In all of the reviewed global methods [29]

for modeling demand dynamics, the explanatory variables used, such as expected temperature or sometimes economic changes [30], do not charac- terize the feeder-specic local features. In partic- ular, none of them employs CIS general statistics.

Finally, the drawback of these methods when

used for planning purposes is that they cannot

adapt to a change in the mix of customer cate-

1.3 Contributions

to evaluate the relevancy of dierent customer categorizations.

Alexis Gerossier ^∗ Thibaut Barbier ^∗ Robin Girard

a ^f Consumption trend of feeder f relative to temperature

b ^f Temperature threshold of feeder f B Matrix of demand proles of

β Column vector associated with B c ^f _k Annual consumption of category k

d ^f Demand of feeder f

d _k Elementary prole of customer category k

ε ^f Residual term for modeling feeder demand f

p ^f _k Share of electricity used by category k for feeder f

σ ² _k Empirical variance of p ¹ _k , . . . , p ^F _k T Number of instants

T ^f Outside temperature of feeder f u Vector (1, . . . , 1) ^| of length K v Vector (T ⁻¹ , . . . , T ⁻¹ ) ^| of length T V _inter Inter group variability

Our paper assumes that the sub-hourly demands d ^f (t) of a feeder f aggregate dierent proles d ₁ (t), . . . , d _K (t) associated with K categories of customers with weights p ^f ₁ , . . . , p ^f _K ,

d ^f (t) =

p ^f _k d k (t) + ε ^f (t). (1)

We take the elementary proles d _k (t) to be com-

ual term ε ^f (t) is meant to be small. The time t

known elementary electricity proles d _k (t) . For

each feeder f ∈ {1, . . . , F } , d ^f (t) is observed

{1, . . . , K } , we also have access to the weight p ^f _k .

The aim is to nd the elementary proles d _k (t) from aggregated demand d ^f (t) according to Equation (1). We write and solve the following optimization problem.

t d _k (t) = T , to have comparable proles. To write this constraint in mathematical terms, we dene the col- umn of length K , u = (1, . . . , 1) ^| , and the column of length T , v = (T ⁻¹ , . . . , T ⁻¹ ) ^| in

order to write the average unit constraint, with a Kronecker product ⊗ , as (I _K ⊗v ^| )β = u .

min β kx − (A ⊗ I T )β k ² (2) s.t. β ≥ 0

(I K ⊗ v ^| )β = u

The aggregated demand prole d ^f (t) of a feeder