• Aucun résultat trouvé

Identifying regions for regional frequency analysis using statistical depth function.

N/A
N/A
Protected

Academic year: 2021

Partager "Identifying regions for regional frequency analysis using statistical depth function."

Copied!
1
0
0

Texte intégral

(1)

Poster template by ResearchPosters.co.za

IDENTIFYING HOMOGENEOUS REGIONS USING STATISTICAL DEPTH FUNCTION

Hussein Wazneh*

1

, Fateh Chebana

1

& Taha Ouarda

2

1

INRS-ETE, University of Quebec,

2

Masdar Institute of science and technology

I. Introduction

Extreme events (floods, droughts, storms) have serious social and economic consequences.

Local frequency analysis is a favourite tool to predict the frequency of an extreme event when sufficient hydrological information is available.

Regional frequency analysis (RFA) is used to estimate extreme events at ungauged sites or sites with short records.

RFA is composed of 2 main steps:

Delineation of hydrological homogeneous region (DRH)

 Regional estimation

Traditional approaches to DRH :

 Canonical correlation analysis, Region of influence

Hierarchical Clustering (HC)

II. Tools

V. Conclusion

Contact Information

The proposed D-clustering approach is robust.

This algorithm based-procedure automates the optimal choice of the sub-region with respect to the homogeneity criterion.

It allows the DRH to be more practical and usable without the user's subjective intervention.

The study of the three regions shows that the proposed D-clustering approach leads:

to more homogeneous sub-regions in terms of H heterogeneity measure

to more efficient quantile estimations in terms of relative bias and relative root mean square error

than those obtained by the traditional DRH approach.

Hussein Wazneh, PhD student

490 rue de la couronne

G1K 9A9, Québec, Canada

Tel: 418 654 2430#4468

Email:

[email protected]

5rd STAHY International Workshop, Abu Dhabi

Motivation

Spatial Depth function

A depth function is a statistical notion introduced to extend order in multidimensionnal spaces.

Especially, to define the median of multivariate sample. : The spatial depth of p

Spatial depth oint is thx

 

     

1

e amount of probability mass needed at to make it the multivariate median spatial median of the data :

1 ˆ , 1 n i , d X i i x x x SPD x F x n x x

Index-flood model

To find the optimal homogeneous sub-regions, the procedure is composed of three main steps :

- Initialization: Use a traditional clustering approach to form the initial K sub-regions (we start with K=2, see Algorithm below);

- Modifying: Modify the initial position of sites r using the D-clustering criterion (i.e. each site r is reassigned to the sub-region that maximizes its DeD);

- Homogeneity: Test the homogeneity of the final sub-regions obtained after the modification step. If the obtained sub-regions are not homogeneous go to the initial step with K=K+1.

D-clustering algorithm

Drawbacks related to the traditional HC approach:

Based on distance measures (e.g., Ward or linkage)

Uses non robust statistics (e.g., k-means) :

Sensitive to noise and to outliers

Require a pre-selection of the number of sub-regions Aims of this study:

Propose robust approach for DRH based on the notion of depth function

Make the delineation step automatic and objective

Selected results

Relative error with the (a) Ward and (b) D-clustering approaches and using index-flood and Depth index-flood estimation methods

IV. Case study and results

 

 

 

 

 

 

 

 

 

          1 2 1 1 2 2 1 1 1 2 2 2 :

, with mean elevation; 3; , (1)

, 1 and , 1 ;

Suppose that: , 2 , 1 , 3 and

, 3 , 1 , 2 ; In this cas r r m r m w w Example X AREA H H K X X I D SPD X I D SPD X I SPD X I SPD X I SPD X I SPD X I SPD X I SPD X I

 

 

  1 1 2 2 e Db SPD X I, 2 and Db SPD X I, 3

III. DRH using spatial depth function

1 7 1 2 : ,...., , 1 , 0 Example X x x SPD y X SPD y X

  

     

1, , : selected attributes for site ;

: number of homogeneous sub-regions; : number of sites in the data set;

; sub-region ; 1,.., ; 1,.. We define the: -Within sub-region d pte r r rp r X X X p r K N I k X r k r N k K

 

 

      1,.., of site :

quantifies how central the site is with respect to its sub-region . - of site :

is the max

h ,

Between sub-region depth m x ,

i a w r r b r l K r l k w r b r r D r k r D D SPD X I k D SPD X I l  

mum depth of site with respect to the sub-region to which it does n Deviance sub-region de ot belong. - pth of site : DeD w b r Dr r r D r

Particular definition

D-clustering criterion

     r r

- DeD 0 site is well classified in its sub-region - DeD 0 site lies between two sub-region

- De

Criterio

D 0 site is placed in the wrong sub-region : in the D-clustering approach, each site

n

r r

r r

r is assigned to the sub-region that maximizes its DeDr

Using the D-clustering criterion, these two sites must modify their

sub-regions. In fact, site 1 must be assigned to sub-region 2 and site 2 must be assigned to sub-region 3.

Region: Piemonte, Valle d'Aosta and Liguria, three geographical regions in

the North-West of Italy.

Data: Annual maximum peak flood data of 47 stream flow gauging sites. Clustering variables: coordinates in the UTM system of the catchment

centroids (Xbar and Ybar) and the mean elevation (Hm) of the drainage basin.

Aims:

1) Delineate the homogeneous sub-regions using Ward and D-clustering approaches;

2) Estimates the flood quantile for different non-exceedence probability and using the obtained sub-region by the different DRH approaches.

Performance criteria:

Since the delineation step affects the results of estimation of flood quantiles, two categories of performance criteria are used in this study:

Quantile estimation results in % using the index-flood and Depth-based index-flood method for the studied region using D-clustering and Ward approaches

Depth index-flood model

 

 

          

0 0 1 ' 1 0

The index-flood model estimates flood quantile corresponding to a nonexceedance probability through the expression :

ˆ ˆ ; 0 1 and , 1,.. : Identifier of th N R R h i i t l N h l h h h t Q t q l L n i n t

 

    0 th th

e target site; . : Growth curve function ˆ

:Index flood; : parameter obtained from the gauged site : Vector with components of the regional growth curve parameters

: Num t h i l R q l h L

N ber of sites in the homogenous sub-region that contain the site i0

 

        

'  1

The depth index-flood model estimates flood quantile using the same

expression of index-flood model. However, the vector of regional parameter is estimated by : ˆ , 1,..., and h R N R h l D h h l L

            1 ; 1,..., : Increasing weight function (e.g., Gompertz, logistic,..);

: Depth function (e.g., Mahalanobis, Tukey,..)

ˆ ˆ

= ,...., : Set contain the vectors of a

ˆ t-site param ; ete h N h N De h Dep pt th r

a) Hierarchical clustering of stations using Ward method. The green line represents the cut off to define the four homogenous sub-region used in the Ward approach. The reed line represent the cut off to define the three initial sub-region used for the D-clustering approach. b) Geographical location of sites of four homogeneous sub-regions formed using Ward method.

Heterogeneity measures H, the silhouette SIL and the global silhouette GSIL using traditional Ward and D-clustering approaches. . Sites that have negative deviance depth “DeD”. These sites are moved from their initial sub-regions in the D-clustering approach.

Criteria assess the clustering approaches criteria assess the estimation results

H: Heterogeneity measure SIL: Silhouette of sub-region GSIL: Global silhouette

RB: Relative bias

Références

Documents relatifs

"Is learning another language worth it?" – In der Antwort des Buchs auf die Frage im Titel des vierten Kapitels zeigt sich der Mehrwert des

This paper firstly reminds the Model of Manufactured Part (the MMP) [1, 11, 12] ; a method for modeling the different geometrical deviation impacts on the part produced (error

Rolling the dice – exploring different approaches to probability with primary school students..

T HREE MOBILE automatic river quality monitoring stations used by Thames Water to moni- tor pollution problems have been fitted with satellite communication

We are not able to prove such a strong estimate, but we are still able to derive estimate for M(x) from estimates for ψ(x) − x. Our process can be seen as a generalization of

The monthly averaged diurnal cycles of the modelled and measured signals at Cabauw for January (a) to June (f): (black) measurements, (red) anthropogenic and oceanic esti-

To go further, we propose to infer from the low-level visual features contextual information such that the dominant depth and position of the horizon line (if any):.. Dominant

THEOREM 1.8.— Let D CC C 71 be a pseudoconvex domain admit- ting a bounded uniformly Holder continuous plurisubharmonic exhaustion function and having the property, that any point q