• Aucun résultat trouvé

TESTING THE GOODNESS OF FIT

PRELIMINARY RESULTS OF AN APPROACH IN ASSESSING THE ECOLOGICAL STATUS LOW FLOW IN BULGARIAN RIVERS

TESTING THE GOODNESS OF FIT

Goodness of fit of the candidate distributions can be investigated by statistical tests. A quite powerful test is the probability plot correlation coefficient (PPCC) test. Critical values of the PPCC tests are given by Vogel and Kroll (1989) for W2, Onoz and Bayazit (1998) for P2.

The main purpose of the low flow frequency analysis is to estimate the quanti les corresponding to large return periods. Therefore smaller observations should be given higher weights in parameter estimation. It is observed that at sorne sites the lower tail of the distribution of minimum flows seems to follow a different curve than the remaining data (see e.g. Fig.1). In this study modified L-moments called LL-moments proposed by Onoz and Bayazit (1998) are used to give more weight to the smaller observations. LL-moments differ from L-moments in that LLm-moment of order r is based on the expectations for the r smallest elements of a subsample of size r+m instead of size r, m being a positive integer. For m=O, LLm-moments reduce to L-moments.

28009

100

....

1

1

..

1

i ~6

••

1

,

1 1

• •

• ,

1

,1

; 1 1 1

0.1

0.01 10

Fig. 1 : An example of low flow data where smaller observations seem to follow a difJerent curve.

PPCC test is not sensitive to the method of parameter estimation for 2-parameter distributions like W2 and P2 when a Weibull-type plotting position formula of the form

F(Xj) = (i-a)/(n+ 1-2a) (4)

is used (Vogel and Kroll, 1989), where i is the rank of the observation Xi in an ordered sample of size n. The reason for this is that the probabi1ity plots of these distributions plot ln Xi against 1n[-ln(F(xD].

Therefore PPCC test cannot be used to compare the goodness of fit of these distributions whose parameters are estimated using LLm-moments with different values of m when Eq.(4) is used with any value of a as the plotting position formula.

For P2 distribution Onoz and Bayazit (1998) obtained the following expression for the plotting position formula corresponding to F(E(xD) where E(x;) is the expected value of the observation of rank i in a samp1e of size n (Gumbel 1958):

[ J

e

n-I . .

F(xi)=

TI

I+J

. . . 1

J=O 1+J+~

(5)

Since this fonnula contains the parameter c,

ppee

test becomes sensitive to the method of parameter estimation when Eq. (5) is used for P2 distribution. A similar formula could not be obtained for W2 distribution.

Parameters of W2 distribution can be estimated in terms of LL-moments by the expressions:

LLm -CV =(1+

~)[1-(1+m)l/k(2+m)-l/k]

LLm1

= ar(l+~}(1

+ m)l/k (6)

where LLml is the first order LLmmoment and LLm-CV is the ratio of the second order LLmmoment to LLml. For the P2 distribution corresponding equations are:

1 [(

m)

1 ]

c = - - 1+- -1

2+m 2 LLm -CV

(7) (1 + c)(1+2c) ...

~ +

(1 + m)c]LL

Xo

=

ml

(1+m)! cl+m For m=O, Eq. (6) reduces to Eq.(l) and Eq. (7) to Eq.(3).

To compare the overall goodness of fit of W2 distribution whose parameters are estimated by LLm-moments with different values of m, another approach is used that is explained below. Same method is also used for P2 distribution.

For a certain model (probability distribution function plus parameter estimation method) quantiles corresponding to nonexceedance probabilities 0.1, 0.2, 0.3, ... , 0.7 are computed (higher probabilities are not used since emphasis is on fitting of the lower tail) and compared with the regional average of the at-site data corresponding to those probabilities. Bias and rmse (root mean square error) of the differences between observed and predicted values are calculated for each mode!. These quantities are used in comparing the goodness of fit ofvarious models.

APPLICATION

Methodology described above is applied to the Gediz river basin in Western Anatolia, Turkey, along the coast of the Aegean Sea. Total area of the basin is about 18000 km2There are 12 flow-gaging stations, one of them is affected by the regulation of the reservoirs upstream and is not considered in this study.

7-day annual average minimum flow is chosen as the representative low flow variable. At 5 stations sorne of the years have zero 7-day minimum flows. Three of these have more than 60% zeroes and are not included in the regional analysis. Data of remaining 8 stations are used in the regional analysis (Table 1).

Record lengths vary from N=17 to 33 years. Number of years with zero flows is No=4 at one site, and No=1 at another site.

Statistical moments ~k(101,2,3,4) and quanti les qp (p=0.10,0.20, ... ,0.90) are determined at each site. Regression equations are derived between ln!lk and ln A (Table 2), and ln qp and ln A (Table 3).

TABLE 1 : Characteristics of f10w gaging stations

TABLE 2 ;: Power13of the relationship

Ilf

oc

Afk

k 2 3 4

0.700 0.700 0.697 0.695

TABLE 3 : Power a of the relationshipqpiocA

F

0.1

It is seen that

r3

coefficient is practically constant such that f!k is proportional to A070k (correlation coefficient R=0.75-0.78). a coefficient varies in a wider range with an average 0.70 for p=0.2-0.9 (R=0.63-0.78) indicating thatqp is proportional to AO.70 but a=0.60 for p=O.I. These results imply approximate simple sca1ing of low flows in the region.

0.8.,---,

Fig. 2:L-CV and L-CS values ofthe at-site data for m=O.

The discordancy ana1ysis is performed to flag as discordant the sites of which the data stand out from those of the other sites. Discordancy measure Di is computed at each site (Hosking and Wallis 1993). AliDi

values are smaller than 2.14, critical value for 8 sites (Hosking and Wallis 1997) and can not be rejected as discordant by the criterion for discordancy (Table 1).

Zeros are discarded at sites with sorne zero observations, then the data at each site are made non-dimensional dividing by the mean.

L-moments of the data at each site are computed. L-variation coefficient L-CV and L-skewness coefficient L-CS of the at-site data are plotted in Fig.2. Weighted average point is also shown in the figure, weights being proportional to record lengths. L-CS versus L-CV relationships of the three probability distributions are shown in Fig.2. It is seen that the curve of LN2 is far from most of the data points, whereas W2 and P2 are almost equally close to the regional point. For this reason LN2 distribution is not considered in the rest of the study.

Next, LLm-CV and LLm-CS values are computed from the at-site data for m=I,2,3 and 4 (LLm-CV is the LLm-variation coefficient and LLm-CS is the LLm-skewness coefficient) and plotted in Figs. 3-6, respectively. In these figures, regional weighted average point and curves of the W2 and P2 distributions are also shown.

It is seen from these figures that for m~2 data are better represented by P2 distribution and the regional point is almost on the curve of this distribution. P2 distribution seems to have a better fit to the data than W2 for higher values of m in light of the LLm-moment diagrams.

Parameters of regional W2 and P2 distributions are estimated using Eqs.(6) and (7) on the basis of the regional weighted average LLm-moments for m=O,1 ,2,3 and 4. Results are given in Table 4. Parameter values do not change significantly for m?2.

TABLE 4 :Parameters of the distributions estimated by LLm-moments for various values of m

m 0 1 2 3 4

Weibull k 1.32 1.22 1. 15 1.10 1.07

d. a 1.09 1. 1 1 1.17 1.23 1.29

Power c 0.72 0.89 0.94 0.95 0.96

d. xo 2.38 1.97 1.84 1.80 1. 80

Power distribution has an upper bound Xo. At sorne stations there are observations above this value.

In such cases these points are omitted in testing the goodness of fit of P2. Number of such points, NI, is shown in Table 5. NI in general increases with m because of the decrease ofX{) and increase ofc.

0,8.,...---....,...-,

Fig. 3 : LLm-CV and LLm-CS values ofthe at-site datafor m=l.

0.8

- r - - - ,

0.7

0.6

llm-CV 0.5

0.4

• • •

-.-POWER -+-WEIBUll2

o

REGION AL VALUE

0.3

0.9 1.1 0.5 0.7

0.3 -0.1 0.1

0.2 +----..---.---.---r---.----....----.-1

·0.3

Um-cs

Fig. 4 .' LLm-CV and LLm-CS values ofthe at-site datafor m=2.

0.9 , - - - ,

0.8

0.7

0.6 llm-CV

0.5 0.4

-.-POWER - . - WEIBUlL 2

o REGIONAL VALUE

0.3

1.1 0.9

0.5 0.7 0.3

0.1 -{J.I

0.2 +----..---r---.----...,....----r----...---"

-{J.3

lL",-CS

Fig. 5 .' LLm-CV and LLm-CS values ofthe at-site datafor m=3.

0 . 9 . . , - - - , 0.8

0.7

_ _ POWER_WEIBULL2

o REGION AL VALUE 0.6

Um-CV 0.5

0.4

0.3

• •

1.1 0.9

0.7 0.3 0.5

0.1 -0.1

0.2 + - - - - . , . - - - . - - - . - - - - . , . - - - . - - - . - - - i

-0.3

ll",-CS

Fig. 6: LLm-CV and LLm-CS values ofthe at-site data for m=4.

TABLE5 :No. of observations above the upper boundXoofP2 for various values ofm

Station m

no. 0 1 2 3 4

501 0 2 2 2 2

509 0 0 1 1 1

510 0 3 4 4 4

514 3 4 5 5 5

515 3 5 6 6 6

523 2 3 3 3 3

524 0 0 0 0 1

525 2 2 2 3 3

Goodness of fit of the regional W2 and P2 distributions to the at-site data is checked by the PPCC test. For W2 distribution critical values of the PPCC test are taken from Vogel and Kroll (1989) and for P2 distribution from On6z and Bayazit (1998). Table 6 shows the highest level of significance ex at which the distribution passes the test. As stated before, PPCC statistics are independent of the method of parameter estimation for W2 distribution when a Weibull-type plotting position is used and therefore do not vary with m.

Goodness of fit of P2 distribution generally improves with the increase of m (with the exception of sites 515 and 525) but does not vary when m exceeds 2. P2 with parameters estimated by LLrmoments has a better fit than W2 at 3 sites but the fit is poorer at 2 sites. In this case PPCC test is not very helpful in deciding which regional distribution has the best overall fit

Goodness of fit of the distributions is also checked by the method described before that is based on the differences between observed and predicted quantiles. This method makes it possible to evaluate the goodness of fit for the region as a whole. Bias and rmse of the differences are given in Table7.

TABLE6 : Levels of significance at which the distributions pass the

ppee

test

Station P2

no. W2 m=O 1 2 3 4

501 0.10 0.10 0.10 0.10 0.10 0.10

509 0.10 0.25 0.50 0.50 0.50 0.50

510 0.25 0.25 0.25 0.50 0.50 0.50

514 0.25 0.01 0.01 0.25 0.25 0.25

515 0.25 0.25 0.05 0.01 0.01 0.01

523 0.25 0.50 0.50 0.50 0.50 0.50

524 0.10 0.05 0.10 0.10 0.10 0.10

525 0.25 0.25 0.10 0.10 0.10 0.10

TABLE 7 :Bias and nnse of the differences between observed and predicted quantiles

W2 P2

m 0 1 '2 3 4 0 1 2 3 4

Bias 0.016 0.016 -0.006 -0.034 -0.060 -0.018 -0.014 -0.002 0.004 -0.006 Rmse 0.035 0.036 0.052 0.087 0.121 0.102 0.037 0.018 0.019 0.026

1t is interesting that the bias and rmse of the differences based on P2 distribution decrease with the increase of m up to m=2 and then start to increase again. This also holds for the bias of the differences based on W2 but their rmse increases with m. P2 distribution with parameters based on LLrmoments has both the minimum bias and minimum rmse.

On the basis of these results P2 distribution with parameters estimated by LLrmoments is chosen as the regional probability distribution:

F(x) = (x/l.84)094 x~1.84 (8)

X denotes the annual 7-day minimum flow made non-dimensional dividing by the at-site mean. 1t should be noted that this distribution is fitted to the data below Xo=1.84. At a site where sorne of the observations are above this value (x>1.84~,,) F(x) corresponding to a certain x should be computed as follows. In this case, Eq. (8) gives the conditional distribution F(xlx~~). By the rule of conditional probability F(x) can be computed as

N-N f. )

F(x)

=

1 F\Xlx:::;Xo

N (9)

whereNI is the number of observations aboveXo.

On the other hand, at sites with zero observations Eq. (8) or Eq.(9} gives the conditional distribution F(xlx>O). F(x) can be determined as

CONCLUSIONS

No N -No

F(x)

=- + .

F(xix >0)

N N (l0)

Regional analysis of low flows can be performed following the steps explained in the paper. Having identified a homogeneous region and discarded the discordant sites, a suitable probability distribution function is fitted to the data. Parameters of the distribution function are estimated by the regional weighted average L-moments or LL-moments. LL-moments give greater weight to the lower tail of the data.

The methodology is applied to the Gediz river basin in Turkey. Power distribution with LL-moments of order 2 has the bestfit. It is discussed how the data ab ove the upper bound and zero values must be treated in estimating the low flow quantiles.

ln this paper, only 2-parameter probability distributions are considered. Similar studies can be carried out using 3-parameter distributions,

REFERENCES

DURRANS, S.R., and STomic, Regionalization of low flow frequency estimates : an Alabama case study, Water Resour.

Bull..32(1),23-37,1996.

GUMBEL, E.J.,Statistics of Extremes, 375 pp., Columbia University Press, New York and London, 1958.

GUPTA, V.K., and E. WA YMIRE, Multiscaling properties of spatial rainfall and river flow distributions, Jour. of Geophysical Res., 95(D3), 1999-2009, 1990.

GUPTA, V.K., 0.1. MESA, and D.R. DAWDY, Multiscaling theory of flood peaks: Regional quanti le analysis, Water Resour Res., 30( 12), 3405-3421, 1994.

GUSTARD, A., and R.GROSS, Regionallow flow studies, Chap.5 inFlow Regimes from Experimental and Network Data (Friend), Institute of Hydrology, Wallingford, 1989.

HOSKING, J.R.M. and J.R. WALLIS, Sorne statistics useful in regional frequency analysis, Water Resour. Res., 29(2), 271-281, 1993.

HOSKING, J.R.M. and J.R. WALLIS, Regional frequency an alysis , IBM Research Division, T.J. Watson Research Center, Cambridge University Press 1997.

ONOZ, B. and M BAyAZIT, Power distribution for low stream flows,Second National Hydrology Congress, 1998.

STEDINGER, J.L., R.M. VOGEL, and E. FOUTOULA-GEORGIOU, Frequency analysis of extreme events, Chap. 18 in Handbook ofHydrology,edited by D-R. Maidment, pp.18.1-18.66, McGraw Hill, 1993.

VOGEL, R.M., and C.N.KROLL, Low-flow frequency analysis using probability-plot correlation coefficients, 1.Water Resour. Plann. Manage. Div. Am.Soc.Civ.Eng., 115(3),338-357,1989.

VUKMIROVIC, V., J.MaLisic and D.PAVLüVIC, Sorne aspects of regional statistical analysis of low flows, Friend Low Flows Expert Meeting, Belgrade, 1998.