• Aucun résultat trouvé

Mod´ elisation de la relation hauteur-d´ ebit en utilisant

Mod´elisation de la relation

hauteur-d´ebit en utilisant des

mod`eles avec asymptote curviligne

4.1

Introduction

Dans ce chapitre, nous consid´erons le probl`eme de la mod´elisation de la relation hauteur-

d´ebit en hydrologie en utilisant des mod`eles param´etriques de croissance non lin´eaires.

Utilisant six jeux de donn´ees tir´es du site web des services d’enquˆetes g´eologiques des

´

Etats-Unis (U.S. Geological Survey), nous effectuons une ´etude comparative de quelques

mod`eles non lin´eaires d’ajustement de courbes. Au total huit mod`eles sont utilis´es dans

cette ´etude. En plus du mod`ele exponentiel ´etudi´e dans le troisi`eme chapitre, nous pro-

posons l’utilisation de quelques mod`eles non lin´eaires qui permettent une approximation

globale des courbes poss´edant un ou plusieurs point d’inflexions avec la pr´esence d’une

asymptote horizontale (Chapitre

1), lin´eaire ou curviligne (Chapitres

2

et

3). Les nou-

veaux mod`eles propos´es sont des g´en´eralisations des mod`eles ´etudi´es dans le deuxi`eme

chapitre. Plusieurs crit`eres de comparaisons sont utilis´es. Nous comparons aussi l’ajuste-

ment de chacun de ces mod`eles avec la courbe de tarage (courbe de la relation hauteur-

d´ebit) d´evelopp´ee par les services d’enquˆetes g´eologiques pour les six jeux de donn´ees.

Les r´esultats montrent l’utilit´e et la performance de ces mod`eles sur ces jeux de donn´ees

hydrologiques.

4.2

esultat

Le texte qui suit est l’article intitul´e Least squares fitting of the Stage-Discharge

relationship using growth models with curvilinear asymptotes qui a ´et´e accept´e

pour une publication dans la revue

Hydrological Sciences Journal. Cet article a

´

et´e ´ecrit en collaboration avec Fran¸cois Dubeau. Une version papier de cette article sera

publi´ee fin 2014.

Least squares fitting of the Stage-Discharge relationship

using smooth models with curvilinear asymptotes

Youness Mir and Fran¸cois Dubeau

D´epartement de math´ematiques Universit´e de Sherbrooke 2500 Boulevard de l’Universit´e Sherbrooke (Qc), Canada, J1K 2R1

youness.mir@usherbrooke.ca francois.dubeau@usherbrooke.ca

Abstract : In this paper we consider the problem of modelling the stage-discharge relationship

by curve fitting using the least squares method. Our basic idea is to present new models which are more flexible and have the ability to model phenomena with increasing or unchanging carrying capacity. The new models present a generalisation of some sigmoid smooth models commonly used in practice. They are characterized by a curvilinear asymptote and may have several inflection points. The use of these models on six real data sets collected from the U.S. Geological Survey’s web site proves their performance and their ability to model hydrological phenomena.

Keywords : Hydrology, rating curve, smooth models, curvilinear asymptote, least squares method.

1

Introduction

In hydrology, establishing a reliable relationship between water depth and flow rate (rating curve) in rivers and streams channel is of prime importance and hence has attracted the attention of researchers over a century. Although this relationship exists in any case and can be studied in- terchangeably (Braca (2008); Callede et al. (1997); Di Baldassarre and Claps (2011); Dymond and Ross (1982)), in hydrological and in hydraulic studies the accuracy of the stage and discharge val- ues predicted is strongly dependent on the reliability of this relationship as well as other factors (Domeneghetti et al. (2012); Dymond and Ross (1982); Shrestha et al. (2007)). Several standard methods have been used to derive and improve this relationship, like power and polynomial re- gression methods. Even if power model is the most used in river hydraulics over the last decades (Callede et al. (2001); Parodi and Ferraris (2004); Petersen-Øverleir (2004, 2005, 2006)), several authors showed that this approach often leads to inaccurate estimates of flows, mainly where the stage-discharge relationship does not cover the full range of flows (Schmidt and Yen (2001); West- phal et al. (1999); Wu and Yang (2008)). In addition to its limitation to the range of data measured (Di Baldassarre and Claps (2011)), this law does not describe perfectly this relationship, especially when the raw data have nonlinear behaviour, namely, in the presence of one or more inflection points. This change in concavity may be caused by the influence of certain factors such as the

geology watershed and the morphology of the channel at the gauging station on high flow of the rivers. In a similar situation, a more objective way to model this relationship is to use sums or piecewise combinations of power functions (Herschy (1995)).

Polynomial models of second or third degree are also commonly used (Braca (2008); Herschy (1995, 1999); McGinn and Chubak (2002); Sivapragasam and Muttil (2005)). Despite their ability to model data with inflection points, polynomial functions are not generally constantly increasing functions. To extrapolate a stage-discharge relationship beyond the range of data measured, sev- eral numerical methods have been used, like the polynomial regression methods (Braca (2008)), Support Vector Machine (SVM) (Sivapragasam and Muttil (2005)), or Artificial Neural Networks (ANN) (Chalisgaonkar and Jain (2000); Deka and Chandramouli (2003); Habid and Meselhe (2006); Shrestha et al. (2007)). The last approach was first used to model the stage-discharge relationship which exhibits the hysteresis phenomenon (Petersen-Øverleir (2006); Tawfik and Fahmy (1997)) which is not taken into account in the present study. Owing to the mathematical relationship existing between water depth and flow rate to derive this relationship (Torsten et al. (2002)), many authors suggest the use of some smooth models with several parameters to be optimized able to produce a best fit (Sivapragasam and Muttil (2005); Torsten et al. (2002)).

In (Dubeau et al. (2012)) we have made a comparative study of several models with sigmoid behaviour on real data. The problem of this approach is that the carrying capacity of all the mentioned models becomes constant with time, which is unrealistic in practice, mainly, during flood events for example. In other words, from the hydrologic perspective, as more and more water is delivered by a particular meteorological condition, river levels become progressively higher as discharge monotonically increases. Several authors have reformulated these models to accommodate phenomena with a varying (Meyer and Ausubel (1999); Meyer et al. (1999); Shepherd and Stojkov (2007)) or increasing (Dubeau and Mir (2013, 2014)) carrying capacity.

In this paper we investigate the use of some new mathematical models in order to derive the stage-discharge relationship of some real data sets by means of the least squares method. As the collected data sets suggest curves having sigmoid behaviour with curvilinear asymptotes, we adjust the non-linear models presented in (Dubeau and Mir (2011); Dubeau et al. (2011, 2012)) and modify their natural horizontal asymptote to accommodate data having curvilinear or oblique asymptotes. In addition to their ability to model data with or without inflection points, the shape of the new resulting models are more flexible and could be an alternative to model the stage-discharge relationship over the range of data studied and to extrapolate this relationship beyond the largest observed data. The objectives of the present study are : (1) to indicate the reliability of the modified models and their estimates, (2) to evaluate their performance on stage-discharge data sets, and (3) to compare model fits by means of some known comparison criteria. The numerical results reported demonstrate the efficiency of all the proposed models on the six real data sets considered.

2

Data

The study presented in this paper uses data sets obtained from six gauging stations located in three states of the south and southeast region of the United States of America: Texas, Florida, and Alabama. Some characteristics of these data sets are summarized in Table 1, namely the station names and abbreviations, the river basins, the number of gauging measurements (K), and the minimum and maximum annual flows. For each data set the rating table, and the associated rating curve drawn on logarithmic plotting paper are also available. The gauging measurements {(hk, Qk)}Kk=1, in meter (m) for gauge height hkand in cubic meter per second (m3/s) for discharge

Qk, are daily and were taken between October 2011 and October 2012. The rating curve is a piecewise linear continuous function described by a set of points{(xi, yi)}N

i=0 summarized in the

rating table, with N much greater than K, such that the abscissas xi’s are equally spaced, Δx = xi+1− xi for i = 0,· · · , N, and a piecewise linear interpolating function y(x) such that y(xi) = yi for i = 0,· · · , N. This rating curve is used to predict the value ˆQk = y(hk) to approximate Qk (k = 1,· · · , K). The data sets {(hk, Qk)}K

k=1 and their corresponding stage-discharge rating curves

are plotted in Figures 2(a)-7(a) (stage on x-axis versus discharge on y-axis). The six data sets used in this study have been selected randomly from the automated USGS data base (the real time stream flow map). Some daily data are not approved yet for publication and may be subject to revision. In addition, according to the USGS web site, ”the stage discharge ratings (associated to these data sets) are developed from a graphical analysis of current-meter discharge measurments made over a range of stages and discharges. These relations change over time as the channel features that control the relation between stage and discharge vary”. All these informations are collected from the U.S. Geological Survey (USGS) Water Resources web site and are available at http://waterwatch.usgs.gov.

Table 1: Streamflow gauging stations.

Station name Abbreviation River Basin State Gauging (K) Flows (m3/s) Figure min max Paint Rock River Near Woodville PRRNW Tennessee Alabama 366 0.3398 336.97 2(a) Chattooga River Above Gaylesville CRAG Coosa Alabama 436 3.3697 148.10 3(a) Withlacoochee River Near Pinetta WRNP Withlacoochee Florida 365 2.1238 148.95 4(a) Ochlockonee River Near Bloxham ORNB Ochlockonee Florida 454 1.4158 139.89 5(b) Pease River Near Vernon PRNV Red Texas 366 0 14.583 6(a) Clear Creek Near Sanger CCNS Trinity Texas 466 0 126.58 7(a)

3

Models

The basic models f (t), as well as the acceptable values of their parameters able to give an increasing curve with one inflection point are listed in Table 2. Moreover, this table also contains the lower bound T0of their domain of definition. The asymptote of all the basic models is given by y = a. In order to realize our goal of obtaining a smooth function with a curvilinear asymptote, we set the multiplicative constant a equal to 1 in the basic growth function f (t) and multiply the resulting model equation by an increasing curvilinear function given by m(t) = ptβ+ q, with p≥ 0, β > 0,

and q > 0, to obtain the modified model

F (t; θ) = m(t)f (t). (3.1)

This function is well defined for t ∈ (max {0, T0}, +∞). Note that the case β > 1 gives an increasing convex asymptote with no inflection point or with an even number of inflection points.

Table 2: Parameter values of the basic smooth models.

Models Function Parameter vector ζ T0

Michaelis-Menten fM(t) = a ω 0+ tc kc+ tc  a > 0, k > 0, c > 1, 1 > ω0≥ 0 0 Richards(1) f R(t) = a1− be−ctm a > 0, b > 0, c > 0, m > 1 ln(b)c Richards(2) f R(t) = a  1− be−ctm a > 0, b < 0, c > 0, m < 0 −∞ Hossfeld fH(t) = atc kc+ tc a > 0, k > 0, c > 1 0 Gompertz fG(t) = ae−be −ct a > 0, b > 0, c > 0 −∞ Logistic fL(t) = a 1 + be−ct a > 0, b > 0, c > 0 −∞ Bridge fB(t) = a  1− e−btc a > 0, c > 1, b > 0 0 Exponential fE(t) = ae−b/t c a > 0, b > 0, c≥ 1 0

The case 0 < β < 1 gives an increasing concave asymptote with an odd or even number of inflection points, the parity of the number of inflection points depends on the values of the parameters p and β of the function m(t). The limiting case β = 1 generates an oblique asymptote with one inflection point (Dubeau and Mir (2013)). The case p = 0 retrieves the natural horizontal asymptote of the basic model f (t). In all the cases discussed above, the asymptote is given by y =ptβ+ qf+,

where f+ = limt→+∞f (t) = 1 (because we have set a = 1). Note that the modified Gompertz

model is a limiting case of the modified Richards(1)and Richards(2)models as mR→ ±∞, bR→ 0±, and mRbR → bGrespectively, and the modified Logistic model is a particular case of the modified Richards(2)model with the parameter m equal to−1 and b is replaced by −b. A special case of the modified Michaelis-Menten (or Morgan) model (with ω0= 0) is the modified Hossfeld model (Zeide (1993)). The basic Bridges model is the well known Weibull model (Khamis et al. (2005); Ismail et al. (2003); Kuhi et al. (2010); Zeide (1993)). The graphs of f and F with 0 < β < 1, β = 1 and β > 1 for all these models look like the graphs given in Figure 1.

4

Material and methods

Data sets{(hk, Qk)}K

k=1 and the eight modified models described in Section 3 were used to inves-

tigate the stage-discharge relationship by using the least squares estimator. The parameter vector θ = (p, β, q, ζ) of each modified model F (t; θ) is estimated by minimizing the least squares criteria

Z(θ) = 1 2 K  k=1 [F (hk; θ)− Qk]2 (4.1)

over the feasible set of the vector of parameters θ = (p, β, q, ζ)∈ Θ where p ≥ 0, β > 0, q ≥ 0, and the parameter vector ζ as defined for each basic model in Table 2. We obtain

Z∗= Z(θ∗) = min

θ∈ΘZ(θ), and θ

= argmin

θ∈ΘZ(θ). (4.2)

0 50 100 150 200 250 300 350 400 450 0 1 2 3 4 f(t) F (t), β = 1 F (t), β > 1 m(t), β > 1 m(t), β = 1 m(t), 0 < β < 1 F (t), 0 < β < 1

Figure 1: Graphic representation of the basic model f (t) with its asymptote y = 1 and the modified model F (t) with its asymptote y = ptβ+ q for 0 < β < 1 (concave asymptote), β = 1 (oblique

asymptote) and β > 1 (convex asymptote) and their inflection points.

The MATLAB constrained least squares function ”lsqnonlin” is used to solve these problems. This function uses the bound constrained trust-region-reflective algorithm which is based on the interior- reflective Newton method (Coleman and Li (1994, 1996)). On each data set, and on each model, we have tested 10 different starting points θ0= (p0, β0, q0, ζ0) randomly chosen in a large approximate region of the parameter space using continuous uniform distributions. For initial values which produce converging sequences, the sequences converge to the same optimal solution. The MATLAB function ”fsolve” was used to calculate, when it exists, an approximation T∗∗of the x-coordinate of the inflection point (T∗, F∗) of each modified model F , and the approximate ordinate F∗∗is given by F∗∗= F (T∗∗). More than one inflection point might exist.

Table 3: River name abbreviations and the sum of squared errors between river data and the rating curve.

River PRRNW CRAG WRNP ORNB PRNV CCNS

SSE 1101.2 12.601 5.2607 154.18 4.6955 24.035

To compare the least squares fitting given by our modified models and the rating curves pro- vided from the USGS Water Resources web site, we have indicated in Table 3 the sum of squared errors (SSE) between the data{(hk, Qk)}K

k=1and the corresponding values{(hk, ˆQk= y(hk))}Kk=1

obtained from the rating curve. The SSE are given by the formula

SSE = 1 2 K  k=1  ˆ Qk− Qk 2 . (4.3)

This value will be compared to the optimal value Z∗of (4.1).

Comparison and ranking are made between the different model fits and are based on the Root Mean Square Error (RM SE) defined by the formula

RM SE = 

Z∗

K− n, (4.4)

and the Akaike information criteria (AIC) defined by (Akaike (1974))

AIC = 2K ln(RM SE) + 2n. (4.5)

These two criteria are related and take into account the degree of freedom regarding the number of observations and the number of estimated parameters. A smaller numerical value of RM SE and/or AIC criteria, indicate a better fit when comparing models. The goodness of fit was based on these two criteria as well as a visual evaluation of optimal curves given by models fit on the data sets.

To evaluate the reliability of our model fits, we have computed a relative Bootstrap confidence interval (BCI) (Efron (1979, 1985)) of parameters θ = (p, β, q, ζ), and we have drawn a Bootstrap confidence band (BCB) of the graph of each optimal modified model F (t; θ∗) on each data set. The width of the BCI and the BCB gives us more information on the variability of the parameters of the modified models. In general, the narrower the BCI and/or the BCB is, the higher the data density and the less the variability and uncertainty in parameter values. To measure the linear relationship between each pair of parameters of each modified model, we have calculated their pairwise correlation. A positive correlation means that the values of these parameters increase or decrease simultaneously. A negative correlation means that, if the value of one of them increases, the value of the other decreases and conversely. To construct a BCI we proceed as follows. We resample uniformly with replacement the original data set{(hk, Qk)}K

k=1 to generate a Bootstrap

sample{(h∗k, Q∗k)}K

l=1, in which some pairs (hk, Qk) may be repeated several times and other pairs

would not appear. We treat the resulting Bootstrap sample{(h∗k, Q∗k)}K

l=1 as a new data set and

adjust it to the modified model F (t; θ) to obtain a new vector of parameter estimates by minimizing the least squares criterion (4.1). We repeat this process a large number of times B, for instance we have taken B = 1000. Let us denote by{θb∗= (p∗b, βb∗, q∗b, ζb)}B

b=1 the set of the replicate values

of parameters θ = (p, β, q, ζ). To construct a 95% percentile BCI of a given parameter, we sorted its corresponding Bootstrap replicate values in ascending order, then the lower 95% percentile confidence limit is the B· /2 = 25-th ranked element and the upper 95% percentile confidence limit is the B· (1 − α/2) = 975-th ranked element, where α = 1 − 0.95 = 0.05. To obtain the relative lower and upper Bootstrap confidence limits of this parameter we respectively divide these confidence limits by the optimal value of this parameter given by the adjustment of the modified model F (t; θ) on the original data set{(hk, Qk)}K

k=1. To draw the lower and upper BCB for the

graph of the optimal modified model F (h; θ∗), we consider each abscissea xi of the rating curve described in Section 2 and the set of the replicate optimal values{F (xi; θb)}B

b=1 to obtain yli and

yu

i which are respectively the 25-th and the 975-th ranked element of the replicate optimal values.

The lower, respectively upper, confidence band is the piecewise linear continuous function passing through the points{(xi, yl

i)}Ni=0 and{(xi, yiu)}Ni=0.

Finally, we have also computed the values of the adjusted R squared (R2

adj). This statistics is

defined by the formula (Cameron and Windmeijer (1997); Tarald (1985)): R2adj= 1 (K− 1)Z

(K− n)SST; (4.6)

where Z∗ is given by (4.2), SST is the sum of squared differences of each observation and the overall mean, K is the number of gauging measurments, and n is the number of parameters of the model. This criterion is frequently used in the context of linear regression to measure the linear relationship between two variables and hence considered as a goodness of fit measure. Since our models are non-linear, it is used here for information.

5

Numerical results and discussion

Due to bad least squares modelling of Bridge and Logistic models on the data sets studied here, numerical results of these two models are not included. Moreover, for all the data sets, numerical results show that the best estimate ω0 of Michaelis-Menten model is close to zero, the optimal solutions θ∗and the optimal value of Hossfeld (ω0= 0) and Michaelis-Menten (ω0= 0) models are close together. As these two models gave the same results on all the data studied, we have not added the numerical results of the adjustment of the Michaelis-Menten model.

We have included in Tables 4, 6, 8, 10, 12, and 14 the optimal solutions θ∗ = (p∗, β∗, q∗, ζ∗), the optimal values Z∗= Z(θ∗), an approximation (T∗∗, F∗∗) of its corresponding inflection points (T∗, F∗), the lower and upper relative confidence limits of each parameter, and the pairwise cor- relation coefficients among the parameter estimates. The values and the corresponding ranking of the comparaison criteria of the remaining models on each data set are summarized in Tables 5, 7, 9, 11, 13, and 15.

From these tables, we observe that all the remaining modified models fit the six data sets very well. Except WRNP, and CCNS data sets where Hossfeld model gives an optimal value Z∗ far greater than that given by the other models, the difference between the optimal values Z∗ of all the modified models adjustments on PRRNW, CRAG, ORNB, and PRNV are rather small. Furthermore, on these four rivers, all the modified models appear to be an improvement over their corresponding rating curves obtained from the USGS water resources web site, because the maximum of the optimal values Z∗ given by the modified models on each of these data sets is smaller than the sum of squared error SSE regrouped in Table 3. On WRNP, Richards(1) and Exponential models have somewhat produced smaller optimal values Z∗ than the SSE produced by the rating curve. On CCNS, except Hossfeld model, where the optimal value Z∗is significantly high, the remaining models gave slightly smaller optimal values comparing to the SSE. A visual comparison of the graphs of optimal curves and rating curves of data sets corroborates this fact. We also observe the same results for the modified Richards(2) and Gompertz models for all the data sets except PRNV because m∗Rb∗R≈ b∗G. On PRNV, from Table 4, Richards(1)and Richards(2) models give the same results because m∗R(1)b∗R(1) ≈ m∗R(2)b∗R(2). On ORNB and CCNS, all the

models have no points of inflection and the adjustmnts remain convex on all the range of data. On PRRNW, CRAG, WRNP and PRNV, all the modified models have two inflection points. Besides WRNP, on each data set where the inflection points occur, they occur in small intervals and are

Documents relatifs