Practical cases and issues related to model fitting

(1)

HAL Id: hal-02793398

https://hal.inrae.fr/hal-02793398

Submitted on 5 Jun 2020

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Practical cases and issues related to model fitting

Laurent Saint-André, Gael Sola, Matieu Henry, Nicolas Picard

To cite this version:

Laurent Saint-André, Gael Sola, Matieu Henry, Nicolas Picard. Practical cases and issues related to model fitting. Training Workshop on Tree Allometric Equations, May 2014, Colombo, Sri Lanka. pp.41 slides. �hal-02793398�

(2)

Practical cases and

issues related to model

fitting

•

Dr. Laurent Saint-André, Gael

(3)

Step by Step

Exploratory stage, getting a model for each compartment and each strata (local model)

û

What variable is to be used as input data ? Or what combination of variables is to be used as in put data?

What is the form of the relationship with each of the variable ?

What are the relationships between the parameters of the local models and the strata characteristics ?

What is the form of this relationship for each parameter ?

Fitting of the complete model: one system of equations for all compartments and all strata

û

Aggregation stage, getting a model for each compartment, all strata pooled together (global model)

(4)

Linear Models

Linear regression: Principle

Fitting this equation consist in estimating parameters a and b.

Usually, we use the least squared method which consist in finding parameters a and b that minimized the sum of squared errors :

The model is written as following: Y X

û

i i i

a

b

X

Y

=

+

.

+

ε

model the of parameters the are b and a model, by the explained not variation residual the is i ε

∑

= = + − = n i i i n i i Y a b X 1 2 1 2 ) . ( ε

(5)

Linear Models

a and b are two random variables

The covariance between a and b is not null (meaning that parameters of a given equation are non-independent)

An unbiased estimation of this covariance is given by :

The standard deviation of a and b

is given by : And their confidence interval by :

Usually p=0.05 to get the parameter value at level 95% of confidence Linear regression: Principe

û

      − = − − − =

∑

_ _ _ 2 _ _ . ) ( ) ).( ( X b Y a X X X X Y Y b i i i 2 ) . ( 2 2 − + − =

∑

n X b a Y s i i          + = − =

∑

_ 2 _ 2 2 1 ) ( ) ( X X X s b ect i    − ± − ± ) ( ). 2 / , 2 ( ) ( ). 2 / , 2 ( a ext p n t a b ect p n t b

(6)

Linear Models

û

Dep Var: Y N: 13 Multiple R: 0.828991 Squared multiple R: 0.687225 Adjusted squared multiple R: 0.658791 Standard error of estimate: 0.351724 Effect Coefficient Std Error Std Coef Tolerance t P(2 Tail) CONSTANT 0.063555 0.378512 0.000000 . 0.16791 0.86970 X 0.704030 0.143206 0.828991 1.000000 4.91621 0.00046 Effect Coefficient Lower < 95%> Upper

CONSTANT 0.063555 -0.769544 0.896655 X 0.704030 0.388836 1.019223 Analysis of Variance

Source Sum-of-Squares df Mean-Square F-ratio P

Coefficient of correlation R2, the adjusted one is better

Values of parameters a = intercept

b = parameter linked to X

Standard deviations of parameters Confidence intervals of parameters

Linear regression:

Analysis of variance

(7)

Linear Models

Can we do a linear regression ?

û

Y

(8)

Linear Models

û

Y

X

(9)

Linear Models

Y X Ln Y Ln X Y’=Ln Y X’=Ln X ?

(10)

Linear Models

Y X



Ln Y Ln X



Y’=Ln Y X’=Ln X ?

(11)

Linear Models

û

Y X Y’=Ln Y X’=Ln X ? Ln Y Ln X

(12)

Linear Models

û

Y X



Y’=Ln Y X’=Ln X ? Ln Y Ln X

(13)

Linear Models

û

Y X Y’=Ln Y X’= X ? Ln Y X

(14)

Linear Models

û

Y X



Y’=Ln Y X’= X ? Ln Y X

(15)

Linear Models

Y X X’=X Y’=ln(Y/(1-Y)) ? ln(Y/(1-Y)) X

(16)

Linear Models

Y X



X’=X Y’=ln(Y/(1-Y)) ? ln(Y/(1-Y)) X

(17)

Linear Models

Why transforming the data ?

û

Ä

Power equation :

Ä

Exponential model :



It is always interesting to get a linear relationship because the solution is explicit



And sometimes it permits also to stabilize the variance

But, this is not always possible and it may not correspond to the data set…. So try and see !

)

exp(

ε

b

X

a

Y

=

ε

+

=

ln(

)

ln

Y

a

'

b

X

)

exp(

+

ε

=

a

b

X

Y

ε

+

=

a

b

X

Y

'

ln

(18)

Linear Models

The following equations are linear or can be transformed to get a linear equation ?

û

Ä Ä Ä Ä Ä Ä Ä



Yes but two highly



ε

+

=

b

X

Y

ε

X

b

Y

=

)

exp(

ε

X

b

Y

=

ε

+

=

_b

_X

2

Y

)

exp(

ε

b

X

Y

=

ε

+

=

_X

b

Y

ε

+

=

_b

_X

_c

_X

2

Y

) ln( ln ln ln lnY = bXε = b+ X + ε

)

ln(

ln

Y

=

X

b

+

ε

(19)

Non - Linear Models

Non-Linear regression: Principle

û

For linear models, the solution is explicit because the derivative of the model toward each parameter is independent from the paramameters of the equation.

For non-linear models, it is not the case: the derivatives depend on the parameters. The resolution of the system is too much difficult. It is then necessary to use alternative methods.

ε

α

β

₊

=

_e

X

Y

_.

. 2 1 . ₎ . (

∑

₌ − = n i X i res i e Y SS

α

β 0 ) )( . ( . 1 . ₋ ₌ −

∑

= i i bX n i X b i a e e Y ( . )( . . . ) 0 1 . − = −

∑

₌ i X b n i X b i a e ae X Y i i

(20)

Non - Linear Models

To fit a non-linear model, it is necessary to proceed by iterations.

When the least square method is used, at each step (i.e. each estimation of a new set of parameters) the sum of squared errors is calculated. If the procedure is efficient, this SSE decrease at each step. At the end of the process, if this decrease is negligible, then it is said that the model converged.

The most used iterative procedure is the Gauss-Newton one. But a lot of other procedures are

available. When there are problems in fitting a model, it is recommended to test several methods (ex: fractionnal iteration, Marquardt)

Meaning that we have to give initial values to the parameters

(21)

Non - Linear Models

û

b1 b2 b1 final 100 80 60 40

Sum of squared errors – isovalue curves

b1 initial

Successive iterations to get the final values of b1 and b2

Graphical view of the model

convergence with two parameters

(22)

Non - Linear Models

û

_b1 100 60 2 0 40 60

With these initial values, we fall in a local minimum of SSE (>20 et <40)

With these initial values, we fall in the absolute minimum of SSE(<20)

Importance of the initial values given to the parameters

(23)

û

Dependent variable is Y

Source Sum-of-Squares df Mean-Square Regression 1.79138E+04 3 5971.269443 Residual 6.711670 20 0.335583 Total 1.79205E+04 23

Mean corrected 3284.344348 22

Raw R-square (1-Residual/Total) = 0.999625 Mean corrected R-square (1-Residual/Corrected) = 0.997956 R(observed vs predicted) square = 0.997965

Wald Confidence Interval Parameter Estimate A.S.E. Param/ASE Lower < 95%> Upper B1 40.269815 0.584758 68.865757 39.050031 41.489600 B2 0.029815 0.001760 16.941467 0.026144 0.033486 B3 1.454754 0.078017 18.646595 1.292013 1.617495 Asymptotic Correlation Matrix of Parameters

B1 B2 B3 B1 1.000000

B2 -0.910171 1.000000

B3 -0.756906 0.939698 1.000000

R2, the mean corrected one should be used

Values of the parameters and their confidence intervals

Non Linear regression: Analysis of variance

(24)

Goodness of fit

How to assess the goodness of fit (for linear and non-linear models)

û

Ä R2 and graph Y=f(Ypredit)

R2 is an index of fit, to be used cautiously, (see thereafter) Maximum value = 1; Minimum value = 0

Ä Values of the parameters and their confidence interval Identifying problems of convergence; usually the standard error should not exceed 10% of the parameter value

Ä Correlations between parameters

If correlations are too high, transformation of the variables or change the model equation

Ä The RMSE (Root Mean Square Error, or residual standard error) Gives the error dispersion; to be compared to the average measured

values. Usually, the model is satisfactory when the RMSE is less than 10% of the measured values

(25)

Goodness of fit

û

Example of normally distributed errors, to be verified with statistical tests (ex D’agostino et al, 1990) and quantile plots

(26)

Goodness of fit

û

Example errors with

(27)

Goodness of fit

û

Error of a linear model when the appropriate model is in fact

(28)

Goodness of fit

û

Do not listen to the siren’s song of the R2 !

(29)

Heteroscedasticity

How to deal with heteroscedasticity ?

û

Y

X

 Transformation of the variables,

to get the following linear model : 

This is equivalent to performing a weighted linear regression:

ε

avec

.

+

=

a

b

X

Y

N

(

0 ,

σ

X

)

X

Y

'

=

/

X

'

=

1 /

X

ε

'

=

ε

/

X

' ' ' '

₌

_a

_X

₊

_b

₊

_ε

_avec

_ε

Y

N

(

0 ,

σ

)

(30)

Heteroscedasticity

û

Y

X

More generally, the weighted regression consist in minimizing :

avec le weight of observation i Usually, we use

All the challenge consist in finding the appropriate z value

z =Zoptimum z < Zoptimum z > Zoptimum

∑

=

−

n i ipredit i i

Y

w

1 2

)

(

2

/

1

i i

w

=

σ

z i i

∝

X

σ

(31)

Heteroscedasticity

û

First option: a rough and simple method that can be used if there are enough data

Ä

step 1 = split the variable X into k classes centered on Xk

Step 2 = calculate the variance of Y within each k classes

Step 3 = linear regression of to logXk

The slope of this regression is z which is often rounded to 1 or 2

z i i X w ∝ 1₂ 2 k σ k σ log

(32)

Heteroscedasticity

û

Ä

Step 1 = fitting the weighted model by fixing z to a given value (often 0 at the beginning) Step 2 = calculate the Furnival index (FI)

Step 3 = back to step 1 by increasing z

Second option: fitting z iteratively z i i X w ∝ 1₂ ( ) RMSE n X anti FI n i k k . log log 1             =

∑

=

(33)

Heteroscedasticity

û

Fitting z with the other parameters of the model

Ä

Fit by maximum likelihood instead of least squared methods Model for the mean

Model for the variance

z i i X w ∝ 1₂

( )

∑

_













+







 −

−

=

i z i i z i i i i

_X

X

Y

ML

log

2 .

log(

.

)

.

2

1

2 2. 2

σ

π

σ

µ

(34)

Model Choice

For nested models: F test using the sum of squares errors (SSE) of the two models

if Fobs>Ftab, then model 1 is more Ftab(p1-p2,n-p1)

How to chose between models ?

û

If the number of parameters is the same between model 1 and model 2, then use the sum of square errors (SSE), the lowest SSE is the best

Ä

Nested in Nested in Non nested in

If the number of parameters is different between model 1 and model 2, then Check if these two models are nested or not:

Ä 1 2 1 1 1 2 p n SCE p p SCE SCE F T T T obs − − − = 2 1 p p >

ε

+ + = a bD Y Y ₌ a₊bD ₊cD2H ₊

ε

+ + = a bD H Y 2 _Y ₌ _a₊_b_D ₊_c_D2_H ₊

ε

+ + = a bD Y _Y ₌ _a ₊_c_D2_H ₊

_ε

(35)

Advice: decision tree for

model selection

Same dependent

variable?

Same number of

parameters?

Are the models

nested?

F test

Akaike information

Sum of square

errors

Furnival index

Yes No Yes Yes No No

(36)

Model Choice

How to chose between models ?

û

Don’t use the R2 because it increases automatically as and when the number of

2 D c D b a Y = + + 5 2 _.. _k_D D c D b a Y = + + + +

(37)

Step by Step

û

What variable is to be used as input data ? Or what combination of variables is to be used as in put data?

What is the form of the relationship with each of the variable ?

What are the relationships between the parameters of the local models and the strata characteristics ?

What is the form of this relationship for each parameter ?

û

(38)

Aggregation

Example : Eucalyptus in Congo (Saint-André et al. 2005)

û

0 0.2 0.4 0.6 0.8 1 0 25 50 75 100 125 150 A g e (m o is ) 100 200 300 400 500 600

No variation with stand age

Exponential decrease with stand age 0 1 2 3 4 5 6 0 0.5 1 1.5 2 2.5 D2H (m3) L e av e s B io m as s ( kg D M t re e-1) DMLeaves DMLeavesEst GP3A, 3B _{GP3C, 3D} GP1, 2 (11-30 months) (50-75 months) (135 months)

Fitted age by age, then analysis of the parameter variations with stand age

ε + + =a bD H s LeafBiomas 2

(39)

Aggregation

Compartiment Peuplements utilisés pour la calibration

Modèle pour l’espérance Modèle pour la variance F1 Total G1, G3A, G3B, G3D ₅_.₅₃ ₉₃₉_.₁₁( )_{r .}2 _h 3 . 1 + = µ ₂₄_.₁₉( )2_. 0.483 3 . 1 h r = ε F2 Aérien G1, G2, G3A, G3B, G3C, G3D, V1 à V9 2.18 (488.8 2.2 )( )12.3. (0.87 0.0012 ) age h r age + + + = µ 21.74( )2. 0.613 3 . 1 h r = ε F3 Souterrain G1, G3A, G3B, G3D ₉_2.14( )2_. 0.630 3 . 1 h r = µ 5.52( )2. 0.385 3 . 1 h r = ε F4 Feuilles G1, G2, G3A, G3B, G3C, G3D, V1 à V9 0.64 (20.39 0.09age 2344.6e 0.15age)r12.3.h − + − + = µ 0.68( )2. 0.232 3 . 1 h r = ε

F5 Branches Mortes G1, G2, G3A, G3B, G3C,

G3D, V1 à V9 (6.12 158.9e 0.03age)r12.3.h − + = µ 3.09( )2. 0.353 3 . 1 h r = ε

F6 Branches Vivantes G1, G2, G3A, G3B, G3C,

G3D, V1 à V9 (31.12 4496.7e 0.18age)r12.3.h − + = µ 5.20( )2. 0.573 3 . 1 h r = ε F7 Ecorce G1, G2, G3A, G3B, G3C, G3D, V1 à V9 (25.95 19.83 0.05 )r12.3.h0.761 age e− + = µ 1.03( )2. 0.402 3 . 1 h r = ε F8 Tronc G1, G2, G3A, G3B, G3C, G3D, V1 à V9 0.29 (510.7 1.29age)r .h 2 3 . 1 + + = µ _36.02( )2_. 0.887 3 . 1 h r = ε F9 Souche G1, G3A, G3B, G3D ₃_7.72( )2_. 0.718 3 . 1 h r = µ 4.27( )2. 0.508 3 . 1 h r = ε

F10 Grosses Racines G1, G3A, G3B, G3D ( )₀_.₇₉₀ . 84 . 4 5 2 3 . 1 h r = µ 5.05( )2. 0.564 3 . 1 h r = ε

F11 Racines moyennes G1, G3A, G3B, G3D ( )₀_.₄₇₀

. 2.71 2 3 . 1 h r = µ 0.47( )2. 0.276 3 . 1 h r = ε

F12 Racines fines G1, G3A, G3B, G3D _{( )}₀_.₂₉₇

. 7.61 2 3 . 1 h r = µ 0.67( )r .2h 3 . 1 = ε

Age effect was significant for most of the compartments, we then get a set of equations that can be used whatever the stand age (within the range of the calibration data set 11 to

Example : Eucalyptus in Congo (Saint-André et al. 2005)

(40)

0 50 100 150 200 250 0 50 100 150 200 Eucalyptus -Congo Beech -France Eucalyptus -Brasil 0 50 100 150 200 250 0 50 100 150 Age (years) b ( ad im ) 0 50 100 150 200 250 0 50 100 150 200 Age (years) b ( ad im ) 0 100 200 300 400 500 600 700 0 50 100 150 200 Age (years) b ( ad im

) Not only eucalyptus and

fagus have the same pattern, they do also follow the same line ! (especially for stem wood and branches)

Example : Fagus in France (Genet et al. 2011)

(41)

Step by Step

û

What variable is to be used as input data ? Or what combination of variables is to be used as in put data? What is the form of the relationship with each of the variable ?

What are the relationships between the parameters of the local models and the strata characteristics ?

What is the form of this relationship for each parameter ?

û

(42)

Taking all

compartments into

account

Equations were fitted altogether simultaneously, To take cross-compartment correlation into account. This step is important when one wants to simulate biomass estimates with confidence intervals

Ä

The output of SUR Regressions (Seemly unrelated regression) are :

1-Values of parameters and their confidence intervals

2-Correlation matrix of parameters (within compartment and between compartments)

3-Residual errors for each compartment

4-Correlation matrix of errors (between compartments)