• Aucun résultat trouvé

Robust Inference for Generalized Linear Models

N/A
N/A
Protected

Academic year: 2022

Partager "Robust Inference for Generalized Linear Models"

Copied!
37
0
0

Texte intégral

(1)

Article

Reference

Robust Inference for Generalized Linear Models

CANTONI, Eva, RONCHETTI, Elvezio

Abstract

By starting from a natural class of robust estimators for generalized linear models based on the notion of quasi-likelihood, we define robust deviances that can be used for stepwise model selection as in the classical framework. We derive the asymptotic distribution of tests based on robust deviances, and we investigate the stability of their asymptotic level under contamination. The binomial and Poisson models are treated in detail. Two applications to real data and a sensitivity analysis show that the inference obtained by means of the new techniques is more reliable than that obtained by classical estimation and testing procedures.

CANTONI, Eva, RONCHETTI, Elvezio. Robust Inference for Generalized Linear Models.

Journal of the American Statistical Association , 2001, vol. 96, no. 455, p. 1022-1030

DOI : 10.1198/016214501753209004

Available at:

http://archive-ouverte.unige.ch/unige:22899

Disclaimer: layout of this document may differ from the published version.

(2)

Models

Eva Cantoni and Elvezio Ronhetti

Department of Eonometris

University of Geneva

CH - 1211 Geneva 4, Switzerland

May 1999

Revised January 2001

Abstrat

Bystartingfromanaturallassofrobustestimatorsforgeneralizedlinear

models based on the notion of quasi-likelihood, we dene robust devianes

that an be used for stepwise model seletion as in the lassialframework.

We derivetheasymptotidistributionof testsbasedonrobustdevianesand

weinvestigatethestabilityoftheirasymptotilevelunderontamination. The

binomial and Poisson modelsare treated indetail. Two appliations to real

data anda sensitivityanalysis show thatthe inferene obtainedby meansof

thenewtehniquesismorereliablethanthatobtainedbylassialestimation

and testing proedures.

(3)

Generalizedlinearmodels(MCullaghandNelder,1989)areapowerfulandpopular

tehnique for modeling a large variety of data. In partiular, generalized linear

models allowtomodeltherelationshipbetween thepreditors andafuntionof the

mean of the response for ontinuous and disrete response variables. The response

variablesY

i

, for i=1;:::;n are supposed toome from adistribution belonging to

the exponential family, suh that E[Y

i

℄=

i

and V[Y

i

℄=V(

i

) for i=1;:::;n and

i

=g(

i )=x

T

i

; i=1;:::;n; (1)

where 2IR p

isthe vetor of parameters, x

i 2IR

p

, and g(:)is the link funtion.

Thenon-robustnessof themaximumlikelihoodestimatorforhas beenstudied

extensively in the literature: f. for instane the early work of Pregibon (1982) on

logisti regression, Stefanski, Carroll, and Ruppert (1986), Kunsh, Stefanski, and

Carroll (1989), Morgenthaler (1992), and Rukstuhl and Welsh (1999). In more

reent work, Preisserand Qaqish(1999) onsider alass of robustestimators inthe

general framework of generalized estimating equations.

The quasi-likelihood estimatorof the parameter of model (1)(see Wedderburn,

1974,MCullaghandNelder,1989,andHeyde,1997)sharesthesamenon-robustness

properties. This estimatoristhe solutionof the system of estimating equations

n

X

i=1

Q(y

i

;

i )=

n

X

i=1 (y

i

i )

V(

i )

0

i

=0; (2)

where 0

i

=

i

,andQ(y

i

;

i

)isthequasi-likelihoodfuntion. Thesolutionof (2)is

an M-estimator(see Huber, 1981, and Hampel, Ronhetti, Rousseeuw, and Stahel,

1986) dened by the sore funtion

~

(y

i

;

i ) =

(y

i

i )

V(

i )

0

i

. Its inuene funtion

(Hampel, 1974 and Hampel et al., 1986) is proportional to

~

and is unbounded.

(4)

explanatory variables x

i

an have a large inuene on the estimator. Thus, the

quasi-likelihood estimator { as well as the maximum likelihood estimator { is not

robust. Several robust alternatives have been proposed in the literature; see the

referenes given above.

However, in spite of the fair amount of existing literature, robust inferene for

generalized linear models seems to be very limited. Moreover, only the logisti

regression situation is usually onsidered in detail, and the problem of developing

robust alternatives tolassial tests is not addressed globally forthe whole lass of

generalized linear models.

In this paper we propose a robust approah to inferene based on robust de-

vianes whih are natural generalizationsof quasi-likelihoodfuntions. Our robust

devianes are based on the same lass of robust estimators as that proposed by

Preisser and Qaqish (1999) in the more general setup of generalized estimating

equations. Although these estimators are not optimally robust, they form a lass

of M-estimators easy to deal with, and whih admits handy inferene not only for

logisti regression but for the whole lass of generalized linear models.

One ould argue that two alternative approahes ould be onsidered. A rst

possibility would be to view variable seletion as a parametri hypothesis and to

use Wald, sore orlikelihoodratio tests for whih robustversions are available;see

e.g. Heritier and Ronhetti (1994) and Markatou and He (1994). Whilethis would

in priniple be feasible, Wald and sore tests do not seem to be used muh in the

lassialanalysisofgeneralizedlinearmodels. Moreover,robustlikelihoodratiotests

annot be proposed in this ase, beause the optimal robust sore funtion does

not admit an analyti primitive funtion and numerial integration in the spae

(5)

approahwould betorelyonthe robustmodelseletionbasedonAkaikeCriterion,

Mallows' C

p

or similar tehniques; see e.g. Ronhetti and Staudte (1994), Sommer

and Huggins (1996) and Ronhetti (1997) for a review. This approah has the

advantage to perform a full model searh. However, when the number of variables

is moderate to large suh a full searh is impossible and a stepwise seletion is the

only feasible alternative.

For these reasons and in view of the importane of the notion of deviane for

model buildingin generalized linear models, we propose robust devianesbased on

generalizations of quasi-likelihood funtions. The general struture of the lassial

approah by quasi-likelihood is preserved, whih oers the advantage of having ro-

bust tools playing the same role as devianes, anova tables, stepwise proedures,

and so on.

The paper isorganized asfollows. In the next setionwe disuss robustestima-

tors of a generalized linear modelbased on quasi-likelihood. As an illustration, we

fous inpartiular on the estimationof binomialand Poissonmodels. In Setion3,

we disuss inferene and propose afamily of test statistis for modelseletion. We

derive their asymptotidistribution through the development of an asymptotially

equivalent quadratiformand we study their robustness propertiesthrough the in-

uene funtion. Setion4presentssomeomputationalaspetsand Setion5gives

twoappliations. Finally,inSetion6wedisusssomepotentialresearhdiretions.

(6)

2.1 General Denition

We onsider a general lass of M-estimators of Mallows's type, where the inuene

of deviations ony and on xare bounded separately. The estimator is the solution

of the estimating equations:

n

X

i=1 (y

i

;

i

)=0; (3)

where (y;) = (y;)w(x) 0

a(), a() = 1

n P

n

i=1 E[(y

i

;

i )℄w(x

i )

0

i

with the

expetation taken with respet to the onditional distribution of yjx, (;), w(x)

are weight funtions dened below, and

i

=

i

() = g 1

(x T

i

). The onstant

a() ensures the Fisher onsisteny of the estimator. The estimating equation (3)

for generalized linear models is a speial ase of equation (1)p. 575 for generalized

estimating equationsinPreisserand Qaqish(1999),where ourfuntion (y;)w(x)

is (intheir notation) V 1

()w(x;y; )(y ) and a()= 0

V 1

() .

Let y = (y

1

;:::;y

n )

T

and = (

1

;:::;

n )

T

. The estimating equation (3)

orresponds tothe minimization of the quantity

Q

M

(y;)= n

X

i=1 Q

M (y

i

;

i

); (4)

with respet to , where the funtions Q

M (y

i

;

i

)an bewritten as

Q

M (y

i

;

i )=

Z

i

~ s

(y

i

;t)w(x

i )dt

1

n n

X

j=1 Z

j

~

t E

(y

j

;t)w(x

j )

dt; (5)

with ~s suh that (y

i

;s)~ =0, and

~

t suh that E[(y

i

;

~

t)℄ =0. Note that dierenes

of devianes, asthe test statisti(8), are independent of s~and

~

t.

Thestrutureof (3) issuggested bythe lassialquasi-likelihoodequations. The

estimatordened byequation(3)isanM-estimatorharaterizedbythesorefun-

(7)

tion (y

i

;

i

)=(y

i

;

i )w(x

i )

i

a(). Its inuene funtionis then IF(y; ;F)=

M( ;F) 1

(y;), where M( ;F) = E[

(y;)℄; f. Hampel et al. (1986).

Moreover, the estimator has an asymptoti normal distribution with asymptoti

variane =M( ;F) 1

Q( ;F)M( ;F) 1

, where Q( ;F) =E[ (y;) (y;) T

℄.

Itisthenlearthatthehoieofaboundedfuntion ensuresrobustnessbyputting

a bound on the inuene funtion. Therefore, a bounded funtion (y;) is intro-

duedtoontroldeviationsinthey-spae,andleveragepointsaredown-weightedby

the weightsw(x). Simplehoies for(;)and w()suggested by robustestimators

in linear models are (y

i

;

i ) =

(r

i )

1

V 1=2

(i)

(see (6) below) and w(x

i )=

p

1 h

i ,

where h

i

is the i-th diagonal element of the hat matrix H =X(X T

X) 1

X T

. More

sophistiated hoiesfor w()are available(see Staudte and Sheather, 1990, p.258,

for a disussion in linear regression or Carroll and Welsh, 1988). Weights dened

on H do not have high breakdown properties, and from this point of view, other

hoies of w(x

i

) aremore suitable. Forexample, w(x

i

)an behosen asthe inverse

of the Mahalanobisdistane dened through ahigh breakdown estimateof the en-

ter and of the ovariane matrix of the x

i

(see, for example, the minimum volume

ellipsoidestimatororthe minimum ovarianedeterminantestimatorinRousseeuw

and Leroy, 1987, p. 258 .). Finally notie that the hoie of (y

i

;

i ) =

y

i

i

V(

i )

and

w(x

i

) = 1 for all i, reovers the lassial quasi-likelihood estimator, so that for a

judiious hoie of (y

i

;

i

) and ofthe weights w(x

i

), the funtionQ

M

(y;)an be

seen as the robust ounterpart of the lassial quasi-likelihoodfuntion.

The form of this estimator is attrative beause the estimating equation (3)

orresponds to the minimization of (4) and this leads to a natural denition of

robust deviane; see Setion3.1.

(8)

We onsider here the partiular ase of (3), dened by (y

i

;

i ) =

(r

i )

1

V 1=2

(i) ,

where r

i

= y

i

i

V 1=2

(

i )

are the Pearson residuals and

is the Huber funtion dened

by

(r)=

8

<

:

r jrj;

sign(r) jrj>:

(6)

We all the estimator dened in this way, the Mallows quasi-likelihood estimator.

It solvesthe set of estimating equations

n

X

i=1 h

(r

i )w(x

i )

1

V 1=2

(

i )

0

i

a() i

=0; (7)

where a() = 1

n P

n

i=1 E[

(r

i )℄w(x

i )

1

V 1=2

(

i )

0

i

. Using the same notation as in the

linear regressionase, whenw(x

i

)=1 weallthis estimatorHuberquasi-likelihood

estimator.

The tuning onstant is typially hosen to ensure a given level of asymptoti

eÆieny. In Setion 3.2 we propose an alternative proedure for the hoie of

the tuning onstant. a() is a orretion term to ensure Fisher onsisteny; see

Hampel et al. (1986) for general parametri models and He and Simpson (1993),

Setion4.1,forpowerseriesdistributions. Notethata()anbeomputedexpliitly

for binomial and Poisson models and does not require numerial integration; f.

Appendix A. ThematriesM(

;F)andQ(

;F)an alsobeeasilyomputedfor

the Mallows quasi-likelihoodestimator:

Q(

;F)= 1

n X

T

AX a()a()

T

;

where A is adiagonalmatrix with elementsa

i

=E[

(r

i )

2

℄w 2

(x

i )

1

V(

i )

(

i

i )

2

,and

M(

;F)= 1

n X

T

BX;

(9)

whereBisadiagonalmatrixwithelementsb

i

=E [

(r

i )

i

logh(y

i jx

i

;

i )℄

1

V 1=2

(

i )

w(x

i )(

i )

2

,

andh()istheonditionaldensityorprobabilityofy

i jx

i

. WerefertoAppendixBfor

further details and for the omputationof these matriesfor binomialand Poisson

models.

3 Robust Inferene

3.1 Model Seletion Based on Robust Devianes

The funtion Q

M

(y;) dened in (4) and (5) allows to develop robust tools for

inferene and modelseletion basedon robust quasi-devianes.

Denote by a=(a T

(1)

;a T

(2) )

T

the partitionof avetor a into(p q) and q ompo-

nents and the orresponding partitionof a matrix A by

A= 0

A

11 A

12

A

21 A

22 1

A

;

where A

11 2 IR

(p q)(p q)

, A

12 2IR

(p q)q

, A

21 2IR

q(p q)

and A

22 2 IR

qq

.

Toevaluatethe adequayofamodel,wedenearobustgoodness-of-tmeasure

| alled robust quasi-deviane | based on the notion of robust quasi-likelihood

funtion, i.e.

D

QM

(y;)= 2Q

M

(y;)= 2 n

X

i=1 Q

M (y

i

;

i );

where Q

M

is dened by (4)and (5).

D

QM

(y;) desribes the quality of a t and will be used to dene a statisti

for model seletion. Let us onsider the model M

p

, with p parameters. Suppose

that the orresponding set of parameters is = (

1

;:::;

p )

T

= ( T

(1)

; T

(2) )

T

. We

are interested in testing the null hypothesis H

0 :

(2)

= 0. This is equivalent to

(10)

p q p

the sub-model M

p q

holds.

Weestimatethe vetor ofparametersbysolving (3)fortheompletemodel,and

weobtainanestimator

^

of. Underthenullhypothesis,thesameproedureyields

an estimator _

of (

(1)

;0). We write

^

and _

for the estimated linear preditors

assoiated to the estimate

^

and _

respetively. Then, we dene a robust measure

of disrepany between twonested models by

QM

= h

D

QM

(y;)_ D

QM (y;)^

i

= 2 h

n

X

i=1 Q

M (y

i

;^

i )

n

X

i=1 Q

M (y

i

;_

i )

i

; (8)

where the funtion Q

M (y

i

;

i

) isdened by (5).

Thestatisti(8) isinfatageneralizationof thequasi-deviane testforgeneral-

ized linear models, whih isreovered by taking Q

M (y

i

;

i )=

R

i

yi y

i t

V(t)

dt. Moreover,

when the link funtion is the identity (linear regression), (8) beomes the -test

statisti dened in Hampelet al.(1986),Chapter 7.

Thesameformsforthefuntions(y

i

;

i

)andw(x

i

)asintheestimationproblem

an be onsidered here. In partiular, a Mallows quasi-deviane statisti an be

dened by taking (y

i

;

i )=

(r

i )=V

1=2

(

i ).

The following Proposition establishes the asymptoti distribution of the test

statisti (8). We assume the onditions for the existene, onsisteny, and asymp-

toti normality of M-estimators as given by (A.1)-(A.9) in Heritier and Ronhetti

(1994), p. 902. These onditions have been studied by Huber (1967, 1981), Clarke

(1986) and Bednarski (1993).

Proposition 1 Under onditions (A.1)-(A.9) in Heritier and Ronhetti (1994),

[C1℄, [C2℄ of Appendix C, and under H

0 :

(2)

= 0, the test statisti

QM

dened

(11)

nL T

n

C( ;F)L

n +o

P

(1)=nR T

n(2)

M( ;F)

22:1 R

n(2) +o

P

(1); (9)

whereC( ;F)=M 1

( ;F)

~

M +

( ;F), p

n L

n

isnormallydistributedN 0;Q( ;F)

,

M( ;F)

22:1

= M( ;F)

22

M( ;F) T

12

M( ;F) 1

11

M( ;F)

12 , and

p

n R

n

is nor-

mally distributed N 0;M 1

( ;F)Q( ;F)M 1

( ;F)

.

Moreover,

QM

is asymptotially distributed as

q

X

i=1 d

i N

2

i

;

where N

1

;:::;N

q

are independent standard normal variables, d

1

;:::;d

q

are the q

positiveeigenvaluesofthematrixQ( ;F) M 1

( ;F)

~

M +

( ;F)

,and

~

M +

( ;F)

is suh that

~

M +

( ;F)

11

= M( ;F) 1

11 and

~

M +

( ;F)

12

= 0,

~

M +

( ;F)

21

= 0,

~

M +

( ;F)

22

=0.

The proof is given in Appendix D. A similar result an be obtained for the dis-

tribution of

QM

under ontiguous alternatives

(2)

= n 1=2

. In suh a ase

QM

is asymptotially distributed as P

q

i=1 (d

1=2

i N

i +S

T

) 2

, where S is suh that

SS T

= M

22:1

and S T

(M 1

( ;F

0

)Q( ;F

0 )M

1

( ;F

0 ))

22

S = D and D is the

diagonalmatrix with elementsd

1

;:::;d

q .

3.2 Robustness Properties and Choie of the Tuning Con-

stant

The robustness properties of the test based on (8) an be investigated by showing

that a small amount of ontamination at a point z has bounded inuene on the

asymptoti level and power of the test. This ensures the loalstability of the test.

(12)

the breakdown point as dened in He, Simpson, and Portnoy (1990). However, we

fous hereonsmalldeviationswhihareprobablythemain onernattheinferene

stage of a statistialanalysis.

We onsider the sequene of -ontaminations

F

;n

= 1

p

n

F

0 +

p

n

G; (10)

where Gis anarbitrary distribution(see Heritier and Ronhetti,1994) and investi-

gate the asymptoti level of the test under (10).

Proposition 2 Consider a parametri model F

0

and the null hypothesis H

0 :

(2)

=0. DenotebyF (n)

theempirialdistributionandbyU

n

thefuntionalU(F (n)

)

suh that U(F

0

)=0, IF(z;U;F

0

) is bounded and

p

n(U

n

U(F

;n

))N(0;) (11)

uniformly over the -ontamination F

;n

. Let (F) be the level of the test based on

the quadrati form nU T

n AU

n

when the underlying distribution is F. The nominal

level is (F

0 )=

0 .

Then, under the -ontamination F

;n

, we have

lim

n!1 (F

;n )=

0 +

2

T

diag

P

Z

IF(z;U;F

0

)dG(z)

Z

IF(z;U;F

0

)dG(z)

T

P T

+o(

2

);

where=

H

d

1

;:::;d

q (

1

0

;)

=0

,=(

1

;:::;

q )

T

=( 2

1

;:::; 2

q )

T

,H

d

1

;:::;d

q (:;)

is the .d.f. of the random variable P

q

i=1 d

i

2

1 (

2

i ),

1

0

is the (1

0

)-quantile of

P

q

i=1 d

i

2

1

(0), P is an orthogonal matrix suh that P T

DP = A, and D is the di-

agonal matrix with elements d

1

;:::;d

q

, the eigenvalues of A. Moreover, diag(R )

indiates the vetor with omponents the diagonal elements of the matrix R .

(13)

under ontamination is also bounded. The proof of this proposition is presented

in Appendix E. A similar result an be obtained for the power, showing that the

asymptoti power is stableunder ontamination.

Notethatthis propositiongeneralizesthe result ofProposition4inHeritier and

Ronhetti (1994),whih an be reovered by taking =

1

=Æ() and A=I

q .

Thegeneralresultof Proposition2anbeappliedtotherobustquasi-likelihood

test statisti(8) and in the speial ase of a pointmass ontamination G(z)=

z .

This givesthe following Corollary.

Corollary 1 Under onditions (A.1)-(A.9) in Heritier and Ronhetti (1994), and

foranyM-estimator

^

(2)

withboundedinuenefuntion,theasymptoti levelofthe

robust quasi-likelihood test statisti (8) under a point mass ontamination is given

by

lim

n!1 (F

;n )=

0 +

2

T

diag

P IF(z;

^

(2)

;F

0 )IF(z;

^

(2)

;F

0 )

T

P T

+o(

2

);

where P is an orthogonal matrix suh that P T

DP =

22 M

22:1

, is the asymptoti

variane of

^

dened in Setion 2.1, and D is the diagonal matrix with elements

d

1

;:::;d

q

dened in Proposition 1.

The result is obtained by applying Proposition 2 with G(z) =

z

, U =

^

(2) ,

=

22

, A =M

22:1

, and by using the Frehet dierentiability of

^

(2)

; see Heritier

and Ronhetti (1994).

Hene,aboundedinuene M-estimator

^

(2)

ensures aboundonthe asymptoti

levelof the robust quasi-likelihoodtest under ontamination.

(14)

the estimation of parameters an be performed via M-estimation aording to (3),

and the test statisti(8) allows usto make inferene and modelhoie.

Thefuntion(y

i

;

i

)whihappearsinthedenitionofQ

M (y

i

;

i

),isoftentuned

by aonstant;f.forinstane(6). AssuggestedinRonhettiandTrojani(2001),we

anonsidertheproblemfromthepointofviewofinfereneandhoosetheonstant

thatontrolsthe maximalbias ontheasymptotilevelofthetestinaneighborhood

of the model. To serve this last purpose, one an use the Corollary above. The

maximallevelof therobustquasi-likelihoodteststatistiinaneighborhoodofthe

model of radius is given by

=

0 +

2

(

^

(2)

;F

0 )

2

T

diag

P11 T

P T

; (12)

where (

^

(2)

;F

0

)=sup

z jjIF(z;

^

(2)

;F

0

)jj and 1=(1;:::;1) T

.

By(12),we an write

b= 1

r

0

T

diag(P11 T

P T

)

; (13)

where b is the bound on the inuene funtion of the estimator

^

(2)

. Then, for a

xedamountofontaminationandbyimposingamaximalerroronthelevelofthe

test

0

,oneandeterminetheboundbontheinuenefuntionoftheestimator,

and hene the tuning onstant by solving b = (

^

(2)

;F

0 ) =

with respet to .

For example, if q = 1 we have P = 1, diag(P11 T

P T

) = 1, and = 0:1145, see

Ronhetti and Trojani (2001). In pratie, the supremum onz =(y;x) is taken as

the maximum overthe sampleof thesupremum onyjx. Notealsothat the solution

depends onthe unknown parameter

0

; our experieneshows that itdoes not vary

muh for dierent values of , so that one an safely plug-ina reasonable (robust)

estimate. This is valid for a single test. However, in a stepwise proedure (as in

(15)

value of for eah test. Sine this is unreasonable from a pratial point of view,

we suggest to hoose a global value of by solving b = sup

z jjIF(z;

^

;F

0

)jj, based

on the fat that(

^

(2)

;F

0

)=sup

z jjIF(z;

^

(2)

;F

0

)jjjjsup

z IF(z;

^

;F

0 )jj.

4 Computational Aspets

The solution of equation (3) an be obtained numerially by a Newton-Raphson

proedure or by a Fisher soring proedure. In the latter ase, the algorithm is

alsoknown asthe inuene algorithm;f. for instane Hampeletal.(1986), p.263.

However, there is a potential problem with multiple roots of equation (3). In this

ase, wereommend touse abootstraprootsearhasproposed inMarkatou,Basu,

and Lindsay (1998), p.743-744, based on the objetive funtion Q

M

dened in (4)

as aseletion rule; see alsoHanfelt and Liang (1995).

The test statisti

QM

of equation (8), an be omputed diretly. It involves

n one-dimensional integrations, whih are performed numerially. Our experiene

shows that itworks wellforbinomialand Poisson models. Toavoidthesenumerial

integrations { espeially in the ase when n is large { one an onsider using the

asymptoti quadratiforms of Proposition1 given by (9)whihare asymptotially

equivalent to the test statisti

QM

. A systemati study on the omparison of (8)

withtheasymptotiequivalentquadratiforms(9)isleftforfurtherwork. Moreover,

ritialregionsorp-valuesfortheteststatisti

QM

areeasytoobtain. Infat,linear

ombinations of 2

1

variables have been well studied in the literature. Algorithms

for the omputation of these p-values have been proposed among others by Davies

(1980)and by Farebrother (1990). Analytial approximationsof thesedistributions

(16)

S-PLUS (MathSoft, Seattle) routines for estimation and inferene based on ro-

bust quasi-likelihoodare olleted in alibrary and are available from the authors.

5 Appliations

5.1 Binomial models

In this setion, we analyze the damaged arrots dataset. It is taken from Phelps

(1982) and is disussed by Williams (1987) and used in MCullagh and Nelder

(1989) to illustratetehniques for heking for isolateddepartures fromthe model,

beauseofthe presene of anoutlierinthe y-spae. Thedata are issuedfroma soil

experimentandgivetheproportionofarrotsshowing insetdamageina trialwith

three bloks and eight dose levels of insetiide. The logarithm of the dose ranges

from 1.52 to2.36 in anequally spaedgrid. The sample size is 24.

We assumea binomialmodel with logitlink

log

m

=

0 +

1

log(dose )+

2

blok2+

3

blok1 ;

where = E[Y℄ = E[number of damaged arrots℄, bloki; i = 1;2 are indiators

variablestaking the value of 1if measures are taken inblok i and 0otherwise.

Dierent tehniques |plot of devianeresiduals, plot of Pearson residuals and

Cook's distane | show that there is a single large outlier, namely observation 14

(dose level 6and blok2). On the other hand, this observation does not appear

as aleverage point beause itsh

i

value issmall.

In the followingwe ompare the lassialand the robust analysis. The lassial

estimates are obtained by maximum likelihood. The robust estimates are based on

(17)

i

tuning onstantof the Huber funtionis hosen tobe 1:2, whih isobtained by the

proedure desribed at the end of Setion 3.2 with

0

= 0:02, = 0:04 and

=0:1145.

[Table 1 about here.℄

Table 1 shows the eet of observation 14: it seems to inrease the value of

2

orresponding to the variable blok2 . The robust tehnique automatially takes

into aount the partiularity of observation 14: in the estimation proedure, most

oftheobservationsreeiveaweightequalto1,oratleastgreaterthan0.70,whereas

observation 14 reeivesa weight equalto0.26.

Also, the eet of observation 14 is lear on the value of the deviane. This

seems dangerous beause the deviane is used for assessing the signiane of the

variables used for modeling the response. This is onrmed by Table 2, where the

results of a lassialand robust stepwise proedure are ompared.

[Table 2 about here.℄

Thelassialanalysisshows that allthe variables,addedsequentially,are highly

signiant on the basis of their deviane value. Model seletion via a robust step-

wise proedure based on the Huber quasi-deviane dened by equation (8) with

(y

i

;

i )=

(r

i )=V

1=2

(

i

)and =1:2 shows that the variableblok1 is not signif-

iant.

5.2 Poisson models

We use a datasetissued from a study of the diversity of arboreal marsupials in the

Montane ash forest (Australia). This dataset was olleted in view of the man-

(18)

wood prodution, into aount. The study is fully desribed in Lindenmayer etal.

(1990, 1991). The numberof dierent speies of arboreal marsupials (possum) was

observed on 151 dierent 3ha sites with uniform vegetation. For eah site the fol-

lowingmeasures were reorded: numberof shrubs,numberof utstumps frompast

logging operations, number of stags (hollow-bearing trees), a bark index reeting

the quantity ofdeortiatingbark, ahabitatsore indiatingthe suitabilityof nest-

ing and foraging habitat for Leadbeater's possum, the basal area of aaia speies,

the speies of eualypt withthe greatest stand basal area (Eualyptus regnans,Eu-

alyptus delegatensis,Eualyptus nitens),and the aspet ofthe site. Theproblemis

to modelthe relationship between diversity and these other variables.

Weisberg and Welsh (1993) used these data to investigate by nonparametri

tehniques the shape of the link funtion. Their onlusion was that the anonial

linktsthisdatasetwell. Therefore,weonsideraPoissongeneralizedlinearmodels

with log-linkto desribediversity asa funtion of

shrubs +stumps +stags +bark+habitat +aaia +eualyptus +aspet,

where eualyptus is a fator with three levels and aspet is a fator with four

levels. Hene, the model involvesthe estimation of aparameter of dimension 12.

The robust estimation of parameters via a Mallows quasi-likelihood estimator

dened by (7) with tuning onstant = 1:6 and weights w(x

i ) =

p

1 h

i gives

the result of Table 3. In the same table, we report within parentheses the results

obtained by meansof lassialquasi-likelihood. It has tobenotiedthat 4observa-

tions, namely observations 59, 110, 133, 139, reeive a weight with respet to their

residualbetween 0.68and0.88. This showsthatthese4observationsare potentially

inuential not only for the estimation proedure, but also for inferene and model

(19)

many explanatoryvariablesdonot enter signiantly inthemodel,and aredution

of the numberof variablesin the modelis neessary.

[Table 3 about here.℄

We applied a forward stepwise proedure based on quasi-likelihood and on the

robust version of it. Starting from the null modelwhere only the onstant term is

tted, wetestedwhetheritisappropriatetoadd thenext explanatoryvariable. We

hose to retain a variable if the p-value was smaller than 5%. Table 4 shows the

p-value obtained ateahstep of the proedure. Boldp-valuesindiatethe variables

whihhave been retained inthe model.

[Table 4 about here.℄

Asone ansee fromthe table,themodelshosen bythe lassialandthe robust

analysis are essentially the same, even if the p-values involved are sometimes quite

dierent. The variable habitat is at the border of the deision rule and external

onsideration may be used to judge if it has to be kept in the model. It has to be

notied that the orrelationbetween habitat and aaia ishigh (0.54) and one of

these variablesan be dropped.

[Table 5 about here.℄

In the robust nal t, observations 59, 110, 133, 139 reeive a weights with

respet to their residuals between 0.68 and 0.86, as it was already the ase in the

full model. On the other hand, with respet to the inuene of position, the only

observations reeivingaweightless than 0.9isthe rst one. Therewere threeother

(20)

whole set of variables. Probably, this outlyingness was due to some explanatory

variables,whih were not retained in the nal model.

Forthenal modelaspresentedinTable5,weinvestigatethe sensitivityof Mal-

lows quasi-likelihood tests ompared to lassial tests by onsidering the following

proedure: weletthe response ofthe observationreeiving the lowest weight inthe

estimationofthenalmodel,namelyobservation110,spantherangeofvaluesfrom

0 to 6. These values over the range of the response in the sample. In eah situa-

tion, we test the null hypothesis that the oeÆient orresponding to the variable

habitat isequal to0. The p-values of these tests are represented in Figure1.

[Figure1 about here.℄

The p-value of the robust test ( = 1:6) is stable, irrespetive to the response

valuetaken by observation 110. Thisp-value rangesfrom2:6to3:3%. Onthe other

hand, thep-value ofthe lassialtest (=1),varies muhmore: from2:3 to6:5%,

givingrisetoadierentmodelhoie,ifthedeisionruleisset at5%. Moreover,by

lettingobservation110 takearbitrarilylargevalues,the p-valueof therobust testis

bounded, whereas the p-value of the lassialtest ontinues toinrease.

6 Conlusion

Inthispaperweproposedanaturallassofrobusttestingproeduresforgeneralized

linearmodels. Theyare avaluableomplementtolassialtehniquesand aremore

reliable in the presene of outlying points and other deviations from the assumed

model. Further researh inludes the extension of these proedures to generalized

estimatingequationsand tononparametri modelslikegeneralizedadditivemodels.

(21)

Wederive the onstant

a()= 1

n n

X

i=1 E

Y

i

i

V 1=2

(

i )

1

V 1=2

(

i )

0

i

;

forbinomialandPoissonmodels,whihreduestotheomputationofE

Yi i

V 1=2

(

i )

.

Letus denej

1

=b

i V

1=2

(

i

),and j

2

=b

i +V

1=2

(

i ).

The binomial model states that Y

i

B(m

i

;p

i

), so that E [Y

i

℄ =

i

= m

i p

i and

V[Y

i

℄=

i m

i

i

m

i

. Then we have

E

Y

i

i

V 1=2

(

i )

= 1

X

j= 1

j

i

V 1=2

(

i )

P(Y

i

=j)1I

fj2[0;m

i

℄g

= P(Y

i j

2

+1) P(Y

i j

1 )

+

i

V 1=2

(

i )

P(j

1

~

Y

i j

2

1) P(j

1

+1Y

i j

2 )

;

with

~

Y

i

B(m

i 1;p

i ).

ThePoissonmodelstates thatY

i

P(

i

),andhene E[Y

i

℄=V(

i )=

i

. Then,

E

Y

i

i

V 1=2

(

i )

= 1

X

j= 1

j

i

V 1=2

(

i )

P(Y

i

=j)1I

fj0g

= P(Y

i j

2

+1) P(Y

i j

1 )

+

i

V 1=2

(

i )

P(Y

i

=j

1

) P(Y

i

=j

2 )

:

B Asymptoti variane

Werst determinethe matrixQ(

;F)inthepartiularsituationofMallowsquasi-

likelihoodestimator. Using itsdenition, we have

Q(

;F) = E

(r)w(x) 1

V 1=2

()

0

a()

(r)w(x) 1

V 1=2

()

0

a()

T

= 1

n X

T

AX a()a()

T

;

(22)

whereA isthediagonalmatrixwithelementsa

i

=E[

(r

i )

2

℄w 2

(x

i )

1

V(

i )

(

i )

2

,sine

0

i

=

i

i

x

i

. Inthe same manner, writings(y;x;)=

logh(y

i jx

i

;

i

), wederive

the expression of M(

;F),

M(

;F) = E

(r)w(x) 1

V 1=2

()

0

a()

s(y;x; ) T

= 1

n n

X

i=1 E

(r

i )

i

logh(y

i jx

i

;

i )

1

V 1=2

(

i )

w(x

i )

0

i

0T

i

= 1

n X

T

BX;

whereBisthediagonalmatrixwithelementsb

i

=E[

(r

i )

i

logh(y

i jx

i

;

i )℄

1

V 1=2

(i) w(x

i )(

i

i )

2

.

So, the determinationof the asymptoti variane of a Mallows quasi-likelihood

estimatorinvolvesthe omputation of the diagonalterms of the matriesA and B.

Wedeterminethethreeterms:

i g

1

(

i ),E[

(r

i )

2

℄,andE[

(r

i )

i

logh(y

i jx

i

;

i )℄

for binomialand Poisson models.

Forthe binomialmodelwith logitlink

i g

1

(

i )=m

i

exp (

i )

(1+exp (

i ))

2

;

and

E

2

Y

i

i

V 1=2

(

i )

= 2

P(Y j

1

)+P(Y j

2 +1)

+

+ 1

V(

i )

h

2

i m

i (m

i

1)P(j

1

1

~

~

Y j

2

2)+

+ (

i 2

2

i )P(j

1

~

Y j

2

1)+

+

2

i P(j

1

+1Y j

2 )

i

;

with Y B(m

i

;

i ),

~

Y B(m

i 1;

i )and

~

~

Y B(m

i 2;

i ).

(23)

i

logh(y

i jx

i

;

i

) being equal to

V(

i )

, we have

E

(r

i )

i

logh(y

i jx

i

;

i )

=E

Y

i

i

V 1=2

(

i )

Y

i

i

V(

i )

=

=

i

V(

i )

h

P(Y

i j

1 ) P(

~

Y

i j

1

1)+P(

~

Y

i j

2

) P(Y

i j

2 +1)

i

+

+ 1

V 3=2

(

i )

h

2

i m

i (m

i

1)P(j

1

1

~

~

Y

i j

2 2)

+(

i 2

2

i )P(j

1

~

Y

i j

2

1)+ 2

i P(j

1

+1Y

i j

2 )

i

:

For the Poisson model, we use the log-link

i

= g(

i

) = log(

i

) whih leads to

i g

1

(

i

)=exp (

i

). We alsohave

E

2

Y

i

i

V 1=2

(

i )

= 2

P(Y

i j

1

)+P(Y

i j

2 +1)

+

+ 1

V() h

2

P(j

1

1Y

i j

2

2)+( 2 2

)P(j

1 Y

i j

2 1)

+

2

P(j

1

+1Y

i j

2 )

i

:

The sore funtionequals

i

logh(y

i jx

i

;

i )=

Y

i

i

i

= Y

i

i

V(

i )

,so that

E

(r

i )

i

logh(y

i jx

i

;

i )

=E

Y

i

i

V 1=2

(

i )

Y

i

i

V(

i )

=

= P(Y

i

=j

1

)+P(Y

i

=j

2 )

+

+

1

V 3=2

(

i )

2

i P(Y

i

=j

1

1) P(Y

i

=j

1

) P(Y

i

=j

2

1)+P(Y

i

=j

2 )

+

+

i P(j

1 Y

i j

2 1):

C Conditions for Robust Quasi-deviane Tests

[C1℄: DenotebyD

n

theset ofallsamplepointsz

i

,i=1;:::;n forwhihtheseond-

order derivatives 2

Q

M (z

i

;)=

j

k

, i = 1;:::;n; j;k = 1;:::;p exist and

are ontinuous funtions of . It is assumed that lim

n!1 P

(D

n )=1.

(24)

n

1

jk

1

leastupperboundandby

jk (z;

1

;Æ)thegreatestlowerboundof 2

Q

M

(z;)=

j

k ,

with respet to in the interval jj

1

jjÆ.

Moreover, assume that for any sequene fÆ

n

g for whih lim

n!1 Æ

n

=0,

lim

n!1 E

jk

(z;;Æ

n )

= lim

n!1 E

jk

(z;;Æ

n )

=E

2

Q

M

(z;)=

j

k

;

and that there exists a positive suh that the expetations E

2

jk

(z;;Æ)

and E

2

jk

(z;;Æ)

are bounded funtions of and Æ for all and Æ<.

Theseonditionsare obtained by replaing logf(z;)by Q

M

(z;)inthe orre-

sponding lassialresults for the likelihoodratio test; f. Rao (1973),Wald(1943).

D Proof of Proposition 1

First,wederivetheasymptotiequivalentquadratiformof

QM

. Theprooffollows

the same lines asin the lassialtheory.

Therststep oftheproofonsistsinapproximating

QM

underonditions[C1℄-

[C2℄ by

p

n (

^

_

) T

M( ;F) p

n(

^

_

); (14)

viaaTaylorexpansionandbymakinguseofSlutsky'stheorem. Then,underH

0 and

by the asymptoti properties of M-estimators whih hold under onditions (A.1)-

(A.9) of Heritier and Ronhetti (1994), the following distribution equality holds

asymptotially

p

n(

^

_

) D

p

n M 1

( ;F)

~

M +

( ;F)

L

n

; (15)

(25)

where L

n

= 1

n P

n

i=1 (y

i

;

i

) is suh that p

n L

n

N 0;Q( ;F) . Putting (15)

in (14), and taking into aount the symmetry of M( ;F), we nally have, as

n !1,

QM D

nL T

n

C( ;F)L

n

: (16)

(16) an berewritten as

QM D

nR

T

n(2)

M( ;F)

22:1 R

n(2)

;

where M( ;F)

22:1

= M( ;F)

22

M( ;F)

12

M( ;F) 1

11

M( ;F)

12 , and

p

nR

n is

distributed aording to N 0;M 1

( ;F)Q( ;F)M 1

( ;F)

.

Finally, from(16) we onlude that

QM

q

X

i=1 d

i N

2

i

;

where d

i

are the q positive eigenvalues of Q( ;F)C( ;F) and N

1

;:::;N

q

are in-

dependent standard normal variables. Thus, the distribution of

QM

is a linear

ombinationof 2

randomvariableswith 1 degree of freedom.

E Proof of Proposition 2

Byusing (11)andbystandard resultsonthedistributionofquadratiformsinnor-

mal variables, we an say that the statisti nU T

n AU

n

is asymptotially distributed

as P

q

i=1 d

i

2

1 (

2

i

), with () = (

1

();:::;

q ())

T

= p

nPU(F

;n

). Notie that the

distribution depends onlyon the 2

i

()(see Johnson and Kotz (1970), Chapter 29).

Moreover, up to O(1=n), we have that (F

;n

) =1 H

d1;:::;dq (

1 0

;()), with

()=diag (()() T

)=ndiag PU(F

;n )U(F

;n )

T

P T

.

(26)

d

1

;:::;d

q 1

0

(F

;n

)

0

=b() b(0)=b 0

(0)+ 1

2

2

b 00

(0)+o(

2

):

But

b 0

(0)= T

=0

=2n T

diag

P h

U(F

;n )

i

=0 U(F

0 )P

T

=0;

beause U(F

0 )=0.

We alsohave that

b 00

(0) = T

2

2

=0

= T

2ndiag

P h

U(F

;n )

U(F

;n )

T i

=0 P

T

= 2

T

diag

P

Z

IF(z;U;F

0

)dG(z)

Z

IF(z;U;F

0

)dG(z)

T

P T

;

byusingagainthefatthatU(F

0

)=0andbeause

U(F

;n )

=0

= R

IF(z;U;F

0 )

1

p

n dG(z)

(see Hampel etal., 1986,p. 83). This ompletes the proof.

Aknowledgments

EvaCantoni(Eva.Cantonimetri.unige. h)isma^tre-assistante andElvezioRon-

hetti (Elvezio.Ronhettimetri.un ige. h) is Professor, Department of Eono-

metris, CH-1211 Geneva 4, Switzerland. The authors would like to thank the

Editor, the Assoiate Editor, the referees, P. Rousseeuw and M.-P. Vitoria-Feser

for theirvaluable suggestions,and A.Welshfor providingthe data analyzedinSe-

tion 5.2. Thehospitality ofStanford(E.C.) and MIT(E.R.)duringtherevision of

the manusript isgratefully aknowledged.

(27)

Bednarski, T. (1993), \Frehet Dierentiability of Statistial Funtionals and Im-

pliations to Robust Statistis," in New Diretions in Statistial Data Analy-

sis and Robustness, eds. Morgenthaler, S., Ronhetti, E., and Stahel, W. A.,

Basel/Cambridge, MA:Birkhaeuser, pp. 25{34.

Carroll, R.J. and Welsh, A. H. (1988), \A Noteon Asymmetry and Robustness in

Linear Regression," TheAmerian Statistiian, 42,285{287.

Clarke, B. R. (1986), \Nonsmooth Analysis and Frehet Dierentiability of M-

funtionals," Probability Theory and Related Fields,73,197{209.

Davies,R.B.(1980),\[AlgorithmAS155℄TheDistributionofaLinearCombination

of 2

RandomVariables(ASR53: 84V33P366-369),"AppliedStatistis, 29,323{

333.

Farebrother, R. W. (1990), \[Algorithm AS 256℄ The Distribution of a Quadrati

Form inNormal Variables,"Applied Statistis, 39,294{309.

Hampel, F. R. (1974), \The Inuene Curve and Its Role in Robust Estimation,"

Journal of the Amerian StatistialAssoiation, 69,383{393.

Hampel,F.R.,Ronhetti,E.M.,Rousseeuw, P.J.,andStahel,W.A.(1986),Robust

Statistis: The Approah Based on Inuene Funtions, New York: Wiley.

Hanfelt,J.J.and Liang,K.-Y. (1995),\Approximate LikelihoodRatiosforGeneral

EstimatingFuntions," Biometrika,82, 461{477.

(28)

Minimax Versus Loally Linear Estimation," The Annals of Statistis, 21, 314{

337.

He, X., Simpson, D. G., and Portnoy, S. L. (1990), \Breakdown Robustness of

Tests," Journal of the Amerian StatistialAssoiation, 85, 446{452.

Heritier, S. and Ronhetti, E. (1994), \RobustBounded-inuene Tests inGeneral

ParametriModels,"JournaloftheAmerianStatistialAssoiation,89,897{904.

Heyde, C. C. (1997), Quasi-Likelihood and its Appliation, Berlin/New York:

Springer-Verlag.

Huber, P. J. (1967), \The Behavior of the Maximum Likelihood Estimates under

Nonstandard Conditions," in Proeedings of the Fifth Berkeley Symposium on

Mathematial Statistis and Probability, pp. 221{233.

| (1981),Robust Statistis, New York: Wiley.

Imhof, J. P. (1961), \Computing the Distribution of Quadrati Forms in Normal

Variables,"Biometrika,48,352{363.

Johnson, N. L. and Kotz, S. (1970), Continuous Univariate Distributions, vol. 2,

Boston; Geneva, IL: Houghton-Miin.

Kunsh,H. R.,Stefanski, L.A., andCarroll,R. J.(1989), \ConditionallyUnbiased

Bounded-inuene Estimation in General Regression Models, With Appliations

to Generalized Linear Models," Journal of the Amerian Statistial Assoiation,

84,460{466.

(29)

A. P. (1991), \The Conservation of Arboreal Marsupials in the Montane Ash

Forestsofthe CentralHighlandsofVitoria,South-EastAustralia: III.TheHabi-

tatRequirementsofLeadbeater'sPossumGymnobelideusleadbeateriandModels

oftheDiversityandAbundaneofArborealMarsupials,"BiologialConservation,

56,295{315.

Lindenmayer, D. B., Cunningham, R. B., Tanton, M. T., Smith, A. P., and Nix,

H. A. (1990), \The Conservation of Arboreal Marsupials in the Montane Ash

Forests of the Vitoria, South-East Australia, I. Fators Inuening the Ou-

pany of Trees with Hollows," BiologialConservation, 54,111{131.

Markatou,M.,Basu,A.,andLindsay,B.G.(1998),\WeightedLikelihoodEquations

With Bootstrap Root Searh," Journal of the Amerian Statistial Assoiation,

93,740{750.

Markatou, M. and He, X. (1994), \Bounded Inuene and High Breakdown Point

Testing Proedures in Linear Models," Journal of the Amerian Statistial Asso-

iation, 89,543{549.

MCullagh, P.and Nelder,J.A.(1989),GeneralizedLinear Models,London: Chap-

man &Hall, 2nd ed.

Morgenthaler, S. (1992), \Least-absolute-deviations Fits for Generalized Linear

Models," Biometrika,79, 747{754.

Pearson, E. S. (1959), \Note on the Distribution of Non-entral 2

," Biometrika,

46,364.

(30)

response Relationships in Insetiide Evaluation Field Trials," in Leture Notes

in Statistis, No. 14. GLIM.82: Proeedings of the International Conferene on

Generalized Linear Models, ed. Gilhrist, R.,Berlin/New York: Springer-Verlag.

Pregibon, D. (1982), \Resistant Fits for Some Commonly Used Logisti Models

WithMedial Appliations," Biometris,38, 485{498.

Preisser, J. S. and Qaqish, B. F. (1999), \Robust Regression for Clustered Data

with Appliations toBinary Regression," Biometris, 55,574{579.

Rao, C. R. (1973),Linear StatistialInferene, New York: Wiley, 2nd ed.

Ronhetti, E. (1997), \Robustness Aspets of Model Choie," Statistia Sinia, 7,

327{338.

Ronhetti,E. andStaudte, R.G.(1994),\ARobustVersionofMallows'C

p

,"Jour-

nal of the Amerian StatistialAssoiation, 89,550{559.

Ronhetti, E. and Trojani, F. (2001), \Robust Inferene with GMM Estimators,"

Journal of Eonometris, 101, 36{79.

Rousseeuw, P.J.and Leroy,A. M.(1987),Robust RegressionandOutlier Detetion,

New York: Wiley.

Rukstuhl, A.F.and Welsh, A.H.(1999),\RobustFittingofthe BinomialModel,"

Manusript.

Sommer,S. and Huggins,R. (1996),\VariableSeletion using the WaldTest and a

Robust C

p

," Applied Statistis, 45, 15{29.

(31)

York: Wiley.

Stefanski, L.A., Carroll,R.J.,and Ruppert, D. (1986),\OptimallyBoundedSore

Funtions for Generalized Linear Models With Appliations to Logisti Regres-

sion," Biometrika,73,413{424.

Wald, A. (1943), \Test for Statistial Hypotheses Conerning Several Parameters

whentheNumberofObservationsisLarge."Transations of theAmerianMath-

ematialSoiety, 54,426{482.

Wedderburn, R. W. M. (1974), \Quasi-likelihood Funtions, Generalized Linear

Models, and the Gauss-NewtonMethod," Biometrika, 61,439{447.

Weisberg, S. and Welsh, A. H. (1993), \Missing Links," in New Diretions in Sta-

tistial Data Analysis and Robustness, eds. Morgenthaler, S., Ronhetti, E., and

Stahel,W. A., Basel/Cambridge, MA: Birkhaeuser,pp. 275{284.

Williams, D.A. (1987),\GeneralizedLinearModelDiagnostisUsingtheDeviane

and Single Case Deletions,"Applied Statistis, 36, 181{191.

(32)

y110

pvalue

0 1 2 3 4 5 6

0.03 0.04 0.05 0.06

c=1.6 c=Inf

Figure 1: Sensitivity urves of the p-value for Mallows quasi-likelihood tests with

=1:6 (solid line) and =1 (dashed line).

(33)

Interept 1.480 (0.66) 1.939 (0.70)

logdose -1.817 (0.34) -2.049 (0.37)

blok2 0.843 (0.23) 0.685 (0.24)

blok1 0.542 (0.23) 0.450 (0.24)

Table 1: Estimationof by maximumlikelihoodandbythe Huberquasi-likelihood

estimatorwith =1:2. Standard errors are indiated within parentheses.

(34)

NULL 83.34 60.46

logdose 54.73 (0.000) 39.94 (0.000)

blok2 45.59 (0.003) 35.21 (0.017)

blok1 39.98 (0.018) 32.74 (0.085)

Table2: ResidualdevianeandresidualHuberquasi-devianewith=1:2. p-values

are indiated withinparentheses.

(35)

Interept -0.8978 (-0.947) 0.2682 (0.265)

shrubs 0.0099 (0.012) 0.0222 (0.022)

stumps -0.2514 (-0.272) 0.2876 (0.286)

stags 0.0402 (0.040) 0.0113 (0.011)

bark 0.0400 (0.040) 0.0145 (0.014)

aaia 0.0178 (0.018) 0.0107 (0.011)

habitat 0.0714 (0.072) 0.0385 (0.038)

eualyptus nitens 0 (0) - (-)

eualyptus regnans -0.020 (-0.015) 0.1938 (0.192)

eualyptus delegatensis 0.127 (0.115) 0.2738 (0.272)

aspet NW-NE 0 (0) - (-)

aspet NW-SE 0.0601 (0.067) 0.1913 (0.190)

aspet SE-SW 0.0949 (0.117) 0.1920 (0.190)

aspet SW-NW -0.5079 (-0.489) 0.2505 (0.247)

Table 3: CoeÆients estimation and orresponding standard errors for the Poisson

modelwith log-linkof the possum dataset.

(36)

shrubs 0.0871 0.3642

stumps 0.0646 0.2988

stags 0.0000 0.0000

bark 0.0035 0.0039

aaia 0.0002 0.0009

habitat 0.0500 0.0443

eualyptus regnans 0.8754 0.8030

eualyptus delegatensis 0.8591 0.8074

eualyptus nitens 0.5681 0.6461

aspet NW-NE 0.8336 0.9100

aspet NW-SE 0.2612 0.3462

aspet SE-SW 0.1996 0.1646

aspet SW-NW 0.0012 0.0023

Table4: p-valuesofaforwardstepwiseproedureforthePoissonmodelwithlog-link

of the possum dataset.

(37)

Interept -0.7981 (-0.8213) 0.2030 (0.2000)

stags 0.0406 (0.0410) 0.0104 (0.0103)

bark 0.0410 (0.0406) 0.0126 (0.0125)

habitat 0.0143 (0.0136) 0.0098 (0.0097)

aaia 0.0776 (0.0782) 0.0371 (0.0367)

aspet SW-NW -0.6044 (-0.5968) 0.2121 (0.2086)

Table 5: CoeÆients estimation and orresponding standard errors for the nal

Poisson model with log-linkof the possum dataset.

Références

Documents relatifs

While in the framework of robust statistics, robustness of an estimator is measured in terms of the breakdown point (the asymptotic minimum proportion of points which cause

The robustness properties of the approximate ML estimator are analysed by means of the IF. For the simulations we will consider the case of a mixture of normal and binary variables

Chatterjee and Bose [2005] and, in particular, the Generalized Cluster Bootstrap (GCB) for Linear Mixed Models (LMM) [Field et al., 2010], we formu- late a bootstrapping

In spite of their second order accuracy, standard classical inference based on saddlepoint technique can be drastically affected by small deviations from the underlying assumptions

In the presence of small deviations from the model (Figure 2), the χ 2 approximation of the classical test is extremely inaccurate (even for n = 100), its saddlepoint version and

Robust and Accurate Inference for Generalized Linear Models:..

By starting from a natural class of robust estimators for generalized linear models based on the notion of quasi-likelihood, we define robust deviances that can be used for

In spite of their second order accuracy, standard classical inference based on saddlepoint technique can be drastically affected by small deviations from the underlying assumptions