• Aucun résultat trouvé

Nous avons montré que l'approche bayésienne estimait mieux la taille de la popu- popu-lation que la régression linéaire et que l'estimation du taux était comparable entre les

2 approches. À partir des données récoltées lors de la campagne de septembre 2011,

nous avons estimé simultanément la taille de la population et le taux

d'échantillon-nage en prenant en compte l'eet du type de végétation et l'heure d'échantillond'échantillon-nage

en tant qu'eet xe et nous avons introduit un eet aléatoire pour modéliser une

variation aléatoire du taux lié à l'unité d'observation.

Nous avons estimé un taux d'échantillonnage compris entre 33,9% et 47,4% pour

les arbustes et compris entre 53,6% et 66,7% pour les feuilles mortes.

j ou rn a lh o m epa ge :w w w . e l s e v i e r . c o m / l o c a t e / e c o l m o d e l

Bayesianestimationofabundancebasedonremovalsamplingunder

weakassumptionofclosedpopulationwithcatchabilitydependingon

environmentalconditions.Applicationtotickabundance

S.Bord

a,∗

,P.Druilhet

b

,P.Gasqui

a

,D.Abrial

a

,G.Vourc’h

a

aINRA,UR346UnitéEpidémiologieAnimale,CentreINRAdeTheix,F-63122SaintGenèsChampanelle,France

bLaboratoiredemathématiques,UMRCNRS6620CampusdesCézeaux,B.P.80026,F-63171Aubièrecedex,France

article info

Articlehistory:

Received8July2013

Receivedinrevisedform

15November2013

Accepted1December2013

Available online 28 December 2013 Keywords:

Abundance Sampling-rate

Removalsampling

Samplingdesign

HierarchicalBayesianapproach

Haynemethod

abstract

Theestimationofanimalabundanceisessentialtounderstandpopulationdynamics,speciesinteractions

anddiseasepatternsinpopulations.Estimationsofrelativeabundanceclassicallyarebasedonasingle

observationofseveralsites.Inthiscase,themappingofabundanceassumesthattheprobabilityof

detectinganindividual,hencethesamplingrate,remainsconstantacrosstheobservedsites.Inpractice,

however,thisassumptionisoftennotsatisfiedasthesamplingratemayfluctuatebetweensitesdueto

randomfluctuationsand/orfluctuationsassociatedwiththesamplingprocess,notablyassociatedwith

thecharacteristicsofthesite.Itisthereforeimportanttoaccountforvariationsindetectionprobability.

Usingaremovalsamplingdesign,westudiedtheperformanceofaBayesianapproachtoestimateboth

samplingratesandabundanceundertheassumptionofaclosedpopulation.Theassumptionofaclosed

populationoftenisweakenedwhenthenumberofsuccessivesamplingsislarge.Thenumberofsamplings

hastobelimitedandoptimal.Wethereforeexaminedtheminimalnumberofsuccessivesamplings

neededtoachievesufficientstatisticalaccuracywhilerespectingunderlyingmodelassumptions.Using

thesamesimulations,wealsocomparedtheperformanceoftheBayesianapproachtotheperformanceof

thefrequentistHaynemethodbasedonlinearregression.WeshowthattheBayesianapproachproposed

givesgenerallybetterestimationsofpopulationsizethantheHaynemethod.Thetwomethodsgive

approximatelythesameresultsfortheestimationofsamplingrate.Wethenstudiedthevariabilityof

detectionprobabilityofIxodesricinustickssampledunderseveralenvironmentalconditionsbyusinga

hierarchicalBayesianmodelwitharandomeffect.Theestimatedsamplingrate ˆcvariedbetween33.9%

and47.4%forshrubsand53.6%and66.7%fordeadleaves.Thevariabilityofthesamplingratedueto

thesitedecreasedwhenthenumberofsuccessivesamplingsconsideredinthemodelincreased.The

variabilitywaslowerindeadleavesthanshrubs.Thisapproachcouldbeusedroutinelyforecologicalor

epidemiologicalstudiesofticksandspecieswithcomparablelifehistories.

© 2013 Elsevier B.V. All rights reserved.

1. Introduction

Theestimationofanimalabundanceisessentialinecologyto understandfundamentalprocesses,suchaspopulationdynamics andspecies interactions,aswellasinepidemiologyto under-standandgeneratediseasepatternsinpopulations(Anderson, 1991).Inthemajorityofbiologicalsystems,relevantindicators ofabundancearebasedoncountpointsurveys(Alldredgeetal., 2007)obtainedusingconvenientandcalibratedsamplingmethods (Anderson,2001;Pollocketal.,2002).Asapartofthepopulation isoftennotobservable,theprobabilityofdetectinganindividual

Royle,2010;PelletandSchmidt,2005).Consequently,indicators calculatedinthiswayonlygiveanindexofrelativeabundance. Theseindicatorsofabundanceareimplicitlybasedonthe assump-tionthatthesamplingrateisconstantfromsitetosite(Williams etal.,2002;Pollocketal.,2002;RoyleandDorazio,2006). How-ever,thesamplingratemaydependonenvironmentalconditions, suchasweather,season,samplerandhabitats.Ifthisisthecase, consideringthesamplingratetobeconstantleadstoconfusion betweenthevariabilityoftherateandthevariabilityofabundance (Thompsonetal.,1998;MacKenzieandKendall,2002).Therefore, aneffortneedstobemadetoestimateboththeabundanceandthe

intensive(DoddandDorazio,2004)andmaymodifytheobserved sitewhenthenumberofsuccessivesamplingsishigh.Thechoice ofagivenprotocol(CMRorRS)anditseaseofimplementation dependonthespeciesstudied.Althoughavailable,CMRandRS methodsarerarelyusedforcertainspecies.Onesuchspeciesare ticks,whicharethemostimportantvectorsofhumanandanimal diseasesaftermosquitoes(ParolaandRaoult,2001).The classi-calindexoftickabundanceusedisestimatedbythenumberof ticknymphsbydraggingapieceofclothonceoverthe vegeta-tionofadelimitedarea,generally10m2(Vassalloetal.,2000)in aselectionofsites.Host-seekingnymphs,i.e.thosewaitingfora hostonthetopofthevegetation,arecollectedbythedrag.The numbersofnymphscollectedonthedifferentsitesarethen com-pared.ThedragmethodisdistinguishedfromRSmethodsinthat theclothisdraggedonlyonceovereachsite.Toourknowledge, thesamplingrateofthedragsamplingmethodhasbeenstudied little.Onlyonestudy(Talleklint-EisenandLane,2000)hasused aRSdesigntoestimatetheabundanceandthesamplingrateof thedragmethod.Inthisstudy,17successivesamplingswere con-ductedover23days.Thisprotocolcouldnotguaranteethatthe populationremainedclosedoverthesamplingperiod.Theauthors estimatedthesamplingratetobe5.9%usingtheHaynemethod (Hayne,1949).

Toestimateparameters,theHaynemethodmakesalinear regressionofthenumberofsuccessivecapturesonthe cumula-tivenumberofcaptures.Thesamplingrateandthepopulationsize areestimatedrespectivelyastheslopeoftheregressionlineand astheintersectionpointbetweenthehorizontalaxisand regres-sionline.However,theHaynemethodisknowntoproducepoor estimationsofpopulationsize(Whiteetal.,1982),especiallywhen thesamplingrateispotentiallylow(lessthan10%)andvariable. Moreover,thismethoddoesnotallowcovariateeffectstobetaken intoaccount,nordoesitprovideconfidenceintervalsfor esti-mates.

Inthispaper,ahierarchicalBayesianapproachwasusedto esti-mateboththesamplingrateandthepopulationsize.Thisapproach allowstheinclusionofpriorknowledgeandprovidesposterior distributionsofparameterestimates.Moreover,becauseitis hier-archical,onecantakeintoaccountparameterswhichareeither observableornotobservable,andwhicharelocatedatdifferent scalessuchastheareasampledandthesiteofsampling(Gelman andHill,2006;Cressieetal.,2009).First,westudiedthe

per-samplingandNktheremainingpopulationafterthekthsampling whereNk=Nk1−Xkfork∈1,...,K.Furthermore,eachindividualof theclosedpopulationwasassumedtobecapturedindependently withthesameprobabilityofcapture(Moran,1951;Zippin,1956). Hence,weassumedthatXkfollowedabinomialdistributionwith populationsizeNk1andprobabilityofcapture(Eq.(1)): (Xk|Nk−1,)∼B(Nk−1,), where Nk=Nk−1−Xk. (1)

Thecaptureprobabilitywasconsideredasindependentandthe sameforallindividuals,soweconsideredthatitwasequaltothe samplingrate,i.e.thepercentofcapturedindividualsinthe popu-lation.

2.2. HierarchicalBayesianmodels

AfirstHBM(HBM1)assumedthatthepopulationsizeN0sand thesamplingrateswasspecifictoeachsites.AsecondHBM (HBM2)assumedthatthesamplingrateofagivensiteswas asso-ciatedtoboththeeffectofsamplingconditionsc(consideredtobe afixedeffect)andasmallvariationduetothesitesampled (consid-eredtobearandomeffect).Thelogittransformationofsampling ratesdenotedlogit(s)wasused.Thelogit(s)wasdecomposed asthesumofthelogittransformationofthesamplingratecunder thesamplingconditionscandarandomeffects(Eq.(2)): logit(s)=logit(c)+s. (2) Therandomeffectswasintroducedtoaddafluctuationofthe samplingratesduetotheobservedsites.Therangeof varia-tionoftherandomeffects,denotedby2

c,wasconsideredto bespecifictoeachsamplingconditioncwhereswasassumedto followanormaldistributionwithzero-meanandvariance2

cwhich dependedonthesamplingconditionsc.TheHBM1andHBM2 mod-elsdescribedabovearesummarisedinFigs.1and2byDirected AcyclicGraphs(DAGs)(ThulasiramanandSwamy,1992;Clarkand Gelfand,2006).TheseDAGsrepresentrelationships(lines)between theobserveddataandtheunknownparametersorhypothesesof themodel(nodes).Thelinesrepresenttherelationsandthe hier-archybetweennodes.ThenodessymbolisethedataobservedXk, parameterstoestimateN0s,s,c,priordistributionofparameters toestimateforN0s,s,candsandthedistributionof hyper-parameter2

c.Ifs=0andcwasspecifictoeachsites,thentheDAG correspondedtotheHBM1model.HBM1andHBM2were

imple-Fig.1.DirectedAcyclicGraphforthehierarchicalBayesianmodeldenotedbyHBM1

andaprioridistributionsofparametersN0and.N0representsthesizeof

popula-tion;representsthesamplingrate;Xkrepresentstheobservednumberofcapture

atthekthsampling;Nkrepresentsthesizeofpopulationafterthekfirstsamplings.

byinspectingtraceandautocorrelationplots.

2.3. Modelcheckingforobserveddata

Tocheckwhetherthemodelfitthedata(Gelmanetal.,2004), wetestedwhetherthesimulationsdenotedbyXrep=(Xrep,1,...,

Xrep,K)generatedunderthemodelweresimilartotheobserved datadenotedbyX=(X1,...,XK).Consequently,wesimulated25,000 sequencesofremovalsamplingXrepfromtheposteriorpredictive distribution.Theobserveddatawerecomparedtothesimulated

Xrepthroughpredictivebandsofoverallcoveragelevel1−˛,for some0<˛<1.Theshapeofthebandwasconstructedbyusing equal-tailedpredictiveintervalsforeachsamplingrankk∈1,...,K

withthesamecoveragelevel1−˛rank.Moreprecisely,foragiven

˛rank,theinferiorandsuperiorlimitsofthePredictiveIntervalfor rankk,denotedbyPIk,weregivenbythe˛rank/2,resp.1−˛rank/2, quantileofthe25,000simulatedvaluesXrep,k.Theoveralllevel1−˛

ofthepredictivebandwasthenestimatedbytheproportionofXrep

insidetheband,i.e.theXrepsuchthateachXrep,kbelongstoPIk.The valueof˛rankwassuchthattheoverallriskwasequalto˛.The validityofthemodelwasacceptediftheproportionoftheobserved sequencesX=(X1,...,XK)thatbelongedtothepredictivebandwas atleast1−˛.

2.4. Simulationscheme

TocomparetheperformanceoftheHaynemethodtothatof HBM1,simulationswereperformed.Differentlevelsofpopulation sizeN0andsamplingrateweretestedtoevaluatetheirimpacton thequalityofestimators.ThreepopulationsizesN0were consid-ered:50,100and500.Fourvaluesofweretested:0.10,0.30,0.50, 0.70.ForeachcombinationofN0andvalues,wesimulated1000 sequencesofremovalsamplingX1,...,X10accordingtothemodel describedinHBM1(Eq.Eq.(1)).Thefirstksuccessivesamplings ofeachsimulatedsequencewereconsideredtoestimateN0and, wherekrangedfrom2to10.ThecombinationofN0,andkvalues wereconsideredasonescenario.Intheend,108scenarioswere constructed.Basedonthe108scenariosandthe1000sequences simulated,theestimationofN0andwereperformedbyboththe HaynemethodandHBM1.

2.5. Performanceofestimators

Theclassicalcriterionusedtoevaluatetheperformanceofan estimator ˆofaparameteristheMeanSquareErrordenotedby MSEˆ

()(LehmannandCasella,1998).MSEˆ

()hastwo compo-nents:thevariabilityoftheestimatorandthedistancebetween theestimator’sexpectedvalueandthetruevalueoftheparameter beingestimatedorbias(Eq.(3)and(4)):

MSEˆ()=E((ˆ−)2)=(bias(ˆ))2+var( ˆ) (3) where

2.6. Casestudy:abundanceestimationofIxodesricinus host-seekingnymphs

Ixodesricinushost-seekingnymphswerecollectedfromthe15th tothe23thofSeptember2011inthe“ParcdelaFaisanderie”located intheForêtdeSénart,France.Thevegetationontheobservedsites wascomposedofeitherdeadleaveoryoungoak(Quercus)and/or hornbeam(Carpinusbetulus),consideredtobe“shrub”.Duringthe samplingperiod,60sitesweresampledeitherbetween8a.m.and 11a.m.(“morning”)orbetween12p.m.and5p.m.(“afternoon”). The60sitesthereforefellinto4categories:“shrubsampledinthe morning”(n=8),“shrubsampledintheafternoon”(n=15),“dead leavessampledinthemorning”(n=17),“deadleavessampledin theafternoon”(n=20).

Nymphs were captured using the drag sampling method describedbyVassalloetal.(2000),whichconsistsofdragginga 1m×1mclothoverthevegetationfor10m.Unfedhost-seeking nymphsrespondtothemechanicalstimulicreatedbythesweep ofthedragclothbyattachingthemselvestoit.Each1×10meter samplingsitewasmarkedwithboundarymarkers.Foreach samp-lingsite,16successiveremovalsamplingswererealisedtocollect host-seekingnymphs.Onesamplingwasmadeeverytwo min-utesandthirtyseconds.Thenumberofnymphscapturedateach samplingwasrecorded.

Thevariableofinterestwasthesequenceofconsecutive num-bersofnymphscapturedduringthesuccessivesamplingsona givenobservationsite.Onlythefirstnsamplingswerekeptforthe estimationofthehost-seekingpopulationsizeandthesampling rate.nvaluesvariedbetween3and5toavoidtheemergenceof uncontrolledphenomenathatcouldcontradictthehypothesisofa closedpopulationsuchasamodificationofenvironmental condi-tionsorthearrivalofnewhost-seekingticks(Sonenshine,1994). DataweremodelledthroughHBM1andHBM2.Theposterior dis-tributionoftheparameterswasbythinningthreeMarkovchains andkeepingevery50thofthelast500,000values(burn-inwasset to500,000).ThesamplingrateandthesizeofpopulationN0were estimatedbythemeanoftheposteriordistribution.

Tocomparetheestimatedsamplingratesoftwodifferent condi-tionsc1andc2,wemonitoredtheposteriordistributionofcontrasts. Thesamplingrateoftwoconditionsc1andc2areconsideredto bethesamewhen0belongstothe95%equaltailcrediblesetof

thantheHaynemethod.TheRRMSE(ˆ)wasgloballycomparable forbothmethodsalthoughitwasgenerallylowerfortheHBM1 estimator.For≤0.3,theRRMSE( ˆN0)andRRMSE(ˆ)oftheHBM1 estimatorweregloballygreaterthantheHayneestimator.When thetruevalueofN0wasequalto50or100,theRRMSE( ˆN0)ofthe Hayneestimatorwasverylarge.

3.2. BayesianmodelestimationsforI.ricinusabundance

Thecomparisonoftheobserveddatawiththepredictivebands ofanoverallcoveragelevelof95% showedthatonesequence

X=(X1,...,Xk)overthe60observedoneswasoutsidethepredictive bandsfork=3.Fork=4andk=5,6and9observedsequenceswere outsidethepredictivebands.Thevalidityofthemodelconsidering thefirst3samplingswasthusacceptedata95%confidencelevel.

Theresultspresentedintheprecedingsectiondemonstratethat forapopulationsizeofN0=50andasamplingrate=0.5,4 suc-cessivesamplingsareneededtoachieveagoodestimationquality. Threemodelswerethereforeconsideredusingthefirstthree,four andfivesuccessivesamplingsfortheapplication.

Theestimatedsamplingrate ˆcvariedbetween33.9%and47.4% forshrubsand53.6%and66.7%fordeadleaves(Fig.3).No sig-nificantdifference intheestimatedsamplingrate ˆc between themorningandtheafternoonwasdetectedwhateverthetype ofvegetationorthenumberofsuccessivesamplingsconsidered. Moreover,therewasnosignificantdifferencebetweendeadleaves andshrubssampled inthemorning(see Table2andFig.4). However,intheafternoon,thesamplingrateofshrubsiteswas sig-nificantlylowerthanthatofdeadleafsitessampledinthemorning orintheafternoon(fork=4,themedianof ˆposteriordistribution, equalto34.1%forshrubsintheafternoonversus63.1%and58.3% fordeadleavesinthemorningandintheafternoonrespectively). Thevariabilityofthesamplingrateˆ2

c duetothesiteobserved decreasedwhenthenumberofsuccessivesamplingsconsideredin themodelincreased(Table2).However,thevariabilitywaslower indeadleavesthaninshrubs(Table2).

4. Discussion

pop-0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 K= 4 Number of sampling 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 K= 5 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Fig.3.Posteriordistributionofsamplingrate ˆaccordingtoconditionsforHBM2consideringthe3,4and5firstsamplingsforthe4conditions:shrubsampledinthemorning

(Sh.Mor.),shrubsampledintheafternoon(Sh.Aft.),deadleavessampledinthemorning(DLMor.),deadleavessampledintheafternoon(DLAft.).Theredlinerepresent

themeanofposteriordistributionofsamplingrate ˆ.

−1.0 −0.5 0.0 0.5 1.0 K= 3 Sh, Aft vs Mor −1.0 −0.5 0.0 0.5 1.0 Mor, DF vs Sh −1.0 −0.5 0.0 0.5 1.0 DF−Mor vs Sh−Aft