Nous avons montré que l'approche bayésienne estimait mieux la taille de la popu- popu-lation que la régression linéaire et que l'estimation du taux était comparable entre les

2 approches. À partir des données récoltées lors de la campagne de septembre 2011,

nous avons estimé simultanément la taille de la population et le taux

d'échantillon-nage en prenant en compte l'eet du type de végétation et l'heure d'échantillond'échantillon-nage

en tant qu'eet xe et nous avons introduit un eet aléatoire pour modéliser une

variation aléatoire du taux lié à l'unité d'observation.

Nous avons estimé un taux d'échantillonnage compris entre 33,9% et 47,4% pour

les arbustes et compris entre 53,6% et 66,7% pour les feuilles mortes.

j ou rn a lh o m epa ge :w w w . e l s e v i e r . c o m / l o c a t e / e c o l m o d e l

Bayesianestimationofabundancebasedonremovalsamplingunder

weakassumptionofclosedpopulationwithcatchabilitydependingon

environmentalconditions.Applicationtotickabundance

S.Bord

a,∗

,P.Druilhet

,P.Gasqui

,D.Abrial

,G.Vourc’h

aINRA,UR346UnitéEpidémiologieAnimale,CentreINRAdeTheix,F-63122SaintGenèsChampanelle,France

bLaboratoiredemathématiques,UMRCNRS6620CampusdesCézeaux,B.P.80026,F-63171Aubièrecedex,France

article info

Articlehistory:

Received8July2013

Receivedinrevisedform

15November2013

Accepted1December2013

Available online 28 December 2013 Keywords:

Abundance Sampling-rate

Removalsampling

Samplingdesign

HierarchicalBayesianapproach

Haynemethod

abstract

Theestimationofanimalabundanceisessentialtounderstandpopulationdynamics,speciesinteractions

anddiseasepatternsinpopulations.Estimationsofrelativeabundanceclassicallyarebasedonasingle

observationofseveralsites.Inthiscase,themappingofabundanceassumesthattheprobabilityof

detectinganindividual,hencethesamplingrate,remainsconstantacrosstheobservedsites.Inpractice,

however,thisassumptionisoftennotsatisﬁedasthesamplingratemayﬂuctuatebetweensitesdueto

randomﬂuctuationsand/orﬂuctuationsassociatedwiththesamplingprocess,notablyassociatedwith

thecharacteristicsofthesite.Itisthereforeimportanttoaccountforvariationsindetectionprobability.

Usingaremovalsamplingdesign,westudiedtheperformanceofaBayesianapproachtoestimateboth

samplingratesandabundanceundertheassumptionofaclosedpopulation.Theassumptionofaclosed

populationoftenisweakenedwhenthenumberofsuccessivesamplingsislarge.Thenumberofsamplings

hastobelimitedandoptimal.Wethereforeexaminedtheminimalnumberofsuccessivesamplings

neededtoachievesufﬁcientstatisticalaccuracywhilerespectingunderlyingmodelassumptions.Using

thesamesimulations,wealsocomparedtheperformanceoftheBayesianapproachtotheperformanceof

thefrequentistHaynemethodbasedonlinearregression.WeshowthattheBayesianapproachproposed

givesgenerallybetterestimationsofpopulationsizethantheHaynemethod.Thetwomethodsgive

approximatelythesameresultsfortheestimationofsamplingrate.Wethenstudiedthevariabilityof

detectionprobabilityofIxodesricinustickssampledunderseveralenvironmentalconditionsbyusinga

hierarchicalBayesianmodelwitharandomeffect.Theestimatedsamplingrate ˆcvariedbetween33.9%

and47.4%forshrubsand53.6%and66.7%fordeadleaves.Thevariabilityofthesamplingratedueto

thesitedecreasedwhenthenumberofsuccessivesamplingsconsideredinthemodelincreased.The

variabilitywaslowerindeadleavesthanshrubs.Thisapproachcouldbeusedroutinelyforecologicalor

epidemiologicalstudiesofticksandspecieswithcomparablelifehistories.

1. Introduction

Theestimationofanimalabundanceisessentialinecologyto understandfundamentalprocesses,suchaspopulationdynamics andspecies interactions,aswellasinepidemiologyto under-standandgeneratediseasepatternsinpopulations(Anderson, 1991).Inthemajorityofbiologicalsystems,relevantindicators ofabundancearebasedoncountpointsurveys(Alldredgeetal., 2007)obtainedusingconvenientandcalibratedsamplingmethods (Anderson,2001;Pollocketal.,2002).Asapartofthepopulation isoftennotobservable,theprobabilityofdetectinganindividual

Royle,2010;PelletandSchmidt,2005).Consequently,indicators calculatedinthiswayonlygiveanindexofrelativeabundance. Theseindicatorsofabundanceareimplicitlybasedonthe assump-tionthatthesamplingrateisconstantfromsitetosite(Williams etal.,2002;Pollocketal.,2002;RoyleandDorazio,2006). How-ever,thesamplingratemaydependonenvironmentalconditions, suchasweather,season,samplerandhabitats.Ifthisisthecase, consideringthesamplingratetobeconstantleadstoconfusion betweenthevariabilityoftherateandthevariabilityofabundance (Thompsonetal.,1998;MacKenzieandKendall,2002).Therefore, aneffortneedstobemadetoestimateboththeabundanceandthe

intensive(DoddandDorazio,2004)andmaymodifytheobserved sitewhenthenumberofsuccessivesamplingsishigh.Thechoice ofagivenprotocol(CMRorRS)anditseaseofimplementation dependonthespeciesstudied.Althoughavailable,CMRandRS methodsarerarelyusedforcertainspecies.Onesuchspeciesare ticks,whicharethemostimportantvectorsofhumanandanimal diseasesaftermosquitoes(ParolaandRaoult,2001).The classi-calindexoftickabundanceusedisestimatedbythenumberof ticknymphsbydraggingapieceofclothonceoverthe vegeta-tionofadelimitedarea,generally10m2(Vassalloetal.,2000)in aselectionofsites.Host-seekingnymphs,i.e.thosewaitingfora hostonthetopofthevegetation,arecollectedbythedrag.The numbersofnymphscollectedonthedifferentsitesarethen com-pared.ThedragmethodisdistinguishedfromRSmethodsinthat theclothisdraggedonlyonceovereachsite.Toourknowledge, thesamplingrateofthedragsamplingmethodhasbeenstudied little.Onlyonestudy(Talleklint-EisenandLane,2000)hasused aRSdesigntoestimatetheabundanceandthesamplingrateof thedragmethod.Inthisstudy,17successivesamplingswere con-ductedover23days.Thisprotocolcouldnotguaranteethatthe populationremainedclosedoverthesamplingperiod.Theauthors estimatedthesamplingratetobe5.9%usingtheHaynemethod (Hayne,1949).

Toestimateparameters,theHaynemethodmakesalinear regressionofthenumberofsuccessivecapturesonthe cumula-tivenumberofcaptures.Thesamplingrateandthepopulationsize areestimatedrespectivelyastheslopeoftheregressionlineand astheintersectionpointbetweenthehorizontalaxisand regres-sionline.However,theHaynemethodisknowntoproducepoor estimationsofpopulationsize(Whiteetal.,1982),especiallywhen thesamplingrateispotentiallylow(lessthan10%)andvariable. Moreover,thismethoddoesnotallowcovariateeffectstobetaken intoaccount,nordoesitprovideconﬁdenceintervalsfor esti-mates.

Inthispaper,ahierarchicalBayesianapproachwasusedto esti-mateboththesamplingrateandthepopulationsize.Thisapproach allowstheinclusionofpriorknowledgeandprovidesposterior distributionsofparameterestimates.Moreover,becauseitis hier-archical,onecantakeintoaccountparameterswhichareeither observableornotobservable,andwhicharelocatedatdifferent scalessuchastheareasampledandthesiteofsampling(Gelman andHill,2006;Cressieetal.,2009).First,westudiedthe

per-samplingandNktheremainingpopulationafterthekthsampling whereN_k=N_k₋₁−X_kfork∈1,...,K.Furthermore,eachindividualof theclosedpopulationwasassumedtobecapturedindependently withthesameprobabilityofcapture(Moran,1951;Zippin,1956). Hence,weassumedthatX_kfollowedabinomialdistributionwith populationsizeN_k₋₁andprobabilityofcapture(Eq.(1)): (Xk|Nk−1,)∼B(Nk−1,), where Nk=Nk−1−Xk. (1)

Thecaptureprobabilitywasconsideredasindependentandthe sameforallindividuals,soweconsideredthatitwasequaltothe samplingrate,i.e.thepercentofcapturedindividualsinthe popu-lation.

2.2. HierarchicalBayesianmodels

AfirstHBM(HBM1)assumedthatthepopulationsizeN0sand thesamplingrateswasspecifictoeachsites.AsecondHBM (HBM2)assumedthatthesamplingrateofagivensiteswas asso-ciatedtoboththeeffectofsamplingconditionsc(consideredtobe afixedeffect)andasmallvariationduetothesitesampled (consid-eredtobearandomeffect).Thelogittransformationofsampling ratesdenotedlogit(s)wasused.Thelogit(s)wasdecomposed asthesumofthelogittransformationofthesamplingratecunder thesamplingconditionscandarandomeffects(Eq.(2)): logit(s)=logit(c)+s. (2) Therandomeffectswasintroducedtoaddafluctuationofthe samplingratesduetotheobservedsites.Therangeof varia-tionoftherandomeffects,denotedby2

c,wasconsideredto bespeciﬁctoeachsamplingconditioncwhereswasassumedto followanormaldistributionwithzero-meanandvariance2

cwhich dependedonthesamplingconditionsc.TheHBM1andHBM2 mod-elsdescribedabovearesummarisedinFigs.1and2byDirected AcyclicGraphs(DAGs)(ThulasiramanandSwamy,1992;Clarkand Gelfand,2006).TheseDAGsrepresentrelationships(lines)between theobserveddataandtheunknownparametersorhypothesesof themodel(nodes).Thelinesrepresenttherelationsandthe hier-archybetweennodes.ThenodessymbolisethedataobservedX_k, parameterstoestimateN0s,s,c,priordistributionofparameters toestimateforN₀_s,s,candsandthedistributionof hyper-parameter2

c.Ifs=0andcwasspeciﬁctoeachsites,thentheDAG correspondedtotheHBM1model.HBM1andHBM2were

imple-Fig.1.DirectedAcyclicGraphforthehierarchicalBayesianmodeldenotedbyHBM1

andaprioridistributionsofparametersN0and.N0representsthesizeof

popula-tion;representsthesamplingrate;Xkrepresentstheobservednumberofcapture

atthekthsampling;Nkrepresentsthesizeofpopulationafterthekﬁrstsamplings.

byinspectingtraceandautocorrelationplots.

2.3. Modelcheckingforobserveddata

Tocheckwhetherthemodelﬁtthedata(Gelmanetal.,2004), wetestedwhetherthesimulationsdenotedbyX_rep=(X_rep,1,...,

Xrep,K)generatedunderthemodelweresimilartotheobserved datadenotedbyX=(X₁,...,X_K).Consequently,wesimulated25,000 sequencesofremovalsamplingX_repfromtheposteriorpredictive distribution.Theobserveddatawerecomparedtothesimulated

Xrepthroughpredictivebandsofoverallcoveragelevel1−˛,for some0<˛<1.Theshapeofthebandwasconstructedbyusing equal-tailedpredictiveintervalsforeachsamplingrankk∈1,...,K

withthesamecoveragelevel1−˛rank.Moreprecisely,foragiven

˛rank,theinferiorandsuperiorlimitsofthePredictiveIntervalfor rankk,denotedbyPI_k,weregivenbythe˛rank/2,resp.1−˛rank/2, quantileofthe25,000simulatedvaluesX_rep,_k.Theoveralllevel1−˛

ofthepredictivebandwasthenestimatedbytheproportionofXrep

insidetheband,i.e.theXrepsuchthateachX_rep,_kbelongstoPI_k.The valueof˛rankwassuchthattheoverallriskwasequalto˛.The validityofthemodelwasacceptediftheproportionoftheobserved sequencesX=(X₁,...,XK)thatbelongedtothepredictivebandwas atleast1−˛.

2.4. Simulationscheme

TocomparetheperformanceoftheHaynemethodtothatof HBM1,simulationswereperformed.Differentlevelsofpopulation sizeN0andsamplingrateweretestedtoevaluatetheirimpacton thequalityofestimators.ThreepopulationsizesN₀were consid-ered:50,100and500.Fourvaluesofweretested:0.10,0.30,0.50, 0.70.ForeachcombinationofN0andvalues,wesimulated1000 sequencesofremovalsamplingX₁,...,X₁₀accordingtothemodel describedinHBM1(Eq.Eq.(1)).Theﬁrstksuccessivesamplings ofeachsimulatedsequencewereconsideredtoestimateN0and, wherekrangedfrom2to10.ThecombinationofN₀,andkvalues wereconsideredasonescenario.Intheend,108scenarioswere constructed.Basedonthe108scenariosandthe1000sequences simulated,theestimationofN₀andwereperformedbyboththe HaynemethodandHBM1.

2.5. Performanceofestimators

Theclassicalcriterionusedtoevaluatetheperformanceofan estimator ˆofaparameteristheMeanSquareErrordenotedby MSE_ˆ

()(LehmannandCasella,1998).MSE_ˆ

()hastwo compo-nents:thevariabilityoftheestimatorandthedistancebetween theestimator’sexpectedvalueandthetruevalueoftheparameter beingestimatedorbias(Eq.(3)and(4)):

MSEˆ()=E((ˆ−)²)=(bias(ˆ))²+var( ˆ) (3) where

2.6. Casestudy:abundanceestimationofIxodesricinus host-seekingnymphs

Ixodesricinushost-seekingnymphswerecollectedfromthe15th tothe23thofSeptember2011inthe“ParcdelaFaisanderie”located intheForêtdeSénart,France.Thevegetationontheobservedsites wascomposedofeitherdeadleaveoryoungoak(Quercus)and/or hornbeam(Carpinusbetulus),consideredtobe“shrub”.Duringthe samplingperiod,60sitesweresampledeitherbetween8a.m.and 11a.m.(“morning”)orbetween12p.m.and5p.m.(“afternoon”). The60sitesthereforefellinto4categories:“shrubsampledinthe morning”(n=8),“shrubsampledintheafternoon”(n=15),“dead leavessampledinthemorning”(n=17),“deadleavessampledin theafternoon”(n=20).

Nymphs were captured using the drag sampling method describedbyVassalloetal.(2000),whichconsistsofdragginga 1m×1mclothoverthevegetationfor10m.Unfedhost-seeking nymphsrespondtothemechanicalstimulicreatedbythesweep ofthedragclothbyattachingthemselvestoit.Each1×10meter samplingsitewasmarkedwithboundarymarkers.Foreach samp-lingsite,16successiveremovalsamplingswererealisedtocollect host-seekingnymphs.Onesamplingwasmadeeverytwo min-utesandthirtyseconds.Thenumberofnymphscapturedateach samplingwasrecorded.

Thevariableofinterestwasthesequenceofconsecutive num-bersofnymphscapturedduringthesuccessivesamplingsona givenobservationsite.Onlytheﬁrstnsamplingswerekeptforthe estimationofthehost-seekingpopulationsizeandthesampling rate.nvaluesvariedbetween3and5toavoidtheemergenceof uncontrolledphenomenathatcouldcontradictthehypothesisofa closedpopulationsuchasamodiﬁcationofenvironmental condi-tionsorthearrivalofnewhost-seekingticks(Sonenshine,1994). DataweremodelledthroughHBM1andHBM2.Theposterior dis-tributionoftheparameterswasbythinningthreeMarkovchains andkeepingevery50thofthelast500,000values(burn-inwasset to500,000).ThesamplingrateandthesizeofpopulationN₀were estimatedbythemeanoftheposteriordistribution.

Tocomparetheestimatedsamplingratesoftwodifferent condi-tionsc₁andc₂,wemonitoredtheposteriordistributionofcontrasts. Thesamplingrateoftwoconditionsc1andc2areconsideredto bethesamewhen0belongstothe95%equaltailcrediblesetof

thantheHaynemethod.TheRRMSE(ˆ)wasgloballycomparable forbothmethodsalthoughitwasgenerallylowerfortheHBM1 estimator.For≤0.3,theRRMSE( ˆN0)andRRMSE(ˆ)oftheHBM1 estimatorweregloballygreaterthantheHayneestimator.When thetruevalueofN₀wasequalto50or100,theRRMSE( ˆN0)ofthe Hayneestimatorwasverylarge.

3.2. BayesianmodelestimationsforI.ricinusabundance

Thecomparisonoftheobserveddatawiththepredictivebands ofanoverallcoveragelevelof95% showedthatonesequence

X=(X₁,...,X_k)overthe60observedoneswasoutsidethepredictive bandsfork=3.Fork=4andk=5,6and9observedsequenceswere outsidethepredictivebands.Thevalidityofthemodelconsidering theﬁrst3samplingswasthusacceptedata95%conﬁdencelevel.

Theresultspresentedintheprecedingsectiondemonstratethat forapopulationsizeofN₀=50andasamplingrate=0.5,4 suc-cessivesamplingsareneededtoachieveagoodestimationquality. Threemodelswerethereforeconsideredusingtheﬁrstthree,four andﬁvesuccessivesamplingsfortheapplication.

Theestimatedsamplingrate ˆcvariedbetween33.9%and47.4% forshrubsand53.6%and66.7%fordeadleaves(Fig.3).No sig-nificantdifference intheestimatedsamplingrate ˆc between themorningandtheafternoonwasdetectedwhateverthetype ofvegetationorthenumberofsuccessivesamplingsconsidered. Moreover,therewasnosignificantdifferencebetweendeadleaves andshrubssampled inthemorning(see Table2andFig.4). However,intheafternoon,thesamplingrateofshrubsiteswas sig-nificantlylowerthanthatofdeadleafsitessampledinthemorning orintheafternoon(fork=4,themedianof ˆposteriordistribution, equalto34.1%forshrubsintheafternoonversus63.1%and58.3% fordeadleavesinthemorningandintheafternoonrespectively). Thevariabilityofthesamplingrateˆ2

c duetothesiteobserved decreasedwhenthenumberofsuccessivesamplingsconsideredin themodelincreased(Table2).However,thevariabilitywaslower indeadleavesthaninshrubs(Table2).

4. Discussion

pop-0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 K= 4 Number of sampling 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 K= 5 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Fig.3.Posteriordistributionofsamplingrate ˆaccordingtoconditionsforHBM2consideringthe3,4and5ﬁrstsamplingsforthe4conditions:shrubsampledinthemorning

(Sh.Mor.),shrubsampledintheafternoon(Sh.Aft.),deadleavessampledinthemorning(DLMor.),deadleavessampledintheafternoon(DLAft.).Theredlinerepresent

themeanofposteriordistributionofsamplingrate ˆ.

−1.0 −0.5 0.0 0.5 1.0 K= 3 Sh, Aft vs Mor −1.0 −0.5 0.0 0.5 1.0 Mor, DF vs Sh −1.0 −0.5 0.0 0.5 1.0 DF−Mor vs Sh−Aft

Dans le document Estimation Bayésienne de l’abondance par "removal sampling" en présence de variabilité du taux d’échantillonnage : application aux tiques Ixodes ricinus en quête d’hôtes (Page 68-73)