• Aucun résultat trouvé

Simulating Bandwidth Sharing with Pareto distributed File Sizes

N/A
N/A
Protected

Academic year: 2021

Partager "Simulating Bandwidth Sharing with Pareto distributed File Sizes"

Copied!
24
0
0

Texte intégral

(1)

HAL Id: inria-00383079

https://hal.inria.fr/inria-00383079

Submitted on 12 May 2009

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

File Sizes

Eitan Altman, Julio Rojas-Mora, Tania Jimenez

To cite this version:

Eitan Altman, Julio Rojas-Mora, Tania Jimenez. Simulating Bandwidth Sharing with Pareto dis- tributed File Sizes. [Research Report] RR-6926, INRIA. 2009. �inria-00383079�

(2)

a p p o r t

d e r e c h e r c h e

0249-6399ISRNINRIA/RR--6926--FR+ENG

Thème COM

Simulating Bandwidth Sharing with Pareto distributed File Sizes

Julio Rojas-Mora — Tania Jiménez — Eitan Altman

N° 6926

May 2009

(3)
(4)

Centre de recherche INRIA Sophia Antipolis – Méditerranée

JulioRojas-Mora

,Tania Jiménez, Eitan Altman

ThèmeCOMSystèmesommuniants

Équipe-ProjetMaestro

Rapportdereherhe 6926May200920pages

Abstrat: Thetraontheinternethasknowntobeheavytailed: thesizeof

letransfersthroughFTPorHTTPappliations,aswellasthosetransferredby

P2Pappliations hasbeenobservedtohaveaveryheavytail. Typiallymod-

eledasParetodistributed withparameterbetween1.05to1.5, thelesizehas

innitevariane. Thisisthesoureofmanydiultiesin simulatingdatatraf-

: onvergeneisveryslow,simulationshavetobeverylong,andthestandard

methodsforderivingondeneintervals,basedontheCLT,arenotappliable

here. Weillustratethese wellknownproblemsthroughthesimulationstudyof

a proessorsharingqueue, whih is oftenused to model session levelresoure

sharingin theinternet. Wetest bootstrapmethods to aelerate onvergene

and improvethepreisionofsimulations,and testadiret approah toobtain

ondene intervalbasedonthe histogramoftheempirial distributions. The

onlusiondrawnarethenompared tothoseobtainedwhensimulatingin ns2

datatransferusingTCP.

Key-words: M/G/1 queue. Proessor Sharing. Simulations. Condene

interval. Bootstrap.

LIA, University of Avignon, 339, hemindes Meinajaries, Agropar BP 1228, 84911

AvignonCedex9,Frane. {julio.rojas,tania,jimenez}univ-avignon.fr

MaestrogroupINRIA,2004 RoutedesLuioles06902 SophiaAntipolisCedex,Frane.

eitan.altmansophia.inria.fr

(5)

Résumé: Lataille des hierstransférés surl'Internet parFTPouHTTP,

ainsi que eux transférés par les appliations P2P, a souvent été observée et

mesurée. La distribution observée a été à queue lourde et elle a souvent été

modélisée par une une distribution de Pareto ave un paramètre entre 1,05

et 1,5; ela signie que la taille des hiers a une variane innie. Cei est

la soure de nombreuses diultés dans la simulation du tra de données:

la onvergene est très lente, les simulations doivent être très longues, et les

méthodeshabituellesdealuld'intervallesdeonane,baséessurleThéorème

Limite Centrale, ne sont pas appliables. Nous illustrons es problèmes bien

onnus à traversla simulationd'une le "Proessor Sharing",qui est souvent

utiliséepourmodéliserlepartagedesressouresauniveausessiondansl'Internet.

Lesontributionprinipaledenotretravailsont(i)l'appliationdelaméthode

deBootstrapquinouspermetd'aélérerlaonvergeneetd'améliorerlapréision

des simulations, et (ii) l'étude de l'approhe pour obtenir des intervales de

onanebaséssurladistributionempirique. Puisnousomparonslesonlusions

à elles obtenues lors de la simulation en NS2 des transfers de hiers sur

l'Internet.

Mots-lés: FileM/G/1,ProessorSharing,Simulatiojs,IntervaldeConane,

Bootstrap

(6)

1 Introdution

The proessor sharing queue has been perhaps the most ommon model for

bandwidthsharingofsessions(havingthesameRTT)ofdatatransferoverthe

Internet [4℄, along with the Disriminatory Proessor sharing Sharing queue

[7℄ adapted to the ase of unequal round trip times. Several researh groups

[17, 14, 11, 13℄ haveexamined itsvalidity throughsimulations. Theexpeted

sojourntimeofaustomerin aproessorsharingqueueisknowntobeinsensi-

tivetothelesizedistribution(itdependsonthedistributiononlythroughthe

expetation). Its variane,however,isnotinsensitiveanymore,andinpartiu-

lar,itisinnitewhentheservietimeofalient(orequivalentlyinoursetting

-thesizeofalethatistransferred)hasaninniteseondmoment[3℄.

The size of a data transfer overthe Internet is known to be heavy tailed

[12℄. (This is thease both for FTPtransfers aswell asfor the "on"periods

of HTTP transfers). Among several andidates for modeling the distribution

ofthesetransfers,theParetodistributionwithparameterbetween1.05and1.5

has been the one that gave the best t with experiments for the last twenty

years,see[8,2,6,15℄.

Thisveryheavytail hasbeenausing various serious problemsfor simula-

tionsofdatatransfers:

ˆ Theauthorsof[11℄write: "sineasubstantialpartof thedistribution is

in the tail, if the simulation is not runfor very longthe average of the

sampledlesizeswouldbelessthanthenominalaverage,thusleadingto

aloweroeredloadandheneoverestimationofthethroughput."

ˆ Centrallimitbasedondeneintervalsarenotavailablesinetheseond

momentof the number of sessions orof the workload in thesystem are

innite.

ˆ Duetothelastpointsonemayhaveproblemsininterpretationofsimula-

tionresults. Indeed,variouspapersin whih sharingbandwidthismod-

eledbyproessorsharingreportdierenesbetweentheexpetedtheoret-

ialvalueandthesimulationvalue[13,14,11,17℄that varyfromaround

10%in [11℄ andgoupto afatorof10in somesituationsin [13℄. When

suh deviations our,it is importantto know whether theyanbedue

totheimpreisionsin thesimulationortorealphenomena.

ˆ Thewarm-uptimeisextremelylong,see[9℄.

Other approahes to simulating networks with Pareto dis-

tributed le size

Onewayto avoidheavytailsisto trunatethem. Thisis in spiritof thesug-

gestioninthepaper"DiultiesinsimulatingqueueswithParetoservie"[10℄

that says: "Sinefor any nite simulation runlength, there isalways amaxi-

mum value of the random variables generated, we, inatuality, are simulating

atrunatedPareto servie distribution. Ithas alsobeen arguedthat thereisal-

waysamaximumlesizeorlaim amountso,inreality,wearealways dealing

with trunated distributions." Butwhere should we trunate thedistribution?

TrunationatsomesizeM wouldbevalidifthedierenebetweenperformane

(7)

with trunation at anyother valueL that is greater than M hasa negligible

impatonperformane. Simulating withatrunated Pareto distribution may

notbesuientandseveralothertestsoftrunationatlargerthresholdvalues

maybeneeded.

Our ontributions

Our main ontribution is in introduing to the networking ontext statistial

methods that, to the best of our knowledge have notbeenused before in the

networking ontext. Inpartiular,

ˆ Weobtainanimpressiveimprovementinthepreisionofsimulationsusing

thebootstrapapproah;equivalently,thisallowsusto dereaseonsider-

ably the simulation run times or the number of simulations needed in

order toahieveagivenpreisionlevel. Weshall nallypresentdetailed

experimentationswithsomeofthem.

ˆ Condene Intervals Weusethequantilebasedapproahasanalter-

native to the entral limit based approah for obtaining the ondene

intervals. Theentrallimitapproahis notdiretly appliablewhenever

thevarianeisinnite,whihistheasewiththestationarysojourntime

of adata transfersession(having aPareto distribution withashapepa-

rameterKlowerthan2).

ˆ We then apply these ideasdiretly to the simulationof ompeting non-

persistent TCP transfers of les that have heavy tail distribution. We

omparethesimulationresultstothoseobtainedfortheproessorsharing

queue. Weidentify various reasons for the deviation of the behaviorof

TCPfromtheidealproessorsharingmodelandmanageto quantifythe

impatofsomeofthese.

Struture of the paper

We begin by presenting in the next setion a bakground on the bootstrap

methodandonthequantilemethodforobtainingondeneinterval. InSetion

3 we present our approah onerning the use of simulation to estimate the

average number of pakets in the proessor sharing queue. In Setion 4 we

study someissues onerning the simulation of the workload of the proessor

sharingqueue. Wethen presentin Setion 5simulationsof TCP onnetions

that share a ommon bottlenek link and ompare the preision that an be

obtained(with andwithoutthebootstrapapproah)tothepreisionobtained

in simulating the proessor sharing queue. We provide explanations for the

dierenesbetween theTCPsenario andits orresponding proessorsharing

model. Aonludingsetionsummariesthepaper. An appendix lariessome

basiquestionsonerningondeneintervals.

(8)

Algorithm1Bootstrapalgorithmforestimatingthemeanqueuesize

1. Makensimulationsofthe queuequeuesizeand letXi,i= 1, . . . , n,be

theestimationoftheparameterofinterestobtainedfromeahsimulation.

2. Forj = 1, . . . , kdo:

(a) Let X1,j , . . . , Xn,j be a resample, with replaement, taken from

X1, . . . , Xn.

(b) Letθˆj=n−1Pn i=1Xi,j .

3. Calulateθˆ=k−1Pk

j=1θˆj,theMonteCarloapproximationoftheboot- strapestimationofθ.

4. Calulateondeneintervalsforθˆ usingthequantilemethod.

2 Bakground

Bootstrap

BootstrapisamethodreatedbyEfron[20℄fornon-parametrialestimation. Let

θbeaparameterofaompletelyunspeieddistributionF,forwhihwehave

asampleofi.i.d. observationsX1, . . . , Xn,andletθˆbetheestimationmadeof

theparameter. Fromthesample, wewill makek resampleswithreplaement,

X1,j , . . . , Xn,j j= 1, . . . , k,witheahelementhavingprobability1/nofbeing

seleted, and foreah resamplean estimatorθˆj will bealulated. ByMonte

Carloapproximation,thedistributionofθˆisthenestimatedbythedistribution of θˆ. When k → ∞, the estimation of θˆ will be better and, in turn, the

real distribution of θ will also be better estimated. We note that the time

and memoryoverheadfor resamplingand performing thebootstrapalgorithm

are oftenmuhsmallerthan theones neededto reatemoresamples, andan

beperformedwithin averyshort amount oftime. Singh [21℄, and Bikel and

Freedman[19℄aregoodreferenestounderstandtheasymptotiharateristis

ofthebootstrap.

Remark. An alternativeway to aelerate the simulationsis the important

samplingormoregenerally,varianeredutiontehniquesseee.g. [22,Chap4℄

for ageneralintrodution. Theyare dierentthan thebootstrapapproahin

thattheyarebasedonsimulatinganothermodel(e.g. usealargerloadinorder

to obtainbetterestimateof arareeventof reahingalargequeuesize). Then

some knowledge of the system is needed in order to transform the simulated

results of the new model to that of the original one. The bootstrap method

that westudyis apost-simulationapproah: it onernsstatistial proessing

ofsimulatedtraes. Itanbeusedontopofvarianeredutiontehniqueswhen

theyareavailable.

(9)

Quantile-based ondene interval

Assumewewishtoobtaintheondeneintervaloftheestimationofsomepa-

rameterof asimulatedproessXt. Thequantileapproahtoderiveondene

intervalsisbasedonrunninganumberN ofi.i.d. simulations(eahsimulation orrespondsinourasetothequeuelengthproessortofuntions ofthispro-

ess). Wethenusethese toomputetheempirialdistributionof thefuntion

oftherandomvariable.

For(1α)·100% ondene level, the (1α)·100% ondene interval

by the quantile method is the interval between the (α/2)·100- th and the (1α/2)·100-thpointsofthesortedsample.

3 Simulating the queue size of a proessor shar-

ing queue

ThroughaseriesofsimulationsperformedinJAVA(availablefromtheauthors

byrequest) we exhibit thepower ofthe bootstrapapproah: itsability to in-

reasethepreisionofthesimulationofInternettrathatisthrottledbysome

bottleneklinkandatthesametimedereasetherequireddurationofthesimu-

lation. Inthissetionwerestrittostudyoftheproessorsharingqueuewhih

hadoftenbeenproposedasamodelforTCPtransferssharingabottleneklink

intheInternet. A TCPsession isthenrepresentedbyoneustomerin thePS

queue.

Belowweused the PS queue with aPoisson arrivalproesswith a rateof

λ= 1ustomers perseond (A ustomer representsale when the proessor

sharingqueueis used to model le transfers). We onsider three servietime

distributions: Pareto with shape parameter 1.5, Pareto with shape 2.5, and

exponentially distributed. Wevary the average servietime σ soasto obtain

anaverageloadρ=λσ thattakesthevalues0.6and0.7.

Eahoneoftheguresbeloworrespondto thequeuesize oftheproessor

sharing averaging over 100 independent samples. When onsidering the PS

queueasamodel forbandwidthsharingintheInternet,thequeuesize should

beinterpretedasthenumberofongoingsessions.

Thedurationofeahsimulationis6·106seandthereisawarmuptimeof

300000se. Ineahoneofthesenariosdesribedin Figure1,wepresent:

1. thetheoretialsteadystateexpetedqueuesize

2. thevalueobtainedbythesimulations

3. theondeneintervalorrespondingtoa95%perentileofobtainedusing

thequantilemethod

4. the ondene intervalorrespondingto a 95%perentile obtainedafter

applyingthebootstrapmethod.

Wetook100000resamplesoutofour100originalsamplesforthebootstrap.

Figure1displaysthepreisionofthesimulationswithandwithouttheboot-

strap approahfor ρ = 0.6 (up) and ρ = 0.7 (down). The left subgures are

foranexponentialdistributed servietime,themiddleand rightonesarefora

(10)

Pareto distributed servietime withparameterK = 2.5 and K = 1.5,respe-

tively. Theaverageservietimeis thesamein theallthree sub-guresgures

orrespondingto thesameρ. (Itwashosensothat indeedρ=λ·σwillhave

thevalues0.6 and0.7,respetively.)

Weobservethefollowingpointsfrom thesimulations:

ˆ ThesimulationsshowwelltheinsensitivityofthePSregimetotheservie

time distribution. Indeed, in eah set of simulations having the same

loadρ, we see onvergene to the sameaverage ratefor the exponential ase,thePareto distributionwithparameterK= 2.5and forthePareto

distributionwithK= 1.5.

ˆ The95%ondene intervalwithoutbootstrapisaround10timeslarger

than that of the bootstrap (whih establishes learly the advantage of

usingit). Thisisseentoholdforanydurationofthesimulation.

ˆ Weseethattheondeneintervalarearound10timessmallerinthease

ofexponentialand Pareto with shapeparameterof 2.5 than in the ase

ofParetoshapeparameter1.5. Thisanbeexpetedsinethetailofthe

latterdistributionismuh heavier.

ˆ Durationofthesimulation:Forallthreedistributions,andforthedierent

loads, we see that the preision obtained with bootstrap after already

400000se is morethan three timesbetterthan that without bootstrap

after wesee that even after 6millionseonds. It thus seemsthat to get

thesamepreisionwithoutbootstrap,onewouldneedtousesimulations

muhlongerthan15timesasmuhaswithbootstrap.

4 On the simulation of the workload in a queue

with heavy tailed servie time

Thestatespaeneededtorepresenttheevolutionofthenumberofpaketsina

proessorsharingisquiteomplex: weneedtheresidualtimeofeahpaketthat

is in the system. Inontrast, the evolutionworkload in the systemis easyto

desribeasitanbewritten asaonedimensionalreursiveequation. Wethus

hose to illustrate some of the diulties in simulating the proessorsharing

queuebyashortdisussiononerningthesimulationoftheworkloadproess.

We onsider in this Setion the workload of an M/G/1 proessorsharing

queue withaPoisson arrivalproess withrate λand withi.i.d. servietimes.

Theworkloadisthetimeittakestoompletetransmissionofalltheonnetions

thatarepresentin thesystem. Assume throughoutthatρ <1sotheworkload

proessisergodi.

Wenote thattheworkloadproess isthesameastheoneofaFIFOqueue

withthesamearrivalsandservietimes.

In ase servie times do not have a nite seond moment, the expeted

workloadat steady stateis innite. (Indeed, the workloadis invariantunder

theservie disipline,and is thus equalto theoneof FIFOdisipline. Dueto

thePASTAproperty,theexpetedstationaryworkloadequalstothatatarrival

epohs whih equals the expeted waiting time. The latter is innite in the

M/G/1FIFOqueuesine sinethearrivalhasto wait morethan theresidual

(11)

servie time of the ustomer in servie, and the latter is proportional to the

seondmomentoftheservietime.)

Considernow the asewhere the servie time has anite seond moment

but innite third moment. In that ase the expeted stationary workload is

nitebutitsvarianeisinnite.

Assumethat wewishto estimatetheexpetedstationaryworkloadE[V]in

thesystem. Theworkloadsatisesthereursion

V0(v) =v, Vn+1(v) = (Vn(v) + Θnτn)+, n0

whereVnistheworkloadasseenbythentharrival,Θn istheworkloadbrought

bythentharrival,andτn isthetimebetweenthentharrivalepohTn andthe

(n+ 1)tharrivalinstantTn+1. E[V]anthenbeestimatedby Vbn=

Pn i=1Vi

n

forn large(due to the PASTA property, estimation at arrivalinstantsindeed

onvergestothestationaryrandomvariable). Denethebiasfuntion:

Rn(v) =E[Sn(v)]nE[V] where Sn(v) = Xn

i=1

Vi(v).

WenotethatRn(0)ismonotonedereasinginn. Ithasalimitasn→ ∞whih

wedenote by R. Rn(0) isknown asthebias; ifit werezero then Vbn =E[V].

OtherwizethetotalexpetedestimationerrorattimeTn isgivenbyRn(v)/n.

Toobtainanestimationerrorsmallerthanǫ,weneedtohoosensuhthat Rn/n < ǫ. It is advoated in [18℄ to hoose nsuh that R/n < ǫ. It ensures

that|Rn|/n < ǫ. Followingthisapproah,wedisover,unfortunately, thatthe simulationlength should betaken to be innitesine |R| turns to beinnite.

Wenextestablishthisonlusion.

LetVn bethestationary workloadproess WeassumethatVn andVn are

dened on ajoint probability spae: the arrivalproess and servietimes are

thesameinboth. Theonlydiereneisintheinitialstate.

Deneη(v)tobethersttimenthat Vn oupleswithVn. Letθ(v)bethe

rsttimethatithits0. Then

Rn(v) = Xn

i=0

E

1{η(v)> i}[Vi(v)Vi]

Wemakethefollowingkeyobservation: Whenv > V0thenη(v) =θ(v). Indeed,

aslongasVnandVn arebothpositivethenthedierene|VnVn|isonstant.

Itstart dereasingwhen thesmallerof thetwohitszeroand itvanisheswhen

theyarebothzero.

LetQ0(v) =v anddene forn >0: Qn(v) =

Xn

i=0

1{θ(v)> i}Vi(v)

NotethatRn(v)P(V < v)Qn(v).

Références

Documents relatifs

For an alphabet of size at least 3, the average time complexity, for the uniform distribution over the sets X of Set n,m , of the construction of the accessible and

In this paper, we use results on the average number of minimal transversals in random hypergraphs to show that the base of proper premises is, on average, of quasi-polynomial

For the class of the almost-periodic functions, a non normal limit distribution for the normalized error of the time average estimate can be obtained when the normalizing factor

Physically, this means that at energies of this order pions (which are the lightest hadrons) can be produced in collisions of protons in cosmic rays with CMB photons, which prevents

Theorem 3 The topology compression average ratio of a k-connecting unstretched remote spanner in an unit disk graph with n nodes and average degree δ is smaller than O(( k δ ) 2 3 )..

Alladi (Alladi, 1987) proved an Erd˝os–Kac theorem for integers without large prime factors... We follow closely the proofs of Propositions 2 and 3, making appropri-

However, volumes of limit and market orders are equal in the two latter cases, and we have shown in section 3 that these different shapes can actually be obtained with different

Knuth [5] shows ( n 5/3 ) for the average-case of p = 2 passes and Yao [14] derives an expression for the average case for p = 3 that does not give a direct bound but was used by