• Aucun résultat trouvé

Lifetime and availability of data stored on a P2P system: Evaluation of recovery schemes

N/A
N/A
Protected

Academic year: 2023

Partager "Lifetime and availability of data stored on a P2P system: Evaluation of recovery schemes"

Copied!
38
0
0

Texte intégral

(1)

HAL Id: inria-00448100

https://hal.inria.fr/inria-00448100v2

Submitted on 29 Apr 2010

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

Evaluation of recovery schemes

Abdulhalim Dandoush, Sara Alouf, Philippe Nain

To cite this version:

Abdulhalim Dandoush, Sara Alouf, Philippe Nain. Lifetime and availability of data stored on a P2P

system: Evaluation of recovery schemes. [Research Report] RR-7170, INRIA. 2010, pp.37. �inria-

00448100v2�

(2)

a p p o r t

d e r e c h e r c h e

-6 3 9 9 IS R N IN R IA /R R -- 7 1 7 0 -- F R + E N G

Thème COM

Lifetime and availability of data stored on a P2P system: Evaluation of redundancy and recovery

schemes

Abdulhalim Dandoush — Sara Alouf — Philippe Nain

N° 7170

April 2010

(3)
(4)

AbdulhalimDandoush , Sara Alouf ,PhilippeNain

ThèmeCOMSystèmesommuniants

ProjetMAESTRO

Rapportdereherhe n° 7170April201034pages

Abstrat: ThisreportstudiestheperformaneofPeer-to-Peerstorageandbakupsystems

(P2PSS). Thesesystemsare basedonthree pillars: data fragmentationand dissemination

amongthepeers,redundanymehanismstoopewithpeershurn,andrepairmehanisms

to reoverlost or temporarilyunavailable data. Usually, redundany is ahievedeither by

using repliation orby using erasure odes. A new lass of network oding (regenerating

odes)hasbeenproposedreently. Therefore,wewill adaptourworktothese threeredun-

danyshemes. Weintroduetwomehanismsforreoveringlost data and evaluatetheir

performanebymodelingthemthroughabsorbingMarkovhains. Speially,weevaluate

thequalityofservieprovidedtousersintermsofdurabilityandavailabilityofstoreddata

foreahreoverymehanismanddeduetheimpatofitsparametersonthesystemperfor-

mane. Therstmehanismis entralizedandbasedontheuseofasingleserverthat an

reovermultiplelossesatone. Theseondmehanismisdistributed: reonstrutionoflost

fragmentsisiteratedsequentiallyonmanypeersuntiltherequiredlevelofredundanyisat-

tained. Thekeyassumptionsmadeinthiswork,inpartiular,theassumptionsmadeonthe

reoveryproess andpeeron-timesdistribution, arein agreementwiththeanalysis in [11℄

andin [20℄ respetively. Themodels aretherebygeneralenoughto beappliable tomany

distributedenvironmentsasshownthroughnumerialomputations. Wendthat,instable

environmentssuhas loal areaorresearhinstitute networks wheremahinesare usually

highly available, the distributed-repair sheme in erasure-oded systems oers a reliable,

salableandheapstorage/bakupsolution. Fortheaseofhighly dynamienvironments,

in general, thedistributed-repairshemeis ineient, in partiular to maintainhigh data

availability, unless thedata redundany is high. Using regeneratingodes overomesthis

limitationof thedistributed-repairsheme. P2PSS withentralized-repairsheme aree-

ient in anyenvironmentbut havethe disadvantageof relyingon aentralized authority.

However,theanalysisoftheoverheadost(e.g. omputation,bandwidthostandomplex-

ity)resultingfrom thedierentredundanyshemeswithrespetto theiradvantages(e.g.

simpliity),isleftforfuturework.

(5)

Key-words: Peer-to-Peer network, distributed storage system, performane evaluation,

absorbingMarkovhain,dataavailability,datadurability,reoveryproess

(6)

redondane et de reouvrement

Résumé : Ce rapport étudie les performanes de systèmes pair-à-pair de stokage des

données(P2PSS).Ces systèmes reposentsur troispiliers: la fragmentationdes donnéeset

leurdisséminationhezlespairs,laredondanedesdonnéesandefairefaeauxéventuelles

indisponibilités des pairset l'existened'un méanisme de réparationoude reouvrement

desdonnéesperduesoutemporairementindisponibles. Nousintroduisonsdeuxméanismes

dereouvrementdesdonnéesperduesetenévaluonslesperformanesenlesmodélisantpar

deshaînesde Markovabsorbantes. Plus préisément,nous évaluonslaqualité duservie

rendu aux utilisateurs en terme de longévité et de disponibilité des données de haque

méanisme et en déduisons l'impat de ses paramètressur ses performanes. Le premier

méanismeestentraliséetreposesurl'utilisationd'ununiqueserveurpourlareonstrution

desfragmentsdedonnéeperdus. Leseondméanismeestdistribué: lareonstrutiondes

fragments perdus met en oeuvre, séquentiellement, plusieurs pairs et s'arrête dès que le

niveaude redondanerequis est atteint. Les prinipaleshypothèsesfaitesdans e travail,

notammentellesportantsurladistributionduproessusdereouvrementetdedisponibilité

des pairs,sonten aordaveles hypothèses dans[11,20℄. Nosmodèles, qui généralisent

euxdans [10℄, sontsusammentgénériques pour s'appliquerà diérentsenvironnements

distribués, omme le montrent nos appliations numériques. Nous onstatons que dans

des environnements stables omme les réseaux loaux des instituts de reherhe, où les

mahinessontgénéralementtrèsdisponibles, leméanismede reouvrementdistribuéore

unesolutiondestokageable,évolutiveetpeuoûteuse. Pourdesréseauxtrèsdynamiques

l'eaitéduméanismedistribuéestinversementproportionnelleauniveauderedondane

des données, tout partiulièrement pour e qui onerne leur disponibilité. Les systèmes

destokage quiimplémentent leméanismede reouvremententralisé sonteaesdans

n'importe quelenvironnement,leurtalon d'ahilleétant lespannesduserveurqui rendent

inopéranteslareonstrutiondesdonnéesperdues.

Mots-lés : systèmes pair-à-pair, évaluation de performane, haîne de Markov absor-

bante,approximationhampmoyen

(7)

1 Introdution

Conventionalstoragesolutionsrelyonrobustdediatedserversandmagnetitapesonwhih

dataarestored. Theseequipmentsarereliable,buttheyarealsoexpensiveanddonotsale

well. Thegrowthof storagevolume, bandwidth,and omputationalresouresforPCs has

fundamentallyhangedthewayappliations are onstruted. Almost10 yearsago,anew

networkparadigmhasbeenproposedwhere omputers anbuild avirtualnetwork (alled

overlay)on top of another network or an existing arhiteture (e.g. Internet). This new

networkparadigmhasbeenlabeledpeer-to-peer(P2P) distributednetwork. A peerin this

paradigmis aomputer that play the role of both supplierand onsumerof resoures, in

ontrast to the traditional lient-server model where only servers supply, and omputers

onsume. Appliationsthatusethis distributednetworkprovidesenhanedsalabilityand

servierobustnessasall theonnetedomputers orpeersprovidesomeservies. Peers in

theoverlayanbethoughtofasbeingonnetedbyvirtualorlogiallinks,eah ofwhih

orrespondstoapath,perhapsthroughmanyphysiallinks,intheunderlyingnetwork. As

already mentioned,eah peer reeives/providesa serviefrom/to other peers throughthe

overlaynetwork;examplesofsuhserviesareomputing(sharingtheapaityofitsentral

proessingunit),dataupload(sharingitsbandwidthapaity),datastorage(sharingitsfree

storagespae),aswellassupport toloateresoures,serviesandotherpeers.

ThisP2PmodelhasprovedtobeanalternativetotheClient/Servermodelandapromis-

ingparadigmforGridomputing,lesharing,voieoverIP,bakupandstorageappliations.

Someof thereenteortsforbuilding highlyavailablestorageorbakup systembasedon

the P2P paradigminlude Intermemory [13℄, OeanStore [18℄, CFS [9℄, PAST [24℄, Total

Reall [5℄, UbiStorage [27℄ andTahoe[28℄. Although salableand eonomiallyattrative

ompared to traditional storage/bakup systems,these P2Psystems pose many problems

suh asreliability,dataavailabilityandondentiality.

We andistinguishbetweenbakup and storagesystems. P2P bakup systemsaim to

provide long data lifetime without onstraints on the data availability level orthe reon-

strutiontime. Forthisreason,thebakupsystemdesignersareinterestedinthepermanent

departuresofpeersratherthantheintermediatedisonnetions,ontheontraryto storage

systems,evenifthedisonnetionsdurationswerelong.

1.1 Redundany shemes

In a P2P network, peers are free to leave and join the system at any time. As a result

of theintermittent availability ofpeers, ensuring high availabilityof thestored data isan

interesting and hallenging problem. To ensure data reliability and availability in suh

dynami systems, redundant data is inserted into the system. Existing systems ahieve

redundanyeitherbyrepliation,wheretherearetworepliationlevels,orbyerasureodes

(e.g. [23,6℄). Thetworepliationlevelsareasfollows:

ˆ The whole-le-level repliation sheme. A le

f

is repliated

r

times over

r

dierent

peers(asinPAST[24℄)sothatthetoleraneagainstfailuresorpeersdepartureisequal

(8)

to

r

. Theratio

1/{r + 1}

denestheusefulstoragespaeinthesystem. Hereafter,we

willrefertothisrepliationshemeasrepliation.

ˆ The fragment-levelrepliationsheme. Thisshemeonsistsofdividingthele

f

into

s

equally sized fragments, and then make

r

opies of eah of them and plae one

fragmentopyperpeer,asinCFS[9℄.

Theerasureode(EC) shemeonsistsof dividingthele

f

into

b

equallysizedbloks

(say

S B

bits). Eah blok of data

D

is partitioned into

s

equally sized fragments (say

F EC = S B /s

bits)towhih,usingoneoftheerasureodessheme(e.g. [23,6℄),

r

redundant

fragments are added of the same size. To download a blok of data, any

s

fragments

are needed out of the

s + r

(downloadingthe size of the original blok

S B

). Reovering

any fragment (if it is lost) or adding a new redundant fragment of a given blok of data

requiresthedownloadofanyother

s

fragmentsoutoftheavailablefragmentsofthatblok

(downloadingagain thesize of theoriginal blok

S B

). Therefore,for eah stored blokof

data,thetoleraneagainstfailuresorpeersdepartureisequalto

r

. Theusefulstoragespae

inthesystemisdened bytheratio

s/(s + r)

. Intermemory[13,7℄, OeanStore[18℄, Total Reall[5℄andUbiStorage[27℄aresomeexamplesofexistingP2Psystemsthatuseerasure

oding mehanismsto providesomelevelofsystemreliabilityanddataavailability.

Forthe same amountof redundany, erasure odes provide higher availability of data

thanrepliation[29℄. In[3℄,theauthors showthatanerasureodesshememakesbakup

systemsmoresalablethan repliationand blok-levelrepliation shemesastherequired

availability gets higher. They show as well that the salability of the blok-level sheme

withrespettothetotalstoragerequiredisevenlowerthanthat oftherepliationsheme.

A new lass of odes, so-alled regenerating odes (RC) has been proposed reently

in [12℄. RC an be onsidered a generalization of erasure ode (EC), whih redues the

ommuniationost of EC by slightly inreasing the storage ost. The size of fragments

in RC is larger of that in EC. In[12℄, the authors onsider in Theorem 1, p. 5 a simple

shemeinwhihtheyrequirethatany

s

fragments(theminimumpossible)anreonstrut

theoriginalblokofdata. Allfragmentshaveequalsize

F RC = α ∗ S B

,where

S B

standsfor

thesizeofthegivenblokofdatatobestored. Anewommer(anewpeerinournotation)

produesanewredundantfragmentbyonnetingtoany

s

nodesanddownloading

αS B /s

bits from eah. In this theorem, the authors assume that the soure node of the blok

of data will store initially

n

fragmentsof size

αS B

bits on

n

storage nodes. In addition,

newommersarrivesequentiallyandeahoneonnetstoanarbitrary

k

-subsetofprevious

nodes(inludingpreviousnewommers). Theydene

α c := s

s 2 − s + 1

to be,in theworth

ase,thelowerboundontheminimalamountofdatathatanewomermustdownload. The

worthaseisouredwhenadataolletor(lient)needtoreovertheoriginalblokofdata

from only newomers. Ingeneral, if

α ≥ α c

there exists a linearnetwork ode sothat all

dataolletorsanreonstrut theonsideredblokbydownloading

s

fragmentsfrom any

s

nodes. So,using thissimplesheme ofRC,addinganewredundantfragmentofagiven

blok requires anewpeer to download

1/s

perentof

s

storedfragments(

αS B /s

of eah)

sothatthe newpeer regeneratesonerandom linearombinationof theparts offragments

(9)

alreadydownloaded;thenewpeerwillstoreallthedownloadeddatawhosesizeisequalto

thesizeofthestoredfragmentsinsteadtodownloadtheequivalentoforiginalbloksize,in

theaseofEC,toregenerateonefragmentanddeletinglaterthedownloadedfragments. In

thesameway, downloadingtheblok,by adata olletor,in EC requires thedownloadof

itssize(

s ∗ F EC = S B

),whereinRC,itrequiresthedownloadof

s∗ F RC = s∗ α∗ S B = βS B

,

where

β > 1

. Theauthorsshowthat

β → 1

as

s → ∞

. Untilthetimeofwritingthisreport,

theregeneratingodesis notyet used inanyP2Psystem. However,wewillshowthrough

numerialresultsthat RCisapromissingredundanyshemeforP2Pstorageand bakup

systems.

1.2 Reovery mehanisms and poliies

Using redundany mehanisms withoutrepairing lost data is not eient, as the level of

redundany dereases when peers leave the system. Consequently, P2P storage/bakup

systemsneedto ompensatethelossof databy ontinuously storingadditionalredundant

dataonto newhosts.

Systems may rely on a entral authority that reonstruts fragments when neessary;

thesesystemswillbereferredtoasentralized-reoverysystems. Alternatively,seureagents

runningonnewhostsanreonstrutbythemselvesthedatatobestoredonthehostsdisks.

Suh systemswill be referred to asdistributed-reovery systems. A entralized server an

reoveratonemultiplelossesofthesamedoumentintheentralized-reoverysheme. In

thedistributedase,eahnewhostthankstoitsseureagentreoversonlyonelossper

doument.

Regardlessofthereoverymehanismused,tworepairpoliies anbeenfored. Inthe

eager poliy, when thesystem detets that onehost has left the network, it immediately

initiatesthereonstrutionofthelostdata,andstoresitonanewpeeruponreovery. This

poliy is simple but makes no distintion between permanentdepartures that need to be

reovered,andtransientdisonnetionsthatdonot.

Havingin mind that onnetionsmayexperienetemporary, asopposed to permanent

failures,onemaywantto deployasystemthat deferstherepairbeyondthedetetionof a

rst lossof data. This alternativepoliy, alled lazy, inherently usesless bandwidththan

theeagerpoliy. However,itisobviousthatanextraamountofredundanyisneessaryto

maskandtotoleratehostdeparturesforextendedperiodsoftime.

1.3 Contributions

Theaim ofthis reportis todevelopmathematialmodels toevaluatefundamental perfor-

manemetris(datalifetimeandavailability)ofP2PSS. Ourontributionsareasfollows:

ˆ Analysisofentralizedanddistributed reoverymehanisms.

ˆ Propositionofageneralmodelthatapturesthebehaviorofeager/lazyrepairpoliies

andthethreerepliationshemes,andaommodatesbothtemporaryandpermanent

disonnetionsofpeers.

(10)

ˆ Numerialinvestigationusingrealistiparametersvaluestosupportthemathematial

models andto ompareerasureodeswithregeneratingodes.

ˆ Provision ofguidelines onhowto engineer aP2Pbakuporstorage systemin order

tosatisfygivenrequirements.

Inthefollowing,Setion 2reviews relatedworkand Setion 3introduestheassumptions

andnotationusedthroughoutthereport. InSetion4,wedisusssomepreliminarymaterial

to beused in subsequentsetions. Setions 5and 6are dediated to the modeling of the

entralized-anddistributed-reoverymehanism,respetively. InSetion8,weprovidesome

numerialresultsshowingtheperformaneoftheentralizedanddeentralizedshemeswhen

eithererasureodesorregeneratingodesareused. Setion 9onludesthereport.

2 Related Work and Bakground

Althoughtheliteratureonthearhitetureandlesystemofdistributedbakupandstorage

systemsis abundant, mostof these systemsare ongured statiallyto providedurability

and/oravailabilitywithonlyaursoryunderstandingofhowtheongurationwillimpat

overall performane. Some systems allow data to be repliated and ahed without on-

straintsonthestorageoverheadoronthefrequenyatwhihdataareahedorreovered.

Theseyieldtowasteofbandwidthandstoragevolumeanddonotprovidealearpredened

durabilityand availabilitylevel. Hene,theimportane ofthethorough evaluation ofP2P

storagesystemsbefore theirdeployment.

There havebeen reent modeling eorts fousing on the performane analysis of P2P

bakupandstoragesystemsintermsofdatadurabilityandavailabilitysuhas[22,1℄and[?℄.

However,inallthesemodels,ndingsandonlusionsrelyontheassumptionthatthepeers

availabilityandthereoveryproessareexponentiallydistributed withsomeparameterfor

the sake of simpliation or for the lak of works haraterizing their distribution under

realistisettings andassumptions.

Charaterizingpeersavailabilitybothinloalandwideareaenvironmentshasbeenthe

fous of[20℄. Inthis paper, Nurmi,Brevik andWolski investigate threesets of data,eah

measuringmahine availabilityin adierent setting, and perform goodness-of-t tests on

eah dataset toassesswhihoutof fourdistributions bestts thedata. Theyhavefound

thatahyper-exponentialmodeltsmoreauratelythemahineavailabilitydurationsthan

theexponential,Pareto,orWeibulldistribution.

Tounderstandhowthereoveryproessouldbebettermodeled,weperformedapaket-

levelsimulationanalysis ofthe downloadand thereoveryproessesin erasure-odedsys-

tems; f. [11℄. We ran several experiments and olleted a large number of samples of

theseproessesin alargevarietyofsenarios. Weusedexpetationmaximizationandleast

square estimation algorithms to t the empirial distributions and tested the goodness of

ourtsusingstatistial(Kolmogorov-Smirnovtest)andgraphialmethods. Wefoundthat

thedownloadtimeofafragmentofdataloatedonasinglepeerfollowsapproximatelyan

(11)

exponentialdistribution. Wealso foundthat the reoverytime essentiallyfollowsahypo-

exponentialdistributionwithmanydistintphases.. Wefoundthatthedownloadtimeofa

fragmentofdataloatedonasinglepeerfollowsapproximatelyanexponentialdistribution.

Wealsofoundthatthereoverytimeessentiallyfollowsahypo-exponentialdistributionwith

manydistintphases.

In light of theonlusions of [20℄, namely, that mahine availability is modeled with a

hyper-exponentialdistribution,and buildingon thendingsof[11℄wewillproposein this

reportgeneralandauratemodelsthat arevalidunderdierentdistributedenvironments.

3 System Desription and Assumptions

Inthefollowing,wewilldistinguishthepeers,whihareomputerswheredataisstoredand

whih form astoragesystem, fromtheusers whoseobjetiveis toretrievethedatastored

inthestoragesystem.

We onsider a distributed storage system whih peers randomly join and leave. The

followingassumptionsontheP2PSS designwill beenforedthroughoutthereport:

ˆ Ablokofdata

D

ispartitionedinto

s

equallysizedfragmentstowhih,usingerasure

odesorregeneratingode,

r

redundantfragmentsareadded. Theaseofrepliation- basedredundanyisequallyapturedbythisnotation,aftersetting

s = 1

andletting

the

r

redundantfragmentsbesimplerepliasoftheuniquefragmentoftheblok. This

notationandheneourmodelingisgeneralenoughtostudybothrepliation-based

anderasureode-basedstoragesystems.

ˆ Mainlyforprivayissues,apeeranstoreatmostonefragmentofanydata

D

.

ˆ Weassumethesystemhasperfetknowledgeoftheloationoffragmentsatanygiven

time,e.g. byusingaDistributedHashTable(DHT)oraentralauthority.

ˆ Thesystemkeepstrakofonlythelatest knownloationofeahfragment.

ˆ Overtime,apeeran beeitheronnetedtoordisonnetedfromthestoragesystem.

Atreonnetion,apeer mayormaynotstillstoreitsfragments. Wedenoteby

p

the

probabilitythatapeerthat reonnetsstillstoresitsfragments.

ˆ Thenumberofonnetedpeersatanytimeistypiallymuhlargerthanthenumber

offragmentsassoiatedwith

D

,i.e.,

s +r

. Therefore,weassumethattherearealways

atleast

s + r

onnetedpeershereafterreferredtoasnewpeerswhiharereadyto reeiveandstorefragmentsof

D

.

We refer to as on-time (resp. o-time) atime-interval during whih a peer is always

onneted (resp. disonneted). During a peer's o-time, the fragments stored on this

peer aremomentarily unavailable totheusersofthestoragesystem. Atreonnetion,and

aordingtotheassumptionsabove,thefragmentsstoredonthispeerwillbeavailableonly

(12)

with a persistene probability

p

(and with probability

1 − p

they are lost). In order to

improve data availability and inrease the reliability of the storagesystem, it is therefore

ruialtoreoverfromlossesbyontinuouslymonitoringthesystemandaddingredundany

wheneverneeded.

We will investigatethe performaneof the twodierent repairpoliies: theeagerand

thelazyrepairpoliies. Reallthatinthelazypoliy,therepairisdelayeduntilthenumber

ofunavailablefragmentsreahesagiventhreshold,denoted

k

. Wemusthave

k ≤ r

sine

D

is lostif morethan

r

fragments aremissing from thestoragesystem. Bothrepairpoliies

anberepresentedbythethresholdparameter

k ∈ {1, 2, . . . , r}

,where

k

antakeanyvalue

intheset

{2, . . . , r}

inthelazypoliyand

k = 1

in theeagerpoliy.

Anyrepairpoliyanbeimplementedeitherinaentralizedoradistributedway. Inthe

followingdesription,weassumethatthesystemmisses

k

fragmentsso thatlostfragments

haveto berestored.

In the entralized implementation, aentral authoritywill: (1) downloadin parallel

s

fragmentsfromthepeerswhihareonneted,(2)reonstrut at oneall theunavail-

ablefragments,and(3)uploadthereonstrutedfragmentsinparallelontoasmanynew

peersforstorage. Theentralauthorityupdatesthedatabasereordingfragmentsloations

assoonasalluploadsterminate. Step2exeutesinanegligibletimeomparedtotheexe-

utiontimeofSteps1and3andwill heneforthbeignoredin themodeling. Step1(resp.

Step3)endsexeutingwhenthedownload(resp. upload)ofthelastfragmentisompleted.

Inthedistributedimplementation,a seureagentononenewpeerisnotied

ofthe identityofone out ofthe

k

unavailablefragmentsfor ittoreonstrut it.

Upon notiation, the seure agent (1) downloads

s

fragments (or

s

parts of

s

fragmentsifRCisused)of

D

fromthepeerswhihareonnetedtothestorage

system,(2)reonstrutsthe speied fragmentand storesiton thepeer'sdisk;

(3) subsequently disards the

s

downloaded fragments, in the ase of erasure

ode, so as to meet the privay onstraint that only one fragment ofa blok of

data may be held by a peer. This operationiterates until lessthan

k

fragments are

sensedunavailableandstopsifthenumberofmissingfragmentsreahes

k − 1

. Thereovery

of onefragment lasts mainly for the exeution time of Step 1. We will thus onsider the

reoveryproesstoendwhenthedownloadofthelastfragment(outof

s

)isompleted.

Inbothimplementations,oneafragmentisreonstruted,anyotheropyof

itthat reappears in the systemdue to a peer reonnetion issimplyignored,

as only one loation (the newest) of the fragment is reorded in the system.

Similarly,ifafragmentisunavailable,thesystemknowsofonlyonedisonneted

peerthat stores the unavailablefragment.

Giventhesystemdesription,data

D

anbeeither available, unavailable orlost. Data

D

is saidto beavailable ifany

s

fragmentsoutof the

s + r

fragmentsanbedownloaded

by the usersof the P2PSS. Data

D

is said to be unavailable if less than

s

fragments are

availablefordownload,howeverthemissingfragmentstoomplete

D

areloatedat apeer

oraentralauthorityon whih areoveryproessisongoing. Data

D

is saidto be lostif

therearelessthan

s

fragmentsinthesysteminludingthefragmentsinvolvedinareovery

(13)

proess. We assume that, at time

t = 0

, at least

s

fragments are available so that the

doumentisinitiallyavailable.

Wenowintroduetheassumptionsonsideredin ourmodels.

Assumption1: (o-times) Weassumethatsuessivedurationsofo-timesofapeerare

independent andidentially distributed (iid) randomvariables (rvs) with aommon

exponentialdistributionfuntionwithparameter

λ > 0

.

Assumption1is inagreementwiththeanalysisin[22℄.

Assumption2: (on-times) We assume that suessive durations of on-times of a peer

areiidrvswithaommonhyper-exponentialdistributionfuntionwith

n

phases;the

parametersofphase

i

are

{p i , µ i }

,with

p i

theprobabilitythatphase

i

isseletedand

1/µ i

themeandurationofphase

i

. Wenaturallyhave

P n

i=1 p i = 1

.

Assumption2,with

n > 1

,isinagreementwiththeanalysisin [20℄;when

n = 1

,itisin

agreementwiththeanalysis in[22℄.

Assumption3: (independene) Suessive on-times and o-times are assumed to be

independent. Peersareassumedto behaveindependentlyofeahother.

Assumption4: (download/upload durations) We assume that suessive download

(resp. upload) durationsof a fragment are iid rvs with a ommon exponential dis-

tribution funtion with parameter

α

(resp.

β

). We further assume that onurrent

fragmentsdownloads/uploadsarenotorrelated.

Assumption4issupportedbyourndingsin[11℄. Asalreadymentioned,thefragment

download/upload timewasfoundto followapproximatelyanexponentialdistribution. As

fortheonurrentdownloads/uploads,wehavefound insimulationsthat theseare weakly

orrelatedandlosetobeindependentaslongasthetotalworkloadisequallydistributed

overtheativepeers. Therearetwomainreasonsfortheweakorrelationbetweenonur-

rent downloads/uploadsasobservedin simulations: (i)the good onnetivityof nowadays

orenetworksand(ii)theasymmetryinpeersupstreamanddownstreambandwidths,ason

average,apeertendsto havehigherdownstreamthanupstreambandwidth[25,15℄. So,as

thebottlenekwouldbetheupstream apaityof peers, the fragmentdownloadtimesare

losetobeiidrvs.

AonsequeneofAssumption4isthateahoftheblokdownloadtimeandthedurations

oftheentralizedandthedistributedreoveryproessesisarvfollowingahypo-exponential

distribution [16℄. Indeed, eah of these durations is the summation of independently dis-

tributedexponentialrvs(

s

fortheblokdownloadandinthedistributedsheme,and

s + k

intheentralizedshemeif

k

fragmentsareto bereonstruted)havingeah itsownrate.

Thisisafundamentaldierenewith[1℄wherethereoveryproessisassumedtofollowan

exponentialdistribution.

It is worthmentioningthat the simulationanalysis of [11℄ has onludedthat in most

ases the reovery time follows roughly a hypo-exponential distribution. This result is

(14)

Table1: Systemparameters.

D

Blokofdata

s

Originalnumberoffragmentsforeahblokofdata

r

Numberofredundantfragments

k

Thresholdofthereoveryproess

p

Persisteneprobability

λ

Rateatwhihpeersrejointhesystem

{p i , µ i } i=1,...,n

Parametersofthepeersfailure proess

α

Downloadrateofapiee ofdata(fragment)

β

Uploadrateofafragmentin theentralized-repairsheme

expeted as long as fragments downloads/uploads are exponentially distributed and very

weakly orrelated. It wasalso found in [11℄ that a hypo-exponential model gives a more

reasonableapproximationof thereoveryproess thananexponentialmodelevenin ases

whenthenullhypothesis isrejeted.

Given Assumptions14, the models developed in this report are more generaland/or

morerealistithan thosein [22,1,10℄. Table1reapitulatestheparametersintrodued in

thissetion. Wewillreferto

s

,

r

and

k

astheprotoolparameters,

p

,

λ

and

{p i , µ i } i=1,...,n

asthepeersparameters,and

α

and

β

asthenetworkparameters.

4 Preliminaries and Notation

Wewillfousinthissetiononthedynamiofpeersinthestoragesystem. Inpartiular,we

areinterestedinomputingthestationarydistribution ofpeers. AordingtoAssumptions

13intheprevioussetion,eahtimeapeerrejoinsthesystem,itpiksitson-timeduration

fromanexponentialdistributionhavingparameter

µ i

with probability

p i

,for

i ∈ [1..n]

. In

otherwords,apeeranstayonnetedforashorttime inasession andforalongtime in

anotherone.

This dynamiity anbemodeledas ageneralqueueing network with anarbitrary but

nite number

n

ofdierentlassesof ustomers(peers)and aninnitenumberof servers.

Inthisnetwork,anewustomerenters diretly,withprobability

p i

,aserverwithaservie

rate

µ i

. Dene

P I(~n = (n 1 , . . . , n n )) := lim t→∞ P(N 1 (t) = n 1 , . . . , N n (t) = n n )

to bethe

jointdistributionfuntion ofthenumberof ustomersoflass

1, . . . , n

in steady-state(or, equivalently, thenumber of busy servers) where

N i (t)

is thenumberof peers of lass

i

in

thesystemattime

t

for

i = 1, . . . , n

. Wehavethefollowingknownresults[2,17℄:

P I(~n) = 1/G

n

Y

i=1

ρ n i i n i !

where

ρ i = λp i /µ i

is the rate at whih work enters lass

i

and

G

is the normalizing onstant.

(15)

G = X

~ n∈N n

n

Y

i=1

ρ n i i n i !

Denotetheexpetednumberofustomersoflass

i

inthesystemby

E[n i ]

where

E[n i ] = X

~ n∈N n

n i P I (~n) = ρ i n

Y

l=1

e ρ l ,

for

i = 1, . . . , n

.

Forlateruse,wewillomputetheprobabilityofseletinganewpeerinphase

i

,denoted

by

R(i)

,orequivalentlytheperentageoftheonnetedpeersinphase

i

asfollows:

R(i) = E[n i ] P n

l=1 E[n l ] = ρ i

P n

l=1 ρ l = p i /µ i

P n l=1 p l /µ l

(1)

Weintrodue aswellfuntions

S

and

f

suh that for agiven

n

-tuple

~a = (a 1 , . . . , a n )

,

S(~a) := P n

i=1 a i

and

f i (~a) := a i /S(~a)

.

We onlude this setion by a word on the notation: a subsript

c

(resp.

d

) will

indiate that weare onsidering the entralized (resp. distributed) reoverysheme. The

notation

~e j i

referstoarowvetorofdimension

j

whoseentriesarenullexeptthe

i

-thentry

thatisequalto

1

;thenotation

~ 1 j

referstoaolumnvetorofdimension

j

whoseeahentry

isequalto

1

;andthenotation

~ 0

referstoanullrowvetorofappropriatedimension.

1

l

{A}

is the harateristi funtion of event

A

. The notation

[a] +

refersto

max{a, 0}

. The set

of integersranging from

a

to

b

is denoted

[a..b]

. Given aset of

n

rvs

{B i (t)} i∈[1..n]

,

B(t) ~

denotesthevetor

(B 1 (t), . . . , B n (t))

and

B ~

denotesthestohastiproess

{ B(t), t ~ ≥ 0}

.

5 Centralized Repair Systems

Inthissetion,weaddresstheperformaneofP2PSSusingtheentralized-reoverysheme.

We will fous on a single blok of data

D

, and pay attention only to peers storing

fragmentsofthisblok. Atanytime

t

,thestateofablok

D

anbedesribedbyboththe

numberoffragmentsthatareavailable fordownloadandthestateofthereoveryproess.

Whentriggered,thereoveryproessgoesrstthroughadownloadphase (fragmentsare

downloadedfromonnetedpeerstotheentralauthority)thenthroughanuploadphase

(fragmentsareuploadedto newpeersfromtheentral authority).

Moreformally,weintrodue

n

-dimensionalvetors

X ~ c (t)

,

Y ~ c (t)

,

Z ~ c (t)

,

U ~ c (t)

,and

V ~ c (t)

,

where

n

is the number of phases of the hyper-exponentialdistribution of peers on-times durations, and a

5n

-dimensional vetor

W ~ c (t) = ( X ~ c (t), ~ Y c (t), ~ Z c (t), ~ U c (t), ~ V c (t))

. Vetors

Y ~ c (t)

and

Z ~ c (t)

desribethedownloadphaseofthereoveryproesswhereas

U ~ c (t)

and

V ~ c (t)

desribeitsuploadphase. Theformaldenition ofthesevetorsisasfollows:

ˆ

X ~ c (t) := (X c,1 (t), . . . , X c,n (t))

where

X c,l (t)

is a

[0..s + r]

-valued rv denoting the

numberoffragmentsof

D

storedonpeersthatarein phase

l

attime

t

.

(16)

ˆ

Y ~ c (t) := (Y c,1 (t), . . . , Y c,n (t))

where

Y c,l (t)

isa

[0..s −1]

-valuedrvdenotingthenumber

of fragmentsof

D

beingdownloadedat time

t

to theentral authority frompeers in

phase

l

(one fragmentperpeer).

ˆ

Z ~ c (t) := (Z c,1 (t), . . . , Z c,n (t))

where

Z c,l (t)

isa

[0..s]

-valuedrvdenotingthenumberof

fragmentsof

D

hold attime

t

bytheentralauthorityandwhosedownloadwasdone

frompeersinphase

l

(onefragmentperpeer). Observethatthesepeersmayhaveleft

thesystembytime

t

.

ˆ

U ~ c (t) := (U c,1 (t), . . . , U c,n (t))

where

U c,l (t)

is a

[0..s + r − 1]

-valued rvdenotingthe

numberof(reonstruted) fragmentsof

D

beinguploaded attime

t

from theentral

authoritytonewpeersthat areinphase

l

(one fragmentperpeer).

ˆ

V ~ c (t) := (V c,1 (t), . . . , V c,n (t))

where

V c,l (t)

is a

[0..s + r − 1]

-valued rv denotingthe

numberof(reonstruted)fragmentsof

D

whoseuploadfromtheentralauthorityto

newpeersthatareinphase

l

hasbeenompletedattime

t

(onefragmentperpeer).

Giventheabovedenitions,weneessarilyhave

Y c,l (t) ≤ X c,l (t)

for

l ∈ [1..n]

atanytime

t

.

Thenumberoffragmentsof

D

thatareavailablefordownloadattime

t

isgivenby

S( X ~ c (t))

(reallthedenition ofthefuntion

S

in Setion3). Giventhat

s

fragmentsof

D

needto

bedownloadedtotheentralauthorityduringthedownloadphaseofthereoveryproess,

we will have (during this phase)

S( Y ~ c (t)) + S( Z ~ c (t)) = s

, suh that

S( Y ~ c (t)), S( Z ~ c (t)) ∈ [1..s − 1]

. Onethedownloadphaseisompleted,theentralauthoritywillreonstrut at

one allmissing fragments, that is

s + r − S( X ~ c (t))

. Therefore, during the uploadphase,

we have

S( U ~ c (t)) + S( V ~ c (t)) = s + r − S( X ~ c (t))

. Observethat, one the download phase

is ompleted, the number of available fragments,

S( X ~ c (t))

, may well derease to 0 with

peersallleavingthesystem. Insuhasituation,theentralauthoritywillreonstrut

s + r

fragmentsof

D

. Assoon asthe downloadphaseis ompleted

Y ~ c (t) = ~ 0

and

S ( Z ~ c (t)) = s

.

The end of the upload phase is also the end of the reoveryproess. We will then have

Y ~ c (t) = Z ~ c (t) = U ~ c (t) = V ~ c (t) = ~ 0

untilthereoveryproessisagaintriggered.

Aording to theterminologyintrodued in Setion 3, at time

t

, data

D

is available if

S( X ~ c (t)) ≥ s

,regardlessofthestateofthereoveryproess. Itisunavailable if

S( X ~ c (t)) <

s

but

S( Z ~ c (t))

the number of fragments hold by the entral authorityis larger than

s − S( X ~ c (t))

and at least

s − S( X ~ c (t))

fragmentsoutof

S( Z ~ c (t))

are dierent from those

S( X ~ c (t))

fragmentsavailable onpeers. Otherwise,

D

is onsidered to be lost. Thelatter

situationwillbemodeledbyasinglestate

a

.

Ifareoveryproessisongoing,theexatnumberofdistintfragmentsof

D

thatarein

thesystemountingboththosethatareavailableandthoseholdbytheentralauthority

may be unknown due to peers hurn. However,we are ableto nd a lower bound on it,

namely,

b( X ~ c (t), ~ Y c (t), ~ Z c (t)) :=

n

X

l=1

max{X c,l (t), Y c,l (t) + Z c,l (t)}.

(17)

Infat,the unertainty aboutthe numberof distint fragmentsis aresultof peershurn.

Thatsaid, thisbound isverytightandmostoftengivestheexatnumberof distintfrag-

ments sine peers hurn ours at a muh larger time-sale than a fragment download.

In our modeling, we onsider an unavailable data

D

to beome lost when the bound

b

takesa valuesmaller than

s

. Observe that, ifthe reoveryproess is nottriggered, then

b( X ~ c (t),~ 0,~ 0) = S( X ~ c (t))

givestheexatnumberofdistintfragments.

The system state at time

t

an be represented by the

5n

-dimensional vetor

W ~ c (t)

.

ThankstotheassumptionsmadeinSetion3,themulti-dimensionalproess

W ~ c := { W ~ c (t), t ≥ 0}

isanabsorbinghomogeneousontinuous-timeMarkovhain(CTMC)withasetoftran- sientstates

T c

representing thesituations when

D

iseither available orunavailable and a singleabsorbingstate

a

representingthesituationwhen

D

islost. Aswriting

T c

istedious,we

willsimplysaythat

T c

isasubsetof

[0..s+r] n ×[0..s−1] n ×[0..s] n ×[0..s+r−1] n ×[0..s+r−1] n

.

Theelementsof

T c

mustverifytheonstraintsmentionedabove.

Without loss of generality, we assume that

S( X ~ c (0)) ≥ s

. The innitesimal generator hasthefollowinganonialform

T c a T c

a

Q ~ c R ~ c

~ 0 0

where

R ~ c

isanon-zeroolumnvetorofsize

|T c |

,and

Q ~ c

is

|T c |

-by-

|T c |

matrix. Theelements

of

R ~ c

arethetransitionratesbetweenthetransientstates

w ~ c ∈ T c

andtheabsorbingstate

a

. The diagonalelementsof

Q ~ c

areeahthetotaltransition rateoutoftheorresponding transient state. The other elements of

Q ~ c

are the transition rates between eah pair of

transient states. The non-zero elements of

R ~ c

are, for

S(~y c ) ∈ [1..S (~x c )]

and

S(~z c ) = s − S(~y c )

,

r c (~x c ,~ 0,~ 0,~ 0,~ 0) =

n

X

l=1

x c,l µ l ,

for

S(~x c ) = s.

r c (~x c , ~y c , ~z c ,~ 0,~ 0) =

n

X

l=1

y c,l µ l · 1

l

{b(~x c , ~y c , ~z c ) = s} ,

for

S(~x c ) ∈ [1..s].

Letus proeedtothedenitionofthenon-zeroelementsof

Q ~ c

.

The ase when a peerleaves the system

Therearesevendierentsituationsinthisase. Intherstsituation,eitherthereovery

proesshasnotbeentriggered orithasbut nodownloadhasbeenompleted yet. Inboth

the seond and third situations, the download phase of the reovery proess is ongoing

and at least one downloadis ompleted. However, in the seond situation, the departing

peer does notaet the reoveryproess (either it wasnot involved in it orits fragment

downloadisompleted),unlikewhathappensinthethird situation. Inthethird situation,

afragmentdownloadis interrupteddueto thepeer's departure. Theentralauthoritywill

thenimmediatelystartdownloadingafragmentfromanotheravailablepeerthatisuniformly

(18)

seletedamongallavailablepeersnoturrentlyinvolvedinthereoveryproess. Thefourth

situation arises when a peer leaves the system at the end of the download phase. The

fth situation ours when an available fragment beomes unavailable during the upload

phase. Thesixthsituationourswhen apeer,to whih theentralauthorityisuploading

afragment,leavesthesystem. Thelastsituationarisesbeauseofadepartureofapeer to

whih the entral authority hasompletely uploadedareonstruted fragment. Note that

the uploaded fragmentwas notyet integrated in the available fragments. This is aused

bythe fat that the entral authority updates the databasereordingfragmentsloations

assoon as all uploadsterminate. Tooverom any departure orfailure that ours in the

ontextofoneofthelastthreesituations,theentralauthorityhasthentouploadagainthe

givenfragmentto anew peer. A newseletedpeer would bein phase

m

with probability

R(m)

for

m ∈ [1..n]

. Theelementsof

Q ~ c

orresponding to these seven situations are, for

l ∈ [1..n]

and

m ∈ [1..n]

,

q c ((~x c ,~ 0,~ 0,~ 0,~ 0), (~x c − ~e n l ,~ 0,~ 0,~ 0,~ 0)) = x c,l µ l ,

for

S(~x c ) ∈ [s + 1..s + r].

q c ((~x c , ~y c , ~z c ,~ 0,~ 0), (~x c − ~e n l , ~y c , ~z c ,~ 0,~ 0)) = [x c,l − y c,l ] + µ l ,

for

S(~x c ) ∈ [s..s + r − 1], S(~y c ) ∈ [1..s − 1], S(~z c ) = s − S(~y c );

or

S(~x c ) ∈ [2..s − 1], S (~y c ) ∈ [1..S(~x c ) − 1], S(~z c ) = s − S (~y c ).

q c ((~x c , ~y c , ~z c ,~ 0,~ 0), (~x c − ~e n l , ~y c − ~e n l + ~e n m , ~z c ,~ 0,~ 0)) = y c,l µ l [x c,m − y c,m − z c,m ] + P n

i=1 [x c,i − y c,i − z c,i ] + ,

for

S(~x c ) ∈ [s..s + r − 1], S(~y c ) ∈ [1..s − 1], S(~z c ) = s − S(~y c );

or

S(~x c ) ∈ [2..s − 1], S (~y c ) ∈ [1..S(~x c ) − 1], S(~z c ) = s − S (~y c ).

q c ((~x c ,~ 0, ~z c ,~ 0,~ 0), (~x c − ~e n l ,~ 0, ~z c ,~ 0,~ 0)) = x c,l µ l ,

for

S(~x c ) ∈ [1..s + r − 1], S(~z c ) = s.

q c ((~x c ,~ 0, ~z c , ~u c , ~v c ), (~x c − ~e n l ,~ 0, ~z c , ~u c + ~e n m , ~v c )) = x c,l µ l R(m),

for

S(~x c ) ∈ [1..s + r − 2], S(~z c ) = s, S(~u c ) ∈ [1..s + r − S(~x c ) − 1], S(~v c ) = s + r − S(~x c ) − S (~u c ).

q c ((~x c ,~ 0, ~z c , ~u c , ~v c ), (~x c ,~ 0, ~z c , ~u c − ~e n l + ~e n m , ~v c )) = u c,l µ l R(m),

for

S(~x c ) ∈ [1..s + r − 2], S(~z c ) = s, S(~u c ) ∈ [1..s + r − S(~x c ) − 1], S(~v c ) = s + r − S(~x c ) − S (~u c ), l 6= m.

q c ((~x c ,~ 0, ~z c , ~u c , ~v c ), (~x c ,~ 0, ~z c , ~u c + ~e n m , ~v c − ~e n l )) = v c,l µ l R(m),

for

S(~x c ) ∈ [1..s + r − 2], S(~z c ) = s, S(~u c ) ∈ [1..s + r − S(~x c ) − 1], S(~v c ) = s + r − S(~x c ) − S (~u c ).

The ase when a peerrejoins the system

Reall that thesystem keepstraeof onlythe latestknownloation ofeah fragment.

As suh, one a fragment is reonstruted, any other opy of it that reappears in the

systemdue to a peer reonnetion is simply ignored,as onlyone loation(the newest) of

thefragmentisreorded in thesystem. Similarly, ifafragmentisunavailable,thesystem

knowsofonlyonedisonnetedpeerthatstorestheunavailablefragment. Inthefollowing,

(19)

only relevantreonnetions are onsidered. Forinstane, when the reoveryproess is in

itsuploadphase,anypeerthatrejoinsthesystemdoesnotaetthesystemstatesineall

fragmentshavebeenreonstrutedandarebeinguploadedtotheirnewloations.

There arethreesituations wherereonnetions may berelevant. Intherst, eitherthe

reoveryproesshasnotbeentriggeredorithasbutnodownloadhasbeenompletedyet.In

boththeseondandthirdsituations,thedownloadphaseofthereoveryproessisongoing

andat leastonedownloadis ompleted. However,in thethird situation,there isonly one

missing fragment,so when thepeer storing the missing fragmentsrejoins the system,the

reoveryproessaborts.

Theelementsof

Q ~ c

orrespondingtothesethreesituationsare,for

l ∈ [1..n]

and

S(~z c ) = s − S(~y c )

q c ((~x c ,~ 0,~ 0,~ 0,~ 0), (~x c + ~e n l ,~ 0,~ 0,~ 0,~ 0)) = p l (s + r − S(~x c ))pλ,

for

S(~x c ) ∈ [s..s + r − 1].

q c ((~x c , ~y c , ~z c ,~ 0,~ 0), (~x c + ~e n l , ~y c , ~z c ,~ 0,~ 0)) = p l (s + r − S(~x c ))pλ,

for

S(~x c ) ∈ [s..s + r − 2], S(~y c ) ∈ [1..s − 1];

or

S(~x c ) ∈ [1..s − 1], S(~y c ) ∈ [1..S(~x c )].

q c ((~x c , ~y c , ~z c ,~ 0,~ 0), (~x c + ~e n l ,~ 0,~ 0,~ 0,~ 0)) = p l pλ,

for

S(~x c ) = s + r − 1, S(~y c ) ∈ [1..s − 1].

The ase when onedownload isompleted during the reovery proess

Whenareoveryproessisinitiated,thesystemstateveries

S(~x c ) ∈ [s..s + r − k]

and

~y c = ~z c = ~u c = ~v c = ~ 0

. Theentral authorityselets

s

peers out ofthe

S(~x c )

peers that

are onneted to the system and initiates afragment download from eah. Among the

s

peers that are seleted,

i l

outof

s

would be in phase

l

, for

l ∈ [1..n]

. Let

~i = (i 1 , . . . , i n )

.

We naturallyhave

0 ≤ i l ≤ x c,l

, for

l ∈ [1..n]

, and

S( ~i) = s

. This seletion ours with

probability

g( ~i, ~x c ) :=

Q n l=1

x c,l

i l

S(~ x c ) s

.

Theprobabilitythattherstdownloadtobeompleted outof

s

wasfromapeerinphase

l

isequalto

f l ( ~i) = i l /s

(reallthedenitionof

f

in Setion3). Similarly,whenthenumber

ofongoingdownloadsis

~ y c

, theprobabilitythatthe rstdownloadto beompleted outof

S(~y c )

was fromapeerin phase

l

isequalto

f l (~y c ) = y c,l /S(~y c )

.

The two possible transition rates in suh situations are, for

l ∈ [1..n]

,

m ∈ [1..n]

and

S(~z c ) = s − S(~y c )

,

q c ((~x c ,~ 0,~ 0,~ 0,~ 0), (~x c ,~i − ~e n l , ~e n l ,~ 0,~ 0)) = sα g( ~i, ~x c ) f l ( ~i),

for

S(~x c ) ∈ [s..s + r − k], i m ∈ [0..x c,m ], S( ~i) = s.

q c ((~x c , ~y c , ~z c ,~ 0,~ 0), (~x c , ~y c − ~e n l , ~z c + ~e n l ,~ 0,~ 0)) = S(~y c )α f l (~y c ),

for

S(~x c ) ∈ [s..s + r − 1], S(~y c ) ∈ [1..s − 1];

or

S(~x c ) ∈ [1..s − 1], S(~y c ) ∈ [1..S(~x c )].

The ase when oneupload isompletedduring the reovery proess

(20)

When the downloadphase is ompleted, the systemstate veries

S(~z c ) = s

and

~y c =

~u c = ~v c = ~ 0

. Theentralauthorityselets

s + r − S(~x c )

newpeersthatareonnetedtothe

systemandinitiatesa(reonstruted)fragmentuploadtoeah. Among thepeers thatare

seleted,

i l

outof

s + r − S(~x c )

wouldbeinphase

l

,for

l ∈ [1..n]

. Let

~i = (i 1 , . . . , i n )

. We

naturallyhave

0 ≤ i l ≤ s + r − S(~x c )

,for

l ∈ [1..n]

,and

S( ~i) = s + r − S(~x c )

. Thisseletion

ourswithprobability

h( ~i, ~x c ) :=

s + r − S (~x c ) i 1 , i 2 , . . . , i n

n Y

l=1

R(l) i l

wherethemultinomialoeienthasbeenused. For

l ∈ [1..n]

and

S(~z c ) = s

,weanwrite

q c ((~x c ,~ 0, ~z c ,~ 0,~ 0), (~x c ,~ 0, ~z c ,~i − ~e n l , ~e n l )) = S( ~i)β h( ~i, ~x c ) f l ( ~i),

for

S(~x c ) ∈ [0..s + r − 2], ~i ∈ [0..s + r − S(~x c )] n , S( ~i) = s + r − S(~x c ).

q c ((~x c ,~ 0, ~z c , ~u c , ~v c ), (~x c ,~ 0, ~z c , ~u c − ~e n l , ~v c + ~e n l )) = S(~u c )β f l (~u c ),

for

S(~x c ) ∈ [0..s + r − 2], S(~u c ) ∈ [2..s + r − S(~x c ) − 1], S(~v c ) = s + r − S(~x c ) − S(~u c ).

q c ((~x c ,~ 0, ~z c , ~e n l , ~v c ), (~x c + ~v c + ~e n l ,~ 0,~ 0,~ 0,~ 0)) = β,

for

S(~x c ) ∈ [0..s + r − 2], S(~v c ) = s + r − S(~x c ) − 1.

q c ((~x c ,~ 0, ~z c ,~ 0,~ 0), (~x c + ~e n l ,~ 0,~ 0,~ 0,~ 0)) = R(l)β,

for

S(~x c ) = s + r − 1.

Notethat

Q ~ c

isnotaninnitesimalgeneratorsineelementsinsomerowsdonotsumupto

0

. Thoserowsorrespondtothesystemstateswhereonly

s

distint fragmentsarepresent

inthesystem. Thediagonalelementsof

Q ~ c

are

q c (~ w c , ~ w c ) = −r c ( w ~ c ) − X

~

w c ∈T c −{ w ~ c }

q c ( w ~ c , ~ w c ),

for

w ~ c ∈ T c .

Forillustrationpurposes,wedepitinFig. 1someofthetransitionsoftheabsorbingCTMC

when

n = 2

,

s = 3

,

r = 1

,and

k = 1

.

5.1 Data Lifetime

Thissetionisdevotedtotheanalysisofthelifetimeof

D

. Itwillbeonvenienttointrodue

sets

E I := {(~x c ,~ 0,~ 0,~ 0,~ 0) : ~x c ∈ [0..s + r] n , S(~x c ) = I}

for

I ∈ [s..s + r].

The set

E I

onsists of all states of the proess

W ~ c

in whih the number of fragments of

D

urrentlyavailable is equalto

I

and thereoveryproess either hasnot been triggered

(for

I ∈ [s + r − k + 1..s + r]

) or it has but no download has been ompleted yet (for

I ∈ [s..s + r − k]

). Forany

I

,theardinalof

E I

is

I+n−1

n−1

(thinkofthepossibleseletions

of

n − 1

boxesin arowof

I + n − 1

boxes,soastodelimit

n

groupsofboxessumming up

to

I

).

(21)

Introdue

T c (E I ) := inf{t > 0 : W ~ c (t) = a| W ~ c (0) ∈ E I }

, the time until absorption in

state

a

orequivalentlythetimeuntil

D

islostgiventhattheinitialnumberoffragments

of

D

available in the systemis equal to

I

. In the following,

T c (E I )

will be referredto as

theonditional blok lifetime. Weareinterestedin theonditional probabilitydistribution

funtion,

P (T c (E I ) ≤ t)

,andtheonditional expetation,

E[T c (E I )]

, giventhat

W ~ c (0) ∈ E I

for

I ∈ [s..s + r]

.

Fromthe theory ofabsorbingMarkov hains, weanompute

P(T c ({ w ~ c }) ≤ t)

where

T c h ({ w ~ c })

is the time until absorption in state

a

given that the system initiates in state

~

w c ∈ T c

. Weknowthat(e.g. [19,Lemma2.2℄)

P(T c ({ w ~ c }) ≤ t) = 1 − ~e |T ind( c | w ~ c ) · exp t ~ Q c

· ~ 1 |T c | , t > 0, ~ w c ∈ T c

(2)

where

ind( w ~ c )

referstotheindexofstate

w ~ c

inthematrix

Q ~ c

. Denitionsofvetors

~e j i

and

~ 1 j

were givenat theend ofSetion 3. Observethat theterm

~e |T ind( c | w ~ c ) · exp t ~ Q c

· ~ 1 |T c |

in

theright-handsideof (2)isnothingbutthesummation ofall

|T c |

elementsinrow

ind(~ w c )

ofmatrix

exp t ~ Q c

.

(2, 0), (0, 0), (2, 1), (0, 0), (0, 0) (2, 1), (0, 0), (2, 1), (0, 0), (0, 0)

(2, 1), (1, 0), (1, 1), (0, 0), (0, 0)

(2, 1), (2, 0), (0, 1), (0, 0), (0, 0)

(2, 1), (0, 0), (0, 0), (0, 0), (0, 0) (2, 0), (1, 0), (1, 1), (0, 0), (0, 0)

(2, 0), (2, 0), (0, 1), (0, 0), (0, 0) (2, 0), (0, 0), (2, 1), (1, 0), (0, 1)

(1, 0), (0, 0), (2, 1), (0, 0), (0, 0) (1, 0), (0, 0), (2, 1), (2, 0), (0, 1) (1, 0), (0, 0), (2, 1), (1, 0), (1, 1)

(1, 0), (1, 0), (1, 1), (0, 0), (0, 0)

(1, 1), (1, 0), (1, 1), (0, 0), (0, 0)

(3, 1), (0, 0), (0, 0), (0, 0), (0, 0)

(2, 2), (0, 0), (0, 0), (0, 0), (0, 0)

D

isunavailable

D

isavailable

µ 2

2µ 1

µ 1

µ 1

β

D

islost

β

p 2 λ p 2µ 1 p 1

3α · 1/3 µ 1

3 p 2 λp 2µ 1

2α 3p 1 λp

2 p 2 λp 2p 1 λp α

µ 1

µ 2

2µ 1 + µ 2

2 p 2 λp µ 2

µ 11 + µ 2

1

3µ 1

1 p p λ 2α

α

µ 1

2 p

p λ

p 1 λ p

α R(2)β R(1)β 3β · 3R(1) 2 R(2) · 1/3 2β · 2R(1)R(2) · 1/2

a

Figure1: SometransitionsoftheMarkovhain

W ~ c

when

n = 2

,

s = 3

,

r = 1

,and

k = 1

.

(22)

Let

π ~ x c

denotetheprobabilitythatthesystemstartsinstate

w ~ c = (~x c ,~ 0,~ 0,~ 0,~ 0) ∈ E I

at

time

0

giventhat

W ~ c (0) ∈ E I

. Weanwrite

π ~ x c := P

W ~ c (0) = w ~ c ∈ E I | W ~ c (0) ∈ E I

=

I x c,1 , . . . , x c,n

n Y

l=1

R(l) x c,l .

(3)

Clearly

P

~

w c ∈E I π ~ x c = 1

for

I ∈ [s..s + r]

. Using (2) and (3) and the total probability theoremyields,for

I ∈ [s..s + r]

,

P (T c (E I ) ≤ t) = X

~ w c ∈E I

π ~ x c P (T c ({ w ~ c }) ≤ t)

= 1 − X

~ w c ∈E I

π ~ x c ~e |T ind( c | w ~ c ) · exp t ~ Q c

· ~ 1 |T c | , t > 0.

(4)

Weknowfrom[19,p. 46℄thattheexpetedtimeuntilabsorptiongiventhatthe

W ~ c (0) =

~

w c ∈ T c

anbewrittenas

E [T c ({ w ~ c )}] = −~e |T ind( w ~ c )

c | ·

Q ~ c

−1

· ~ 1 |T c | , w ~ c ∈ T c ,

wheretheexisteneof

Q ~ c

−1

isaonsequeneofthefatthatallstatesin

T c

aretransient

[19,p. 45℄. Theonditionalexpetationof

T c (E I )

isthen(reallthattheelementsof

E I

are

oftheform

(~x c ,~ 0,~ 0,~ 0,~ 0)

)

E [T c (E I )] = X

~ w c ∈E I

π ~ x c E [T c ({ w ~ c })]

= − X

~ w c ∈E I

π ~ x c ~e |T ind( c | w ~ c ) · Q ~ c

−1

· ~ 1 |T c | ,

for

I ∈ [s..s + r].

(5)

5.2 Data Availability

In this setion we introdue dierent metristo quantify the availability of

D

. But rst,

we will study the time during whih

J

fragmentsof

D

are available in the system given

thattherewereinitially

I

fragments. Toformalizethismeasure,weintroduethefollowing

subsetsof

T c

,for

J ∈ [0..s + r]

,

F J := {(~x c , ~y c , ~z c , ~u c , ~v c ) ∈ T c : S(~x c ) = J }

The set

F J

onsists of all states of proess

W ~ c

in whih the number of fragments of

D

urrentlyavailableisequalto

J

,regardlessofthestateofthereoveryproess. Thesubsets

F J

formapartitionof

T c

. Wemaydenenow

T c (E I , F J ) :=

Z T c (E I )

0

1

l

n

W ~ c (t) ∈ F J | W ~ c (0) ∈ E I

o dt.

Références

Documents relatifs

The ASSIC project focused on various aspects of the statistics of the O-A differences (biases, standard deviations, and MIPAS evaluation), and on various regions of the

Here we process space-geodetic data for 37 permanent GNSS stations distributed along Nubia, Somalia and Victoria to (1) compute the motion of the three tectonic units in the

13 Sezione INFN di Bari, Bari, Italy 14 Sezione INFN di Bologna, Bologna, Italy 15 Sezione INFN di Cagliari, Cagliari, Italy 16 Sezione INFN di Ferrara, Ferrara, Italy 17 Sezione

The main result of the present study is that a simple widespread and inexpensive computer assistance based on common sense can improve nutrition support prescription, with a

Testing was not without its challenges. Initial efforts focused solely on validation of specific functional blocks of the circuit and rating validation. It was important

Under the assumption of large but finite storages, the flow of material is approximated by a diffusion process with reflecting boundary conditions, independent of the

Given that we now have the required information about the storage location of all replicas of the data segment to be erased, we have to design data erasure mechanism.. Towards this,

c) Cache Manager: periodically computes the ideal cache configuration (i.e., what objects should be cached and how many chunks for each) based on object popularity statis- tics from