Finding Good Partners in Availability-aware P2P Networks

(1)

HAL Id: inria-00352529

https://hal.inria.fr/inria-00352529

Submitted on 13 Jan 2009

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Finding Good Partners in Availability-aware P2P

Networks

Stevens Le Blond, Fabrice Le Fessant, Erwan Le Merrer

To cite this version:

Stevens Le Blond, Fabrice Le Fessant, Erwan Le Merrer. Finding Good Partners in Availability-aware

P2P Networks. [Research Report] RR-6795, INRIA. 2009, pp.17. �inria-00352529�

(2)

a p p o r t

d e r e c h e r c h e

IS

S

N

0

2

4

9 -6

3

9

9 IS

R

N

IN

R

IA

/R

R

--6

7

9

5 --F

R

+

E

N

G

Thème COM

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

Finding Good Partners in Availability-aware P2P

Networks

Stevens Le Blond — Fabrice Le Fessant — Erwan Le Merrer

N° 6795

(3)

(4)

Centre de recherche INRIA Saclay – Île-de-France

Parc Orsay Université

4, rue Jacques Monod, 91893 ORSAY Cedex

Téléphone : +33 1 72 92 59 00

Networks Stevens Le Blond

∗

, Fabri eLe Fessant

†

,Erwan LeMerrer

‡

ThèmeCOMSystèmes ommuni ants Équipes-ProjetsAsapet Planète

Rapportdere her he n°6795January200914pages

Abstra t:

Inthispaper,westudytheproblemofndingpeersmat hingagiven avail-abilitypatternin apeer-to-peer(P2P) system. Werstprovetheexisten eof su hpatternsinanewtra eoftheeDonkeynetwork, ontainingthesessionsof 14Mpeersover27days. Wealsoshowthat,usingonly7daysofhistory,a sim-ple predi tor an sele t predi table peers and su essfully predi ttheir online periods for thenextweek. Then, motivated bypra ti alexamples, wespe ify twoformalproblemsofavailabilitymat hingthatariseinrealappli ations: dis- onne tion mat hing, where peers look for partners expe ted to dis onne t at thesametime, andpresen e mat hing, wherepeerslook forpartnersexpe ted to be online simultaneously in the future. As a s alable and inexpensive so-lution, we propose to use epidemi proto ols for topologymanagement, su h asT-Man;weprovide orrespondingmetri sfor both mat hingproblems. Fi-nally, weevaluated this solutionby simulatingtwo P2Pappli ations overour real tra e: task s heduling and le storage. Simulationsshowed that our sim-ple solution provided good partners fast enough to mat h the needs of both appli ations, andthat onsequently,these appli ationsperformedase iently at amu h lower ost. Webelieve that this work will beuseful formany P2P appli ationsforwhi hithasbeenshownthat hoosinggoodpartners,basedon theiravailability,drasti allyimprovestheire ien y.

Key-words: availability,peer-to-peer,mat hing,epidemi ,proto ols

∗

INRIASophiaAntipolis

†

INRIASa lay

‡

(5)

réseau pair-à-pair en fon tion de sa disponibilité

Résumé:

Dans e papier, nousétudions leproblématique detrouverdes partenaires suivantun ritèrededisponibilitédansunréseaupair-à-pair.Nous ommençons parmontrer l'existen e derégularitésde disponibilité dans une nouvelletra e duréseaueDonkey, ontenantlessessions de14Mde pairssur27 jours. Nous montrons aussi que, en utilisant 7 jours d'historique, une prédi teur simple peut séle tionnerdes pairsprévisibles et prédire ave su ès leurspériodesde disponibilitésur lasemainesuivante. Ensuite, nousspé ionsdeux problèmes formelsde séle tionen fon tionde ladisponibilité, qui seprésentent dansdes appli ationsréelles: laséle tionpourladé onnexion,quire her helespairsqui sedé onne terontprobablementenmêmetemps,etlaséle tionpourlaprésen e, qui re her helespairs quiserontprobablement présentsen même tempsdans lefutur. Commesolution peu oûteuse et passantà l'é helle, nous proposons d'utiliser desproto olesépidémiques degestion detopologie, telsque T-Man; nousfournissonslesmétriques orrespondantànosdeuxproblèmes. Finalement, nousavonsévalué ette solutionen simulantdeux appli ationspair-à-pair sur notretra eréelle. Lessimulationsontmontréquenotresimplesolutionfournit debonspartenairessusammentvitepourlesbesoinsdesdeuxappli ations,et qu'en onséquen e, es appli ations fon tionnent aussi e a ement àun oût bienmoindre. Nouspensonsque etravailserautilepourtouteslesappli ations pair-à-pairpourlesquelsil aétémontré quele hoixdebonspartenairespeut augmenter onsidérablementlesperforman es.

(6)

0 100000

200000

300000

400000

500000

600000

0

5

10

15

20

25

30 Number of Peers

Days

Global System Availability

Online Peers

(a)SystemAvailability

1000

10000

100000

0

1

2

3

4

5

6

7

8

9 10 11 12 13 14

Number of peers (log scale)

Best pattern size (days)

Distribution of the Best Sizes of Patterns

(b) Auto-Correlation on Ses-sions

0

0.2

0.4

0.6

0.8

1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 Prediction Success

Availability

CDF of Peers (normalized)

Predictability of selected peers

prediction (predictable peers)

prediction (randomized bitmaps)

Availability

( ) Predi tion versusA vailabil-ity

Figure1: (a): Diurnalpatternsareobviouslyvisibleontheglobalsystem avail-ability. (b) The auto- orrelation on the sessions shows that the best pattern size is oneday, followedbyoneweek. ( ) Whereas availabilitydeterminesthe predi tion with random bitmaps, daily patterns improve the predi tion with real bitmaps (e.g. for 60% of peers (x=0.4), 50% of predi tions (y=0.5) are su essful,but only25%withrandombitmaps).

1 Introdu tion

Churnisoneofthemost riti al hara teristi sofpeer-to-peer(P2P)networks, as the permanent ow of peer onne tions and dis onne tions an seriously hamperthee ien y of appli ations[9℄. Fortunately, it hasbeenshownthat, formanypeers,theseeventsgloballyobeysomeavailabilitypatterns([18,19,2℄), andso, anbepredi tedfromtheuptimehistoryofthosepeers[15℄.

Totakeadvantageofthese predi tions,appli ationsneed tobeableto dy-nami allyndgoodpartnersforpeers,a ordingtotheseavailabilitypatterns, even in large-s aleunstru tured networks. The intrinsi onstitution of those networksmakespure randommat hingte hniquesto betime-ine ientfa ing hurn.

Inthis paper,we study ageneri te hniqueto dis oversu h partners, and applyitfortwoparti ularmat hing problems: dis onne tion mat hing, where peerslookfor partnersexpe ted to dis onne tat thesametime, andpresen e mat hing, where peers look for partners expe ted to beonline simultaneously in thefuture. These problemsarespe iedin Se tion 3. Wethenexplainthat T-Man[12℄,astandardepidemi algorithmfortopologymanagement,isagood andidatetosolvetheseproblems. However,inorderto onvergetothedesired state or topology (here mat hed peers), T-Man needs an a urate metri to ompute the distan e between peers. In Se tion 4, we des ribe how T-Man worksandproposeaparti ularmetri forea hof ourmat hingproblems.

Toevaluatethee ien yofourproposal,wesimulateanappli ationforea h mat hingproblem: anappli ationoftasks heduling,wheretasksofmultiple

(7)

motejobsarestartedbyallthepeersin thenetwork(dis onne tionmat hing), andanappli ationofP2Ple-system,wherepeersrepli atelesonotherpeers tohavethemhighlyavailable(presen emat hing). Torunoursimulationsona realisti workload,we olle tedanewtra eofpeeravailabilityontheeDonkey le-sharingnetwork. Withthe onne tionsanddis onne tionof14Mpeersover 27days,thistra eisthelargestavailableworkload, on erningpeers' availabil-ity. InSe tion2,weshowthat peersinthis tra eexhibit availabilitypatterns, and,usingasimple7-daypredi tor,thatitispossibletosele tpredi tablepeers andsu essfullypredi ttheirbehavioroverthefollowingweek.

Oursimulationresults,inSe tion5,showthatourT-Manbasedsolutionis ableto providegood partners to all peers, forbothappli ations. Using avail-abilitypatterns,bothappli ationsareabletokeepthesameperforman e,while onsuming 30% less resour es, ompared to a random sele tion of partners. Moreover, T-Man is s alable and inexpensive, making the solutionusable for anyappli ationandnetworksize.

We believe that manyP2P systemsand appli ations anbenet from this work,asalot of availability-awareappli ations havebeen proposed in the lit-erature[3,8,17, 5,22℄. Close to ourwork,[9℄ showsthat strategiesbased on the longest urrent uptime are more e ient than uptime-agnosti strategies for repli a pla ement; [15℄ introdu es sophisti ated availabilitypredi tors and showsthat they anbeverysu essful. However,tothebestofourknowledge, this paper is the rst to deal with the problem of nding the best partners a ordingto availabilitypatternsin alarge-s alenetwork. Moreover,previous resultsareoften omputedonsyntheti tra esorsmalltra esofP2Pnetworks.

2 Availability Patterns in eDonkey

Inthisse tion,wedes ribethe hara teristi softhetra ewe olle tedforthe needsofthis study. With afew thousandpeersonlineat thesametime, most other tra es olle tedonP2P systems[18, 10, 2℄la k massive onne tionand dis onne tiontrends,forthestudyofavailabilitypatternsonalarges ale.

2.1 The eDonkey Tra e

In 2007, we olle ted the onne tion and dis onne tion events from the logs of one of the main eDonkey servers in Europe. Our tra e, available on our website [1℄, ontains more than 200 millions of onne tions by more than 14 millions of peers, over a period of 27 days. To analyse this tra e, we rst ltereduseless onne tions(shorterthan 10minutes) andsuspi ious ones(too repetitive,simultaneousorwith hangingidentiers),leadingtoalteredtra e of12millionpeers.

Thenumberofpeersonlineatthesametimeinthelteredtra eisusually morethan 300,000,asshown byFig. 1(a). Global diurnalpatterns of around 100,000 usersare also learlyvisible: asshownby previousstudies [11℄, most eDonkeyusersarelo atedinEurope,andso,theirdailyoineperiodsareonly partially ompensatedby onne tionsfromother ontinents.

Foreverypeer in theltered tra e, the auto- orrelationon its availability periodswas omputedon14days,withastepofone minute. Foragivenpeer, the period for whi h the auto- orrelation is maximum gives its best pattern

(8)

size. ThenumberofpeerswithagivenbestpatternsizeisplottedonFig.1(b), andshows,as ouldbeexpe ted,that thebest patternsizeisaday,andmu h further,aweek.

2.2 Filtering and Predi tion

Weimplementedastraightforwardpredi tor,thatusesa7-daywindowof avail-ability historyto omputethe daily pattern of apeer: for ea h interval of 10 minutesin aday, itsvalue is the numberof daysin theweek where the peer wasavailableduring thatfullinterval.

Thispredi torhastwopurposes: (1)Itshouldhelptheappli ationtode ide whi hpeersarepredi table,and thus, anbenetfromanimprovedqualityof servi e. Thisgivesanin entiveto peersto parti ipateregularlytothesystem; (2) itshouldhelp theappli ation topredi tfuture onne tionsand dis onne -tionsofthesele tedpeers. Tosele tpredi tablepeers,thepredi tor omputes, forea hpeer, themaximumandthemean ovarian eofthepeerdailypattern. Forthispaper,we omputedaset, alledpredi tableset, ontaining19,600peers whosemaximumisatleast5(predi tionthreshold),andwhosemean ovarian e isgreaterthan28( learbehavior). Wealsoremovedthepeerswhose availabil-ity was smaller than 0.1 (useless peers) or greater then 0.9 (they would bias positivelyourexperiments).

For everypeer in the predi table set, the predi tor predi ts that the peer willbeonlineinagivenintervalifthepeer'sdailypatternvalueforthatinterval isat least5,and otherwisepredi tsnothing (weneverpredi tthat apeerwill beoine). Theratioofsu essfulpredi tionsafteraweekforthefullfollowing weekisplottedonFig.1( ). Itshowsthatpredi tions annotbeonlyexplained bya identalavailability,andprovethepresen eofavailabilitypatternsinthe tra e.

Wepurposely hoseaverysimplepredi tor,asweareinterestedinshowing that patterns of presen eare visibleand anbenetappli ations, even witha worst- aseapproa h. Therefore,weexpe tthatbetterresultswouldbea hieved usingmoresophisti atedpredi tors,su hasdes ribedin[15℄,andforanoptimal patternsize ofonedayinsteadofaweek.

3 Problem Spe i ation

Thisse tionpresentstwoavailabilitymat hingproblems,dis onne tion mat h-ing and presen e mat hing. Ea h problem is abstra ted from the needs of a pra ti al P2P appli ation that we des ribe afterward. But rst, we start by introdu ingoursystemmodel.

3.1 System and Network Model

We assumea fully- onne tedasyn hronous P2P network of

N

nodes, with

N

usuallyrangingfromthousandstomillionsofnodes. Weassumethatthereisa onstantbound

n

c

onthenumberofsimultaneous onne tionsthat apeer an engage in,typi allymu hsmallerthan

N

. Whenpeersleavethesystem,they dis onne tsilently. However,weassumethatdis onne tionsare dete tedafter atime

∆

disc

,forexamplethirtyse ondswithTCPkeep-alive.

(9)

Forea hpeer

x

,weassumetheexisten eofanavailabilitypredi tion

P r

x

_(t)

, startingatthe urrenttime

t

andforaperiod

T

inthefuture,su hthat

P r

x

(t)

is a set of non-overlapping intervals during whi h

x

is expe ted to be online. Sin ethese predi tionsare basedonprevious measuresof availabilityforpeer

x

,weassumethatsu hmeasuresarereliable,eveninthepresen eofmali ious peers[16, 14℄.

We note

S

P r

x

_(t)

the set dened by the unionof theintervals of

P r

x

_(t)

, and

||S||

thesizeofaset

S

.

3.2 The Problem of Dis onne tion Mat hing

Intuitively, the problem of Dis onne tion Mat hing is, for a peer online at a giventime,tondasetofotheronlinepeerswhoareexpe tedtodis onne tat thesametime.

Formally, for a peer

x

online at time

t

, an online peer

y

is a better mat h forDis onne tion Mat hingthananonlinepeer

z

if

|t

x

_{− t}

y

_{| < |t}

x

_{− t}

z

_|

,where

[t, t

x

_{[∈ P r}

x

_(t)

,

[t, t

y

_{[∈ P r}

y

_(t)

and

[t, t

z

_{[∈ P r}

z

_(t)

. TheproblemofDis onne tion Mat hing

DM (n)

isto dis overthe

n

best mat hesofonlinepeersatanytime.

Theproblem ofdis onne tion mat hingarises inappli ations whereapeer tries to nd partners with whom it wants to ollaborate until the end of its session.

An exampleof su h an appli ation is task s heduling in P2Pnetworks. In Zorilla[7℄,apeer ansubmita omputationtaskof

n

jobstothesystem. Insu h a ase, thepeertries to lo ate

n

online peers (withexpanding ringsear h)to be omepartnersforthetask,andexe utesthe

n

jobsonthesepartners. When the omputation is over, the peer olle ts the

n

results from the

n

partners. With dis onne tionmat hing, su hasystembe omesmu h moree ient: by hoosing partners who are likely to dis onne t at the sametime asthe peer, thesystemin reasestheprobabilitythat(1)ifthepeerdoesnotdis onne ttoo early, its partners will havetime to nish exe uting their jobs before dis on-ne tingandhewillbeableto olle ttheresults,and(2)ifthepeerdis onne ts beforetheendofthe omputation,partnerswillnotwasteunne essaryresour es astheyarealsolikelytodis onne tat thesametime.

3.3 The Problem of Presen e Mat hing

Intuitively, theproblem of Presen e Mat hing is, for a peer online at agiven time, to nd aset of other online peers who are expe ted to be onne ted at thesametimeinthefuture.

Formally, fora peer

x

onlineat time

t

, an online peer

y

is abetter mat h forUnfair Presen eMat hingthananonlinepeer

z

if:

||

[

P r

z

(t) ∩

[

P r

x

(t)|| < ||

[

P r

y

(t) ∩

[

P r

x

(t)||

(10)

Thisproblemisqualiedasunfair,sin epeerswhoarealwaysonlineappear tobebestmat hesforallotherpeersinthesystem,whereasonlyother always-onpeersarebestmat hesforthem. Sin esomefairnessiswantedinthesystem, oineperiodsshouldalsobe onsidered. Consequently,

y

isabettermat hthan

z

forPresen eMat hingif:

||

S

P r

z

_{(t) ∩}

S

_{P r}

x

_(t)||

||

S

P r

z

_{(t) ∪}

S

_{P r}

x

_(t)|

<

||

S

P r

y

_{(t) ∩}

S

_{P r}

x

_(t)||

||

S

P r

y

_{(t) ∪}

S

_{P r}

x

_(t)||

TheproblemofPresen eMat hing

P M (n)

istodis overthe

n

bestmat hes ofonlinepeersatanytime.

Theproblemofpresen emat hingarisesinappli ationswhereapeerwants to ndpartnersthat willbeavailableat thesametimein othersessions. This istypi allythe asewhenhugeamountofdatahavetobetransferred,andthat partnerswill haveto ommuni atealottousethatdata.

An example of su h an appli ation is storage of les in P2P networks [4℄. Forexample, in Pasti he [6℄, ea h peer in the systemhas to nd other peers to storeitsles. Sin eles anonlybeused whenthepeer isonline, thebest partners fora peer (atequivalentstability) arethe peers whoare expe tedto beonlinewhen thepeeritselfisonline.

Moreover,inaP2Pba kup system[8℄,peersusuallyrepla etherepli athat annotbe onne tedforagivenperiod,tomaintainagivenlevelofdata redun-dan y. Usingpresen emat hing,su happli ations anin reasetheprobability of beingableto onne tto alltheirpartners, thus redu ingtheirmaintenan e ost.

4 Uptime Mat hing with Epidemi Proto ols

Wethinkthatepidemi proto ols[20,21,13℄aregoodapproximatesolutionsfor these mat hing problems. Here,wepresentoneof these proto ols, T-Man[12℄ and, sin e su h proto ols rely heavily on appropriate metri s, we propose a metri forea hmat hingproblem.

4.1 Distributed Mat hing with T-Man

T-Man is awell-knownepidemi proto ol,usuallyused to asso iate ea h peer in thenetwork with aset ofgoodpartners, given ametri (distan e fun tion) betweenpeers. Eveninlarge-s alenetworks,T-Man onvergesfast,andprovides agoodapproximationoftheoptimalsolutioninafewrounds,whereea hround ostsonlyfourmessagesin averageperpeer.

InT-Man,ea hpeermaintainstwosmallsets,itsrandomviewanditsmetri view, whi h are, respe tively, some random neighbors, and the urrent best andidatesforpartnership,a ordingtothemetri inuse. Duringea hround, every peer updates its views: with one random peer in its random view, it

(11)

merges the two random views, and keeps the most re ently seen peers in its randomview;withthebestpeerin itsmetri view,itmergesalltheviews,and keepsonlythebestpeers,a ordingto themetri ,initsmetri view.

This double s heme guarantees a permanent shue of the random views, whileensuringfast onvergen eofthemetri viewstowardstheoptimalsolution. Consequently, the hoi eofagood metri isveryimportant. Wepropose su h metri sforthetwoavailabilitymat hingproblemsin thenextpart.

4.2 Metri s for AvailabilityMat hing

To ompute e iently the distan e between peers in T-Man, the predi tion

P r

x

_(t)

is approximatedbyabitmap ofsize

m

,

pred

x

,where entry

pred

x

[i]

is 1 if

[i × T /m, (i + 1) × T /m[

is in ludedin anintervalof

P r

x

_(t)

for

0 ≤ i < m

. 4.2.1 Dis onne tion Mat hing

The metri omputes the time between the dis onne tions of two peers. In aseofequality,thePM-distan eof4.2.2isusedtopreferpeerswiththesame availabilityperiods:

DM-distan e

(x, y) = |I

x

_{− I}

y

_|+

PM-distan e

(x, y)

where

I

x

_{= min{0 ≤ i < m|pred}

x

_{[i] = 1 ∧ pred}

x

_{[i + 1] = 0}}

4.2.2 Presen e Mat hing

The metri rst omputes the ratio of o-availability (time where bothpeers were simultaneouslyonline)ontotalavailability(time whereat least onepeer was online). Sin ethe distan e should be loseto 0when peers are lose, we thenreversethevalueon[0,1℄:

PM-distan e

(x, y) = 1 −

P

0≤i<m

min(pred

x

[i],pred

y

[i])

P

0 ≤i<m

max(pred

x

[i],pred

y

[i])

Notethat,whilethePM-distan evalueisin [0,1℄,theDM-distan evalueis in[0,m℄.

5 Simulations and Results

We evaluated the performan e of T-Man plus the metri s of Se tion 4.2, by simulatingthetwoappli ationsofSe tion3ontheeDonkeytra eofSe tion 2.

5.1 General Simulation Setup

Asimulator wasdevelopedfrom s rat hto run thesimulationsonaLinux3.2 GHzXeon omputer,forthe19,600peersofthepredi tablesetfromSe tion2.2. Theirbehaviorson14-dayswereextra tedfrom theeDonkeytra e: therst7 dayswereusedto omputeapredi tion,andthat predi tion,withoutupdates, wasusedtoexe utetheproto olonthefollowingsevendays. Duringoneround ofthe simulator, all onlinepeers in random orderevaluate oneT-Man round, orrespondingtooneminuteofthetra e. Asexplainedlater,bothappli ations weredelayedbyaperiodof10minutesafterapeerwould omeonlinetoallow

(12)

0 5000

10000

15000

20000

25000

30000

UptimeRandom

Number of Tasks

Impact of Disconnection Matching for P2P scheduling

Completed Tasks

Aborted Tasks

Week Mean

Day +7

Day + 1

Figure 2: A task isa setof threeremotejobs of4 hoursstarted by everypeer, ten minutesafter omingonline. Ataskis su essfulwhenthe peeranditspartnersare still onlineafter 4hours to olle t the results. Using availability predi tions, apeer ande idenottostartataskexpe tedtoabort,leadingtofewerabortedtasks. Using dis onne tionmat hing,it anndgoodpartnersandit anstill ompletealmostas manytasksasthemu hmoreexpensiverandomstrategy.

(13)

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 Uptime

Random

Uptime

Random

Uptime

Random

Co-Availability

Impact of Presence Matching for P2P File Storage

1 repl

2 repl

3 repl

4 repl

5 repl

6 repl

7 repl

8 repl

9 repl

Week

Day + 7

Day + 1

Figure3: 10minutesafter omingonlineforthersttime,ea hpeer reatesagiven numberofrepli aforitsdata.Co-availabilityisdenedbythesimultaneouspresen e ofthepeerandatleastonerepli a. Usingpresen emat hing,fewerrepli asareneeded toa hievebetter resultsthanusingarandom hoi eof partners. Eventhe 7thday, usinga6-day oldpredi tion, the systemstill performsmu hmore e iently,almost ompensatingthegeneralloss inavailability.

(14)

T-Mantoprovideausefulmetri view. The omputationofa ompleterundid notex eedtwohoursand 6GBofmemoryfootprint.

5.2 Evaluation of Dis onne tion Mat hing

The task s heduling appli ation of Se tion 3.2 wassimulated to evaluate the performan eof T-Manand theDM-distan e metri . Inthesimulations,every peerstartedataskafter10minutesonline: ataskranthree jobsof4hourson remotepartners,andwas ompletedifthepeeranditspartnerswerestillonline after 4 hoursto olle t theresults. A peer ouldde ide notto start atask if the predi tion of its own availability fore ast that he would go oine before ompletion of the task. The number of aborted/ ompleted tasks is plotted on Fig. 2, for the rst day, the seventh day and the whole week for either dis onne tion mat hing(uptime) orrandom hoi e(randompeer hosenin T-Manrandomview).

Predi tionof availabilityde reasedalot thenumberof abortedtasks,and, with fewer started tasks, dis onne tion mat hing ompleted almost the same numberoftasksasrandommat hing,evenoverthefullweek,whenthe predi -tionwassupposedtobelessa urate(seeauto- orrelationinSe tion 2.1).

5.3 Evaluation of Presen e Mat hing

The P2P le storage of Se tion 3.3 was also simulated with T-Man and the PM-distan emetri . Everypeerrepli ateditsdataonitspartners,tenminutes after omingonlinefor thersttime, inthe hopeof usingitsremote datathe next time itwould be online. The o-availability of thepeer and at leastone repli aisplotted onFig.3,fordierentnumberofrepli as.

Usingpresen emat hing,fewerrepli aswereneededtoa hievebetterresults thanusingarandom hoi eofpartners. Asintheprevioussimulations,week-old predi tionsperformedstillbetterthanrandom hoi e.

6 Dis ussion and Con lusion

Inthispaper,weshowedthatepidemi proto olsfortopologymanagement an be e ient to nd good partners in availability-aware networks. Simulations provedthat,usingoneoftheseproto olsandappropriatemetri s,su h appli a-tions anbelessexpensiveandstillperformwithanequivalentorbetterquality of servi e. Weusedaworst- ases enario: asimplepredi tor,and atra e ol-le ted from a highly volatile le-sharing network, where only a small subset of peers provide predi table behaviors. Consequently, we expe t that a real appli ationwouldtakeevenmorebenetfromavailabilitymat hingproto ols. Inparti ular,untilthiswork,availability-awareappli ationswerelimitedto usingpredi tionsoravailabilityinformationtobetter hooseamongalimitedset ofneighbors. Thisworkopensthedoorto newavailability-awareappli ations, wherebestpartnersare hosenamongallavailablepeersinthenetwork. Itisa useful omplementtotheworkdoneonmeasuringavailability[16,14℄andusing these measurestopredi tfutureavailability[15℄.

(15)

Referen es

[1℄ Tra e. http://fabri e.lefessant.net/tra es/edonkey2.

[2℄ Ranjita Bhagwan, Stefan Savage, and Georey Voelker. Understanding availability. InIPTPS,Int'lWork. onPeer-to-Peer Systems,2003.

[3℄ RanjitaBhagwan,KiranTati,Yu-ChungCheng, StefanSavage,and Geof-frey M. Voelker. Total re all: system support for automated availability management. In NSDI, Symp. on Networked Systems Design and Imple-mentation,2004.

[4℄ Jean-Mi helBus a,FabioPi oni,andPierreSens.Pastis: Ahighly-s alable multi-userpeer-to-peerlesystem. InEuro-Par,2005.

[5℄ Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon, M. Frans Kaashoek, John Kubiatowi z, and Robert Mor-ris. E ientrepli a maintenan efordistributed storagesystems. InNSDI, Symp.on NetworkedSystemsDesignandImplementation,2006.

[6℄ Landon P. Cox, Christopher D. Murray, and Brian D. Noble. Pasti he: Making ba kup heap and easy. In OSDI, Symp. on Operating Systems DesignandImplementation, 2002.

[7℄ Niels Drost, Rob V. van Nieuwpoort, and Henri E. Bal. Simple lo ality-aware o-allo ationin peer-to-peer super omputing. InGP2P, Int'lWork. onGlobal andPeer-2-Peer Computing,2006.

[8℄ AlessandroDuminu o,ErnstWBiersa k,andTaoukEnNajjary.Proa tive repli ationindistributedstoragesystemsusingma hineavailability estima-tion. In CoNEXT, Int'l Conf. on emerging Networking EXperiments and Te hnologies,2007.

[9℄ P. Brighten Godfrey, S ott Shenker, and Ion Stoi a. Minimizing hurn in distributed systems. In SIGCOMM, Conf. on Appli ations, Te hnologies, Ar hite tures, andProto ols for ComputerCommuni ations, 2006.

[10℄ SaikatGuha,NeilDaswani,andRaviJain. An ExperimentalStudyofthe SkypePeer-to-Peer VoIP System. InIPTPS, Int'l Work. on Peer-to-Peer Systems,2006.

[11℄ SidathB.Handurukande,Anne-MarieKermarre ,Fabri eLeFessant, Lau-rentMassoulié,and SimonPatarin. Peersharingbehaviourin theedonkey network,andimpli ations forthedesignof server-lessle sharingsystems. InEuroSys, 2006.

[12℄ Mark Jelasity and Ozalp Babaoglu. T-man: Gossip-based overlay topol-ogy management. In ESOA, Intl'l Work. on Engineering Self-Organising Systems,2005.

[13℄ Mar -Olivier Killijian, Ludovi Courtès, and David Powell. A Surveyof CooperativeBa kup Me hanisms. Te hni alReport06472,LAAS,2006.

(16)

[14℄ Fabri e Le Fessant, Cigdem Sengul, and Anne-Marie Kermarre . Pa e-maker: Fighting Selshness in Availability-Aware Large-S ale Networks. Te hni alReportRR-6594,INRIA, 2008.

[15℄ JamesW. Mi kensandBrianD. Noble. Exploiting availabilitypredi tion in distributed systems. InNSDI, Symp. onNetworkedSystemsDesign and Implementation,2006.

[16℄ Ramses MoralesandIndranilGupta. AVMON:Optimaland s alable dis- overyof onsistentavailabilitymonitoringoverlaysfordistributedsystems. InICDCS, Int'lConf.onDistributedComputingSystems, 2007.

[17℄ Jan Sa ha, Jim Dowling, Raymond Cunningham, and René Meier. Dis- overy of stable peers in a self-organising peer-to-peer gradient topology. InDAIS,Int'lConf.onDistributedAppli ations andInteroperableSystems, 2006.

[18℄ S. Saroiu,P. KrishnaGummadi,andS.D.Gribble. Ameasurementstudy ofpeer-to-peerlesharingsystems. InMMCN, MultimediaComputingand Networking,2002.

[19℄ Daniel Stutzba h and Reza Rejaie. Understanding hurn in peer-to-peer networks. InIMC, InternetMeasurement Conf.,2006.

[20℄ Spyros Voulgaris, Daniela Gavidia, and Maarten van Steen. CYCLON: Inexpensive membership management for unstru tured P2P overlays. J. NetworkSyst. Manage., 13(2),2005.

[21℄ Spyros VoulgarisandMaarten vanSteen. An epidemi proto olfor man-aging routingtables in verylarge peer-to-peer networks. In DSOM, Int'l Work. onDistributedSystems: Operations andManagement,,2003.

[22℄ Qin Xin, Thomas S hwarz, and Ethan L. Miller. Availability in global peer-to-peer storage systems. In WDAS, Work. on Distributed Data and Stru tures,2004.

Contents

1 Introdu tion 3

2 Availability Patterns in eDonkey 4 2.1 TheeDonkeyTra e. . . 4 2.2 FilteringandPredi tion . . . 5

3 ProblemSpe i ation 5

3.1 SystemandNetworkModel . . . 5 3.2 TheProblemofDis onne tionMat hing . . . 6 3.3 TheProblemofPresen eMat hing . . . 6

(17)

4 UptimeMat hingwith Epidemi Proto ols 7

4.1 DistributedMat hing withT-Man . . . 7

4.2 Metri sforAvailabilityMat hing . . . 8

4.2.1 Dis onne tion Mat hing . . . 8

4.2.2 Presen eMat hing . . . 8

5 Simulationsand Results 8 5.1 GeneralSimulationSetup . . . 8

5.2 EvaluationofDis onne tionMat hing . . . 11

5.3 EvaluationofPresen eMat hing . . . 11

6 Dis ussionand Con lusion 11

(18)

Centre de recherche INRIA Saclay – Île-de-France

Parc Orsay Université - ZAC des Vignes

4, rue Jacques Monod - 91893 Orsay Cedex (France)

Centre de recherche INRIA Bordeaux – Sud Ouest : Domaine Universitaire - 351, cours de la Libération - 33405 Talence Cedex

Centre de recherche INRIA Grenoble – Rhône-Alpes : 655, avenue de l’Europe - 38334 Montbonnot Saint-Ismier

Centre de recherche INRIA Lille – Nord Europe : Parc Scientifique de la Haute Borne - 40, avenue Halley - 59650 Villeneuve d’Ascq

Centre de recherche INRIA Nancy – Grand Est : LORIA, Technopôle de Nancy-Brabois - Campus scientifique

615, rue du Jardin Botanique - BP 101 - 54602 Villers-lès-Nancy Cedex

Centre de recherche INRIA Paris – Rocquencourt : Domaine de Voluceau - Rocquencourt - BP 105 - 78153 Le Chesnay Cedex

Centre de recherche INRIA Rennes – Bretagne Atlantique : IRISA, Campus universitaire de Beaulieu - 35042 Rennes Cedex

Centre de recherche INRIA Sophia Antipolis – Méditerranée : 2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex

Éditeur

INRIA - Domaine de Voluceau - Rocquencourt, BP 105 - 78153 Le Chesnay Cedex (France)

http://www.inria.fr