HAL Id: inria-00432741
https://hal.inria.fr/inria-00432741
Submitted on 17 Nov 2009
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
Finding Good Partners in Availability-aware P2P
Networks
Stevens Le Blond, Fabrice Le Fessant, Erwan Le Merrer
To cite this version:
Stevens Le Blond, Fabrice Le Fessant, Erwan Le Merrer. Finding Good Partners in Availability-aware
P2P Networks. International Symposium on Stabilization, Safety, and Security of Distributed Systems
(SSS’09), Nov 2009, Lyon, France. �inria-00432741�
Availability-aware P2P Networks StevensLe Blond
1
,Fabri e Le Fessant2
,Erwan Le Merrer3⋆
1
INRIASophiaAntipolis,
stevens.le_blondinria.fr,
2
INRIASa lay, fabri e.le_fessantinria.fr,3
INRIARennes, elemerreirisa.frWe study the problem of nding peers mat hing a given availability
pattern in a peer-to-peer (P2P) system. Motivated by pra ti al
exam-ples,wespe ifytwoformalproblemsofavailabilitymat hingthatarisein
real appli ations: dis onne tion mat hing, where peers look for partners
expe ted to dis onne t at the same time, and presen e mat hing, where
peers look for partners expe ted to be online simultaneously in the
fu-ture. As as alable and inexpensive solution, we proposeto useepidemi
proto ols fortopologymanagement;weprovide orrespondingmetri sfor
both mat hing problems. We evaluated this solution by simulating two
P2Pappli ations, tasks hedulingand lestorage,overa newtra eofthe
eDonkey network, thelargest available with availability information. We
rst proved the existen e of regularity patterns in the sessions of 14M
peers over 27 days. We also showed that, using only 7 days of history, a
simple predi tor ould sele tpredi table peersand su essfullypredi ted
their online periods for the next week. Finally, simulations showed that
oursimplesolutionprovidedgoodpartnersfastenoughtomat htheneeds
ofbothappli ations, andthat onsequently,theseappli ationsperformed
as e iently at a mu h lower ost. We believe that this work will be
usefulformanyP2Pappli ationsforwhi hithasbeenshownthat
hoos-ing good partners, based on their availability, drasti ally improves their
performan eand stability.
1 Introdu tion
Churnisoneofthemost riti al hara teristi sofpeer-to-peer(P2P)
net-works, asthepermanent ow ofpeer onne tionsand dis onne tions an
seriouslyhamperthee ien yofappli ations[9℄.Fortunately,ithasbeen
⋆
patterns([21,22,2℄), andso, an be predi tedfromtheuptimehistory of
those peers [18℄.
Totakeadvantageofthesepredi tions,appli ationsneedtobeableto
dynami ally nd good partners for peers, a ording to these availability
patterns, even in large-s ale unstru tured networks. The intrinsi
on-stitution of those networks makes pure random mat hing te hniques to
be time-ine ient fa ing hurn. Basi usage of predi tion based on node
availabilityexistsintheliterature, ase.g.for le repli ation[16℄.
Inthispaper,westudy ageneri te hnique to dis oversu h partners,
and apply itfor two parti ular mat hingproblems:dis onne tion
mat h-ing, where peers look for partners expe ted to dis onne t at the same
time, and presen e mat hing, where peers look for partners expe ted to
be online simultaneously in the future. These problems are spe ied in
Se tion 2.
Wethenproposetousestandardepidemi proto olsfortopology
man-agement to solve these problems (see e.g. [12,24℄); su h proto ols have
proven to be e ient for a large panel of appli ations, from overlay
sli -ing[13℄toIP-TVoverlaymaintenan e[14℄forexample.However,inorder
to onverge to thedesired state or topology (here mat hed peers), those
proto ols require good metri s to ompute the distan e between peers.
Su h metri s and a well known epidemi proto ol, T-Man [12℄, are
de-s ribed inSe tion 3.
To evaluate the e ien y of our proposal, we simulated an
appli a-tion for ea h mat hing problem: an appli ation of task s heduling, where
tasks of multiple remote jobsare startedbyall the peersin thenetwork
(dis onne tion mat hing), and an appli ation of P2P le-system, where
peers repli ate les on other peers to have them highly available
(pres-en emat hing). Theseappli ations arespe iedin Se tion5.
To run our simulations on a realisti workload, we olle ted a new
tra e of peer availability on theeDonkey le-sharing network. With the
onne tions and dis onne tion of 14M peers over 27 days, this tra e is
thelargestavailableworkload, on erningpeers'availability.InSe tion4,
we showthat peers inthis tra e exhibit availability patterns, and, using
a simple 7-day predi tor, that it is possible to sele t predi table peers
and su essfullypredi t their behavioroverthe following week.The new
eDonkeytra e and thissimple predi tor arestudied inSe tion 4.
Oursimulation resultsshowed thatour T-Man based solutionis able
to provide good partners to all peers, for both appli ations. Using
partners. Moreover, T-Man iss alable andinexpensive,makingthe
solu-tionusableforanyappli ationandnetworksize.Theseresultsaredetailed
inSe tion 6.
Webelieve thatmanyP2Psystemsand appli ations anbenetfrom
this work, asa lot of availability-aware appli ations have been proposed
inthe literature[3,8,20,5,25℄.Close toour work,Godfrey etal. [9℄show
that strategies based on the longest urrent uptime are more e ient
than uptime-agnosti strategies forrepli a pla ement; Mi kens etal. [18℄
introdu e sophisti atedavailabilitypredi torsandshowsthatthey anbe
very su essful. However, to thebest of our knowledge, this paperis the
rst to deal with the problem of nding the best partners a ording to
availability patterns in a large-s ale network. Moreover, previous results
areoften omputed on syntheti tra es orsmall tra esofP2P networks.
2 Problem Spe i ation
This se tion presents two availability mat hing problems, dis onne tion
mat hing and presen e mat hing. Ea h problem is abstra ted from the
needsofapra ti alP2Pappli ationthatwedes ribeafterward.Butrst,
we start byintrodu ingour systemmodel.
2.1 System and Network Model
Weassumeafully- onne tedasyn hronousP2Pnetworkof
N
nodes,withN
usually ranging from thousands to millions of nodes. We assume thatthere isa onstant bound
n
c
on thenumberof simultaneous onne tionsthat a peer an engage in, typi ally mu h smaller than
N
. When peersleave thesystem, they dis onne t silently. However, we assume that
dis- onne tions are dete ted aftera time
∆
disc
,for example 30 se onds withTCP keep-alive.
For ea hpeer
x
,we assumetheexisten eof anavailabilitypredi tionP r
x
(t)
,startingatthe urrenttime
t
andforaperiodT
inthefuture,su hthat
P r
x
(t)
isasetofnon-overlappingintervalsduringwhi h
x
isexpe tedto be online. Sin e these predi tions are based on previous measures of
availabilityforpeer
x
,weassumethatsu hmeasuresarereliable,eveninthe presen eof mali iouspeers[19,17℄.
We note
S
P r
x
(t)
the set dened by the union of the intervals of
P r
x
(t)
Fig.1.Dis onne tionMat hing:peer
y
isabettermat hthanpeerz
forpeerx
.2.2 The Problem of Dis onne tion Mat hing
Intuitively, the problem of Dis onne tion Mat hing is, for a peer online
at a given time, to nd a set of other online peers who are expe ted to
dis onne t at thesametime.
Formally,forapeer
x
onlineattimet
,anonlinepeery
isabettermat hfor Dis onne tion Mat hing than anonline peer
z
if|t
x
− t
y
| < |t
x
− t
z
|
, where[t, t
x
[∈ P r
x
(t)
,[t, t
y
[∈ P r
y
(t)
and[t, t
z
[∈ P r
z
(t)
. The problem ofDis onne tionMat hing
DM (n)
istodis overthen
bestmat hesofonlinepeersat anytime.
Theproblemofdis onne tionmat hingtypi allyarisesinappli ations
whereapeertriestondpartnerswithwhomitwantsto ollaborateuntil
the end of its session, in parti ular when starting su h a ollaboration
might be expensiveintermsof resour es.
Anexampleofsu hanappli ationistasks hedulinginP2Pnetworks.
InZorilla[7℄forexample,apeer ansubmita omputationtaskof
n
jobsto thesystem.Insu ha ase, thepeertriestolo ate
n
onlinepeers(withexpanding ring sear h) to be ome partners for the task, and exe utes
the
n
jobs on these partners. When the omputation is over, the peerolle ts the
n
results from then
partners. Withdis onne tion mat hing,su hasystembe omesmu hmoree ient:by hoosingpartnerswhoare
likelytodis onne tatthesametimeasthepeer,thesystemin reasesthe
probabilitythat:
If the peer doesnot dis onne t too early, its partners will have time
tonish exe utingtheir jobsbeforedis onne tingand hewill be able
to olle t the results;
Ifthepeerdis onne tsbeforetheendofthe omputation,partnerswill
not waste unne essaryresour es asthey arealso likely to dis onne t
Fig.2. Presen eMat hing:peer
y
isabettermat hthanpeerz
forpeerx
.2.3 The Problem of Presen e Mat hing
Intuitively, the problem of Presen e Mat hing is, for a peer online at a
given time, to nd a set of other online peers who are expe ted to be
onne ted at the same timeinthefuture.
Formally, for a peer
x
online at timet
, an online peery
is a bettermat hfor Unfair Presen e Mat hingthan an onlinepeer
z
if:||
[
P r
z
(t) ∩
[
P r
x
(t)|| < ||
[
P r
y
(t) ∩
[
P r
x
(t)||
Thisproblemis alledunfair,sin epeerswhoarealwaysonlineappear
to be best mat hes for all other peers in the system, whereas only other
always-onpeersarebestmat hesforthem.Sin esomefairnessiswantedin
mostP2Psystems,oineperiodsshouldalsobe onsidered.Consequently,
y
is abettermat h thanz
for Presen e Mat hingif:||
S
P r
z
(t) ∩
S
P r
x
(t)||
||
S
P r
z
(t) ∪
S
P r
x
(t)|
<
||
S
P r
y
(t) ∩
S
P r
x
(t)||
||
S
P r
y
(t) ∪
S
P r
x
(t)||
The problem of Presen e Mat hing
P M (n)
is to dis over then
bestmat hesof online peers atanytime.
Theproblemofpresen emat hing arisesinappli ations wherea peer
wants to nd partners that will be available at the same time in other
sessions. Thisis typi ally the ase when huge amount ofdata have to be
transferred,and thatpartnerswillhave to ommuni ate alotto usethat
data.
Anexample ofsu han appli ation isstorageof lesinP2P networks
[4℄.Forexample,inPasti he[6℄,ea hpeerinthesystemhastondother
peerstostoreitsles.Sin eles anonlybeusedwhenthepeerisonline,
thebestpartnersforapeer(atequivalentstability)arethepeerswhoare
expe tedto be online whenthe peeritself isonline.
Moreover,inaP2Pba kupsystem[8℄,peersusuallyrepla etherepli a
that annotbe onne ted for a given period,to maintain a givenlevel of
dataredundan y.Usingpresen emat hing,su happli ations anin rease
theprobabilityofbeingableto onne ttoalltheirpartners,thusredu ing
We think that epidemi proto ols [12,23,15,24℄ are good approximate
solutions forthese mat hingproblems.Here, wepresent oneof these
pro-to ols, T-Man[12℄ and, sin e su h proto ols rely heavily on appropriate
metri s, we proposeametri for ea h mat hingproblem.
3.1 Distributed Mat hing with T-Man
T-Man isa well-known epidemi proto ol, usually usedto asso iate ea h
peerinthenetwork withaset ofgoodpartners, givena metri (distan e
fun tion) between peers. Even in large-s ale networks, T-Man onverges
fast, and providesa good approximation of theoptimal solutionina few
rounds, where ea h round ostsonlyfour messagesinaverage perpeer.
In T-Man, ea h peer maintains two small sets, its random view and
its metri view, whi h are, respe tively, some randomneighbors, andthe
urrent best andidates for partnership, a ording to the metri in use.
During ea h round, every peer updates its views: with one random peer
inits randomview, itmergesthetwo random views,and keepsthe most
re ently seen peers in its random view; with the best peer in its metri
view,itmergesall the views,and keepsonly thebest peers,a ordingto
the metri ,inits metri view.
This double s heme guarantees a permanent shue of the random
views, while ensuring fast onvergen e of the metri views towards the
optimalsolution.Consequently,the hoi eofagoodmetri isvery
impor-tant.We proposesu hmetri s forthe twoavailabilitymat hingproblems
inthe next part.
3.2 Metri s for Availability Mat hing
To ompute e iently the distan ebetween peers,the predi tion
P r
x
(t)
is approximated bya bitmap ofsize
m
,pred
x
,where entry
pred
x
[i]
is 1if
[i × T /m, (i + 1) × T /m[
isin ludedinanintervalofP r
x
(t)
for
0 ≤ i < m
.Notethatthesemetri s anbeusedwithanyepidemi proto ol,not only
withT-Man.
Dis onne tion Mat hing The metri omputes the time between the
dis onne tions oftwo peers.In ase ofequality,thePM-distan eof 3.2is
usedto prefer peers withthesame availabilityperiods:
DM-distan e
(x, y) = |I
x
− I
y
|+
0
100000
200000
300000
400000
500000
600000
0
5
10
15
20
25
30
Number of Peers
Days
Global System Availability
Online Peers
Fig.3.Diurnal patternsare learly visible whenwe plotthenumberofonline peers
atany timeinour27-dayeDonkeytra e.Dependingonthetimeoftheday,between
300,000and600,000usersare onne tedtoasingleeDonkeyserver.
I
x
= min{0 ≤ i < m|pred
x
[i] = 1 ∧ pred
x
[i + 1] = 0}
Presen eMat hing Themetri rst omputestheratioof o-availability
(time where both peers were simultaneously online) on total availability
(time where at least one peer was online). Sin e the distan e should be
lose to 0when peersare lose, we thenreverse thevalueon [0,1℄:
PM-distan e
(x, y) = 1 −
P
0≤i<m
min(pred
x
[i],pred
y
[i])
P
0
≤i<m
max(pred
x
[i],pred
y
[i])
Note that, while the PM-distan e value is in [0,1℄, the DM-distan e
valueis in[0,m℄.
4 Simulation Settings
We evaluated our a solution based on T-Man on two appli ations, one
for ea h mat hing problem. In this se tion, we des ribe our simulation
settings.Inparti ular, wedes ribe the hara teristi s ofthetra ewe
ol-le ted for the needs of this study, with more than 300,000 online peers
on 27 days. With a few thousand peers online at the same time, most
other tra es olle ted on P2P systems[21,10,2℄ la k massive onne tion
anddis onne tion trends,forthe study ofavailabilitypatternsona large
1000
10000
100000
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14
Number of peers (log scale)
Best pattern size (days)
Distribution of the Best Sizes of Patterns
Fig.4.Peersa hievetheirbestauto- orrelation(ressemblan ebetweensessionsaftera
givenperiod)betweensessionsforaone-dayperiodoraone-weekperiod.Consequently,
peers are highly likely to onne tat almost thesame timethe nextday or thenext
week.
4.1 A new eDonkey Tra e
In 2007, we olle ted the onne tion and dis onne tion events from the
logs of one of themain eDonkeyserversin Europe. Edonkey is urrently
the mostusedP2P le-sharingnetworkintheworld.Our tra e,available
on our website [1℄, ontains more than 200 millions of onne tions by
more than 14 millions of peers,overa period of 27 days. To analyse this
tra e, we rst ltereduseless onne tions (shorter than 10 minutes) and
suspi iousones(toorepetitive,simultaneousorwith hangingidentiers),
leading to a ltered tra e of12 millionpeers.
The number of peers online at the same time in the ltered tra e is
usually more than 300,000, asshown by Fig. 3. Global diurnal patterns
of around 100,000 users are also learly visible: as shown by previous
studies [11℄, most eDonkey users are lo ated in Europe, and so, their
daily oine periods are only partially ompensated by onne tions from
other ontinents.
For every peer inthe ltered tra e, the auto- orrelation on its
avail-abilityperiods was omputed on 14 days,with astep of one minute. For
a given peer,the period for whi h theauto- orrelation is maximum gives
itsbestpatternsize.Thenumberofpeerswithagivenbestpatternsizeis
plotted onFig.4,and shows,as ouldbe expe ted, thatthebest pattern
Ourgoalinthesesimulationswastoevaluate thee ien y ofour
mat h-ing proto ol, and not the e ien y of availability predi tors, as already
done in [18℄. As a onsequen e, we implemented a very straightforward
predi tor, thatusesa7-daywindowofavailabilityhistoryto omputethe
dailypattern ofapeer:forea hintervalof10minutesinaday,itsvalueis
the numberof days intheweekwhere the peerwasavailable duringthat
full interval:
pattern
p
[i] = Σ
d∈[0:6]
history
p
[d ∗ 24 ∗ 60/10 + i]
Thispredi torhastwo purposes:
It should help the appli ation to de ide whi h peers are predi table,
andthus,whi hpeers anbenetfromanimprovedqualityofservi e.
Thisgivesanin entiveforpeerstoparti ipateregularlytothesystem;
it should help the appli ation to predi t future onne tions and
dis- onne tionsof the sele ted peers.
Tosele tpredi table peers,thepredi tor omputes, forea hpeer,the
maximum and the mean ovarian e of the peer daily pattern. For these
simulations, we omputed a set, alled predi table set, ontaining peers
mat hingwiththefollowing properties:
The maximum value in
pattern
is at least 5:ea h peer wasavailableat leastvedaysduring thelast week exa tlyat thesametime;
Theaverage ovarian e in
pattern
isgreater than28: ea hpeerhasasharply-shaped behavior;
Peer availabilityis greater than 0.1: peers have to ontribute enough
to thesystem;
Peer availability is smaller than 0.9: peers whi h are always online
wouldbias positively our simulations.
Inour eDonkey tra e,this predi tableset ontains 19,600su h peers.
Notethatthisrelatively smallamountofpeers,w.r.t.thetotalnumberof
peersinourtra e,doesnotmeanthateDonkeypeersarenot predi table:
our tra e on erns onlya partofeDonkeyusersat measuretime(around
10%
,those onne ted to eDonkey Server N.2).Usersthatleave mayjoinanotherserver(e.g.ServerN.1,alarger one),whi hmakestheminvisible
in our tra e, even though they are still using eDonkey. For every peer
in the set, the predi tor predi ts that the peer will be online in a given
0
0.2
0.4
0.6
0.8
1
0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Prediction Success
Availability
CDF of Peers (normalized)
Predictability of selected peers
prediction (predictable peers)
prediction (randomized bitmaps)
Availability
Fig.5. Whereas availability determinesthe predi tion with random bitmaps, daily
patternsimprovethepredi tionwithrealbitmaps(e.g.for60%ofpeers(x=0.4),50%
ofpredi tions(y=0.5)aresu essful,butonly25%withrandombitmaps).
otherwise predi ts nothing (we never predi t thata peer will be oine).
Theratio ofsu essfulpredi tionsafteraweekfor thefullfollowing week
is plotted on Fig. 5. It shows that predi tions annot be only explained
bya idental availability, and prove the presen eof availability patterns
inthe tra e.
We purposely hose a very simple predi tor, as we are interested in
showingthatpatternsofpresen earevisibleand anbenetappli ations,
even withaworst- aseapproa h.Therefore,weexpe tthatbetter results
wouldbea hieved usingmore sophisti atedpredi tors, su hasdes ribed
in[18℄,and for an optimalpattern size ofone dayinsteadof aweek.
4.3 General Simulation Setup
Asimulatorwasdevelopedfroms rat htorunthesimulationsonaLinux
3.2GHz Xeon omputer, for the19,600 peersof thepredi table set from
Se tion 4.2. Their behaviors on 14-days were extra ted from the
eDon-key tra e: the rst 7 days were used to ompute a predi tion, and that
predi tion, withoutupdates,wasusedto exe utetheproto olon the
fol-lowing seven days.During one round of thesimulator,all online peersin
randomorderevaluateone T-Manround, orrespondingtoone minuteof
the tra e.As explainedlater, both appli ations weredelayed bya period
two hoursand 6GB of memory footprint.
5 Simulated Appli ations
Inthisse tion,wedes ribethetwoappli ationsthatweusedto illustrate
theneedforane ientproto olfordistributedavailabilitymat hing.Our
goal is not to improve the performan e of these appli ations, asthis an
be done by an aggressive greedy algorithm, but to save resour es using
availabilityinformation.
5.1 Dis onne tion Mat hing: Task S heduling
To evaluatethee ien y ofT-Manand theDM-distan emetri ,we
sim-ulatedadistributedtasks hedulingappli ation.Inthisappli ation,every
peerstartsataskafter10 minutesonline: ataskis omposedof 3jobsof
4 hourson remotepartners, and is ompleted if thepeerand its partners
arestill online after4 hoursto olle t theresults.
The2rst hoursof ea hjob aredevotedto thedownload ofthedata
needed for the omputation from a entral server. As a onsequen e, a
peer an de ide not to start a taskto save the bandwidth of the entral
server. In our simulation, su h a de ision is taken when the predi tion
of the peer availability shows that the peer is going to go oine before
ompletion of the task.
5.2 Presen e Mat hing: P2P File-Storage
To evaluatethee ien y ofT-ManandthePM-distan e metri ,we
sim-ulatedaP2Plestorageappli ation.Inthisappli ation,everypeer
repli- atesits datato itspartners,tenminutes after omingonline fortherst
time, in the hope that he will be able to use this remote data the next
time itwill beonline.
Thesize of the dataof ea h peeris supposedto be large, hundred of
megabytesofexample.Asa onsequen e,itisimportantforthesystemto
useaslittle redundan y aspossibleto a hieve high o-availabilityofdata
(i.e. availability ofthe peer and at least one of its datarepli a). Finding
good partners in the network is expe ted to provide repli a whi h are
more likely to be available at thesame time asthepeer,thus de reasing
0
5000
10000
15000
20000
25000
30000
UptimeRandom
UptimeRandom
UptimeRandom
Number of Tasks
Impact of Disconnection Matching for P2P scheduling
Completed Tasks
Aborted Tasks
Week Mean
Day +7
Day + 1
Fig.6.Ataskisasetofthreeremotejobsof4hoursstartedbyeverypeer,tenminutes
after omingonline.Ataskissu essfulwhenthepeeranditspartnersarestillonline
after4hoursto olle ttheresults.Usingavailabilitypredi tions,apeer ande idenot
tostartataskexpe tedtoabort, leadingtofewerabortedtasks.Usingdis onne tion
mat hing,it anndgoodpartnersandit anstill ompletealmostas manytasksas
themu hmoreexpensiverandomstrategy.
6 Simulation Results
Inthisse tion,wepresenttheresultsofoursimulationsofthetwo
appli a-tions.We arenot interestedintherawperforman eoftheseappli ations,
butinthesavingsthat ouldbea hievedbyusingavailabilityinformation
and partner mat hing.
6.1 Results for Dis onne tion Mat hing
We ompared Dis onne tionMat hingwithaRandom hoi e ofpartners
(a tually, usingpartners withinT-Man random view) for thedistributed
tasks hedulingappli ation.Thenumberof ompletedtasksandthe
num-berof aborted tasks are plotted on Fig. 6, for the rst day, the 7
th
day
and the wholeweek.
Predi tion of availability de reased by 68% the number of aborted
tasksonaverageoveraweek, orrespondingto 50%ofbandwidthsavings
on the data server, while de reasing the number of ompleted tasks by
only 17%.
These results were largely improved using one-day predi tion, sin e
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Uptime
Random
Uptime
Random
Uptime
Random
Co-Availability
Impact of Presence Matching for P2P File Storage
1 repl
2 repl
3 repl
4 repl
5 repl
6 repl
7 repl
8 repl
9 repl
Week
Day + 7
Day + 1
Fig.7. 10 minutesafter oming online for the rst time, ea h peer reates a given
numberofrepli afor itsdata.Co-availabilityisdenedbythesimultaneouspresen e
ofthepeerandatleastonerepli a.Usingpresen emat hing,fewerrepli asareneeded
to a hieve better results thanusinga random hoi e of partners.Eventhe 7th day,
usinga 6-day old predi tion, the systemstill performs mu hmoree iently,almost
ompensatingthegenerallossinavailability.
inSe tion4.1).Indeed,bandwidthsavingswereabout43%for
Dis onne -tion Mat hing, while ompleting 20%more tasks. Thus, itis mu h more
interestingfromaperforman epointofviewtouseone-daypredi tion
ev-erydayinsteadof one-week predi tion,althoughsavingsarestillpossible
withone-week predi tions.
6.2 Results for Presen e Mat hing
We ompared Presen e Mat hing with a Random hoi e of repli a
lo a-tions for the P2P le-systemappli ation. The o-availability of the peer
andat leastone repli aisplotted onFig.7,for dierent numberof
repli- as.
Usingpresen emat hing,fewerrepli aswereneededtoa hievebetter
results than using a random hoi e of partners. For example, 1 repli a
withPresen eMat hinggivesabetter o-availabilitythan 2repli aswith
RandomChoi e; 5repli as withPresen e Mat hinggive a o-availability
of 95% whi h is only a hieved using 9 repli as with Random Choi e. As
forthe otherappli ation,week-oldpredi tionsperformedstill betterthan
In this paper, we showed that epidemi proto ols for topology
manage-ment anbee ient tondgoodpartnersinavailability-awarenetworks.
Simulations proved that, using one of these proto ols and appropriate
metri s,su h appli ations anbelessexpensiveand stillperformwithan
equivalent or better quality of servi e. We used a worst- ase s enario: a
simple predi tor, and a tra e olle ted from a highly volatile le-sharing
network, where only a small subset of peers provide predi table
behav-iors.Consequently,weexpe tthatarealappli ationwouldtakeevenmore
benetfromavailabilitymat hing proto ols.
Inparti ular,untilthiswork,availability-awareappli ationswere
lim-itedtousingpredi tionsoravailabilityinformationtobetter hooseamong
a limitedset of neighbors. This work opens thedoor to new
availability-aware appli ations, where best partners are hosen among all available
peers in the network. It is a useful omplement to the work done on
measuring availability[19,17℄ and using these measures to predi t future
availability[18℄.
Referen es
1. Tra e. http://fabri e.lefessant.net/tra es/edonkey2.
2. Bhagwan, R., Savage, S., and Voelker, G. Understanding availability. In
IPTPS,Int'lWork.onPeer-to-Peer Systems (2003).
3. Bhagwan,R.,Tati,K.,Cheng,Y.-C.,Savage,S.,andVoelker,G.M.Total
re all:systemsupportforautomatedavailabilitymanagement.InNSDI,Symp.on
NetworkedSystemsDesignandImplementation (2004).
4. Bus a, J.-M., Pi oni, F., and Sens, P. Pastis: Ahighly-s alable multi-user
peer-to-peerlesystem. InPro eedingsofEuro-Par (2005).
5. Chun, B.-G., Dabek, F., Haeberlen, A., Sit, E., Weatherspoon, H.,
Kaashoek, M. F., Kubiatowi z, J., and Morris, R. E ient repli a
main-tenan e for distributed storage systems. In NSDI, Symp. on Networked Systems
DesignandImplementation(2006).
6. Cox,L.P., Murray,C.D.,andNoble,B.D.Pasti he:Makingba kup heap
andeasy.InOSDI,Symp.onOperatingSystemsDesignandImplementation(2002).
7. Drost, N., van Nieuwpoort, R. V., and Bal, H. E. Simple lo ality-aware
o-allo ationinpeer-to-peersuper omputing. InGP2P,Int'lWork. onGlobaland
Peer-2-PeerComputing (2006).
8. Duminu o, A., Biersa k, E. W., and En Najjary, T. Proa tive repli ation
indistributedstoragesystemsusingma hineavailabilityestimation. InCoNEXT,
Int'lConf. onemergingNetworkingEXperimentsandTe hnologies (2007).
9. Godfrey,P.B.,Shenker, S., andStoi a,I. Minimizing hurnindistributed
systems. InSIGCOMM, Conf. on Appli ations, Te hnologies, Ar hite tures, and
Proto olsforComputer Communi ations (2006).
10. Guha, S., Daswani, N., and Jain, R. An Experimental Studyof the Skype
andPatarin,S. Peersharingbehaviourintheedonkeynetwork,andimpli ations
forthedesignofserver-lesslesharingsystems. InEuroSys (2006).
12. Jelasity,M.,and Babaoglu,O. T-man:Gossip-basedoverlaytopology
man-agement. InESOA,Intl'lWork.onEngineering Self-Organising Systems(2005).
13. Jelasity,M.,andKermarre ,A.-M. Orderedsli ingofverylarge-s ale
over-lay networks. IEEE International Conferen e onPeer-to-Peer Computing (2006),
117124.
14. Kermarre , A.-M., Le Merrer, E., Liu, Y., and Simon, G. Surng
peer-to-peeriptv:Distributed hannelswit hing. Pro eedingsofEuro-Par (2009).
15. Killijian, M.-O., Courtès, L., and Powell, D. A Survey of Cooperative
Ba kupMe hanisms. Te h.Rep.06472,LAAS,2006.
16. Kim, K. Lifetime-aware repli ation for data durability inp2p storage network.
IEICETransa tions91-B,12(2008),40204023.
17. Le Fessant, F., Sengul, C., and Kermarre , A.-M. Pa emaker: Fighting
Selshness inAvailability-Aware Large-S ale Networks. Te h.Rep.RR-6594,
IN-RIA,2008.
18. Mi kens, J. W., and Noble, B. D. Exploitingavailability predi tionin
dis-tributedsystems.InNSDI,Symp.onNetworkedSystemsDesignand
Implementa-tion(2006).
19. Morales,R.,andGupta,I.AVMON:Optimalands alabledis overyof
onsis-tentavailabilitymonitoringoverlaysfordistributedsystems.InICDCS,Int'lConf.
onDistributedComputingSystems (2007).
20. Sa ha,J.,Dowling,J.,Cunningham,R.,andMeier,R.Dis overyofstable
peers inaself-organising peer-to-peergradienttopology. InDAIS,Int'lConf. on
DistributedAppli ationsandInteroperable Systems (2006).
21. Saroiu,S.,Gummadi,P.K.,and Gribble,S. Ameasurementstudyof
peer-to-peer le sharing systems. In MMCN, Multimedia Computing and Networking
(2002).
22. Stutzba h,D.,and Rejaie,R.Understanding hurninpeer-to-peernetworks.
InIMC,Internet MeasurementConf.(2006).
23. Voulgaris,S.,Gavidia,D.,andvanSteen,M.CYCLON:Inexpensive
mem-bershipmanagementforunstru turedP2Poverlays.J.NetworkSyst.Manage.13,
2(2005).
24. Voulgaris,S.,vanSteen,M.,andIwani ki,K.Proa tivegossip-based
man-agementofsemanti overlaynetworks:Resear harti les.Con urr.Comput.:Pra t.
Exper.19,17(2007),22992311.
25. Xin, Q., S hwarz,T., and Miller, E.L. Availability inglobal peer-to-peer