HAL Id: inria-00111963
https://hal.inria.fr/inria-00111963v2
Submitted on 8 Dec 2006
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Yves Robert, Anne Benoit, Veronika Rehn
To cite this version:
Yves Robert, Anne Benoit, Veronika Rehn. Strategies for Replica Placement in Tree Networks.
[Re-search Report] RR-6012, LIP RR-2006-30, INRIA, LIP. 2006, pp.39. �inria-00111963v2�
inria-00111963, version 2 - 8 Dec 2006
a p p o r t
d e r e c h e r c h e
S
N
0
2
4
9
-6
3
9
9
IS
R
N
IN
R
IA
/R
R
--6
0
1
2
--F
R
+
E
N
G
Thème NUM
Strategies for Replica Placement in Tree Networks
Anne Benoit — Veronika Rehn — Yves Robert
N° 6012
Unité de recherche INRIA Rhône-Alpes
655, avenue de l’Europe, 38334 Montbonnot Saint Ismier (France)
Téléphone : +33 4 76 61 52 00 — Télécopie +33 4 76 61 52 52
Anne Benoit, Veronika Rehn , Yves Robert ThèmeNUMSystèmesnumériques
ProjetGRAAL
Rapportdere her he n°6012November200636pages
Abstra t: Inthispaper,wedis ussand ompareseveralpoli iestopla erepli asintreenetworks, subje tto server apa ityandQoS onstraints. The lientrequestsareknownbeforehand,while the number and lo ation of the servers are to be determined. The standard approa h in the literature is to enfor e that all requests of a lient be served by the losest server in the tree. Weintrodu eandstudy twonewpoli ies. Intherstpoli y, allrequestsfrom agiven lient are still pro essedbythesameserver,but thisserver anbelo atedanywhere in thepathfrom the lienttotheroot. Inthese ondpoli y,therequestsofagiven lient anbepro essedbymultiple servers.
Onemajor ontributionofthispaperistoassesstheimpa tofthesenewpoli iesonthetotal repli ation ost. Anotherimportantgoalistoassesstheimpa tofserverheterogeneity,bothfrom atheoreti alandapra ti alperspe tive. Inthispaper,weestablishseveralnew omplexityresults, andprovideseverale ientpolynomialheuristi sforNP- ompleteinstan esoftheproblem. These heuristi sare omparedto anabsolute lowerbound provided bytheformulationof theproblem intermsofthesolutionofanintegerlinearprogram.
Key-words: Repli a pla ement, tree networks, a ess poli y, s heduling, omplexity results, heuristi s,heterogeneous lusters.
Résumé : Dans e rapportnous présentonset omparons plusieurspolitiquesde pla ement de répliquessurdesarbres,prenanten ompteàlafoisdes ontraintesliéesàla apa itédetraitement de haque serveur et des ontraintes de type QoS(qualité de servi e). Les requêtesdes lients sont onnuesavantexé ution,alorsquelenombreet l'empla ementdesrépliques(serveurs)sont àdéterminerparl'algorithmede pla ement. L'appro he lassiqueimposequetouteslesrequêtes d'un lientdonnésoienttraitéesparunseulserveur,àsavoirlepluspro hedu lientdansl'arbre. Nous introduisons deux nouvelles politiques de pla ement. Dans la première, haque lient a toujoursunserveurunique,mais e dernierpeutêtresituén'importeoùsurle heminquimène du lient à la ra ine dans l'arbre. Ave la deuxième politique, les requêtes d'un même lient peuventêtretraitées parplusieursserveurssur emême hemin.
Nousmontronsque esdeuxnouvellespolitiquesdepla ementsontàmêmederéduirefortement le oût total de la répli ation. Un autre obje tif de e travail est l'analyse de l'impa t de l'hétérogénéité de la plate-forme, à la fois d'un point de vue théorique et pratique. Sur le plan théorique, nous établissons plusieurs résultats de omplexité, dans les adres homogène et hétérogène, pour l'appro he lassique et les nouvelles politiques. Surle plan pratique, nous on evons des heuristiques polynomiales pour les instan es ombinatoires du problème. Nous omparonslesperforman esde esheuristiquesen lesrapportantàuneborne inférieureabsolue surle oûttotaldelarépli ation; etteborneestobtenueparrelaxationd'unprogrammelinéaire ennombreentiersqui ara tériselasolutionoptimaleduproblème.
Mots- lés: Pla ementderépliques,réseauxenarbre,ordonnan ement, omplexité,heuristiques, grappesde al ul hétérogènes.
Contents
1 Introdu tion 4
2 Framework 5
2.1 Denitionsandnotations . . . 5
2.2 Probleminstan es . . . 6
2.2.1 Constraints . . . 6
2.2.2 Obje tivefun tion . . . 7
2.2.3 Simpliedproblems . . . 7
3 A esspoli ies 7 3.1 Impa tofthea ess poli y ontheexisten eofasolution. . . 8
3.2 Upwards versusClosest . . . 8
3.3 Multiple versusUpwards . . . 9
3.4 LowerboundfortheRepli aCountingproblem . . . 10
4 Complexity results 11 4.1 WithhomogeneousnodesandtheMultiple strategy . . . 12
4.1.1 Algorithmformultiple servers. . . 12
4.1.2 Example. . . 14
4.1.3 Proofof optimality . . . 14
4.2 WithhomogeneousnodesandtheUpwards strategy . . . 20
4.3 Withheterogeneousnodes . . . 21
5 Linearprogramming formulation 22 5.1 Singleserver. . . 22
5.2 Multipleservers. . . 23
5.3 AnILP-basedlowerbound . . . 24
6 Heuristi sforthe Repli a Costproblem 24 6.1 Closest. . . 25
6.2 Upwards . . . 26
6.3 Multiple . . . 27
7 Experiments: omparisonsof dierent a esspoli ies 29 7.1 Obtainingalowerbound . . . 29
7.2 Experimental plan . . . 29
7.3 Results. . . 30
8 Extensions 33 8.1 Withseveralobje ts . . . 33
8.2 More omplexobje tivefun tions . . . 33
9 Relatedwork 34
1 Introdu tion
Inthispaper,we onsider thegeneralproblem ofrepli apla ementin treenetworks. Informally, there are lients issuing requests to be satised by servers. The lients are known (both their positionin the treeand theirnumberofrequests), whilethe numberandlo ationof theservers are to be determined. A lientis aleaf node of thetree, and its requests anbe served byone orseveralinternal nodes. Initially, there arenorepli a;when anodeisequipped witharepli a, it anpro ess anumberofrequests,uptoits apa itylimit. Nodesequippedwitharepli a,also alledservers, anonlyserve lientslo atedin theirsubtree(sothattheroot,ifequippedwitha repli a, anserveany lient);thisrestri tionisusuallyadoptedtoenfor ethehierar hi alnature ofthetargetappli ationplatforms,whereanodehasknowledgeonlyofitsparentand hildrenin thetree.
The rule of the game is to assign repli as to nodes so that some optimization fun tion is minimized. Typi ally, this optimization fun tion is the total utilization ost of the servers. If all the nodes are identi al, this redu es to minimizing the number of repli as. If the nodes are heterogeneous,itisnaturaltoassigna ostproportionaltotheir apa ity(sothatonerepli aona node apableofhandling
200
requestsisequivalenttotworepli asonnodesof apa ity100
ea h). The oreof the paper is devoted to the study of the previous optimization problem, alled Repli aPla ementinthefollowing. Additional onstraintsareintrodu ed,su has guarantee-ingsomeQualityofServi e(QoS):therequestsmustbeservedinlimitedtime,therebyprohibiting tooremoteorhard-to-rea hrepli alo ations. Also,theowofrequeststhroughalinkinthetree annotex eedsomebandwidth-related apa ity. Wefo usonoptimizingthetotalutilization ost (orrepli anumberinthehomogeneous ase). Thereisabun hofpossibleextensions: dealingwith several obje ttypesrather thanone, in luding ommuni ationtime into the obje tivefun tion, taking into a ount anupdate ostof therepli as, and soon. Forthe sakeof larity wedevote aspe ialse tion (Se tion 8) to formulate theseextensions, and to des ribe whi h situations our resultsandalgorithms anstillapplyto.We point out that the distribution tree ( lients and nodes) is xed in our approa h. This keyassumptionis quitenaturalforabroadspe trum ofappli ations, su h asele troni ,ISP, or VODservi edelivery. Theroot serverhastheoriginal opyof thedatabasebut annotserveall lientsdire tly,soadistributiontreeisdeployedto provideahierar hi aland distributeda ess to repli asof theoriginal data. On the ontrary,in other, morede entralized, appli ations(e.g. allo ating Web mirrors in distributed networks), a two-step approa h is used: rst determine a good distribution tree in an arbitrary inter onne tion graph, and then determine a good pla ement ofrepli as among thetree nodes. Both stepsare interdependent,and the problemis mu hmore omplex,duetothe ombinatorialsolutionspa e(thenumberof andidatedistribution treesmaywellbeexponential).
ManyauthorsdealwiththeRepli aPla ementoptimizationproblem,andwesurveyrelated work inSe tion 9. Theobje tiveofthispaperistwofold: (i)introdu ingtwonewa ess poli ies and omparingthemwiththestandardapproa h;(ii)assessingtheimpa tofserverheterogeneity ontheproblem.
Inmost,ifnotall,papersfromtheliterature,allrequestsofa lientareservedbythe losest repli a,i.e. therstrepli afoundintheuniquepathfromthe lienttotherootinthedistribution tree. ThisClosest poli yissimpleandnatural,butmaybeunduly restri tive,leadingtoawaste of resour es. We introdu e and study two dierent approa hes: in the rst one, we keep the restri tionthat allrequests from agiven lientare pro essed by thesamerepli a, but we allow lientrequeststo traverse serverssoasto be pro essedby otherrepli as lo atedhigher in the path( losertotheroot). We allthisapproa htheUpwardspoli y. Thetrade-oftoexploreisthe following: theClosest poli yassignsrepli asatproximityofthe lients,but mayneedtoallo ate toomanyof them if somelo al subtreeissues agreat numberof requests. The Upwards poli y will ensureabetterresour e usage,load-balan ingthe pro ess of requestson alargers ale; the possibledrawba kisthat requestswill beservedby remoteservers,likelyto takelongertimeto pro ess them. Taking QoS onstraints into a ount would typi ally be more important for the Upwards poli y.
Inthese ondapproa h,wefurtherrelaxa ess onstraintsandgrantthepossibilityfora lient tobeassignedseveralrepli as. WiththisMultiple poli y,thepro essingofagiven lient'srequests willbesplitamongseveralserverslo atedinthetreepathfromthe lienttotheroot. Obviously, thispoli yis themostexible,and likelyto a hievethebest resour eusage. Theonlydrawba k isthe(modest)additional omplexityindu edbythefa tthatrequestsmustnowbetaggedwith therepli a serverIDin addition tothe lient ID.As alreadystated, onemajor obje tiveofthis paperisto omparethesethreea ess poli ies, Closest,Upwards andMultiple.
These ond major ontribution of the paper is to assess the impa t of serverheterogeneity, both from a theoreti al and a pra ti al perspe tive. Re ently, several variantsof the Repli a Pla ementoptimizationproblem withthe Closest poli y havebeenshown tohavepolynomial omplexity. Inthispaper,weestablishseveralnew omplexityresults.Thoseforthehomogeneous asearesurprising:forthesimplestinstan ewithoutQoSnorbandwidth onstraints,theMultiple poli yispolynomial(asClosest)whileUpwardsisNP-hard. Thethreepoli iesturnouttobe NP- ompleteforheterogeneousnodes,whi hprovidesyetanotherexampleoftheadditionaldi ulties indu edbyresour eheterogeneity. Onthemorepra ti alside,weprovideanoptimalalgorithm fortheMultiple problemwith homogeneousnodes,and severalheuristi sforallthree poli iesin theheterogeneous ase. We omparethese heuristi sthroughsimulations ondu tedforproblem instan es without QoSnor bandwidth onstraints. Another ontribution is that weare ableto assesstheabsoluteperforman eoftheheuristi s,notjust omparingonetotheother,owingtoa lowerboundprovidedbyanewformulationoftheRepli a Pla ementproblemintermsofan integerlinearprogram: the relaxationofthis program to therational numbers providesalower boundtothesolution ost(whi hisnotalwaysfeasible).
Therestofthepaperisorganizedasfollows. Se tion2isdevotedtoadetailedpresentationof thetargetoptimizationproblems. InSe tion3weintrodu ethethreea esspoli ies,andwegive afewmotivatingexamples. NextinSe tion4wepro eedtothe omplexityresultsforthesimplest versionoftheRepli aPla ementproblem,bothinthehomogeneousandheterogeneous ases. Se tion5dealswiththeformulationfortheRepli aPla ementproblemintermsofaninteger linear program. In Se tion 6 we introdu e several polynomial heuristi s to solvethe Repli a Pla ementproblem with the dierent a ess poli ies. These heuristi s are ompared through simulations,whoseresultsareanalyzedinSe tion7. Se tion8dis ussesvariousextensionstothe Repli aPla ementproblemwhileSe tion9isdevotedtoanoverviewofrelatedwork. Finally, westatesome on ludingremarksinSe tion 10.
2 Framework
Thisse tionisdevotedtoapre isestatementoftheRepli aPla ementoptimizationproblem. Westartwithsomedenitionsandnotations. Nextweoutlinethesimplestinstan eoftheproblem. Thenwedes ribeseveraltypesof onstraintsthat anbeaddedtotheformulation.
2.1 Denitions and notations
We onsideradistributiontree
T
whosenodesarepartitionedintoasetof lientsC
andasetof nodesN
. ThesetoftreeedgesisdenotedasL
. The lientsareleafnodesofthetree,whileN
is theset ofinternal nodes. It would beeasyto allow lient-server nodeswhi hplayboththerule ofa lientand ofan internalnode(possibly aserver), bydividingsu hanode into twodistin t nodesinthetree, onne tedbyanedgewithzero ommuni ation ost.A lient
i ∈ C
ismakingrequests todatabaseobje ts. Forthesakeof larity,werestri tthe presentation to asingleobje ttype,hen easingledatabase. Wedealwith severalobje ttypes inSe tion 8.A node
j ∈ N
may ormay not have been provided with a repli a of the database. Nodes equippedwitharepli a (i.e. servers) anpro ess requestsfrom lientsin theirsubtree. Inother words,thereisauniquepathfrom a lienti
totherootofthetree,andea hnodein thispathis eligibletopro esssomeoralltherequestsissuedbyi
whenprovidedwitharepli a.Let
r
betherootofthetree. Ifj ∈ N
,then hildren(j)
isthesetof hildrenofnodej
. Ifk 6= r
isanynodeinthetree(leaforinternal),parent(k)
isitsparentinthetree. Ifl : k → k
′
=
parent
(k)
isanylink in thetree,then su(l)
is thelinkk
′
→
parent
(k
′
)
(whenitexists). Let An estors
(k)
denote theset ofan estors ofnodek
, i.e. thenodesin theuniquepath that leadsfromk
upto therootr
(k
ex luded). Ifk
′
∈
An estors
(k)
,thenpath[k → k
′
]
denotesthesetoflinksinthepath from
k
tok
′
;also, subtree
(k)
isthesubtreerootedink
,in ludingk
. Weintrodu emorenotationstodes ribeoursysteminthefollowing. Clients
i ∈ C
Ea h lienti
(leafofthetree)issendingr
i
requestspertimeunit. Forsu h requests,therequiredQoS(typi ally,aresponsetime)isdenotedqi
,andweneedtoensure that thisQoSwillbesatisedforea h lient. Nodes
j ∈ N
Ea hnodej
(internalnodeofthetree)hasapro essing apa ityWj
,whi h is the totalnumberof requests that it anpro ess pertime-unit when ithas arepli a. A ostisalsoasso iatedtoea hnode,sj
,whi hrepresentsthepri etopaytopla earepli a atthisnode. Withasingleobje ttypeitisquitenaturaltoassumethat sj
isproportional to Wj
: the morepowerfulaserver, themore ostly. Butwith several obje tswe may use non-relatedvaluesof apa ityand ost. Communi ation links
l ∈ L
Theedges ofthe tree representthe ommuni ationlinks betweennodes(leafandinternal). Weassign a ommuni ationtime omml
onlinkl
whi h is thetime requiredto send a request through the link. Moreover, BWl
is the maximum numberofrequeststhatlinkl
an transmitpertimeunit.2.2 Problem instan es
For ea h lient
i ∈ C
, letServers(i) ⊆ N
be theset of serversresponsiblefor pro essing at least oneof itsrequests. Wedo notspe ify herewhi h a ess poli y is enfor ed(e.g. oneormultiple servers), wedefer this to Se tion 3. Instead, weletr
i,s
bethe numberof requestsfrom lienti
pro essedbyservers
(of ourse,P
s∈
Servers(i)
r
i,s
= r
i
). Inthefollowing,R
isthesetofrepli as:R = {s ∈ N | ∃i ∈ C , s ∈
Servers(i)} .
2.2.1 Constraints
Threemain typesof onstraintsare onsidered.
Server apa ity The onstraintthatnoserver apa ity anbeex eededispresentin all vari-antsoftheproblem:
∀s ∈ R,
X
i∈C|s∈
Servers(i)
r
i,s
≤
Ws
QoS Someproblem instan esenfor e aqualityofservi e: thetimeto transferarequestfrom a lienttoarepli aserverisboundedbyaquantityq
i
. Thistranslatesinto:∀i ∈ C, ∀s ∈
Servers(i),
X
l∈
path[i→s]
omm
l
≤
qi
.
NotethatitwouldbeeasytoextendtheQoS onstraintsoastotakethe omputation ost of a requestin additionto its ommuni ation ost. This former ost isdire tly relatedto the omputationalspeedoftheserverandtheamountof omputation(inops)requiredfor ea hrequest.
Link apa ity Someprobleminstan esenfor eaglobal onstraintonea h ommuni ationlink
l ∈ L
:X
i∈C,s∈
Servers(i)|l∈
path[i→s]
2.2.2 Obje tivefun tion
Theobje tivefun tion fortheRepli a Pla ementproblemisdened as: Min
X
s∈R
s
s
Asalreadypointedout,itisfrequentlyassumedthatthe ostofaserverisproportionaltoits apa ity,soin someprobleminstan eswelets
s
=
Ws
.2.2.3 Simpliedproblems
Wedeneafewsimplied probleminstan esinthefollowing:
QoS=distan e We an simplify the expression of the ommuni ationtime in the QoS on-straintandonly onsiderthedistan e(innumberofhops)betweena lientanditsserver(s). TheQoS onstraintisthen
∀i ∈ C, ∀s ∈
Servers(i), d(i, s) ≤
qi
where the distan e
d(i, s) = |
path[i → s]|
is thenumberof ommuni ationlinks betweeni
ands
.No QoS Wemayfurthersimplifytheproblem,by ompletelysuppressingtheQoS onstraints. Inthis ase,theservers anbeanywhereinthetree,theirlo ationisindierenttothe lient. No link apa ity Wemay onsidertheproblemassuminginnitelink apa ity,i.e. not
bound-ingthetotaltra onanylinkin anadmissiblesolution.
Only server apa ities The problem without QoS and link apa ities redu es to nding a validsolutionofminimal ost,wherevalid meansthat noserver apa ityisex eeded. We nameRepli a Costthisfundamental problem.
Repli a ounting We anfurthersimplifythepreviousRepli aCostprobleminthe homo-geneous ase: withidenti al servers,theRepli a Costproblem amountsto minimizethe numberofrepli asneededtosolvetheproblem. Inthis ase,thestorage osts
j
issetto1
forea hnode. We allthisproblemRepli a Counting.3 A ess poli ies
Inthis se tionwereview theusualpoli ies enfor ingwhi h repli a is a essedby agiven lient. Considerthat ea h lient
i
is makingr
i
requestspertime-unit. There aretwos enariosfor the numberofserversassignedtoea h lient:Single server Ea h lient
i
isassignedasingleserverserver(i)
,thatisresponsibleforpro essing allitsrequests.Multipleservers A lient
i
maybeassigned severalserversin aset Servers(i)
. Ea h servers ∈
Servers(i)
willhandleafra tionr
i,s
oftherequests. Of ourseP
s∈
Servers(i)
r
i,s
= r
i
. To thebest of ourknowledge, the single serverpoli y has beenenfor ed in all previous ap-proa hes. Oneobje tiveofthispaperistoassesstheimpa tofthisrestri tionontheperforman e ofdatarepli ationalgorithms. Thesingleserverpoli ymayproveausefulsimpli ation,butmay omeat thepri eofanon-optimalresour eusage.Intheliterature, thesingleserverstrategyisfurther onstrainedto theClosest poli y. Here, theserverof lient
i
is onstrainedtobetherstserverfoundonthepaththatgoesfromi
upwards to therootof thetree. Inparti ular, onsider a lienti
andits serverserver(i)
. Then anyotherlientnode
i
′
residinginthe subtreerootedin server
(i)
will beassignedaserverin that subtree. Thisforbidsrequestsfromi
′
totraverse server
(i)
andbeservedhigher( losertotherootinthe tree).Werelaxthis onstraintintheUpwards poli ywhi histhegeneralsingleserverpoli y. Noti e that a solutionto Closest alwaysis a solutionto Upwards, thus Upwards is alwaysbetter than Closest in termsof the obje tivefun tion. Similarly, the Multiple poli y is always better than Upwards,be auseitisnot onstrainedbythesingleserverrestri tion.
Thefollowingse tionsillustratethethreepoli ies. Se tion3.1providessimpleexampleswhere thereisavalidsolutionforagivenpoli y,butnoneforamore onstrainedone. Se tion3.2shows that Upwards an bearbitrarily better thanClosest, while Se tion 3.3 shows that Multiple an bearbitrarily better thanUpwards. We on ludewith an exampleshowingthat the ost of an optimal solutionof theRepli a Counting problem (for any poli y) anbe arbitrarily higher thantheobviouslowerbound
P
i∈C
r
i
W
,
where
W
istheserver apa ity.3.1 Impa t of the a ess poli y on the existen e of a solution
We onsider here a very simpleinstan e of the Repli a Counting problem. In this example thereare twonodes,
s
1
beingtheunique hildofs
2
,thetreeroot (seeFigure1). Ea h node an pro essW= 1
request. (b) (a) ( )W = 1
1
s
2
s
1
1
1
s
2
s
1
s
2
s
1
2
Figure1: A esspoli ies.
If
s
1
hasone lient hildmaking1
request,theproblemhasasolutionwithallthreepoli ies, pla ingarepli aons
1
orons
2
indierently(Figure1(a)). If
s
1
hastwo lient hildren,ea hmaking1
request,theproblemhasnomoresolutionwith Closest. However,wehaveasolutionwithbothUpwards and Multiple ifwepla erepli as onbothnodes. Ea hserverwillpro esstherequestofoneofthe lients(Figure1(b)). Finally,ifs
1
hasonlyone lient hildmaking2
requests,onlyMultiple hasasolutionsin eweneedto pro essonerequeston
s
1
and theotherons
2
, thusrequestingmultiple servers (Figure1( )).This exampledemonstratestheusefulnessofthe newpoli ies. The Upwards poli yallowsto ndsolutionswhenthe lassi alClosest poli ydoesnot. ThesameholdstrueforMultiple versus Upwards. Inthefollowing,we omparethe ostofsolutionsobtainedwithdierentstrategies.
3.2 Upwards versus Closest
In the following example, we onstru t an instan e of Repli a Counting where the ost of theUpwards poli y isarbitrarilylowerthanthe ostof theClosest poli y. We onsiderthetree networkof Figure2,where thereare
2n + 2
internalnodes, ea h withWj
= W = n
, and2n + 1
lients,ea hwithr
i
= r = 1
.WiththeUpwards poli y,wepla ethreerepli asin
s
2n
,s
2n+1
ands
2n+2
. Allrequests anbe satisedwiththese threerepli as.s
1
1
1
s
2n
s
2n+2
s
2n+1
W = n
1
Figure2: Upwards versusClosest
When onsideringtheClosest poli y,rstweneedtopla earepli ain
s
2n+2
to overits lient. Then, Eitherwepla earepli aon
s
2n+1
. Inthis ase,thisrepli aishandlingn
requests,butthere remainn
otherrequestsfromthe2n
lientsinitssubtreethat annotbepro essedbys
2n+2
. Thus,weneedtoaddn
repli as betweens
1
..s
2n
. Otherwise,
n − 1
requestsofthe2n
lientsinthesubtreeofs
2n+1
anbepro essedbys
2n+2
in additiontoitsown lient. Weneedtoaddn + 1
extrarepli asamongs
1
, s
2
, . . . , s
2n
. Inboth ases,wearepla ingn+2
repli as,insteadofthe3
repli asneededwiththeUpwardspoli y. This proves that Upwards an be arbitrary better than Closest on some Repli a Counting instan es.3.3 Multiple versus Upwards
Inthisse tion webuildan instan eoftheRepli a Countingproblem whereMultiple istwi e better than Upwards. We do not know whether there exist instan es of Repli a Counting wheretheperforman eratioofMultiple versusUpwards ishigherthan
2
(andwe onje turethat this is not the ase). However, we also build an instan e of the Repli a Costproblem (with heterogeneousnodes)whereMultiple isarbitrarilybetterthanUpwards.Westartwiththehomogeneous ase. Considertheinstan eofRepli aCountingrepresented inFigure3,with
3n + 1
nodesof apa ityWj
= W = 2n
. Therootr
hasn + 1
hildren,n
nodes labeleds
1
tos
n
anda lientwithr
i
= n
. Ea hnodes
j
hastwo hildrennodes,labeledv
j
andw
j
for1 ≤ j ≤ n
. Ea hnodev
j
hasaunique hild,a lientwithr
i
= n
requests;ea hnodew
j
hasa unique hild,a lientwithr
i
= n + 1
requests.TheMultiple poli y assigns
n + 1
repli as, oneto the rootr
and one to ea h nodes
j
. The repli a ins
j
anpro essallthe2n + 1
requestsin its subtreeex eptone,whi h ispro essed by theroot.FortheUpwards poli y,weneedtoassignonerepli ato
r
,to overits lient. Thisrepli a an pro essn
otherrequests, forinstan ethosefrom the lient hildofv
1
. Weneedto pla eat least arepli a ins
1
orinw
1
,and2(n − 1)
repli asinv
j
andw
j
for2 ≤ j ≤ n
. Thisleadstoatotalof2n
repli as, hen eaperforman efa tor2n
n+1
whoselimitisto2
whenn
tendsto innity.We now pro eed to the heterogeneous ase. Consider the instan e of Repli a Cost rep-resented in Figure 4, with
3
nodess
1
,s
2
ands
3
, and2
lients. The apa ity ofs
1
ands
2
is W1
=
W2
= n
while that ofs
3
is W3
= Kn
, whereK
is arbitrarily large. Re all that in the Repli aCostproblem,weletsj
=
Wj
forea hnode. Multiple assigns2
repli as,ins
1
ands
2
, hen ehas ost2n
. TheUpwards poli yassignsarepli atos
1
to overits hild,andthen annotn + 1
n
n + 1
s
1
s
2
W = 2n
r
v
2
v
1
n + 1
s
n
v
n
n
w
1
w
2
w
n
n
n
Figure3: Multiple versusUpwards,homogeneousplatforms.
n + 1
s
1
, W
1
=
n
s
2
, W
2
=
n
s
3
, W
3
=
Kn
n − 1
Figure4: Multiple versusUpwards,heterogeneousplatforms.
use
s
2
to pro ess the requests of the hildin its subtree. It must pla e arepli a ins
3
, hen ea nal ostn + Kn = (K + 1)n
arbitrarilyhigherthanMultiple.3.4 Lower bound for the Repli a Counting problem
Obviously, the ostof an optimal solutionof theRepli a Countingproblem (for any poli y) annotbelowerthantheobviouslowerbound
l
P
i∈C
r
i
W
m
,where
W
istheserver apa ity. Indeed, this orrespondsto asolutionwherethetotalrequestloadissharedasevenlyaspossibleamong therepli as.Thefollowinginstan eof Repli a Countingshowsthat theoptimal ost anbearbitrarily higherthanthislowerbound. ConsiderFigure5,with
n + 1
nodesof apa ityWj
= W
,Therootr
hasn + 1
hildren,n
nodes labeleds
1
tos
n
, and a lient withr
i
= W
. Ea h nodes
j
has a unique hild,a lientwithr
i
= W/n
(assumewithoutlossofgeneralitythatW
isdivisiblebyn
). The lowerbound isl
P
i∈C
r
i
W
m
=
2W
W
= 2
. However, ea h of the three poli ies Closest, Upwards andMultiplewillassignarepli atotherootto overits lient,andwillthenneedn
extrarepli as, oneper lientofs
j
,1 ≤ j ≤ n
. The total ostisthusn + 1
repli as, arbitrarilyhigher thanthe lowerbound.s
1
W/n
W/n
r
s
n
W
Figure5: Thelowerbound annotbeapproximatedforRepli a Counting.
All the examples in Se tions 3.1 to 3.4 give an insight of the ombinatorial nature of the Repli a Pla ementoptimization problem, even in its simplest variants Repli a Cost and Repli a Counting. The followingse tion orroboratesthis insight: mostproblems areshown NP-hard,eventhoughsomevariantshavepolynomial omplexity.
4 Complexity results
One major goal of this paper is to assess the impa t of the a ess poli y on the problem with homogeneousvsheterogeneousservers. Werestri ttothesimplestproblem,namelytheRepli a Cost problem introdu ed in Se tion 2.2.3. We onsider a tree
T = C ∪ N
, no QoS onstraint, and innite link apa ities. Ea h lienti ∈ C
hasr
i
requests; ea h nodej ∈ N
has pro essing apa ity Wj
and storage ostsj
= W
j
. This simpleproblem omes in twoavors,either with homogeneousnodes(Wj
=
Wforallj ∈ N
), orwithheterogeneousnodes(serverswithdierent apa ities/ osts).Inthe singleserver versionof the problem, weneed to nd a serverserver
(i)
for ea h lienti ∈ C
. LetServers bethesetofservers hosenamongthenodesinN
. Theonly onstraintisthat server apa ities annotbeex eeded: thistranslatesintoX
i∈C,
server(i)=j
r
i
≤ W
j
forallj ∈
Servers.
The obje tiveis to nd avalid solutionof minimal storage ost
P
j∈
ServersW
j
. Note that with homogeneous nodes, the problem redu es to nd the minimum number of servers, i.e. to the Repli aCountingproblem. AsoutlinedinSe tion3,therearetwovariantsofthesingleserver versionoftheproblem, namelytheClosest andtheUpwards strategies.
IntheMultiple poli y withmultiple serversper lient,letServersbethesetofservers hosen amongthenodesin
N
;forany lienti ∈ C
andanynodej ∈ N
,letr
i,j
bethenumberofrequests fromi
that arepro essedbyj
(r
i,j
= 0
ifj /
∈
Servers ). WeneedtoensurethatX
j∈N
r
i,j
= r
i
foralli ∈ C.
The apa ity onstraintnowwrites
X
i∈C
r
i,j
≤ W
j
forallj ∈
Servers,
whiletheobje tivefun tion isthesameasforthesingleserverversion.
Thede isionproblemsasso iatedwiththepreviousoptimizationproblemsareeasyto formu-late: givenaboundonthenumberofservers(homogeneousversion) oronthetotalstorage ost (heterogeneousversion),isthereavalidsolutionthat meetsthebound?
Homogeneous Heterogeneous Closest polynomial[2,9℄ NP- omplete Upwards NP- omplete NP- omplete Multiple polynomial NP- omplete
Table1: Complexityresultsforthedierentinstan esoftheRepli a Costproblem.
Table1 apturesthe omplexityresults. These omplexity resultsareallnew,ex ept forthe Closest/Homogeneous ombination. The NP- ompleteness of the Upwards/Homogeneous ase omes as a surprise, sin e all previously known instan es were shown to be polynomial, using dynami programmingalgorithms. Inparti ular,theClosest/Homogeneousvariantremains poly-nomial whenadding ommuni ation osts [2℄ orQoS onstraints[9℄. PreviousNP- ompleteness results involved generalgraphs rather than trees, and the ombinatorial nature of the problem amefrom thedi ultyto extra tagoodrepli a treeoutofanarbitrary ommuni ationgraph. Herethetreeisxed,but theproblem remains ombinatorial duetoresour eheterogeneity.
4.1 With homogeneous nodes and the Multiple strategy
Theorem1. Theinstan eofthe Repli a Countingproblem withthe Multiplestrategy an be solved inpolynomialtime.
Proof. Weoutlinebelowanoptimalalgorithmto solvetheproblem. Theproof ofoptimalityis quitete hni al,sothereadermaywantto skipitatrstreading.
4.1.1 Algorithmformultiple servers
We propose agreedy algorithm to solvethe Repli a Counting problem. Let W be the total numberofrequeststhat aserver an handle.
Thisalgorithmworksinthreepasses: rstwesele tthenodeswhi hwillhavearepli ahandling exa tlyWrequests. Thenase ondpassallowsus tosele tsomeextraserverswhi harefullling theremainingrequests. Finally,weneedtode ideforea hserverhowmanyrequestsofea h lient itispro essing.
Weassumethatea h node
i
knowsitsparentparent(i)
andits hildren hildren(i)
in thetree. We introdu e a new variable whi h is the ow oming up in the tree (requests whi h are not alreadyfullledbyaserver). Itisdenotedbyowi
fortheowbetweeni
andparent(i)
. Initially,∀i ∈ C
owi
= r
i
and∀i ∈ N
owi
= −1
. Moreover,theset ofrepli asisemptyinthebeginning:repl = ∅
.Pass 1 We greedily sele t in this step some nodes whi h will pro ess W requests and whi h are as loseto theleavesas possible. We pla earepli a onsu h nodes(see Algorithm1). Pro edurepass1is alledwith
r
(rootofthetree)asaparameter,anditgoesdownthetree re ursivelyinorder to omputetheows. Whenaowex eedsW , wepla earepli asin e the orrespondingserverwillbefullyused, andweremovethepro essedrequestsfrom the owgoingupwards.At theend, if
f low
r
= 0
or (f low
r
≤
W andr /
∈ repl
), wehaveanoptimal solutionsin e allrepli as whi h havebeenpla edarefully usedandallrequestsaresatisedbyaddinga repli a inr
iff low
r
6= 0
. Inthis aseweskippass2andgodire tlyto pass3.Otherwise, we need someextra repli as sin e somerequests are not satisedyet, and the root annotsatisfyalltheremainingrequests. Topla e theseextrarepli as,wegothrough pass2.
Pass 2 Inthispass,weneedtosele tthenodeswheretoaddrepli as. Todoso,whilethereare toomany requestsgoing upto the root, wesele t thenodewhi h anpro essthe highest number of requests, and we pla e a repli a there. The number of requests that a node
pro edurepass1(node
s ∈ N
) beginf low
s
= 0
;for
i ∈
hildren(s)
doif
f low
i
== −1
thenpass1(i)
;//Re ursive all.f low
s
= f low
s
+ f low
i
; endif
f low
s
≥
Wthenf low
s
= f low
s
−
W; repl = {s} ∪ repl
; endAlgorithm1: Pro edurepass1
j ∈ N
aneventuallypro essistheminimumoftheowsbetweenj
andtherootr
,denoteduf low
j
(forusefulow). Indeed, somerequestsmayhavenoserveryet,buttheymightbe pro essedbyaserveronthepathbetweenj
andr
,wherearepli ahasbeenpla edinpass1. Algorithm2detailsthispass.If weexit this pass with
f inish = −1
, this meansthat we havetried to pla e repli as on all nodes,but thissolutionis notfeasiblesin e thereare still somerequestswhi hare not pro essedgoinguptotheroot. Inthis ase,theoriginalprobleminstan ehadnosolution. However,ifwesu eedtopla erepli assu hthatf low
r
= 0
,wehaveasetofrepli aswhi h su eedtopro essallrequests. Wethengothroughpass3toassignrequeststoservers,i.e. to omputehowmanyrequestsofea h lientshouldbepro essedbyea hserver.while
f low
r
6= 0
dof reenode = N \ repl
;if
f reenode == ∅
thenf inish = −1;
exittheloop; // Atea h step,assign1repli aandre- omputeows.child =
hildren(r); uf low
r
= f low
r
; whilechild! = ∅
doremove
j
fromchild
;uf low
j
= min(f low
j
, uf low
parent(j)
)
;child = child ∪
hildren(j)
; end// The usefulows havebeen omputed, sele tthemax. maxuow=0;
for
j ∈ f reenode
doif
uf low
j
> maxuf low
thenmaxuf low = uf low
j
; maxnode = j
; endif
maxuf low 6= 0
thenrepl = repl ∪ {maxnode}
; // Update theows upwards.for
j ∈
An estors(maxnode) ∪ {maxnode}
dof low
j
= f low
j
− maxuf low
; endelse
f inish = −1;
exittheloop; endAlgorithm2: Pass2
Pass 3 Thispassisinfa tstraightforward,startingfromtheleavesanddistributingtherequests to theserversfrom thebottom until the top ofthe tree. Wede ide for instan e to ae t requestsfrom lientsstartingtotheleft. Pro edurepass3is alledwith
r
(rootofthetree) as a parameter, and it goes down the tree re ursively ( .f. Algorithm 3). Fori ∈ C
,r
′
i
is the number of requests of
i
not yet ae ted to a server(initiallyr
′
i
= r
i
).w
s,i
is therequestsae ted to
s
.C(s)
isthesetof lientsinsubtree(s)
whi hstillhavesomerequests notae ted. Initially,C(i) = {i}
fori ∈ C
,andC(s) = ∅
otherwise.Notethat aserverwhi hwas omputingWrequestsinpass1mayendup omputingfewer requests ifoneof itsdes endantsin thetree hasearned arepli a in pass2. Butthis does notae ttheoptimalityoftheresult,sin ewekeepthesamenumberofrepli as.
pro edurepass3(node
s ∈ N
) beginw
s
= 0
;for
i ∈
hildren(s)
doif
C(i) = ∅
thenpass3(i)
; //Re ursive all.C(s) = C(s) ∪ C(i)
; end ifs ∈ repl
then fori ∈ C(s)
do ifr
′
(i) ≤
W
− w
s
thenC(s) = C(s) \ {i}; w
s,i
= r
′
i
;w
s
= w
s
+ r
′
i
; r
′
i
= 0
; end ifC(s) 6= ∅
thenLeti ∈ C(s); x =
W− w
s
; r
′
i
= r
′
i
− x; w
s,i
= x; w
s
=
W; end endAlgorithm3: Pro edurepass3
Theproofin Se tion4.1.3showstheequivalen ebetweenthesolutionbuiltbythisalgorithm and any optimal solution,thus proving theoptimalityof the algorithm. The following example illustratesthestepbystepexe utionofthealgorithm.
4.1.2 Example
Figure 6(a) provides anexample of network on whi h weare pla ing repli as with theMultiple strategy. ThenetworkisthushomogeneousandwexW
= 10
.Pass 1ofthe algorithm isquitestraightforwardto unroll, andFigure 6(b) indi ates theow onea hlink andthesaturatedrepli asarethebla knodes.
Duringpass2,wesele tthenodesofmaximumusefulow. Figure6( )representstheseuseful ows;weseethatnode
n
4
is theonewiththemaximumusefulow(7
),soweassign itarepli a andupdatetheusefulows. All theusefulowsarethenredu eddownto1
sin ethereisonly1
requestgoingthroughtherootn
1
. Therstnodeofmaximumusefulow1
tobesele tedisn
2
, whi hissettobearepli aofpass2. Theowattherootisthen0
anditis theendofpass2.Finally,pass3ae tstheserverstothe lientsandde ideswhi hrequestsareservedbywhi h repli a(Figure6(d)). Forinstan e,the lientwith
12
requestssharesitsrequestsbetweenn
10
(10
requests)andn
2
(2
requests). Requestsare ae ted from thebottom ofthe treeupto the top. Note that the rootn
1
, even though itwasa saturatedrepli a of pass 1, hasonly5
requests to pro eedin theend.4.1.3 Proof of optimality
Let
R
opt
beanoptimalsolutiontoaninstan e oftheproblem. The oreoftheproof onsistsin transformingthissolutionintoanequivalent anoni aloptimalsolutionR
can
. Wewillthenshow thatouralgorithmisbuildingthis anoni alsolution,andthusitisprodu inganoptimalsolution.Ea h server
s ∈ R
opt
isservingw
s,i
requestsof lienti ∈
subtree(s) ∩ C
,andw
s
=
X
i∈
subtree(s)∩C
w
s,i
≤ W.
Forea h
i ∈ C
,w
s,i
= 0
ifs ∈ N
isnotarepli a,and,P
00
00
11
11
00
11
00
00
11
11
00
11
(a) Initial network
n
1
n
2
n
3
n
4
2
2
n
5
n
6
n
7
n
8
n
9
n
10
n
11
12
1
1
9
7
W = 10
n
1
n
2
n
3
n
4
2
2
n
5
n
6
n
7
n
8
n
9
n
10
n
11
12
1
1
9
7
7
2
2
2
4
6
2
12
1
1
7
7
4
1
(b) Pass 13
7
4
8
4
3
7
3
3
4
2
2
12
1
1
9
7
7
3
4
n
1
n
2
n
3
n
4
2
2
n
5
n
6
n
7
n
8
n
9
n
10
n
11
12
1
1
9
7
7
2
2
2
4
2
( )Pass 23
(d) Pass 32
2
10
2
1
1
1
4
3
7
3
4
8
9
2
6
1
1
1
1
1
1
1
4
3
3
3
3
4
4
8
7
2
Figure 6: AlgorithmfortheRepli a CountingproblemwiththeMultiple strategy.
Wedenetheow ofnode
k
, owk
,bythenumberofrequestsgoingthroughthisnodeupto itsparents. Thus, fori ∈ C
,f low
i
= r
i
,whileforanodes ∈ N
,f low
s
=
X
i∈
hildren(s)
f low
i
− w
s
.
Thetotalowgoingthroughthetree,
tf low
,isdenedinasimilarway,ex eptthatwedonot removefrom theowtherequestspro essedby arepli a,i.e.tf low
s
=
P
i∈
hildren(s)
tf low
i
. We thushavetf low
s
=
X
i∈
subtree(s)∩C
r
i
.
Thesevariablesare ompletelydenedbythenetwork andtheoptimalsolution
R
opt
.Arstlemmashowsthatitispossibleto hangerequestassignmentswhilekeepinganoptimal solution. Theowsneedtobere omputedafteranysu hmodi ation.
Lemma1. Let
s ∈ N ∩ R
opt
beaserver su hthatw
s
<
W. If
tf low
s
≥
W,we an hangetherequestassignmentbetweenrepli asoftheoptimalsolution, in su hawaythatw
s
=
W . Otherwise, we an hangethe requestassignmentso that
w
s
= tf low
s
.Proof. Firstwepointoutthat the lientsin subtree
(s)
an allbeservedbys
,and sin eR
opt
is asolution, these requestsare served byarepli a somewhere in thetree. We do notmodify the optimalityofthesolutionby hangingthew
s,i
,itjust ae tstheowsofthesolution. Thus, for agiven lienti ∈
subtree(s) ∩ C
,ifthereisarepli as
′
6= s
onthepathbetween
i
andtheroot,we an hangetheassignmentoftherequestsof lienti
. Letx = max(w
s
′
,i
,
W− w
s
)
. Thenwemovex
requests,i.e.w
s
′
,i
= w
s
′
,i
− x
andw
s,i
= w
s,i
+ x
. Fromthedenition oftf low
s
,weobtainthe result,ifwemoveallpossiblerequeststos
untiltherearenomorerequestsinthesubtreeoruntils
ispro essingW
requests.We now introdu e a new denition, ompletely independent from the optimal solution but relatedtothetreenetwork. The anoni alow isobtainedbydistinguishingnodeswhi hre eive aowgreaterthan
W
from theother nodes. We omputethe anoni alowcf low
of thetree, independentlyoftherepli apla ement,anddeneasubsetofnodeswhi haresaturated,SN
. We also omputethenumberofsaturatednodesin subtree(k)
,denotednsn
k
,foranynodek ∈ C ∪ N
ofthetree.For
i ∈ C
,cf low
i
= r
i
andnsn
i
= 0
,andwethen omputere ursivelythe anoni alowsfor nodess ∈ N
. Letf
s
=
P
i∈
hildren(s)
cf low
i
and
x
s
=
P
i∈
hildren(s)
nsn
i
. If
f
s
≥
Wthens ∈ SN
,cf low
s
= f
s
−
Wandnsn
s
= x
s
+ 1
. Otherwise,s
isnotsaturated,cf low
s
= f
s
andnsn
s
= x
s
. We andedu efrom thesedenitionsthefollowingresults:Proposition 1. Anon saturatednode always has a anoni al ow beinglessthanW:
∀s ∈ N \ SN cf low
s
<
WLemma2. For allnodes
s ∈ C ∪ N
,cf low
s
= tf low
s
− nsn
s
×
W. Corollary 1. Forallnodess ∈ C ∪ N
,tf low
s
≥ nsn
s
×
W.Proof. Proposition1istrivialdueto thedenition ofthe anoni alow. Lemma2 an beprovedre ursivelyonthetree.
Thispropertyis trueforthe lients: for
i ∈ C
,nsn
i
= 0
andtf low
i
= cf low
i
= r
i
. Lets ∈ N
,andletusassumethat thepropositionistrueforall hildrenofs
. Then,∀j ∈
hildren(s) cf low
j
= tf low
j
− nsn
j
×
W.
If
s /
∈ SN
,nsn
s
=
P
j∈
hildren(s)
nsn
j
andcf low
s
=
X
j∈
hildren(s)
cf low
j
=
X
j∈
hildren(s)
(tf low
j
− nsn
j
×
W) = tf low
s
− nsn
s
×
W Ifs ∈ SN
,nsn
s
=
P
j∈
hildren(s)
nsn
j
+ 1
andcf low
s
=
X
j∈
hildren(s)
cf low
j
−
W=
X
j∈
hildren(s)
(tf low
j
− nsn
j
×
W) −
W= tf low
s
− (nsn
s
− 1) ×
W−
W= tf low
s
− nsn
s
×
Wwhi hprovesthe result. Corollary1istriviallydedu ed fromLemma 2sin e
cf low
is apositive fun tion.Wealsoshowthatitisalwayspossibletomovearepli a intoafreeserverwhi h isoneofits an estorsinthetree,whilekeepinganoptimalsolution:
Proposition 2. Let
R
opt
be an optimal solution, and lets ∈ R
opt
. If∃s
′
∈
An estors
(s) \ R
opt
thenR
′
Proof.
s
′
anhandle allrequestswhi hwerepro essed by
s
sin es ∈
subtree(s
′
)
. Wejust need toredene
w
s
′
,i
= w
s,i
foralli ∈ C
andthenw
s,i
= 0
.Wearenowready to transform
R
opt
into anewoptimal solution,R
sat
,byredistributing the requestsamongtherepli asandmovingsomerepli as,inordertopla earepli aatea hsaturated node, andae tingWrequeststo thisrepli a. This transformationis donestartingat theleaves ofthetree,and onsideringallnodesofSN
. Nothingneedstobedonefortheleaves(the lients) sin etheyarenotinSN
.Letus onsider
s ∈ SN
, and assumethat theoptimal solutionhasalready been modiedto pla earepli a,andassignitWrequests,onallnodesinsubSN = SN ∩
subtree(s) \ {s}
.Weneedtodierentiatetwo ases:
1. If
s ∈ R
opt
, we do not need to move any repli a. However, ifw
s
6=
W , we hange the assignmentofsomerequestswhilekeepingthesamerepli asin orderto obtainaworkload of W on servers
. We donot remove requestsfrom the saturatedserversofsubSN
whi h havealreadybeenlled. Corollary1ensuresthattf low
s
≥ nsn
s
×
W , and(nsn
s
− 1) ×
W requests should notmovesin etheyare ae ted tothensn
s
− 1
serversofsubSN
. There arethusstill morethanWrequestsof lientsofsubtree(s)
whi h anpossiblybemovedons
usingLemma 1.2. If
s /
∈ R
opt
, we need to move a repli a ofR
opt
and pla e it ins
without hanging the optimalityofthesolution. Wedierentiatetwosub ases.(a) If
∃s
1
∈
subtree(s) ∩ R
opt
\ SN
, then therepli a pla edons
1
anbemoved ins
by applyingProposition2. Then,ifw
s
6=
W,weapply ase1abovetosaturatetheserver. (b) Otherwise,alltherepli aspla edinsubtree(s)
arealsoinSN
,and theow onsumed bythealreadymodiedoptimalalgorithmisexa tly(nsn
s
− 1) ×
W. Itiseasytosee thattheow(oftheoptimalsolution)ats
isexa tlyequaltothetotalowminusthe onsumedow. Therefore,f low
s
= tf low
s
− (nsn
s
− 1) ×
W ,andwiththeappli ation ofCorollary1,f low
s
≥
W.Theideanow onsistsinae tingtherequestsofthisowtonode
s
byremovingwork fromtherepli asupwardstotheroot,andrearrangetheremainingrequeststoremove onerepli a. Theowf low
s
isgoingupwardstobepro essedbysomeofthenr
s
repli as in An estors(s) ∩ R
opt
, denoteds
1
, ..., s
nr
s
,s
1
beingthe losestnodefroms
. We an removeW of these requests from theowand ae t them to anew repli a pla edins
. Letw
s
k
,s
=
P
j∈
subtree(s)∩C
w
s
k
,j
. WehaveP
k=1..nr
s
w
s
k
,s
= f low
s
. Wemovetheserequestsfrom
s
k
tos
, startingwithk = 1
. Thus,after themodi ation,w
s
1
,s
= 0
. It is howeverpossible thatw
s
1
6= 0
sin es
1
maypro essrequests whi h are not oming from subtree(s)
. Inthis ase, wearesurethat wehaveremovedenoughrequestsfroms
k
,k = 2..nr
s
whi h aninstead pro ess requests still in hargeofs
1
. We an then removetherepli ainitiallypla edins
1
.This way, we have not hanged the assignment on repli as in
subSN
, but we have pla edarepli a ins
whi h ispro essing Wrequests. Sin ewehave atthe sametime removedtherstrepli aonthepathfroms
totheroot(s
1
),wehavenot hangedthe numberofrepli asandthesolutionisstilloptimal.On ewehaveappliedthispro edureuptotheroot,wehaveanoptimalsolution
R
sat
inwhi h allnodesofSN
havebeenpla edarepli aandarepro essingWrequests. Wewillnot hangethe assignmentof these repli asanymorein thefollowing. Free nodesin thenew solutionare alled F-nodes,whilerepli aswhi harenotinSN
are alled PS-nodes,forpartially saturated.In anext step, we further modify the
R
sat
optimal solutionin order to obtainwhat we all the anoni al solutionR
can
. To do so, we hange the request assignment of the PS-nodes: we saturate someof themas mu haswe anandweintegratetheminto thesubsetofnodesSN
, redeningthecf low
a ordingly. Attheend ofthepro ess,SN = R
can
.The
cf low
isstill theow whi h hasnotbeenpro essedbyasaturatednode in thesubtree, andthuswe anexpressitin amoregeneralway:cf low
s
= tf low
s
−
X
s
′
∈SN ∩
subtree
(s)
Notethatthis istotallyequivalentto thepreviousdenitionwhilewehavenotmodied
SN
. Wealsointrodu eanewowdenition,thenon-saturatedowofs
,nsf low
s
,whi h ountsthe requestsgoingthroughnodes
andnotservedbyasaturatedserveranywherein thetree. Thus,nsf low
s
= cf low
s
−
X
i∈
hildren(s)∩C
X
s
′
∈
An estors(s)∩SN
w
s
′
,i
.
This ow representsthe requests that anpotentiallybeserved by
s
while keepingall nodesof SNsaturated.Lemma3. Inasaturatedoptimalsolution,there annotexistaPS-nodeinthesubtreeofanother PS-node.
Proof. Thenon-saturatedowis
nsf low
s
≤ cf low
s
sin ewefurtherremovefrom the anoni al owsomerequestswhi hareae tedupwardsin thetreeto somesaturatedservers.Let
s ∈ R
sat
\ SN
beaPS-node. Its anoni alowiscf low
s
< W
. It anpotentiallypro ess alltherequestsofthesubtreewhi harenotae tedtoasaturatedserverupwardsordownwards in the tree, thusnsf low
s
requests. Sin ensf low
s
≤ cf low
s
< W
, we an hange the request assignmentto assignall thesensf low
s
requeststos
, removingeventuallysomework fromother non-saturated repli as upwardsor downwardswhi h were pro essing these requests. Thus, the repli aonnodes
ispro essingalltherequestsofsubtree(s)
whi h arenotpro essedbysaturated nodes.Iftherewasanonsaturatedrepli ainsubtree
(s)
,it ouldthusberemovedsin ealltherequests arepro essedbys
. ThismeansthatasolutionwithaPS-nodeinthesubtreeofanotherPS-node isnotoptimal,thusprovingthelemma.Atthis point, we an movethePS-nodesashighaspossiblein
R
sat
. Lets
beaPS-node. If thereisafreenodes
′
inAn estors
(s)
thenwe anmovetherepli afroms
tos
′
usingProposition2. Lemma3ensuresthat therearenootherPS-nodesin subtree
(s
′
)
.
Allfurthermodi ationswillonlyalternodeswhi hhavenoPS-nodesintheiran estors. We dene
N
′
= {s|
An estors(s) \ SN = ∅}
. Lets ∈ N
′
.nsf low
s
= cf low
s
−
P
i∈
hildren(s)∩C
P
s
′
∈
An estors(s)
w
s
′
,i
sin e all an estorsofs
areinSN
. Thus,nsf low
s
=
X
s
′
∈
subtree(s)\SN
w
s
′
.
Bydenition,
∀s ∈ N nsf low
s
≤ cf low
s
. Moreover, ifs /
∈ SN
, thennsf low
s
= w
s
sin e subtree(s) \ SN
isredu edtos
(nootherPS-nodeunder thePS-nodes
,from Lemma3).Weintrodu eanewowdenition,theusefulow,whi hintuitivelyrepresentsthenumberof requeststhat anpossiblybepro essedon
s
withoutremovingrequestsfrom asaturatedserver.uf low
s
=
min
s
′
∈
An estors(s)∪{s}
{cf low
s
′
}
Lemma4. Lets ∈ N
′
. Then
nsf low
s
≤ uf low
s
. Proof. Lets
′
∈
An estors(s)
. Sin es ∈ N
′
,s
′
∈ SN
.cf low
s
′
≥ nsf low
s
′
=
X
s
′′
∈
subtree(s
′
)\SN
w
s
′′
Butsin e
s ∈
subtree(s
′
)
, subtree
(s) \ SN ⊆
subtree(s
′
) \ SN
, hen e
nsf low
s
≤ nsf low
s
′
. Note thatnsf low
isanonde reasingfun tion (whengoingupthetree).Thus,
∀s
′
∈
An estors
(s) ∪ {s}
,nsf low
s
≤ cf low
s
′
, and by denition of the useful ow,Nowwestartthemodi ationoftheoptimalsolutioninordertoobtainthe anoni alsolution. Atea hstep,wesele tanode
s ∈ N \ SN
maximizingtheusefulow. Ifthereareseveralnodes of identi aluf low
, we sele t the rst one in a depth-rst traversal of the tree. We will prove that we anae tuf low
s
requeststo thisnodewithoutunsaturatinganyserverofSN.s
is then onsideredasasaturatednode,were omputethe anoni alows(andthustheusefulows)and reiterate thepro ess untilcf low
r
= 0
, whi h means that all the requestshave been ae ted to saturatedservers.Letus explainhowto reassigntherequestsin orderto saturate
s
withuf low
s
requests. The ideaistoremovesomerequestsfromAn estors(s)
inordertosaturates
,andthentosaturatethe an estorsofs
again,byae tingthemsomerequests omingfromothernonsaturatedservers.First,wenotethat
uf low
s
≤ cf low
r
= nsf low
r
. Thus,uf low
s
≤
X
s
′
∈N \SN
w
s
′
= w
s
+
X
s
′
∈P S
w
s
′
wherePSisthesetofnonsaturatednodeswithout
s
. Letx = uf low
s
− w
s
. Ifx = 0
,s
isalready saturated. Otherwise,weneedtoreassignx
requeststos
. Fromthepreviousequation,we ansee thatP
s
′
∈P S
w
s
′
≥ uf low
s
− w
s
= x
. There arethusenoughrequestshandled bynon saturated nodeswhi h anbepassedtos
.Thenumberof requestsofsubtree
(s) ∩ C
handled byAn estors(s)
isX
s
′
∈
An estors(s)
X
i∈
subtree(s)∩C
w
s
′
,i
= cf low
s
− nsf low
s
bydenitionoftheow. Or
cf low
s
− nsf low
s
≥ uf low
s
− w
s
= x
sothereareatleastx
requests thats
antakefrom itsan estors.Let
a
1
=
parent(s), ..., a
k
= r
bethean estorsofs
.x
j
=
P
i∈
subtree(s)∩C
w
a
j
,i
is theamountof requests thats
antakefroma
j
. We hoose arbitrarywhere to take therequestsifP
j
x
j
> x
, and donot modify the assignment ofthe other requests. We thus assumein thefollowingthatP
j
x
j
= x
. Sin e thesex
j
requests are oming from a lient in subtree(s)
, we anassign them tos
, andtherearenowonlyW− x
j
requestshandledbya
j
,whi hmeansthata
j
istemporarily unsaturated. However,wehavegivenx
extrarequeststos
,hen es
ispro essingw
s
+ x = uf low
s
requests.We nally need to reassign requests to
a
j
, j = 1..k
in order to saturate these nodes again, taking requestsoutof nodesinP S
(non saturatednodesother thans
). This isdoneiteratively startingwithj = 1
andgoinguptotheroota
k
. Atea hstepj
,weassumethata
j
′
, j
′
< j
have alreadybeensaturatedagainandweshouldnotmoverequestsawayfromthem. However,we an stilleventuallytakerequestsawayfrom
a
j
′′
, j
′′
> j
. Inordertosaturate
a
j
,weneedtotake: eitherrequestsfromsubtree
(a
j
) ∩ C
whi hare urrentlyhandledbya
j
′′
, j
′′
> j
,butwithout movingrequestswhi harealreadyae tedto
s
(i.e.P
j
′′
>j
x
j
′′
); orrequests from non saturatedserversin subtree
(a
j
)
, ex eptrequests froms
and requests alreadygiventos
that shouldnotbemovedanymore(i.e.P
j
′
<j
x
j
′
).Thenumberofrequeststhatwe anpotentiallyae tto
a
j
istherefore:X =
X
s
′
∈
subtree(a
j
)\SN \{s}
w
s
′
+
X
i∈
subtree(a
j
)∩C
X
s
′
∈
An estors(a
j
)
w
s
′
,i
−
X
j
′
<j
x
j
′
−
X
j
′′
>j
x
j
′′
Letusshowthat
X ≥ x
j
. Thenwe anusetheserequeststo saturatea
j
again.cf low
a
j
= nsf low
a
j
+
X
i∈
subtree(a
j
)∩C
X
s
′
∈
An estors(a
j
)
w
s
′
,i
= w
s
+X+
X
j
′
<j
x
j
′
+
X
j
′′
>j
x
j
′′
= X+w
s
+x−x
j
Butcf low
a
j
≥ uf low
s
anduf low
s
− w
s
= x
soc
1
c
2
c
3
n
1
n
2
n
m
c
3m
Figure7: Theplatformusedin theredu tionforTheorem2.
Itisthuspossibletosaturate
s
andthenkeepitsan estorssaturated. Atthispoint,s
be omes anodeofSN
and we an re ompute the anoni al andnon saturated ows. We haveremoveduf low
s
requests whi h were pro essed by nonsaturated servers,so thecf low
andnsf low
of all an estorsofs
,in ludings
,shouldbede reasedbyuf low
s
.Inparti ular,at theroot,
cf low
r
= cf low
r
− uf low
s
,whi hprovesthat the ontributionofs
oncf low
r
isuf low
s
.In the last stepof the proof, we showthat thenumberof repli as in the modied anoni al solutionattheendoftheiteration
R
can
= SN
hasexa tlythesamenumberofrepli asthanR
sat
. Inthesaturatedsolution, ea h PS-nodes
is pro essingnsf low
s
requests,while in the anoni al solution,it isuf low
s
. However,ateverystep whenadding asaturatednodes
,wehaveuf low
s
greaterthananyofthensf low
s. Itisthus easytoseethatthenumberofnodesinthe anoni al solution is less orequal to the number of nodes in the saturated solution. Sin e the saturated solutionisoptimal,|R
can
| = |R
sat
|
,whi h ompletestheproof.Ouralgorithmbuilds
R
can
inpolynomialtime,whi hassessesthe omplexityoftheproblem.4.2 With homogeneous nodes and the Upwards strategy
Theorem 2. The instan e of the Repli a Counting problem with the Upwards strategy is NP- ompleteinthe strongsense.
Proof. The problem learly belongs to the lass NP: given a solution, it is easy to verify in polynomialtimethatallrequestsareservedandthatnoserver apa ityisex eeded. Toestablish the ompletenessinthestrongsense,weusearedu tionfrom3-PARTITION[3℄. We onsideran instan e
I
1
of3-PARTITION:given3m
positiveintegersa
1
, a
2
, . . . , a
3m
su hthatB/4 < a
i
< B/2
for1 ≤ i ≤ 3m
, andP
3m
i=1
a
i
= mB
, anwepartition these integersintom
triples, ea hof sumB
? Webuild thefollowinginstan eI
2
of Repli aCounting(seeFigure7):
3m
lientsc
i
withr
i
= a
i
for1 ≤ i ≤ 3m
.
m
internal nodesn
j
withWj
=
sj
= B
for1 ≤ j ≤ m
.-The hildrenof
n
1
areallthe3m
lientsc
i
,anditsparentisn
2
.- For
2 ≤ j ≤ m
, theonly hildofn
j
isn
j−1
. For1 ≤ j ≤ m − 1
, theparentofn
j
isn
j+1
(hen en
m
istheroot).Finally, we ask whether there exists a solution with total storage ost