Strategies for Replica Placement in Tree Networks

(1)

HAL Id: inria-00111963

https://hal.inria.fr/inria-00111963v2

Submitted on 8 Dec 2006

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Yves Robert, Anne Benoit, Veronika Rehn

To cite this version:

Yves Robert, Anne Benoit, Veronika Rehn. Strategies for Replica Placement in Tree Networks.

[Re-search Report] RR-6012, LIP RR-2006-30, INRIA, LIP. 2006, pp.39. �inria-00111963v2�

(2)

inria-00111963, version 2 - 8 Dec 2006

a p p o r t

d e r e c h e r c h e

S

N

0

2

4

9 -6

3

9

9 IS

R

N

IN

R

IA

/R

R

--6

0

1

2 --F

R

+

E

N

G

Thème NUM

Strategies for Replica Placement in Tree Networks

Anne Benoit — Veronika Rehn — Yves Robert

N° 6012

(3)

(4)

Unité de recherche INRIA Rhône-Alpes

655, avenue de l’Europe, 38334 Montbonnot Saint Ismier (France)

Téléphone : +33 4 76 61 52 00 — Télécopie +33 4 76 61 52 52

Anne Benoit, Veronika Rehn , Yves Robert ThèmeNUMSystèmesnumériques

ProjetGRAAL

Rapportdere her he n°6012November200636pages

Abstra t: Inthispaper,wedis ussand ompareseveralpoli iestopla erepli asintreenetworks, subje tto server apa ityandQoS onstraints. The lientrequestsareknownbeforehand,while the number and lo ation of the servers are to be determined. The standard approa h in the literature is to enfor e that all requests of a lient be served by the losest server in the tree. Weintrodu eandstudy twonewpoli ies. Intherstpoli y, allrequestsfrom agiven lient are still pro essedbythesameserver,but thisserver anbelo atedanywhere in thepathfrom the lienttotheroot. Inthese ondpoli y,therequestsofagiven lient anbepro essedbymultiple servers.

Onemajor ontributionofthispaperistoassesstheimpa tofthesenewpoli iesonthetotal repli ation ost. Anotherimportantgoalistoassesstheimpa tofserverheterogeneity,bothfrom atheoreti alandapra ti alperspe tive. Inthispaper,weestablishseveralnew omplexityresults, andprovideseverale ientpolynomialheuristi sforNP- ompleteinstan esoftheproblem. These heuristi sare omparedto anabsolute lowerbound provided bytheformulationof theproblem intermsofthesolutionofanintegerlinearprogram.

Key-words: Repli a pla ement, tree networks, a ess poli y, s heduling, omplexity results, heuristi s,heterogeneous lusters.

(5)

Résumé : Dans e rapportnous présentonset omparons plusieurspolitiquesde pla ement de répliquessurdesarbres,prenanten ompteàlafoisdes ontraintesliéesàla apa itédetraitement de haque serveur et des ontraintes de type QoS(qualité de servi e). Les requêtesdes lients sont onnuesavantexé ution,alorsquelenombreet l'empla ementdesrépliques(serveurs)sont àdéterminerparl'algorithmede pla ement. L'appro he lassiqueimposequetouteslesrequêtes d'un lientdonnésoienttraitéesparunseulserveur,àsavoirlepluspro hedu lientdansl'arbre. Nous introduisons deux nouvelles politiques de pla ement. Dans la première, haque lient a toujoursunserveurunique,mais e dernierpeutêtresituén'importeoùsurle heminquimène du lient à la ra ine dans l'arbre. Ave la deuxième politique, les requêtes d'un même lient peuventêtretraitées parplusieursserveurssur emême hemin.

Nousmontronsque esdeuxnouvellespolitiquesdepla ementsontàmêmederéduirefortement le oût total de la répli ation. Un autre obje tif de e travail est l'analyse de l'impa t de l'hétérogénéité de la plate-forme, à la fois d'un point de vue théorique et pratique. Sur le plan théorique, nous établissons plusieurs résultats de omplexité, dans les adres homogène et hétérogène, pour l'appro he lassique et les nouvelles politiques. Surle plan pratique, nous on evons des heuristiques polynomiales pour les instan es ombinatoires du problème. Nous omparonslesperforman esde esheuristiquesen lesrapportantàuneborne inférieureabsolue surle oûttotaldelarépli ation; etteborneestobtenueparrelaxationd'unprogrammelinéaire ennombreentiersqui ara tériselasolutionoptimaleduproblème.

Mots- lés: Pla ementderépliques,réseauxenarbre,ordonnan ement, omplexité,heuristiques, grappesde al ul hétérogènes.

(6)

Contents

1 Introdu tion 4

2 Framework 5

2.1 Denitionsandnotations . . . 5

2.2 Probleminstan es . . . 6

2.2.1 Constraints . . . 6

2.2.2 Obje tivefun tion . . . 7

2.2.3 Simpliedproblems . . . 7

3 A esspoli ies 7 3.1 Impa tofthea ess poli y ontheexisten eofasolution. . . 8

3.2 Upwards versusClosest . . . 8

3.3 Multiple versusUpwards . . . 9

3.4 LowerboundfortheRepli aCountingproblem . . . 10

4 Complexity results 11 4.1 WithhomogeneousnodesandtheMultiple strategy . . . 12

4.1.1 Algorithmformultiple servers. . . 12

4.1.2 Example. . . 14

4.1.3 Proofof optimality . . . 14

4.2 WithhomogeneousnodesandtheUpwards strategy . . . 20

4.3 Withheterogeneousnodes . . . 21

5 Linearprogramming formulation 22 5.1 Singleserver. . . 22

5.2 Multipleservers. . . 23

5.3 AnILP-basedlowerbound . . . 24

6 Heuristi sforthe Repli a Costproblem 24 6.1 Closest. . . 25

6.2 Upwards . . . 26

6.3 Multiple . . . 27

7 Experiments: omparisonsof dierent a esspoli ies 29 7.1 Obtainingalowerbound . . . 29

7.2 Experimental plan . . . 29

7.3 Results. . . 30

8 Extensions 33 8.1 Withseveralobje ts . . . 33

8.2 More omplexobje tivefun tions . . . 33

9 Relatedwork 34

(7)

1 Introdu tion

Inthispaper,we onsider thegeneralproblem ofrepli apla ementin treenetworks. Informally, there are lients issuing requests to be satised by servers. The lients are known (both their positionin the treeand theirnumberofrequests), whilethe numberandlo ationof theservers are to be determined. A lientis aleaf node of thetree, and its requests anbe served byone orseveralinternal nodes. Initially, there arenorepli a;when anodeisequipped witharepli a, it anpro ess anumberofrequests,uptoits apa itylimit. Nodesequippedwitharepli a,also alledservers, anonlyserve lientslo atedin theirsubtree(sothattheroot,ifequippedwitha repli a, anserveany lient);thisrestri tionisusuallyadoptedtoenfor ethehierar hi alnature ofthetargetappli ationplatforms,whereanodehasknowledgeonlyofitsparentand hildrenin thetree.

The rule of the game is to assign repli as to nodes so that some optimization fun tion is minimized. Typi ally, this optimization fun tion is the total utilization ost of the servers. If all the nodes are identi al, this redu es to minimizing the number of repli as. If the nodes are heterogeneous,itisnaturaltoassigna ostproportionaltotheir apa ity(sothatonerepli aona node apableofhandling

200

requestsisequivalenttotworepli asonnodesof apa ity

100

ea h). The oreof the paper is devoted to the study of the previous optimization problem, alled Repli aPla ementinthefollowing. Additional onstraintsareintrodu ed,su has guarantee-ingsomeQualityofServi e(QoS):therequestsmustbeservedinlimitedtime,therebyprohibiting tooremoteorhard-to-rea hrepli alo ations. Also,theowofrequeststhroughalinkinthetree annotex eedsomebandwidth-related apa ity. Wefo usonoptimizingthetotalutilization ost (orrepli anumberinthehomogeneous ase). Thereisabun hofpossibleextensions: dealingwith several obje ttypesrather thanone, in luding ommuni ationtime into the obje tivefun tion, taking into a ount anupdate ostof therepli as, and soon. Forthe sakeof larity wedevote aspe ialse tion (Se tion 8) to formulate theseextensions, and to des ribe whi h situations our resultsandalgorithms anstillapplyto.

We point out that the distribution tree ( lients and nodes) is xed in our approa h. This keyassumptionis quitenaturalforabroadspe trum ofappli ations, su h asele troni ,ISP, or VODservi edelivery. Theroot serverhastheoriginal opyof thedatabasebut annotserveall lientsdire tly,soadistributiontreeisdeployedto provideahierar hi aland distributeda ess to repli asof theoriginal data. On the ontrary,in other, morede entralized, appli ations(e.g. allo ating Web mirrors in distributed networks), a two-step approa h is used: rst determine a good distribution tree in an arbitrary inter onne tion graph, and then determine a good pla ement ofrepli as among thetree nodes. Both stepsare interdependent,and the problemis mu hmore omplex,duetothe ombinatorialsolutionspa e(thenumberof andidatedistribution treesmaywellbeexponential).

ManyauthorsdealwiththeRepli aPla ementoptimizationproblem,andwesurveyrelated work inSe tion 9. Theobje tiveofthispaperistwofold: (i)introdu ingtwonewa ess poli ies and omparingthemwiththestandardapproa h;(ii)assessingtheimpa tofserverheterogeneity ontheproblem.

Inmost,ifnotall,papersfromtheliterature,allrequestsofa lientareservedbythe losest repli a,i.e. therstrepli afoundintheuniquepathfromthe lienttotherootinthedistribution tree. ThisClosest poli yissimpleandnatural,butmaybeunduly restri tive,leadingtoawaste of resour es. We introdu e and study two dierent approa hes: in the rst one, we keep the restri tionthat allrequests from agiven lientare pro essed by thesamerepli a, but we allow lientrequeststo traverse serverssoasto be pro essedby otherrepli as lo atedhigher in the path( losertotheroot). We allthisapproa htheUpwardspoli y. Thetrade-oftoexploreisthe following: theClosest poli yassignsrepli asatproximityofthe lients,but mayneedtoallo ate toomanyof them if somelo al subtreeissues agreat numberof requests. The Upwards poli y will ensureabetterresour e usage,load-balan ingthe pro ess of requestson alargers ale; the possibledrawba kisthat requestswill beservedby remoteservers,likelyto takelongertimeto pro ess them. Taking QoS onstraints into a ount would typi ally be more important for the Upwards poli y.

(8)

Inthese ondapproa h,wefurtherrelaxa ess onstraintsandgrantthepossibilityfora lient tobeassignedseveralrepli as. WiththisMultiple poli y,thepro essingofagiven lient'srequests willbesplitamongseveralserverslo atedinthetreepathfromthe lienttotheroot. Obviously, thispoli yis themostexible,and likelyto a hievethebest resour eusage. Theonlydrawba k isthe(modest)additional omplexityindu edbythefa tthatrequestsmustnowbetaggedwith therepli a serverIDin addition tothe lient ID.As alreadystated, onemajor obje tiveofthis paperisto omparethesethreea ess poli ies, Closest,Upwards andMultiple.

These ond major ontribution of the paper is to assess the impa t of serverheterogeneity, both from a theoreti al and a pra ti al perspe tive. Re ently, several variantsof the Repli a Pla ementoptimizationproblem withthe Closest poli y havebeenshown tohavepolynomial omplexity. Inthispaper,weestablishseveralnew omplexityresults.Thoseforthehomogeneous asearesurprising:forthesimplestinstan ewithoutQoSnorbandwidth onstraints,theMultiple poli yispolynomial(asClosest)whileUpwardsisNP-hard. Thethreepoli iesturnouttobe NP- ompleteforheterogeneousnodes,whi hprovidesyetanotherexampleoftheadditionaldi ulties indu edbyresour eheterogeneity. Onthemorepra ti alside,weprovideanoptimalalgorithm fortheMultiple problemwith homogeneousnodes,and severalheuristi sforallthree poli iesin theheterogeneous ase. We omparethese heuristi sthroughsimulations ondu tedforproblem instan es without QoSnor bandwidth onstraints. Another ontribution is that weare ableto assesstheabsoluteperforman eoftheheuristi s,notjust omparingonetotheother,owingtoa lowerboundprovidedbyanewformulationoftheRepli a Pla ementproblemintermsofan integerlinearprogram: the relaxationofthis program to therational numbers providesalower boundtothesolution ost(whi hisnotalwaysfeasible).

Therestofthepaperisorganizedasfollows. Se tion2isdevotedtoadetailedpresentationof thetargetoptimizationproblems. InSe tion3weintrodu ethethreea esspoli ies,andwegive afewmotivatingexamples. NextinSe tion4wepro eedtothe omplexityresultsforthesimplest versionoftheRepli aPla ementproblem,bothinthehomogeneousandheterogeneous ases. Se tion5dealswiththeformulationfortheRepli aPla ementproblemintermsofaninteger linear program. In Se tion 6 we introdu e several polynomial heuristi s to solvethe Repli a Pla ementproblem with the dierent a ess poli ies. These heuristi s are ompared through simulations,whoseresultsareanalyzedinSe tion7. Se tion8dis ussesvariousextensionstothe Repli aPla ementproblemwhileSe tion9isdevotedtoanoverviewofrelatedwork. Finally, westatesome on ludingremarksinSe tion 10.

2 Framework

Thisse tionisdevotedtoapre isestatementoftheRepli aPla ementoptimizationproblem. Westartwithsomedenitionsandnotations. Nextweoutlinethesimplestinstan eoftheproblem. Thenwedes ribeseveraltypesof onstraintsthat anbeaddedtotheformulation.

2.1 Denitions and notations

We onsideradistributiontree

T

whosenodesarepartitionedintoasetof lients

C

andasetof nodes

N

. Thesetoftreeedgesisdenotedas

L

. The lientsareleafnodesofthetree,while

N

is theset ofinternal nodes. It would beeasyto allow lient-server nodeswhi hplayboththerule ofa lientand ofan internalnode(possibly aserver), bydividingsu hanode into twodistin t nodesinthetree, onne tedbyanedgewithzero ommuni ation ost.

A lient

i ∈ C

ismakingrequests todatabaseobje ts. Forthesakeof larity,werestri tthe presentation to asingleobje ttype,hen easingledatabase. Wedealwith severalobje ttypes inSe tion 8.

A node

j ∈ N

may ormay not have been provided with a repli a of the database. Nodes equippedwitharepli a (i.e. servers) anpro ess requestsfrom lientsin theirsubtree. Inother words,thereisauniquepathfrom a lient

i

totherootofthetree,andea hnodein thispathis eligibletopro esssomeoralltherequestsissuedby

i

whenprovidedwitharepli a.

(9)

Let

r

betherootofthetree. If

j ∈ N

,then hildren

(j)

isthesetof hildrenofnode

j

. If

k 6= r

isanynodeinthetree(leaforinternal),parent

(k)

isitsparentinthetree. If

l : k → k

′

₌

parent

(k)

isanylink in thetree,then su

(l)

is thelink

k

′

_→

parent

(k

′

₎

(whenitexists). Let An estors

(k)

denote theset ofan estors ofnode

k

, i.e. thenodesin theuniquepath that leadsfrom

k

upto theroot

r

(

k

ex luded). If

k

′

_∈

An estors

(k)

,thenpath

[k → k

′

_]

denotesthesetoflinksinthepath from

k

to

k

′

;also, subtree

(k)

isthesubtreerootedin

k

,in luding

k

. Weintrodu emorenotationstodes ribeoursysteminthefollowing.

Clients

i ∈ C

Ea h lient

i

(leafofthetree)issending

r

i

requestspertimeunit. Forsu h requests,therequiredQoS(typi ally,aresponsetime)isdenotedq

i

,andweneedtoensure that thisQoSwillbesatisedforea h lient.

Nodes

j ∈ N

Ea hnode

j

(internalnodeofthetree)hasapro essing apa ityW

j

,whi h is the totalnumberof requests that it anpro ess pertime-unit when ithas arepli a. A ostisalsoasso iatedtoea hnode,s

j

,whi hrepresentsthepri etopaytopla earepli a atthisnode. Withasingleobje ttypeitisquitenaturaltoassumethat s

j

isproportional to W

j

: the morepowerfulaserver, themore ostly. Butwith several obje tswe may use non-relatedvaluesof apa ityand ost.

Communi ation links

l ∈ L

Theedges ofthe tree representthe ommuni ationlinks betweennodes(leafandinternal). Weassign a ommuni ationtime omm

l

onlink

l

whi h is thetime requiredto send a request through the link. Moreover, BW

l

is the maximum numberofrequeststhatlink

l

an transmitpertimeunit.

2.2 Problem instan es

For ea h lient

i ∈ C

, letServers

(i) ⊆ N

be theset of serversresponsiblefor pro essing at least oneof itsrequests. Wedo notspe ify herewhi h a ess poli y is enfor ed(e.g. oneormultiple servers), wedefer this to Se tion 3. Instead, welet

r

i,s

bethe numberof requestsfrom lient

i

pro essedbyserver

s

(of ourse,

P

s∈

Servers

(i)

r

i,s

= r

i

). Inthefollowing,

R

isthesetofrepli as:

R = {s ∈ N | ∃i ∈ C , s ∈

Servers

(i)} .

2.2.1 Constraints

Threemain typesof onstraintsare onsidered.

Server apa ity The onstraintthatnoserver apa ity anbeex eededispresentin all vari-antsoftheproblem:

∀s ∈ R,

X

i∈C|s∈

Servers

(i)

r

i,s

≤

W

s

QoS Someproblem instan esenfor e aqualityofservi e: thetimeto transferarequestfrom a lienttoarepli aserverisboundedbyaquantityq

i

. Thistranslatesinto:

∀i ∈ C, ∀s ∈

Servers

(i),

X

l∈

path

[i→s]

omm

l

≤

q

i

.

NotethatitwouldbeeasytoextendtheQoS onstraintsoastotakethe omputation ost of a requestin additionto its ommuni ation ost. This former ost isdire tly relatedto the omputationalspeedoftheserverandtheamountof omputation(inops)requiredfor ea hrequest.

Link apa ity Someprobleminstan esenfor eaglobal onstraintonea h ommuni ationlink

l ∈ L

:

X

i∈C,s∈

Servers

(i)|l∈

path

[i→s]

(10)

2.2.2 Obje tivefun tion

Theobje tivefun tion fortheRepli a Pla ementproblemisdened as: Min

X

s∈R

s

Asalreadypointedout,itisfrequentlyassumedthatthe ostofaserverisproportionaltoits apa ity,soin someprobleminstan eswelets

s

=

W

s

.

2.2.3 Simpliedproblems

Wedeneafewsimplied probleminstan esinthefollowing:

QoS=distan e We an simplify the expression of the ommuni ationtime in the QoS on-straintandonly onsiderthedistan e(innumberofhops)betweena lientanditsserver(s). TheQoS onstraintisthen

∀i ∈ C, ∀s ∈

Servers

(i), d(i, s) ≤

q

i

where the distan e

d(i, s) = |

path

[i → s]|

is thenumberof ommuni ationlinks between

i

and

s

.

No QoS Wemayfurthersimplifytheproblem,by ompletelysuppressingtheQoS onstraints. Inthis ase,theservers anbeanywhereinthetree,theirlo ationisindierenttothe lient. No link apa ity Wemay onsidertheproblemassuminginnitelink apa ity,i.e. not

bound-ingthetotaltra onanylinkin anadmissiblesolution.

Only server apa ities The problem without QoS and link apa ities redu es to nding a validsolutionofminimal ost,wherevalid meansthat noserver apa ityisex eeded. We nameRepli a Costthisfundamental problem.

Repli a ounting We anfurthersimplifythepreviousRepli aCostprobleminthe homo-geneous ase: withidenti al servers,theRepli a Costproblem amountsto minimizethe numberofrepli asneededtosolvetheproblem. Inthis ase,thestorage osts

j

issetto

1

forea hnode. We allthisproblemRepli a Counting.

3 A ess poli ies

Inthis se tionwereview theusualpoli ies enfor ingwhi h repli a is a essedby agiven lient. Considerthat ea h lient

i

is making

r

i

requestspertime-unit. There aretwos enariosfor the numberofserversassignedtoea h lient:

Single server Ea h lient

i

isassignedasingleserverserver

(i)

,thatisresponsibleforpro essing allitsrequests.

Multipleservers A lient

i

maybeassigned severalserversin aset Servers

(i)

. Ea h server

s ∈

Servers

(i)

willhandleafra tion

r

i,s

oftherequests. Of ourse

P

s∈

Servers

(i)

r

i,s

= r

i

. To thebest of ourknowledge, the single serverpoli y has beenenfor ed in all previous ap-proa hes. Oneobje tiveofthispaperistoassesstheimpa tofthisrestri tionontheperforman e ofdatarepli ationalgorithms. Thesingleserverpoli ymayproveausefulsimpli ation,butmay omeat thepri eofanon-optimalresour eusage.

Intheliterature, thesingleserverstrategyisfurther onstrainedto theClosest poli y. Here, theserverof lient

i

is onstrainedtobetherstserverfoundonthepaththatgoesfrom

i

upwards to therootof thetree. Inparti ular, onsider a lient

i

andits serverserver

(i)

. Then anyother

(11)

lientnode

i

′

residinginthe subtreerootedin server

(i)

will beassignedaserverin that subtree. Thisforbidsrequestsfrom

i

′

totraverse server

(i)

andbeservedhigher( losertotherootinthe tree).

Werelaxthis onstraintintheUpwards poli ywhi histhegeneralsingleserverpoli y. Noti e that a solutionto Closest alwaysis a solutionto Upwards, thus Upwards is alwaysbetter than Closest in termsof the obje tivefun tion. Similarly, the Multiple poli y is always better than Upwards,be auseitisnot onstrainedbythesingleserverrestri tion.

Thefollowingse tionsillustratethethreepoli ies. Se tion3.1providessimpleexampleswhere thereisavalidsolutionforagivenpoli y,butnoneforamore onstrainedone. Se tion3.2shows that Upwards an bearbitrarily better thanClosest, while Se tion 3.3 shows that Multiple an bearbitrarily better thanUpwards. We on ludewith an exampleshowingthat the ost of an optimal solutionof theRepli a Counting problem (for any poli y) anbe arbitrarily higher thantheobviouslowerbound

P

i∈C

r

i

W

,

where

W

istheserver apa ity.

3.1 Impa t of the a ess poli y on the existen e of a solution

We onsider here a very simpleinstan e of the Repli a Counting problem. In this example thereare twonodes,

s

1

beingtheunique hildof

s

2

,thetreeroot (seeFigure1). Ea h node an pro essW

= 1

request. (b) (a) ( )

W = 1

1 s

2 s

1

1 s

2 s

1 s

2 s

1

2

Figure1: A esspoli ies.

If

s

1

hasone lient hildmaking

1

request,theproblemhasasolutionwithallthreepoli ies, pla ingarepli aon

s

1

oron

s

2

indierently(Figure1(a)).

If

s

1

hastwo lient hildren,ea hmaking

1

request,theproblemhasnomoresolutionwith Closest. However,wehaveasolutionwithbothUpwards and Multiple ifwepla erepli as onbothnodes. Ea hserverwillpro esstherequestofoneofthe lients(Figure1(b)). Finally,if

s

1

hasonlyone lient hildmaking

2

requests,onlyMultiple hasasolutionsin e

weneedto pro essonerequeston

s

1

and theotheron

s

2

, thusrequestingmultiple servers (Figure1( )).

This exampledemonstratestheusefulnessofthe newpoli ies. The Upwards poli yallowsto ndsolutionswhenthe lassi alClosest poli ydoesnot. ThesameholdstrueforMultiple versus Upwards. Inthefollowing,we omparethe ostofsolutionsobtainedwithdierentstrategies.

3.2 Upwards versus Closest

In the following example, we onstru t an instan e of Repli a Counting where the ost of theUpwards poli y isarbitrarilylowerthanthe ostof theClosest poli y. We onsiderthetree networkof Figure2,where thereare

2n + 2

internalnodes, ea h withW

j

= W = n

, and

2n + 1

lients,ea hwith

r

i

= r = 1

.

WiththeUpwards poli y,wepla ethreerepli asin

s

2n

,

s

2n+1

and

s

2n+2

. Allrequests anbe satisedwiththese threerepli as.

(12)

s

1

1 s

2n

s

2n+2

s

2n+1

W = n

1

Figure2: Upwards versusClosest

When onsideringtheClosest poli y,rstweneedtopla earepli ain

s

2n+2

to overits lient. Then,

Eitherwepla earepli aon

s

2n+1

. Inthis ase,thisrepli aishandling

n

requests,butthere remain

n

otherrequestsfromthe

2n

lientsinitssubtreethat annotbepro essedby

s

2n+2

. Thus,weneedtoadd

n

repli as between

s

1 ..s

2n

.

Otherwise,

n − 1

requestsofthe

2n

lientsinthesubtreeof

s

2n+1

anbepro essedby

s

2n+2

in additiontoitsown lient. Weneedtoadd

n + 1

extrarepli asamong

s

1 , s

2 , . . . , s

2n

. Inboth ases,wearepla ing

n+2

repli as,insteadofthe

3

repli asneededwiththeUpwardspoli y. This proves that Upwards an be arbitrary better than Closest on some Repli a Counting instan es.

3.3 Multiple versus Upwards

Inthisse tion webuildan instan eoftheRepli a Countingproblem whereMultiple istwi e better than Upwards. We do not know whether there exist instan es of Repli a Counting wheretheperforman eratioofMultiple versusUpwards ishigherthan

2

(andwe onje turethat this is not the ase). However, we also build an instan e of the Repli a Costproblem (with heterogeneousnodes)whereMultiple isarbitrarilybetterthanUpwards.

Westartwiththehomogeneous ase. Considertheinstan eofRepli aCountingrepresented inFigure3,with

3n + 1

nodesof apa ityW

j

= W = 2n

. Theroot

r

has

n + 1

hildren,

n

nodes labeled

s

1

to

s

n

anda lientwith

r

i

= n

. Ea hnode

s

j

hastwo hildrennodes,labeled

v

j

and

w

j

for

1 ≤ j ≤ n

. Ea hnode

v

j

hasaunique hild,a lientwith

r

i

= n

requests;ea hnode

w

j

hasa unique hild,a lientwith

r

i

= n + 1

requests.

TheMultiple poli y assigns

n + 1

repli as, oneto the root

r

and one to ea h node

s

j

. The repli a in

s

j

anpro essallthe

2n + 1

requestsin its subtreeex eptone,whi h ispro essed by theroot.

FortheUpwards poli y,weneedtoassignonerepli ato

r

,to overits lient. Thisrepli a an pro ess

n

otherrequests, forinstan ethosefrom the lient hildof

v

1

. Weneedto pla eat least arepli a in

s

1

orin

w

1

,and

2(n − 1)

repli asin

v

j

and

w

j

for

2 ≤ j ≤ n

. Thisleadstoatotalof

2n

repli as, hen eaperforman efa tor

2n

n+1

whoselimitisto

2

when

n

tendsto innity.

We now pro eed to the heterogeneous ase. Consider the instan e of Repli a Cost rep-resented in Figure 4, with

3

nodes

s

1

,

s

2

and

s

3

, and

2

lients. The apa ity of

s

1

and

s

2

is W

1 =

W

2 = n

while that of

s

3

is W

3 = Kn

, where

K

is arbitrarily large. Re all that in the Repli aCostproblem,welets

j

=

W

j

forea hnode. Multiple assigns

2

repli as,in

s

1

and

s

2

, hen ehas ost

2n

. TheUpwards poli yassignsarepli ato

s

1

to overits hild,andthen annot

(13)

n + 1

n

n + 1

s

1 s

2 W = 2n

r

v

2 v

1 n + 1

s

n

v

n

w

1 w

2 w

n

Figure3: Multiple versusUpwards,homogeneousplatforms.

n + 1

s

1 , W

1 =

n

s

2 , W

2 =

n

s

3 , W

3 =

Kn

n − 1

Figure4: Multiple versusUpwards,heterogeneousplatforms.

use

s

2

to pro ess the requests of the hildin its subtree. It must pla e arepli a in

s

3

, hen ea nal ost

n + Kn = (K + 1)n

arbitrarilyhigherthanMultiple.

3.4 Lower bound for the Repli a Counting problem

Obviously, the ostof an optimal solutionof theRepli a Countingproblem (for any poli y) annotbelowerthantheobviouslowerbound

l

P

i∈C

r

i

W

m

,where

W

istheserver apa ity. Indeed, this orrespondsto asolutionwherethetotalrequestloadissharedasevenlyaspossibleamong therepli as.

Thefollowinginstan eof Repli a Countingshowsthat theoptimal ost anbearbitrarily higherthanthislowerbound. ConsiderFigure5,with

n + 1

nodesof apa ityW

j

= W

,Theroot

r

has

n + 1

hildren,

n

nodes labeled

s

1

to

s

n

, and a lient with

r

i

= W

. Ea h node

s

j

has a unique hild,a lientwith

r

i

= W/n

(assumewithoutlossofgeneralitythat

W

isdivisibleby

n

). The lowerbound is

l

P

i∈C

r

i

W

m

=

2W

_W

= 2

. However, ea h of the three poli ies Closest, Upwards andMultiplewillassignarepli atotherootto overits lient,andwillthenneed

n

extrarepli as, oneper lientof

s

j

,

1 ≤ j ≤ n

. The total ostisthus

n + 1

repli as, arbitrarilyhigher thanthe lowerbound.

(14)

s

1 W/n

W/n

r

s

n

W

Figure5: Thelowerbound annotbeapproximatedforRepli a Counting.

All the examples in Se tions 3.1 to 3.4 give an insight of the ombinatorial nature of the Repli a Pla ementoptimization problem, even in its simplest variants Repli a Cost and Repli a Counting. The followingse tion orroboratesthis insight: mostproblems areshown NP-hard,eventhoughsomevariantshavepolynomial omplexity.

4 Complexity results

One major goal of this paper is to assess the impa t of the a ess poli y on the problem with homogeneousvsheterogeneousservers. Werestri ttothesimplestproblem,namelytheRepli a Cost problem introdu ed in Se tion 2.2.3. We onsider a tree

T = C ∪ N

, no QoS onstraint, and innite link apa ities. Ea h lient

i ∈ C

has

r

i

requests; ea h node

j ∈ N

has pro essing apa ity W

j

and storage osts

j

= W

j

. This simpleproblem omes in twoavors,either with homogeneousnodes(W

j

=

Wforall

j ∈ N

), orwithheterogeneousnodes(serverswithdierent apa ities/ osts).

Inthe singleserver versionof the problem, weneed to nd a serverserver

(i)

for ea h lient

i ∈ C

. LetServers bethesetofservers hosenamongthenodesin

N

. Theonly onstraintisthat server apa ities annotbeex eeded: thistranslatesinto

X

i∈C,

server

(i)=j

r

i

≤ W

j

forall

j ∈

Servers

.

The obje tiveis to nd avalid solutionof minimal storage ost

P

j∈

Servers

W

j

. Note that with homogeneous nodes, the problem redu es to nd the minimum number of servers, i.e. to the Repli aCountingproblem. AsoutlinedinSe tion3,therearetwovariantsofthesingleserver versionoftheproblem, namelytheClosest andtheUpwards strategies.

IntheMultiple poli y withmultiple serversper lient,letServersbethesetofservers hosen amongthenodesin

N

;forany lient

i ∈ C

andanynode

j ∈ N

,let

r

i,j

bethenumberofrequests from

i

that arepro essedby

j

(

r

i,j

= 0

if

j /

∈

Servers ). Weneedtoensurethat

X

j∈N

r

i,j

= r

i

forall

i ∈ C.

The apa ity onstraintnowwrites

X

i∈C

r

i,j

≤ W

j

forall

j ∈

Servers

,

whiletheobje tivefun tion isthesameasforthesingleserverversion.

Thede isionproblemsasso iatedwiththepreviousoptimizationproblemsareeasyto formu-late: givenaboundonthenumberofservers(homogeneousversion) oronthetotalstorage ost (heterogeneousversion),isthereavalidsolutionthat meetsthebound?

(15)

Homogeneous Heterogeneous Closest polynomial[2,9℄ NP- omplete Upwards NP- omplete NP- omplete Multiple polynomial NP- omplete

Table1: Complexityresultsforthedierentinstan esoftheRepli a Costproblem.

Table1 apturesthe omplexityresults. These omplexity resultsareallnew,ex ept forthe Closest/Homogeneous ombination. The NP- ompleteness of the Upwards/Homogeneous ase omes as a surprise, sin e all previously known instan es were shown to be polynomial, using dynami programmingalgorithms. Inparti ular,theClosest/Homogeneousvariantremains poly-nomial whenadding ommuni ation osts [2℄ orQoS onstraints[9℄. PreviousNP- ompleteness results involved generalgraphs rather than trees, and the ombinatorial nature of the problem amefrom thedi ultyto extra tagoodrepli a treeoutofanarbitrary ommuni ationgraph. Herethetreeisxed,but theproblem remains ombinatorial duetoresour eheterogeneity.

4.1 With homogeneous nodes and the Multiple strategy

Theorem1. Theinstan eofthe Repli a Countingproblem withthe Multiplestrategy an be solved inpolynomialtime.

Proof. Weoutlinebelowanoptimalalgorithmto solvetheproblem. Theproof ofoptimalityis quitete hni al,sothereadermaywantto skipitatrstreading.

4.1.1 Algorithmformultiple servers

We propose agreedy algorithm to solvethe Repli a Counting problem. Let W be the total numberofrequeststhat aserver an handle.

Thisalgorithmworksinthreepasses: rstwesele tthenodeswhi hwillhavearepli ahandling exa tlyWrequests. Thenase ondpassallowsus tosele tsomeextraserverswhi harefullling theremainingrequests. Finally,weneedtode ideforea hserverhowmanyrequestsofea h lient itispro essing.

Weassumethatea h node

i

knowsitsparentparent

(i)

andits hildren hildren

(i)

in thetree. We introdu e a new variable whi h is the ow oming up in the tree (requests whi h are not alreadyfullledbyaserver). Itisdenotedbyow

i

fortheowbetween

i

andparent

(i)

. Initially,

∀i ∈ C

ow

i

= r

i

and

∀i ∈ N

ow

i

= −1

. Moreover,theset ofrepli asisemptyinthebeginning:

repl = ∅

.

Pass 1 We greedily sele t in this step some nodes whi h will pro ess W requests and whi h are as loseto theleavesas possible. We pla earepli a onsu h nodes(see Algorithm1). Pro edurepass1is alledwith

r

(rootofthetree)asaparameter,anditgoesdownthetree re ursivelyinorder to omputetheows. Whenaowex eedsW , wepla earepli asin e the orrespondingserverwillbefullyused, andweremovethepro essedrequestsfrom the owgoingupwards.

At theend, if

f low

r

= 0

or (

f low

r

≤

W and

r /

∈ repl

), wehaveanoptimal solutionsin e allrepli as whi h havebeenpla edarefully usedandallrequestsaresatisedbyaddinga repli a in

r

if

f low

r

6= 0

. Inthis aseweskippass2andgodire tlyto pass3.

Otherwise, we need someextra repli as sin e somerequests are not satisedyet, and the root annotsatisfyalltheremainingrequests. Topla e theseextrarepli as,wegothrough pass2.

Pass 2 Inthispass,weneedtosele tthenodeswheretoaddrepli as. Todoso,whilethereare toomany requestsgoing upto the root, wesele t thenodewhi h anpro essthe highest number of requests, and we pla e a repli a there. The number of requests that a node

(16)

pro edurepass1(node

s ∈ N

) begin

f low

s

= 0

;

for

i ∈

hildren

(s)

do

if

f low

i

== −1

thenpass1

(i)

;//Re ursive all.

f low

s

= f low

s

+ f low

i

; end

if

f low

s

≥

Wthen

f low

s

= f low

s

−

W

; repl = {s} ∪ repl

; end

Algorithm1: Pro edurepass1

j ∈ N

aneventuallypro essistheminimumoftheowsbetween

j

andtheroot

r

,denoted

uf low

j

(forusefulow). Indeed, somerequestsmayhavenoserveryet,buttheymightbe pro essedbyaserveronthepathbetween

j

and

r

,wherearepli ahasbeenpla edinpass1. Algorithm2detailsthispass.

If weexit this pass with

f inish = −1

, this meansthat we havetried to pla e repli as on all nodes,but thissolutionis notfeasiblesin e thereare still somerequestswhi hare not pro essedgoinguptotheroot. Inthis ase,theoriginalprobleminstan ehadnosolution. However,ifwesu eedtopla erepli assu hthat

f low

r

= 0

,wehaveasetofrepli aswhi h su eedtopro essallrequests. Wethengothroughpass3toassignrequeststoservers,i.e. to omputehowmanyrequestsofea h lientshouldbepro essedbyea hserver.

while

f low

r

6= 0

do

f reenode = N \ repl

;

if

f reenode == ∅

then

f inish = −1;

exittheloop; // Atea h step,assign1repli aandre- omputeows.

child =

hildren

(r); uf low

r

= f low

r

; while

child! = ∅

do

remove

j

from

child

;

uf low

j

= min(f low

j

, uf low

parent

(j)

)

;

child = child ∪

hildren

(j)

; end

// The usefulows havebeen omputed, sele tthemax. maxuow=0;

for

j ∈ f reenode

do

if

uf low

j

> maxuf low

then

maxuf low = uf low

j

; maxnode = j

; end

if

maxuf low 6= 0

then

repl = repl ∪ {maxnode}

; // Update theows upwards.

for

j ∈

An estors

(maxnode) ∪ {maxnode}

do

f low

j

= f low

j

− maxuf low

; end

else

f inish = −1;

exittheloop; end

Algorithm2: Pass2

Pass 3 Thispassisinfa tstraightforward,startingfromtheleavesanddistributingtherequests to theserversfrom thebottom until the top ofthe tree. Wede ide for instan e to ae t requestsfrom lientsstartingtotheleft. Pro edurepass3is alledwith

r

(rootofthetree) as a parameter, and it goes down the tree re ursively ( .f. Algorithm 3). For

i ∈ C

,

r

′

i

is the number of requests of

i

not yet ae ted to a server(initially

r

′

i

= r

i

).

w

s,i

is the

(17)

requestsae ted to

s

.

C(s)

isthesetof lientsinsubtree

(s)

whi hstillhavesomerequests notae ted. Initially,

C(i) = {i}

for

i ∈ C

,and

C(s) = ∅

otherwise.

Notethat aserverwhi hwas omputingWrequestsinpass1mayendup omputingfewer requests ifoneof itsdes endantsin thetree hasearned arepli a in pass2. Butthis does notae ttheoptimalityoftheresult,sin ewekeepthesamenumberofrepli as.

pro edurepass3(node

s ∈ N

) begin

w

s

= 0

;

for

i ∈

hildren

(s)

do

if

C(i) = ∅

thenpass3

(i)

; //Re ursive all.

C(s) = C(s) ∪ C(i)

; end if

s ∈ repl

then for

i ∈ C(s)

do if

r

′

_{(i) ≤}

W

− w

s

then

C(s) = C(s) \ {i}; w

s,i

= r

′

i

;

w

s

= w

s

+ r

′

i

; r

′

i

= 0

; end if

C(s) 6= ∅

thenLet

i ∈ C(s); x =

W

− w

s

; r

′

i

= r

′

i

− x; w

s,i

= x; w

s

=

W; end end

Algorithm3: Pro edurepass3

Theproofin Se tion4.1.3showstheequivalen ebetweenthesolutionbuiltbythisalgorithm and any optimal solution,thus proving theoptimalityof the algorithm. The following example illustratesthestepbystepexe utionofthealgorithm.

4.1.2 Example

Figure 6(a) provides anexample of network on whi h weare pla ing repli as with theMultiple strategy. ThenetworkisthushomogeneousandwexW

= 10

.

Pass 1ofthe algorithm isquitestraightforwardto unroll, andFigure 6(b) indi ates theow onea hlink andthesaturatedrepli asarethebla knodes.

Duringpass2,wesele tthenodesofmaximumusefulow. Figure6( )representstheseuseful ows;weseethatnode

n

4

is theonewiththemaximumusefulow(

7

),soweassign itarepli a andupdatetheusefulows. All theusefulowsarethenredu eddownto

1

sin ethereisonly

1

requestgoingthroughtheroot

n

1

. Therstnodeofmaximumusefulow

1

tobesele tedis

n

2

, whi hissettobearepli aofpass2. Theowattherootisthen

0

anditis theendofpass2.

Finally,pass3ae tstheserverstothe lientsandde ideswhi hrequestsareservedbywhi h repli a(Figure6(d)). Forinstan e,the lientwith

12

requestssharesitsrequestsbetween

n

10

(

10

requests)and

n

2

(

2

requests). Requestsare ae ted from thebottom ofthe treeupto the top. Note that the root

n

1

, even though itwasa saturatedrepli a of pass 1, hasonly

5

requests to pro eedin theend.

4.1.3 Proof of optimality

Let

R

opt

beanoptimalsolutiontoaninstan e oftheproblem. The oreoftheproof onsistsin transformingthissolutionintoanequivalent anoni aloptimalsolution

R

can

. Wewillthenshow thatouralgorithmisbuildingthis anoni alsolution,andthusitisprodu inganoptimalsolution.

Ea h server

s ∈ R

opt

isserving

w

s,i

requestsof lient

i ∈

subtree

(s) ∩ C

,and

w

s

=

X

i∈

subtree

(s)∩C

w

s,i

≤ W.

Forea h

i ∈ C

,

w

s,i

= 0

if

s ∈ N

isnotarepli a,and,

P

(18)

00

11

00

11

00

11

00

11

(a) Initial network

n

1 n

2 n

3 n

4

2

2 n

5 n

6 n

7 _n

₈

n

9 n

10 _n

11

12

1

9

7 W = 10

n

1 n

2 n

3 n

4

2

2 n

5 n

6 n

7 _n

₈

n

9 n

10 _n

11

12

1

9

7

2 ₂

2

4

6

2 ₁₂

1

7

4

1

(b) Pass 1

3

7

4

8

4

3

7

3

4

2

12

1

9

7

3

4 n

1 n

2 n

3 n

4

2

2 n

5 n

6 n

7 _n

8 n

9 n

10 n

11

12

1

9

7

2 ₂

2

4

2

( )Pass 2

3

(d) Pass 3

2

10

2

1

4

3

7

3

4

8

9

2

6

1

4

3

4

8

7

2

Figure 6: AlgorithmfortheRepli a CountingproblemwiththeMultiple strategy.

Wedenetheow ofnode

k

, ow

k

,bythenumberofrequestsgoingthroughthisnodeupto itsparents. Thus, for

i ∈ C

,

f low

i

= r

i

,whileforanode

s ∈ N

,

f low

s

=

X

i∈

hildren

(s)

f low

i

− w

s

.

Thetotalowgoingthroughthetree,

tf low

,isdenedinasimilarway,ex eptthatwedonot removefrom theowtherequestspro essedby arepli a,i.e.

tf low

s

=

P

i∈

hildren

(s)

tf low

i

. We thushave

tf low

s

=

X

i∈

subtree

(s)∩C

r

i

.

Thesevariablesare ompletelydenedbythenetwork andtheoptimalsolution

R

opt

.

Arstlemmashowsthatitispossibleto hangerequestassignmentswhilekeepinganoptimal solution. Theowsneedtobere omputedafteranysu hmodi ation.

Lemma1. Let

s ∈ N ∩ R

opt

beaserver su hthat

w

s

<

W.

If

tf low

s

≥

W,we an hangetherequestassignmentbetweenrepli asoftheoptimalsolution, in su hawaythat

w

s

=

W .

(19)

Otherwise, we an hangethe requestassignmentso that

w

s

= tf low

s

.

Proof. Firstwepointoutthat the lientsin subtree

(s)

an allbeservedby

s

,and sin e

R

opt

is asolution, these requestsare served byarepli a somewhere in thetree. We do notmodify the optimalityofthesolutionby hangingthe

w

s,i

,itjust ae tstheowsofthesolution. Thus, for agiven lient

i ∈

subtree

(s) ∩ C

,ifthereisarepli a

s

′

_{6= s}

onthepathbetween

i

andtheroot,we an hangetheassignmentoftherequestsof lient

i

. Let

x = max(w

s

′

,i

,

W

− w

s

)

. Thenwemove

x

requests,i.e.

w

s

′

,i

= w

s

′

,i

− x

and

w

s,i

= w

s,i

+ x

. Fromthedenition of

tf low

s

,weobtainthe result,ifwemoveallpossiblerequeststo

s

untiltherearenomorerequestsinthesubtreeoruntil

s

ispro essing

W

requests.

We now introdu e a new denition, ompletely independent from the optimal solution but relatedtothetreenetwork. The anoni alow isobtainedbydistinguishingnodeswhi hre eive aowgreaterthan

W

from theother nodes. We omputethe anoni alow

cf low

of thetree, independentlyoftherepli apla ement,anddeneasubsetofnodeswhi haresaturated,

SN

. We also omputethenumberofsaturatednodesin subtree

(k)

,denoted

nsn

k

,foranynode

k ∈ C ∪ N

ofthetree.

For

i ∈ C

,

cf low

i

= r

i

and

nsn

i

= 0

,andwethen omputere ursivelythe anoni alowsfor nodes

s ∈ N

. Let

f

s

=

P

i∈

hildren

(s)

cf low

i

and

x

s

=

P

i∈

hildren

(s)

nsn

i

. If

f

s

≥

Wthen

s ∈ SN

,

cf low

s

= f

s

−

Wand

nsn

s

= x

s

+ 1

. Otherwise,

s

isnotsaturated,

cf low

s

= f

s

and

nsn

s

= x

s

. We andedu efrom thesedenitionsthefollowingresults:

Proposition 1. Anon saturatednode always has a anoni al ow beinglessthanW:

∀s ∈ N \ SN cf low

s

<

W

Lemma2. For allnodes

s ∈ C ∪ N

,

cf low

s

= tf low

s

− nsn

s

×

W. Corollary 1. Forallnodes

s ∈ C ∪ N

,

tf low

s

≥ nsn

s

×

W.

Proof. Proposition1istrivialdueto thedenition ofthe anoni alow. Lemma2 an beprovedre ursivelyonthetree.

Thispropertyis trueforthe lients: for

i ∈ C

,

nsn

i

= 0

and

tf low

i

= cf low

i

= r

i

. Let

s ∈ N

,andletusassumethat thepropositionistrueforall hildrenof

s

. Then,

∀j ∈

hildren

(s) cf low

j

= tf low

j

− nsn

j

×

W

.

If

s /

∈ SN

,

nsn

s

=

P

j∈

hildren

(s)

nsn

j

and

cf low

s

=

X

j∈

hildren

(s)

cf low

j

=

X

j∈

hildren

(s)

(tf low

j

− nsn

j

×

W

) = tf low

s

− nsn

s

×

W If

s ∈ SN

,

nsn

s

=

P

j∈

hildren

(s)

nsn

j

+ 1

and

cf low

s

=

X

j∈

hildren

(s)

cf low

j

−

W

=

X

j∈

hildren

(s)

(tf low

j

− nsn

j

×

W

) −

W

= tf low

s

− (nsn

s

− 1) ×

W

−

W

= tf low

s

− nsn

s

×

W

whi hprovesthe result. Corollary1istriviallydedu ed fromLemma 2sin e

cf low

is apositive fun tion.

Wealsoshowthatitisalwayspossibletomovearepli a intoafreeserverwhi h isoneofits an estorsinthetree,whilekeepinganoptimalsolution:

Proposition 2. Let

R

opt

be an optimal solution, and let

s ∈ R

opt

. If

∃s

′

_∈

An estors

(s) \ R

opt

then

R

′

(20)

Proof.

s

′

anhandle allrequestswhi hwerepro essed by

s

sin e

s ∈

subtree

(s

′

₎

. Wejust need toredene

w

s

′

,i

= w

s,i

forall

i ∈ C

andthen

w

s,i

= 0

.

Wearenowready to transform

R

opt

into anewoptimal solution,

R

sat

,byredistributing the requestsamongtherepli asandmovingsomerepli as,inordertopla earepli aatea hsaturated node, andae tingWrequeststo thisrepli a. This transformationis donestartingat theleaves ofthetree,and onsideringallnodesof

SN

. Nothingneedstobedonefortheleaves(the lients) sin etheyarenotin

SN

.

Letus onsider

s ∈ SN

, and assumethat theoptimal solutionhasalready been modiedto pla earepli a,andassignitWrequests,onallnodesin

subSN = SN ∩

subtree

(s) \ {s}

.

Weneedtodierentiatetwo ases:

1. If

s ∈ R

opt

, we do not need to move any repli a. However, if

w

s

6=

W , we hange the assignmentofsomerequestswhilekeepingthesamerepli asin orderto obtainaworkload of W on server

s

. We donot remove requestsfrom the saturatedserversof

subSN

whi h havealreadybeenlled. Corollary1ensuresthat

tf low

s

≥ nsn

s

×

W , and

(nsn

s

− 1) ×

W requests should notmovesin etheyare ae ted tothe

nsn

s

− 1

serversof

subSN

. There arethusstill morethanWrequestsof lientsofsubtree

(s)

whi h anpossiblybemovedon

s

usingLemma 1.

2. If

s /

∈ R

opt

, we need to move a repli a of

R

opt

and pla e it in

s

without hanging the optimalityofthesolution. Wedierentiatetwosub ases.

(a) If

∃s

1 ∈

subtree

(s) ∩ R

opt

\ SN

, then therepli a pla edon

s

1

anbemoved in

s

by applyingProposition2. Then,if

w

s

6=

W,weapply ase1abovetosaturatetheserver. (b) Otherwise,alltherepli aspla edinsubtree

(s)

arealsoin

SN

,and theow onsumed bythealreadymodiedoptimalalgorithmisexa tly

(nsn

s

− 1) ×

W. Itiseasytosee thattheow(oftheoptimalsolution)at

s

isexa tlyequaltothetotalowminusthe onsumedow. Therefore,

f low

s

= tf low

s

− (nsn

s

− 1) ×

W ,andwiththeappli ation ofCorollary1,

f low

s

≥

W.

Theideanow onsistsinae tingtherequestsofthisowtonode

s

byremovingwork fromtherepli asupwardstotheroot,andrearrangetheremainingrequeststoremove onerepli a. Theow

f low

s

isgoingupwardstobepro essedbysomeofthe

nr

s

repli as in An estors

(s) ∩ R

opt

, denoted

s

1 , ..., s

nr

s

,

s

1

beingthe losestnodefrom

s

. We an removeW of these requests from theowand ae t them to anew repli a pla edin

s

. Let

w

s

k

,s

=

P

j∈

subtree

(s)∩C

w

s

k

,j

. Wehave

P

k=1..nr

s

w

s

k

,s

= f low

s

. Wemovethese

requestsfrom

s

k

to

s

, startingwith

k = 1

. Thus,after themodi ation,

w

s

1 ,s

= 0

. It is howeverpossible that

w

s

1 6= 0

sin e

s

1

maypro essrequests whi h are not oming from subtree

(s)

. Inthis ase, wearesurethat wehaveremovedenoughrequestsfrom

s

k

,

k = 2..nr

s

whi h aninstead pro ess requests still in hargeof

s

1

. We an then removetherepli ainitiallypla edin

s

1

.

This way, we have not hanged the assignment on repli as in

subSN

, but we have pla edarepli a in

s

whi h ispro essing Wrequests. Sin ewehave atthe sametime removedtherstrepli aonthepathfrom

s

totheroot(

s

1

),wehavenot hangedthe numberofrepli asandthesolutionisstilloptimal.

On ewehaveappliedthispro edureuptotheroot,wehaveanoptimalsolution

R

sat

inwhi h allnodesof

SN

havebeenpla edarepli aandarepro essingWrequests. Wewillnot hangethe assignmentof these repli asanymorein thefollowing. Free nodesin thenew solutionare alled F-nodes,whilerepli aswhi harenotin

SN

are alled PS-nodes,forpartially saturated.

In anext step, we further modify the

R

sat

optimal solutionin order to obtainwhat we all the anoni al solution

R

can

. To do so, we hange the request assignment of the PS-nodes: we saturate someof themas mu haswe anandweintegratetheminto thesubsetofnodes

SN

, redeningthe

cf low

a ordingly. Attheend ofthepro ess,

SN = R

can

.

The

cf low

isstill theow whi h hasnotbeenpro essedbyasaturatednode in thesubtree, andthuswe anexpressitin amoregeneralway:

cf low

s

= tf low

s

−

X

s

′

_{∈SN ∩}

subtree

(s)

(21)

Notethatthis istotallyequivalentto thepreviousdenitionwhilewehavenotmodied

SN

. Wealsointrodu eanewowdenition,thenon-saturatedowof

s

,

nsf low

s

,whi h ountsthe requestsgoingthroughnode

s

andnotservedbyasaturatedserveranywherein thetree. Thus,

nsf low

s

= cf low

s

−

X

i∈

hildren

(s)∩C

X

s

′

_∈

An estors

(s)∩SN

w

s

′

_,i

.

This ow representsthe requests that anpotentiallybeserved by

s

while keepingall nodesof SNsaturated.

Lemma3. Inasaturatedoptimalsolution,there annotexistaPS-nodeinthesubtreeofanother PS-node.

Proof. Thenon-saturatedowis

nsf low

s

≤ cf low

s

sin ewefurtherremovefrom the anoni al owsomerequestswhi hareae tedupwardsin thetreeto somesaturatedservers.

Let

s ∈ R

sat

\ SN

beaPS-node. Its anoni alowis

cf low

s

< W

. It anpotentiallypro ess alltherequestsofthesubtreewhi harenotae tedtoasaturatedserverupwardsordownwards in the tree, thus

nsf low

s

requests. Sin e

nsf low

s

≤ cf low

s

< W

, we an hange the request assignmentto assignall these

nsf low

s

requeststo

s

, removingeventuallysomework fromother non-saturated repli as upwardsor downwardswhi h were pro essing these requests. Thus, the repli aonnode

s

ispro essingalltherequestsofsubtree

(s)

whi h arenotpro essedbysaturated nodes.

Iftherewasanonsaturatedrepli ainsubtree

(s)

,it ouldthusberemovedsin ealltherequests arepro essedby

s

. ThismeansthatasolutionwithaPS-nodeinthesubtreeofanotherPS-node isnotoptimal,thusprovingthelemma.

Atthis point, we an movethePS-nodesashighaspossiblein

R

sat

. Let

s

beaPS-node. If thereisafreenode

s

′

inAn estors

(s)

thenwe anmovetherepli afrom

s

to

s

′

usingProposition2. Lemma3ensuresthat therearenootherPS-nodesin subtree

(s

′

₎

.

Allfurthermodi ationswillonlyalternodeswhi hhavenoPS-nodesintheiran estors. We dene

N

′

_{= {s|}

An estors

(s) \ SN = ∅}

. Let

s ∈ N

′

.

nsf low

s

= cf low

s

−

P

i∈

hildren

(s)∩C

P

s

′

_∈

An estors

(s)

w

s

′

_,i

sin e all an estorsof

s

arein

SN

. Thus,

nsf low

s

=

X

s

′

_∈

subtree

(s)\SN

w

s

′

.

Bydenition,

∀s ∈ N nsf low

s

≤ cf low

s

. Moreover, if

s /

∈ SN

, then

nsf low

s

= w

s

sin e subtree

(s) \ SN

isredu edto

s

(nootherPS-nodeunder thePS-node

s

,from Lemma3).

Weintrodu eanewowdenition,theusefulow,whi hintuitivelyrepresentsthenumberof requeststhat anpossiblybepro essedon

s

withoutremovingrequestsfrom asaturatedserver.

uf low

s

=

min

s

′

_∈

An estors

(s)∪{s}

{cf low

s

′

}

Lemma4. Let

s ∈ N

′

. Then

nsf low

s

≤ uf low

s

. Proof. Let

s

′

_∈

An estors

(s)

. Sin e

s ∈ N

′

,

s

′

_{∈ SN}

.

cf low

s

′

≥ nsf low

_s

′

=

X

s

′′

_∈

subtree

(s

′

_)\SN

w

s

′′

Butsin e

s ∈

subtree

(s

′

₎

, subtree

(s) \ SN ⊆

subtree

(s

′

_{) \ SN}

, hen e

nsf low

s

≤ nsf low

s

′

. Note that

nsf low

isanonde reasingfun tion (whengoingupthetree).

Thus,

∀s

′

_∈

An estors

(s) ∪ {s}

,

nsf low

s

≤ cf low

s

′

, and by denition of the useful ow,

(22)

Nowwestartthemodi ationoftheoptimalsolutioninordertoobtainthe anoni alsolution. Atea hstep,wesele tanode

s ∈ N \ SN

maximizingtheusefulow. Ifthereareseveralnodes of identi al

uf low

, we sele t the rst one in a depth-rst traversal of the tree. We will prove that we anae t

uf low

s

requeststo thisnodewithoutunsaturatinganyserverofSN.

s

is then onsideredasasaturatednode,were omputethe anoni alows(andthustheusefulows)and reiterate thepro ess until

cf low

r

= 0

, whi h means that all the requestshave been ae ted to saturatedservers.

Letus explainhowto reassigntherequestsin orderto saturate

s

with

uf low

s

requests. The ideaistoremovesomerequestsfromAn estors

(s)

inordertosaturate

s

,andthentosaturatethe an estorsof

s

again,byae tingthemsomerequests omingfromothernonsaturatedservers.

First,wenotethat

uf low

s

≤ cf low

r

= nsf low

r

. Thus,

uf low

s

≤

X

s

′

_{∈N \SN}

w

s

′

= w

_s

+

X

s

′

_{∈P S}

w

s

′

wherePSisthesetofnonsaturatednodeswithout

s

. Let

x = uf low

s

− w

s

. If

x = 0

,

s

isalready saturated. Otherwise,weneedtoreassign

x

requeststo

s

. Fromthepreviousequation,we ansee that

P

s

′

_{∈P S}

w

s

′

≥ uf low

_s

− w

_s

= x

. There arethusenoughrequestshandled bynon saturated nodeswhi h anbepassedto

s

.

Thenumberof requestsofsubtree

(s) ∩ C

handled byAn estors

(s)

is

X

s

′

_∈

An estors

(s)

X

i∈

subtree

(s)∩C

w

s

′

_,i

= cf low

_s

− nsf low

_s

bydenitionoftheow. Or

cf low

s

− nsf low

s

≥ uf low

s

− w

s

= x

sothereareatleast

x

requests that

s

antakefrom itsan estors.

Let

a

1 =

parent

(s), ..., a

k

= r

bethean estorsof

s

.

x

j

=

P

i∈

subtree

(s)∩C

w

a

j

,i

is theamountof requests that

s

antakefrom

a

j

. We hoose arbitrarywhere to take therequestsif

P

j

x

j

> x

, and donot modify the assignment ofthe other requests. We thus assumein thefollowingthat

P

j

x

j

= x

. Sin e these

x

j

requests are oming from a lient in subtree

(s)

, we anassign them to

s

, andtherearenowonlyW

− x

j

requestshandledby

a

j

,whi hmeansthat

a

j

istemporarily unsaturated. However,wehavegiven

x

extrarequeststo

s

,hen e

s

ispro essing

w

s

+ x = uf low

s

requests.

We nally need to reassign requests to

a

j

, j = 1..k

in order to saturate these nodes again, taking requestsoutof nodesin

P S

(non saturatednodesother than

s

). This isdoneiteratively startingwith

j = 1

andgoinguptotheroot

a

k

. Atea hstep

j

,weassumethat

a

j

′

, j

′

_{< j}

have alreadybeensaturatedagainandweshouldnotmoverequestsawayfromthem. However,we an stilleventuallytakerequestsawayfrom

a

j

′′

, j

′′

_{> j}

. Inordertosaturate

a

j

,weneedtotake:

eitherrequestsfromsubtree

(a

j

) ∩ C

whi hare urrentlyhandledby

a

j

′′

, j

′′

_{> j}

,butwithout movingrequestswhi harealreadyae tedto

s

(i.e.

P

j

′′

_>j

x

j

′′

);

orrequests from non saturatedserversin subtree

(a

j

)

, ex eptrequests from

s

and requests alreadygivento

s

that shouldnotbemovedanymore(i.e.

P

j

′

_<j

x

_j

′

).

Thenumberofrequeststhatwe anpotentiallyae tto

a

j

istherefore:

X =

X

s

′

_∈

subtree

(a

j

)\SN \{s}

w

s

′

+

X

i∈

subtree

(a

j

)∩C

X

s

′

_∈

An estors

(a

j

)

w

s

′

_,i

−

X

j

′

_<j

x

j

′

−

X

j

′′

_>j

x

j

′′

Letusshowthat

X ≥ x

j

. Thenwe anusetheserequeststo saturate

a

j

again.

cf low

a

j

= nsf low

a

j

+

X

i∈

subtree

(a

j

)∩C

X

s

′

_∈

An estors

(a

j

)

w

s

′

_,i

= w

_s

+X+

X

j

′

_<j

x

j

′

+

X

j

′′

_>j

x

j

′′

= X+w

_s

+x−x

_j

But

cf low

a

j

≥ uf low

s

and

uf low

s

− w

s

= x

so

(23)

c

1 c

2 c

3 n

1 n

2 n

m

c

3m

Figure7: Theplatformusedin theredu tionforTheorem2.

Itisthuspossibletosaturate

s

andthenkeepitsan estorssaturated. Atthispoint,

s

be omes anodeof

SN

and we an re ompute the anoni al andnon saturated ows. We haveremoved

uf low

s

requests whi h were pro essed by nonsaturated servers,so the

cf low

and

nsf low

of all an estorsof

s

,in luding

s

,shouldbede reasedby

uf low

s

.

Inparti ular,at theroot,

cf low

r

= cf low

r

− uf low

s

,whi hprovesthat the ontributionof

s

on

cf low

r

is

uf low

s

.

In the last stepof the proof, we showthat thenumberof repli as in the modied anoni al solutionattheendoftheiteration

R

can

= SN

hasexa tlythesamenumberofrepli asthan

R

sat

. Inthesaturatedsolution, ea h PS-node

s

is pro essing

nsf low

s

requests,while in the anoni al solution,it is

uf low

s

. However,ateverystep whenadding asaturatednode

s

,wehave

uf low

s

greaterthananyofthe

nsf low

s. Itisthus easytoseethatthenumberofnodesinthe anoni al solution is less orequal to the number of nodes in the saturated solution. Sin e the saturated solutionisoptimal,

|R

can

| = |R

sat

|

,whi h ompletestheproof.

Ouralgorithmbuilds

R

can

inpolynomialtime,whi hassessesthe omplexityoftheproblem.

4.2 With homogeneous nodes and the Upwards strategy

Theorem 2. The instan e of the Repli a Counting problem with the Upwards strategy is NP- ompleteinthe strongsense.

Proof. The problem learly belongs to the lass NP: given a solution, it is easy to verify in polynomialtimethatallrequestsareservedandthatnoserver apa ityisex eeded. Toestablish the ompletenessinthestrongsense,weusearedu tionfrom3-PARTITION[3℄. We onsideran instan e