Performance evaluation of redirection schemes in content distribution networks

(1)

in Content Distribution Networks

Jussi Kangasharju Keith W.Ross

Institut Eurecom, B.P. 193

SophiaAntipolis,France

James W. Roberts

FranceTelecomR&D

Issy les Moulineaux,France

Abstract

ContentdistributionontheWebismovingfromanarchitecturewhereobjectsare

placedona single,designated serverto anarchitecture where objectsare replicated

on geographicallydistributed serversand clientstransparentlyaccess anearbycopy

of an object. In this paper we study how the dierent redirection schemes used in

moderncontentdistributionnetworksaecttheuser-perceivedperformanceinnormal

Webpageviewing. UsingbothsimulationsandexperimentswithrealWebserverswe

showthatredirectionschemesthatrequireclientstoretrievedierentpartsofaWeb

pagefromdierentserversyieldsub-optimalperformancecomparedtoschemeswhere

aclient accesses onlyone serverfor all the parts of a Web page. This implies that

whenreplicatingWeb pages,weshouldtreatthewholepage(HTMLand images)as

asingleentity.

1 Introduction

Content distributionon theWeb ismovingfromanarchitecturewhereobjectsareplaced

on a single, designated server to an architecture where objects are replicated on geo-

graphically distributed servers and clients transparently access a nearby copy of an ob-

ject [1,2,6,11,12]. The new architectures are constructed from a set of servers, which

we call content servers, that contain copies of the objects. These copies can be created

statically using some pre-determinedrules, ordynamically on-demand dependingon the

load and client request patterns. When a client wants to retrieve an object, it contacts

a mappingservice thatprovidestheclient withan addressof a content server thathas a

copy of therequested object. Therehave already been some proposals forsuch architec-

tures[4,10,15] and several companies have started to oer dynamic content distribution

services over theirownnetworks[1,2,6,11].

A vital component of a content distribution architecture is a method for redirecting

clients to the content servers. What is common to most of the proposed architectures

is that theclient is redirected to the content server by the system. This means that the

systemmustcontain mechanismsfordeterminingwhat isthebestcontent serverforeach

client. On the other hand, the proposed architectures are transparent to the client, i.e.,

theydo notrequiremodifyingtheclientsorinstallingnew softwareat client-side. Wewill

discussthedetailsof dierentredirectionmethodsinSection2.

In this paper we study how dierent redirection schemes aect the user-perceived

performance. As the measure for performance we use the total time to download all

objects on a Web page, i.e., both the HTML and embedded images. Some redirection

schemes requirethat theclient retrievessome part of aWeb page(e.g., the HTML-part)

(2)

client isusing persistent connections of HTTP/1.1 [7], thismeans thatthe client cannot

benetfrom previousrequeststhathave openedtheunderlyingTCP-congestionwindow;

instead theclient must opena new connection to anotherserver andthisconnection will

initiallysuerfromasmallcongestionwindow. Ofcourse,ifthenewserverissignicantly

closer than the old server, the client can retrieve the remaining objects faster from the

new server.

Using simulations and experiments on the Web we will evaluate the performance of

dierent redirection strategies and howthey aect thedownload time of the whole Web

page in dierent situations. We will evaluate the performance of redirection strategies

usingbothmultipleparallelconnectionsand persistent connectionswithpipelining.

1.1 Related Work

Nielsen et al. [14] studied the performance of persistent connections and pipelining and

theirresultsshowthatpipeliningisessentialtomakepersistentconnectionsperformbetter

than multiple, non-persistent connections in parallel. Although modern browsersimple-

ment persistent connections, they do not implement pipelining[18]. For thisreason the

popularbrowsersopen several persistent connectionsinparallelto a server.

Recentlyseveralcompanies[1,2,6,11]havebeguntooercontentdistributionservices.

In their services, thecontent is distributed over several, geographically dispersedservers

and clients are directed to one of these servers using DNS redirection. We will discuss

DNSredirectioninmore detail inSection2.

Rodriguez et al. [16] study parallelaccess schemes where theclient requests dierent

parts of one object from dierent servers. Their scheme is designed for large objects

and isnotwellsuitedfortypicalweb pageviewing;alsoitrequiresmodicationstoclient

software. Inourworkwestudytheperformanceofcurrentlyemployedredirectionschemes

which redirecttheclientto asingle serverbutrequireno modicationsto client software.

Thispaperisorganizedasfollows. Section2presentsthedierentredirectionschemes

used in real world systems. Section 3 describes the model used in our simulations and

Section 4 presents theresults obtained inthe simulations. Section 5 presents theresults

obtained inexperimentson the real network. Section 6 discusses the implicationsof our

results. Finally,Section 7concludes thepaperand presentsdirections forfuture work.

2 Redirection to Servers

Clientscanberedirectedtoserverswithseveraldierentmethods. Forexample,theorigin

server couldredirect clientsusingtheappropriate HTTP-replycodes, theclient could be

given a list of alternative content servers, or the system could use other mechanisms,

such as DNSredirection. These dierentmechanismshave all dierent overheads on the

user-perceived performancewhichwe willdiscussinSection 6. Forthe remainderof this

paperwe assumethat the systemuses DNS redirection(or a similarmethod) because of

its wide-spreaduseinthe realworld.

Currently the content distribution companies redirect clients using DNS redirection

in two dierent ways. In both redirectionschemes the client sends a DNS query to the

authoritative DNS server of the content distribution company which replies with an IP-

addressforacontentserverthattheauthoritativeDNSserverdeemstobethebestforthe

client. (ThereplycanincludeIP-addressesofmultipleserversbutmodernclientsuseonly

(3)

The advantage of using DNS redirectionis that it does not require any modicationsto

theclientsoftwarebecause typicallyURLsidentifyhostsbytheirnames.

The two dierent schemes are as follows. In the rst scheme, which we call full redi-

rection,the content distributorhascompletecontrolovertheDNS mappingof theorigin

server. When a client requests any object from the origin server, it will get redirected

to a content server. This scheme requires that eitherall content servers replicate all the

content fromtheoriginserver,orthatthecontent servers actassurrogate proxiesforthe

origin server. A majoradvantage of fullredirectionis that itadapts dynamicallyto new

hot-spots because all client requests are redirected to geographically dispersed content

servers.

The secondscheme, whichwe callselectiveredirection,goesasfollows. Thereferences

to replicatedobjects arechanged topoint toa serverinthecontentdistributionnetwork.

When the client wants to retrieve a replicated object, it resolves the hostname which

redirectsitto acontentserver. Inthisscheme,thereplicated objectsappeartobesimply

objects that are served from another server. An advantage of this scheme is that the

contentserversonlyneedtohavethecontentthathasbeenreplicated. Inmoderncontent

distributionnetworksthat useselective redirection,the burdenof decidingwhichobjects

to replicate is placed on the content provider. A system using selective redirection is

slowerto adaptto hot-spots because it mustrst identifythem, change thereferencesto

the new hot objects, and possibly replicate the objects to content servers. In addition,

clients that have cachedreferencesto thenew hotobjects (e.g., caching theHTML page

referencingahotimage)wouldnotberedirected,butwouldinsteadgototheoriginserver

thusnegatingthe benetsof usingacontent distribution network.

3 Simulation Model

ForthesimulationsweusedtheNSnetworksimulator[13]. Weusedaverysimplenetwork

topologyanditisshowninFigure1. InFigure1,C istheclient,S istheserver,and Lis

the linkbetween the two. To represent dierent network conditions we varied thedelay,

bandwidth, and loss rate on the linkL. We used FullTCP-agents at boththe client and

server and we set the MSS to 1460 bytes. This is the MSS that an Ethernet-connected

machine wouldobtainandwe have obtainedthesame MSSonreal connectionstodistant

servers from our local Ethernet. As suggested in [14], we disabledNagle's algorithm on

theserver'sTCPagent.

C L S

Figure 1: Simulationmodel

In allofoursimulations,theclient rstsent arequest to theserver,theserverreplied

with one le (the HTML-le). When the client had received all of this le, it requested

theimages from theserver.

We observed thatbecause all ourHTTP-connectionswereshort, theunderlyingTCP

connection neverprogressed beyond slow-start. Therefore thebandwidthof the link had

only minimal eect on the overall download time. This is also shown by the graphs

(4)

bandwidth is greater than 1 Mbit/s. This is also true inthe case when the loss rate on

thelink ishigh(Figure2b, averaged over3000 simulationruns).

0 2 4 6 8 10

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Bandwidth in Mbps

Download time in seconds

RTT = 20ms RTT = 60ms RTT = 100ms RTT = 120ms RTT = 160ms RTT = 200ms

(a)Noloss

0 2 4 6 8 10

0 0.5 1 1.5 2 2.5

Bandwidth in Mbps

Download time in seconds

RTT = 20ms RTT = 60ms RTT = 100ms RTT = 120ms RTT = 160ms RTT = 200ms

(b)5%loss rate

Figure 2: Eect of bandwidthondownload time

Becauseofthenegligibleeectofthebandwidth,wecomparedthedierentredirection

schemes onlybyvarying theround-trip times and lossrates. Our simulationmodeldoes

notaccountforserverloadsorhowtheclientobtainstheredirection;wewilldiscussthese

issues inSection6.

3.1 Redirection Schemes

We compared the redirectionschemes fortwo dierent clients. The rst one wasa client

that does not implement pipelining and opens several persistent connections in parallel

and the second client implemented pipelining. Given the typical number of embedded

images on a page (see Section 3.2) and the numberof persistent connections opened by

popularbrowsers(2or6,see[18]),weassumedthattheclientopeningparallelconnections

cannot retrieve allimages withone setof connections. Instead, itrstsendsone batch of

requests and when it receives the replies, it requests the second batch; this matches the

behaviorof aclientimplementing persistentconnections withoutpipelining.

The baseline for our comparisons was a scheme where the client retrieves all object

from the origin server. We compared this baseline scheme to other schemes where the

client retrieves the HTML from the origin server and is redirected to another server for

the images. To model the second server, we simply used two models from Figure 1 in

parallel and the onlyconnection between them was when the rst model had completed

its download and it triggered the second model. We used six dierent round-trip time

valuesfortheservers,10,20, 60,100, 120,and 160 ms. We have observed that thevalue

of 160 ms is quite typical from Europe to popular web sites in the US, and the smaller

valuesreectconditionswithintheUS.

Our model does not account for the delay caused by a potential DNS lookup to get

the address of the second server. We also assume that the parallel connections do not

interferewithone anotherand representthem byone connectionwhichretrievesonefth

of the imagedata on thepage. In reality,the downloadtime of the parallelclient would

(5)

betweenalltheparallelconnections. Modernclientsalsostartrequestingembeddedimages

assoonas they have seen thereferences to them instead of waitingfor thewholeHTML

page to download. Our model does not take thisfully into account, butthe behavior of

ourmodelisappropriate forimagesthat arereferenced nearthe endof theHTML page.

3.2 Files

ToestimatethesizeofaWebpage,wedownloadedallthehomepagesofthemostpopular

sites fromHot100.com [8]. We foundthat themean lesizeis 20 KB.This onlyincludes

the actual HTML for the page and does not include any embedded objects. The mean

amount of embedded image data on a page is 40 KB, the mean number of embedded

imagesonapageis15.5,andthemeansizeofasingleembeddedimageis2.5KB.Mostof

theHTML pages areat least 5 KB and almost noneof the HTML pages arelarger than

45 KB.To retrieve a typical homepagewithall theembeddedimages we needto transfer

around 50{60 KB and the total amount of data (HTML and images) can be as high as

250 KB.Weconstructedfourdierentsizedpagesinorderto cover asmanydierent real

pagesaspossible. Table 1shows thedierent pagesand theamount ofHTML andimage

data on each of them. We also show theamount of imagedata retrieved bythe parallel

client whichwasequal to one fthofthetotal image data.

Page HTML Images Parallel

Small 5 KB 10 KB 2 KB

Medium 10 KB 20 KB 4 KB

Large 20 KB 40 KB 8 KB

X-Large 40 KB 80 KB 16 KB

Table 1: Dierent pagesused insimulations

4 Simulation Results

In this section we will present the results from our simulations. We ran simulationsfor

allthedierentparametervalues(RTTandlossrate) butbecause ofspacelimitationswe

onlyshow some of thecombinationshere. The resultsforthe simulation runsnot shown

here weresimilar.

4.1 Loss Free Conditions

We rst simulatedretrievals under completely lossfree conditions. In Figure 3 we show

the performance of the baseline scheme, i.e., retrieving everything from one server. We

plot one curve for each dierent origin server (R TT

o

), and a point on a curve shows the

downloadtimerelativetothedownloadtimefromtheoriginserverifwehadusedaserver

withRTTgivenbythex-axisvalueforalltheobjects. Theplotshowsthedownloadtimes

for a client using parallel connections. We observed only very small dierences between

dierent lesizes.

In Figure 4 we show the relative download time of the selective redirection scheme

fora client usingparallelconnections and fortheMediumand Large pages. We plotone

(6)

0 20 40 60 80 100 120 140 160 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

RTT in ms

Relative download time RTT

o = 160ms RTT o = 120ms RTT o = 100ms RTT o = 60ms RTT o = 20ms

Figure 3: Performance ofthebaselinescheme

curveforeachdierentoriginserver(R TT

o

). Apointonacurveshowsthedownloadtime

relative to the baseline that a client would obtain if the origin server had a round-trip

timeofR TT

o

andthenewserverhadtheRTTonthex-axis. Thebaselineinthesegraphs

refersto retrievingallobjectfromtheoriginserverwithround-triptimeR TT

o

. Aswecan

see, typicallywe need the RTT to the new server to be less than 75% of R TT

o

in order

foritto worthitto switch to thenew server. In Figure5 we showthe resultsforaclient

usingpipelining. We can see that inthis case, the RTT to the new servershouldbe less

than50% ofR TT

o .

0 20 40 60 80 100 120 140 160

0 0.2 0.4 0.6 0.8 1 1.2 1.4

RTT in ms

Download time relative to baseline

RTT o = 160ms RTT o = 120ms RTT o = 100ms RTT o = 60ms RTT o = 20ms

(a)Mediumpage

0 20 40 60 80 100 120 140 160

0 0.2 0.4 0.6 0.8 1 1.2 1.4

RTT in ms

Download time relative to baseline

RTT o = 160ms RTT o = 120ms RTT o = 100ms RTT o = 60ms RTT o = 20ms

(b)Largepage

Figure4: Download times withparallelconnections

ComparingFigures4and 5weseethatforparallelconnectionstheslopeof thecurves

is smaller, meaning that a small reduction in RTT only gets small reductions in total

downloadtime. Forpipeliningtheslopeislargermeaninglargergainsforsmallreductions

in RTT. Overall, we see thatthe maximum gain in download time is around 30% of the

baseline. This isachieved when thenew serverisextremely close(RTTaround 10 ms).

As we can see in Figure 3, if the origin server is slow (100 ms or more), we can get

impressivegainsifwewereabletoaccessanearbyserverforallofthepage. Bychoosingto

go to a nearbyserverinstead ofthe originserverfor allthe ofthe pagewe can download

(7)

0 20 40 60 80 100 120 140 160 0

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

RTT in ms

Download time relative to baseline

RTT o = 160ms RTT o = 120ms RTT o = 100ms RTT o = 60ms RTT o = 20ms

(a)Mediumpage

0 20 40 60 80 100 120 140 160

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

RTT in ms

Download time relative to baseline

RTT o = 160ms RTT o = 120ms RTT o = 100ms RTT o = 60ms RTT o = 20ms

(b)Largepage

Figure 5: Downloadtimes withpipelining

it in 10% of the time it would have taken to get the page from the origin server. The

possiblegainsof goingto a nearbyfast serveraremuch greaterthantheachievablegains

obtained with schemes where the client must switch servers during the download. This

holdsforbothparallelpersistentconnectionsandpipeliningconnections. Wealsoseethat

the schemes which switch servers (Figures 4 and 5) can obtain at maximum only a 30%

reduction indownload time. In the same situation, by going directly to the nearby fast

server,theclientwouldreducethedownloadtimeby90%. Inotherwords,evenifthenew

serverwouldbefastenoughtowarrantswitchingtoit,theclientwouldbemuchbettero

bygoingto that fastserveralready forall theobjects. Wecan conclude thatundergood

network conditions,switchingservers duringdownload will givesub-optimal performance

compared to ascheme whichredirects theclientsto a good serverfor alltheobjects.

4.2 Simulations with Loss in Network

We then ranthesame simulationsusinga 2%lossrate on all thelinksinthesimulation.

Theresultsinallcasesweresimilartotheonesobtainedunderlossfreeconditions. Figure6

shows theperformance of thebaseline scheme, i.e., retrieving everything from the origin

server withround-triptime R TT

o

. Aswith theno-loss situation,thedierences between

dierent lesizeswerevery small.

When we compareFigures3 and6,we seethatintheno-loss situation,themaximum

gains are larger than in the situation where there is loss on the link. We believe this

is because the underlying TCP connection is still in slow-start and therefore its RTT-

estimatehasnotyetadapted wellenoughto thelinkRTT. Hence,theRTT-estimatesfor

thelinkswithsmallRTTsaretoohighanddiscoveringalostpackettakesmoretimethan

itwouldiftheTCP connectionknew theRTTbetter.

In Figure 7 we show the relative download times of the selective redirection scheme

for a client usingparallel connections and Medium and Large pages averaged over 3000

simulation runs.

As we can see, the general form of the curves matches those obtained in loss free

conditions(Figures4and5). Theonlydierenceisthatthepointwhereswitchingservers

wouldbecomeusefulislowerthaninthelossfreecase. Furthermore,themaximumgainin

(8)

0 20 40 60 80 100 120 140 160 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

RTT in ms

Relative download time RTT

o = 160ms RTT o = 120ms RTT o = 100ms RTT o = 60ms RTT o = 20ms

Figure 6: Performanceof thebaselinescheme withloss

0 20 40 60 80 100 120 140 160

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

RTT in ms

Download time relative to baseline (w/ errors)

RTT o = 160ms RTT o = 120ms RTT o = 100ms RTT o = 60ms

(a)Mediumpage

0 20 40 60 80 100 120 140 160

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

RTT in ms

Download time relative to baseline (w/ errors)

RTT o = 160ms RTT o = 120ms RTT o = 100ms RTT o = 60ms

(b)Largepage

Figure7: Download times withparallelconnectionswith loss

downloadtimeislessthan20%andforSmallpagesswitchingserversalways resultedina

slowerdownload. WebelievethatthereasonforthestricterRTTrequirementforthenew

server is dueto thepossibilityof losingTCP SYN packets when opening theconnection

tothenewserver. Mostofthedownloadstooklessthan2secondsfromstarttonish,but

alostSYNpacketcaused a6secondtimeout. Thisslows downtheconnectionto thenew

server considerably and givesa substantial advantage to usingthe persistent connection

to the oldserver. Also,the TCPconnections are inslow-startand do not therefore have

an accurate estimate of the RTT and discovering lost packets will take longer on the

new connection. If a packet on the persistent connection is lost, it will be discovered

faster, either because of duplicate ACKs orbecause the TCPagents have an idea of the

connection round-triptimeand know when to expectpackets.

The results conrm our conclusions from the loss free case. It is preferable not to

switchservers duringdownloadofasinglepage. Switchingserversgreatlylimitsthegains

inperformance obtainedfrom usinga nearbyserverfor all theobjects. This meansthat

a system which forces clients to switch servers during the download of a single page will

provide clients withsub-optimal service;either thenew server isnotfast enough, oreven

ifitis,the client shouldhave beenredirectedto the fastserver foralltheobjects.

(9)

To validate the results obtained from our simulations, we ran several experiments in

which we retrieved objects from real, replicated web sites. We used three dierent sites,

Apache [3],Debian [5],and Squid[17], and chose several mirrorservers ofthose sites for

our experiments. From each of the three main sites we selected some lesthat matched

the les in our simulations as closely as possible. For Apache and Squid our les were

closeto theSmallpageinTable1,and forDebianthey weresimilarto theMediumpage.

Ourgoalwastohaveourclientrunlikeamodernbrowser,i.e.,nopipelining. Wechose

one HTML le and one image le from each site and divided the servers in pairs. The

client wouldrst requestthebothlesfrombothserversinthepairandthen requestthe

HTMLfromtherstservertheimagefromtheotherserverinthepair. Beforerequesting

theobjects we performed aDNSqueryon thehostnamesinorderto eliminatetheeects

of long DNS lookups. We ran our experiments several hundred times during dierent

times of dayand on several days.

Weshowtheresultsfrom7ofourexperimentsinTable2. Ineachexperimentweused

a dierent pair of servers to get as many dierent combinations as possible. The RTTs

to the two servers shown in columns R TT

A

and R TT

B

reect typical RTT values from

ourclientmachinetotheservers intheexperiment. We usedserverAasthebaselineand

show the relative download times for two other download schemes in columns AB and

BB. ColumnAB refers to experiments wherethe client retrieved the HTML from server

A and theimagedatafrom server B. Incolumn BB we showthe relative downloadtime

when theclient retrievedeverythingfromserverB.

Experiment RTT

A

RTT

B

AB BB

Apache-1 80 ms 25 ms 0.77 0.14

Apache-2 60 ms 50 ms 1.34 0.58

Debian-1 100 ms 40 ms 0.98 0.25

Debian-2 180 ms 90 ms 0.97 0.72

Debian-3 80 ms 65 ms 1.26 0.79

Squid-1 200 ms 45 ms 0.73 0.20

Squid-2 70 ms 45 ms 1.03 0.59

Table2: Results from experiments

The results we obtained in our experiments closely match those we obtained in our

simulations. In fact, forall the experiments shown in Table 2, our simulationscorrectly

estimated whether switching servers would result in a gain in relative download time or

not.

6 Discussion

Ourresultsshowthattheclientcandownloadawholewebpagefastestifitisusingpersis-

tent connectionsto anearbycontent server(possiblyusingparallelpersistentconnections

forembedded images). Ofthe two redirectionschemes, fulland selective redirection, full

redirectionachievesthisgoaleasilysinceallrequeststotheoriginserverareredirectedto

a nearbycontent server.

(10)

all objects on a web page are replicatedin thesame way and thusclient requests forthe

objects would be redirectedto thesame content server. This putstheburdenof ensuring

eÆciency on the party deciding the replicated objects. Some modern systems put this

responsibilityon the content provider byallowing the content provider to tag individual

objects for replication. We feel that ensuring eÆcient delivery of content to the clients

shouldbetheresponsibility of the content distributor; infact, eÆcientdelivery ofcontent

is exactlythereasonwhycontent providersenlisttheservices ofcontent distributors.

Our workisbasedontheassumptionthattheclientisableto openpersistentconnec-

tions to all servers, although we do not assume pipelining of requests. Even though the

content provider may run a web server that does not implement persistent connections,

the content distributor can implement persistent connections inits own content servers.

If the content provider's web server doesnot implement persistent connections then the

RTT threshold for switching servers would be higher (because new connections to the

origin server would go throughslow-start again). If a suitablecontent server exists, the

client would bebetter ousingthatserverforall objects.

Our simulationmodeldoesnotaccount fortwo importantfactors,namely serverloads

andredirectioncosts. Amajorreasonfordistributingcontentonseveralserversisto take

oloadfromtheoriginserverand distributeitamong thecontentservers. Increasingthe

load onthe originserver inourmodel wouldhave the eect ofmaking it more attractive

toswitchservers,i.e., itwouldmake theRTT-thresholdforswitchinglower. Ontheother

hand, ourmodelassumes thattheclient knowsthe addressof thenew content server. In

reality,theclientwouldhavetoobtainthisaddresssomehowbeforecontactingtheserver.

Thiswouldmake switchingservers lessattractive,i.e., raise theRTT-threshold.

The cost of getting the redirectiondepends on the redirection technique used. If the

system is usingDNS redirection, thenthe client can expect to spendat least 200 ms for

getting theaddressof thecontent server[9]. In theworst case, theDNSlookup can take

several seconds to resolve. In our simulations and experiments all the downloads took

onlya few seconds at maximum and a long DNSlookupwouldhave caused asignicant

slow-downforswitchingservers. If theclient needsto contact theoriginserverto getthe

redirection,thiswouldadd at leastone round-triptime to theoriginserver.

7 Conclusion

Inthispaperwehaveevaluatedtheperformanceofthedierentclientredirectionschemes

used in modern content distribution networks. Using both simulationsand experiments

ontherealnetworkwehave foundthatredirectionschemes,whichforceclientstoretrieve

objects on a web page from multiple servers, always yields sub-optimal performance in

terms of the overall client downloadtime compared to schemeswhich allowthe client to

retrieveallobjectsfromone,goodserver. Thisimpliesthatwhenreplicatingwebpages,we

musttreattheHTML-pageandtheembeddedimagesasasingleentityandreplicateeither

all or none of them. Full redirection achieves this goal and yields superior performance

compared to selectiveredirectionwhichmay splittheweb pagebetween several servers.

Inourfutureworkwewillexpandoursimulationmodelsto includeseveral clientsand

explore the eects of dierent network topologies on the results. We will improve our

simulation models by includingother parameters, such as server load. We willalso do a

more extensive setof experimentsto validate ourconclusions.

(11)

[1] Adero.<URL:http://www.adero.com/>.

[2] Akamai.<URL:http://www.akamai.com>.

[3] TheApache SoftwareFoundation. <URL:http://www.apache.org/>.

[4] S. M. Baker and B. Moon. Distributed cooperative web servers. In Proceedings of

TheEighth International WWW Conference, Toronto,Canada, May 11{14, 1999.

[5] DebianGNU/Linuxweb site. <URL:http://www.debian.org>.

[6] DigitalIslandinc.<URL:http://www.digisle.net>.

[7] R.T.Fielding,J.Gettys,J.Mogul,H.Frystyk,L.Masinter,P.Leach,andT.Berners-

Lee. RFC 2616: Hypertext Transfer Protocol { HTTP/1.1, June1999.

[8] 100hotsites.<URL:http://www.hot100.com/>.

[9] J. Kangasharju and K. W. Ross. A replicated architecture for the domain name

system. In Proceedings of IEEE Infocom, TelAviv,Israel, March 26{30 2000.

[10] J.Kangasharju,K.W.Ross,and J.W. Roberts. Locatingcopiesofobjectsusingthe

domainname system. In Proceedings of the 4th WebCaching Workshop, San Diego,

CA,March 31 {April 2,1999.

[11] MirrorImage Internet. <URL:http://www.mirror-image.com>.

[12] Adistributedtestbed fornational informationprovisioning.

<URL:http://ircache.nlanr.net/>.

[13] UCB/LBNL/VINT network simulator - ns(version2).

<URL:http://www-mash.cs.berkeley.edu/ns/>.

[14] H.F.Nielsen,J.Gettys,A.Baird-Smith,E.Prud'hommeaux,H.W.Lie,andC.Lilley.

Network performanceeects ofHTTP/1.1, CSS1,and PNG. In Proceedings of ACM

SIGCOMM, Cannes,France,September14{181997.

[15] M. Rabinovich and A. Aggarwal. RaDaR: A scalable architecture for a global web

hosting service. In Proceedings of The Eighth International WWW Conference,

Toronto, Canada, May11{14, 1999.

[16] P. Rodriguez,A. Kirpal, and E. W. Biersack. Parallel-access for mirrorsites in the

internet. In Proceedings of IEEEInfocom, TelAviv,Israel, March 26{30 2000.

[17] SquidInternetObjectCache. <URL:http://squid.nlanr.net/>.

[18] Z.Wang and P.Cao. Persistent connectionbehaviorof popularbrowsers.

<URL:http://www.cs.wisc.edu/~cao/papers/persistent-connection.html>.

(12)

JussiKangasharjureceivedhisMasterofScienceinTechnologyfromtheDepartmentof

ComputerScienceandEngineeringinHelsinkiUniversityofTechnology,Finland,in1998.

He received his DEA from Ecole Superieure des Sciences Informatiques (ESSI), France

in 1998. He is currently pursuing PhD studies at University of Nice (Sophia Antipolis)

on webcontent distributionintheMultimediaCommunicationsDepartment ofEurecom.

His research interestsinclude content distribution,web caching,and Internetprotocols.

Keith Ross received his Ph.D. from the University of Michigan in 1985 (Program in

Computer, Information and Control Engineering). He wasa professor at the University

of Pennsylvania from 1985 through1997. At theUniversityof Pennsylvania,his primary

appointment was inthe Department of Systems Engineering and his secondary appoint-

ment was in the Wharton School. He joined the Multimedia Communications Dept. at

Institut Eurecom in January 1998, and became department chairman in October 1998.

InFall1999, whileremainingaprofessorat InstitutEurecom,he co-foundedand became

CEO of Wimba.com. Keith Ross has published over 50 papers and written two books.

He hasserved on editorial boards of ve major journals,and hasserved on the program

committees of major networking conferences, including Infocom and Sigcomm. He has

supervisedmorethantenPh.D.theses. Hisresearchand teachinginterests includemulti-

media networking, asynchronous learning, Web caching, streaming audio and video, and

traÆcmodeling.

Jim Roberts has a BSc in mathematics from the University of Surrey,UK and a PhD

from the University of Paris. Since graduating in 1970, he has worked in the eld of

teletraÆc engineering and performanceevaluation. He is currently with France Telecom

R&D.Hisresearch hasbeenmainlyintheeldofmultiservicenetworks, ISDN,ATMand

IP.Heis authorof manypapersinthiseldand editorofthebook"BroadbandNetwork

TeletraÆc", the nal report of the European COST 257 project which he chaired. He

receivedthebestpaperawardforInfocom99. Heisamemberofaseveraljournaleditorial

boardsand conferenceprogramme committeesinthenetworkingeld andparticipatesin

standardization activitiesat theITU.