The Model - Scalable content distribution in the Internet

As shown in Figure 2.2, the Internet connecting the server and the receivers can be modeled as a hierarchy of ISPs, each ISP with its own autonomous administration. We shall make the reasonable assumption that the Internet hierarchy consists of three tiers of ISPs: institutional networks, regional networks, and national backbones. All of the clients are connected to the institutional networks (typically via LAN or modem connection); the institutional networks are connected to the regional networks; the regional networks are connected to the national networks. The national networks are also connected, sometimes by transoceanic links. We shall focus on a model with two national networks, with one of the national networks con-taining all of the clients and the other national network concon-taining the origin servers.

Institutional Network Institutional Network Institutional Network Institutional Network National Network National Network

Regional Network Regional Network

Origin Servers

Figure 2.2: Network topology

In order to have a common basis for the comparison of caching versus multicast, as shown in Figure 2.3 we model the underlying network topology of the national and regional networks as a full

O

-ary tree (a full

O

-ary tree has proved to be a good model for network topologies, providing very realistic results [69]). Let

O

be the nodal outdegree of the tree.

Let

H

be the number of network links between the root node of a national network and the root node of a regional network.

H

is also the number of links between the root node of a regional network and the root node of an institutional network. Let

z

be the number of links

...

C I H

LAN z

International Path

Clients National Network

C Origin Server

Regional Networks

Institutional Networks

l=0 l=1 l=2

Figure 2.3: The tree model.

between a origin server and root node (i.e., the international path). Let

d

be the propagation delay on one link, homogeneous for all links. Let

l

be the level of the tree⁰

l

H

⁺

z

where

l

⁼⁰represents the top node of the institutional network, and

l

⁼²

H

⁺

z

^represents

the origin server.

We assume that bandwidth is homogeneous within each ISP. Let

C

R, and

C

N be the bandwidth capacity of the links at the institutional, regional, and national networks. Let

C

be the bottleneck link capacity on the international path. Receivers are only on the leaves of the tree and not on the intermediate nodes. We assume that the network is a lossless network.

2.2.1 Document Model

In order to not obscure the key points, we make a number of simplifying assumptions. We assume that all documents are of the same size,

S

bytes (we consider a document to be a Web page or an in-lined image). We assume that each institution issues requests at a rate of

LAN. We assume that there are

N

HC hot-changing documents, all of which being candidates for CMP. From each institution the request rate for any one of the hot-changing

documents is the same and is denoted by

I. We assume the

I is Poisson distributed. The assumption of Poisson arrivals is a reasonable one [45], [10] [68]. The total request rate from an institution for the hot-changing documents is

LAN^HC ⁼

N

I. The rate of the remaining “background traffic” is

_BLAN ⁼

LAN ^;

LAN. Finally, let^HC be the update period of a hot-changing document. Initially we assume that all hot-changing documents change periodically everyseconds. In this case, caches do not need to contact the origin server to check for the document’s consistency and the

N

HC hot-changing documents can be removed from the caches everyseconds. We shall also consider non-periodic updates.

For notational convenience, we use

l ⁼

for l

⁼⁰

l⁼

O

^l^;1

for l

;

for the aggregate request rate from all institutions below the multicast tree rooted at level

l

Also

tot ⁼

O

²H denotes the total request rate, aggregated over all institutions.

2.2.2 Hierarchical Caching Model

Caches are usually placed at the access points between two different networks to reduce the cost of traveling through a new network. As shown in Figure 2.4, we make this assumption for all of the network levels. In one country there is one national network with one (logical) national cache. There are

O

H regional networks and every one has one (logical) regional cache. There are

O

²H institutional networks and every one has one (logical) institutional cache. Caches are placed on height⁰of the tree (level¹in the cache hierarchy), height

H

^of

the tree (level²in the cache hierarchy), and height²

H

of the tree (level³of the hierarchy).

If a requested document is not found in the cache hierarchy the national cache requests the document directly from the server.

Caches are connected to their ISPs via access links. We assume that the capacity of the access link at every level is equal to the network link capacity at that level, i.e.,

C

R, and

C

I for the respective levels.

...

Figure 2.4: The tree model. Caching placement.

For simplicity we assume that clients’ local caches are disabled. (We could include local client caches in the model, but they would only complicate the analysis without changing the main conclusions.) The hit rate is the percentage of requests satisfied by the caching hierar-chy for all documents. The cumulative hit rate at a cache level or below for the institutional, the regional, and the national caches is given by

HIT

N. To keep the analysis simple we assumed some typical values for the hit rates such as

HIT

I ⁼⁰

:

⁵^,

HIT

R ⁼⁰

:

⁶^,

HIT

N ⁼ ⁰

:

⁷[85] [79]. In Section 4.4.1 we derive analytical expressions for the hit rates at different caching levels.

A caching hierarchy cannot satisfy all requests arriving to it. Requests not satisfied on the caching hierarchy are called misses. We assume that each cache has infinite storage capacity since it is becoming very common to have caches with huge effective storage capacities. We also ignore non-cacheable misses (e.g. dynamic documents generated from cgi scripts), as they do not impact significantly the main conclusions of this chapter. In our analysis we consider hierarchical caching, however, the caching hierarchy can be easily replaced by a

caching infrastructure with a mesh configuration as described in [93]. Thus, along this thesis we will use the terms caching hierarchy and caching infrastructure interchangeable.

2.2.3 CMP Model

For CMP we assume that the same hierarchical caching infrastructure is in place, and that most of the documents are distributed with the caching infrastructure. However, the

N

HC hot-changing documents are distributed with CMP.

For the multicasting, we assume the origin server to be connected to the clients via a core based tree [14, 29]. The server sends to the core, which is the root for a shortest path tree [32], where a receiver is connected to the core via a shortest path through the network. Let

cmp be the multicast transmission rate for a single document. In Section 2.5.2 we shall show how

cmp can be calculated.

Dans le document Scalable content distribution in the Internet (Page 35-39)