• Aucun résultat trouvé

Performance of a Server Cluster with Parallel Processing and Randomized Load Balancing

N/A
N/A
Protected

Academic year: 2021

Partager "Performance of a Server Cluster with Parallel Processing and Randomized Load Balancing"

Copied!
6
0
0

Texte intégral

(1)

HAL Id: hal-01306343

https://hal.archives-ouvertes.fr/hal-01306343

Submitted on 22 Apr 2016

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Performance of a Server Cluster with Parallel

Processing and Randomized Load Balancing

Thomas Bonald, Céline Comte

To cite this version:

Thomas Bonald, Céline Comte. Performance of a Server Cluster with Parallel Processing and

Ran-domized Load Balancing. [Research Report] Telecom ParisTech. 2016. �hal-01306343�

(2)

Performance of a Server Cluster with Parallel Processing and

Randomized Load Balancing

T. Bonald and C. Comte

April 22, 2016

Abstract

We consider a cluster of servers where each incoming job is assigned d servers chosen uniformly at random, for some fixed d ≥ 2. Jobs are served in parallel and the resource allocation is balanced fairness. We provide a recursive formula for computing the exact mean service rate of each job. The complexity is polynomial in the number of servers.

1

Model

We consider a cluster of S servers, each with service rate µ. Jobs arrive according to a Poisson process with intensity λ. Each incoming job is assigned d servers chosen uniformly at random, for some fixed d ≥ 2. Each job is processed in parallel by its assigned servers, the overall service capacity being allocated according to balanced fairness [1].

There are N = Sd classes of jobs, defined by the assigned servers. Let I = {1, . . . , N } be the set of classes. We denote by Si⊂ {1, . . . , S} the set of servers assigned to each job of class i and, for each set A ⊂ I of classes, we denote by S(A) the set of servers assigned to jobs whose class belongs to A. Let x = (xi)i∈I be the network state, where xi is the number of ongoing class-i jobs. We denote by φi(x) the total service rate of class-i jobs in state x. The corresponding vector lies in the capacity set:

C = ( φ ∈ RN+ : ∀A ⊂ I, X i∈A φi≤ µ|S(A)| ) ,

where |A| is the cardinal of the set A. The capacity set is a polymatroid, and it follows from [3] that balanced fairness is Pareto-efficient. Specifically,

∀i ∈ I, φi(x) =

( Φ(x−e

i)

Φ(x) if xi> 0, 0 otherwise,

where the function Φ is defined by the recursion Φ(0) = 1 and, using the notation Ax= {i ∈ I : xi> 0}, ∀x ∈ NN\ {0}, Φ(x) =

P

i∈AxΦ(x − ei)

µ|S(Ax)|

. (1)

Under the stability condition λ < Sµ, the stationary distribution of the system state is given by

∀x ∈ NN, π(x) = π(0)Φ(x) λ N

|x|

, (2)

where |x| = P

i∈Ixi is the total number of jobs. This stationary distribution is insensitive to the job size distribution beyond the mean.

(3)

2

Performance

We are interested in the mean service rate, defined by γ = E( P i∈Iφi(X)) E(P i∈IXi) ,

where X is a random variable distributed according to the stationary distribution π. Observe that, by symmetry, γ is the mean service rate of any job (whatever its class), and that γ ≤ dµ. Moreover, by work conservation,

γ = λ

E(P i∈IXi)

. (3)

In particular, it follows from Little’s law that 1/γ in the mean job duration.

There is no explicit formula for computing γ with a low complexity. In particular, the recursive formula of de Veciana and Shah [3] does not apply because the capacity set is not a symmetric polymatroid. We use the recent results of Gardner et. al. [2] to derive an explicit recursive formula, whose complexity is linear in SN (thus polynomial in S). We denote the system load by

ρ = λ Sµ. Proposition 1 We have γ = G/F with

G = N X n=0 S X m=0 Gn,m and F = N X n=0 S X m=0 Fn,m,

where Gn,m and Fn,m are given by the recursions G0,0 = 1, F0,0 = 0,

Gn,m= ρS N m − nρSN   m d  − n + 1  Gn−1,m+ min(d,m) X r=1 S − m + r r m − r d − r  Gn−1,m−r  , Fn,m= ρS N m − nρSN   m d  − n + 1  Fn−1,m+ min(d,m) X r=1 S − m + r r m − r d − r  Fn−1,m−r  + 1 λ m m − nρSN Gn,m if d ≤ m ≤ S and m d ≤ n ≤ m d, Gn,m= Fn,m= 0 otherwise. Proof. By symmetry, we have for any i ∈ I,

γ = λ/N E(Xi) . Let G(λ1, . . . , λN) = X x∈NN Φ(x)Y i∈I λxi i and G = G λ N, . . . , λ N  . In view of (2), we have γ = G/F , with

F = ∂G ∂λi  λ N, . . . , λ N  .

(4)

We first prove the recursion for computing G. For any A ⊂ I, let GA(λ1, . . . , λN) = X x∈NN:A x=A Φ(x)Y i∈I λxi i and GA= GA  λ N, . . . , λ N  . Observe that G = X A⊂I GA.

Now let S(A) be the set of servers that can serve jobs of classes in A. Let n = |A| be the number of active classes and m = |S(A)| be the number of busy servers. In view of (1), we have

GA= P i∈A λ NGA\{i} mµ − nNλ . (4)

For all n = 0, 1, . . . , N and m = 0, 1, . . . , S, let Gn,m= X A⊂I |A|=n,|S(A)|=m GA. Observe that G = N X n=0 S X m=0 Gn,m

Moreover, G0,0= 1 and Gn,m= 0 unless d ≤ m ≤ S andmd ≤ n ≤ md. For such a pair n, m, we deduce from (4) that Gn,m= ρS N m − nρSN X A⊂I |A|=n,|S(A)|=m X i∈A GA\{i}. (5)

We can rewrite the sum as follows: X A⊂I |A|=n,|S(A)|=m X i∈A GA\{i}= X i∈I X A⊂I,i∈A, |A|=n,|S(A)|=m GA\{i}, =X i∈I X B⊂I,i /∈B, |B|=n−1,|S(B∪{i})|=m GB, = min(d,m) X r=0 X B⊂I, |B|=n−1, |S(B)|=m−r X i∈I\B, |S(B∪{i})|=m GB, =m d  − n + 1  Gn−1,m+ min(d,m) X r=1 S − m + r r m − r d − r  Gn−1,m−r,

which is the announced result.

We now show the recursion for F . For any A ⊂ I and i ∈ I, let FA= ∂GA ∂λi  λ N, . . . , λ N  .

(5)

Observe that

F = X A⊂I

FA.

For all n = 0, 1, . . . , N and m = 0, 1, . . . , S, we can define by symmetry, Fn,m= X A⊂I,i∈A, |A|=n,|S(A)|=m FA= 1 N X i∈I X A⊂I,i∈A, |A|=n,|S(A)|=m FA and we have F = N X n=0 S X m=0 Fn,m. Similarly, F0,0= 0 and Fn,m= 0 unless d ≤ m ≤ S and

m

d ≤ n ≤ m

d. For any i ∈ I and A ⊂ I with |A| = n and |S(A)| = m, we have

FA= 1 µm − nSρN     GA+ GA\{i}+ λ N X j∈A,j6=i FA\{j}    so that Fn,m= 1 N 1 µm − nρSN ( X i∈I X A⊂I,i∈A |A|=n,|S(A)|=m GA+ X i∈I X A⊂I,i∈A |A|=n,|S(A)|=m GA\{i} + λ N X i∈I X A⊂I,i∈A |A|=n,|S(A)|=m X j∈A,j6=i FA\{j}. ) (6)

The first term of the sum is

X A⊂I |A|=n |S(A)|=m X i∈A GA= n X A⊂I |A|=n |S(A)|=m GA= nGn,m.

By equation (5), the second term of the sum is X i∈I X A⊂I,i∈A |A|=n,|S(A)|=m GA\{i}= X A⊂I |A|=n,|S(A)|=m X i∈A GA\{i}= m − nSρN ρS N Gn,m.

Thus the first two terms of equation (6) are equal to

nGn,m+ m − nρSN ρS N Gn,m= mN ρS Gn,m.

(6)

Now the third term of (6) is X i∈I X A⊂I,i∈A |A|=n,|S(A)|=m X j∈A,j6=i FA\{j}, =X i∈I X j∈I,j6=i X A⊂I,i,j∈A |A|=n,|S(A)|=m FA\{j}, =X i∈I X j∈I,j6=i X B⊂I,i∈B,j /∈B |B|=n−1,|S(B∪{j})|=m FB, = min(d,m) X r=0 X i∈I X B⊂I,i∈B |B|=n−1,|S(B)|=m−r X j∈I\B,|S(B∪{j})|=m FB, =X i∈I X B⊂I,i∈B |B|=n−1,|S(B)|=m m d  − n + 1  FB, + min(d,m) X r=1 X i∈I X B⊂I,i∈B |B|=n−1,|S(B)|=m−r m − r d − r S − m + r r  FB, = Nm d  − n + 1  Fn−1,m+ N min(d,m) X r=1 m − r d − r S − m + r r  Fn−1,m−r.

Summing the two parts of equation (6) gives the announced result. 

References

[1] T. Bonald, L. Massouli´e, A. Prouti`ere, and J. Virtamo. A queueing analysis of max-min fairness, pro-portional fairness and balanced fairness. Queueing Syst., 53(1-2):65–84, 2006.

[2] K. Gardner, S. Zbarsky, M. Harchol-Balter, and A. Scheller-Wolf. Analyzing response time in the redundancy-d system. Technical report, Technical Report CMU-CS-15-141, 2015.

[3] V. Shah and G. de Veciana. Performance evaluation and asymptotics for content delivery networks. In Proceedings of IEEE Infocom. IEEE, 2014.

Références

Documents relatifs

• We design a resilient scheduling algorithm, called Lpa-List, that relies on a local processor allocation strategy and list scheduling to achieve O(1)-approximation for some

There are two main issues when clustering based on unsupervised learn- ing, such as Growing Neural Gas (GNG) [9], is performed on vast high dimensional data collection – the fast

Elkihel, Load balancing methods and parallel dynamic programming algorithm using dominance technique applied to th 0-1 knapsack prob- lem Journal of Parallel and Distributed

In Figure 3, we consider uniformly distribution jobs sizes and we represent the evolution of the degradation factor for different values of K (note that the x-axis is in the

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

Overall, we observe that whenever the analyzed input space is small enough (i.e., queries D − F ), the size of the neural network has little influence on the input space coverage

Les applications de cette technique pour le pied bot syndromique sont par contre moins habituelles et loin d’être consensuelles .Nous rapportons notre expérience dans

III About the rheological behavior at high shear strain: flow properties of the melt 145 8 Modeling flow-induced crystallization in startup of shear flow 147 8.1