Heuristic Pareto Minimization of Max Weighted Flow on Unrelated Machines

7 Offline Max-stretch Optimization and Pareto Optimality

7.3 Heuristic Pareto Minimization of Max Weighted Flow on Unrelated Machines

Here, we target the more general case of the max weighted flow as we will need to look at the special case of max-flow minimization.

Algorithm 2 presents the solution we propose for the general case. The solution for uni-processor case cannot be straightforwardly extended to the gen-eral case as the Earliest Deadline First algorithm is obviously not optimal for non-uniform machines. Once again we (try to) recursively optimize the max weighted flow of the jobs. We compute the best achievable max weighted flow for the jobs whose weighted flow is not yet fixed, and we (try to) minimize the number of jobs whose weighted flow is equal to this maximum. As always the objective max weighted flow gives a deadline perFreeStretch job. We first min-imize the number of distinct deadlines dsuch that there always is a job whose deadline isdand which is completed at dated. Then we minimize the number of (problematic) jobs, i.e., of jobs which are completed at their deadline.

We first show that Algorithm 2is correct. Then we come back on Step 15, which is not fully defined.

Lemma 3. Algorithm2produces a valid schedule.

Algorithm 2: Heuristic Pareto minimization of max weighted flow.

FixedStretch ← ∅

FreeStretch ← {J1, ..., Jn}

while FreeStretch6=∅do

Compute the minimum max weighted flow S of the jobs in

FreeStretch taking into account that for any jobJ_j such that (J_j,Sj)∈FixedStretch,J_j has exactly a stretch ofSj

foreachJ_j ∈FreeStretch do

In the set of time intervals defined by the release dates and

deadlines (see Section6.1.1), letI_t_d be the time interval ending at dated: supI_t_d=d

Solve System (5) (which attempts to complete strictly beforedall

weighted flow of S, and such that all the other jobs inSd can simultaneously have a max weighted flow strictly smaller than S.

foreachJ_j ∈S_d⁰ do

FreeStretch ←FreeStretch\ {Jj}

FixedStretch ←FixedStretch∪ {(J_j,S)}

foreach(J_j,S_j)∈FixedStretch do

d¯j←rj+Sj×pj 20

Build a schedule according to the solution of Linear Program 1.

Proof. The proof of correction of Algorithm2follows from the proof of correction for Algorithm1, except for the loop at Step10. We have therefore to prove two properties: 1) System5has a null solution for deadlinedif and only if, whatever the schedule, there exists a job Jj such thatCj = ¯dj =d (we then say dis a

“tight” deadline); 2) there exists a valid schedule under which, whatever the deadlinedwhich is not tight, there is no jobJj such thatCj= ¯dj=d.

We now prove the first property. Suppose d ∈ D is not a tight deadline.

Then there exists a schedule Θ such that all jobs whose deadline is dcomplete strictly before the dated. We consider the time intervalIt_dwhich ends at dated (see Section6), and any processorPi. As no jobs whose deadline isdcompletes at dated, right before that date either:

1. Pi is idle, and then:

supIt_d−infIt_d>X

α^(t)_i,j.pi,j> X

j|d¯_j=d

α^(t)_i,j.pi,j.

2. Pi processes a job whose deadline is strictly greater thand, and then:

j|d¯j=d

α^(t)_i,j.p_i,j<X

α^(t)_i,j.p_i,j6supI_t_d−infI_t_d.

In all cases:

supI_t_d−infI_t_d− X

j|d¯_j=d

α^(t)_i,j.p_i,j>0 Thus, we can pick forδthe strictly positive value:

min



supI_t_d−infI_t_d− X

j|d¯_j=d

α^(t)_i,j.p_i,j



. Therefore, ifδ= 0,dis a tight schedule.

Conversely, if δ >0, we take any solution to System (5), and then, on each processor, and during each time interval, we schedule earliest deadline first the fractions α^(t)_i,j. As δ > 0, whatever the processor, infI_t_d+P

j|d¯_j=dα^(t)_i,j.p_i,j <

supI_t_dand, thus, all jobs those deadlines aredare completed strictly before the dated.

We now prove the second property. Let d₁ and d₂ be two deadlines in D.

Let Θ₁ and Θ₂ be two schedules such that under Θ₁ (resp. Θ₂) no job J_j is such that Cj = ¯dj = d1 (resp. Cj = ¯dj = d2). We denote by α^(t,1)_i,j (resp.

α^(t,2)_i,j ) the fraction of jobJj processed on processorPi during the time interval Itunder the schedule Θ1 (resp. Θ2). We then define a third schedule, Θ3, by α^(t,3)_i,j =¹₂(α^(t,1)_i,j +α^(t,2)_i,j ), and by scheduling, on each processor, and during each interval, the fractions Earliest Deadline First. One can easily check that Θ3 is a valid schedule and that under Θ3, there is no jobJj such thatCj = ¯dj =d1

orC_j= ¯d_j=d₂. An immediate induction gives us the desired property.

Step 15 does not explicit how the set “S_d⁰” should be computed, especially as we would like this set to be as small as possible. In fact, in the general case this problem is NP-complete, as shown by the proof of the next theorem which states the complexity of the general max-flow minimization, and thus of the general case.

Theorem 12. The Pareto minimization of max-flow on unrelated machines, hR|div|F_maxParetoi, is NP-complete

As we do not have any release dates in the above theorem, we in fact prove that hR|div|CmaxParetoi, is NP-complete. In fact we prove an even stronger result, that is that minimizing the number of jobs whose completion date is equal to the makespan is NP-complete on unrelated machines, and under the divisible model.

Proof. This result is proved with a reduction fromMinimum Hitting Set[15].

Let us consider any instanceI1ofMinimum Hitting Set. I1is defined by a collectionC ={S1, ..., S_|C|} of subsets of a finite setS and by an integer K.

The question is: is there a subsetS⁰ ofS, such that|S⁰|6K and such thatS⁰ contains at least one element from each subset inC: for eachi∈[1,|C|], Si∩S⁰ 6=

∅. Without loss of generality we assume thatS =∪iSi.

From instanceI₁ of Minimum Hitting Set, we now build an instance I₂ of our problem. I₂is made ofn=|S|jobs and we identify the jobsJ₁, ...,J_|S|

with the elementsx₁, ...,x_|S|ofS. The sizeW_j of jobJ_j is equal to the number of subsets containingxj : WJ =|{Si∈C|xj ∈Si}|. We will have to schedule these jobs onm=|C|processors and we identify the processors with the subsets S1, ..., S_|C|. We define the computational characteristics of the processors as follows:

pi,j= 1

|Si| ifx_j ∈S_i,

∞ otherwise. .

Here the question is: is there a schedule for which the number of jobs whose flow is equal to the optimal max-flow is less than or equal toK?

We first remark that the optimal maximum flow is equal to 1. Indeed, the total load to be processed isP

jWj=P

j|{Si∈C|xj∈Si}|=P

i|Si|and, at best, processorPican process a load of size|Si|during a unit of time. Therefore the optimal max-flow is greater than or equal to 1. A max-flow of 1 is realized by any schedule under which processor Pi devotes a fraction _|S¹

i| of the time interval [0,1] to any job J_j such thatx_j ∈S_i. Indeed, under such a schedule, the share of jobJj processed during the time interval [0,1] by processorPi such thatxj∈Si is equal to 1. Therefore, the overall share of job Jj process during that time interval is equal toP

i|xj∈Si1 =|{Si∈C|xj ∈Si}|=Wj.

Furthermore, this proof shows that if any processor is (at least) partially idle during the time interval [0,1] then the max-flow achieved will be strictly greater than one. Therefore, under any schedule achieving the optimal max-flow there is, on each processor, a job which is run until the date 1, and thus which has a max-flow of 1. The set of jobs whose max-flow is one inI2then equivalently defines a hitting set ofS inI1.

Minimum hitting setis equivalent toMinimum set cover[14]. There-fore, one of the best polynomial time algorithm to approximateMinimum hit-ting setis the greedy algorithm which at each step picks the element which belongs to the largest number of still un-hit subsets. This greedy algorithm has an approximation ratio of 1 + ln|S|[18,32], where|S|is the size of the set.

We do not know what is the complexity of the Pareto minimization of the max-stretch. Seeing how efficient is the greedy heuristic for the minimum hitting

set problem, we simply suggest to use it to solve in practice Step 15. Further-more, one can easily see that when the setS_d at Step14is always reduced to a singleton, Algorithm2 produces an optimal schedule. Therefore:

Theorem 13. Algorithm2produces a Pareto optimal schedule for max-stretch minimization on unrelated machines under the divisible load model if the setSd

at Step14 is always reduced to a singleton.

We believe that, in practice, the setSdwill always be reduced to a singleton, and thus that Algorithm 2 will always produce optimal schedules in practice.

(Note that the case of jobs of same size and same release date is not a problem.)

Dans le document ÉcoleNormaleSupérieuredeLyon ArnaudLegrand,AlanSu,FrédéricVivien Minimizingthestretchwhenschedulingflowsofdivisiblerequests Laboratoiredel’InformatiqueduParallélisme (Page 47-51)