Energy-efficient and thermal-aware resource management for heterogeneous datacenters

(1)

HAL Id: hal-01153804

https://hal.archives-ouvertes.fr/hal-01153804

Submitted on 20 May 2015

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Energy-eﬀicient and thermal-aware resource

management for heterogeneous datacenters

Hongyang Sun, Patricia Stolf, Jean-Marc Pierson, Georges da Costa

To cite this version:

Hongyang Sun, Patricia Stolf, Jean-Marc Pierson, Georges da Costa. Energy-eﬀicient and

thermal-aware resource management for heterogeneous datacenters. Sustainable Computing : Informatics and

Systems, Elsevier, 2014, vol. 4 (n° 4), pp. 292-306. �10.1016/j.suscom.2014.08.005�. �hal-01153804�

(2)

O

pen

A

rchive

T

OULOUSE

A

rchive

O

uverte (

OATAO

)

OATAO is an open access repository that collects the work of Toulouse researchers and

makes it freely available over the web where possible.

This is an author-deposited version published in :

http://oatao.univ-toulouse.fr/

Eprints ID : 13225

To link to this article :

DOI:10.1016/j.suscom.2014.08.005

URL :

http://dx.doi.org/10.1016/j.suscom.2014.08.005

To cite this version :

Sun, Hongyang and Stolf, Patricia and Pierson, Jean-Marc

and Da Costa, Georges Energy-efficient and thermal-aware resource

management for heterogeneous datacenters. (2014) Sustainable Computing, vol.

4 (n° 4). pp. 292-306. ISSN 2210-5379

Any correspondance concerning this service should be sent to the repository

administrator:

staff-oatao@listes-diff.inp-toulouse.fr

(3)

Energy-efficient

and

thermal-aware

resource

management

for

heterogeneous

datacenters

Hongyang

Sun

∗

_,

_Patricia

_Stolf,

_Jean-Marc

_Pierson,

_Georges

_Da

_Costa

IRIT,UniversityofToulouse,118RoutedeNarbonne,F-31062ToulouseCedex9,France

Keywords: Datacenterheterogeneity Onlinescheduling Serverplacement Cooling Multi-objectiveoptimization

a

b

s

t

r

a

c

t

Weproposeinthispapertostudytheenergy-,thermal-andperformance-awareresourcemanagementin heterogeneousdatacenters.Witnessingthecontinuousdevelopmentofheterogeneityindatacenters,we areconfrontedwiththeirdifferentbehaviorsintermsofperformance,powerconsumptionandthermal dissipation:indeed,heterogeneityatserverlevelliesbothinthecomputinginfrastructure(computing power,electricalpowerconsumption)andintheheatremovalsystems(differentenclosure,fans, ther-malsinks).Alsothephysicallocationsoftheserversbecomeimportantwithheterogeneitysincesome serverscan(over)heatothers.Whilemanystudiesaddressindependentlytheseparameters(mostofthe timeperformanceandpowerorenergy),weshowinthispaperthenecessitytotacklealltheseaspects foranoptimalresourcemanagementofthecomputingresources.Thisleadstoimprovedenergyusage inaheterogeneousdatacenterincludingthecoolingofthecomputerrooms.Webuildourapproachon theconceptofheatdistributionmatrixtohandlethemutualinfluenceoftheservers,inheterogeneous environments,whichisnovelinthiscontext.Weproposeaheuristictosolvetheserverplacement prob-lemandwedesignagenericgreedyframeworkfortheonlineschedulingproblem.Wederiveseveral single-objectiveheuristics(forperformance,energy,cooling)andanovelfuzzy-basedpriority mech-anismtohandletheirtradeoffs.Finally,weshowresultsusingextensivesimulationsfedwithactual measurementsonheterogeneousservers.

1. Introduction

Thelastyearshavewitnessedthedevelopmentof

heterogene-ityinclustersanddatacenters.Twomainreasonshaveledtothis

situationtoday.Thefirstoneisduetothemaintenanceand

evo-lutionofthecomponentsinthedatacenters:differentgenerations

ofcomputersarecommonlyseeninproductiondatacenterssince

theownersarenotchangingeverythingateachupdate.The

sec-ondreasonisdrivenbytheideathatheterogeneitymightbethe

keytoachievingenergy-proportionalcomputing[5,9],especially

forhigh-performancecomputingapplications.

Manyrecentstudiesalertdramaticallyontheenergy

consump-tionofthedatacenters.Forinstance,Koomey’sreport[21]claims

that today’sdatacentersare consumingnearly 2%of theglobal

energy,anduptohalfofthatisspentoncooling-relatedactivities

[33].ThisresultsgenerallyinverypoorPowerUsageEffectiveness

(PUE).

∗ Correspondingauthor.

E-mailaddresses:sun@irit.fr(H.Sun),stolf@irit.fr(P.Stolf),pierson@irit.fr

(J.-M.Pierson),dacosta@irit.fr(G.DaCosta).

In this paper, we study the multi-objective resource

man-agement problem for heterogeneous datacenters. Besides the

performancecriterion,wealsoconsidertheenergyconsumption

oftheserversandtheirthermalimpactonthedatacentercooling.

Theaimofourworkistooptimizetheseobjectivesandtoexplore

theirtradeoffs.Inparticular,theenergyconsumptionispartlydue

tothecoolingefficiencyinthedatacenter[25,38],whichisrelated

toboththephysicalplacementoftheserversandthescheduling

strategieswhenjobsdynamicallyenterandleavethesystem.The

latteralsoaffectstheperformanceandtheenergyconsumedbythe

servers.

Serverplacementinacomputerroomhasbeenrelativelyless

studied,especially itsimpactonthecoolingefficiency.The

rea-sonforthislackofattentionismainlyduetothefactthat,when

serversarehomogeneous,theirrelativepositionshavenoimpact

ontheperformanceandcomputingenergy.However,server

place-mentcanhaveanimpactonthecoolinginfrastructure.Themain

observationisthatoneservermightcontributetothe

tempera-tureraiseattheinletsoftheotherservers,duetotherecirculation

ofheat ina datacenter. Such mutualinfluencecan bemodeled

byaheatdistributionmatrixamongtheservers.Ifonewantsto

(4)

air temperaturehastobedecreased accordinglyby thecooling

system, which inturnincreasesits energyconsumption. Inthe

presenceofheterogeneousserverswithdifferentpower

consump-tionsandhenceheatdissipation,theproblemoffindtheoptimal

placementbecomescomplicatedand,tothebestofourknowledge,

hasnotbeenstudied.Sinceitisnotfeasibletochange

dynami-callythepositionsoftheserversinadatacenter,wefocusonstatic

placementtominimizethecoolingcostinducedbydifferent

con-figurations.

Withagivenserverplacement,thetraditionalproblemofjob

schedulingintheheterogeneousenvironmentremains.Many

pre-viouswork(e.g.,[4,40])consideredonlytheperformancecriterion

andhencefocusedonthejobs’executiontimes.Inordertoaddress

thepowerconsumptionissueindatacenters,however,application

scheduling mustemploy a multi-objective approachby

consid-eringperformance,energy andcoolingtogether.To accountfor

the factthat a schedulerhasno future knowledge (jobs arrive

over time), we need an online scheduling strategy. Instead of

designingdifferentindependentalgorithms,wedesign agreedy

onlineschedulingframeworkthatcanbeadaptedeasilyby

redefin-ing the cost function, from a single objective to two or more

objectives. To tackle the energy-performance tradeoff, we

fur-therintroduceafuzzy-basedpriorityapproach,which allowsto

explorethepotentialimprovementinoneobjectivewhile

relax-ingtheotherobjectiveuptoanacceptablerange.Thisapproach

canbeextendedtoincorporatemorethantwoobjectivesinthe

framework. Itsprinciple is notlimited tothecase athand and

canpotentiallybeappliedtoothermulti-objectiveoptimization

problems.

Themaincontributionsofthispaperarethefollowing:

• Astaticserverplacementheuristictoreducethecoolingcostfor

theserversinadatacenter.

• A greedyschedulingframework and severalcost functionsto

tacklesingle-objectivescheduling(forperformance,energy,and

cooling).

• Afuzzy-basedpriorityapproachtohandlethetradeoffbetween

twoconflictingobjectives,anditsextensiontomulti-objective

optimization.

These proposals are supported by extensive simulations

conductedusingrealhardwarespecificationsandsoftware

bench-marks, as well as experimentally verified cooling model and

heat distribution matrix [39,38]. Specifically for the hardware,

a serversystemwithhighpackingdensityand integrated

cool-ing support is chosen for the experiments, which we believe

represents well anemerging class of highly integrated

energy-efficient servers. The results demonstrate the flexibility of our

schedulingframeworkandconfirmtheeffectivenessofthe

fuzzy-based approach for exploring the energy-performance tradeoff

in heterogeneous datacenter environments. Our static server

placement heuristic is also shown to provide much improved

thermal distributionleading to significant reduction in cooling

cost.

The restof this paperis organizedas follows.Section2

for-mally states the system model and the scheduling problem.

Section3describesourgreedyserverplacementheuristic.Section4

presents the job scheduling framework, various cost functions

and the fuzzy-based priority approach. The simulation results

are shown in Section5. Section6 reviews some related work,

andSection7summarizesthepaperandaddressesfuture

direc-tions.

2. Problemstatement

2.1. Systemmodel

Motivated by the placement of physical servers and the

scheduling of high-performance computing (HPC) applications

inheterogeneousdatacenters,weconsiderthefollowingsystem

model:A setM={M1,M2,...,Mm} ofmservers (ormachines)

needstobeplacedinsideacomputerroom(ordatacenter)with

asetofmrackslots,denotedbyS={S1,S2,...,Sm}.1_Each_server

Mj∈MconsistsofLjprocessorsofthesametype(possiblyon

dif-ferent boards),butthetype andthenumberofprocessorsmay

vary for differentservers,rendering thesystemheterogeneous.

Each server consumes a base power Ubase

j to support thebasic

operationsoftheinfrastructurebackbone,suchasmonitoring,

net-workingandcooling(forinstancefans).AsetJ={J1,J2,...,Jn}of

njobsarriveatthesystemovertime,andtheyneedtobeassigned

inanonlinemannertotheservers.EachjobJi∈Jhasarelease

time ri anda processor requirementli that mustbegranted in

ordertorunonanyserver.ToexecutejobJionserverMjincurs

aprocessingtimePi,jandapowerconsumptionUi,j,bothofwhich

areserver-dependentandbecomeknownuponthejob’sarrival

by priorprofiling of theapplications.In particular,theprofiled

applicationpowerconsumptionisassumedtoincludetheleakage

power.

2.2. Schedulingmodel

We study two orthogonal problems that deal with the

placements of hardware and software, respectively. We call

the two problems static server placement and online job

scheduling. The former concerns the positioning of physical

servers in the datacenter, which as explained in Section3

will have an impact on the cooling energy in

heteroge-neous environment. The latter concerns the dynamic

assign-ment of workloads to the servers, which will impact energy

(due to both computing and cooling) as well as

perfor-mance.

For thefirstproblemofstatic serverplacement, each server

needstobephysicallyandstaticallyplacedinadvancetooneof

theavailablerackslotsinthedatacenter.Inparticular,weare

look-ingforamapping:{1,2,...,m}→{1,2,...,m}fromrackslotsto

serverssothateachslotSkisfilledwithaserverM(k).Theobjective

istominimizethecoolingcost.Moredetailsaboutthisproblemwill

bedescribedinSection3.

Givenaparticularserverplacement,anonlinescheduling

strat-egyisthenrequiredtoassignthejobstotheserversforexecution.

Specifically,eacharrivedjobJi∈Jmustbeassignedirrevocablyto

aserverwithatleastliidleprocessors,andwithoutany

knowl-edgeofthefuturearrivingjobs.Oncethejobhasbeenassigned,no

preemptionormigrationisallowed,whichistypicallyassumedfor

HPCapplicationssincetheytendtoincurasignificantcostinterms

ofdatareallocation.

Atanytimet,thetotalcomputingpowerofserverMjisthesum

ofitsbasepowerandthepowerconsumedforexecutingalljobs

assignedtoit,i.e.,

Ucomp_j (t)=U_jbase+

n

X

i=1

ıi,j(t)·Ui,j (1)

1_In_this_paper,_we_assume_that_the_number_of_rack_slots_is_equal_to_the_number_of

serverstobeplaced,whichrepresentsacommonscenarioinsmall-and medium-sizedatacenters.

(5)

whereı_i,j(t)isabinaryvariablethattakesvalue1ifjobJiis

run-ningonserverMjattimetand0otherwise.Inordertooptimize

performance,wedonotallowprocessorsharingamongthejobs.

Thus,eachserveratanytimecanonlyhostasubsetofthejobs

whosetotalprocessorrequirementsarenomorethantheserver’s

totalnumberofavailableprocessors,i.e.,

P

n

i=1ıi,j(t)·li≤Ljforall

1≤j≤matalltimet.

2.3. Coolingmodel

Tocharacterizethecostofcooling,weconsiderastandard

dat-acenterlayout, where server racks are organizedin rows with

alternatingcoldandhotaisles.Thecomputerroomair

condition-ing(CRAC)unitsuppliescoolairtothecoldaislesthroughraised

floorvents.EachserverMj∈Mintheracksisorientedsuchthat

itdrawscoolairwithtemperatureTin

j fromtheinlet and

dissi-pateshotairwithtemperatureTout

j totheoutlet.Assumingthat

thecomputingpowerconsumedbyaserveriscompletely

trans-formedintoheat,therelationshipbetweenthepowerconsumption

andtheinlet/outlettemperatureofserverMjatanytimetcanbe

characterizedbyTangetal.[39]:

T_jout(t)=T_jin(t)+Kj·Ujcomp(t), (2)

whereKj=pfjc,withpdenotingtheairdensity(inkg/m3),fjthe

airflowrateofserverMj(inm3/s),andctheairheatcapacity2(in

J/(◦_C_kg)).

Duetocomplexairflowpatterns, typicaldatacenters

experi-encetheso-calledheatrecirculationphenomenon,wherethehot

airfromtheserveroutletsrecirculatesintheroomandismixed

withthesuppliedcoolairfromtheCRAC,causingthe

tempera-tureattheserverinletstobehigherthanthatofthesuppliedair.

Priorstudies[39,38]havecharacterizedthisphenomenonwitha

heatdistributionmatrixDbyassumingafixedairflowpatterninthe

roomandconservationofenergyasdescribedbyEq.(2).Weadopt

thisapproachhere.Leteachelementdj,k∈Drepresentthe

temper-atureincreaseattheinletofserverMjperunitofpowerconsumed

byserverMk.3Combiningtheheatcontributionsfromallservers,

theinlettemperatureofserverMjattimetisgivenbythefollowing

equation: T_jin(t)=Tsup(t)+ m

X

k=1 dj,k·Ukcomp(t), (3)

whereTsup_(t)_denotes_the_supplied_air_temperature_at_time_t,_which

shouldbeadjustedtopreventtheinlettemperatureofanyserver

fromgoingbeyondaredlinetemperatureTred_;_otherwise,_the

elec-troniccomponentsmaynotworkreliablyorareatriskofbeing

damaged.Hence,thesuppliedairtemperatureshouldbesetatmost

to

Tsup(t)=Tred− max

j=1...m m

X

k=1

dj,k·Ucompk (t). (4)

Thecoolingcostisspecifiedas

Ucool_(t)₌

P

m j=1U comp j (t) CoP(Tsup_(t)) , (5)

2_The_air_heat_capacity_specifies_the_energy_required_to_change_the_temperature_of

oneunitmassofairbyoneunitdegree.

3_Technically_speaking,_d

j,krepresentsthetemperatureincreasefortheserverat

slotSjduetothepowerconsumptionbytheserveratslotSk.Forconvenience,we

simplyassumethattheserversarerenamedsuchthatserverMjisplacedinslotSj

forall1≤j≤m.

whereCoPisthecoefficientofperformance,definedastheratioof

theamountofheattoberemovedtotheenergythatneedstobe

consumedin ordertoperformthecooling[25].Thiscoefficient

characterizestheefficiencyoftheCRACunit,and isan

increas-ing(usuallynon-linear)functionofthesuppliedairtemperature.

Intuitively,itmeansthattheCRACunitneedstoworkharderand

thusconsumesmoreenergyinordertoprovidecoolerairtothe

computerroom.

2.4. Optimizationobjectives

We consider thefollowing bi-objective optimizationproblem:

optimizing the performance of the jobs and minimizing the

energyconsumptionofthedatacenter,duetobothcomputingand

cooling.4

Forperformance,weusetheaverageresponsetimeofthejobs

asthemetric,anditisdefinedas

Rave= 1 n n

X

i=1 (ci−ri), (6)

whereciandridenotethecompletiontimeandreleasetimeofjob

Ji,respectively.

Theenergyconsumptioncomesfromtwosources:computing

andcooling.Theoneduetocomputingisgivenbythetotal

com-putingpowerofallserversintegratedovertime,i.e.,

Ecomp=

Z

t2 t1 m

X

j=1 U_jcomp(t)dt, (7)

where[t1,t2]denotestheintervalofinterest,duringwhichalljobs

arriveandcompletetheirexecutions.Thiscomputingenergycan

befurtherdividedintotwoparts,namely,thestaticpartduetothe

basepowerconsumption,i.e.,

Estat comp=(t2−t1)· m

X

j=1 Ubase j , (8)

andthedynamicpartduetothepowerconsumedforexecutingthe

jobs,i.e., Edynccomp= n

X

i=1 m

X

j=1

ıi,j·Pi,j·Ui,j, (9)

whereıi,j=1ifjobJiisassignedtoserverMjand0otherwise.

Theenergyspentoncoolingisthetotalcoolingpowerintegrated

overtime,i.e.,

Ecool=

Z

t2

t1

Ucool(t)dt, (10)

andaswithcomputingenergy,coolingenergycanalsobebroken

intoastaticpartandadynamicpart.Specifically,thestaticpartis

thecoolingenergythatwillbespentduringinterval[t1,t2]evenif

nojobarrives,i.e.,

Estat cool=

Z

t2 t1

P

m j=1U base j (t)

CoP(Tred₋_max

j

P

kdj,k·Ukbase(t))

dt, (11)

andthedynamicpartisthedifferencebetweenthetotalcooling

energyandthestaticone,i.e.,

Edync_cool =Ecool−Ecoolstat. (12)

4_The_energy_consumed_by_other_parts_of_the_datacenter,_such_as_lighting,_are

(6)

Inthispaper,weassumethatallserversareturnedonallthe

timetosustaintheservers’infrastructurebackbone,sothestatic

energyduetobothcomputingandcoolingisindependentofthe

workloadandthejobschedulingstrategy.Ontheotherhand,the

totaldynamicenergygivenby

Edync_total=Edync_comp+E_cooldync (13)

iscloselyrelatedtojobscheduling,anditwillbethefocusofthis

study.

Due to the heterogeneity of the servers in the datacenter,

different job scheduling strategies may result in very

differ-ent job response time, computing energy and cooling cost.

While a specific scheduling strategy may optimize one

objec-tive, these different objectives can be conflicting with each

other, making the optimization difficult. In Section4, we will

propose and evaluate online scheduling algorithms to address

both performance and energy as well as to deal with their

tradeoffs.

3. Staticserverplacementandagreedyheuristic

Inthissection,weconsidertheproblemofstaticserver

place-ment.Wefirstmotivatethestudyfromtheperspectiveofcooling

inheterogeneousdatacenters.Wethenformulatetheproblemand

presentagreedyheuristic.

3.1. Motivation

The literaturecontains extensivestudieson virtualmachine

placement(e.g.,[6,15,44])fordatacenters,buttheplacementof

physicalservershasreceivedlittleattention.Therearetwomain

reasons.First,manytraditionaldatacentersarehomogeneous,so

differentplacementsofidenticalserversdonotmakeadifference.

Second,traditionalmetricssuch asjobperformance andenergy

consumption(duetocomputing)areindependentoftheservers’

relativepositions,sotheyareunaffectedbythedifferentplacement

configurations.

Asfarasthecoolingcostisconcernedforheterogeneous

data-centers,however,theplacementofthephysicalserverswillhave

animpact.Inparticular,thestudiesin[39,38]haveshownthatthe

heatrecirculationphenomenonintypicaldatacentersexhibitsthe

followingproperties:

(1)Differentrackpositionstendtobehavedifferentlyintermsof

heatrecirculation.Typically,serverslocatedattheupperparts

oftheracks“inhale”morerecirculatedhotairwhileservers

locatedatthelowerparts“contribute”morehotairto

recircu-lateintheroom.

(2)In aclosedcomputerroomwithfixedlocationsofallmajor

objectsandwithoutmovingobjects,theairflowpatternthat

characterizestheheatrecirculationamongdifferentrack

pos-itionsisrelativelystable.

Whilethefirstpropertysuggeststhattheheatdistributionmatrix

tendstobehighlyasymmetric,thesecondpropertyassuresthatthe

matrixdoesnotchangesignificantlywithdifferentworkloadsinthe

serversordifferentpositionsoftheservers.Inthenextsection,we

willrelyonworkloadplacement(orjobscheduling)techniquesto

managethecoolingcosttogetherwithotherobjectives.Here,we

focusonarrangingthepositionsoftheserverswithdifferentpower

profiles.Thegoalistoreducethemaximuminlettemperatureof

theserverssoastominimizethecoolingcostunderagivenload

condition.

Toillustratetheeffectivenessofthisapproach,considerasimple

datacenterwithtwoservers,tworackslots,andthefollowingheat

distributionmatrix: D=

0.002 0.004 0.001 0.002

.

Supposethetwoserversconsumeanaveragepowerof100W

and200W,respectively.Byplacingthefirstserverinslot1and

thesecondserverinslot2,theirinlettemperaturesincreaseby1◦_C

and0.5◦_C_respectively_according_to_Eq.₍₃₎_._By_simply_swapping_the

positionsofthetwoservers,theirtemperatureincreaseswillnow

become0.4◦_C_and_0.8◦_C._The_0.2◦_C_difference_in_the_maximum_inlet

temperatureofthesetwoconfigurationsdirectlydeterminesthe

temperatureofthesuppliedairbyEq.(4),andthereforeimpacts

thecoolingcost.Forinstance,consideraredlinetemperatureof

25◦_C_and_the_following_CoP_model_for_a_{water-chilled}_CRAC_unit_in

anHPdatacenter[25,38]:

CoP(T)=0.0068T2+0.0008T+0.458. (14)

AccordingtoEqs.(4)and(5),thecoolingcostsforthetwo

place-mentconfigurationsare68.275Wand67.269W,respectively.The

impactwillbemoresignificantwithalowerredlinetemperature

oramoreskewedheatdistributionmatrix,orwhentheserversare

consumingmorepower.Theproblemwillalsobecomemore

chal-lengingwhenthereisalargenumberofservers/positions,since

exhaustivesearchwillnolongerbepossible.Thenextsection

con-sidersthisgeneralcaseandproposesaheuristicalgorithmforthe

problem.

3.2. Greedyheuristic

To reduce the cooling cost, we should minimize the

max-imum temperature increase at the inlet of any server in the

datacenter. As we have seen previously, this is determined by

both the heat-distribution matrix and the power consumption

profile of allservers. While theformer is relatively stable and

canbemeasuredusingasensor-basedapproach[39],thelatter

essentially depends on theservers’ workloads, which can vary

with time. To cope with this uncertainty, we characterize the

power consumption of each serverstaticallyusing the average

power it consumes when executing historical workloads. This

providesa reasonable estimation ontheserver’s typicalpower

consumption during runtime. We call this static value the

ref-erence power, and use it to determine the placement of the

servers.

LetUref_j denotethereferencepowerofserverMj∈M.Thestatic

serverplacementproblemcanthenbeformulatedasfollows:finda

mapping:{1,2,...,m}→{1,2,...,m}fromrackslotstoservers,

soasto

minimizemaxD·Uref , (15)

whereUref =[U(1)ref ,U

ref (2),...,U

ref (m)]

T

.Findingtheoptimal

place-ment turns out to be a NP-hard problem for arbitrary

heat-distribution matrix and reference power vector. Appendix A

providestheNP-hardnessproof.

Giventhehardnessresult,wedesignaheuristicalgorithmfor

thestaticserverplacementproblembasedonagreedyallocation

strategy.Algorithm1presentsthepseudocodeofourgreedyserver

(7)

Algorithm1. Greedyserverplacement(GSP)

Input:ThesetM={M1,M2,...,Mm}ofmservers,andthereferencepower Uref

j ofeachserverMj∈M;thesetS={S1,S2,...,Sm}ofmrackslots,and

theheatdistributionmatrixD.

Output:Amappingfromrackslotstoservers.

1:Sorttheserversindescendingorderofreferencepower,i.e., Uref 1 ≥U ref 2 ≥···≥U ref m 2:InitializeTincr l =0forall1≤l≤m

3:foreachserverMj∈Mdo

4: k∗₌₀_and_Tincr max(k∗)=∞

5: foreachslotSk∈Sdo

6: Tincr max(k)= max l=1,...,m (Tincr l +dl,k·U ref j ) 7: IfTincr

max(k)<Tmaxincr(k∗)then

8: Tincr

max(k∗)=Tmaxincr(k)andk∗=k

9: endif

10: endfor

11: PlaceserverMjtoslotSk∗,i.e.,(k∗)=j

12: UpdateTincr l =T incr l +dl,k∗·U ref j forall1≤l≤m 13: UpdateS=S\Sk∗ 14:endfor

First,GSPsortstheserversindescendingorderofreference

pow-ers(Line1).Sincetheserversthatconsumemorepoweronaverage

willhavelargercontributionstothetemperatureincreasesatall

inlets,theyareplacedfirsttohavemoreflexibilityintheslot

selec-tionandsotoavoidhighpeaktemperature.LetTincr

l denotethe

existingtemperatureincreaseattheinletofslotSl,anditisinitially

settozeroforallinlets(Line2).LetTincr

max(k)denotethemaximum

temperatureincreaseifthenextserverMj∈MisplacedinslotSk,

i.e., Tincr max(k)= max l=1,...,m(T incr l +dl,k·U_jref). (16)

Server Mj will be placed in one of the remainingslots Sk∗∈S

that minimizes the maximum temperature increase, i.e., k∗₌

argmin_kTincr

max(k).Thetemperatureincreaseatallinletswillthen

beupdatedandthefilledslotSk∗willberemovedfromthe

avail-ablesetS(Lines12and13).Thealgorithmiteratesoverallservers

andterminatesafterthelastoneisplaced.

Forthecomplexityofthealgorithm,sortingandinitialization

takesO(mlogm)time.Intheiteration,placingeachserverincurs

O(m2₎_time_as_all_remaining_slots_are_examined_to_determine_the

maximumtemperatureincreaseatallinlets.Therefore,the

over-allcomplexityisO(m3_)._This_is_reasonable_even_for_a_large_number

ofservers,sincetheprocessisperformedrelativelyinfrequently:

newplacementoftheserversisonlynecessaryifthereare

signifi-cantalterationtothedatacenterlayoutorwhensomeserversare

removedandnewonesareintroduced.

4. Onlinejobschedulingandafuzzy-basedpriority

approach

Oncetheservershavebeenplacedina datacenter,theywill

startoperationbyexecutingtheapplicationsorjobs.Inpractice,

jobsaresubmittedbydifferentusersovertime,soeachjobmust

beassignedtoaserverwithoutknowingfuturejobarrivals.This

sectionconsidersonlinejobschedulingunderagivenserver

place-menttooptimizeperformanceandenergy,andtodealwiththeir

tradeoffs.

4.1. Greedyschedulingframework

Allonlineschedulingalgorithmsdescribedinthissectionfall

under a greedy scheduling framework (GSF), which is evoked

wheneveranewjobarrivesoranexistingjobcompletesexecution.

Algorithm2presentsthepseudocodeofthisframework.

Algorithm2. Greedyschedulingframework(GSF)

Input:JobqueueQ,andforeachjobJi∈Q,theprocessorrequirementli,

processingtimePi,jandpowerconsumptionUi,j;ServersetM,andfor

eachserverMj∈M,thenumberLjofavailableprocessors,whichis

initializedtoLj=Lj.

Output:AssignmentsofthenewlyarrivedjobandthejobsinQtothe serversinM.

1:ifanewjobJiarrives

2: j∗₌₀_and_H i,j∗=∞ 3: foreachserverMj∈Mthen

4: ifLj≥li&Hi,j<Hi,j∗then

5: Hi,j∗=Hi,jandj∗=j

6: endif

7: endfor 8: ifHi,j∗=/∞then

9: AssignjobJitoserverMj∗

10: UpdateLj∗=Lj∗−li

11: else

12: PutjobJiinqueueQinshortestjobfirstorder

13: endif

14:elseifajobJicompletesexecutiononserverMjthen

15: UpdateLj=Lj+li

16: foreachjobJk∈Qdo

17: ifLj≥lkthen

18: AssignjobJktoserverMj

19: UpdateLj=Lj−lk

20: endif 21: endfor 22:endif

ThevariableHi,jshowninthepseudocoderepresentsthecost

ofassigningjobJitoserverMj.Specifically,Hi,jcanbea

single-objectivecostfunctionofjobresponsetime,energyconsumption,

etc.(seeSection4.2),oritcanbeacompositecostfunctionoftwo

ormoreobjectives(seeSection4.3).

ForeachnewlyarrivedjobJi,amongtheserversthathave

suffi-cientlyavailableprocessorstohostit,theserverwiththeminimum

costin terms ofHi,j willbechosen forassigningthejob (Lines

2–9).Thismakestheschedulingframework greedy.Ifnoserver

hasenoughprocessorstohostit,thejobwillbeputinawaiting

queueQinshortestjobfirst(SJF)order,whichisknownto

opti-mizetheaverageresponsetime[35](Line12).Notethatalthough

theprocessingtimesofthejobsareserver-dependent,theirrelative

sizesareassumedtobeconsistentondifferentservers,i.e.,afast

serverisfastforalljobs.Hence,SJFcanberealizedbyusingany

serverasthereferenceforcomparingthejobs’processingtimes.

Whenajobcompletesexecutiononaserverandthereforereleases

theoccupiedprocessors,thewaitingjobsinthequeuewillbetested

insequence toseeiftheycanbeassignedtothis server(Lines

16–18).Whenever ajobisassigned orarunningjobcompletes

execution,thenumberofavailableprocessorsontheserverwill

beupdated(Lines10,15,19).Underthisgreedyscheduling

frame-work,theassignmentofeachjobtakesO(m)time,sotheoverall

complexityisO(mn)forassigningnjobs.

Thenexttwosectionswilldescribeheuristicalgorithmsthat

minimize different single- and multi-objective cost functions

dependingontheoptimizationcriteria.

4.2. Single-objectivescheduling

Single-objectiveschedulingconsidersoneoptimization

crite-rionwhendecidingwheretoassigneachjob.Inthissection,we

willpresentseveralsingle-objectiveschedulingheuristics.Someof

themwillalsobeusedasthebasealgorithmsfordesigningthemore

complexmulti-objectiveschedulingheuristicsinthenextsection.

First,thefollowingdescribessomesingle-objectiveheuristics

proposedintheliterature[25,38].

• Uniform:Assigneachjobrandomlytoaserveraccordingtothe

(8)

• MinHR:Assigneachjobtoaserverthatcontributesminimallyto

theheatrecirculationintheroom.Thecostfunctionisdefinedas

HHR i,j = m

X

k=1 dk,j. (17)

• CoolestInlet:Assigneachjobtoaserverwiththelowest

temper-atureatitsinlet.Thecostfunctionisdefinedas

HCI

i,j=Tjin, (18)

whereT_jindenotesthecurrenttemperatureattheinletofserver

Mj.

Notethat,in[25,38],theseheuristicswereappliedintheoffline

setting,wheretheinformationofalljobsisavailabletothe

sched-uler.Here, they arecast asonline heuristics.While the aimof

Uniform is to balance the workload on all servers, MinHR and

CoolestInlet attempt to minimize the overall heat recirculation

andtoachieveauniformtemperaturedistribution,respectively.

However,theseheuristicswereproposedforthehomogeneous

dat-acenterenvironments,andthereforedonotconsiderjob-specific

characteristics.Thefollowingheuristicstakejob-dependent

infor-mation into account by minimizing the performance, energy

consumption,andtemperature,respectively.

• Perf-Aware:AssignjobJitoaserverthatrenderstheminimum

responsetime.Thecostfunctionisdefinedas

HP

i,j=Pi,j, (19)

wherePi,jdenotestheexecutiontimeofjobJionserverMj.

• Energy-Aware:AssignjobJitoaserverthatincurstheminimum

dynamicenergyconsumptionduetobothcomputingandcooling.

Thecostfunctionisdefinedas

HE

i,j=E dync

total(ıi,j=1), (20)

whereEdync_totalisthetotaldynamicenergydefinedinEq.(13),and

itisevaluatedbasedonthecurrentlyrunningjobsandwithjob

JiassignedtoserverMj,i.e.,ıi,j=1.

• Thermal-Aware:AssignjobJitoaserverthatminimizesthe

max-imuminlettemperature.Thecostfunctionisdefinedas

HT i,j=_k=1,...,mmax Tkin+ m

X

k=1 dk,j·Ui,j

!

, (21) whereTin

k denotesthecurrenttemperatureattheinletofserver

Mk,andUi,jdenotesthepowerconsumptionofjobJionserverMj.

ExceptforUniform, allheuristicsabovebreakthetieby

ran-domlyselectingaserverwiththebestcostfunction.Thedifference

betweenCoolestInletandThermal-Aware isthattheformer

con-sidersthe currentinlet temperaturebeforethe jobis assigned,

whereasthelatterconsiderstheresultingtemperatureifthejobis

assignedtotheserver.Notethatalloftheseheuristicsmakegreedy

decisionslocallyforeacharrivingjob,sotheyarenotguaranteed

toprovidetheoptimalglobalcost.

4.3. Multi-objectiveschedulingwithfuzzy-basedpriority

Scheduling jobsto optimizetwo or moreobjectives usually

requireexploringthetradeoffbetweentheconflictinggoals.Inthis

section,weproposeanovelfuzzy-basedpriorityapproachtohandle

suchatradeoff.

4.3.1. Fuzzy-basedpriorityforbi-objectivescheduling

Wefirstconsideroptimizingtwoobjectives,forwhichwedefine

thefollowingcompositecostfunction:

H_i,jX,Y=hHX_i,j(f),HY

i,ji. (22)

Inthiscase,theobjectivesXandYareconsideredoneafteranother

byfirstselectingallserversthatofferthebestperformanceinterms

ofX,andthenselectingamongthissubsetanyserverthatoffers

thebestperformance intermsofY.Toavoiddeprivingthe

sec-ondobjectivealtogether,afuzzyfactorf,wheref∈[0,1],isusedto

relaxtheselectioncriterionforthefirstobjectiveuptoapredefined

margin(inpercentage).Thepurposeis toexplore anypotential

improvementforYwhilemaintainingtheperformanceforXwithin

auser-definedrangeofacceptance.Theapproachwillbe

partic-ularly effective ifa small compromise in X canlead toa large

improvementinY.Settingf=0indicatesthehighimportanceofX

thatshouldnotbecompromisedatall,whilesettingf=1suggests

thatXdoesnotmatterintheoptimization.Varyingfinbetween

givestheuseraflexibleandintuitivewaytospecifythetradeoff

betweenthetwoobjectives.

Toimplementthefuzzy-basedpriorityapproachintheonline

Greedy Scheduling Framework(GSF) as shown in Algorithm 2,

thecostfunctionforthefirstobjectiveXneedstobenormalized

between0and1inordertotakethefuzzyfactorintoaccount,i.e.,

HX_i,j= HX i,j−Hi,minX HX i,max−H X i,min (23) whereHX

i,minandHXi,maxdenotetheminimumandmaximumcosts

intermsofobjectiveXamongallavailableserverstoassignjobJi.

Theimplementationthenreliesonthefollowingruleforcomparing

therelativecostofassignmentonanytwoservers.

Fuzzy-based priority rule (for two objectives): The costs

incurredbyassigningjobJitoanytwoserversMj1andMj2satisfy

H_i,jX,Y

1 <H

X,Y

i,j2 ifandonlyifoneofthefollowingconditionsholds:

• HXi,j1 ≤f<H X i,j2,or • HXi,j1 ≤fandH X i,j2≤fandH Y i,j1<H Y i,j2,or • HXi,j1 <H X i,j2≤fandH Y i,j1 =H Y i,j2,or • f<HX_i,j₁<HX_i,j₂,or • f<HXi,j1=H X i,j2andH Y i,j1<H Y i,j2.

This rulecan beapplied tooptimize any two objectives,as

longastheyhavewell-definedcostfunctions, such astheones

giveninSection4.2.Thevalueofthefuzzyfactor aswellasthe

prioritydependontherelativeimportanceofthetwoobjectives

tooptimize,whichcanbedeterminedbytheuserorthesystem

administrator.

4.3.2. Extensiontomulti-objectivescheduling

Thefuzzy-basedpriorityapproachcanbeextendedtoinclude

morethantwoobjectives.Asinthebi-objectivecase,wecan

opti-mize a sequenceof objectivesone afteranother, whileusing a

(possiblydifferent)fuzzyfactortospecifytheacceptablerangefor

eachobjective.Thefollowingillustratesthismethodwitha

com-positecostfunctionconsistingofsobjectives:

HX1,X2,...,Xs i,j =hH X1 i,j(f1),H X2 i,j(f2),...,HXi,jsi. (24)

Inthiscase,theserversthatarerankedamongthetopf1percentin

termsofobjectiveX1willbeselectedfirst.Then,withinthissubset,

theonesthatfallintothetopf2percentintermsofobjectiveX2

(9)

Fig.1.Comparisonofthefuzzy-basedpriorityapproachwithfourotherapproaches inbi-objectivescheduling.Eachdotrepresentsapotentialsolution,andthesolution returnedbyeachapproachisindicated.

objectiveisconsidered.Finally,aserverthatsurvivesthefirsts−1

roundsofselectionandhasthebestperformanceintermsofthe

lastobjectiveXswillbechosenasthefinalwinner.

Again,theorderoftheprioritiesandthevaluesofthefuzzy

fac-torsshouldbedeterminedbytherelativeimportanceofdifferent

objectivestooptimize.

4.3.3. Comparisonwithotherapproaches

We nowcommentonthesimilarities anddifferences ofthe

fuzzy-based priorityapproach in comparison witha few other

multi-objectiveoptimizationapproachescommonlyfoundinthe

literature.Fig.1illustratesthebasicprinciplesoftheseapproaches

usingbi-objectiveschedulingasanexample.Section6describes

somerelatedworkontheapplicationsoftheseapproachesin

multi-objectivescheduling.

(1)Simplepriority.Thisisaspecialcaseofthefuzzy-basedpriority

approachwithfuzzyfactorf=0.Itisusuallyappliedinsettings

wherestrictprioritiesareimposedondifferentobjectives.This

approachprovidesbetterresultforthefirstobjective,butmay

leadtomuchworseperformanceforthesecondone.Incontrast,

thefuzzy-basedpriorityapproachismoreeffectiveinsettings

withsoft(ornon-strict)priorities,especiallyifanobjectivewith

slightlylowerprioritycanbesignificantlyimprovedwithjust

alittlecompromiseforahigh-priorityobjective.

(2)Pareto frontier.This approachreturns aset ofnondominated

solutions5_to_the_user_instead_of_only_one_solution._It_is_widely

appliedinofflinesettingstoquantifythetradeoffsamong

dif-ferentobjectives.Inthecontextofonlinescheduling,however,

multiplesolutionsarehardtomaintainovertime,andoneof

theintermediatesolutionsmustbeselectedon-the-flyinorder

todecidewhereeachjobshouldbeassigned.

(3)Constraintoptimization.Thisapproachoptimizesoneobjective

subjecttocertainconstraintsimposedontheother(s).Itis

com-monlyappliedinenvironmentswithstrictorclearly-defined

requirements,e.g.,jobdeadlineorenergybudget.Insteadof

usinganabsolutevalueastheconstraint,thefuzzy-based

pri-orityapproachspecifiestheconstraintasarelativethreshold,

i.e.,fuzzyfactor,intermsofpercentage.

5_A_solution_is_called_nondominated_if_no_other_solution_has_better_performance_in

termsofalltheobjectives.

Table1

Valuesoftheparametersusedinthesimulation.

Parameter Value

Airdensity(p) 1.168kg/m3

Airflowrate(fj) 0.1m3/s

Airheatcapacity(c) 1004J/(◦_C_kg)

Basepower(Ubase

j ) 130W

Redlinetemperature(Tred₎ ₂₅◦_C

(4)Weighted sum.Thisapproachtransformsmultipleobjectives

into a single one by optimizing a weighted combination.

Althoughprioritiesarenotexplicitlyspecified,itusesweights

toindicatetherelativeimportanceoftheobjectives.Asdifferent

objectivescanhavedifferentunits,theyareoftennormalized

inordertobecombined.However,itmaynotbeintuitivetoset

thevaluesoftheweights,e.g.,fortimeandenergy.

Compared to simple priority and constraint optimization,

fuzzy-based priorityis particularly suitable for scheduling HPC

applicationsindatacenters,wherenostrictconstraintsorpriority

arenormallyimposedonjobperformanceorenergyconsumption.

Comparedtoweightedsum,fuzzy-basedpriorityprovidesan

intu-itivealternativetodescribingthetradeoffswhilespecifyingsoft

preference oftheuseronthepriorityof theobjectives.Setting

anappropriatefuzzyfactorencodessuchpreferenceinanonline

manner.AsshowninFig.1,thesolutionreturnedbyfuzzy-based

priority(andotherapproaches)whenschedulinganindividualjob

actuallyliesontheparetofrontier.

5. Performanceevaluations

Inthissection,wewillevaluatetheproposedonlinescheduling

heuristicswiththefuzzy-basedpriorityapproachandthegreedy

heuristicforserverplacement.Theevaluationsareperformedby

simulationusingtheDataCenterWorkloadandResource

Manage-mentSimulator(DCworms)[22].

5.1. Simulationsetup

5.1.1. Datacenterconfiguration

Wesimulateadatacenterwith50serversandwhichhasthe

sameconfigurationastheoneconsideredin[38].Specifically,the

datacenterconsistsoftworowsofracksinatypicalcoldaisleand

hotaislelayout.ThecoolairissuppliedbytheCRACunitfromthe

coldaislebetweenthetworows.Eachrowhasfiveracksandeach

rackcontainsfiveservers.Theserverplatformusedinthe

simula-tionisbasedonChristmann’sResourceEfficientClusterServer(RECS)

unit[8],whichisamulti-nodecomputersystemconsistingof18

processors.Thedatacenterconsistsof900processorsintotal.The

RECSplatformischosenbecauseitrepresentsanemergingclassof

high-densityandenergy-efficientserverswithbuilt-inpowerand

temperaturesensorsandintegratedcoolingsupport.

Table1showstheparametersusedinthesimulation,whose

valuesarebasedonrealmeasurementsinaRECSunit.Fromthe

firstthreeparameters,theheatrecirculationmatrixDisderivedby

assumingthesameairflowpatternastheonemeasuredin[39,38].

Thecoefficientofperformance(CoP)isbasedontheoneinanHP

datacenter[25]asshownbyEq.(14).

5.1.2. Processortypes

Toconstruct a heterogeneous datacenter, we selecta set of

fivenondominatedprocessorsintermsofperformanceandenergy

indices(thesmallerthebetter).Theperformanceindexofa

proces-soriscalculatedasthereciprocalofitsperformancescoremeasured

(10)

0 1 2 3 4 5 x 10−4 0 0.005 0.01 0.015 0.02 Performance Index E n e rg y I n d e x XeonE5_2697v2 CoreI7_4770R CoreI7_4960HQ XeonE3_1230Lv3 CoreI7_4600U

Fig.2.Theperformanceandenergyindicesof500+processorsreleasedbyIntel between2009and2013.Fiveprocessors(marked)intheparetofrontierareselected foroursimulation.

Table2

Passmarkscores(asofJanuary2014)andTDPsoffivetypesofprocessorsusedin thesimulation.

Passmark TDP(W)

IntelCoreI74770R 10,381 65

IntelCoreI74960HQ 10,310 47

IntelCoreI74600U 4498 15

IntelXeonE52697v2 19,125 130

IntelXeonE31230Lv3 7344 25

benchmarkresultsastheprocessor’sperformanceindicator.The

energyindexissimplytheproductoftheprocessor’sperformance

indexanditsThermalDesignPower(TDP),whichgivesarelative

indicator(comparedtootherprocessors)ontheaverageenergythe

processorconsumeswhenrunningtypicalbenchmarks.

Fig.2plotsthetwoindicesformorethan500typesofprocessors

releasedbyIntelbetween2009and2013,amongwhichfive

pro-cessorsintheparetofrontierareselected(markedinthefigure).

Table2showsthepassmarkscoresandTDPsofthefiveselected

processors.Wechoosetheseprocessorsbecausetheyforma

non-dominatedset,makingtheschedulingproblemnon-trivial.Inthis

case,noprocessorisdominatedbyothersintermsofboth

per-formance andenergy consumption;hence tradeoff existswhen

assigningajobtodifferentprocessortypes.Inthesimulation,each

typeofprocessormakesup10RECSserverswith180computing

nodesintotal.

5.1.3. Benchmarksandworkloads

Thebenchmarksusedinthesimulationconsistofthefollowing

high-performancecomputingapplications,whichareincludedin

DCWorms.

• fft:aprogramtocomputeFastFourierTransforms.

• c-ray:araytracingsoftware.

• abinit:atooltocomputematerialpropertiesattheatomlevel.

• linpack:alibraryforperformingnumericallinearalgebra.

• tar:aprogramtocreateandmanipulatetararchives.

Thesebenchmarksexhibitalargespectrumofbehaviors,from

CPU intensiveto memoryintensive, tocommunication and I/O

intensive. Moreexplanation andrationale ofthis choicecanbe

foundin[10].Toprofiletheexecutiontimeandpower

consump-tionof thesebenchmarks,anapplication-specificapproach [22]

wasadopted.Specifically,averagemeasurementsarecollectedfor

each application with differentinput parameters on Intel Core

I72715QE,alesspowerfulprocessoravailableinourRECStestbed.

Theresultsarethentranslatedtoourtargetplatformsusingthe

Table3

Averageexecutiontime(above,insecond)andpowerconsumption(below,inWatt) ofeachbenchmarkoneachtypeofprocessor.

CoreI7 CoreI7 CoreI7 XeonE5 XeonE3

4770R 4960HQ 4600U 2697v2 1230Lv3 fft 3400 3450 7850 1850 4800 62.27 45.03 14.37 124.54 23.95 c-ray 1150 1200 2700 650 1650 33.70 24.37 7.78 67.41 12.96 abinit 1700 1750 3950 950 2450 36.11 26.11 8.33 72.22 13.89 linpack 3350 3400 7700 1850 4750 53.81 38.91 12.42 107.61 20.69 tar 2000 2050 4600 1100 2800 50.92 36.82 11.75 101.83 19.58

relativeperformanceandpowerindicatorsasshown inTable2.

Table3detailstheaverageexecutiontimeandthecorresponding

powerconsumptionofthebenchmarksoneachofthefiveselected

processors.

Eachjobisrandomlyselectedfromoneofthesebenchmarksand

thenumberofprocessorsitrequiresisrandomlygeneratedfrom1

to8withuniformdistribution.Followingthedefinitionin[11],the

systemloadisdefinedtobe

=

_P

·E[P]_m

j=1Lj

, (25)

whereisthearrivalrate(in#jobsperhour),E[P] isthe

aver-agesequential executiontimeofthejobsonallprocessortypes

(roughly4.5hours)and

P

m

j=1Ljisthetotalnumberofprocessors,

whichis900inthesimulation.JobsarriveaccordingtothePoisson

process,andthearrivalrateisincreasedfrom20to200witha

fixedarrivaldurationof8hours.Thetotalnumberofjobsranges

from160to1600,andthesystemloadisbetween0.1and1.

5.2. Simulationresults

Thissectionpresentsthesimulationresults.First,weevaluate

theperformanceofvariousonlineschedulingheuristicswithafixed

placementfortheservers.Wethenstudytheimpactofdifferent

placementconfigurationsontheperformanceofthescheduling

heuristics.Allresultsareobtainedbycarryingouttheexperiments

10timesandtakingtheaverage.

5.2.1. Resultsofsingle-objectiveschedulingheuristics

We first evaluatetheonline scheduling heuristicsfor a

sin-gleobjective.Theresultsareusedasreferencesforexploringthe

energy-performancetradeoffinthenextsection.Inbothcases,the

serverplacementisfixedwitheachtypeofprocessoroccupying

10contiguousserverslotsovertworacks,accordingtotheorder

specifiedinTable2.

SixheuristicspresentedinSection4.2areevaluated,namely,

Uniform, MinHR, CoolestInlet, Perf-Aware, Energy-Aware and

Thermal-Aware.Fig.3presentstheresultsoftheseheuristics.As

wecanseeinFig.3(a),Perf-Awarehassignificantlybetteraverage

job response time compared to theother heuristics, especially

underlightsystemloads.ThisisbecausealljobsinPerf-Awareare

assigned to high-performance(faster) processorsbefore slower

oneswheneverpossible.Forthesamereason,Perf-Awarealsohas

bettermakespan (completiontimeof thelast finishedjob)and

processorutilization(ratiobetweentheutilizedprocessorcycles

andallprocessorcyclesduringthesimulationperiod),asshownin

Fig.3(b)and(c).Notethattheprocessorutilizationsremainunder

70%evenwhenthesystemloadreaches1.Thisispartlydueto

thefragmentedprocessorsinsomeserversthatcannotbeutilized

(11)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1000 2000 3000 4000 5000 6000 7000 8000 Load ρ A v er a g e R es p o n s e T im e ( s e c s ) (a) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 7 8 x 104 Load ρ M a k e s p a n ( s e c s ) (b) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 60 70 Load ρ P ro c e s s o r U ti liz a ti o n ( % ) (c) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 100 200 300 400 500 Loadρ Total Energy Consumption (kWh) (d) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 Load ρ C o m p u ti n g E n e rg y C o n s u m p ti o n ( k W h ) (e) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 Load ρ C o o lin g E n e rg y C o n s u mp ti o n ( k W h ) Uniform MinHR CoolestInlet Perf−Aware Energy−Aware Thermal−Aware (f) Fig.3.Performanceofsixsingle-objectiveonlineschedulingheuristics.Thelegendappliestoallsubfigures.

Fig.3(d)comparesthetotal(dynamic)energyconsumptionof

theschedulingheuristics,andFig.3(e)and(f)showstheenergy

consumedforcomputingandcooling,separately.Forall

heuris-tics,theenergy consumptionincreaseswiththesystemloador

thetotalnumberofjobsinthearrivalinterval.Energy-Aware

con-sumeslesstotal energycompared totheotherheuristics,since

jobsareassignedtoprocessorswithbetterenergyefficiency.The

improvementismoresignificantintermsofcomputingenergy.For

thecoolingpart,MinHRandThermal-Awareconsumesroughlythe

sameenergyasEnergy-Aware,sincetheyaredesignedtominimize

theheatrecirculationandthemaximuminlettemperature,which

inturnincreasesthesuppliedtemperatureintheroomandhence

directlyimpactsthecoolingcost.Fig.4showstheaveragesupply

temperatureofthedifferentschedulingheuristicsinthesimulation

period.Indeed,Thermal-AwareandMinHRarebetterthan

Energy-Awareintermsoftheaveragesupplytemperaturebyupto1.3◦_C

and1.6◦_C,_{respectively.}

Asthesystemloadincreasesfurtherandhencetheprocessor

utilizationbecomeshigher,theperformanceofallheuristicstend

toconverge,sinceallserversareroughlyequallyloadedunderall

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 13 15 17 19 21 23 25 27 Loadρ A v e ra g e S u p p ly T e m p e ra tu re ( oC ) Uniform MinHR CoolestInlet Perf−Aware Energy−Aware Thermal−Aware

Fig.4. Averagesupplytemperatureoftheheuristics.

heuristics.InparticularforEnergy-Aware,somejobsareforcedtobe

assignedtothehigh-performanceserverssincetheenergy-efficient

onesarealloccupied,resultinginimprovedaveragejobresponse

time.

5.2.2. Energy-performancetradeoffwithfuzzy-basedpriority

Wenowevaluatetheeffectivenessofthefuzzy-basedpriority

approachforexploringtheenergy-performancetradeoffinonline

scheduling.Tothisend,weconsiderthecompositecostfunction

HE,P_i,j =hHE_i,j(f),HP

i,ji that optimizes the energy consumption

fol-lowedbythejobresponsetime.

Fig.5showstheresultsofminimizingHE,P_i,j whenthefuzzy

fac-torfisincreasedfrom0to1atthreedifferentsystemloads(0.2,

0.5and0.8).Thevaluesofbothobjectivesareplottedasafunction

off,withenergyconsumptionshownontheleftYaxisandaverage

responsetimeontheright.Inaddition,thefigurealsoshowsthe

resultswhenf=−1andf=2,denotingthecaseswherethe

sched-ulingdecisionisbasedsolelyonthefirstobjective(energy)and

thesecondobjective(responsetime).Thetwocasesare

equiva-lenttothesingle-objectiveheuristicsEnergy-AwareandPerf-Aware,

respectively.

As we can see, the average response time improves with

increasedfuzzyfactorattheexpenseoftheenergyconsumption

underall systemloads.However,the improvementcan be

sig-nificantevenbeforemajorcompromiseinenergyconsumptionis

observed.Forinstance,atmediumload(=0.5),theresponsetime

isreducedbyabout1000whenfreaches0.6withoutmuchincrease

intheenergyconsumption.Similarresultscanalsobeobservedat

lightloadandheavyload.Thefuzzy-basedpriorityapproachcan

takeadvantageofsuchcharacteristicsbysettingsuitablefuzzy

fac-torsinordertoachievedesirableenergy-performancetradeoffin

theonlinesetting.

Fig.6showstheenergy-performancetradeoffcurveforHE,P_i,j =

(12)

−1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 20 70 120 170 220 270 320 370 T o ta l E n e rg y Co n s u mp ti o n ( k W h ) fuzzy factor f −1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 1000 2000 3000 4000 5000 6000 7000 A v e ra g e Re s p o n s e T ime ( s e c s ) (a) Load ρ=0.2 −1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 20 70 120 170 220 270 320 370 T o ta l E n e rg y Co n s u mp ti o n ( k W h ) fuzzy factor f −1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 1000 2000 3000 4000 5000 6000 7000 A v e ra g e R e s p o n s e T ime ( s e c s ) Energy Time (b) Load ρ=0.5 −1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 20 70 120 170 220 270 320 370 T o ta l E n e rg y Co n s u mp ti o n ( k W h ) fuzzy factor f −1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 1000 2000 3000 4000 5000 6000 7000 A v e ra g e Re s p o n s e T ime ( s e c s ) (c)Load ρ=0.8 Fig.5. Bi-objectiveschedulingforHE,P

i,j =hH E

i,j(f),Hi,jPiwithdifferentfuzzyfactorsatthreesystemloads.Thelegendappliestoallsubfigures.

20 70 120 170 220 270 320 370 1000 2000 3000 4000 5000 6000 7000

Total Energy Consumption (kWh)

A v e ra g e R e s p o n s e T im e ( se c s ) (a)Load ρ=0.2 20 70 120 170 220 270 320 370 1000 2000 3000 4000 5000 6000 7000

A v e ra g e R e s p o n s e T im e ( s e c s ) Uniform_MinHR CoolestInlet Perf−Aware Energy−Aware Thermal−Aware (b) Load ρ=0.5 20 70 120 170 220 270 320 370 1000 2000 3000 4000 5000 6000 7000

A v e ra g e R e s p o n s e T im e ( se c s ) (c)Load ρ=0.8 Fig.6. Energy-performancetradeoffcurveforHE,P

i,j =hH E

i,j(f),HPi,jiatthreesystemloads.Thelegendappliestoallsubfigures.

results of the six single-objective heuristics are alsoshown in

thefigureundertherespectiveload.WecanseethatMinHRand

Thermal-Awareliearoundthecurve(orevenslightlytotheleftof

thecurveinthecaseofMinHR),indicatingthattheyachievefairly

efficienttradeoffsbetweenjobresponsetimeandenergy

consump-tion.Ontheotherhand,UniformandCoolestInletarecompletely

dominated bythecurve, whichsuggests thattheyprovide less

attractivetradeoffresults.

Fig.7plotsthetradeoffcurvesachievedbyoptimizingtheheat

recirculationandthemaximuminlettemperaturefollowedbythe

jobresponse time, i.e.,withcostfunctionsH_i,jHR,P=hHHRi,j(f),Hi,jPi

andHT,P_i,j =hHT_i,j(f),HP

i,ji.Theresultsunderthreedifferentsystem

loadsareshownalongsidetheonesforHE,P_i,j .Thecurvesindicate

thatthetwoheuristicsareabletoprovidebettertradeoffsinthe

mediumtohighenergyrange(e.g.,between150and220forMinHR

at=0.5)whilethetradeoffremainsefficientforthecost

func-tionH_i,jE,Pwhentheenergyconsumptionisclosetotheminimum.

Theresultsdemonstratetheflexibilityofthefuzzy-basedpriority

approachinexploringtheenergy-performancetradeoffinonline

scheduling.Theapproachcanbepotentiallyappliedtoother

multi-objectiveoptimizationproblems.

5.2.3. Evaluationofserverplacementstrategies

We now studythe impact of server placement onthe

per-formanceoftheonlineschedulingheuristics.Besidesthesimple

location-basedplacementusedinthepreviousevaluations,which

we call LOC, we generate three additional placements for the

servers. One is based on our GSP heuristic and the other two

are based on its variations. We call the three placement

con-figurationsGSP1,GSP2andGSP3,respectively.Thetwovariants

(GSP2 and GSP3) are obtainedin a similar fashion as GSP1. In

particular,in GSP2theservers aresortedin ascending orderof

referencepowerinsteadofdescendingorder,and inGSP3 each

serveris assigned to a remainingrack slot that maximizesthe

maximuminlettemperatureinsteadofminimizingit.Apparently,

thesetwoheuristicsarecounter-intuitiveandareexpectedto

pro-vide undesirableconfigurations. Thepurposeof includingthem

20 70 120 170 220 270 320 370 1000 2000 3000 4000 5000 6000 7000

A v e ra g e R e s p o n s e T ime ( s e c s ) (a)Load =0.2 20 70 120 170 220 270 320 370 1000 2000 3000 4000 5000 6000 7000

A v e ra g e R e s p o n s e T ime ( s e c s ) H E,P i,j Hi,jHR,P Hi,jT,P (b)Load =0.5 20 70 120 170 220 270 320 370 1000 2000 3000 4000 5000 6000 7000

A v e ra g e R e s p o n s e Ti m e ( s e c s ) (c) Load =0.8 Fig.7.Energy-performancetradeoffcurvesforHE,P_i,j,HHR,P_i,j andHT,P_i,j atthreesystemloads.Thelegendappliestoallsubfigures.

(13)

5 10 15 20 25 30 35 40 45 50 20 25 30 35 40 45 50 55 Server Inlet Temperature ( C ) GSP1 (32.2 C) GSP2 (46.4 C) GSP3 (48.4 C) LOC (40.1 C)

Fig.8.Inlettemperaturedistributionofthe50serversunderfourdifferentserver placements.Themaximuminlettemperatureofeachplacementisindicatedinthe legendandbythehorizontalline.

istodemonstrate theimpactofdifferentserverplacementson

ascheduling algorithm’sperformance,especially onthecooling

cost.

Fig.8showstheinlettemperaturedistributionofthe50servers

underthefourplacementconfigurations.In allcases,each

pro-cessor is loaded with the average power consumption of the

benchmarksshowninTable3.Aswecansee,GSP1hasbetter

ther-malbalancethantheotherconfigurations.Specifically,itimproves

LOCby about8◦_C _in_terms _of _the _maximum_inlet _temperature

and improves GSP2 and GSP3 by over 14◦_C _and ₁₆◦_C,

respec-tively.

Figs.9and10showtheperformanceofPerf-Awareand

Energy-Awareunderthefourserverplacementsatdifferentsystemloads.

Inbothheuristics,jobresponsetimeandcomputingenergyarenot

affectedbydifferentconfigurations.However,GSP1 hasreduced

coolingenergycomparedtotheotherconfigurations.Thisis

par-ticularlyevidentunderheavysystemload,whereallserversare

almostfullyand equallyloaded, thus theirpower consumption

ratiosmatchcloselythoseoftheaveragevaluesusedintheserver

placementheuristic.Underlightsystemload,however,theservers

couldexperienceunbalancedloads,whichcausestheirpower

con-sumptionratiostodeviatefromthoseoftheaveragevalues.Asa

result,theadvantageofGSP1becomessmallerorevendiminishes,

butsincetheoverallenergyconsumptionissmallinthiscase,the

impactofserverplacementisnotsignificant.

Quitesimilareffectonthecoolingenergycanbeobservedfor

Thermal-AwareandMinHRasshowninFigs.11and12.Noticethat,

forthesetwoheuristics,differentserverplacementsalsoleadto

atradeoffbetweenjobresponsetimeandcomputingenergy.To

furtherinvestigatethetradeoffefficiency,Fig.13showsthe

energy-performancetradeoffcurvesforthreeheuristicswithcostfunctions

HE,P_i,j ,HHR,P_i,j andH_i,jT,P atload=0.8underdifferentserver

place-ments.Wecanseethat,althoughthetradeoffremains,inallcases

GSP1 providesthebestcoolingenergy andhenceimprovesthe

overalltradeoffefficiency.NotethatMinHRandPerf-Awarebehave

exactlythesameunderGSP1,sinceserverswithfaster

process-orsandhencemorepowerconsumptionsareplacedintheslots

withlessheatrecirculation.Therefore,thesameperformanceand

energyare observedforH_i,jHR,P regardlessof thefuzzy factor,as

showninFig.13(b).

The results confirm that strategic server placement indeed

improves the thermal balance in a heterogeneous datacenter,

whichhelpsreducethecoolingcost.Thisisachievedwithlittle

impactonthejob responsetime andcomputing energy,orthe

tradeoffbetweenthem.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1000 2000 3000 4000 5000 6000 7000 8000 Loadρ A v e ra g e R e s p o n s e T ime ( s e c s )

(a)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 Load ρ C o m p u ti n g E n e rg y C o n s u m p ti o n ( k W h ) GSP1 GSP2 GSP3 LOC

(b)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 100 200 300 400 500 Load ρ C o o lin g E n e rg y C o n s u m p ti o n ( k W h )

(c)

Fig.9. PerformanceofPerf-Awareunderdifferentserverplacementsandsystemloads.Thelegendappliestoallsubfigures.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1000 2000 3000 4000 5000 6000 7000 8000 Loadρ A v e ra g e R e s p o n s e Ti me ( s e c s ) (a) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 Load ρ C o m p u ti n g E n e rg y C o n s u mp ti o n ( k W h ) GSP1 GSP2 GSP3 LOC (b) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 100 200 300 400 500 Load ρ C o o lin g E n e rg y C o n s u mp ti o n ( k W h ) (c)

(14)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1000 2000 3000 4000 5000 6000 7000 8000 Load ρ A v e ra g e R e s p o n s e T im e ( s e c s ) (a) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 Load ρ C o mp u ti n g E n e rg y C o n s u m pt io n ( kW h ) GSP1 GSP2 GSP3 LOC (b) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 100 200 300 400 500 Load ρ C o o lin g E n e rg y C o n s u m p ti o n ( k W h ) (c) Fig.11.PerformanceofThermal-Awareunderdifferentserverplacementsandsystemloads.Thelegendappliestoallsubfigures.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1000 2000 3000 4000 5000 6000 7000 8000 Loadρ A v e ra g e R e s p o n s e Ti m e ( s e c s ) (a) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 150 200 250 Loadρ C o m p u ti n g E n e rg y C o n s u m p ti o n ( k W h ) GSP1 GSP2 GSP3 LOC (b) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 100 200 300 400 500 Load ρ C o o lin g E n e rg y C o n s u m p ti o n ( k W h ) (c) Fig.12.PerformanceofMinHRunderdifferentserverplacementsandsystemloads.Thelegendappliestoallsubfigures.

200 300 400 500 600 2000 2500 3000 3500 4000 4500 5000

Average Response Time (secs) (a) H_i,jE,P 200 300 400 500 600 2000 2500 3000 3500 4000 4500 5000

Average Response Time (secs) GSP1 GSP2 GSP3 LOC (b) H_i,jHR,P 200 300 400 500 600 2000 2500 3000 3500 4000 4500 5000

Average

Response

Time

(secs)

(c) H_i,jT,P Fig.13. Energy-performancetradeoffcurvesforHE,P

i,j,H HR,P i,j andH

T,P

i,j underfourdifferentserverplacementsatload=0.8.Thelegendappliestoallsubfigures.

6. Relatedwork

Inthissection,wereviewsomerelatedworkintheliterature

onmulti-objectiveschedulingandthermal-awareschedulingfor

datacenters.

6.1. Multi-objectivescheduling

Schedulingwithmultipleconflictingobjectiveshasattracted

much attention in many optimization problems. Section4.3

describedafewcommonlyusedapproaches.Thefollowingreviews

someapplicationsoftheseapproachesinvariousproblemdomains.

(1)Simple priority. This is a simple priority-based approach to

optimize multipleobjectivesin sequence. Assayad etal. [2]

introduced a bi-criteria compromise function to set

priori-tiesbetweenmakespanandreliabilityforschedulingreal-time

applications.Tominimizecarbonemission andtomaximize

profit,two-steppolicieswereproposedbyGargetal.[18]to

mapapplicationstoheterogeneousdatacentersbasedonthe

relativepriorityofthetwoobjectives.Duetal.[12]proposed

heuristicstooptimizetheQoS forinteractiveservicesbefore

consideringenergyconsumptiononmulticoreprocessorswith

DVFS(DynamicVoltage&FrequencyScaling)capability.

(2)Paretofrontier.Thisapproachisoftenusedintheoffline

set-tingtogenerateasetofnondominatedsolutions.Durilloetal.

[13]appliedthistechniquetotradeoffmakespanandenergy

consumptionforheterogeneousservers.Torabietal.[41]used

particleswarmoptimizationtoapproximatetheparetofrontier

fortheunrelatedmachineschedulingproblemwith

uncertain-tiesintheinputs.Gaoetal.[15]utilizesantcolonyoptimization

toobtaintheparetofrontierforresourcewastageandpower

consumptioninvirtualmachineplacement.Evolutionary