HAL Id: hal-00301427
https://hal.archives-ouvertes.fr/hal-00301427
Submitted on 21 Jul 2008
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
A parallelisable multi-level banded diffusion scheme for
computing balanced partitions with smooth boundaries
François Pellegrini
To cite this version:
François Pellegrini. A parallelisable multi-level banded diffusion scheme for computing balanced
par-titions with smooth boundaries. EuroPar, Aug 2007, Rennes, France. pp.195-204,
�10.1007/978-3-540-74466-5_22�. �hal-00301427�
A parallelisable multi-level banded diusion
s heme for omputing balan ed partitions with
smooth boundaries
FrançoisPellegrini
ENSEIRB,LaBRIandINRIAFuturs UniversitéBordeauxI
351, oursdelaLibération,33405TALENCE,FRANCE pelegrinlabri.fr
Abstra t. Graphpartitioningalgorithmshaveyettobeimproved, be- ausegraph-basedlo aloptimizationalgorithmsdonot omputesmooth andglobally-optimalfrontiers,whileglobaloptimizationalgorithmsare
tooexpensivetobeofpra ti aluseonlargegraphs.Thispaperpresents awaytointegrateaglobaloptimization,diusionalgorithminabanded
multi-level framework, whi h dramati ally redu es problem size while yieldingbalan edpartitions withsmoothboundaries. Sin eallof these
algorithms do parallelize well, high-quality parallel graph partitioners builtusingthese algorithmswillhavethe samequalityas state-of-the-artsequentialpartitioners.
1 Introdu tion
Graphpartitioning is an ubiquitouste hniquewhi h hasappli ations in many
eldsof omputers ien eandengineering,su hasworkloadbalan inginparallel
omputing, databasestorage,VLSI designorbio-informati s.Itis mostlyused
to help solving domain-dependent optimization problems modeled in terms of
weightedor unweightedgraphs,wherending good solutionsamountsto
om-puting, eventuallyre ursivelyin adivide-and- onquerframework,small vertex
oredge utsthatbalan eevenlytheweightsofthegraphparts.
Manyalgorithmshavebeenproposed to omputee ientpartitions of any
graphs, su h as graph orevolutionary algorithms, spe tral methods, orlinear
optimization methods. Basi ally, all of these methods belong to two distin t
lasses: global methods, whi h onsider all of the graph data, and lo al
opti-mizationheuristi s,whi htrytoimprovelo allyapreexistingpartition.Global
methods oftenyield betterresults, but their ostsdramati allyin reasesalong
with problem size, whi h makes them pra ti ally impossibleto use for graphs
omprisingseveraltensmillionverti es,whi harethegraphsnowbeing
onsid-eredinmanys ienti engineeringproblems.
Themulti-levelapproa h [5,6℄hasbeenaquitesu essful attemptto
om-bine bothapproa hes.It onsistsin repeatedly omputingaset ofin reasingly
hal-00301427, version 1 - 21 Jul 2008
Author manuscript, published in "EuroPar, Rennes : France (2007)"
DOI : 10.1007/978-3-540-74466-5_22
Coarsening
phase
Uncoarsening
phase
Initial partitioning
Projected partition
Refined partition
Fig.1.Multi-levelframeworkfor omputingabipartitionofagraph.
mat hings whi h ollapseverti esand edges,until the oarsest graphobtained
isnolargerthanafewhundredsofverti es,then omputingaseparatoronthis
oarsestgraph,andproje tingba kthisseparator,from oarsertonergraphs,
up to the original graph. Most often, a lo al optimization algorithm, su h as
Kernighan-Lin[7℄orFidu ia-Mattheyses[4℄(FM),isusedin theun oarsening
phasetorenethepartitionthat isproje tedba kat everylevel,su hthatthe
granularityofthesolutionistheoneoftheoriginalgraphandnottheoneofthe
oarsestgraph,asillustratedin Figure 1.This approa h improvesqualityover
plaingraphalgorithms,andspeedoverplainglobaloptimizationalgorithms,by
taking thebest of bothworlds.Globaloptimization algorithms anbeusedon
smallgraphstogivethegeneraldire tionofthepartitiontoset,andinexpensive
lo aloptimization algorithms anbeusedatlow ostonnergraphswithtens
ofmillionverti es.
However,thequalityofpartitionsprodu edbythisapproa hisnotasgood
astheonethatwouldbeyieldedbyplainglobaloptimizationalgorithms.
Coars-eningartifa ts,aswellasthemeshingtopologyoftheoriginalgraphs,traplo al
optimization algorithmsin lo al optimaoftheir ostfun tions,su h that
fron-tiersareoftenmadeofnon-optimalsetsofsegments,asillustratedinFigure5.a.
This paper des ribesan e ient way to integrate diusion s hemes into a
multi-levelframework,soasto omputepartitionswithsmallandsmooth
fron-tiers in atime equivalent in magnitude to the one of state-of-the-artlo al
op-timization algorithms.Itisorganizedasfollows.Afterpresentingrelatedworks
inSe tion 2,weintrodu ein Se tion3ourmulti-levelbandeddiusions heme,
andshowsomepartitioningandmappingresults,obtainedwithS ot h5.0,in
Se tion 4.Then omesthe on lusion.
2 Related works
Manyauthorshadalreadynoti ed thatpartitionsyieldedbylo aloptimization
thatsu hpartitionswerenotttedfortheirpurpose,assubdomainswithlonger
frontiersorirregularshapesresultedin alargernumberofiterationstoa hieve
onvergen e.Tomeasurethequalityofea hoftheparts,severalauthorsdened
a metri alled aspe t ratio, whi h an be thought in 2D as ameasure of the
perimeterofapartwithrespe ttothesquarerootofitsarea.Themore ompa t
apartis,thesmalleritsaspe tratiovalueis,asidealpartsareof ir ularshape
in theEu lideanspa e.
In [3℄, Diekmann et al. eviden ed su h a behavior, and proposed both a
measureoftheaspe tratiooftheparts,aswellasasetofheuristi sto reateand
renethepartitions,withtheobje tiveofde reasingtheiraspe tratio.Among
thesealgorithmsisabubble-growingalgorithm.Thisalgorithmisbasedonthe
observationthatsetsofsoapbubblesself-organizesoastominimizethesurfa eof
theirinterfa es,whi hisindeedwhatisexpe tedfromapartitioningalgorithm.
Consequently, the authors' idea was to grow, from as many seed verti es as
the desirednumber of parts,a olle tionof expanding bubbles, byperforming
breadth-rst traversals rooted at these seedverti es.On e everygraph vertex
has been assigned to some part, ea h part omputes its enter based on the
graph distan e metri . These enter verti es are taken as new seeds and the
expansion pro ess is started again, until it onverges, that is, until enters of
subdomains nolonger move.An important drawba kof this method is that it
doesnotguaranteethat all partswill hold thesamenumberofverti es,whi h
requires to all other heuristi sin turn to perform loadbalan ing. Also,all of
thegraphverti esmustbevisitedmanytimes,whi hmakesthisalgorithmquite
expensive,allthemoreitis ombinedwith ostlyalgorithmssu h assimulated
annealing,andthe omputationoftheaspe tratiorequires someknowledgeon
thegeometryofthegraphs,whi hisnotalwaysavailable.
In[8℄,MeyerhenkeandS hambergerfurtherexplore thebubblemodel, and
devise a way to grow the bubbles by solving, possibly in parallel, systems of
linearequations, insteadof iteratively omputingbubble enters. This method
yields partitions of high quality too,but is veryslow, even in parallel [9℄, and
theloadbalan ing problemis alsonotaddressed,whi hrequires to resorttoa
greedyloadbalan ing algorithmafterwards.
In[13℄,Wanetal.exploreadiusivemodel, alledtheinuen emodel,where
verti esimpa t theirneighbors by diusing them information ontheir urrent
state.Thismodelalsodoesnothandleloadbalan ingproperly.
3 Multi-level banded diusion s heme
Inspiteoftheirbetterquality,alloftheabovediusions hemeshavetwo
draw-ba ks:rst,theydonotnaturallybalan eloadsbetweenpartsandse ond,they
are expensive as they involve all of the graph verti es. The method that we
3.1 The jug of the Danaides
The diusion s heme that we propose an apply to an arbitrary number of
parts, but for the sake of larity we will des ribe it in the ontext of graph
bipartitioning, thatis, withtwoparts only.Wemodelthegraphto bipartition
in the following way, depi tedin Figure 2.Nodes are representedasbarrels of
innite apa ity,whi hleak su hthat oneunitof liquidatmostdripsperunit
of time. When graph verti es are weighted, always with integer weights, the
maximumquantityofliquidtobelostperunitoftimeisequaltotheweightof
thevertex.Graphedges aremodeledbypipes ofse tionequalto theirweight.
In bothparts, a sour e vertex is hosen, to whi h a sour e pipe is onne ted,
whi howsin
|V |
2
unitsofliquidperunitoftime.Twosortsofliquidsareinfa t inje tedinthesystem:s ot hintherstpipe,andanti-s ot hinthese ondpipe,su h thatwhen somequantity ofs ot h mixes withthe samequantityof
anti-s ot h, bothvanish. Toease thewritingof the algorithmin thebipartitioning
ase,s ot his representedbypositivequantities andanti-s ot hisrepresented
bynegativeones,sothatmutualdestru tionnaturallytakespla ewhenadding
anytwoquantities ofoppositesigns.
ThediusionalgorithmperformsasoutlinedinFigure3.Forea htimestep,
andforea hvertex,theamountofliquid(whethers ot horanti-s ot h)whi h
remains after some has leaked is spread a ross the onne ting pipes towards
theneighboring barrels,a ordingtotheir relativese tions.This pro ess ould
beiterated until onvergen e,but in fa t itis only performed fora numberof
stepssu ienttoa hievesignstability.Indeed,wearenotinterestedin omplete
onvergen e,but in thestabilityof thesignsofall ontentquantities borne by
graph verti es, whi h indi ate whether s ot h or anti-s ot h dominatesin the
barrels,that is,ifsomevertexbelongstopart
0
or1
.Sin e
|V |
unitsofbothliquidsareinje tedonthewholeperunitoftime,and sin eallof thebarrels anleak thesameoverallamountin thesametime, thesystemisboundto onverge,allthemorethatliquid andisappearby ollision
ofs ot handanti-s ot h.Asin thebubbles hemes,what isexpe ted isthata
smoothfrontwillbe reatedbetweenthetwoparts.Thepurposeofthealgorithm
reset ontentsofnewarrayto
0
;old[s
0
] ← old[s
0
] − |V |/2
; /* Refill sour e barrels */old[s
1
] ← old[s
1
] + |V |/2
; for(allverti esv
ingraph){c ← old[v]
; /* Get ontents of barrel */if(
|c| > weight[v]
) { /* If not all ontents have leaked */c ← c − weight[v] ∗ sign(c
); /* Compute what will remain */σ ←
P
e=(v,v
′
)
weight[e]
; /* Sum weights of all adja ent edges */ for(alledgese = (v, v
′
)
){ /* For all edges adja ent to v */
f ← c ∗ weight[e]/σ
; /* Fra tion to be spread to v' */new[v
′
] ← new[v
′
] + f
; /* A umulate spreaded ontributions */ }
} }
swapoldandnewarrays; }
Fig.3. Sket hof thejug-of-the-Danaides diusion algorithm. S ot h,representedas positive quantities,ows fromthe sour eof part
1
, while anti-s ot h,representedas negativequantities,owsfromthesour eofpart0
.Forea hstep,the urrentandnew ontentsofeveryvertexarestoredinarraysoldandnew,respe tively.the ut.Infa t,unlikeallofthealgorithmspresentedinthepreviousse tion,our
methodprivilegesloadbalan ingover utminimization.Forthislatter riterion,
werelyonanadditionalfeatureofours heme,asexplainedbelow.
3.2 Band graphsin a multi-levels heme
Ourdiusionalgorithm,assu h,presentstwoweaknesses:nothingissaidabout
the sele tion of the seed verti es, and performing su h iterations over all of
thegraphsverti esisveryexpensive omparedtolo aloptimizationalgorithms
whi honly onsiderverti esintheimmediatevi inityofthefrontiers.
Toaddress these twoproblems on urrently, we usea method we have
de-velopedin [1℄, illustratedin Figure 4. It onsistsin usingamulti-level s heme
in whi h renementalgorithms are notapplied to thefull graphsbut to band
graphsthat ontainverti esthatare at mostat somesmall distan e, typi ally
3
,from theproje tedseparator.Inthese band graphs,twoadditionalan horverti esrepresentalloftheremovedverti esofea hpart,andare onne tedto
the last band layersof verti es of ea h of the parts.The vertexweight of the
an hor verti es is equalto the sum of the vertex weightsof all of the verti es
theyrepla e,topreservethebalan eofthetwobandparts.
Theunderlyingreasoningofthispre- onstrainedbandings hemeisthatsin e
everyrenement is lassi ally performed bymeans of alo al algorithm,whi h
perturbs only in a limited way the position of the proje ted separator, lo al
renementalgorithmsneedonlytobepassedasubgraphthat ontainsthe
atedaroundtheproje tednerseparator,withan horverti esrepresentingallofthe
removedverti es inea h part. After some optimizationalgorithm (whether lo al or global) isapplied,therenedbandseparatorisproje tedba kto thefullgraph, and
theun oarseningpro essgoeson.
when performing Fidu ia-Mattheysesrenementon bandgraphsthat ontain
onlyverti esthat areat distan e at most
3
from theproje tedseparators,the qualityofthenest separatornotonlyremains onstant,but evensigni antlyimprovesinmost ases.Ourinterpretation isthat thispre- onstrainedbanding
preventslo aloptimizationalgorithmsfromexploringandbeingtrappedinlo al
optimathatwouldbetoofarfrom theglobaloptimumsket hedat the oarsest
levelofthemulti-levelpro ess.
Su h a banded s heme is ideal for using our diusion s heme, as an hor
verti esrepresentanatural hoi etobetakenasseedverti es.Indeed,themost
important problem for bubble-growing algorithms is the determination of the
seedverti esfromwhi hbubblesaregrown,whi hrequiresexpensivepro esses
involvingallofthe graphverti es[3,8℄. Sin ean horverti esare onne ted to
alloftheverti esofthelastlayers,thediusedliquidsowasafrontasifthey
originated from the farthest verti es from the frontier, whi h is indeed what
wouldhappeniftheyowedfrom the enter ofabubblehavingthefrontieras
itsperimeter.
3.3 Parallelization
Ourdiusionalgorithmhastheadditionalinterestofbeinghighlys alable.Ifwe
assumethatfullgraphs,aswellasbandgraphs,aredistributeda rosspro essors
su hthateverypro essorholdsafra tionofthegraphverti esalongwiththeir
adja en y lists, like what is done for instan e in PT-S ot h [2℄, the parallel
versionofS ot h,theparallelversionofthealgorithmis straightforward.
Ev-erypro essorperforms its lo al update and omputes the ontributions it has
to spreadtodistantneighbors,after whi hthese ontributionsaresenttotheir
destination pro essorsin orderto beaggregated.Inorderto over
ommuni a-tionby omputations,verti esthathavedistantneighbors anbepro essedrst,
then ommuni ationsarestarted,andverti eswithpurely lo aladja en ylists
ex eptthread.
|V |
and|E|
arethevertexandedge ardinalities,inthousands. Graph Size(×10
3
)Average|V |
|E|
degree altr426
163
12.50
audikw1944
38354
81.28
auto449
3315
14.77
bmw32227
5531
48.65
body45
164
7.26
bra ket63
367
11.71
Graph Size(×10
3
)Average|V |
|E|
degree onesphere1m1055
8023
15.21
o ean143
410
5.71
oilpan74
1762
47.77
pwt37
145
7.93
thread30
2220
149.32
4 Experimental resultsThediusion algorithmdis ussedabovehasbeenimplemented,asasequential
graph bipartitioningmethod, in version5.0 theS ot h [10℄ graph
partition-ingandstati mappingsoftware.Its k-wayimplementationisnotyetavailable,
be auseit requires more oding, in luding a k-wayband extra tion algorithm
whi hdoesnotexist to date.All of thene essaryoating-pointarithmeti has
beenimplementedinsinglepre ision.
Thetests were runon aLenovoThinkPad T60laptop, with anIntel
dual- ore T2400 pro essor running at 1.8 MHz and 1 Gb of memory. As we ran
sequentialtestsonly,thedual- orefeatureofthepro essorisnotrelevant.The
testgraphswehaveusedinourexperimentsarelistedinTable1.Thesegraphs
were partitioned into
2
to128
parts, and the three quality metri s that we onsider arethenumberof utedges, alled Cut,aloadimbalan e ratioequaltothesizeofthelargestpartdividedbytheaveragesize, alledMaCut,andthe
maximumdiameteroftheparts,referredtoasMDi,whi hisanindire tmetri
oftheshapeofthepartition,andisusableeveninthe aseofgraphsofunknown
or nonexistentgeometry.This lattermetri is insu ient, asitdoesnotreally
apturethesmoothness oftheinterfa es,sin eirregularlyshapedparts anstill
havesmalldiameters;thebest proofwouldhavebeentorunaniterativesolver
and measure onvergen erates basingonthenumbersof iterations.Thiswork
isin progress.
Three diusion heuristi s were ompared againstthe lassi alstrategy
im-plementedinS ot h4.0,referredtoasRMFinthefollowing,whi hperforms
re ursivebipartitioning with bipartitions omputed in a multi-level way, using
FM renement.
Therstmethod, RMBD,usesthesamere ursivebipartitioningand
multi-levelstrategy, but banded diusion is performedduring the multi-level
rene-ment steps. Theresults a hieved with this method validate our approa h: the
obtainedpartitions haveverysmoothboundaries(see Figure5.b), and are
ad-equately balan ed if the number of diusion iterations is su iently high, as
Table 2. Evolution ofthe utsize (
∆
Cut),of theload imbalan e ratio(∆
MaCut) andofthemaximumdiameteroftheparts(∆
MDi)produ edbyvariouspartitioning heuristi swithrespe ttotheRMFstrategy,averagedoveralltestgraphsandnumbers of parts.Figures belowpartitioning strategy namesindi atethe numberof diusion stepsperformed. Method RMBD RMBDF RMBaDF 500 200 100 40 500 40 40∆
Cut(%) +19.51+20.01+18.15+21.49+2.26+3.10 -3.17∆
MaCut(%) +0.58 +1.12 +1.80 +9.76 -0.95 -0.29 -0.21∆
MDi(%) +3.86 +1.92 +4.69 +5.43+2.26+3.10 -3.24∆
Time(×
) 21.31 9.33 5.33 2.93 21.47 2.99 3.07Whenperforming
100
diusionsteps,theaverageMaCutvalueforRMBDis1.046
,only1.80
%higherthantheoneofRMF.However,themaximumdiameter Mdiisnotsigni antlyredu ed,andisevenin reasedonaverageby4.69%
with respe ttoRMF.Thismethodisalso5.33
timesslowerthanRMFandin reases the utbyabout20%
,whi hmakesitoflittlepra ti aluse.Wehavethereforeexperimentedase ondmethod,RMBDF,wherethe
las-si al FM algorithm is applied to the band graphafter the diusion algorithm.
The idea of this strategy is to benet from the global optimization
apabil-ities brought by the diusion algorithm, while lo ally optimizing the frontier
afterward.Evenwhenperforming
40
diusion stepsonly,thesmoothnessofthe boundariesis preservedand parts aremorebalan ed, whilethe utis onlyin- reasedby
3.10%
with respe t to RMF. This strategyis also only three times slowerthanRMF,whi hisextremelyfastforadiusion-basedalgorithm.Inordertofavortheminimization ofdiameters,wehavemodiedour
diu-sion method soasto double at ea h stepthe amountof liquidborne byevery
vertex,inanavalan he-likepro ess.ThismethodisreferredtoasaD.Itisno
longerboundto onverge,andindeed ausesoverowsforlargenumbersof
dif-fusionsteps,butgivesgoodresultsforsmallnumbersofiterations.Asamatter
of fa ts,we an see in Table2that theRMBaDF methodis themost e ient
oneonaverage,andyields better resultsthanthe lassi alRMF method while
stillprovidingsmoothboundaries,aseviden edin Figure5. .
Forthesakeof omparison,we ompareinTable3someofourresultsagainst
the ones obtainedwith K-MeTiS . K-MeTiS uses dire t k-waypartitioning
in-stead of re ursive bipartitioning, whi h usually makes it more e ient when
the numberof parts in reases,and alsomu h faster (from
10
to20
times). As analyzed in [11℄,the performan e of re ursivebipartitioning methods tends tode rease when thenumberof parts in reases,whi h should limitthee ien y
ofRMBDFmethodsforlargenumbersofparts.Afullk-waydiusionalgorithm
oftheparts(MDi),betweenthreeheuristi s:multi-levelwithFMrenement(RMF,
asimplementedinS ot h4.0),multi-levelwithbandeddiusionandFMrenements (RMBaDF),andK-MeTiS .
Test Numberofparts
ase 2 4 8 16 32 64 128 altr4 RMF Cut 1688 3197 4978 7788 11905 17656 24478 MDi 50 52 40 33 25 21 14 RMBaD(40)F Cut 1621 3203 5017 7776 11980 17669 24831 MDi 48 46 41 30 25 18 14 KMeTiS Cut 1670 3233 4981 8115 12147 17355 24058 MDi 48 45 41 34 26 22 14 bmw32 RMF Cut 17271 54424 84222 120828181844267427394418 MDi 93 116 130 106 74 120 68 RMBaD(40)F Cut 16032 54446 83422 124945 183454 275594 411154 MDi 91 130 96 84 68 63 56 KMeTiS Cut 15529 55506 92658 125686 193169 286111 420965 MDi 87 108 99 87 70 61 68
5 Con lusion and future work
In thispaper,wehavepresentedadiusion algorithm whi h, used in a
multi-level bandedframework,resultsin smootherpartition frontiers andmore
om-pa tparts.Usedinourbanded ontext,thisalgorithmisfastenoughtobeused
on very large graphs,as it is only about three times slower than lassi al
lo- al optimization s hemes. The 2-waysequential versionhas been integratedin
version5.0ofS ot h.
Thisalgorithmisalsoeasilyparallelizableandhighlys alable,whi hmakes
itaverygood andidatefortherealizationofafastande ientparallelgraph
partitioner,takingadvantageoftheparallel multi-levelandbandgraph
extra -tionroutinesalreadydevelopedinPT-S ot hin the ontextofsparsematrix
reordering.
Even morethan lassi alFM-likealgorithms, this algorithm is onstrained
bythegreedynatureofthere ursivebipartitionings heme,whi hpreventsthe
globalimprovementoffrontiers omputedatpreviousstages.Afullk-wayversion
ofthealgorithmisthereforeunderdevelopment,whi hextendsthe2-waymodel
by onsidering
k
dierentliquidshavingthesamemutualannihilationproperties, su hthat whenp
dierentliquidsare mixed in thesamebarrel,onlythemost abundantoneremains. Thisbehaviorisequivalent totheoneof ouralgorithmin the2-way ase.Usinganativek-ways hemeshouldalsosigni antlyredu e
running times ompared to re ursive bipartitioning. A parallel versionis also
Fig.5.Partitionofgraphaltr4into
8
partsusingthreedierent strategies.The seg-mented frontiers produ ed by FM-like algorithms are learly eviden ed inFigure a. RMBDprodu esthesmoothestboundaries,asshowninFigureb.RMBaDFtakesthebestofbothworlds,inFigure .
Referen es
1. C.ChevalierandF.Pellegrini. Improvementofthee ien yofgeneti algorithms for s alable parallel graphpartitioning ina multi-level framework. InPro .
Eu-ropar, pages 243252, 2006. http://www.labri.fr/pelegrin /pap ers/s ot h _ effi ientga.pdf.
2. C. Chevalier and F. Pellegrini. PT-S ot h: A tool for e ient parallel graph ordering. Submitted to Parallel Computing, de 2006. http://www.labri.fr/ pelegrin/papers/s ot h_parallel orde ring_ par o mp.p df.
3. R. Diekmann,R.Preis, F. S hlimba h,and C.Walshaw. Aspe tratio for mesh partitioning. InPro .Europar'98, LNCS1470,pages347351,1998.
4. C.M.Fidu iaandR.M.Mattheyses. Alinear-timeheuristi for improving net-workpartitions.InPro .19thDesignAutomat.Conf.,pages175181.IEEE,1982.
5. B.Hendri ksonandR.Leland.Amultilevelalgorithmforpartitioninggraphs. In Pro eedingsofSuper omputing, 1995.
6. G. Karypisand V. Kumar. Afast and highquality multilevels hemefor
parti-tioningirregulargraphs. SIAMJ.onS ienti Computing,20(1):359392, 1998. 7. B. W. Kernighan and S. Lin. An e ient heuristi pro edure for partitionning
graphs. BELL SystemTe hni alJournal,49:291307, feb1970.
8. H. Meyerhenke andS.S hamberger. Balan ingparallel adaptive FEM
omputa-tions by solving systems of linear equations. In Pro . Europar, pages 209219, 2005.
9. H.MeyerhenkeandS.S hamberger.Aparallelshapeoptimizingloadbalan er. In Pro .Europar'2006,LNCS4128,pages232242,2006.
10. S ot h: Stati mapping, graph partitioning, and sparse matrix blo k ordering
pa kage. http://www.labri.fr/pelegrin /s ot h/.
11. H.D.SimonandS.-H.Teng.Howgoodisre ursivebipartition.SIAMJ.S ienti
Computing,18(5):14361445, sep1997.
12. R.Vanderstraeten,R.Keunings,andC.Farhat. Beyond onventionalmesh
parti-tioningalgorithms. InSIAMConf.onPar.Pro .,pages611614,1995.
13. Y.Wan,S.Roy,A.Saberi,andB.Lesieutre. Asto hasti automaton-based
algo-rithmforexibleanddistributednetworkpartitioning.InPro .SwarmIntelligen e Symposium,pages273280.IEEE,2005.