HAL Id: hal-02499452
https://hal.archives-ouvertes.fr/hal-02499452
Submitted on 5 Mar 2020
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Pierre Degond, Marina Ferreira, Sébastien Motsch
To cite this version:
Pierre Degond, Marina Ferreira, Sébastien Motsch. Damped Arrow-Hurwicz algorithm for sphere packing. Journal of Computational Physics, Elsevier, 2017, 332, pp.47-65. �10.1016/j.jcp.2016.11.047�.
�hal-02499452�
Contents lists available atScienceDirect
Journal of Computational Physics
www.elsevier.com/locate/jcp
Damped Arrow–Hurwicz algorithm for sphere packing
Pierre Degond
a, Marina A. Ferreira
a, Sebastien Motsch
baDepartmentofMathematics,SouthKensingtonCampus,ImperialCollegeLondon,SW72AZLondon,UnitedKingdom bSchoolofMathematical&StatisticalSciences,ArizonaStateUniversity,Tempe,AZ85287-1804,UnitedStates
a r t i c l e i n f o a b s t r a c t
Articlehistory:
Received16May2016
Receivedinrevisedform28October2016 Accepted30November2016
Availableonline1December2016 Keywords:
Non-convexminimizationproblem Spherepackingproblem Non-overlappingconstraints
We consider algorithms that, from an arbitrarily sampling of N spheres (possibly overlapping),findaclosepackedconfiguration withoutoverlapping. Theseproblemscan be formulatedas minimization problemswithnon-convex constraints.For suchpacking problems, we observe that the classical iterative Arrow–Hurwicz algorithm does not converge. Wederive anovel algorithmfromamulti-stepvariant oftheArrow–Hurwicz schemewithdamping.Wecomparethisalgorithmwithclassicalalgorithms belongingto the classof linearly constrained Lagrangian methods and show that it performs better.
Weprovideananalysisoftheconvergenceofthesealgorithmsinthesimplecaseoftwo spheresinone spatialdimension.Finally,weinvestigatethe behaviourofour algorithm whenthenumberofspheresislargeintwoandthreespatialdimensions.
©2016TheAuthors.PublishedbyElsevierInc.ThisisanopenaccessarticleundertheCC BY-NC-NDlicense(http://creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction
Particle packing problems can be encountered in many different systems, from the formation of planets or cells in livetissues tothedynamicsof crowdsofpeople.Theyhavebeenwidely investigatedinthestudyofgranular media [1], glasses[2]andliquids [3]. Morerecently, particlepackings haverevealed tobe importanttoolsin biology[4]andsocial sciences,inparticularincrowddynamics[5].
Packing problems give rise to NP-hard non-convex optimization problems [6] and the optimalsolution is in general notunique,sincepermutations,rotationsorreflectionsmaygenerateequivalentsolutions.Wereferthereaderto[6]fora reviewonpackingproblems.Intheliterature,onecanfindnumericalstudiesinvolvingparticleswithvariousshapessuchas ellipses[7]orevennon-convexparticles[8].However,inthepresentworkweassumethattheparticlesaresimplyidentical spheres withdiameterd in
R
b,b=
1,
2,
3,butthemethodologyis generalandwill beextended toother casesin future work.Weconsideralgorithmsthat,givenaninitialconfigurationofN spheres(possiblyoverlapping),findanearbypacked configurationwithoutoverlapping. Indeed,inmanynaturalsystemsindividualsorparticles onlyseektoachieve alocally optimalsolution.Therefore,itismore likelythatthey reacha localconfigurationthat doesnot necessarilycorrespondto aglobaloptimum.Bycombiningourmethodwith, forexample,simulatedannealingtechniques[9],wecouldconvertour algorithmsintoglobalminimumsearchalgorithms.Itishowevernottheobjectivewepursuehere.Classicalprocedures to solve non-convexminimization problemsinclude Uzawa–Arrow–Hurwicz type algorithms [10], augmentedLagrangian[11,12],linearlyconstrainedLagrangian(LCL),sequentialquadraticprogramming(SQP)[13],among others.The SQPandthe Uzawa–Arrow–Hurwiczalgorithms are widelyused.However they requiretheHessianmatrixof thefunctiontobeminimizedtobepositivedefinite,whichisnotalwaysthecaseinthistypeofproblems(seetheexample
E-mailaddress:pierre.degond@gmail.com(P. Degond).
http://dx.doi.org/10.1016/j.jcp.2016.11.047
0021-9991/©2016TheAuthors.PublishedbyElsevierInc.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense (http://creativecommons.org/licenses/by-nc-nd/4.0/).
presented in section 3). In general, all thesemethods perform well witha small numberof particles. However we are interestedinthecasewherethisnumberbecomeslarge.
In [14,15] the authors studythe shape of three dimensional clusters of atomsunder the effect of softpotentials by using moleculardynamics.Thisapproachdiffers fromourswithregard tothenon-overlapping constraints,whichare ap- proximatedbysoftpotentials,producingsoftdynamics.Althoughbeingmorecostlywhendealingwithalargenumberof particles,wehaveoptedbytheharddynamicsapproach,sinceitallowsforahigherprecisioninthetreatmentofthecon- straints.Thisproveseffectivewhendealingwithinteractionbetweenrigidbodies,wheretheeffectoftherigidboundaries playsanimportantrole.Thismotivatesthepresentwork.
We start insection 2.1bypresenting twoformulations oftheproblem. Thefirst oneis theclassical minimizationap- proach. The second one considers a constraineddynamical systemin the spirit of [16]. We also presenttwo equivalent types ofnon-overlapping constraints involvingsmooth ornon-smoothfunctionswhich are found inthe literature [6,17].
Tosolvethenon-convexminimization problemsarising intheseformulations,wefirstconsiderinsection2.2theclassical Arrow–Hurwiczalgorithm(AHA). Insection 2.3we introduceanovelmulti-stepschemebasedonasecond-orderODEin- terpretation oftheminimization problem:thedamped Arrow–Hurwiczalgorithm(DAHA). We testthe DAHAagainsttwo methodstakenfromthewidelyknownclassoflinearlyconstrainedLagrangianalgorithms[13,16].Thesealgorithmsconsist ofa sequenceofconvexminimization problems,forwe refertothem asnestedalgorithms(NA)andthey shallbereferred to astheNAPandNAV. Theconvergenceofthefour algorithms(the AHA,DAHA,NAPandNAV)is analyzedinsection 3 andAppendix Aforthecaseoftwospheresinonedimension.Insection4thealgorithmsarenumericallycomparedforthe casesofmanyspheresintwo dimensions.Abriefnumericalstudyofthepackingdensityintwoandthreedimensionsis alsopresented.Finally,conclusionsandfutureworksarepresentedinsection5.Wealsoreferto[18]foradetailedanalysis oftheminimizationproblem.InparticularweprovethatminimizersarenotsaddlepointsoftheLagrangian.Thisanalysis requiresdevelopmentthatarebeyondthescopeofthepaper.
2. ThedampedArrow–Hurwiczalgorithm(DAHA)
2.1. Minimizationproblemsforspherepacking
Wefirstrecalltwodifferentformulationsofgenericminimizationproblems.LetN andb betwogivenpositiveintegers.
WeconsiderfirsttheproblemoffindingaconfigurationX
¯
suchthat X¯ ∈ argmin
φk(X)≤0,k,=1,...,N,k<
W
(
X), (2.1)
where W
: R
bN→ R
is a convex function (not necessarily strictly convex). The functionsφ
k: R
bN→ R
, k, =
1, ...,
N, k<
are continuous butnot necessarilyconvex. We suppose that W has aminimum inthe set ofadmissiblesolutions{
X∈ R
bN| φ
k(
X) ≤
0}
. Intheseconditions,X¯
exists butmaynot be unique.Wealsoassume thatφ
k,k, =
1, ...,
N,k<
andW areC1functionsintheneighbourhoodofX.
¯
Inwhatfollows,dwilldenotethediameterofasphere,N thenumber of spheres, b the spatial dimension, X the position of thecenter of the spheres andφ
k the non-overlapping constraint functionsbetweenthekth andth spheres.Thenon-overlappingconstraintsforasystemofidenticalspheresin
R
b canbe expressedbymeansofasmoothoranon-smoothfunctionasspecifiedbellow.Althoughleading toequivalentconstraints, eachformhasanimpactontheconvergenceofthenumericalmethodtowardsalocalminimizer,aswewillseeinsections3 and4.Definition2.1.Wecallnon-smoothformoftheconstraintfunctions(NS)thefollowingfunction
φ
k(
X) =
d− |
Xk−
X| ,
k, = 1 , ...,
N,
k=
andsmoothformoftheconstraintfunctions(S)thefollowingfunction
φ
k(
X) =
d2− |
Xk−
X|
2,
k, = 1 , ...,
N,
k= .
Anillustrationofthenon-overlappingconstraints,aswellas,apossiblesolutionforN
=
7 arepresentedinFig. 1.Wenowpresentasecondformulationconsistinginsolvingaminimizationproblemassociatedwithadiscretedynamical systemwhichhasX
¯
asafixedpoint.Let| · |
denotetheEuclideannormonR
b.Theproblemisformulatediteratively:given aninitialconfigurationX0= {
X0i}
i=1,...,N,wepassfromiterateXp toiterateXp+1 asfollows⎧ ⎪
⎪ ⎨
⎪ ⎪
⎩
Xp+1
=
Xp+ τ
Vp+1(a)
Vp+1∈ argmin
φk(Xp+τV)≤0,k,=1,...,N,k<
1 2
N i=1|
Vi+ ∇
XiW(
Xp)|
2, (b) (2.2)
where
τ >
0 isagivenparameterandV= {
Vi}
i=1,...,N.WedefineX˜
asafixedpointofthisproblem.Consequently,X˜
satisfiesFig. 1.Representation of the non-overlapping constraints, a, and a possible optimal solution of(2.1)forN=7, b.
0 ∈ argmin
φk(˜X+τV)≤0,k,=1,...,N,k<
1 2
N i=1|
Vi+ ∇
XiW(
X˜ ) |
2. (2.3)
Notethattheminimaof(2.2)(b)exist,butmaynotbeunique.
Theminimizationproblem(2.1)canbeformulatedintermsoftheLagrangian
L : R
bN× ( R
+0)
N(N−1)/2→ R
definedbyL (
X,λ) =
W(
X) +
k,∈{1,...,N},k<
λ
kφ
k(
X),
where
λ = {λ
k}
k,=1,...,N,k<representsthesetofLagrangemultipliers.IfX¯
isasolutionoftheminimizationproblem(2.1), then,theAbadie constraintqualification(ACQ) [19]holds atX¯
and, consequently, thereexistsλ ¯ ∈ ( R
+0)
N(N−1)/2 such that(
X¯ , λ) ¯
isacritical-pointoftheLagrangian,namely,(
X¯ , λ) ¯
satisfiestheKKT-conditions[20,21]:∇
XiL (
X¯ , λ) ¯ = 0 ,
i= 1 , ...,
N∇
λkL (
X¯ , λ) ¯ = 0 and λ ¯
k≥ 0 or
∇
λkL (
X¯ , λ) < ¯ 0 and λ ¯
k= 0 ,
k, = 1 , ...,
N,
k<
whichisequivalentto
⎧ ⎨
⎩
∇
XiW(
X¯ ) +
k,∈{1,...,N},k<
λ ¯
k∇
Xiφ
k(
X¯ ) = 0 ,
i= 1 , ...,
Nφ
k(
X¯ ) = 0 and λ ¯
k≥ 0 or
φ
k(
X¯ ) < 0 and λ ¯
k= 0 ,
k, = 1 , ...,
N,
k< .
(2.4)
Wehavereducedouroriginalproblem(2.1)toacritical-pointsystem,withapossibleenlargementofthesetofsolutions.
Contrarilytoconvexoptimization,inthecaseofpackingproblems,thesecritical-pointsmaynotbesaddle-points.Inrefer- ence[18]weprovideadetailedanalysisofthispoint,whichrequiresnewtechnicaldevelopmentsthatgobeyondthescope ofthepresentpaper.
Wealsoformulatetheminimizationproblem(2.2)(b)intermsofaLagrangian
L
p,L
p(
V, μ ) = 1
2
N i=1|
Vi+ ∇
XiW(
Xp) |
2+
k,∈{1,...,N},k<
μ
kφ
k(
Xp+ τ
V),
where
μ = { μ
k}
k,=1,...,N,k< is the set of Lagrange multipliers associated to the constraints. The gradients of the La- grangianaregivenby∇
ViL
p(
V, μ ) =
Vi+ ∇
XiW(
Xp) + τ
k,∈{1,...,N},k<
μ
k∇
Xiφ
k(
Xp+ τ
V),
i= 1 , ...,
N∇
μkL
p(
V, μ ) = φ
k(
Xp+ τ
V),
k, = 1 , ...,
N.
Thedynamicalsystemiswritten:X
˜
p+1= ˜
Xp+ τ
V˜
p+1suchthat(
V˜
p+1, μ ˜
p+1)
isasolutionofthecritical-pointproblem⎧ ⎪
⎪ ⎪
⎪ ⎨
⎪ ⎪
⎪ ⎪
⎩
V
˜
ip+1+ ∇
XiW(
X˜
p) + τ
k,∈{1,...,N},k<
˜
μ
kp+1∇
Xiφ
k(
X˜
p+ τ
V˜
p+1) = 0 ,
i= 1 , ...,
Nφ
k(
X˜
p+ τ
V˜
p+1) = 0 and μ ˜
kp+1≥ 0 or
φ
k(
X˜
p+ τ
V˜
p+1) < 0 and μ ˜
kp+1= 0
,
k, = 1 , ...,
N,
k<
(2.5)
Likewise,thefixedpointX
˜
ofthedynamicalsystemisdefinedsuchthatthereexistsμ ˜
suchthat⎧ ⎪
⎨
⎪ ⎩
∇
XiW(
X˜ ) + τ
k,∈{1,...,N},k<
˜
μ
k∇
Xiφ
k(
X˜ ) = 0 ,
i= 1 , ...,
Nφ
k(
X˜ ) = 0 and μ ˜
k≥ 0
or
φ
k(
X˜ ) < 0 and μ ˜
k= 0
,
k, = 1 , ...,
N,
k< .
(2.6)
Then, it is clearthat problems (2.4) and(2.6) are equivalent for all valuesof
τ >
0 by settingλ ¯ = τ μ ˜
. However, the choiceofτ
isimportanttoensureconvergenceofthedynamicalsystem(2.5)tothefixedpoint.As itwillbe obviousbelow,allfunctionsW and
φ
k usedthroughoutthepaperwillsatisfytheconditionsconsidered inthissection. Thenonlinearsystems(2.4) or(2.5)will havetobesolved byan iterativealgorithm.We nowpresentthe algorithmsconsideredinthepaper.2.2. TheArrow–Hurwiczalgorithm(AHA)
The classical Arrow–Hurwicziterativealgorithm[10] searches asaddle-pointof theLagrangian byalternating steps in the directionof
−∇
XL
and+∇
λL
. Using thisidea, a saddle-pointis thena steady-state solutionof the Arrow–Hurwicz systemofODE’s(AHS)whichisdefinednext.Definition2.2.TheArrow–Hurwiczsystem(AHS)isdefinedby
⎧ ⎪
⎪ ⎪
⎪ ⎪
⎨
⎪ ⎪
⎪ ⎪
⎪ ⎩
X
˙
i= − α
⎛
⎝ ∇
XiW(
X) +
k,∈{1,...,N},k<
λ
k∇
Xiφ
k(
X)
⎞
⎠ ,
i= 1 , ...,
N(a) λ ˙
k=
0 , if λ
k= 0 and φ
k(
X) < 0
βφ
k(
X), otherwise ,
k, = 1 , ...,
N,
k< , (b)
(2.7)
where
α
andβ
arepositiveconstants.Consideringasmalltime-stept,asemi-implicitEulerdiscretizationschemeofthe previoussystemleadstotheArrow–Hurwiczalgorithm(AHA),whichisdefinediterativelyby
⎧ ⎪
⎨
⎪ ⎩
Xni+1
=
Xin− α
∇
XiW(
Xn) +
k,∈{1,...,N},k<
λ
nk∇
Xiφ
k(
Xn)
,
i= 1 , ...,
Nλ
nk+1= max { 0 , λ
nk+ βφ
k(
Xn+1)},
k, = 1 , ...,
N,
k<
(2.8)
where
α
andβ
nowcorrespondtoα ˜ = α
tandβ ˜ = β
tandthetildeshavebeendroppedforsimplicity.The original AHAwas formulated usinga fullyexplicit Eulerscheme,butithas proved moreaccurate touse asemi- implicitscheme.Findingalocalsteady-satesolutionof(2.7)(a)–(2.7)(b)inthecaseofapackingproblemhasrevealednot to be always possible because it oftenhappens that no critical-point is a saddle-point[18]. Thismanifests itself by the existence ofperiodicsolutionsoftheAHSwhichdonotconvergetothecritical-point.Inordertoovercomethisdifficulty weproposethedampedArrow–Hurwiczalgorithmwhichispresentednext.Thismethodisbasedonamodificationofthe dynamicsofthe AHSthattransformsan unstablecritical-pointintoan asymptotically stableone. Theperformance ofour methodwillbetestedbycomparingwithpreviousapproaches[13,16],whicharebasedonamodificationoftheLagrangian bylinearlyapproximatingtheconstraints.Theseapproachesarepresentedinsection2.4.
2.3. ThedampedArrow–Hurwiczalgorithm
Inordertoavoidperiodicsolutionswewilladdadampingtermasdescribedbelow.Notethatwearenotinterestedon thetransientdynamicsofthesystem,butratheronitsasymptoticbehaviour.
Weproposethefollowingdefinition.
Definition2.3.WedefinethedampedArrow–Hurwiczsystem(DAHS)as
⎧ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎨
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎩
X
¨
i= − α
2[∇
XiW(
X) +
k,∈{1,...,N},k<
λ
k∇
Xiφ
k(
X) ]
− α β
k,∈{1,...,N},k<
φ
k(
X)λ
k∇
Xiφ
k(
X) −
cX˙
i,
i= 1 , ...,
N(a) λ ˙
k=
0 , if λ
k= 0 and φ
k(
X) < 0
βφ
k(
X), otherwise ,
k, = 1 , ...,
N,
k< (b)
(2.9)
where
α
,β
andc arepositive constants andthedampedArrow–Hurwiczalgorithm(DAHA)asthe corresponding semi- implicitdiscretescheme:⎧ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎨
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎩
Xni+1
= 1 1 +
c/ 2
2
Xin− ( 1 −
c/ 2 )
Xni−1− α
21 +
c/ 2 [∇
XiW(
Xn) +
k,∈{1,...,N},k<
λ
nk∇
Xiφ
k(
Xn) ]
− α β 1 +
c/ 2
k,∈{1,...,N},k<
φ
k(
Xn)λ
nk∇
Xiφ
k(
Xn),
i= 1 , ...,
N(a) λ
nk+1= max { 0 , λ
nk+ βφ
k(
Xn+1)},
k, = 1 , ...,
N,
k< , (b)
(2.10)
where
α
,β
andc correspondnowtonumericalparameters.NotethattheDAHAisamulti-stepscheme,sincenotonlyone,buttwopreviousconfigurations Xn−1 andXn areused toobtainXn+1.Bysettingc
=
2,themethodisreducedtoaone-stepmethod.Asecond-orderODEsystemwithdampinghaspreviouslybeenproposedwithinthescopeofconvexprogramming[22, 23]. Besides comprisingthe non-convexcase, ourapproach differs fromthis withregard to the extra term
α β
in equa- tion(2.9)(a).Inthefollowingwepresentthederivation oftheDAHS.Westartby consideringtheAHS(2.7)(a)–(2.7)(b)presentedin theprevioussection.Wethentakethesecond-orderversionof(2.7)(a).Foreachi
=
1, ...,
N wehaveX
¨
i= − α
N m=1∇
Xm∇
XiW(
X) +
k,∈{1,...,N},k<
λ
k∇
Xiφ
k(
X)
X˙
m− α
k,∈{1,...,N},k<
λ ˙
k∇
Xiφ
k(
X). (2.11)
Using(2.7)(b),wecanreplace
λ ˙
kin(2.11)byβφ
k(
X)
H(λ
k)
,whereHistheHeavisidefunction.Moreover,inordertokeep thesamesteadystatesastheAHS,wereplaceH(λ
k)
byλ
k,asatequilibriumλ
kφ
k=
0.Notethatotherchoicescouldbe made,suchasapowerofλ
kforinstance,whichwouldinfluencethespeedofconvergenceofthealgorithm.Howeverwe donotexplorethisaspectfurtherhere.WegetX
¨
i= − α
N m=1∇
Xm∇
XiW(
X) +
k,∈{1,...,N},k<
λ
k∇
Xiφ
k(
X)
X˙
m(2.12)
− α β
k,∈{1,...,N},k<
φ
k(
X)λ
k∇
Xiφ
k(
X). (2.13)
Itturnsoutthatpassingtothesecond-orderintroducesexponentiallygrowingmodes(seeRemark 2.1).
Remark2.1.Consider the simple ODE u
˙ = − α
u whose solution is u(
t) =
u0e−αt, where u0 is the initial configuration.Differentiatingbothsidesoftheequationandsubstitutingu
˙
by− α
uyieldsu¨ = α
2u,whosesolutionincludesnowanexpo- nentiallygrowingmode:u(
t) =
c1e−αt+
c2eαt,wherec1andc2 arerealconstantsdeterminedbytheinitialconfigurations.In orderto remove thesemodes, we replace thetermin (2.12) by a simplesecond-order dynamics inthe force field givenbytherighthandsideof(2.7)(a).Weget:
X
¨
i= − α
2⎡
⎣ ∇
XiW(
X) +
k,∈{1,...,N},k<
λ
k∇
Xiφ
k(
X)
⎤
⎦
− α β
k,∈{1,...,N},k<
φ
k(
X)λ
k∇
Xiφ
k(
X). (2.14)
Now,we justadd avelocity dampingtermin theformof
−
cX˙
i andwe finally obtain (2.9)(a).We endup withthesys- tem(2.9)(a)–(2.9)(b).Remark2.2.Wecaninterpretthefirstterm,attherighthandsideof(2.14)asasecond-orderdynamicsversionof(2.7)(a).
Denotingby T1 andT2 thetermsin(2.14) whicharemultipliedby
− α
2 and− α β
,respectively,we recover(2.7)(a)inan over-dampedlimitX
¨
i+ α β
T2= − α
2T1−
cX˙
i,with→
0 andc=
1.Proposition2.4.TheAHS(2.7)(a)–(2.7)(b)andtheDAHS(2.9)(a)–(2.9)(b)havethesameequilibriumsolutions.
Proof. If
(λ
∗,
X∗)
is a steadystate ofthe AHS, then eitherφ
k(
X∗) =
0 orλ
∗k=
0. Consequently,λ
kφ
k(
X∗) =
0, which impliesthat thesecond partofequation (2.9)(b)isnullandX¨
∗=
0.Using asimilar argumentweconcludethat asteady stateofDAHSisalsoasteadystateofAHS.2
2.4. Previousapproaches
Acommonapproachtosolvethegenericminimizationproblems(2.1)and(2.2)(a)–(2.2)(b)isbasedonthelinearization oftheconstraintfunctions
φ
karoundacertainconfigurationXp,whichwedenotebyφ
kp(
X)
,i.e.,φ
kp(
X) = φ
k(
Xp) + ∇
Xφ
k(
Xp) · (
X−
Xp). (2.15)
The solution Xp+1 of theresulting linearlyconstrainedoptimization problemis usedto improvethe linearizationof the constraint functionsand this process is iterated until convergence. Note that this transformation turns the non-convex minimization problems (2.1) and (2.2)(a)–(2.2)(b) into a sequence of convex problems, for which there are many tools available[24].WehavechosentheArrow–Hurwiczalgorithm,however,anyothermethodforconvexoptimizationproblems wouldsuitourpurpose.
ThismethodbelongstotheclassoflinearlyconstrainedLagrangian(LCL)methods[13]whichhavebeenusedforlarge constrainedoptimizationproblems.
2.4.1. Thenestedalgorithmforthepositions(NAP)
Considerthesystem(2.4)withlinearizedconstraintfunctions.Weproposethefollowingdefinition.
Definition2.5(NestedAlgorithmforthePositions(NAP)).Let
(
Xp, λ
p)
be given. Define Xp,0=
Xp,λ
p,0= λ
p andφ
kp as in(2.15).Foragiven(
Xp,n, λ
p,n)
,letthestepoftheinner-loopbedefinedas⎧ ⎪
⎪ ⎪
⎨
⎪ ⎪
⎪ ⎩
Xip,n+1
=
Xip,n− α
⎡
⎣ ∇
XiW(
Xp,n) +
k,∈{1,...,N},k<
λ
kp,n∇
Xiφ
kp(
Xp,n)
⎤
⎦ ,
i= 1 , ...,
N(a) λ
kp,n+1= max
0 , λ
kp,n+ βφ
kp(
Xp,n+1)
,
k, = 1 , ...,
N,
k< , (b)
(2.16)
then
(
Xp+1, λ
p+1) =
limn→∞(
Xp,n, λ
p,n)
.Ifweonlycomputeone stepoftheinner-loopperiterationoftheouter-loopwegetavariantoftheAHAformulation, where
φ
k(
Xp+1)
isreplacedbyφ
kp(
Xp+1)
in(2.16)(b).2.4.2. Thenestedalgorithmforthevelocities(NAV)
Weconsidertheminimizationproblem(2.5)withlinearizedconstraintfunctions.
Definition2.6(NestedAlgorithmfortheVelocities(NAV)).Let
τ >
0 and(
Xp,
Vp, μ
p)
begiven.DefineVp,0=
Vp,μ
p,0= μ
pand
φ
kpasin(2.15).Foragiven(
Vp,n, μ
p,n)
,letthestepoftheinner-loopbedefinedas⎧ ⎪
⎪ ⎪
⎨
⎪ ⎪
⎪ ⎩
Vip,n+1
=
Vip,n− α
⎛
⎝
Vip,n+ ∇
XiW(
Xp) + τ
k,∈{1,...,N},k<
μ
kp,n∇
Xiφ
kp(
Xp+ τ
Vp,n)
⎞
⎠ ,
i= 1 , ...,
N(a)
μ
kp,n+1= max
0 , μ
kp,n+ βφ
kp(
Xp+ τ
Vp,n+1)
,
k, = 1 , ...,
N, (b)
(2.17)
then
(
Vp+1, μ
p+1) =
limn→∞(
Vp,n, μ
p,n)
andXp+1=
Xp+ τ
Vp+1.TheNAVcorrespondstoanadaptationofthemethoddevelopedbyMauryin[16].
3. Linearanalysis
3.1. Preliminaries
Undertheassumptionsconsideredintheprevioussection,theassociatedODEsystemsarepiecewisesmooth.Inpartic- ular,they aresmoothinaneighbourhood ofX,
¯
whichallowsustocarry outthelinearstabilityanalysisinordertostudy thelocalconvergenceofthesolutiontowardsasteadystate.Weconsiderhereaphysicalsystemwhere N rigidspheresin
R
b attracteach otherthroughaglobalpotentialwhichis givenbyaquadraticfunctionoftherelativedistance,W
(
X) = 1 2N
i,j∈{1,...,N},i<j
|
Xi−
Xj|
2. (3.1)
Definition3.1.Asteadystatex∗ oftheODEsystemx
˙ =
f(
x),
t≥
0,iscalled•
stable(inthesenseofLyapunov)ifforall>
0,thereexistsaδ >
0 suchthat¯
x(
0) −
x∗< δ
implies¯
x(
t) −
x∗<
, forallt>
0 andforallsolution¯
x;•
asymptoticallystableifitisstableandlimt→∞¯
x(
t) −
x∗=
0;•
unstableifitisnotstable.Notethat thisdefinition assumesthat theinitial configuration ischosen closeenough to thesteadystate. Alternative notions ofstability could havebeen used [25,26].The one we consider hereallows usto get insight intothe behaviour ofthe algorithmasit isdescribed below.The next theoremallows ustoobtain conclusions abouttheoriginal nonlinear systemfromthecorrespondinglinearizedsystem.
Theorem3.2.ConsidertheODEsystemx
˙ =
f(
x)
andasteadystatex∗,where f issmoothatx∗.Ifx∗ isanasymptoticallystable (unstable)solutionofthelinearizedsystemaboutx∗,i.e.,x˙˜ =
f(
x∗)(˜
x−
x∗)
,thenitisanasymptoticallystable(unstable)solutionof theoriginalsystem.Proof. See[27],Thm. 2.42,p. 158.
2
InordertoensureconvergenceoftheODEsystemtowardsasteadystate,weonlyneedtoensurethattheeigenvaluesof f
(
x∗)
allhavenegativerealpart.Ifatleastoneeigenvaluehaspositiverealpart,thenx∗isunstable,andifalleigenvalues arepure imaginary,then x∗ isacenterequilibrium,i.e.ifa solutionstartsnearitthenitwillbe periodicaroundit.Inthe lattercase,we cannotconcludeanythingaboutthenonlinearsystem. Theanalysispresentednextismadeforthecaseof twospheresinR
.3.2. TheArrow–Hurwiczalgorithm(AHA) 3.2.1. AHA-NS
Let
φ (
X) =
d− |
X|
andconsider the potential (3.1). The ODE systemassociated to the DAHA-NS inthe case of two spheresinR
whereonesphereisfixedattheorigincanbewrittenas⎧ ⎪
⎪ ⎪
⎨
⎪ ⎪
⎪ ⎩
X
˙ = − α
1 − λ
|
X|
X
(a)
λ ˙ =
0 , if λ = 0 and
d< |
X| β(
d− |
X|), otherwise . (b)
(3.2)
Lemma3.3.Thesteadystatesofthesystem(3.2)(a)–(3.2)(b),
(
X∗, λ
∗) = (
d,
d)
and(
X∗, λ
∗) = ( −
d,
d)
,arebothasymptotically stable,foranyα
andβ
positive.Proof. Sincethedynamicsaroundeach steadystateisidentical,weonlyneedtocarryouttheanalysisofthefirststeady state. Suppose X
>
0 andconsider thechangeofvariables Y=
X−
dandμ = λ −
d.The systemonthenewvariablesis giveninmatrixformby Y˙
˙ μ
=
A Yμ
,
A=
− α α
− β 0
.
We want the eigenvalues of matrix A to be real and negative in order to have a fast convergence to the steady state.
Therootsofthecharacteristicpolynomial
P (λ) = λ
2+ α λ + β α
,havebothnegativerealpart,thereforethesteadystate is asymptoticallystable.2
AnysolutiontotheODEsystem(3.2)(a)–(3.2)(b)convergestoasteadystateforall
α , β >
0 andthefastestconvergence is achieved whenα =
4β
. Contrarily to the one dimensional case, in higher spatial dimensions the constraints are no longerpiecewiselinear.Consequently,wecannotdirectlyextrapolatetheconclusionsdrawninthissection.Inparticular,in dimensionb=
2,thenumericalsimulationsshowoscillations aroundthesteadystate forN>
3 withoutneverconverging toit.Thenon-convergenceinthiscaseisduetothenon-existenceofasaddle-pointoftheLagrangian[18].3.2.2. AHA-S
Let
φ (
X) =
d2−|
X|
2andconsiderthepotential(3.1).TheODEsystemassociatedtotheAHA-Sinthecaseoftwospheres inR
whereonesphereisfixedattheorigincanbewrittenasFig. 2.Phaseportraitofthesystem(3.3)(a)–(3.3)(b)with(α,β,d)=(0.01,0.01,2)andinitialconditionX0=0.2.Thedynamicsdonotconvergetothe equilibrium(2,12).
⎧ ⎪
⎨
⎪ ⎩
X
˙ = − α ( 1 − 2 λ)
X(a) λ ˙ =
0 , if λ = 0 and
d< |
X|
β(
d2−
X2), otherwise . (b)
(3.3)
Lemma3.4.Thesteadystatesofthesystemcorrespondingtothelinearizationof(3.3)(a)–(3.3)(b),
(
X∗, λ
∗) = (
d,
1/
2)
and(
X∗, λ
∗) = ( −
d,
1/
2)
,arebothcenterequilibria,foranyα
andβ
positive.Proof. Asbefore,wewillonlycarryouttheanalysisofthefirststeadystate.
Suppose X
>
0 and considerthe change ofvariables Y=
X−
d andμ = λ −
1/
2. The linearizedsystem onthe new variablesisgiveninmatrixformby Y˙
˙ μ
=
A Yμ
,
A=
0 2d α
− 2d β 0
.
The rootsofthecharacteristicpolynomial
P(λ) = λ
2+
4d2α β
are bothpurelyimaginary,thereforethesteadystateofthe linearizedsystemisacenterequilibrium.2
The linearanalyses doesnot allow us to concludeanything aboutthe asymptotic behaviour of the nonlinear system (see Theorem 3.2).Nevertheless,thephaseportraitplottedinFig. 2revealsthatasolutiontothenonlinearsystemshould convergetowardsaperiodicorbitaroundthesteadystate.Aswewillseeinthenextsection,thedampingtermappliedto theArrow–Hurwiczsystem(2.7)(a)–(2.7)(b)ensuresasymptoticstabilityofthesteadystate,undercertainconditionsonthe parameters.
3.3. ThedampedArrow–Hurwiczalgorithm(DAHA) 3.3.1. DAHA-NS
Let
φ (
X) =
d− |
X|
and consider the potential (3.1). The ODE system associated to the DAHA-NS in the case oftwo spheresinR
whereonesphereisfixedattheorigincanbewrittenas⎧ ⎪
⎪ ⎪
⎨
⎪ ⎪
⎪ ⎩
X
¨ = − α
21 − λ
|
X|
X
+ α βλ(
d− |
X|)
X|
X| −
cX˙ (a) λ ˙ =
0 , if λ = 0 and
d< |
X|,
β(
d− |
X|), otherwise . (b)
(3.4)
Lemma3.5.Let
α , β,
c>
0.If( α + β
d)
c− β α >
0,thenthesteadystatesofthesystem(3.4)(a)–(3.4)(b),(
X∗,
X˙
∗, λ
∗) = (
d,
0,
d)
and(
X∗,
X˙
∗, λ
∗) = ( −
d,
0,
d)
,arebothasymptoticallystable.Proof. Suppose X
>
0 andconsiderthechangeofvariablesY=
X−
d, Z= ˙
Y andμ = λ −
d.Thelinearizedsystemonthe newvariablesisgiveninmatrixformby⎡
⎣
Y˙
Z˙
˙ μ
⎤
⎦ =
A⎡
⎣
YZμ
⎤
⎦ ,
A=
⎡
⎣ − α
2− 0 α β
d− 1
cα 0
2− β 0 0
⎤
⎦ .
Theeigenvaluesofmatrix A aretherootsofthecharacteristicpolynomialin
λ
,whichisgivenbyP (λ) = λ
3+
cλ
2+ ( α
2+
α β
d)λ + β α
2.Consider in generala cubic polynomial ofthe formP(λ) = λ
3+
c2λ
2+
c1λ +
c0,with c0,
c1,
c2∈ R
+.Let z1,
z2 andz3 bethe(complex)rootstothispolynomial.Wewanttoensurethatallrootshavenegativerealpart.Sinceall coefficientsarepositive,iftherootsarerealthentheymustbenegative.Supposenowthattworootsarecomplexconjugate,forexample,z1
=
a+
ib,
z2=
a−
ib,a,
b∈ R
andz3∈ R
−.Inordertofindacondition onthecoefficientswhichensures thataisnon-positive,westartbyidentifyingthecoefficientsoftheequationwithitsroots:z1
+
z2+
z3= −
c2,
z1z2+
z1z3+
z2z3=
c1,
z1z2z3= −
c0 Rewritingintermsofa,
bandz3 weget2a +
z3= −
c2,
a2+
b2+ 2az
3=
c1, (
a2+
b2)
z3= −
c0(3.5)
From(3.5)wededucethatasatisfiesthecubicpolynomial8a
3+ 8c
1a2+ 2 (
c1+
c22)
a+
c1c2−
c0= 0 .
Consequently,ifc1c2
−
c0>
0,thenaisnecessarilynegative.Backto our case, we have c2
=
c, c1= α
2+ α β
d andc0= β α
2 anda sufficient condition forthe steady state to be asymptoticallystableis( α
2+ α β
d)
c− β α
2>
0,i.e.,( α + β
d)
c− β α >
0.Notethatsincethesteadystateisasymptotically stableasasolutiontothelinearizedsystem,thenitisalsoasymptoticallystable(seeTheorem 3.2).2
3.3.2. DAHA-S
Let
φ (
X) =
d2− |
X|
2 andconsiderthepotential(3.1).ForthecaseoftwospheresinR
whereonesphereisfixedatthe origin,theODEsystemassociatedtotheDAHA-Scanbewrittenas⎧ ⎪
⎨
⎪ ⎩
X
¨ = − α
2( 1 − 2 λ)
X+ 2 α βλ(
d2− |
X|
2)
X−
cX˙ (a) λ ˙ =
0 , if λ = 0 and
d< |
X|
β(
d2− |
X|
2), otherwise . (b)
(3.6)
Lemma3.6.Let
α , β,
c>
0.Ifc−
2α >
0,thenthesteadystatesofthesystem(3.6)(a)–(3.6)(b),(
X∗,
X˙
∗, λ
∗) = (
d,
0,
1/
2)
and(
X∗,
X˙
∗, λ
∗) = ( −
d,
0,
1/
2)
,arebothasymptoticallystable.Proof. Asbefore,suppose X
>
0 andconsiderthechangeofvariables Y=
X−
d, Z= ˙
Y andμ = λ −
1/
2.The linearized systemonthenewvariablesisgiveninmatrixformby⎡
⎣
Y˙
Z˙
˙ μ
⎤
⎦ =
A⎡
⎣
YZμ
⎤
⎦ ,
A=
⎡
⎣ − 2 α 0 β
d2− 1
c2d 0 α
2− 2d β 0 0
⎤
⎦ .
TheeigenvaluesofmatrixA aretherootsofthecharacteristicpolynomialin
λ
:P (λ) = λ
3+
cλ
2+ 2 α β
d2λ + 4d
2β α
2Usingthesamereasoningasbeforewehavec2
=
c,c1=
2α β
d2 andc0=
4d2β α
2.Asufficientconditionforthesteady statetobeasymptoticallystableis2cα β
d2−
4d2β α
2>
0,i.e.,c−
2α >
0.2
Remark3.1.We seethat aslong as the damping coefficient, c, islarge enough, the sufficient conditionsfor stability of boththeDAHA-NSandDAHA-Sarefulfilled.Furthermore,the parameterspacecorresponding tothestability ofDAHA-NS islargerthantheoneoftheDAHA-S.
ThecorrespondinganalysesfortheNAPandtheNAValgorithmsarepresentedintheAppendix A.
4. Numericalresults
InthissectionweinvestigateandcomparethenumericalresultsobtainedfromthedampedArrow–Hurwiczalgorithms (DAHA-NS,DAHA-S)andthenestedalgorithms(NAP-NS,NAP-SandNAV-NS)forthepotentialdefinedin(3.1).Duetothe difficultyinfindingtheoptimalparameters
( α , β)
foreachmethodandforeachN,wehaverestrictedthisstudytothe cases N=
7 and N=
100 in two spatial dimensions (i.e.b=
2). We address the convergence time andthe robustness ofthe convergencetimewithrespecttotheinitial configurations.Additionally,we comparethe accuracyofthemethods forthe caseN=
7 only.Indeed,inthecaseN=
7,thestablesteadystateofthedynamicalsystemsassociatedtothealgorithmsis unique(apartfromtranslations,rotationsandreflections) andisrepresentedinFig. 1b.Thisguaranteesthatallalgorithms converge to the same minimum for any initial configuration. In particular, thisallows us to assess the accuracy of the algorithms by comparing thecomputedminimum withthe exactone. We finally show some examples ofconfigurations obtainedwiththeDAHA-SforthecaseN=
2000 intwoandthreedimensions.Inorder toadjustthe spatial dimensions, thenumericalparameters mustsatisfy
α , β,
c∼ O (
1)
forthemethods with thenon-smoothformoftheconstraintfunctionsandα ,
c∼ O(
1)
andβ ∼ O(
1/
d2)
forthemethodswiththesmoothform oftheconstraintfunctions.Inthefollowingwehaveconsideredd=
1.InordertobeabletocomparethenestedalgorithmswiththeDAHAregardingconvergencetime,weonlyconsiderthe evolutionofXandVandwedonotconsidertheevolutionof
λ
.Wedenoteby·
theEuclideannorminR
bN.Foragiven smallandpositive,thestoppingcriterionfortheminimizationalgorithmsassociatedtotheNAP,isgivenbythefollowing conditionontherelativeerror Xn+1
−
Xn Xn<
inner. (4.1)
Forthecaseoftheminimizationproblemformulatedintermsofthevelocities,thestoppingcriterionissimilarbutinstead ofXwewriteVandinsteadofnormalizingbyVn,wenormalizebyXn,yielding
Vn+1−
VnXn
<
innerτ . (4.2)
By usingthe Eulerstep Xn+1
=
Xp+ τ
Vn+1 we show that thetwo conditions(4.1) and(4.2) areequivalent. As we will see,in ordertogeta fastconvergence withthe nestedalgorithms, onedoesnot needtowait forthe convergenceofthe inner-loop. Weintroduce a newparameter, Iinner,which standsfor themaximumnumber ofiterations ofthe inner-loop allowed perouter-loopiteration.Finally,thestoppingcriterionforboththeouter-loopoftheNAPandtheNAV,aswellas, fortheDAHAreads Xp+1−
Xp Xp< . (4.3)
Theassessmentandcomparisonofthemethodswillbemadethroughthecomparisonofstatisticalindicatorsobtained fromaveragingcertainquantitiesoverasetofdifferentinitialconfigurations.Theseindicatorsareintroducedbellow.
Definition4.1.Consider a set ofm initial configurations forwhich an algorithm converges,i.e., the stopping criterion is satisfied ina finitenumberof iterations.Let T be the numberofiterations neededforthe algorithmto convergewhen starting withthe
th initialconfiguration.Let Ai j be theoverlapping areaofspheres i and j atconvergenceand Atotal
=
Nπ (
d/
2)
2.Wedefine thefollowingstatisticalindicatorsmeanconvergencetime,varianceoftheconvergencetimeandthemean proportionofoverlappingareapersphereas
T
= 1
m m=1
T
, σ
2= 1
m
− 1
m=1
(
T−
T)
2and
A= 1
mN Atotali,j∈{1,...,N},i<j
Ai j
,
respectively.
The indicator T measuresthe efficiencyofanalgorithm withrespect totheconvergencetime, A andW measure the accuracy of the final configurationand
σ
2 measures the robustness of the convergencetime withrespect to the initial configurations. Forsimplicity we assume that the time interval between iterations is constant andinvariant among the differentalgorithms.Asaconsequenceofthissimplification,wewillusethenumberofiterationsasthetimeunitofT. 4.1. CaseN=
7WepresentadetailednumericalstudyforthecaseofN
=
7 spheresindimensionb=
2.The20 differentinitialconfigu- rationsconsideredinthissectionweregeneratedfromastandardGaussiandistribution.Wechoosethetolerances=
10−6 andinner
=
10−9 andthemaximumnumberofiterationsofthe inner-loopIinner=
10.Inorderto studytherelation be- tweenthedampingparameterc andtheconvergencetimeoftheDAHAwithsmoothandnon-smoothconstraints,weplot in Fig. 3 the maximumnumberof iterations over 20 differentrandomly generated initialconfigurations asa function of c∈ (
0,
10]
.We observethatthelower convergencetimeisattainedwhenc≈
2,forboth theDAHAwiththe smoothand withthenon-smoothconstraints.InFig. 4weplottherelativeerrorasafunctionofiterationnumber,n,fordifferentvalues of c.Ifc=
0 weobserve thattherelative erroroscillatesandneverdropsbellow 10−1.As weincrease c the oscillations tend to diminish. In the following we have used c=
2. Note that thischoice for c eliminates the dependenceon Xn−1 in(2.10)(a)–(2.10)(b),inthiscase,theDAHAcanbeseenasadiscretizationofthefollowingfirst-orderODEsystem:⎧ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎨
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎪ ⎪
⎩
X
˙
i= − 1
2 α
2[∇
XiW(
X) +
k,∈{1,...,N},k<
λ
k∇
Xiφ
k(
X) ]
− 1
2 α β
k,∈{1,...,N},k<