an author's
https://oatao.univ-toulouse.fr/23683
https://doi.org/10.1016/j.ast.2019.03.041
Bartoli, Nathalie and Lefebvre, Thierry and Dubreuil, Sylvain and Olivanti, Romain and Priem, Rémy and Bons,
Nicolas and Martins, Joaquim R.R. A. and Morlier, Joseph Adaptive modeling strategy for constrained global
optimization with application to aerodynamic wing design. ( 2019) Aerospace Science and Technology. ISSN
1270-9638
Adaptive
modeling
strategy
for
constrained
global
optimization
with
application
to
aerodynamic
wing
design
N. Bartoli
a,∗
,
T. Lefebvre
a,
S. Dubreuil
a,
R. Olivanti
b,
R. Priem
a,
N. Bons
c,
J.R.R.A. Martins
c,
J. Morlier
daONERA/DTIS,UniversitédeToulouse,Toulouse,France bUniversitédeToulouse,ISAE-SUPAERO,Toulouse,France cUniversityofMichigan,AnnArbor,MI,48109,UnitedStates
dUniversitédeToulouse,ISAE-SUPAERO-INSA-MinesAlbi-UPS,CNRSUMR5312,InstitutClémentAder,Toulouse,France
a
b
s
t
r
a
c
t
Keywords: Surrogatemodeling Globaloptimization Multimodaloptimization MixtureofexpertsAerodynamicshapeoptimization Wingdesign
Surrogate models are often used to reduce the cost of design optimization problems that involve computationallycostlymodels,suchascomputationalfluiddynamicssimulations.However,thenumber ofevaluationsrequiredbysurrogatemodelsusuallyscalespoorlywiththenumberofdesignvariables, andthereisaneedforbothbetterconstraintformulationsandmultimodalfunctionhandling.Toaddress thisissue,wedeveloped asurrogate-based gradient-freeoptimizationalgorithmthatcan handlecases where the functionevaluations are expensive,the computational budget islimited, the functions are multimodal, and the optimization problem includes nonlinear equality or inequality constraints. The proposed algorithm—superefficient globaloptimizationcoupledwith mixtureofexperts(SEGOMOE)— cantacklecomplexconstraineddesignoptimizationproblemsthroughtheuseofanenrichmentstrategy based on a mixture of experts coupled with adaptive surrogate models. The performance of this approachwasevaluatedforanalyticconstrainedandunconstrainedproblems,aswellasforamultimodal aerodynamicshapeoptimizationproblemwith17designvariablesandanequalityconstraint.Ourresults showedthatthemethodisefficientandthattheoptimumismuchlessdependentonthestartingpoint thantheconventionalgradient-basedoptimization.
1. Introduction
Inaerodynamicshapeoptimization,realistic wingdesign
prob-lems require both a large number of design variables and
con-straints.Atypical probleminvolvesafew tensofdesignvariables andobjectiveandconstraintfunctionsthatrequiresignificantCPU time.Becausethecomputationalbudgetisusually restricted,only alimitednumberofobjectiveandconstraintfunctionevaluations
(afew hundreds) can be performed to findthe best design. The
emergence of adjoint methods [1–3] was a considerable
break-through in the field of aerodynamic shape optimization because
it renders the cost of computing objective and constraint gradi-ents independentof the numberof design variables. In conjunc-tionwithgradient-basedoptimizers,thisprovidesanefficientway
to convergeto a localminimum in a high-dimensional problem.
Althoughairfoil andwingdesign optimization problemsare uni-modal [4,5], once the wing planform is allowed to change, the
*
Correspondingauthor.E-mailaddress:[email protected](N. Bartoli).
design space becomes multimodal [6]. More generally, in many engineeringdesignoptimizationproblems,we donotknowa pri-ori whether thedesign spaceis multimodalornot and perform-ing a multistart is oftenintractable in termsof CPUtime. When the gradient information is not available, an interesting option is to use gradient-free optimizers known as derivative-free opti-mization (DFO) methods [7–10]. Stochastic DFO methods mostly fall intothe categoryof evolutionaryalgorithms andare efficient to solve constrained problems [11]. One of the most promising evolutionaryalgorithms thatcan handleconstraintsisthe covari-ance matrix adaptation evolution strategy (CMA-ES) [12] and its variants [13–15]. They have gained popularity in the solution of high-dimensional problems.Theironly drawback isstill thelarge numberoffunctionevaluationsrequiredtofindtheoptimum [16]. Deterministic DFO methods,such asmesh adaptivedirect search (MADS), coordinate search (CS), and generalized pattern search (GPS),havebeenproposed [10].Thesemethodshavebeen imple-mentedindifferentpackagessuchasNOMAD [17],HOPSPACK [18], andtheDFLLibrary [19].DeterministicDFO methodsperform lo-calsearchesandthereforedependontheinitialstartingpoint.For
solving a multimodalproblem, a multistartapproach is generally usedandthenumberoffunctionevaluationsincreases,oftenwith
more than a thousand calls. In addition, some DFO algorithms,
suchasNOMAD,artificiallyincreasethenumberofinequality con-straintsto manageequalityconstraints,makingconvergenceeven moredifficult.Surrogate-basedoptimization(SBO)orBayesian op-timization(BO)approachesbuildaninexpensive approximationof theoriginal function that canbe used torapidly findan approx-imate optimum [20–23]. In BO, an acquisition function is built from Gaussian process approximations of the objective and con-straintfunctions [24].ForsomeBOframeworks [17,25],inequality constraints are an issue. Most of the cited algorithms consider equalityconstraintsastwo inequality constraints,which becomes intractable for large numbers of constraints. The augmented La-grangian framework ALBO [26] handles mixed-constrained prob-lems; however,itisstill notsuitable forsolvinglarge-scale prob-lems ina reasonable time.Various reviewsonSBO canbe found intheliterature [27–30].
AnotherBOalternative is basedon sequentialenrichment ap-pliedtoefficientglobaloptimization(EGO) [21] usingan adaptive surrogate model.EGOuses a krigingmodel (also calledGaussian process inthe machine learning community[31]) asasubstitute forhigh-fidelity models,takingadvantageofthepredictionofthe variance that is built into these models to inform the adaptive sampling [32–34]. Wessing and Preuss [16] performed a
com-parison betweenEGOand CMA-ES formultimodalunconstrained
problems.TheyconcludedthatEGOiseffectiveinfindingmultiple optimawhenthebudgetoffunctionevaluationsislimited.To han-dleconstrainedproblems,Sasenaet al. [35] proposedanextension
of EGO called SuperEGO. SBO has been applied to aerodynamic
shape optimizationproblemsin previous research efforts [36,37], some ofwhichhaveusedadaptivesampling [38–40]. Despitethe workscitedabove,thenumberofdesignvariablesthathavebeen
handled so far is insufficient for wing design optimization and
multimodalitywitharestrictedbudgetisstillanissue.
Forrealistic aircraftwingshapeoptimizationproblems,the re-quired number of design variables exceeds 200 [4], and,
there-fore, trying to directly solve the problems using EGO with a
conventional kriging approach is not feasible. We recently
pro-posed to use EGO with a new type of kriging model adapted
tohigh-dimensionalproblems(namely,KPLSandKPLS+K) [41,42]. Theresultingoptimizationapproach,whichwecallSEGOKPLS(+K),
was demonstrated in problems with up to 50 design variables
that were solved using approximately a hundred function evalu-ations [43]. To handle nonlinear functionsthat vary significantly within a wide domain, researchers have proposed to cluster the dataandconstructanassemblyoflocalsurrogates,knownas mix-tureofexperts (MOE),whichfacilitateglobaloptimization [44–48].
The first contribution of this study is the extension of the
SEGOKPLS algorithm to handlehighly nonlinear and non-smooth
functionsby usingMOE withKPLSorKPLS+Kmodelsasthelocal experts.Wehavepreviouslypresentedanapproachcombining Su-perEGOandMOE(SEGOMOE),withpreliminaryresultsforanalytic functions [49].We startby constructingsurrogate modelsforthe objective and constraint functions by combining automatic clus-teringand best expertselection. Then, we approach thesolution iteratively by balancing the exploration and exploitation phases with a new proposed criterion for the acquisition function.
An-alytic test cases are presentedto compare SEGOMOEwith other
DFOalgorithmssuchasCOBYLA,NOMAD,orALBO.
Thesecondcontributionisthecomparisonoftheproposed ap-proachwithagradient-basedalgorithmforanaerodynamicshape optimizationproblem.Thisproblemisbasedon abenchmark
de-velopedby the AIAA Aerodynamic Design andOptimization
Dis-cussion Group (ADODG) [50]. The case that we solve is a
sim-plified version of ADODG Case 6, which is a subsonic wing
de-signproblemwithamultimodalobjective.Theaerodynamicmodel is expensive, requiring high-performance parallel computing re-sources to solve the Reynolds-averaged Navier–Stokes equations usingcomputationalfluiddynamics(CFD).Theobjectiveisto min-imize the drag coefficient at a given lift coefficient. The design variables are the angle of attack, twist distribution,and dihedral distributionofthewing,foratotalof17designvariables.Multiple optima have beenidentified by using a gradient-basedoptimizer starting fromdifferent design points. The goalof this studywas
to compare the effectiveness of SEGOMOE versus an established
gradient-basedapproachinconvergingtotheglobaloptimum. We starttherestofthispaperbysummarizingthepreviously developed methodsthat are relevanttothe presentworkandby introducing a new enrichment criterion (i.e.,WB2S). We demon-strate theabilityoftheproposed approachto findtheglobal op-tima for five analytic test cases that are multimodal. Then, we
demonstrate the effectiveness of SEGOMOE in finding the global
optimumoftheaerodynamicshapeoptimizationproblemand dis-cusshowitcomparestoagradient-basedmethod.
2. TheSEGOMOEapproach
Inthissection,wedescribetheproposedapproach(SEGOMOE), whichsolvestheconstrainedoptimizationproblem
minx∈ y
(x)
s.t. c1
(x)
≤
0, . . . ,
cm(x)
≤
0,
(1) where
⊂ R
d definesthe design space. Equalityconstraints can beconsideredaparticularcasewhereci(
x)
=
0.Westartthissectionwithadescriptionofpreviouslydeveloped techniques(Section2.1)andfollowwithadescriptionofthenovel contributionsinthepresentwork(Section2.2).
2.1. BackgroundonSEGO
TheapproachproposedinthispaperbuildsuponSEGO,which, inturn,isbasedonEGO.Therefore,westart withan overviewof EGO andfollowwithan explanationof howconstraintsare han-dled inSEGO.Wealsoincludea descriptionofamorelocalinfill criterion (WB2). Finally, we describe the techniques that we use tohandlehigh-dimensionaldesignspacesefficiently(namely,KPLS andKPLS+K).
2.1.1. OriginalEGO
Asmentioned intheIntroduction,theproposedalgorithmwas inspired by EGO [21]. The main idea of the EGOapproach is to assume that the unknown objective function is a realization of a Gaussian process (GP) (alsoknown askriging [31,51]). This re-quires aninitial designofexperiments (DOE)ofnDOE points X
=
x(i)
,
i=
1, . . . ,
nDOE
withx(i)
∈ R
d,wheretheobjectivefunctionis evaluated to obtain the samples y
=
y(
x(i)),
i=
1, . . . ,
nDOE
withy(
x(i))
∈ R
.Then,wecanbuildaconditionedGPthat approx-imatestheobjectivefunctionatanypointx byaGaussianrandom variableYˆ
(
x)
withameanofˆ
y
(x)
= ˆ
μ
+
rtxXR−1y−
1μ
ˆ
(2)andastandarddeviationof
ˆ
s2
(x)
= ˆ
σ
21−
rtxXR−1rxX,
(3)where 1 is an nDOE
×
1 column vector of 1’s, rxX= {
k(
x,
x(i)),
i
=
1,
. . . ,
nDOE}
, R is the covariancematrixofcomponents Ri j=
k(
x(i),
x(j))
,andk(.,
.)
isagivencovariancefunction.Inthiswork, weusethesquaredexponentialcovariancefunctionk
(x,
x)
= ˆ
σ
2 d i=1 exp−θ
i xi−
xi 2∀θ
i∈ R
+,
∀
i∈ [
1, . . . ,
d].
(4)The scalar parameters
μ
ˆ
,σ
ˆ
, andθ
i,
i=
1,
. . . ,
d, can be foundusing a number of methods; here, we use likelihood
maximiza-tion [22,31].UsingthisGaussianapproximation,theEGOalgorithm initiallyproposedbyJonesetal. [21] iterativelyaddspointstothe DOEto increase theaccuracy ofthe GP.Tofind newpoints xnew
thathelptheoveralloptimization,theEGOalgorithmusesthe ex-pectedimprovement (EI)criterion,
EI
(x)
= E
max
(
0,
ymin− ˆ
Y(x))
,
(5)where ymin is theminimum value ofthe objectivefunction over
theDOE,i.e., ymin
=
min i∈[1,...,nDOE]y
(
x(i))
.BecauseYˆ
(
x)
isaGaussian randomvariabledefinedbyits mean (2) andits variance (3),the EIcriterioncanbewrittenasEI
(x)
=
⎧
⎪
⎨
⎪
⎩
ymin− ˆ
y(x)
ymin− ˆy(x) ˆ s(x)
+ ˆ
s(x)φ
ymin− ˆy(x) ˆ s(x),
ifˆ
s>
0 0,
ifˆ
s=
0,
(6)where
φ (.)
istheprobability densityfunctionand(.)
isthe cu-mulativedistributionfunctionofthestandardnormaldistribution. Thisexpressionhighlightsthetrade-offbetweenexploitation ofthe Gaussian surrogate model andexploration of the design space. If(.)
is large whenˆ
y(
x)
is small compared to ymin, then the EIcriterionpromotesexploitation.Ontheotherhand,if
φ (.)
islarge whensˆ
(
x)
islarge,thenitpromotesexploration.Each iteration of the EGO algorithm consists of three main steps:
1. Construct a conditioned GP defined by a mean (2) and the variance (3) basedonaDOEofsizenDOE.
2. Maximizetheexpectedimprovement (6).
3. AddanewpointtotheDOE(xnew)andevaluateit( ynew). This iterativeprocess is repeated from an initial DOE until con-vergence.Theusualstoppingcriterionisthemaximumnumberof functionevaluationscorrespondingtoacomputationalbudget. 2.1.2. Handlingconstraints
The original EGO algorithm outlined above was designed to
minimize unconstrained functions. Because mostengineering de-sign problems are subject to constraints, it is crucial to have a soundstrategyto handledesignconstraints. Althoughwe can al-ways add constraint functions to the objective as penalties, this approachisinefficientandinaccurate.Toaddressthisissue,Sasena et al. [52] proposedan approachtheycalledsuper-efficientglobal optimization (SEGO), which can solve general nonlinearly mixed constrainedproblems.AsintheEGOapproach,theobjective func-tion y isapproximatedbyaGP.Toconsiderthem nonlinear con-straintsci
,
i=
1,
· · · ,
m, weconstructasurrogatemodelforeachconstraint ci, denoted by
ˆ
ci, usually based on the same DOE asthatused forthe objectivefunction.This resultsin thefollowing constrainedoptimizationproblem:
max
x∈f
EI
(x),
(7)where the feasible domain
f is definedby the nonlinear
con-straints,i.e.,
f
:= {
x∈ R
d: ˆ
c1(x)
≤
0, . . . ,
cˆ
m(x)
≤
0},
(8)whereequalityconstraints,c
ˆ
i(
x)
=
0,couldreplaceorbeaddedtothesetofinequalityconstraintsabove.Ateachiteration,thepoint
xnew that solves thisoptimization problem is added to the DOE andtheprocess isrepeateduntiltheconvergence.Evenifapoint
xnewisnotfeasible,evaluatingthetruefunctionsaddsinformation
totheDOE.
Inthisprocedure,theiterativeconstructionofthevarious sur-rogatemodels(oneforthe objectivefunctionandm forthe con-straints)isdrivenonly bytheEI oftheobjectivefunction. There-fore, if the surrogate models of the constraints are not accurate enough,theaccuracyoftheoptimumiscompromised.
In some examples,the optimalsolution ofEq. (7) cannot sat-isfy the “true” constraint functions ci
(
x)
because only the meanvalueoftheGPc
ˆ
i(
x)
isusedtoapproximatetheconstraintsduringthe optimization process. To consider the associated error esti-mation
ˆ
s2ci
(
x)
,we should implementdifferentstrategies.Forhan-dlingconstraintsinaBayesianoptimizationframeworkconsidering the errorestimation, various approachesare possible: probabilis-tic [53], expected violation [54], predictive entropy search with constraints [55],orslackvariableswithLagrangianformulationto handle mixed-constraint problems [26]. Work is still in progress whenitcomestoaddressingthischallenge [56].
2.1.3. Infillcriterion
Theoptimizationproblem (7) ismultimodal,anditis challeng-ingtofindtheglobalmaximumwithoutincurringalarge compu-tationalcost.Toimprovetheefficiencyofthisoptimization,Sasena et al. [52] recommendedtousethefollowingcriterionnamed “lo-catingtheregionalextreme”proposedbyWatsonandBarnes [57]:
WB2
(x)
= − ˆ
y(x)
+
EI(x),
(9)wherethemeanvalueoftheGaussiansurrogateissubtractedfrom the EI. The result isthat thiscriterion (WB2) isless multimodal thantheEI,whicheasesthesolutionoftheglobaloptimization,as demonstratedbySasena [58].
Nonetheless, the WB2 criterion can, in certain cases, prevent the algorithm fromfinding the global minimum,yielding a local minimuminstead.In fact,themagnitudeofthetermEI
(
x)
is ex-pectedtodecreaseduringtheiterativeprocess,astheGPsurrogatemodel becomes more accurate inthe promising areasof the
de-signspace. Thus,after afew iterations,the WB2criterionis only drivenbythemeanvalueoftheGPsurrogatemodel,whichmakes thealgorithmfocusonlyontheexploitation ofthesurrogatemodel ratherthanontheexploration ofthedesignspace.Toaddressthese issues,wedevelopedanewcriterionthatbuildsonWB2,whichis describedinSection2.2.1.
2.1.4. Handlingalargenumberofdesignvariables
One ofthe issues withSEGO isthe scalabilitywiththe num-berofdesignvariablesd.ToconstructtheGPoftheobjectiveand the constraint functions, we have to estimate the hyperparame-ters
θ
i,
i=
1,
. . . ,
d ofthecovariancefunction (4). Theestimationofthesehyperparametersby maximizationofthelikelihood func-tioncanbetime-consuming,especiallywhenthenumberofdesign variablesishigh(d
>
10).Thisisbecausethelikelihoodfunctionis multimodal,requiringalargenumberofinversionsofthe correla-tionmatrixtomaximizeit.Arecently developedsurrogate technique, krigingwith partial leastsquares(KPLS), was proposedby Bouhlelet al. [41] to han-dle the large number of variables forGaussian processes (up to 100variables). Thistechnique usesthe partialleastsquares(PLS) methodtoreducethenumberofhyperparametersandthesizeof theestimationproblem.Bouhlelet al. [42] subsequentlyimproved KPLS by addinga newstep in the construction of the surrogate
model, which improves the accuracy for high-dimensional
prob-lems (KPLS+K).These approacheswere usedwithin the SEGO
al-gorithm,andtheywere demonstratedtobeefficientforproblems withupto50designvariables [43].
Inthisstudy,weuseKPLSandKPLS+Ktomodelboththe objec-tivefunctionandtheconstraints.TheuseoftheKPLS(+K)method
in the SEGO algorithm is what enables us to solve nonlinearly
constrainedoptimizationproblemswitha largenumberofdesign variables(d
>
10).2.2. SEGOMOEanditsimprovements
In this section, we describe the improvements we made to
SEGOMOEthat constitute the contributions ofthe present study. Thesecontributionsincludea newenrichmenttechnique that im-proves the overall performance ofthe optimization, and an MOE approachthat improvestheaccuracyofthesurrogatemodelover awiderangeinthedesignspace.
2.2.1. Newinfillcriterion
As discussed in Section 2.1.3, the WB2 criterion (9) improves someoftheissuesofEI;however,thelackofscalingbetweenthe EI termandthepredictionofthemodelcan compromisethe ex-plorationpropertiesofthemethod.Therefore,weaddthescaling
WB2S
(x)
=
s EI(x)
− ˆ
y(x),
(10)wheres isanon-negativescale,EIisgivenbyEq. (9),andy
ˆ
(
x)
is givenbyEq. (2).Thisnewcriterion(WB2S)hastwoobjectives:
1. Keep the exploration property of the expected improvement metricoverthefeasibledesignspace
f.
2. KeeptheoriginalpropertiesoftheWB2metricby smoothing ittofacilitateoptimization.
The second condition is fulfilledasin the original WB2criterion bypenalizing theexpectedimprovementwiththeterm
− ˆ
y(
x)
in Eq. (10).Thefirstconditionismoredifficulttosatisfyandwe use thefollowingheuristic.Intuitively,thisconditioncanbetranslated to arg max x∈f EI(x)
≈
arg max x∈f WB2S(x).
(11)Toenforcethiscondition,weneedtoatleastensurethat,forx
=
arg maxx∈fEI(
x)
, the following inequality is verified: sEI(
x)
>
ˆ
y
(
x)
. As a consequence, we define s= β| ˆ
y(
x)
|/
EI(
x)
, whereβ >
1. This heuristic approach does not guarantee that Eq. (11) is true because thisdepends on the variation of y(
x)
overf.
However, by setting a relatively large value forthe parameter
β
, weexpectthisheuristictogiveanacceptableapproximation.Finally,anewapproximationisperformedtoimprovethe com-putationalefficiencyoftheproposedcriterion.Findingxover
f
is a difficult task, andwe prefer to redefinex withthe follow-ingapproximation: x
=
arg maxx∈X0EI
(
x)
,wherethefinitesubset X0⊂
f contains thestarting points usedinthemultistartopti-mizationofthecriterion.Thefollowingprocedureisusedto com-putes
>
0:1. Compute EI foreach starting point in theoptimization (in a multistartapproach).
2. Evaluate the prediction ofthe surrogate model forthe point withthehighestEI.
3. Computethescales suchthat
s
=
⎧
⎨
⎩
β
| ˆy xstart,EImax| EIxstart,EImax if EI xstart,EImax=
0 1 if EIxstart,EImax=
0,
(12)where
β >
1 is a scalingfactor.In ourexperiments,we have foundthatafixedvalueofβ
=
100 workswell;however,this couldbe adjustedduringoptimizationiftheEIvaluesaretoo smallcomparedtotheyˆ
(
x)
values.Someanalyticexperiments willbepresentedinSection3.3tostudythesensitivity analy-sisoftheβ
parameter.Toillustratethedifferentbehaviorsassociatedwitheach crite-rion, we provideaone-dimensional exampleinFig.1.The objec-tivefunction,itsassociatedGPmean,andtheGPuncertainty(with a confidenceinterval of 99%)are plottedinFig.1(a). InFig. 1(c), theexplorationphaseislessobviouswiththeWB2criterion com-paredto thatwiththe EIorthe WB2Scriterion inFigs.1(b)and 1(d), respectively. The maximum value of WB2 is located close to the minimumofthe krigingprediction (see Fig.1 atx
≈
0.
4), which illustratesthat, inthatcase, WB2focusesmoreonthe ex-ploitationofthesurrogatemodelratherthanontheexploration.In contrast,theEI andWB2Scriteriareachtheir maximumatx=
0, whichallowstoexploreapromisingareaofthedesignspace.The two advantages of WB2S are illustrated in Fig. 1(d). Similarly to the EI criterion, the WB2S criterion facilitates exploration of the threelocalmaxima.Ontheother hand,theWB2criterionismore unimodal, whichpreventstheoptimizerfromgettingstuckinflat areasintheintervalx∈ [
0.
8,
1]
,wherethegradientiszero. 2.2.2. MixtureofexpertsOneofthemaincontributionsofthisstudyisthecombination
ofMOE withEGO.ThemotivationforusingMOE comesfrom
in-dustrial optimizationproblemsforwhichboththeobjective func-tion andthe constraints mightbe strongly nonlinear, discontinu-ous, orboth.Insuch cases,agoodapproximationoverthe whole designspaceby akrigingmodelmightbeinaccurateorrequirea large-sizeDOE.
To dealwith theapproximation of highly nonlinear functions, some researchershave proposed the MOE technique [59,60]. The keyideaistoconstructdifferentlocalapproximations(experts)for differentdomainsinthedesignspace. Theselocalapproximations canbetailoredtodealwithdisparatelocaltrendsinthefunction, includingflatregions,discontinuities,andstrongnonlinearities.
We use MOE with KPLS(+K) surrogate models as the experts
andimplementedthesemethods intheSurrogateModeling
Tool-box [61]. MOE relies on the expectation-maximization algorithm forGaussianmixturemodels [62].Theinputspaceisclustered to-getherwithitsoutputvaluebymeansofaparameterestimationof thejointdistribution.Alocalexpert(e.g.,polynomialfit,radial ba-sisfunctions,andkriging)isthenbuiltoneachcluster,andallthe localexpertsarethencombinedusingtheGaussianmixturemodel
parameters found by the expectation-maximization algorithm to
obtainaglobalmodel.
Inthisapproach,the Gaussianmixture modelisusedto com-bine the data to both partition the input space and derive the
mixing proportion. To perform the clustering, we need n
in-puts, X
=
x(i),
i=
1, . . . ,
n,andthecorrespondingoutputs, y=
y
(
x(i)),
i=
1, . . . ,
n. Therefore, we can only know the cluster posteriorprobabilitiesofvectorslike(
x(i),
y(
x(i)))
∈ R
d+1.Topre-dict the cluster posterior probabilities ofa sample knowing only its inputs, wemust projecteach multivariateGaussian functionk intheGaussianmixturemodel(trainedind
+
1 dimensions)onto theinputspace,whichhasd dimensions.Thus,foreachclusterk,wecreateamultivariateGaussian func-tionin
(
d+
1)
-dimensionalspacewiththecovariancematrix,k
=
kX
ν
kν
T kξ
k,
(13)outputs given by a local model corresponding to the cluster are allowed.Thismethodcanbeefficientfordiscontinuous functions. Moreover,themixtureofexpertscanalsopredicttheuncertainty: for one sample, we choose the uncertainty corresponding to its cluster.
Forbothrecombinations,themixtureofexpertsbasedon krig-ingmodelsapproximatesfunctionswithaheterogeneousbehavior, providing a global modelwith a prediction of both the Jacobian andtheuncertainty.Theproposedalgorithmautomaticallychooses thebesttypeofrecombination,accordingtoacross-validation pro-cedure.TheMOEcanbewrittenasthesum
K
i=1α
iN
(
yˆ
i,
ˆ
s2i)
=
N
K i=1α
iyˆ
i,
K i=1α
2iˆ
s2i=
N
ˆ
y,
ˆ
s2,
(18) whereα
i isdefinedbyEq. (17),and yˆ
i(
x)
andsˆ
2i(
x)
aregivenbyEqs. (2) and (3),respectively. Thus, theMOE can alsopredict the uncertainty.Thisinformationisusefultodefinetheinfillcriterion usedintheEGOorSEGOalgorithm.TheEI,WB2,andWB2S crite-riadefinedby Eqs.(6), (9),and (10), respectively,areadapted for
theMOEmodelsusing
EIMOE
(x)
=
⎧
⎪
⎨
⎪
⎩
ymin− ˆ
y(x)
ymin− ˆy(x) ˆ s(x)
+ ˆ
s(x)φ
ymin− ˆy(x) ˆ s(x),
ifsˆ
>
0 0,
ifsˆ
=
0,
(19) whereφ (.)
istheprobability densityfunction and(.)
isthe cu-mulativedistributionfunction ofthestandardnormaldistributionN (
0,
1)
. The only difference betweenEq. (6) and Eq. (19) is the natureof y.ˆ
BecauseMOEhasmultiplemodels,ˆ
y isderived from Eq. (18) andexpressedasacombinationofkriging-basedlocal ex-perts,i.e.,ˆ
y(x)
=
K i=1α
iyˆ
i(x),
(20)where y
ˆ
i(
x)
isgivenby Eq. (2). Theassociated varianceˆ
s2 isex-pressedinthesameway,i.e.,
ˆ
s2(x)
=
K i=1α
2iˆ
s2i(x),
(21) whereˆ
s2i isgivenby Eq. (3).Becausetheinfill criterionconsiders onlytheobjectivefunctionwiththeknowledgeoftheuncertainty estimation(seeEq. (3)),wehaveawiderchoicefortheconstraint surrogatemodels. Thesecouldbe asingle surrogate oramixture oflocalexperts.Analyticderivativesoftheinfillcriterion(EI,WB2, orWB2S)arealsoavailableandcomputedforgradient-based opti-mization.2.2.4. TheSEGOMOEalgorithm
Once recombinationis achieved, MOE is used in combination
with the SEGO algorithm to solve global optimization problems
subjectto nonlinear constraints involving a large number of de-signvariables (d
>
10) andpotentially highly nonlinear objective andconstraintfunctions.Figure2showsaflowchartofthe SEGO-MOEalgorithmusedinthisstudy.Themainstepsinthealgorithm areasfollows:1. ConstructtheinitialDOEandbuildtheassociatedMOEmodels fortheobjectiveandconstraintfunctions.
2. Maximizetheinfillcriterion(EI,WB2,orthenewcriterion de-fined in Section 2.2.1) subject to the design constraints and variablebounds,andproposethenewenrichmentpoint.
Fig. 2. Overview of the SEGOMOE algorithm.
3. Computethevaluesofthe objectiveandconstraintfunctions atthenewenrichmentpoint.
4. Checkifthenew enrichmentpoint isin thefeasible domain ornot,andidentifyinactive,active,andviolatedconstraints. 5. Returnto Step 2 andupdate theDOE until thestopping
cri-terion is met. A common criterion is the maximum number
offunctioncallscorrespondingtotheavailable computational budget.
Tosolvetheconstrainedoptimizationproblem(withinequality and/orequalityconstraints)inStep 2,wecanuseagradient-based algorithmoraderivative-freeoptimizer.Becauseweuselocal opti-mizers,we performamultistartoptimizationwithdifferent start-ing points (10 points by default). The gradient-based optimizers usetheanalyticJacobianoftheMOE(asdescribedinSection2.2.3) to compute the gradients ofboth theobjective function and the constraintsoftheglobaloptimizationproblem (7).
Regarding thenumberofclustersinvolved intheMOE,Bartoli et al. [49] compared the efficiency ofSEGOMOE with K clusters versusone clusterontheMOPTAtestcase [63].Itwasfoundthat the number of function evaluations could be decreased by auto-matically choosing the number of clusters K (see [49] for more detailsontheproposedstrategy)andthat,ineverytestcase
stud-ied, the use of MOE in the EGO approach never increases the
numberoffunctionevaluationsrequiredforconvergence.Asa con-sequence,inthefollowing,theSEGOMOEframework isusedwith theautomatic clusternumberapproachalreadypresentedby Bar-toli et al. [49]. The algorithm is implemented within the NASA OpenMDAO framework [64] and,additionally, can use surrogates availablewithintheSurrogateModelingToolbox [61].
3. Analyticbenchmarkproblems
This section presents numerical results that demonstrate the
ability of SEGOMOE to solve multimodal analytic optimization
problems. A more realistic application is presented in Section 4. Here, five multimodal analytic optimization problems (three
con-sidered. The three unconstrained problems are well known
two-dimensional benchmark analytic functions: the six-hump
camel-backfunction,the Michalewiczfunction,andthe Ackleyfunction. Thefirstconstrainedproblemisatwo-dimensionalanalytic prob-lemproposedbyParret al. [65] thathastwodesignvariablesand oneinequalityconstraint.Thesolutionconsistsofoneglobal
min-imumandtwo local minima.The second constrained problemis
afour-dimensional analytic problemproposed by [26] that hasa (known)linearobjective andtwo constraints:one inequality and one equality. Before dealing with the numerical results, we re-view first the performance criteria based either onthe objective
value at the optimal point or on the global minimum location
and the choice of the
β
parameter for the WB2S criterion (seeSection 2.2.1). We also introduce derivative-free optimizers to
comparetheirperformance withthatofSEGOMOE. Wechooseto
considerCOBYLA,NOMAD,andALBO, whicharethree
derivative-freeoptimizersavailableasanopen-sourcetoolboxthatcanhandle nonlinearconstraints.
3.1.Performancecriteria
We use three performance criteria to compare the results
for SEGOMOE with those for the other optimization algorithms
(COBYLA,NOMAD,andALBO).
1. Percentageofconvergedruns
2. Meannumberoffunctionevaluationsforconvergedruns 3. Standarddeviationofthe numberoffunction evaluationsfor
convergedruns.
The convergence is assessed by measuring the error in the
ob-jective value or theproximity ofthe design variables. When the objectivevalueisused,therelativeerroris
RE
=
|
f∗
opt
−
fref∗|
|
fref∗|
,
(22)where fopt∗ is thebest point given by each algorithm and fref∗ is
the known reference solution given in Appendix A. When using
theproximity of thedesign variables, we measurethe difference betweentwosolutionsx1andx2 using
μ
prox(x
1,
x2)
=
1−
1 d d i=1 x2i−
x1i ubi−
lbi,
(23)whereubi andlbi denotethe upperbound andthe lowerbound,
respectively,oftheithdesignvariable.Thedistancesarescaledto
[
0,
1]
toconferthesameweightonthedesignvariables.In the test cases presented here, one of the solutions is the referenceandtheother one istheoptimalpointprovided bythe algorithm.Toensurethat anoptimizationisconverged,we check that1
−
μ
prox(
x1,
x2)
≤
,where
isagiventhresholdvalue(10−3
bydefault).The typeofvalidationandtheconvergencethreshold arespecifiedforeachproblem.Itiscrucialtolookatthiscriterion becausetheothertwoareonlycomputedforconvergedruns. 3.2.Derivative-freeoptimizers:COBYLA,NOMAD,andALBO
To evaluate the surrogate-based strategies, we consider the otherderivative-free algorithms(COBYLA, NOMAD,andALBO) for
the constrained and unconstrained analytic test cases. COBYLA
(ConstrainedOptimization BYLinearApproximation) isatrust re-gion optimization method that uses an approximating linear in-terpolation model of the objective andconstraint functions [66].
NOMADisa derivative-freealgorithmbasedon aC++
implemen-tationofthemesh adaptivedirectsearch (MADS)algorithm [17],
which is designed to solve difficult blackbox optimization prob-lems.NOMADdiscretizesthedesignspaceintoameshthatrefines adaptively tofind bettersuccessive solutions. Accordingto Audet etal. [67],NOMADisintendedfortime-consumingblackbox sim-ulation witha small numberof variables. AnotherBO algorithm, ALBO [26], is used here to handle mixed-constrained problems.
ALBO combinesan unconstrained BOframework withthe
classi-calaugmentedLagrangian(AL)framework[68].Originallydesigned fortheequalityconstraintproblems [69],ithasbeenextendedto inequality constraints by means of the slack variables. The ALBO procedureis thesameastheALframework exceptthatthe min-imization ofthe AL function is replaced by the maximization of an acquisitionfunction.The newacquisition functionisnot given explicitlybutonlythrough an estimationmethod.We foundthat ALBOisnot suitable forthesolutionoflarge-scale problemsina reasonabletime.
We tested COBYLA and NOMAD on three well-known
two-dimensionalmultimodalunconstrainedproblemsandtestedALBO
onafour-dimensionalmixed-constrainedproblemtocomparethe
convergence tothe globalminimum andthenumber offunction
evaluations.
BothCOBYLAandNOMADrequiretheusertospecifythe max-imum numberoffunction evaluations.For COBYLA,thisis setto 500evaluations,whichismuchhigherthanthenumberof evalu-ationsrequiredtoreachtheconvergenceintheanalytictestcases
we solve. For NOMAD,we try different valuesfor the maximum
number of evaluations keeping in mind that our budget is
lim-ited to 1000 calls. If no stopping criterion is specified, NOMAD stopsassoonasthemeshsizereachesagiventolerance.Because COBYLAandNOMADrequireastartingpoint,itmaybenecessary tousemultistarttofindtheglobaloptimumofthesemultimodal functions.Tousethesamesetofstarting pointsbetweenthetwo algorithms,wegenerateaninitialsetof100pointsusingLatin hy-percubesampling(LHS) [70].
Asexplainedearlier,SEGOMOEusesanLHSstrategytogenerate theinitialDOEtobuildtheGPapproximationsoftheobjectiveand theconstraints.Thus,tocomparethethreeinfill criteria(EI,WB2, and WB2S), we use thesame set of initial LHS-generated points foreachcriterionwithdifferentnumbersofsamplingpoints(5,10, 20,and30points).ThissetofinitialLHS-generatedpoints isalso usedforALBO,which,likeSEGOMOE,requiresaninitialsampleof pointstobuildaGPapproximation.Thesizeofthesamplingsetis alsoastudiedparameter.
WeusedthePythonlibraryScipy [71] forCOBYLA,thePython NOMADframework [72],andDiceOptim [73] withthedefault set-tingsforALBO.
3.3. SensitivityanalysisoftheWB2Sinfillcriterion
Tofindtherelevantvaluesforthe
β
parameterintheWB2S in-fillcriterion(defined inSection2.2.1),we performeda sensitivity analysisstudyfortwoanalyticfunctions(MichalewiczandAckley) definedin AppendicesA.2 andA.1. Thesensitivity analysisstudy consistedof100runsfor32valuesofβ
∈ [
10−6,
109]
,resultingin a totalof3200runsforeachfunction. AninitialDOE of5points was used forthe twotest cases. Thepercentage ofsolutions ob-tainedforvaryingβ
isplottedinFig.3.The success rate for the Michalewicz problem was 100% for
β >
1, as shown in Fig. 3(a). This was expected because s in Eq. (12) increases withβ
and the WB2S criterion tends to the EI criterion whenβ
tends to infinity, ascan be seenin Eq. (10). On the other hand, forβ <
1, the WB2S criterion tends to thesurrogate-based criterion (SBO approach), which minimizes the
kriging-basedsurrogatemodelwithoutanyexploration.Therefore, smallvaluesof
β
leadtopoorperformanceinbothproblems.Fig. 3. Effect of the value ofβin the WB2S infill criterion. One hundred DOE points (initial set with 5 points) were tested with a total of 3200 runs for each function.
FortheAckleyfunction,thepercentageofproblemssolvedwith WB2Sinitiallyincreasedwithincreasing
β
andthentendedto de-creaseforβ >
105 (seeFig. 3(b)).As aconsequence,one canseethattherangeof
β
valuesleadingtoahighersuccessrateisquite largeβ
∈ [
10,
105]
.Thus,evenifthisrangeisproblemdependent (especially the upper bound, as illustrated by the comparisonof Fig.3(a)andFig.3(b)),one mightbeconfident inchoosingan in-termediatevalueforβ
.Inthefollowing,β
=
100 isretained. 3.4. AnalyticunconstrainedmultimodalproblemsAs mentioned above, the main objective of this section is to
demonstratethe performance of SEGOMOEformultimodal
prob-lems. In particular, we want to compare the performance ofthe differentinfillcriteria(EI,WB2,andWB2S).Wefirstconsiderthree analyticunconstrainedmultimodalfunctions:
•
Thesix-hump camel-backfunction (definedin AppendixA.3) isatwo-dimensionalfunctionwithsixminima—twoofwhich are global—thatissmooth andeasy to optimize.The conver-genceisassessedbasedontheobjectivevalueandathreshold of10−3fortheassociatedrelativeerrorgivenbyEq. (22).•
The Michalewicz function (defined in Appendix A.2) exhibitsvalleysandridgeswithatunable steepness.The convergence is assessed based on the objective value and a threshold of 10−3 fortheassociatedrelativeerror (22).
•
TheAckleyfunction(defined inAppendix A.1foran arbitrary numberofdimensionsd)ischaracterizedbymanylocalmin-ima, which makes it difficult to optimize. The value of the
globalminimumisindependentofthenumberofdimensions
andis located inan area of steep gradient. We considerthe d
=
2 case.Theconvergenceisassessedbasedonthe proxim-ityofthe designvariableswiththereferencesolution,asthe relativeerrorontheobjectivefunctionvalueisnotdefinedin thatcase( f(
x∗)
=
0).TheproximityiscomputedwithEq. (23) andathresholdissetto10−3.ForSEGOMOE,asstatedabove,the samesetsofDOE(oneset per initial size) are used for all the criteria for a fair compari-son. Universal kriging models (with squared exponential correla-tion functions and linear regression functions) are used aslocal experts to build the surrogate model of the objective f
(
x)
. The iteration budget is set to 300evaluations. Each computation canbe stopped early, either because convergence has been reached
or because of a singularity arising when enrichment points are tooclose. Theoptimizationoftheinfill SEGOMOEcriterionis per-formedusingtheSLSQPalgorithm [74] withamultistartapproach. The gradients forSLSQP are computedanalytically forspeed and accuracy. The convergence is assessed witha threshold of 10−3
for the relative error (see Eq. (22)) or the proximity index (see Eq. (23)),withaniterationbudgetof300functionevaluations.The resultsaresummarizedinTable1.
Table 1
SEGOMOEresultsfor3600runs(100percombinationofDOEsize,criterion,and function)forthesix-humpcamel-back,Michalewicz,andAckleyproblems,showing thepercentageofrunsthatconvergedtotheanalyticsolution(witharelativeerror thresholdoraproximityindexof10−3).Thebestresultsareshowninbold.
Criterion Results DOE 5 DOE 10 DOE 20 DOE 30 Six-hump camel-back function
WB2 Converged 94% 100% 100% 100% Mean (evaluations) 23 29 40 43 σ (evaluations) 6 7 7 4 EI Converged 100% 100% 100% 100% Mean (evaluations) 39 40 42 44 σ (evaluations) 4 4 4 3 WB2S Converged 100% 99% 100% 100% Mean (evaluations) 39 39 42 44 σ (evaluations) 3 3 4 3 Michalewicz function WB2 Converged 71% 76% 83% 87% Mean (evaluations) 25 31 36 46 σ (evaluations) 14 24 15 13 EI Converged 100% 100% 100% 100% Mean (evaluations) 47 45 51 59 σ (evaluations) 36 43 31 21 WB2S Converged 100% 99% 100% 100% Mean (evaluations) 48 46 51 55 σ (evaluations) 36 33 27 15 Ackley function WB2 Converged 32% 58% 90% 91% Mean (evaluations) 41 55 45 50 σ (evaluations) 30 38 27 17 EI Converged 71% 98% 98% 98% Mean (evaluations) 95 73 56 80 σ (evaluations) 96 68 30 56 WB2S Converged 97% 100% 100% 100% Mean (evaluations) 107 60 58 68 σ (evaluations) 90 45 26 23
In thistable, highersuccessrateswere obtainedindecreasing orderby WB2S(10times),EI (8times),andWB2(3times).WB2
seems moreefficient whenjudged onthebasis ofthe meanand
standard deviationof the evaluations. However, it is also impor-tant toconsider itsrateofconvergence.Forthe Michalewiczand theAckleyfunction,asignificantnumberofrunsdidnotconverge
(between 70% and 40% for the initial DOEs with fewer than 20
points),althoughtheyconvergedwiththeEI andtheWB2S crite-rion.Thiscanbeexplainedbythemultimodalityoftheconsidered functions, assome runsperformedwiththeWB2criterion ended up converging near the local minima.This issue is mitigated as the size of the initial DOE is increased. The use of more points improvesexploitationandreducesthelikelihoodofbeingtrapped near the localoptima. The EI and theWB2S criterion performed similarlyacross allsizes oftheDOEintermsofboth rateof
con-Table 2
Six-hump,Michalewicz,and Ackleyresults usingCOBYLAor NOMAD(100runs). Differentstoppingcriteriarelativetothemaximumnumberofblackboxevaluations weretested.Thebestresultsareshowninbold.
Algorithm COBYLA NOMAD
MAX_BB_EVAL 500 100 300 without Six-hump camel-back Converged 77% 93% 95% 95% Mean (evaluations) 51 100 294 308 σ (evaluations) 5 0 10 25 Michalewicz Converged 29% 87% 89% 89% Mean (evaluations) 60 100 290 300 σ (evaluations) 27 0 13 26 Ackley Converged 38% 98% 100% 100% Mean (evaluations) 59 100 300 499 σ (evaluations) 5 0 0 12
vergenceandmeannumberofevaluations.Thestandarddeviation oftheWB2Swas slightlybetter forDOEsizes rangingfrom10to 30. Thisbehaviorwasexpected,asexplainedinSection2.2.1.
ThecomparisonresultsbetweenCOBYLAandNOMADarelisted
in Table 2. For COBYLA, the convergence to the global minima
wasobtainedwithasignificantly lower successratecompared to
thatforSEGOMOE. However,when COBYLAconverges,itrequires
roughlythesamenumberoffunctionevaluations(between51and 60) asthatofSEGOMOE.Table2showsalsotheNOMADresultsfor thethreeperformance criteriaasafunctionofthestopping
crite-rionchosen (
MAX_BB_EVAL
foramaximumnumberofblackboxevaluationsof100,300,andnolimit).Thus,thenumberof
evalua-tionswithNOMADwasmuchhigherthanthatwiththeSEGOMOE
approachwithasimilarsuccessrate.
TheefficiencyofSEGOMOEwasvalidatedforthreemultimodal
unconstrained problems compared to the two other
derivative-freealgorithms(COBYLAandNOMAD).Intermsofcriteria,EI and WB2Syieldedsimilarresultsintwoproblems,whereasWB2S per-formedbetterinthethirdone.
3.5.Analyticconstrainedmultimodalproblems
Toevaluate theability ofSEGOMOEto tacklemultimodaland constrainedproblems,weusedthefollowingtwotestcases:
•
The modified Branin problem, which is a two-dimensionalnonlinearproblemwithasinglenonlinearconstraint [65].This modification, which was proposed by Parr et al. [65], adapts the original Branin functionto havea single globaloptimum andtwolocalones,ratherthanthreeoptimaofequalvalue.
min
x∈[−5,10]×[0,15]f
(x)
g
(x)
≥
0,
(24)where f
(
x)
and g(
x)
are as detailed in Appendix A.4. The feasible space for this problem consists of isolatedfeasi-ble regions within their design space, which often occurs
forseverely constrainedpracticalcases.The problemfeatures three distinct feasible regions delimited by green curves in Fig.4,makingitanexcellenttestforconstraintstrategies [25, 65].
•
The Linear-Ackley-Hartman (LAH)mixed-constrainedproblemwith fourinput dimensions, a linear objective,andtwo con-straintsliketheonedescribedbyPichenyet al. [26]
⎧
⎪
⎨
⎪
⎩
min x∈[0,1]4 f(x)
g(x)
≤
0 h(x)
=
0 (25)The first inequality constraint is the Ackley function in four dimensions, andthe secondone isan equalityconstraint
fol-lowing the Hartman four-dimensional function. The problem
isdenoted by LAHin thefollowing,anddetails on the func-tion expressions and the referencesolution are given in Ap-pendixA.5.
For these two constrained test cases, we compared SEGOMOE,
COBYLA,andNOMADwiththeALBOframework.AstheLAH
prob-lemisamixed-constrainedproblem,theequalityconstrainth
(
x)
in Eq. (25) hasbeentransformed intotwo inequality constraintsforCOBYLA andNOMAD.SEGOMOEand ALBOare both ableto
con-sidermixed-constraintproblems.
In our tests, we sought to quantify the impact ofthe size of theinitial DOEandthe choiceofinfillcriterion forSEGOMOE.As for theunconstrained case, the results were obtained by consid-eringuniversalkriging models(withsquaredexponential correla-tion functionsandlinear regressionfunctions)aslocalexpertsto buildthesurrogatemodelsofboththeobjective f
(
x)
andthe con-straints g(
x)
and h(
x)
. The iteration budget was still set to 300 evaluations.OptimizationunderconstraintsoftheinfillSEGOMOE criterionwas stillperformedusingtheSLSQP algorithm [74] with amultistartapproach.BecausetheinitialDOEwascomputedusing anenhancedstochasticevolutionary(ESE)algorithm [70] for build-ingtheLHS,weranthesamecase100timestoobtainstatistically significant results.Furthermore,toreduce thebiasthat could ex-istbetweentheresultsobtainedwithdifferentinfillcriteriashould differentbatchesofinitialDOEsbeused,wekeptthesameinitial batchesforthecriteria(size-wise)bymeansofahotstartfeatureimplementedinSEGOMOE.ThesameDOEswereusedtoinitialize
theALBOframework.
To compare the performances of the different algorithms
(SEGOMOE, ALBO, COBYLA, and NOMAD), we used the criteria
definedinSection3.1.Forthemodified Braninproblem, the
con-vergence was assessed for each feasible point found using the
relative error of the objective with respect to the known solu-tion(seeEq. (22)),andaconvergencethresholdof10−3.Apointis
onlyapriorifeasiblewhentherelative violationoftheconstraint islessthan10−4.FortheLAHproblem,aconvergencethresholdof
10−3 (see Eq. (23))onthe proximityindexwas used.The results
oftheoptimizationsaresummarizedinTable3.
From Table 3, we can see that the convergence success rate increaseswiththesizeoftheinitialDOEforallthecriteria
consid-ered,reachingnearly100%forSEGOMOEwithWB2orWB2Swhen
morethan20pointswereused.Whencomparingthethree crite-riaweused,wefoundthatWB2Swassignificantlybetterthanthe other two for sparse initial DOEs (sizes of 5 and 10), for which
it had a 69% success rate for modified Branin and 10% for the
LAHfunction.Thiscanbeattributedtothefactthat,inWB2S,the scaling eases exploration. Forall criteria, the meanof the SEGO-MOE cost increased withthe size ofthe initial DOE; however, it is importantto note that the variance decreased.The lower suc-cess rates for the sparser initial DOEs can be explained by the inaccuracyofthesurrogateoftheconstraint,whichpreventedthe enrichmentoptimizerfromsearchingthepromisingareas.
If the WB2 and WB2S criteria both gave a convergence
suc-cessratecloseto100%foralltheDOEsizes (10,20,and30)with asimilar numberoffunctionevaluations(meanandvariance)for the LAH function, the EI criterion was less efficient on this
ex-ample. Concerningthe other BOalgorithm, ALBOconvergedwith
a success rateof lessthan 45% with a large number offunction evaluations(greaterthan 50).The ALBOresults ontheLAH
func-Table 3
Resultsfor3200runs(100percombinationofDOEsizeandcriterion)ofthe modi-fiedBraninandtheLAHproblems.DOEsrangefrom5to30points,andthenumber offunctionevaluationstoreachtheminimumvalueisreportedbothforSEGOMOE dependingontheinfillcriterionused(EI,WB2,orWB2S)andforALBO.The num-berofconvergedrunstotheanalyticsolution(witharelativeerrorthresholdora proximityindexof10−3)isgiveninpercent.Thebestresultsareshowninbold.
Criterion Results DOE 5 DOE 10 DOE 20 DOE 30 Modified Branin function
WB2 Converged 62% 78% 93% 99% Mean (evaluations) 38 39 53 51 σ(evaluations) 28 22 20 9 EI Converged 50% 75% 96% 100% Mean (evaluations) 27 45 49 45 σ(evaluations) 22 29 21 8 WB2S Converged 69% 76% 90% 100% Mean (evaluations) 34 39 49 47 σ(evaluations) 20 17 19 9 ALBO Converged 41% 32% 35% 19% Mean (evaluations) 58 57 70 70 σ(evaluations) 15 13 8 7 LAH function WB2 Converged 100% 100% 100% 100% Mean (evaluations) 17 19 28 37 σ(evaluations) 4 3 3 3 EI Converged 96% 86% 80% 68% Mean (evaluations) 45 33 43 43 σ(evaluations) 31 15 15 11 WB2S Converged 100% 100% 100% 100% Mean (evaluations) 18 20 29 37 σ(evaluations) 3 2 3 3 ALBO Converged 5% 8% 13% 27% Mean (evaluations) 29 50 65 84 σ(evaluations) 11 7 8 8
tionarequitedifferentfromthosepresentedbyPichenyet al. [26]; thiscouldbe explainedby thedifference intherelative violation oftheconstraint,i.e.,10−4 insteadof10−2.
To visualize the enrichment process in two dimension, Fig. 4 showsthemodified Branin function, wherethethree feasible
re-gions are shown in green. The global optimum of the modified
Branin caseisat theborder ofthe feasibleregion atthe bottom
right-hand corner ofthe contourplots. When the WB2Sor WB2
infillcriterionwasused,thesolutionsuccessfullyconvergedtothe globalminimum.However,whentheconventionalEIcriterionwas used,thesolutionremainedtrappedinthefeasibleregion discov-eredby the initial DOE.This showsthat the numerical improve-ment of WB2S relative to EI eases the enrichment optimization, allowingabetterexploration.Moreover,theincreasedcontribution
of EI within WB2S, compared to WB2, enabled the optimizer to
identifythesecond feasibleregionand, therefore,toconvergefor thisparticularconfiguration.
As for the unconstrained test cases, comparisons were
per-formed with the DFO algorithms (COBYLA and NOMAD), as
re-ported inTable 4, where 100 runs (correspondingto 100
differ-ent starting points froman LHS) were performedand the same
thresholdcriterionof10−3 fortherelative errorortheproximity
index was used. For the modified Branin problem, the
conver-gence successrate was very low for COBYLA(12%). For NOMAD,
theconstraintwastreatedwiththeProgressiveBarrieroption[75], whichneedstobesatisfiedonlyatthesolutionandnotnecessarily attheintermediatepoints (relaxableconstraint).The convergence successratewasverylow(lessthan27%),andthenumberof func-tionevaluationswasstillsignificant(morethan300).Thesevalues
werecomparedwithfewerthan51evaluationsfortheSEGOMOE
results(seeTable3). FortheLAHproblem,COBYLAwith53%
con-Table 4
ModifiedBraninandLAHproblemswithCOBYLAorNOMAD(100runs).Different stoppingcriteriarelativetothemaximumnumberofevaluationsweretested.The bestresultsareshowninbold.
Algorithms COBYLA NOMAD
MAX_BB_EVAL 500 100 300 Without Modified Branin Converged 12% 20% 27% 27% Mean (evaluations) 60 100 300 641 σ(evaluations) 6 0 0 123 LAH Converged 53% 0% 0% – Mean (evaluations) 42 – – – σ(evaluations) 11 – – –
vergence successratehada numberoffunction evaluationsquite
comparableto theSEGOMOE-EIresults. ForNOMAD,theequality
constraintwastreatedastwoinequalityconstraintswiththe Pro-gressive Barrieroption[75].Theconvergencesuccessratewas0% forthe10−3 thresholdandwas verylow(2%)ifweincreasedthe
thresholdvalueto10−1 (requiring500functionevaluationsinthis case).InthisLAHtestcase, amaximumnumberoffunction
eval-uations
MAX_BB_EVAL
is required toobtain a reasonable a CPUtime.Theobtainedvalueswerecomparedwithfewerthan43 eval-uationsfortheSEGOMOEresults(seeTable3).Tosumuponthese
two constrainedtestcases,SEGOMOEwiththeEI,WB2, orWB2S
criterion was still more efficient than ALBO, NOMAD,or COBYLA
intermsofbothconvergencesuccessrateandnumberoffunction evaluations.
In thefive analyticmultimodal optimizationproblems consid-eredinthissection,EIandWB2Shada similarbehaviorinsome cases, which justifies the use of a scalar multiplier s
>
1 in the WB2S formula to promote the exploration phase. In other cases, WB2S gave better results than those of EI; especially when the size of the initial DOE was small, its convergence success ratewas closeto100%.WhenEIfoundaglobaloptimum,WB2S
man-agedtofinditalso,althoughthereciprocitywasnotalways satis-fied.
4. Applicationtowingaerodynamicshapeoptimization
Inthe previoussections,we demonstratedtheperformance of
SEGOMOEforanalytic cases.Wenow demonstrateitona design
optimization problemthat ismore representative ofa real-world problem:anonlinearlyconstrainedwingaerodynamicshape opti-mizationproblemwheretheobjectivefunctionismultimodal. We
compareSEGOMOEwithagradient-basedoptimizer(SNOPT)that
requiresamultistartapproachtoreachtheglobaloptimum.These comparisons are based on the performance criteria described in Section3.1.
The final problem is a simplified version of an aerodynamic
shapeoptimizationproblemdefinedbytheADODG,whichhasone
equalityconstraintandexhibitsmultimodality.
4.1. High-fidelityaerodynamicshapeoptimizationframework
To perform the aerodynamic shape optimization, we use a
framework that combines a Reynolds-averaged Navier–Stokes
(RANS) CFD solver, a geometry parametrization engine, and a
mesh perturbation algorithm. The CFD solver is ADflow, which
uses asecond-orderfinite-volume schemetosolvethe compress-ible Euler equations,laminar Navier–Stokes, and RANS equations (steady, unsteady, and time periodic) on overset meshes [76,77]. The Spalart–Allmaras turbulencemodel [78] is used to complete
Table 5
DefinitionofthesimplifiedADODGCase 6optimizationproblem. Function/
variable
Description Quantity Range Minimize CD Drag coefficient 1
with respect to α Angle of attack 1 [−3.0,6.0] (◦)
θ Twist 8 [−3.12,3.12] (◦)
δ Dihedral 8 [−0.25,0.25] (δ/c) Total variables 17
subject to CL=0.2625 Lift coefficient 1 Total constraints 1
we reducethe numberofvariables. Thiscasewas devisedto ex-ploretheexistenceofmultiplelocalminimainaerodynamicwing design.Thebaselinegeometryisarectangularwingwitha chord of1.0 manda NACA 0012airfoilcrosssection withasharp
trail-ing edge. The semi-span is 3.06 m and the wing is fitted with
a rounded wingtip cap. In the full benchmark problem, the
op-timizer is given freedom to change the twist, chord, dihedral, sweep, span, and sectional shape variables. In the modified ver-sion used for this study, we reduced the variables to twist and dihedral, for a total of 17 design variables. The geometry is pa-rameterizedusing theFFD approachimplemented inpyGeo [79], which allows the definition of globaldesign variables with con-trol of the sections of the B-spline control points. Nine sections
are defined along the span of the wing with heavier clustering
towardthewingtip.Thetwistvariablesrotateeightofthese sec-tions (excluding the root section) about the quarter-chord. Like-wise, each of the eight dihedral variables controls the vertical displacementof one ofthe spanwisesections,excluding the root section. Theangleof attack canbe varied toallow the optimizer to satisfy the lift constraint. The objective of the problem is to performalift-constraineddragminimizationofasimplewing un-dersubsonicflow(M∞ = 0.5),usingtheEulerequations.Themain characteristicsoftheoptimizationproblemaresummarizedin Ta-ble5.
4.3. Gradient-basedoptimizationresults
The SEGOMOE approach is compared to SNOPT on the same
test case. These gradient-based results,aswell asthose of other cases relatedto ADODG Case 6, are presented inmore detail by Bons et al. [6]. Here, we just compare the SEGOMOE results for theEuler-basedtwistanddihedraloptimizationcase,and summa-rizethecorresponding resultsforcompleteness.The optimization problem is solved using SNOPT [83] starting from the 15 ran-domshapesshowninFig.5(a),whereweuseacoarsemeshgrid (L3 mesh with180K cells). The average numberof iterations re-quiredtoconvergeisapproximately100.Nevertheless,inaddition
Table 6
ResultsobtainedwithSNOPTandADflowusingtheadjointmethod.Theglobal op-timumiswritteninbold.
CD×104(drag counts) Number of runs Mean constraint violation
39.9091 5 7×10−10 40.1971 3 1×10−9 40.1972 3 1×10−8 40.1975 2 6×10−10 40.1986 1 8×10−10 40.1995 1 5×10−10
tothestandardevaluation,ADflowalsoneedstocomputethe gra-dients foreachaerodynamic-related function(dragandlift),each ofwhichrequiresan adjointsolution.Therefore,forthisproblem,
the mean evaluation budget is equivalent to approximately 300
standard ADflowcalls. Theoptimalshapesare showninFig.5(b), andthe optimalresultsaresummarized inTable6.Thisproblem has multiple local optima and a single global one, although the designsareverycloseintermsofdragvalue:39.9091dragcounts fortheglobaloptimumandaround40.1980dragcountsforthe lo-cal ones.Ofthe15runs,5convergedtowardtheglobaloptimum,
which is characterized by an upward winglet. For the other 10
runs, multiplelocalminimawere found forshapeswitha down-wardwinglet.
The radar chart in Fig. 6 shows the values of the 17 design variablesforthevariousoptima.Wecanseethatall localminima differ only slightly on the twist parameters and on the dihedral parametersintheinnerpartofthewing.Themultimodalityofthis problem, together withthe complexity oftheequalityconstraint, makesthisachallengingprovinggroundforSEGOMOE.
4.4. SEGOMOEresults
We performeda series of studies similar to those performed for theanalytic functions inSections 3.4and3.5.The runs were performedusingafixed evaluationbudget of500evaluations, in-cluding the initial DOE anditerations. Sixinitial DOE sizes were considered,rangingfrom1
×
d=
17 to6×
d=
102.Giventhe di-mensionoftheproblem,KPLS+Ksurrogateswereselectedaslocal expertsoftheMOEbecausewehavefoundinpreviousstudiesthat theyofferthebesttrade-offbetweenefficiencyandaccuracy [42]. ThetoleranceontheCL wassetto10−5,andtheenrichmentstepwas performedusingthe SLSQPoptimizer [74] with50randomly pickedstartingpointsateachstep.
Both the WB2 andthe WB2S criterion were tested using the
samebatchesofinitialDOEs(18DOEsperbatch,computedusing an optimized LHSalgorithm) for eachsize ofinitial DOE,leading to atotalof216runs. Aspreviously mentioned,thisallowsusto
Fig. 5. SNOPTresultsforwingoptimizationwithrespecttothetwistanddihedralvariables.Multiplelocalminimawerefoundwhenstartingfrom15randomlygenerated initialconfigurations.