Adaptive modeling strategy for constrained global optimization with application to aerodynamic wing design

(1)

an author's

https://oatao.univ-toulouse.fr/23683

https://doi.org/10.1016/j.ast.2019.03.041

Bartoli, Nathalie and Lefebvre, Thierry and Dubreuil, Sylvain and Olivanti, Romain and Priem, Rémy and Bons,

Nicolas and Martins, Joaquim R.R. A. and Morlier, Joseph Adaptive modeling strategy for constrained global

optimization with application to aerodynamic wing design. ( 2019) Aerospace Science and Technology. ISSN

1270-9638

(2)

Adaptive

modeling

strategy

for

constrained

global

optimization

with

application

to

aerodynamic

wing

design

N. Bartoli

a,

∗

,

T. Lefebvre

a

,

S. Dubreuil

a

,

R. Olivanti

b

,

R. Priem

a

,

N. Bons

c

,

J.R.R.A. Martins

c

,

J. Morlier

d

a_ONERA/DTIS,_Université_de_Toulouse,_Toulouse,_France b_Université_de_Toulouse,_{ISAE-SUPAERO,}_Toulouse,_France c_University_of_Michigan,_Ann_Arbor,_MI,_48109,_United_States

d_Université_de_Toulouse,_{ISAE-SUPAERO-INSA-Mines}_Albi-UPS,_CNRS_UMR5312,_Institut_Clément_Ader,_Toulouse,_France

a

b

s

t

r

a

c

t

Keywords: Surrogatemodeling Globaloptimization Multimodaloptimization Mixtureofexperts

Aerodynamicshapeoptimization Wingdesign

Surrogate models are often used to reduce the cost of design optimization problems that involve computationallycostlymodels,suchascomputationalfluiddynamicssimulations.However,thenumber ofevaluationsrequiredbysurrogatemodelsusuallyscalespoorlywiththenumberofdesignvariables, andthereisaneedforbothbetterconstraintformulationsandmultimodalfunctionhandling.Toaddress thisissue,wedeveloped asurrogate-based gradient-freeoptimizationalgorithmthatcan handlecases where the functionevaluations are expensive,the computational budget islimited, the functions are multimodal, and the optimization problem includes nonlinear equality or inequality constraints. The proposed algorithm—superefficient globaloptimizationcoupledwith mixtureofexperts(SEGOMOE)— cantacklecomplexconstraineddesignoptimizationproblemsthroughtheuseofanenrichmentstrategy based on a mixture of experts coupled with adaptive surrogate models. The performance of this approachwasevaluatedforanalyticconstrainedandunconstrainedproblems,aswellasforamultimodal aerodynamicshapeoptimizationproblemwith17designvariablesandanequalityconstraint.Ourresults showedthatthemethodisefficientandthattheoptimumismuchlessdependentonthestartingpoint thantheconventionalgradient-basedoptimization.

1. Introduction

Inaerodynamicshapeoptimization,realistic wingdesign

prob-lems require both a large number of design variables and

con-straints.Atypical probleminvolvesafew tensofdesignvariables andobjectiveandconstraintfunctionsthatrequiresigniﬁcantCPU time.Becausethecomputationalbudgetisusually restricted,only alimitednumberofobjectiveandconstraintfunctionevaluations

(afew hundreds) can be performed to ﬁndthe best design. The

emergence of adjoint methods [1–3] was a considerable

break-through in the ﬁeld of aerodynamic shape optimization because

it renders the cost of computing objective and constraint gradi-ents independentof the numberof design variables. In conjunc-tionwithgradient-basedoptimizers,thisprovidesaneﬃcientway

to convergeto a localminimum in a high-dimensional problem.

Althoughairfoil andwingdesign optimization problemsare uni-modal [4,5], once the wing planform is allowed to change, the

*

Correspondingauthor.

E-mailaddress:[email protected](N. Bartoli).

design space becomes multimodal [6]. More generally, in many engineeringdesignoptimizationproblems,we donotknowa pri-ori whether thedesign spaceis multimodalornot and perform-ing a multistart is oftenintractable in termsof CPUtime. When the gradient information is not available, an interesting option is to use gradient-free optimizers known as derivative-free opti-mization (DFO) methods [7–10]. Stochastic DFO methods mostly fall intothe categoryof evolutionaryalgorithms andare eﬃcient to solve constrained problems [11]. One of the most promising evolutionaryalgorithms thatcan handleconstraintsisthe covari-ance matrix adaptation evolution strategy (CMA-ES) [12] and its variants [13–15]. They have gained popularity in the solution of high-dimensional problems.Theironly drawback isstill thelarge numberoffunctionevaluationsrequiredtoﬁndtheoptimum [16]. Deterministic DFO methods,such asmesh adaptivedirect search (MADS), coordinate search (CS), and generalized pattern search (GPS),havebeenproposed [10].Thesemethodshavebeen imple-mentedindifferentpackagessuchasNOMAD [17],HOPSPACK [18], andtheDFLLibrary [19].DeterministicDFO methodsperform lo-calsearchesandthereforedependontheinitialstartingpoint.For

(3)

solving a multimodalproblem, a multistartapproach is generally usedandthenumberoffunctionevaluationsincreases,oftenwith

more than a thousand calls. In addition, some DFO algorithms,

suchasNOMAD,artificiallyincreasethenumberofinequality con-straintsto manageequalityconstraints,makingconvergenceeven moredifficult.Surrogate-basedoptimization(SBO)orBayesian op-timization(BO)approachesbuildaninexpensive approximationof theoriginal function that canbe used torapidly findan approx-imate optimum [20–23]. In BO, an acquisition function is built from Gaussian process approximations of the objective and con-straintfunctions [24].ForsomeBOframeworks [17,25],inequality constraints are an issue. Most of the cited algorithms consider equalityconstraintsastwo inequality constraints,which becomes intractable for large numbers of constraints. The augmented La-grangian framework ALBO [26] handles mixed-constrained prob-lems; however,itisstill notsuitable forsolvinglarge-scale prob-lems ina reasonable time.Various reviewsonSBO canbe found intheliterature [27–30].

AnotherBOalternative is basedon sequentialenrichment ap-pliedtoeﬃcientglobaloptimization(EGO) [21] usingan adaptive surrogate model.EGOuses a krigingmodel (also calledGaussian process inthe machine learning community[31]) asasubstitute forhigh-ﬁdelity models,takingadvantageofthepredictionofthe variance that is built into these models to inform the adaptive sampling [32–34]. Wessing and Preuss [16] performed a

com-parison betweenEGOand CMA-ES formultimodalunconstrained

problems.TheyconcludedthatEGOiseffectiveinﬁndingmultiple optimawhenthebudgetoffunctionevaluationsislimited.To han-dleconstrainedproblems,Sasenaet al. [35] proposedanextension

of EGO called SuperEGO. SBO has been applied to aerodynamic

shape optimizationproblemsin previous research efforts [36,37], some ofwhichhaveusedadaptivesampling [38–40]. Despitethe workscitedabove,thenumberofdesignvariablesthathavebeen

handled so far is insuﬃcient for wing design optimization and

multimodalitywitharestrictedbudgetisstillanissue.

Forrealistic aircraftwingshapeoptimizationproblems,the re-quired number of design variables exceeds 200 [4], and,

there-fore, trying to directly solve the problems using EGO with a

conventional kriging approach is not feasible. We recently

pro-posed to use EGO with a new type of kriging model adapted

tohigh-dimensionalproblems(namely,KPLSandKPLS+K) [41,42]. Theresultingoptimizationapproach,whichwecallSEGOKPLS(+K),

was demonstrated in problems with up to 50 design variables

that were solved using approximately a hundred function evalu-ations [43]. To handle nonlinear functionsthat vary signiﬁcantly within a wide domain, researchers have proposed to cluster the dataandconstructanassemblyoflocalsurrogates,knownas mix-tureofexperts (MOE),whichfacilitateglobaloptimization [44–48].

The ﬁrst contribution of this study is the extension of the

SEGOKPLS algorithm to handlehighly nonlinear and non-smooth

functionsby usingMOE withKPLSorKPLS+Kmodelsasthelocal experts.Wehavepreviouslypresentedanapproachcombining Su-perEGOandMOE(SEGOMOE),withpreliminaryresultsforanalytic functions [49].We startby constructingsurrogate modelsforthe objective and constraint functions by combining automatic clus-teringand best expertselection. Then, we approach thesolution iteratively by balancing the exploration and exploitation phases with a new proposed criterion for the acquisition function.

An-alytic test cases are presentedto compare SEGOMOEwith other

DFOalgorithmssuchasCOBYLA,NOMAD,orALBO.

Thesecondcontributionisthecomparisonoftheproposed ap-proachwithagradient-basedalgorithmforanaerodynamicshape optimizationproblem.Thisproblemisbasedon abenchmark

de-velopedby the AIAA Aerodynamic Design andOptimization

Dis-cussion Group (ADODG) [50]. The case that we solve is a

sim-pliﬁed version of ADODG Case 6, which is a subsonic wing

de-signproblemwithamultimodalobjective.Theaerodynamicmodel is expensive, requiring high-performance parallel computing re-sources to solve the Reynolds-averaged Navier–Stokes equations usingcomputationalfluiddynamics(CFD).Theobjectiveisto min-imize the drag coefficient at a given lift coefficient. The design variables are the angle of attack, twist distribution,and dihedral distributionofthewing,foratotalof17designvariables.Multiple optima have beenidentified by using a gradient-basedoptimizer starting fromdifferent design points. The goalof this studywas

to compare the effectiveness of SEGOMOE versus an established

gradient-basedapproachinconvergingtotheglobaloptimum. We starttherestofthispaperbysummarizingthepreviously developed methodsthat are relevanttothe presentworkandby introducing a new enrichment criterion (i.e.,WB2S). We demon-strate theabilityoftheproposed approachto ﬁndtheglobal op-tima for ﬁve analytic test cases that are multimodal. Then, we

demonstrate the effectiveness of SEGOMOE in ﬁnding the global

optimumoftheaerodynamicshapeoptimizationproblemand dis-cusshowitcomparestoagradient-basedmethod.

2. TheSEGOMOEapproach

Inthissection,wedescribetheproposedapproach(SEGOMOE), whichsolvestheconstrainedoptimizationproblem

min

x∈ y

(x)

s.t. c1

(x)

≤

0

, . . . ,

cm

(x)

≤

0

,

(1) where

⊂ R

d deﬁnesthe design space. Equalityconstraints can beconsideredaparticularcasewhereci

(

x

)

=

0.

Westartthissectionwithadescriptionofpreviouslydeveloped techniques(Section2.1)andfollowwithadescriptionofthenovel contributionsinthepresentwork(Section2.2).

2.1. BackgroundonSEGO

TheapproachproposedinthispaperbuildsuponSEGO,which, inturn,isbasedonEGO.Therefore,westart withan overviewof EGO andfollowwithan explanationof howconstraintsare han-dled inSEGO.Wealsoincludea descriptionofamorelocalinﬁll criterion (WB2). Finally, we describe the techniques that we use tohandlehigh-dimensionaldesignspaceseﬃciently(namely,KPLS andKPLS+K).

2.1.1. OriginalEGO

Asmentioned intheIntroduction,theproposedalgorithmwas inspired by EGO [21]. The main idea of the EGOapproach is to assume that the unknown objective function is a realization of a Gaussian process (GP) (alsoknown askriging [31,51]). This re-quires aninitial designofexperiments (DOE)ofnDOE points X

=

x(i)

_,

_i

₌

₁

_{, . . . ,}

_n

DOE

withx(i)

_{∈ R}

d_,_where_the_objective_function

is evaluated to obtain the samples y

=

y

(

x(i)

_),

_i

₌

₁

_{, . . . ,}

_n

DOE

withy

(

x(i)

₎

_{∈ R}

_._Then,_we_can_build_a_conditioned_GP_that approx-imatestheobjectivefunctionatanypointx byaGaussianrandom variableY

ˆ

(

x

)

withameanof

ˆ

y

(x)

= ˆ

μ

+

rt_xXR−1

y

−

1

μ

ˆ

(2)

andastandarddeviationof

ˆ

s2

(x)

= ˆ

σ

2

1

−

rt_xXR−1rxX

,

(3)

where 1 is an nDOE

×

1 column vector of 1’s, rxX

= {

k

(

x

,

x(i)

),

i

=

1

,

. . . ,

nDOE

}

, R is the covariancematrixofcomponents Ri j

=

k

(

x(i)

_,

_x(j)

₎

_,_and_k

_(.,

_.)

_is_a_given_covariance_function._In_this_work, weusethesquaredexponentialcovariancefunction

(4)

k

(x,

x

)

= ˆ

σ

2 d

i=1 exp

−θ

i

xi

−

xi

2

∀θ

i

∈ R

+

,

∀

i

∈ [

1

, . . . ,

d

].

(4)

The scalar parameters

μ

ˆ

,

σ

ˆ

, and

θ

i

,

i

=

1

,

. . . ,

d, can be found

using a number of methods; here, we use likelihood

maximiza-tion [22,31].UsingthisGaussianapproximation,theEGOalgorithm initiallyproposedbyJonesetal. [21] iterativelyaddspointstothe DOEto increase theaccuracy ofthe GP.Toﬁnd newpoints xnew

thathelptheoveralloptimization,theEGOalgorithmusesthe ex-pectedimprovement (EI)criterion,

EI

(x)

= E

max

(

0

,

ymin

− ˆ

Y

(x))

,

(5)

where ymin is theminimum value ofthe objectivefunction over

theDOE,i.e., ymin

=

min i∈[1,...,nDOE]

y

(

x(i)

)

.BecauseY

ˆ

(

x

)

isaGaussian randomvariabledeﬁnedbyits mean (2) andits variance (3),the EIcriterioncanbewrittenas

EI

(x)

=

⎧

⎪

⎨

⎪

⎩

ymin

− ˆ

y

(x)

ymin− ˆy(x) ˆ s(x)

+ ˆ

s

(x)φ

,

if

ˆ

s

>

0 0

,

if

ˆ

s

=

0

,

(6)

where

φ (.)

istheprobability densityfunctionand

(.)

isthe cu-mulativedistributionfunctionofthestandardnormaldistribution. Thisexpressionhighlightsthetrade-offbetweenexploitation ofthe Gaussian surrogate model andexploration of the design space. If

(.)

is large when

ˆ

y

(

x

)

is small compared to ymin, then the EI

criterionpromotesexploitation.Ontheotherhand,if

φ (.)

islarge whens

ˆ

(

x

)

islarge,thenitpromotesexploration.

Each iteration of the EGO algorithm consists of three main steps:

1. Construct a conditioned GP deﬁned by a mean (2) and the variance (3) basedonaDOEofsizenDOE.

2. Maximizetheexpectedimprovement (6).

3. AddanewpointtotheDOE(xnew)andevaluateit( ynew). This iterativeprocess is repeated from an initial DOE until con-vergence.Theusualstoppingcriterionisthemaximumnumberof functionevaluationscorrespondingtoacomputationalbudget. 2.1.2. Handlingconstraints

The original EGO algorithm outlined above was designed to

minimize unconstrained functions. Because mostengineering de-sign problems are subject to constraints, it is crucial to have a soundstrategyto handledesignconstraints. Althoughwe can al-ways add constraint functions to the objective as penalties, this approachisineﬃcientandinaccurate.Toaddressthisissue,Sasena et al. [52] proposedan approachtheycalledsuper-eﬃcientglobal optimization (SEGO), which can solve general nonlinearly mixed constrainedproblems.AsintheEGOapproach,theobjective func-tion y isapproximatedbyaGP.Toconsiderthem nonlinear con-straintsci

,

i

=

1

,

· · · ,

m, weconstructasurrogatemodelforeach

constraint ci, denoted by

ˆ

ci, usually based on the same DOE as

thatused forthe objectivefunction.This resultsin thefollowing constrainedoptimizationproblem:

max

x∈f

EI

(x),

(7)

where the feasible domain

f is deﬁnedby the nonlinear

con-straints,i.e.,

f

:= {

x

∈ R

d

: ˆ

c1

(x)

≤

0

, . . . ,

c

ˆ

m

(x)

≤

0

},

(8)

whereequalityconstraints,c

ˆ

i

(

x

)

=

0,couldreplaceorbeaddedto

thesetofinequalityconstraintsabove.Ateachiteration,thepoint

xnew that solves thisoptimization problem is added to the DOE andtheprocess isrepeateduntiltheconvergence.Evenifapoint

xnew_is_not_feasible,_evaluating_the_true_functions_adds_information

totheDOE.

Inthisprocedure,theiterativeconstructionofthevarious sur-rogatemodels(oneforthe objectivefunctionandm forthe con-straints)isdrivenonly bytheEI oftheobjectivefunction. There-fore, if the surrogate models of the constraints are not accurate enough,theaccuracyoftheoptimumiscompromised.

In some examples,the optimalsolution ofEq. (7) cannot sat-isfy the “true” constraint functions ci

(

x

)

because only the mean

valueoftheGPc

ˆ

i

(

x

)

isusedtoapproximatetheconstraintsduring

the optimization process. To consider the associated error esti-mation

ˆ

s2

ci

(

x

)

,we should implementdifferentstrategies.For

han-dlingconstraintsinaBayesianoptimizationframeworkconsidering the errorestimation, various approachesare possible: probabilis-tic [53], expected violation [54], predictive entropy search with constraints [55],orslackvariableswithLagrangianformulationto handle mixed-constraint problems [26]. Work is still in progress whenitcomestoaddressingthischallenge [56].

2.1.3. Inﬁllcriterion

Theoptimizationproblem (7) ismultimodal,anditis challeng-ingtoﬁndtheglobalmaximumwithoutincurringalarge compu-tationalcost.Toimprovetheeﬃciencyofthisoptimization,Sasena et al. [52] recommendedtousethefollowingcriterionnamed “lo-catingtheregionalextreme”proposedbyWatsonandBarnes [57]:

WB2

(x)

= − ˆ

y

(x)

+

EI

(x),

(9)

wherethemeanvalueoftheGaussiansurrogateissubtractedfrom the EI. The result isthat thiscriterion (WB2) isless multimodal thantheEI,whicheasesthesolutionoftheglobaloptimization,as demonstratedbySasena [58].

Nonetheless, the WB2 criterion can, in certain cases, prevent the algorithm fromﬁnding the global minimum,yielding a local minimuminstead.In fact,themagnitudeofthetermEI

(

x

)

is ex-pectedtodecreaseduringtheiterativeprocess,astheGPsurrogate

model becomes more accurate inthe promising areasof the

de-signspace. Thus,after afew iterations,the WB2criterionis only drivenbythemeanvalueoftheGPsurrogatemodel,whichmakes thealgorithmfocusonlyontheexploitation ofthesurrogatemodel ratherthanontheexploration ofthedesignspace.Toaddressthese issues,wedevelopedanewcriterionthatbuildsonWB2,whichis describedinSection2.2.1.

2.1.4. Handlingalargenumberofdesignvariables

One ofthe issues withSEGO isthe scalabilitywiththe num-berofdesignvariablesd.ToconstructtheGPoftheobjectiveand the constraint functions, we have to estimate the hyperparame-ters

θ

i

,

i

=

1

,

. . . ,

d ofthecovariancefunction (4). Theestimation

ofthesehyperparametersby maximizationofthelikelihood func-tioncanbetime-consuming,especiallywhenthenumberofdesign variablesishigh(d

>

10).Thisisbecausethelikelihoodfunctionis multimodal,requiringalargenumberofinversionsofthe correla-tionmatrixtomaximizeit.

Arecently developedsurrogate technique, krigingwith partial leastsquares(KPLS), was proposedby Bouhlelet al. [41] to han-dle the large number of variables forGaussian processes (up to 100variables). Thistechnique usesthe partialleastsquares(PLS) methodtoreducethenumberofhyperparametersandthesizeof theestimationproblem.Bouhlelet al. [42] subsequentlyimproved KPLS by addinga newstep in the construction of the surrogate

(5)

model, which improves the accuracy for high-dimensional

prob-lems (KPLS+K).These approacheswere usedwithin the SEGO

al-gorithm,andtheywere demonstratedtobeeﬃcientforproblems withupto50designvariables [43].

Inthisstudy,weuseKPLSandKPLS+Ktomodelboththe objec-tivefunctionandtheconstraints.TheuseoftheKPLS(+K)method

in the SEGO algorithm is what enables us to solve nonlinearly

constrainedoptimizationproblemswitha largenumberofdesign variables(d

>

10).

2.2. SEGOMOEanditsimprovements

In this section, we describe the improvements we made to

SEGOMOEthat constitute the contributions ofthe present study. Thesecontributionsincludea newenrichmenttechnique that im-proves the overall performance ofthe optimization, and an MOE approachthat improvestheaccuracyofthesurrogatemodelover awiderangeinthedesignspace.

2.2.1. Newinﬁllcriterion

As discussed in Section 2.1.3, the WB2 criterion (9) improves someoftheissuesofEI;however,thelackofscalingbetweenthe EI termandthepredictionofthemodelcan compromisethe ex-plorationpropertiesofthemethod.Therefore,weaddthescaling

WB2S

(x)

=

s EI

(x)

− ˆ

y

(x),

(10)

wheres isanon-negativescale,EIisgivenbyEq. (9),andy

ˆ

(

x

)

is givenbyEq. (2).

Thisnewcriterion(WB2S)hastwoobjectives:

1. Keep the exploration property of the expected improvement metricoverthefeasibledesignspace

f.

2. KeeptheoriginalpropertiesoftheWB2metricby smoothing ittofacilitateoptimization.

The second condition is fulﬁlledasin the original WB2criterion bypenalizing theexpectedimprovementwiththeterm

− ˆ

y

(

x

)

in Eq. (10).Theﬁrstconditionismorediﬃculttosatisfyandwe use thefollowingheuristic.Intuitively,thisconditioncanbetranslated to arg max x∈f EI

(x)

≈

arg max x∈f WB2S

(x).

(11)

Toenforcethiscondition,weneedtoatleastensurethat,forx

₌

arg maxx∈fEI

(

x

)

, the following inequality is veriﬁed: sEI

(

x

₎

_>

ˆ

y

(

x

₎

_. _As _a _consequence, _we _deﬁne _s

_{= β| ˆ}

_y

₍

_x

₎

_|/

_EI

₍

_x

₎

_, _where

β >

1. This heuristic approach does not guarantee that Eq. (11) is true because thisdepends on the variation of y

(

x

₎

_over

f.

However, by setting a relatively large value forthe parameter

β

, weexpectthisheuristictogiveanacceptableapproximation.

Finally,anewapproximationisperformedtoimprovethe com-putationaleﬃciencyoftheproposedcriterion.Findingx_over

f

is a diﬃcult task, andwe prefer to redeﬁnex withthe follow-ingapproximation: x

₌

_{arg max}

x∈X0EI

(

x

)

,wheretheﬁnitesubset X0

⊂

f contains thestarting points usedinthemultistart

opti-mizationofthecriterion.Thefollowingprocedureisusedto com-putes

>

0:

1. Compute EI foreach starting point in theoptimization (in a multistartapproach).

2. Evaluate the prediction ofthe surrogate model forthe point withthehighestEI.

3. Computethescales suchthat

s

=

⎧

⎨

⎩

β

| ˆy xstart,EImax| EIxstart,EImax if EI

xstart,EImax

=

0 1 if EI

xstart,EImax

=

0

,

(12)

where

β >

1 is a scalingfactor.In ourexperiments,we have foundthataﬁxedvalueof

β

=

100 workswell;however,this couldbe adjustedduringoptimizationiftheEIvaluesaretoo smallcomparedtothey

ˆ

(

x

)

values.Someanalyticexperiments willbepresentedinSection3.3tostudythesensitivity analy-sisofthe

β

parameter.

Toillustratethedifferentbehaviorsassociatedwitheach crite-rion, we provideaone-dimensional exampleinFig.1.The objec-tivefunction,itsassociatedGPmean,andtheGPuncertainty(with a conﬁdenceinterval of 99%)are plottedinFig.1(a). InFig. 1(c), theexplorationphaseislessobviouswiththeWB2criterion com-paredto thatwiththe EIorthe WB2Scriterion inFigs.1(b)and 1(d), respectively. The maximum value of WB2 is located close to the minimumofthe krigingprediction (see Fig.1 atx

≈

0

.

4), which illustratesthat, inthatcase, WB2focusesmoreonthe ex-ploitationofthesurrogatemodelratherthanontheexploration.In contrast,theEI andWB2Scriteriareachtheir maximumatx

=

0, whichallowstoexploreapromisingareaofthedesignspace.The two advantages of WB2S are illustrated in Fig. 1(d). Similarly to the EI criterion, the WB2S criterion facilitates exploration of the threelocalmaxima.Ontheother hand,theWB2criterionismore unimodal, whichpreventstheoptimizerfromgettingstuckinﬂat areasintheintervalx

∈ [

0

.

8

,

1

]

,wherethegradientiszero. 2.2.2. Mixtureofexperts

Oneofthemaincontributionsofthisstudyisthecombination

ofMOE withEGO.ThemotivationforusingMOE comesfrom

in-dustrial optimizationproblemsforwhichboththeobjective func-tion andthe constraints mightbe strongly nonlinear, discontinu-ous, orboth.Insuch cases,agoodapproximationoverthe whole designspaceby akrigingmodelmightbeinaccurateorrequirea large-sizeDOE.

To dealwith theapproximation of highly nonlinear functions, some researchershave proposed the MOE technique [59,60]. The keyideaistoconstructdifferentlocalapproximations(experts)for differentdomainsinthedesignspace. Theselocalapproximations canbetailoredtodealwithdisparatelocaltrendsinthefunction, includingﬂatregions,discontinuities,andstrongnonlinearities.

We use MOE with KPLS(+K) surrogate models as the experts

andimplementedthesemethods intheSurrogateModeling

Tool-box [61]. MOE relies on the expectation-maximization algorithm forGaussianmixturemodels [62].Theinputspaceisclustered to-getherwithitsoutputvaluebymeansofaparameterestimationof thejointdistribution.Alocalexpert(e.g.,polynomialﬁt,radial ba-sisfunctions,andkriging)isthenbuiltoneachcluster,andallthe localexpertsarethencombinedusingtheGaussianmixturemodel

parameters found by the expectation-maximization algorithm to

obtainaglobalmodel.

Inthisapproach,the Gaussianmixture modelisusedto com-bine the data to both partition the input space and derive the

mixing proportion. To perform the clustering, we need n

in-puts, X

=

x(i)

_,

_i

₌

₁

_{, . . . ,}

_n

_,_and_the_{corresponding}_outputs, _y

₌

y

(

x(i)

_),

_i

₌

₁

_{, . . . ,}

_n

_. _Therefore, _we _can _only _know _the _cluster posteriorprobabilitiesofvectorslike

(

x(i)

_,

_y

₍

_x(i)

₎₎

_{∈ R}

d+1_._To

pre-dict the cluster posterior probabilities ofa sample knowing only its inputs, wemust projecteach multivariateGaussian functionk intheGaussianmixturemodel(trainedind

+

1 dimensions)onto theinputspace,whichhasd dimensions.

Thus,foreachclusterk,wecreateamultivariateGaussian func-tionin

(

d

+

1

)

-dimensionalspacewiththecovariancematrix,

k

=

_kX

ν

k

ν

T k

ξ

k

,

(13)

(6)

(7)

outputs given by a local model corresponding to the cluster are allowed.Thismethodcanbeeﬃcientfordiscontinuous functions. Moreover,themixtureofexpertscanalsopredicttheuncertainty: for one sample, we choose the uncertainty corresponding to its cluster.

Forbothrecombinations,themixtureofexpertsbasedon krig-ingmodelsapproximatesfunctionswithaheterogeneousbehavior, providing a global modelwith a prediction of both the Jacobian andtheuncertainty.Theproposedalgorithmautomaticallychooses thebesttypeofrecombination,accordingtoacross-validation pro-cedure.TheMOEcanbewrittenasthesum

K

i=1

α

i

N

(

y

ˆ

i

,

ˆ

s2i

)

=

N

_K

i=1

α

iy

ˆ

i

,

K

i=1

α

2_i

ˆ

s2_i

=

N

ˆ

y

,

ˆ

s2

,

(18) where

α

i isdeﬁnedbyEq. (17),and y

ˆ

i

(

x

)

ands

ˆ

2i

(

x

)

aregivenby

Eqs. (2) and (3),respectively. Thus, theMOE can alsopredict the uncertainty.Thisinformationisusefultodefinetheinfillcriterion usedintheEGOorSEGOalgorithm.TheEI,WB2,andWB2S crite-riadefinedby Eqs.(6), (9),and (10), respectively,areadapted for

theMOEmodelsusing

EIMOE

(x)

=

⎧

⎪

⎨

⎪

⎩

ymin

− ˆ

y

(x)

+ ˆ

s

(x)φ

,

ifs

ˆ

>

0 0

,

ifs

ˆ

=

0

,

(19) where

φ (.)

istheprobability densityfunction and

(.)

isthe cu-mulativedistributionfunction ofthestandardnormaldistribution

N (

0

,

1

)

. The only difference betweenEq. (6) and Eq. (19) is the natureof y.

ˆ

BecauseMOEhasmultiplemodels,

ˆ

y isderived from Eq. (18) andexpressedasacombinationofkriging-basedlocal ex-perts,i.e.,

ˆ

y

(x)

=

K

i=1

α

iy

ˆ

i

(x),

(20)

where y

ˆ

i

(

x

)

isgivenby Eq. (2). Theassociated variance

ˆ

s2 is

ex-pressedinthesameway,i.e.,

ˆ

s2

(x)

=

K

i=1

α

2_i

ˆ

s2_i

(x),

(21) where

ˆ

s2_i isgivenby Eq. (3).Becausetheinﬁll criterionconsiders onlytheobjectivefunctionwiththeknowledgeoftheuncertainty estimation(seeEq. (3)),wehaveawiderchoicefortheconstraint surrogatemodels. Thesecouldbe asingle surrogate oramixture oflocalexperts.Analyticderivativesoftheinﬁllcriterion(EI,WB2, orWB2S)arealsoavailableandcomputedforgradient-based opti-mization.

2.2.4. TheSEGOMOEalgorithm

Once recombinationis achieved, MOE is used in combination

with the SEGO algorithm to solve global optimization problems

subjectto nonlinear constraints involving a large number of de-signvariables (d

>

10) andpotentially highly nonlinear objective andconstraintfunctions.Figure2showsaﬂowchartofthe SEGO-MOEalgorithmusedinthisstudy.Themainstepsinthealgorithm areasfollows:

1. ConstructtheinitialDOEandbuildtheassociatedMOEmodels fortheobjectiveandconstraintfunctions.

2. Maximizetheinﬁllcriterion(EI,WB2,orthenewcriterion de-ﬁned in Section 2.2.1) subject to the design constraints and variablebounds,andproposethenewenrichmentpoint.

Fig. 2. Overview of the SEGOMOE algorithm.

3. Computethevaluesofthe objectiveandconstraintfunctions atthenewenrichmentpoint.

4. Checkifthenew enrichmentpoint isin thefeasible domain ornot,andidentifyinactive,active,andviolatedconstraints. 5. Returnto Step 2 andupdate theDOE until thestopping

cri-terion is met. A common criterion is the maximum number

offunctioncallscorrespondingtotheavailable computational budget.

Tosolvetheconstrainedoptimizationproblem(withinequality and/orequalityconstraints)inStep 2,wecanuseagradient-based algorithmoraderivative-freeoptimizer.Becauseweuselocal opti-mizers,we performamultistartoptimizationwithdifferent start-ing points (10 points by default). The gradient-based optimizers usetheanalyticJacobianoftheMOE(asdescribedinSection2.2.3) to compute the gradients ofboth theobjective function and the constraintsoftheglobaloptimizationproblem (7).

Regarding thenumberofclustersinvolved intheMOE,Bartoli et al. [49] compared the eﬃciency ofSEGOMOE with K clusters versusone clusterontheMOPTAtestcase [63].Itwasfoundthat the number of function evaluations could be decreased by auto-matically choosing the number of clusters K (see [49] for more detailsontheproposedstrategy)andthat,ineverytestcase

stud-ied, the use of MOE in the EGO approach never increases the

numberoffunctionevaluationsrequiredforconvergence.Asa con-sequence,inthefollowing,theSEGOMOEframework isusedwith theautomatic clusternumberapproachalreadypresentedby Bar-toli et al. [49]. The algorithm is implemented within the NASA OpenMDAO framework [64] and,additionally, can use surrogates availablewithintheSurrogateModelingToolbox [61].

3. Analyticbenchmarkproblems

This section presents numerical results that demonstrate the

ability of SEGOMOE to solve multimodal analytic optimization

problems. A more realistic application is presented in Section 4. Here, ﬁve multimodal analytic optimization problems (three

(8)

con-sidered. The three unconstrained problems are well known

two-dimensional benchmark analytic functions: the six-hump

camel-backfunction,the Michalewiczfunction,andthe Ackleyfunction. Theﬁrstconstrainedproblemisatwo-dimensionalanalytic prob-lemproposedbyParret al. [65] thathastwodesignvariablesand oneinequalityconstraint.Thesolutionconsistsofoneglobal

min-imumandtwo local minima.The second constrained problemis

afour-dimensional analytic problemproposed by [26] that hasa (known)linearobjective andtwo constraints:one inequality and one equality. Before dealing with the numerical results, we re-view ﬁrst the performance criteria based either onthe objective

value at the optimal point or on the global minimum location

and the choice of the

β

parameter for the WB2S criterion (see

Section 2.2.1). We also introduce derivative-free optimizers to

comparetheirperformance withthatofSEGOMOE. Wechooseto

considerCOBYLA,NOMAD,andALBO, whicharethree

derivative-freeoptimizersavailableasanopen-sourcetoolboxthatcanhandle nonlinearconstraints.

3.1.Performancecriteria

We use three performance criteria to compare the results

for SEGOMOE with those for the other optimization algorithms

(COBYLA,NOMAD,andALBO).

1. Percentageofconvergedruns

2. Meannumberoffunctionevaluationsforconvergedruns 3. Standarddeviationofthe numberoffunction evaluationsfor

convergedruns.

The convergence is assessed by measuring the error in the

ob-jective value or theproximity ofthe design variables. When the objectivevalueisused,therelativeerroris

RE

=

|

f

∗

opt

−

fref∗

|

f_ref∗

|

,

(22)

where f_opt∗ is thebest point given by each algorithm and f_ref∗ is

the known reference solution given in Appendix A. When using

theproximity of thedesign variables, we measurethe difference betweentwosolutionsx1andx2 using

μ

prox

(x

1

,

x2

)

=

1

−

1 d d

i=1

x2i

−

x1i

ubi

−

lbi

,

(23)

whereubi andlbi denotethe upperbound andthe lowerbound,

respectively,oftheithdesignvariable.Thedistancesarescaledto

[

0

,

1

]

toconferthesameweightonthedesignvariables.

In the test cases presented here, one of the solutions is the referenceandtheother one istheoptimalpointprovided bythe algorithm.Toensurethat anoptimizationisconverged,we check that1

−

μ

prox

(

x1

,

x2

)

≤

,where

isagiventhresholdvalue(10−3

bydefault).The typeofvalidationandtheconvergencethreshold arespeciﬁedforeachproblem.Itiscrucialtolookatthiscriterion becausetheothertwoareonlycomputedforconvergedruns. 3.2.Derivative-freeoptimizers:COBYLA,NOMAD,andALBO

To evaluate the surrogate-based strategies, we consider the otherderivative-free algorithms(COBYLA, NOMAD,andALBO) for

the constrained and unconstrained analytic test cases. COBYLA

(ConstrainedOptimization BYLinearApproximation) isatrust re-gion optimization method that uses an approximating linear in-terpolation model of the objective andconstraint functions [66].

NOMADisa derivative-freealgorithmbasedon aC++

implemen-tationofthemesh adaptivedirectsearch (MADS)algorithm [17],

which is designed to solve difficult blackbox optimization prob-lems.NOMADdiscretizesthedesignspaceintoameshthatrefines adaptively tofind bettersuccessive solutions. Accordingto Audet etal. [67],NOMADisintendedfortime-consumingblackbox sim-ulation witha small numberof variables. AnotherBO algorithm, ALBO [26], is used here to handle mixed-constrained problems.

ALBO combinesan unconstrained BOframework withthe

classi-calaugmentedLagrangian(AL)framework[68].Originallydesigned fortheequalityconstraintproblems [69],ithasbeenextendedto inequality constraints by means of the slack variables. The ALBO procedureis thesameastheALframework exceptthatthe min-imization ofthe AL function is replaced by the maximization of an acquisitionfunction.The newacquisition functionisnot given explicitlybutonlythrough an estimationmethod.We foundthat ALBOisnot suitable forthesolutionoflarge-scale problemsina reasonabletime.

We tested COBYLA and NOMAD on three well-known

two-dimensionalmultimodalunconstrainedproblemsandtestedALBO

onafour-dimensionalmixed-constrainedproblemtocomparethe

convergence tothe globalminimum andthenumber offunction

evaluations.

BothCOBYLAandNOMADrequiretheusertospecifythe max-imum numberoffunction evaluations.For COBYLA,thisis setto 500evaluations,whichismuchhigherthanthenumberof evalu-ationsrequiredtoreachtheconvergenceintheanalytictestcases

we solve. For NOMAD,we try different valuesfor the maximum

number of evaluations keeping in mind that our budget is

lim-ited to 1000 calls. If no stopping criterion is speciﬁed, NOMAD stopsassoonasthemeshsizereachesagiventolerance.Because COBYLAandNOMADrequireastartingpoint,itmaybenecessary tousemultistarttoﬁndtheglobaloptimumofthesemultimodal functions.Tousethesamesetofstarting pointsbetweenthetwo algorithms,wegenerateaninitialsetof100pointsusingLatin hy-percubesampling(LHS) [70].

Asexplainedearlier,SEGOMOEusesanLHSstrategytogenerate theinitialDOEtobuildtheGPapproximationsoftheobjectiveand theconstraints.Thus,tocomparethethreeinﬁll criteria(EI,WB2, and WB2S), we use thesame set of initial LHS-generated points foreachcriterionwithdifferentnumbersofsamplingpoints(5,10, 20,and30points).ThissetofinitialLHS-generatedpoints isalso usedforALBO,which,likeSEGOMOE,requiresaninitialsampleof pointstobuildaGPapproximation.Thesizeofthesamplingsetis alsoastudiedparameter.

WeusedthePythonlibraryScipy [71] forCOBYLA,thePython NOMADframework [72],andDiceOptim [73] withthedefault set-tingsforALBO.

3.3. SensitivityanalysisoftheWB2Sinﬁllcriterion

Toﬁndtherelevantvaluesforthe

β

parameterintheWB2S in-fillcriterion(defined inSection2.2.1),we performeda sensitivity analysisstudyfortwoanalyticfunctions(MichalewiczandAckley) definedin AppendicesA.2 andA.1. Thesensitivity analysisstudy consistedof100runsfor32valuesof

β

∈ [

10−6

,

109

]

,resultingin a totalof3200runsforeachfunction. AninitialDOE of5points was used forthe twotest cases. Thepercentage ofsolutions ob-tainedforvarying

β

isplottedinFig.3.

The success rate for the Michalewicz problem was 100% for

β >

1, as shown in Fig. 3(a). This was expected because s in Eq. (12) increases with

β

and the WB2S criterion tends to the EI criterion when

β

tends to inﬁnity, ascan be seenin Eq. (10). On the other hand, for

β <

1, the WB2S criterion tends to the

surrogate-based criterion (SBO approach), which minimizes the

kriging-basedsurrogatemodelwithoutanyexploration.Therefore, smallvaluesof

β

leadtopoorperformanceinbothproblems.

(9)

Fig. 3. Effect of the value ofβin the WB2S inﬁll criterion. One hundred DOE points (initial set with 5 points) were tested with a total of 3200 runs for each function.

FortheAckleyfunction,thepercentageofproblemssolvedwith WB2Sinitiallyincreasedwithincreasing

β

andthentendedto de-creasefor

β >

105 _(see_Fig. ₃_(b))._As _a_consequence,_one _can_see

thattherangeof

β

valuesleadingtoahighersuccessrateisquite large

β

∈ [

10

,

105

]

.Thus,evenifthisrangeisproblemdependent (especially the upper bound, as illustrated by the comparisonof Fig.3(a)andFig.3(b)),one mightbeconﬁdent inchoosingan in-termediatevaluefor

β

.Inthefollowing,

β

=

100 isretained. 3.4. Analyticunconstrainedmultimodalproblems

As mentioned above, the main objective of this section is to

demonstratethe performance of SEGOMOEformultimodal

prob-lems. In particular, we want to compare the performance ofthe differentinﬁllcriteria(EI,WB2,andWB2S).Weﬁrstconsiderthree analyticunconstrainedmultimodalfunctions:

•

Thesix-hump camel-backfunction (deﬁnedin AppendixA.3) isatwo-dimensionalfunctionwithsixminima—twoofwhich are global—thatissmooth andeasy to optimize.The conver-genceisassessedbasedontheobjectivevalueandathreshold of10−3_for_the_associated_relative_error_given_by_{Eq. (}₂₂_).

•

The Michalewicz function (deﬁned in Appendix A.2) exhibits

valleysandridgeswithatunable steepness.The convergence is assessed based on the objective value and a threshold of 10−3 fortheassociatedrelativeerror (22).

•

TheAckleyfunction(deﬁned inAppendix A.1foran arbitrary numberofdimensionsd)ischaracterizedbymanylocal

min-ima, which makes it diﬃcult to optimize. The value of the

globalminimumisindependentofthenumberofdimensions

andis located inan area of steep gradient. We considerthe d

=

2 case.Theconvergenceisassessedbasedonthe proxim-ityofthe designvariableswiththereferencesolution,asthe relativeerrorontheobjectivefunctionvalueisnotdeﬁnedin thatcase( f

(

x∗

)

=

0).TheproximityiscomputedwithEq. (23) andathresholdissetto10−3.

ForSEGOMOE,asstatedabove,the samesetsofDOE(oneset per initial size) are used for all the criteria for a fair compari-son. Universal kriging models (with squared exponential correla-tion functions and linear regression functions) are used aslocal experts to build the surrogate model of the objective f

(

x

)

. The iteration budget is set to 300evaluations. Each computation can

be stopped early, either because convergence has been reached

or because of a singularity arising when enrichment points are tooclose. Theoptimizationoftheinﬁll SEGOMOEcriterionis per-formedusingtheSLSQPalgorithm [74] withamultistartapproach. The gradients forSLSQP are computedanalytically forspeed and accuracy. The convergence is assessed witha threshold of 10−3

for the relative error (see Eq. (22)) or the proximity index (see Eq. (23)),withaniterationbudgetof300functionevaluations.The resultsaresummarizedinTable1.

Table 1

SEGOMOEresultsfor3600runs(100percombinationofDOEsize,criterion,and function)forthesix-humpcamel-back,Michalewicz,andAckleyproblems,showing thepercentageofrunsthatconvergedtotheanalyticsolution(witharelativeerror thresholdoraproximityindexof10−3_)._The_best_results_are_shown_in_bold.

Criterion Results DOE 5 DOE 10 DOE 20 DOE 30 Six-hump camel-back function

WB2 Converged 94% 100% 100% 100% Mean (evaluations) 23 29 40 43 σ (evaluations) 6 7 7 4 EI Converged 100% 100% 100% 100% Mean (evaluations) 39 40 42 44 σ (evaluations) 4 4 4 3 WB2S Converged 100% 99% 100% 100% Mean (evaluations) 39 39 42 44 σ (evaluations) 3 3 4 3 Michalewicz function WB2 Converged 71% 76% 83% 87% Mean (evaluations) 25 31 36 46 σ (evaluations) 14 24 15 13 EI Converged 100% 100% 100% 100% Mean (evaluations) 47 45 51 59 σ (evaluations) 36 43 31 21 WB2S Converged 100% 99% 100% 100% Mean (evaluations) 48 46 51 55 σ (evaluations) 36 33 27 15 Ackley function WB2 Converged 32% 58% 90% 91% Mean (evaluations) 41 55 45 50 σ (evaluations) 30 38 27 17 EI Converged 71% 98% 98% 98% Mean (evaluations) 95 73 56 80 σ (evaluations) 96 68 30 56 WB2S Converged 97% 100% 100% 100% Mean (evaluations) 107 60 58 68 σ (evaluations) 90 45 26 23

In thistable, highersuccessrateswere obtainedindecreasing orderby WB2S(10times),EI (8times),andWB2(3times).WB2

seems moreeﬃcient whenjudged onthebasis ofthe meanand

standard deviationof the evaluations. However, it is also impor-tant toconsider itsrateofconvergence.Forthe Michalewiczand theAckleyfunction,asigniﬁcantnumberofrunsdidnotconverge

(between 70% and 40% for the initial DOEs with fewer than 20

points),althoughtheyconvergedwiththeEI andtheWB2S crite-rion.Thiscanbeexplainedbythemultimodalityoftheconsidered functions, assome runsperformedwiththeWB2criterion ended up converging near the local minima.This issue is mitigated as the size of the initial DOE is increased. The use of more points improvesexploitationandreducesthelikelihoodofbeingtrapped near the localoptima. The EI and theWB2S criterion performed similarlyacross allsizes oftheDOEintermsofboth rateof

(10)

con-Table 2

Six-hump,Michalewicz,and Ackleyresults usingCOBYLAor NOMAD(100runs). Differentstoppingcriteriarelativetothemaximumnumberofblackboxevaluations weretested.Thebestresultsareshowninbold.

Algorithm COBYLA NOMAD

MAX_BB_EVAL 500 100 300 without Six-hump camel-back Converged 77% 93% 95% 95% Mean (evaluations) 51 100 294 308 σ (evaluations) 5 0 10 25 Michalewicz Converged 29% 87% 89% 89% Mean (evaluations) 60 100 290 300 σ (evaluations) 27 0 13 26 Ackley Converged 38% 98% 100% 100% Mean (evaluations) 59 100 300 499 σ (evaluations) 5 0 0 12

vergenceandmeannumberofevaluations.Thestandarddeviation oftheWB2Swas slightlybetter forDOEsizes rangingfrom10to 30. Thisbehaviorwasexpected,asexplainedinSection2.2.1.

ThecomparisonresultsbetweenCOBYLAandNOMADarelisted

in Table 2. For COBYLA, the convergence to the global minima

wasobtainedwithasigniﬁcantly lower successratecompared to

thatforSEGOMOE. However,when COBYLAconverges,itrequires

roughlythesamenumberoffunctionevaluations(between51and 60) asthatofSEGOMOE.Table2showsalsotheNOMADresultsfor thethreeperformance criteriaasafunctionofthestopping

crite-rionchosen (

MAX_BB_EVAL

foramaximumnumberofblackbox

evaluationsof100,300,andnolimit).Thus,thenumberof

evalua-tionswithNOMADwasmuchhigherthanthatwiththeSEGOMOE

approachwithasimilarsuccessrate.

TheeﬃciencyofSEGOMOEwasvalidatedforthreemultimodal

unconstrained problems compared to the two other

derivative-freealgorithms(COBYLAandNOMAD).Intermsofcriteria,EI and WB2Syieldedsimilarresultsintwoproblems,whereasWB2S per-formedbetterinthethirdone.

3.5.Analyticconstrainedmultimodalproblems

Toevaluate theability ofSEGOMOEto tacklemultimodaland constrainedproblems,weusedthefollowingtwotestcases:

•

The modiﬁed Branin problem, which is a two-dimensional

nonlinearproblemwithasinglenonlinearconstraint [65].This modiﬁcation, which was proposed by Parr et al. [65], adapts the original Branin functionto havea single globaloptimum andtwolocalones,ratherthanthreeoptimaofequalvalue.

min

x∈[−5,10]×[0,15]f

(x)

g

(x)

≥

0

,

(24)

where f

(

x

)

and g

(

x

)

are as detailed in Appendix A.4. The feasible space for this problem consists of isolated

feasi-ble regions within their design space, which often occurs

forseverely constrainedpracticalcases.The problemfeatures three distinct feasible regions delimited by green curves in Fig.4,makingitanexcellenttestforconstraintstrategies [25, 65].

•

The Linear-Ackley-Hartman (LAH)mixed-constrainedproblem

with fourinput dimensions, a linear objective,andtwo con-straintsliketheonedescribedbyPichenyet al. [26]

⎧

⎪

⎨

⎪

⎩

min x∈[0,1]4 f

(x)

g

(x)

≤

0 h

(x)

=

0 (25)

The ﬁrst inequality constraint is the Ackley function in four dimensions, andthe secondone isan equalityconstraint

fol-lowing the Hartman four-dimensional function. The problem

isdenoted by LAHin thefollowing,anddetails on the func-tion expressions and the referencesolution are given in Ap-pendixA.5.

For these two constrained test cases, we compared SEGOMOE,

COBYLA,andNOMADwiththeALBOframework.AstheLAH

prob-lemisamixed-constrainedproblem,theequalityconstrainth

(

x

)

in Eq. (25) hasbeentransformed intotwo inequality constraintsfor

COBYLA andNOMAD.SEGOMOEand ALBOare both ableto

con-sidermixed-constraintproblems.

In our tests, we sought to quantify the impact ofthe size of theinitial DOEandthe choiceofinﬁllcriterion forSEGOMOE.As for theunconstrained case, the results were obtained by consid-eringuniversalkriging models(withsquaredexponential correla-tion functionsandlinear regressionfunctions)aslocalexpertsto buildthesurrogatemodelsofboththeobjective f

(

x

)

andthe con-straints g

(

x

)

and h

(

x

)

. The iteration budget was still set to 300 evaluations.OptimizationunderconstraintsoftheinfillSEGOMOE criterionwas stillperformedusingtheSLSQP algorithm [74] with amultistartapproach.BecausetheinitialDOEwascomputedusing anenhancedstochasticevolutionary(ESE)algorithm [70] for build-ingtheLHS,weranthesamecase100timestoobtainstatistically significant results.Furthermore,toreduce thebiasthat could ex-istbetweentheresultsobtainedwithdifferentinfillcriteriashould differentbatchesofinitialDOEsbeused,wekeptthesameinitial batchesforthecriteria(size-wise)bymeansofahotstartfeature

implementedinSEGOMOE.ThesameDOEswereusedtoinitialize

theALBOframework.

To compare the performances of the different algorithms

(SEGOMOE, ALBO, COBYLA, and NOMAD), we used the criteria

deﬁnedinSection3.1.Forthemodiﬁed Braninproblem, the

con-vergence was assessed for each feasible point found using the

relative error of the objective with respect to the known solu-tion(seeEq. (22)),andaconvergencethresholdof10−3_._A_point_is

onlyapriorifeasiblewhentherelative violationoftheconstraint islessthan10−4_._For_the_LAH_problem,_a_convergence_threshold_of

10−3 _(see _{Eq. (}₂₃₎₎_on_the _proximity_index_was _used._The _results

oftheoptimizationsaresummarizedinTable3.

From Table 3, we can see that the convergence success rate increaseswiththesizeoftheinitialDOEforallthecriteria

consid-ered,reachingnearly100%forSEGOMOEwithWB2orWB2Swhen

morethan20pointswereused.Whencomparingthethree crite-riaweused,wefoundthatWB2Swassigniﬁcantlybetterthanthe other two for sparse initial DOEs (sizes of 5 and 10), for which

it had a 69% success rate for modiﬁed Branin and 10% for the

LAHfunction.Thiscanbeattributedtothefactthat,inWB2S,the scaling eases exploration. Forall criteria, the meanof the SEGO-MOE cost increased withthe size ofthe initial DOE; however, it is importantto note that the variance decreased.The lower suc-cess rates for the sparser initial DOEs can be explained by the inaccuracyofthesurrogateoftheconstraint,whichpreventedthe enrichmentoptimizerfromsearchingthepromisingareas.

If the WB2 and WB2S criteria both gave a convergence

suc-cessratecloseto100%foralltheDOEsizes (10,20,and30)with asimilar numberoffunctionevaluations(meanandvariance)for the LAH function, the EI criterion was less eﬃcient on this

ex-ample. Concerningthe other BOalgorithm, ALBOconvergedwith

a success rateof lessthan 45% with a large number offunction evaluations(greaterthan 50).The ALBOresults ontheLAH

(11)

func-Table 3

Resultsfor3200runs(100percombinationofDOEsizeandcriterion)ofthe modi-ﬁedBraninandtheLAHproblems.DOEsrangefrom5to30points,andthenumber offunctionevaluationstoreachtheminimumvalueisreportedbothforSEGOMOE dependingontheinﬁllcriterionused(EI,WB2,orWB2S)andforALBO.The num-berofconvergedrunstotheanalyticsolution(witharelativeerrorthresholdora proximityindexof10−3₎_is_given_in_percent._The_best_results_are_shown_in_bold.

Criterion Results DOE 5 DOE 10 DOE 20 DOE 30 Modiﬁed Branin function

WB2 Converged 62% 78% 93% 99% Mean (evaluations) 38 39 53 51 σ(evaluations) 28 22 20 9 EI Converged 50% 75% 96% 100% Mean (evaluations) 27 45 49 45 σ(evaluations) 22 29 21 8 WB2S Converged 69% 76% 90% 100% Mean (evaluations) 34 39 49 47 σ(evaluations) 20 17 19 9 ALBO Converged 41% 32% 35% 19% Mean (evaluations) 58 57 70 70 σ(evaluations) 15 13 8 7 LAH function WB2 Converged 100% 100% 100% 100% Mean (evaluations) 17 19 28 37 σ(evaluations) 4 3 3 3 EI Converged 96% 86% 80% 68% Mean (evaluations) 45 33 43 43 σ(evaluations) 31 15 15 11 WB2S Converged 100% 100% 100% 100% Mean (evaluations) 18 20 29 37 σ(evaluations) 3 2 3 3 ALBO Converged 5% 8% 13% 27% Mean (evaluations) 29 50 65 84 σ(evaluations) 11 7 8 8

tionarequitedifferentfromthosepresentedbyPichenyet al. [26]; thiscouldbe explainedby thedifference intherelative violation oftheconstraint,i.e.,10−4 insteadof10−2.

To visualize the enrichment process in two dimension, Fig. 4 showsthemodiﬁed Branin function, wherethethree feasible

re-gions are shown in green. The global optimum of the modiﬁed

Branin caseisat theborder ofthe feasibleregion atthe bottom

right-hand corner ofthe contourplots. When the WB2Sor WB2

inﬁllcriterionwasused,thesolutionsuccessfullyconvergedtothe globalminimum.However,whentheconventionalEIcriterionwas used,thesolutionremainedtrappedinthefeasibleregion discov-eredby the initial DOE.This showsthat the numerical improve-ment of WB2S relative to EI eases the enrichment optimization, allowingabetterexploration.Moreover,theincreasedcontribution

of EI within WB2S, compared to WB2, enabled the optimizer to

identifythesecond feasibleregionand, therefore,toconvergefor thisparticularconﬁguration.

As for the unconstrained test cases, comparisons were

per-formed with the DFO algorithms (COBYLA and NOMAD), as

re-ported inTable 4, where 100 runs (correspondingto 100

differ-ent starting points froman LHS) were performedand the same

thresholdcriterionof10−3 fortherelative errorortheproximity

index was used. For the modiﬁed Branin problem, the

conver-gence successrate was very low for COBYLA(12%). For NOMAD,

theconstraintwastreatedwiththeProgressiveBarrieroption[75], whichneedstobesatisﬁedonlyatthesolutionandnotnecessarily attheintermediatepoints (relaxableconstraint).The convergence successratewasverylow(lessthan27%),andthenumberof func-tionevaluationswasstillsigniﬁcant(morethan300).Thesevalues

werecomparedwithfewerthan51evaluationsfortheSEGOMOE

results(seeTable3). FortheLAHproblem,COBYLAwith53%

con-Table 4

ModiﬁedBraninandLAHproblemswithCOBYLAorNOMAD(100runs).Different stoppingcriteriarelativetothemaximumnumberofevaluationsweretested.The bestresultsareshowninbold.

Algorithms COBYLA NOMAD

MAX_BB_EVAL 500 100 300 Without Modiﬁed Branin Converged 12% 20% 27% 27% Mean (evaluations) 60 100 300 641 σ(evaluations) 6 0 0 123 LAH Converged 53% 0% 0% – Mean (evaluations) 42 – – – σ(evaluations) 11 – – –

vergence successratehada numberoffunction evaluationsquite

comparableto theSEGOMOE-EIresults. ForNOMAD,theequality

constraintwastreatedastwoinequalityconstraintswiththe Pro-gressive Barrieroption[75].Theconvergencesuccessratewas0% forthe10−3 _threshold_and_was _very_low_(2%)_if_we_increased_the

thresholdvalueto10−1 (requiring500functionevaluationsinthis case).InthisLAHtestcase, amaximumnumberoffunction

eval-uations

MAX_BB_EVAL

is required toobtain a reasonable a CPU

time.Theobtainedvalueswerecomparedwithfewerthan43 eval-uationsfortheSEGOMOEresults(seeTable3).Tosumuponthese

two constrainedtestcases,SEGOMOEwiththeEI,WB2, orWB2S

criterion was still more eﬃcient than ALBO, NOMAD,or COBYLA

intermsofbothconvergencesuccessrateandnumberoffunction evaluations.

In theﬁve analyticmultimodal optimizationproblems consid-eredinthissection,EIandWB2Shada similarbehaviorinsome cases, which justiﬁes the use of a scalar multiplier s

>

1 in the WB2S formula to promote the exploration phase. In other cases, WB2S gave better results than those of EI; especially when the size of the initial DOE was small, its convergence success rate

was closeto100%.WhenEIfoundaglobaloptimum,WB2S

man-agedtoﬁnditalso,althoughthereciprocitywasnotalways satis-ﬁed.

4. Applicationtowingaerodynamicshapeoptimization

Inthe previoussections,we demonstratedtheperformance of

SEGOMOEforanalytic cases.Wenow demonstrateitona design

optimization problemthat ismore representative ofa real-world problem:anonlinearlyconstrainedwingaerodynamicshape opti-mizationproblemwheretheobjectivefunctionismultimodal. We

compareSEGOMOEwithagradient-basedoptimizer(SNOPT)that

requiresamultistartapproachtoreachtheglobaloptimum.These comparisons are based on the performance criteria described in Section3.1.

The ﬁnal problem is a simpliﬁed version of an aerodynamic

shapeoptimizationproblemdeﬁnedbytheADODG,whichhasone

equalityconstraintandexhibitsmultimodality.

4.1. High-ﬁdelityaerodynamicshapeoptimizationframework

To perform the aerodynamic shape optimization, we use a

framework that combines a Reynolds-averaged Navier–Stokes

(RANS) CFD solver, a geometry parametrization engine, and a

mesh perturbation algorithm. The CFD solver is ADﬂow, which

uses asecond-orderﬁnite-volume schemetosolvethe compress-ible Euler equations,laminar Navier–Stokes, and RANS equations (steady, unsteady, and time periodic) on overset meshes [76,77]. The Spalart–Allmaras turbulencemodel [78] is used to complete

(12)

(13)

Table 5

DeﬁnitionofthesimpliﬁedADODGCase 6optimizationproblem. Function/

variable

Description Quantity Range Minimize CD Drag coeﬃcient 1

with respect to α Angle of attack 1 [−3.0,6.0] (◦)

θ Twist 8 [−3.12,3.12] (◦)

δ Dihedral 8 [−0.25,0.25] (δ/c) Total variables 17

subject to CL=0.2625 Lift coeﬃcient 1 Total constraints 1

we reducethe numberofvariables. Thiscasewas devisedto ex-ploretheexistenceofmultiplelocalminimainaerodynamicwing design.Thebaselinegeometryisarectangularwingwitha chord of1.0 manda NACA 0012airfoilcrosssection withasharp

trail-ing edge. The semi-span is 3.06 m and the wing is ﬁtted with

a rounded wingtip cap. In the full benchmark problem, the

op-timizer is given freedom to change the twist, chord, dihedral, sweep, span, and sectional shape variables. In the modiﬁed ver-sion used for this study, we reduced the variables to twist and dihedral, for a total of 17 design variables. The geometry is pa-rameterizedusing theFFD approachimplemented inpyGeo [79], which allows the deﬁnition of globaldesign variables with con-trol of the sections of the B-spline control points. Nine sections

are deﬁned along the span of the wing with heavier clustering

towardthewingtip.Thetwistvariablesrotateeightofthese sec-tions (excluding the root section) about the quarter-chord. Like-wise, each of the eight dihedral variables controls the vertical displacementof one ofthe spanwisesections,excluding the root section. Theangleof attack canbe varied toallow the optimizer to satisfy the lift constraint. The objective of the problem is to performalift-constraineddragminimizationofasimplewing un-dersubsonicﬂow(M_∞ = 0.5),usingtheEulerequations.Themain characteristicsoftheoptimizationproblemaresummarizedin Ta-ble5.

4.3. Gradient-basedoptimizationresults

The SEGOMOE approach is compared to SNOPT on the same

test case. These gradient-based results,aswell asthose of other cases relatedto ADODG Case 6, are presented inmore detail by Bons et al. [6]. Here, we just compare the SEGOMOE results for theEuler-basedtwistanddihedraloptimizationcase,and summa-rizethecorresponding resultsforcompleteness.The optimization problem is solved using SNOPT [83] starting from the 15 ran-domshapesshowninFig.5(a),whereweuseacoarsemeshgrid (L3 mesh with180K cells). The average numberof iterations re-quiredtoconvergeisapproximately100.Nevertheless,inaddition

Table 6

ResultsobtainedwithSNOPTandADﬂowusingtheadjointmethod.Theglobal op-timumiswritteninbold.

CD×104_{(drag counts)} _{Number of runs} _{Mean constraint violation}

39.9091 5 7×10−10 40.1971 3 1×10−9 40.1972 3 1×10−8 40.1975 2 6×10−10 40.1986 1 8×10−10 40.1995 1 5×10−10

tothestandardevaluation,ADﬂowalsoneedstocomputethe gra-dients foreachaerodynamic-related function(dragandlift),each ofwhichrequiresan adjointsolution.Therefore,forthisproblem,

the mean evaluation budget is equivalent to approximately 300

standard ADﬂowcalls. Theoptimalshapesare showninFig.5(b), andthe optimalresultsaresummarized inTable6.Thisproblem has multiple local optima and a single global one, although the designsareverycloseintermsofdragvalue:39.9091dragcounts fortheglobaloptimumandaround40.1980dragcountsforthe lo-cal ones.Ofthe15runs,5convergedtowardtheglobaloptimum,

which is characterized by an upward winglet. For the other 10

runs, multiplelocalminimawere found forshapeswitha down-wardwinglet.

The radar chart in Fig. 6 shows the values of the 17 design variablesforthevariousoptima.Wecanseethatall localminima differ only slightly on the twist parameters and on the dihedral parametersintheinnerpartofthewing.Themultimodalityofthis problem, together withthe complexity oftheequalityconstraint, makesthisachallengingprovinggroundforSEGOMOE.

4.4. SEGOMOEresults

We performeda series of studies similar to those performed for theanalytic functions inSections 3.4and3.5.The runs were performedusingaﬁxed evaluationbudget of500evaluations, in-cluding the initial DOE anditerations. Sixinitial DOE sizes were considered,rangingfrom1

×

d

=

17 to6

×

d

=

102.Giventhe di-mensionoftheproblem,KPLS+Ksurrogateswereselectedaslocal expertsoftheMOEbecausewehavefoundinpreviousstudiesthat theyofferthebesttrade-offbetweeneﬃciencyandaccuracy [42]. ThetoleranceontheCL wassetto10−5,andtheenrichmentstep

was performedusingthe SLSQPoptimizer [74] with50randomly pickedstartingpointsateachstep.

Both the WB2 andthe WB2S criterion were tested using the

samebatchesofinitialDOEs(18DOEsperbatch,computedusing an optimized LHSalgorithm) for eachsize ofinitial DOE,leading to atotalof216runs. Aspreviously mentioned,thisallowsusto

Fig. 5. SNOPTresultsforwingoptimizationwithrespecttothetwistanddihedralvariables.Multiplelocalminimawerefoundwhenstartingfrom15randomlygenerated initialconﬁgurations.

(14)

(15)

Adaptive modeling strategy for constrained global optimization with application to aerodynamic wing design

an author's

https://oatao.univ-toulouse.fr/23683

https://doi.org/10.1016/j.ast.2019.03.041

Bartoli, Nathalie and Lefebvre, Thierry and Dubreuil, Sylvain and Olivanti, Romain and Priem, Rémy and Bons,

Nicolas and Martins, Joaquim R.R. A. and Morlier, Joseph Adaptive modeling strategy for constrained global

optimization with application to aerodynamic wing design. ( 2019) Aerospace Science and Technology. ISSN

1270-9638

Adaptive

modeling

strategy

for

constrained

global

optimization

with

application

to

aerodynamic

wing

design

N. Bartoli

∗

,

T. Lefebvre

,

S. Dubreuil

,

R. Olivanti

,

R. Priem

,

N. Bons

,

J.R.R.A. Martins

,

J. Morlier

a

b

s

t

r

a

c

t

*

(x)

(x)

≤

, . . . ,

(x)

≤

,

⊂ R

(

)

=

=

,

=

, . . . ,

∈ R

=

(

),

=

, . . . ,

(

)

∈ R

ˆ

(

)

ˆ

(x)

= ˆ

μ

+



−

_,

₌

_{, . . . ,}

_{∈ R}

_),

₌

_{, . . . ,}

₎

_{∈ R}

_,

₎

_(.,

_.)