PROBLEMS WITH SEQUENTIAL CONVEX
PROGRAMMING
Pierre DUYSINX
Michael BRUYNEEL
Claude FLEURY
Institute of Mechanics and CivilEngineering
University of Liege,
Chemindes Chevreuils 1,Building B52
B-4000Liege, Belgium
1 INTRODUCTION
Structural optimization consists in formulating the design problem of structural
compo-nentsasan optimizationprobleminorder to usethepowerof mathematical optimization
tools.
From a mathematical point of view, a quite general statement of the optimization
problemisgiven asfollowing
(P) 8 > > < > > : min x g 0 (x) s.t.: g j (x) g j j =1:::m x i x i x i i=1:::n (1) The function g 0
is the objective function of the problem, i.e. a cost function or a
performanceindexthatonewantstominimizeinordertohaveabetterdesign. Intopology
optimization,thisisforexamplethecomplianceofthestructureundertheconsideredload
case.
The set of constraint functions g
j
(in number m) are expressing the restrictions the
designis subject to inorder to be feasible. Forexample these functions are some bounds
upon a stress measure to have resistance, restricted displacements, a volume resource or
perimeterbound:::
The n variables x
i
are the design variables of the problem, that is, the parameters,
whichcanbemodied,toimprovethedesign. Inthetopologyoptimizationcontext,thex
i
Thedesignvariablevaluesarealsosubjecttosomeverysimplerestrictions,becauseofsome
physicalormathematicalreasons(thedensityliesbetween0and1,andtheorientationcan
besearchedbetween0and becauseofperiodicity). Otherconstraintsondesignvariables
can come from technological or manufacturing arguments like prescribed density regions
in topologyoptimization. Allthese specialconstraints on design variablesare called side
constraints. They generally callfor a special treatment inthe algorithm because of their
very simplestructure.
The direct solution of optimization problem (P) is totally prohibitive in structural
optimization,andfurthermoreintopologyoptimization,becauseofthecomputationalcost
ofthestructural and sensitivityanalysis oftheproblem. Indeedtheproblemofstructural
optimizationishighlynon-linear,andimplicitintermsofthedesignvariablessothateach
functionevaluation wouldrequirea Finiteelement (F.E.) analysis.
As soon as the seventies, Schmidt and his co-workers (e.g. Schmidt and Farshi [34 ],
Schmidt and Fleury [35]) proposed an interesting way to circumvent the problem while
usingmathematicalprogrammingtools. Theapproximation conceptsapproach replacesthe
primary optimization problem(P) witha sequence of explicit approximate sub-problems
having a simple algebraic structure, and built from the available information (function
values, rst and second order derivatives) at the current design point or at the former
iterationpoints. ( ~ P) 8 > > < > > : min x ~ g 0 (x) s.t.: g~ j (x) g j j =1:::m x i x i x i j =1:::n (2) whereg~ 0 and g~ j
denotes theapproximationsof objective functiong
0
and constraint
func-tionsg
j
. Theseapproximationscanberegardedassomekindofexpansionsoftheresponse
functionsaroundthecurrentdesignpointx k
the most famousones remain CONLIN by Fleuryand Braibant [24 ] and MMA (Method
ofMovingAsymptotes) bySvanberg[38]
However the success of the approximation strategy comes from the fact that the
sub-problems( ~
P)canbesolvedeÆcientlywithadaptedmathematicalprogrammingalgorithms.
Up to now themosteÆcient strategyis thedualmethod proposedinitiallybyFleury[17]
but used by many other authors like Svanberg [38 ]. Dual methods are well adapted to
structural problems, because the dimensionality of the dual solution space is generally
much lower than the primal design space. With eÆcient algorithms, dual solvers are
able to solve sub-problems within reasonable computational time. For sizing and shape
optimization, solution time is less than 1 percent of the F.E. computation time. When
dealingwith compliancetopologyproblems, thesame character ispreserved.
Tosummarize,thesolutionstrategyofoptimizationproblemsisbasedonthefollowing
steps:
1. For the current design characterized by the design variables x k
, perform a Finite
Element analysis andits sensitivityanalysis.
2. From theresultsof thecurrent structuralanalysis,generate an approximate( ~
P).
3. Solvethe sub-problem( ~
P)with aneÆcient solver, like dualsolver.
4. Adoptthesolutionoftheapproximatesub-problemx ?
asathenewdesignx k+1
and
go backto 1untilconvergence.
The strategy combines both concepts of approximation and dual solution and it is now
generally known as the sequential convex programming approach (SCP) as suggested by
cations. Onecancometo anearlystationnarysolutiongenerallywithin20 iterationsteps,
independently of the number of design variables. For topology optimization convergence
speedcan be a bit slower, because of the highercomplexity of the problem;it is notrare
to have to wait for50 or100 iterationsforhavinga stablesolution.
Beforestudyingthenumericalsolutiontechniquesintodetails,itisimportanttoremind
the main characteristics of optimal material distribution problems in order to be able to
selectthemostsuitedalgorithmstosolvetheseproblems. Topologyoptimizationproblems
arelargescaleoptimization problems. ItresultsthatextensionoftraditionalSCPmethods
mustbe consideredcarefully.
Thenumberofdesignvariablesislargeorverylarge,i. e. inusualtopologyproblems,
one hasto considerfrom1.000 to 100.000 densityvariables.
Thenumberofconstraintsisgenerallyquitelowwhentheproblemformulationrelies
on global constraints. For example, in compliance type problems, one has to take
into account asmanycompliance responsesasthenumberofloadcases plusvolume
and perimeter constraints. This means in turn that one has to consider around 10
restrictions. The number of restrictions being much smaller than the number of
designvariables,we areinafavorablesituation forapplyingdualalgorithms.
Thissituationistotallydierentwhenlocal constraintslikestressconstraintsare
con-sideredinthe designproblem. Indeedone hasto copewith one constraint pernite
element,whichmeansanumberofconstraint thatisofthesameorder ofmagnitude
asthenumberofdesignvariables. Thiscouldalsobethecaseifslopeconstraintsare
includedinthedesignasinSigmundandPetersson [31 ]. Thustheproblemisalarge
scale probleminthe designvariable space and in the dualspace. Under these
con-ditions,advantages of dualalgorithms arelower, butwork donein Ref.[13 ] showed
thatdual solvers are stillableto producesolutions witha computationtime that is
of the same order of magnitude as the F.E. analysis even when there are around a
thousandofconstraints.
Thereisalso anothercharacteristic relatedto stressconstraint whichcomplicatesthetask
of the optimizer: the singularity phenomenon of stress constraints. This diÆculty is
al-leviated byusinga perturbation technique (-relaxation technique as proposed by Cheng
and Guo [9]). So to considerstress constraints, one hasto designa additionalstrategy to
manage theextraperturbationparameter.
This lectureis devotedto explainthebasicelementsof SCPmethod andinparticular
its application to topology optimization problems, which are very large scale problems.
Despite thehighlevelof integrationof thedierentconcepts inavailableimplementations
ofSCP,thisstudywillstilldistinguishinthispresentationthetwomainconceptsaspointed
outby[20 , 22 ]:
insection 3.
Finally as our goal is to apply Sequential Convex programming to topology problems,
dierent issues particular to topology problems will be reviewed in section 4 and 5. At
rst, we will seehow an eÆcient approximation can be builtfor theperimeter constraint
(section 4). Then we will have a careful study of the management of the -relaxation
technique ofstress constraint (section 5).
2 DUAL SOLUTION ALGORITHMS
2.1 Lagrange function
Supposethatwe have to solve thefollowingnon linearoptimization problem:
min x f(x) s.t. g j (x) 0 j=1;:::;m (3)
wherefunctions f(x) and g
j
(x) areassumed to becontinuousand dierentiable.
It is classicto denetheLagrangefunctionassociatedto problem(3):
L(x;)=f(x)+ m X j=1 j g j (x) (4) where the j
0 are Lagrange multipliers associated to each constraint g
j
(and so in
number m as the constraints). The new variables
j
are generally called dual variables
since there is a one-to-one association with design constraints. Conversely the original
designvariablesxof theproblemaresaid primal variables ofthe problem.
One can remark that the Lagrange transformation replaces the constraints g
j (x) 0 bya linear term j g j
(x)in the objective function. This can be interpretedas adding to
objectivefunctionf(x)alinearcost,withmarginalprice
j
,whichhastobepaidwhenever
theconstraint isviolated.
Lagrange function transforms the optimization constrained problem into an
uncon-strainedproblem. Theobjectivefunctionofthisnewoptimizationproblemispreciselythe
Lagrangian functionL(x;) and thedesignvariablesareboththexand variables.
min
x max
0
L(x;) (5)
Maximizationoveraectsaninnitecost(penalty)toLagrangefunctionwhenconstraints
are violated g
j
(x) > 0. The price of this transformation is that the dimension of the
Theorem 1 Necessary conditions of optimality of contrained problems.
Ifx ?
isanoptimum of problem(3),andifx ?
isa regularpoint,Thenonecan ndavector
of Langrangemultipliers ? =( ? 1 ;:::; ? m ) such that @f(x ) @x i + m X j=1 j @g j (x ) @x i = 0 8i (6) g j (x )0 (7) j 0 (8) j g j (x )=0 8j (9)
The Karush-Kuhn-Tucker (KKT) conditions stated as below consist of four types of
conditions,namely
Stationarityofthe LagrangefunctionL(x;) withrespectto x,
Primalfeasibility,which meansthat x ?
isa feasiblepoint,
Dual feasibility,which meansthatthe Lagrangemultipliers ?
are nonnegative,
Complementaryslackness, whichmeansthattheLagrangemultiplierscorresponding
to inactive constraintsare zero.
Remark 1 A point x ?
is a regular point of the problem, if all the gradientvectorsrg
j of
active constraints (i.e. g
j (x
?
)=0) arelinearly independent.
2.3 Introduction to duality
Asremarked above,therst KKT condition(6) impliesthe solutionof thesystem:
r
x L(x;
?
) = 0 (10)
Thisconditionis equivalentto say thatx ?
isthesolutionof minimizationproblem:
min
x
L(x; ?
)
Imaginenowthat, foranyLagrangevector,i.e.
=( 1 ;:::; m ) such that j 0 j=1:::m
one can ndthesolutionofthe minimizationproblem(called Lagrangian problem)
min
x
thenew dualvariables.
x = x() (12)
Substitutingprimalvariablesintermsoffunction(12),itispossibletorewriteLagrange
functionL(x;)interms ofdual variables solely
`() = L(x();) = f(x())+ m X j=1 j g j (x()) (13)
Thisfunction`() iscalled dualfunction of theproblem.
Optimizationproblemthatisrelatedtodualfunctionisamaximizationproblem,called
dualproblem (whileoriginalproblemis calledprimal problem):
max j `() s.t. j 0 j=1;::: ;m (14)
Solution of dual problem determines the optimal Lagrange multipliers ?
, which satisfy
KKT conditions. One can showthat dualproblemhasthefollowingproperties:
Ifprimalproblemis a minimizationproblem,dual problem isa maximization
prob-lem;
Dual problempossessesa solutionifprimalproblemdoes;
A solutionofdualproblemalso provides asolutionto primalproblem.
2.4 Weak duality
Withverylittleassumptionsonf andg
j
,onecan statethemostgeneraldenitionofdual
functionasfollows:
`() = inf
x
L(x;) (15)
So dual function exists even for non convex problems or discrete valued design variable
problems, etc.
Withveryweakassumptionsonf andg
j
,onecanprovethat`()isconcave. However,
dualfunctionis non-smooth, sothat derivative mustbe replacedbythesub-gradient.
Dual problem isdened asthemaximizationproblem
max j `() s.t. j 0 j=1;::: ;m (16)
local optimum of `() is a global optimum. Therefore the dual problem will in general
be more easily solved than the primal problem (for example when dealing with integer
programming). Thisisone ofthereasonswhytheconceptof dualityhasbeenfoundto be
veryusefulinmathematicalprogramming. Howeverbecauseofthenon-smoothnessofdual
function, one has to resort to non smooth maximization algorithms, which are generally
lesseÆcient.
General properties of dualfunction
For any feasiblex and forany Lagrangevector ,there holds:
f(x) `() (17)
Goingalong, if ?
is thesolutionof dualproblem, and ifx ?
is theoptimumof theprimal
problem,one has
f(x ?
) `( ?
) (18)
Thismeansthatdual functionisalwaysa lowerboundofprimalobjective functionvalue.
Thegap G = f(x ?
) `( ?
) iscalledduality gap. In addition,primalpointassociatedto
optimalLagrangemultiplierthroughLagragian problem
arg inf
x
L(x; ?
) (19)
maynotbe realizedormaynotbe afeasiblepoint
2.5 Strong duality
Nowsupposethatwewant to solvethe convex problem:
min x2X f(x) s.t. g j (x) 0 j=1;:::;m (20)
wherefunctions f(x) and g
j
(x) areassumed to beC 1
(i.e. continuousand dierentiable),
and convex. The set X of feasibledesignvariablesis also convex.
The setX is usuallymadeofsideconstraintson designvariables,which isobviouslya
convex set: X = fx i j x i x i x i i=1:::ng (21)
In addition,it isassumed that theSlater condition(orqualicationof the constraint)
isfullled:
9x~ : g
j
then, thereis atleast oneLagrangemultiplier vector ? such that f(x ? ) = min x L(x; ? ) (23)
and thereis no dualitygap
f(x ?
) = `( ?
) (24)
Corollary 1 Necessary and suÆcientconditions of optimality:
If the problem (20) is a convex and if functions f(x) and g
j
(x) are dierentiable and if
Slater condition is satised, then x ?
is the primal optimal point and ?
the optimal dual
point if and only ifKarush-Kuhn-Tucker conditions aresatised
Corollary 2 If theproblem (20)isaconvexandif Slaterconditionissatised, theprimal
problem (20) is completely equivalentto solve the dual problem:
min j `() s.t. j 0 j=1;:::;m (25)
withdual function
`() = min
x2X
L(x;) (26)
This is a consequence of the nature of the Lagrange function. If f(x) and g
j
(x) are
C 1
andconvex functions,Lagrange functionL(x;)hasa saddle pointin (x ? ; ? ),which meansthat: L(x ? ;) L(x ? ; ? ) L(x; ? ) (27)
Thenceitispossibleto showtheequivalencebetweenthefollowingoptimizationproblems:
min x2X max 0 L(x;) , max 0 min x2X L(x;) (28)
Aslefthandsideproblemisanequivalentofprimalproblem,andrightproblemisthedual
Concavity
Dual function`()is concave.
Lower boundand duality gap
Generally speaking, dual function is a lower bound of primal function, but for convex
problems, thereis no dualitygap:
`( ?
) = f(x ?
)
Thismeansthat thedualsolutionis equivalentto theprimalsolution.
Gradient of dualfunction
The rst derivatives of dual function with respect to Lagrange multiplier is simply the
constraint valueassociatedto thatmultiplierinpointx():
@`() @ k = g k (x()) (29)
The proofof thisformulaiseasy when derivingdual function:
@`() @ k = @ @ k ff(x())+ m X j=1 j g j (x())g = n X i=1 2 4 @f(x()) @x i + m X j=1 j @g j (x()) @x i 3 5 @x i @ k + g k (x())
The rst contribution vanishes because it is the optimality conditions of the Lagrangian
problemleadingto denitionofx().
From result (29), its comesthat the evaluation of the gradient of the dual function is
very easy to calculate.
Optimality conditions of dual problem
Itcomesfromthepreviouspropertythattheoptimalityconditionsofthequasi-unconstrained
dualproblemcan bewrittenas follows:
@`() @ k = 0 if k > 0 (30) @`() @ k < 0 if k = 0 (31)
A rst algorithm for dual maximization
These conditions indicate that the maximum of the dual function is attained when the
constraints are satised exactly (as equalities) for
j
> 0 and as inequalities if
j = 0.
to maximizethedual function: + = + r` (32) or + j = j + g j (x()) (33)
whereisastepsize. Thusadualvariables
j
increasesifthecorrespondingconstraintg
j is
violated,whereasitdecreases(possiblyreachingzero)ifg
j
isnegative(andsatised). From
theseconsiderations,itresultsanintuitiveinterpretationofthedualmethodapproach: the
approachattemptsto satisfytheinequalityconstraintsbyadjustingthevaluesofthedual
variables.
Sincenonnegativityconstraintsareveryeasytotakeintoaccount,classicalalgorithms
forunconstrained maximizationcan be readilyadapted to solvequasi-unconstrained dual
problem. Inparticular,conjugate gradientmethod iswellsuitedsincecomputationof the
gradient is straight forward: it needs computingtheconstraintsvaluesg
j
(x()) solely.
Separable case
The main diÆculty of the dual approach is to compute the relationshipsbetween primal
and dualmethodsx():
x() = arg min
x2X
L(x;) (34)
Solvingtheminimizationproblemhastoberepeatedalotoftimes,andthismightleadtoa
prohibitivecomputationalcost. Howeveriftheproblemhassomecertainspecialstructure,
asseparableproblems, thisisnotcumbersome. Inadditionto convexity,separabilityisan
essentialpropertyfordualformulation to be eÆcient.
A functionf(x) is said to be separableif thefunction be written assum of functions
f
i (x
i
),whichdependsonlyon thesingle variable x
i . f(x) = n X i=1 f i (x i ) (35)
Separablefunctionsbenetformsomecomputationallyimportantproperties. Inparticular,
theHessian matrixof such functionisdiagonal.
The constrained problem (20) is a separable problem programming problem if each
functionf(x), g
j
(x)is itself separable. Thisimpliesthatthe Lagrangian functionL(x;)
isseparabletoo. L(x;) = n X i=1 L i (x i ;)
Lagrangian problems min x2X L(x;) = n X i=1 min x i 2X i L i (x i ;) (36)
whereeachone dimensionalproblemcan be solvedseparately. Thusthedualfunctioncan
bewritten asfollows: `() = n X i=1 ` i () = n X i=1 f min x i 2X i L i (x i ;)g (37)
Inmaycases thesingle-variableminimizationproblemhasa simplealgebraicstructure
and itcan besolved inclosedform,thusyieldingan explicitdualfunction.
Second orderderivatives of dual function
To get second order derivatives of dual function, one derives the expressionof rst order
derivativesgiven in(29) one more time,whichgives:
@ 2 `() @ k @ l = @ @ l g k (x()) = n X i=1 @g k @x i @x i () @ l (38)
One remarks immediately that because of the presence of term @x
i ()
@
l
the second order
derivatives can be discontinuous. Indeed, when sideconstraints are taken into account in
Lagrangian problem(11) in order to be treated explicitly,the function relationship x()
can berst order discontinuous. Thiswillbe explainedinmore detailslater.
Evaluationof thisterm @xi()
@
l
can bemade fromKKTconditions. DerivationsofKKT
conditions(6) withrespectto
l gives: n X k=1 2 4 @ 2 f @x i @x k + m X j=1 j @ 2 g j @x i @x k 3 5 @x k () @ l + @g l () @x i = 0
Goingtomatrixnotations,onerecognizestheHessianmatrixoftheLagrangefunctionand
gradient vector of constraint g
l
withrespectto primalvariables:
r 2 xx L(x;) @x() @ l + r x g l = 0
If side-constraints are treated explicitly, one has to restrict thisrelation to free variables
(variables, which are not xed at their lower or upper bounds). So one has to consider
matrixGinsteadofr 2
xx
L. MatrixGisobtainedfromr 2
xx
Lindeletingrowsandcolumns
correspondingto xed variables. Onegets now:
G @x() @ l + r x g l = 0 (39)
constraints N = [ rg 1 :::rg m ]
and ifone introduces result(39) into (38), one getsthe nalexpressionofHessian of dual
function: @ 2 `() @ k @ l = N T G 1 N (40)
We must repeatthat thisHessian is onlycontinuous by pieces. Discontinuityof dual
Hessian occursalong hyper-planesof equations
x i () = x i and x i () = x i (41)
This has a major impact on dual maximization, since greatest care has to given to that
property to built second order algorithms for dual maximization. Classical Newton or
quasi-Newton methods can only be applied on subregions where the set of free and xed
designvariablesisfreezed. BuildinganeÆcientstrategytondtheoptimalsetoffreeand
xed designvariablesis one ofthe majorissue ofdual optimizerconstruction.
2.7 Application to quadratic problems with linear constraints
Weillustratethedualapproachonthequadraticproblemwithlinearinequalityconstraints.
The problemstatement is thefollowing:
min x 1 2 x T x s.t. C T x d
whereCisanmmatrixoftheconstraintgradients(whichareareassumedtobelinearly
dependent). TheLagrangian functionof theproblem is:
L(x;) = 1 2 x T x T (C T x d)
Optimalitycondition(KKTconditions) are:
r x L =x T C = 0 C T x + d 0 0 T (C T x d) = 0
TherstconditionisthesolutionoftheLagrangianproblem,whichleadstotherelationship
betweenprimaland dualvariables:
Replacing primalvariablesinto Lagrangian function,one gets thedual function: `() = 1 2 x() T x() T (C T x() d) = 1 2 T C T C T (C T C d) = 1 2 T C T C + T d
Gradient of dualfunctionis givenby:
r`() = C T C + d = d C T x()
which isthevalue ofprimalconstraints, inagreement withthegeneral theory.
Fromthisexpression, one getsalso easilytheHessian matrixof dualfunction:
r 2
`() = C T
C
This Hessian matrix is constant. Thus dual functionof a quadraticproblem is quadratic
function. It hasthefully explicitform:
`() = 1 2 T A + T d whereA=C T
Cdenotesthenegative ofthedual Hessian matrix.
In the case of equality constraint, dual variables are unrestricted in sign. Therefore
the maximum of dual function can simply be obtained by stating that its gradient must
vanish:
r`() = A + d = 0
Thisleadsto thesolution:
?
= A 1
d
From thedual solution, one recovers theoptimalprimalsolutionwith thehelp of
primal-dualrelationships:
x ?
= C ?
In the case of inequalityconstraints, the dual variables are subject to non negativity
sideconstraints, and onehas to solve dualmaximizationproblem:
max 1 2 T A + T d s.t. j 0
ers primalsolutionfrom thesame primal-dualrelationshipsasbefore:
x ?
= C ?
Dual maximization is the most natural and rigorous way to select automatically the
optimum set of Lagrange multipliers. In addition, as the number of dual variables is
oftensmaller thanthe numberof primalvariables, thedimensionalityof the optimization
problemwe actuallysolve is smaller.
Example of quadratic separableproblem
min x1;x2 1 2 x 2 1 + 1 2 x 2 2 s.t. x 1 + x 2 4 x 1 x 2 4
2.8 Treatment of side constraints
Let us consider the following separable, quadratic problem, with linear inequality and
side-constraints: min x 1 2 n X i=1 x 2 i s.t. n X i=1 c ij x i d j j=1:::m (42) x i x i x i i=1:::n
The side constraints, which imposelower and upper boundson the design variablescan,
of course, be considered as linear inequality constraints. However this would increase
dramaticallythenumberofdualvariables,andthusthiswouldreducepotentialadvantage
ofdualmethodbyincreasedimensionalityofdualworkspace. Therefore,eÆciencycallsfor
a particular and separate treatment of these very simple constraints, apart from general
constraints.
The Lagragian problemis
min x i x i x i 1 2 n X i=1 x 2 i m X j=1 j ( n X i=1 c ij x i d j ) (43)
Because of the separabilityproperty, the n-dimensional problem can be split into n
one-dimensionalproblemsrelativeto each variable "i":
min x i x i x i L i (x i ;) = 1 2 x 2 i ( m X j=1 j c ij ) x i (44)
The solutions of these problems areobtained byexpressingthe optimalityconditions,
i.e. vanishingthe rstorder derivativewithrespect to x
i : @L i @x i = x i ( m X j=1 j c ij ) = 0 (45)
Thisyieldsthe primal-dualrelationshipinclosedform:
x ? i = ( m X j=1 j c ij ) (46)
This expression holdsif the sideconstraint are ignored. Now to enforce side-constraints,
one hasto considerthree situations (seegure 3):
x i ()=x ? i if x i x ? i x i ; (47) x i () = x i if x ? i x i ; (48) x i () = x i if x ? i x i : (49)
Even ifthesolutionintroducesdierentconditions,theprimal-dualrelationsx=x() are
still available under closed form. Nonetheless considering dierent cases leads to a
non-smoothexpression. Indeedthederivationofthisexpressionisdiscontinuouswhenchanging
from one conditionto another,thatis when axed variablebecome freeorconversely.
The dual function is formed by inserting these primal-dual relationships into the
La-grangianfunction,whichleadsto thefollowingstatementof thedualproblem(onecannot
ndanymore an explicitexpressionof dualfunction):
max 1 2 n X i=1 x 2 i () m X j=1 j ( n X i=1 c ij x i () d j ) s.t. j 0
Firstderivativesofdualfunctionaregivenbyprimalvariableconstraint,inwhichprimal
variableshasbeenreplacedbytheirexpressioninterms oftheLagrange multipliers:
@` @ j = d j ( n X i=1 c ij x i ()) (50)
Second derivativesof dualfunction, giving Hessianmatrix, areeasilyavailabletoo:
A jk = @ 2 ` @ j @ k = @ @ k " d j n X i=1 c ij x i () # = n X i=1 c ij @x i @ k (51)
Now the situation is much more complicated, because of the separate treatment of side
constraints, which introduces non smooth primal-dual relations. On one hand for free
variables,wehave: @x i @ k = c ik if x i x i x i (52)
Ontheother hand, forxedvariables, itis obvious that
@x i @ k = 0 if x i =x i or x i = x i (53)
ThereforetheHessian matrixtakestheform:
A jk = @ 2 ` @ j @ k = = X i2F c ij c ik (54)
eachtimethatafreeprimalvariablebecomesxedorconversely,sinceonechangestheset
F of freevariablesinformula (54). From theprimaldualrelationshipsit isclearthat the
dualspaceispartitionedinseveral regionsseparatedbysecond-order discontinuityplanes.
Theseplanesare denes herebytheirequation:
( m X j=1 j c ij ) = x i ( m X j=1 j c ij ) = x i (55)
Example quadratic problem withseparable treatment of sideconstraints
min x 1 ;x 2 x 2 1 + x 2 2 s.t. x 1 + x 2 1 1=4 x i 4
2.9 Dual solution scheme of MMA sub-problems
Anticipatinga littlebit on structuralapproximations, itis ratherinteresting to studythe
principlesofdualsolutionsappliedtoMMAapproximatedproblems. After normalization,
theMMA sub-problemcan be writteninthefollowingform:
min x n X i=1 p i0 U ij x i + n X i=1 q i0 x i L ij s.t. n X i=1 p ij U ij x i + n X i=1 q ij x i L ij d j j=1:::m (56) x i x i x i i=1:::n
Onemustnoticethattheasymptotes maydependonbothvariableandconstraint indices:
U
ij and L
ij .
In order to simplify the notations, we adopt the following notation:
0
= 1, so that
Lagrangian functionisgiven by:
L(x;) = m X j=0 j ( n X i=1 p ij U ij x i + n X i=1 q ij x i L ij d j ) (57)
The Lagragian problemis
min x i x i x i L(x;) (58)
dimensionalproblemsrelativeto each variable "i": min x i x i x i L i (x i ;) = m X j=0 j p ij U ij x i + m X j=0 j q ij x i L ij (59)
For apure MMA[38 ] approximation,theasymptotes are thesame foreach constraint
and they do not depend on the approximation index j: U
ij = U i and L ij = L j . The
solution of this problem can be solved explicitly for each variable and gives rise to the
primal-dualrelations. Optimalityconditionsof theLagrangian problemare:
P m j=0 j p ij (U i x i ) 2 P m j=0 j q ij (x i L i ) 2 = 0 (60)
Fromthisexpression,oneextractsthecandidateoptimalvalueofvariableinterms ofdual
variables: x ? i () = U i + L i +1 (61) where = s P m j=0 j p ij P m j=0 j q ij (62)
For an approximation of the GMMA family [36], the primal-dualrelationshipsare no
longer explicit since each asymptote depends now on both the primal variables (index
"i") and on the constraint (index "j"). As the Lagrangian problem no longer admits a
closedsolutionform,a Newton-Raphsonscheme isadopted (as inRef.[42]). However the
approximationisseparableandnone-dimensionalnumericalminimizationsareperformed.
The iterationscheme forprimal-dualrelationshipsx
i () isgiven by: x i ( + ) = x i () @L i =@x i @ 2 L i =@x 2 i (63) where @L i @x i = m X j=0 j p ij (U ij x i ) 2 m X j=0 j q ij (x i L ij ) 2 (64) @ 2 L i @x 2 i = 2 m X j=0 j p ij (U ij x i ) 3 2 m X j=0 j q ij (x i L ij ) 3 (65)
considerthree situations: x i ()=x ? i if x i x ? i x i ; (66) x i () = x i if x ? i x i ; (67) x i () = x i if x ? i x i : (68)
ValuesofprimalvariablesintermsofLagrangemultipliersaretheninsertedinLagrange
function to calculate dual function value. Gradient of dual function is also given by the
valueof primalconstraints.
When primal-dual relationships are not available in closed form, each evaluation
re-quires the solution of non linear problems, which needs an additional numeric eort. In
orderto circumventtheproblem,Fleury[20 ]suggested tobreakthesolutionoftheconvex
subproblem itself into a sequence of quadratic separable sub-subproblems. As explained
in the former section, these problems can be solved eÆciently by dual methods. So the
non-explicitcharacter of primal-dualrelationshipsisnota realobstacle inpractice.
3 STRUCTURAL APPROXIMATIONS
3.1 Characteristics of approximations
When building approximation schemes, one pursues several (and sometimes antagonist)
goals. It isgoodto explicitlygive them before reviewingthedierent schemes.
The rst constraint one hasto keepinmindis thatthe solutionofthe generated
sub-problemmustremainrigorousandthelesscomputationallyexpensive. Inorderto beable
to resort to dual methods, one would like, at rst, that the solution of the dual problem
is the same as the solution of primal sub-problem. Then the power of the dual method
holdsifthecomputationeorttomove tothedualspaceissmalltosave thebenetofthe
dualsolution. To match theseconditions,thesub-problems( ~
P)andtherelatedstructural
approximations mustbe:
Convex. The convexity of the sub-problems insures that there is a unique solution
and that the solution of the dual problem is the solution of the original problem.
In addition the dual problem is concave and eÆcient algorithms can exploit this
property.
Separable. The separability is essential to arrive at relations between the primal
design variables and the Lagrange multipliers that are easy to compute. Primal
variables are given by an uncoupled system of equations in terms of the Lagrange
multipliers, which is possible to solve independently for each primal variable. F
ur-thermore for most of the approximation schemes, it is even possible to express the
solutioninclosedform. Separabilityisanessentialpropertytoreducestoaminimum
ary solutionwithin a minimumnumberof iterations withoutconstraint violations.
Obvi-ouslythenumberofstructural analysescan be largelyreducedwhenappropriateschemes
areused. To thisendone would like thatthestructural approximationsare:
Precise inorder to give thebestt to thereal responsesinthelargestneighborhood
possible. Then the higher the quality of the approximation is the quicker is the
convergencespeedto asolution.
Conservative enough inorder to generate asequence of steadily improved iterations
which providefeasiblesolutionsat any stage ofthe optimizationprocess.
Conserva-tive approximations are obtained by increasing the value of the convexity terms in
order to reduce the sizeof the trustregion and to avoid constraint violations. This
argument isgenerally antagonist withtheprecision property.
Furthermore the overall computational time to come to an optimum depends also upon
the numerical work that is necessary to buildthe approximation. To this end one wants
to lookfor
Aminimumcomputationaleort togeneratetheapproximation,butalsotocalculate
thethe necessaryinformationthat is requiredto builtit. Forexample computation
of truesecond order sensitivityisgenerally expensive so thatit is generallyavoided
and replaced by approximated second order information e.g. with Quasi-Newton
techniques. Another exampleis the cost of thesolutionof the primal-dual
relation-ships: Too complex approximation schemes are generally highly penalized because
they require elaborated and expensive numerical solutions of the primal-dual
sys-tem,which,inits turn,increasesthenumericaleortto formulatethedualproblem.
Therefore, itcomesthat reducing theoverall computational eortisalso antagonist
to the accuracy and precision criteria,so that a trade ohas be made between the
two criteria.
Thislistis nonextensive,one must alsokeepinmindthattheapproximationscheme and
theapproximatedsub-problemsmustbe
Robust, in order to be able to construct the approximation in an automatic and
reliablemannerand ina lotof situations.
Flexible,tobeabletobeusedforvariouskindsofstructuralandgeometricresponses.
Overall convergent inorder to be able to endup ina optimized solution(may be a
local optimum)from anystartingpoint design.
:::
The approximation schemes aregeneratedthroughrst orsecond orderTaylor
expan-sionofthedesignfunctiong
j
(x)aroundcurrentdesignpointx k
. Thedierentschemesare
i
whichmeansthattheprecisionoftheapproximationisrestrictedto aneighborhoodofthe
current designpoint.
Wearenowgoingto reviewthemostimportantschemesandseetheimprovements(or
drawbacks) thateach one brings.
3.2 Linear approximation and sequential linear programming
When thinking about local approximation, the most natural and simple approximation
schemeisatheTaylorexpansionaroundcurrentdesignpointx 0
restricted tolinearterms.
The linearapproximation writes:
~ g j (x) = g j (x 0 )+ n X i=1 @g(x 0 ) @x i (x i x 0 i ) (69)
The choice of linear approximation is natural when it corresponds to the nature of the
restriction. Forexample itis exact forthevolumerestriction when usingSIMP material.
It isalso mandatorywhen dealingwith equalityconstraints.
Whenthelinearapproximationisappliedtoeach restrictionoftheproblem,one
trans-formstheoriginalprobleminto a sequenceof linear programming problems:
min x i g 0 (x 0 )+rg 0 (x 0 ) T (x x 0 ) s.t. g j (x 0 )+rg j (x 0 ) T (x x 0 ) 0 (70) x i x i x i
With the general formulation of linear programming, it is possible to treat a wide
spectrum of problems independently of the nature of the restrictions. But, may be the
most interesting advantage of sequential linear programming, comes from the fact that
the solution of this kind of sub-problems can be realized eÆciently with the help of
lin-ear programming algorithms like a SIMPLEX algorithm or a primal-dual interior point
method [29 ]whichis very welladaptedto verylarge scaleproblems.
However the sequential linear approximation strategy has also some drawbacks. Due
to the lack of convexity of the sub-problem, one can have some problem to stabilize the
convergenceprocess. Onecanhavesomeoscillationsoftheconvergenceoralotofunfeasible
designs during iteration history. To overcome the diÆculty, one has to add some
move-limits to playtherole of a'trust region':
max(x i ;x 0 i i ) x i min(x i ;x 0 i + i ) (71)
For topology problems, we suggest an adaptive move-limit strategy which reduces the
designintervalwhenthedesignvariableoscillatesandwhichopensitwhentheconvergence
process isstable.
Foriteration k=1and 2,take aninitial 10percent move-limits
i = 0:1(x i x i ) (72)
metriclaminatesubjectto shear and torsionloads[7 ]
Foriterationsk >2,updatemove-limits according to thefollowingrule:
(k+1) i =0:7 (k+1) i if x (k+1) i x (k) i <0 (73) (k+1) i =1:2 (k+1) i if x (k+1) i x (k) i 0 (74)
Whentheoriginalproblemisverynon-linear,itissometimesnecessarytoadoptveryclose
move-limits,whichreducesalottheconvergencespeed. Itisusualthat100iterationsmay
benecessary to come to astable solutionin topologyoptimization.
3.3 Reciprocal variable expansion
As soon as the beginning of the seventies, Schmidt and his co-authors (see for example
Ref. [34 ])showedthatan approximation scheme intermsofreciprocalvariablesy
i =1=x
i
isfavorableto reduce thenon-linearityof responsesin sizingproblems. Thusagoodlocal
approximation is realized when expanding the responses g
j (x) in terms of intermediate variablesy i =1=x i
to better capturethenon-linearcharacter ofthefunction:
~ g j (x) = g j (x 0 )+ n X i=1 (x 0 i ) 2 @g(x 0 ) @x i ( 1 x i 1 x 0 i ) (75)
Foralotofstructuralapplications,thereciprocalscheme(75)leadstosuccessfulresults
laminatesubjectto shear and torsion loads[7 ]
designproblem. In factthescheme iseÆcient whenall therst derivativesofthefunction
arenegative(likeindeterminatestructures),sincetheapproximation(75)hasonlypositive
curvature terms and is conservative. However when the problem is highly non-linear or
when dealing withother kindsof problems like shape orcompositeproblems, Fleuryand
Braibant [24 ] observed that the derivatives can have mixed signs and that convergence
processisnomore stable,becausetheapproximation(75)isnolongerconservativeforthe
variableswhich have positive derivatives.
3.4 Mixed linearization approximation: CONLIN
ToovercomediÆcultyofreciprocalvariableandlinearexpansion,FleuryandBraibant[24]
expressed theidea to combine reciprocal expansion(when thederivative is negative) and
linearapproximation(whenthederivativehasapositive value) ina singlemixed
approxi-mationscheme: ~ g j (x)=g j (x 0 )+ X + @g j @x i (x i x 0 i ) X (x 0 i ) 2 @g j @x i ( 1 x i 1 x 0 i ) (76) where P +
isthesumoverall theterms forwhichthederivative ispositiveand P
isthe
sum over allthe termsforwhich thederivativeis negative.
Thescheme(76)isunconditionnallyconvexsinceallitssecondderivativesarepositives,
that'swhyitis namedCONLIN asConvexLinearization. Moreover, it wasdemonstrated
subjectto shear and torsionloads[7]
be generated withlinear and reciprocal variables. This convex character gives riseto the
conservativecharacter ofCONLIN, which meansthatthe approximation (76)tends to lie
inthefeasibledomainoftheconstraint. ItfollowsthattheCONLIN methodmostlytends
to generate feasiblenew solutions. Innumerous applications,CONLIN technique leadsto
a stable convergence and a feasible sequence of steadily improved designs. Convergence
speedisgenerallyfast: 10to20F.E.analysesaregenerallynecessaryto reachastationary
solutioninsizingoptimization.
AnotherstrengthofCONLINcomesfromthefactthatitisverywellsuitedtoausedual
optimizers, because of theconvex and separablecharacter of theapproximatedfunctions,
the sub-problemlends itself to an eÆcient solutionprocedure based on dual method and
second order maximizationalgorithm(see Fleury[18]).
ThemaindrawbackofCONLINisrelatedtothefactthatthecurvatureofthe
approxi-mationisxedandthereisnowayfortheusertomodifyitexceptbyusingtrickychangeof
designvariables. Thisalsosometimesintroducesa badttingoftheapproximationto the
real function. So despite numerous successes, the mixed linearizationleads sometimes to
toosloworunstableconvergencehistoriesaswell. Famousexampleofdivergenceprocesses
were reported bySvanberg[38 ].
3.5 Method of Moving Asymptotes: MMA
To beableto adjustthecurvatureoftheapproximation andtohavea better ttingto the
subjectto shear and torsionloads[7]
Svanberg [38 ]whichextends theconvex linearizationscheme.
~ g j (x)=r 0 j + n X i=1 p ij U i x i + n X i=1 q ij x i L i (77) with p ij = maxf0;(U i x i ) 2 @g j @x i g q ij = maxf0; (x i L i ) 2 @g j @x i g (78) and where r 0 j
collects all zero order terms that are adjustedto t to the constraint value
inx 0
.
Once again the scheme can regarded as a rst order Taylor expansion in terms of
intermediatevariables1=(U
i x i )or1=(x i L i
)dependinguponthesignofthederivatives.
Ofcourse, one hastheasymptotes valuesaresuchthat L k i <x k i <U k i .
The twosets ofverticalasypmtotes U
i andL
i
generalizetheverticalasymptotes
intro-ducedinCONLINwiththechangeofvariablesy
i =1=x
i
. InfacttheCONLIN
approxima-tioncan berecovered fromthegeneral statement byassumingL
i
=0 forthelower bound
and U
i
!1 forthelowerbound.
The two sets of asymptotes also play a role of move-limits. This analysis has been
subjectto shear and torsionloads[7]
pointmethodsinwhichside-constraints(herelowerbound)aretakenintoaccountthrough
barrierfunctionsof thetype [6 , 28 ]:
ln(x i x i ) or 1 x i x i (79)
ToadjusttheMMAapproximation,one hastoplaywiththeparametersofthe
approx-imationi.e. the positionof the verticalasymptotes. One could calculatethat the second
derivatives(and theconvexity aswell)increasewhen theasymptotes are pushedcloser to
thedesignpoint. Andconverselytheconvexityoftheapproximationisreducedwhenthey
aremoved awayfrom thedesignpointx k
.
One diÆculty of the method comes from the fact that the automatic choice of these
curvature parameters remains diÆcult. Svanberg [38 ] proposed a heuristic strategy for
asymptotesupddatebasedonthedesignvariableoscillations. Fortheiterationsk =1and
2,thedefaultvaluesof theasymptotes areadopted:
L k i = x k i s 0 (x i x i ) U k i = x k i + s 0 (x i x i ) (80) with s 0
= 0:5 is suggested by Svanberg. Then asymptotes are updated in the following
ways.
For k >2, the update scheme is the following. When convergence process is smooth
(x k 2 i x k 1 i ):(x k 1 i x k i
L k i = x k i s 1 (x k 1 i L k 1 i ) U k i = x k i + s 1 (U k 1 i x k 1 i ) (81) withs 1
less than1and generally chosen as1:2.
While when convergence history oscillates (x k 2 i x k 1 i ):(x k 1 i x k i ) < 0, one has
to close the asymptotes towards the design point to increase convexity. So one gets the
followingupdaterules
L k i = x k i (x k 1 i L k 1 i )=s 2 U k i = x k i + (U k 1 i x k 1 i )=s 2 (82) withs 2
greater than1 andgenerally chosen ass
1 or
p
s
1
OnemayalsonoticethatZhangandFleury[41 ]proposedanalternativestrategybased
on thettingtheapproximationto theprevious functionvalue.
3.6 Globally Convergent Method of Moving Asymptotes: GCMMA
Asone can remarkin gures8 and 9 that MMA (but also COLNLIN) ismade of sum of
monotonousfunctions. Whenoneisfacedtonon-monotonousproblems(whichisoftenthe
casewithcompositestructuresforexample),itismandatorytorestrictthedesignvariable
motion in one direction. Therefore, as suggested by Svanberg [38 ], a move-limit strategy
isgenerally adopted incombination withMMA.
max(x i ;0:9L j +0:1x 0 i ) x i min(x i ;0:1x 0 i +0:9U i ) (83)
More recently, Svanberg [39 ] proposed a new extension of MMA method which uses
simultaneouslybothasymptotesinordertocreate anon monotonousapproximation
func-tion. The generalstatement of theapproximation issimilarto MMA formula:
~ g j (x)=r 0 j + n X i=1 p k ij U i x i + n X i=1 q k ij x i L i (84) butcoeÆcients p k ij and q k ij
arebothnon zeroingeneral. Theyare aschosen asfollows:
p k ij = (U k i x k i ) 2 maxf0; @g j (x k ) @x i g + k j 2 (U k i L k i ) ! (85) q k ij = (x k i L k i ) 2 maxf0; @g j (x k ) @x i g + k j 2 (U k i L k i ) ! (86) where k j
are strictly positive parameters (to insure the convexity of the approximation)
whichareupdatedtogetherwiththeasymptotesL k
i
andU k
i
subjectto shear and torsionloads[7]
ofthecomplementarytermin k
j
thatallowstheapproximationto benon-monotonousby
usinginthesame timethe twoasymptotes L k i andU k i .
Mobile asymptotes are updatedwith thesame ruleas inMMA (see formula (80-82)).
Theadditionalparameters k
j
arechosenasfollows. Forthetworstiteration(k=1),one
choose
1
j
=" 8j2f0;1:::;mg 0<"1 (87)
Inthe lateriterations(k2),the parameters k
i
areupdatedaccording to
k j =2 k 1 j if g~ k 1 j (x k ) < g j (x k ) k j = k 1 j if g~ k 1 j (x k ) g j (x k ) (88)
Further(in order to prove theconvergence!), if
~ g k 1 j (x k ) g j (x k ) 8j2f0;1:::;mg
theasymptotes shouldnowinstead be updatedas
L k i = x k i (x k 1 i L k 1 i ) U k i = x k i + (U k 1 i x k 1 i ) (89)
Eventhoughthatforthisscheme,onecanprovethegloballyconvergentcharacterofthe
inmostcases, itconverges more slowly thantheoriginalMMA (onproblemswhereMMA
doesconverge). Thereasonforthisisthatsincetheparameters k
i
areincreased,andnever
decreased,theapproximationsbecome increasinglyconservative. Thismayeventuallylead
to very smallstepsintheiterationprocess.
3.7 Example
To illustrate the dierence between the various approximation schemes, we now plot the
contoursg(x)=0ofthereal and approximatedconstraintsfora simpleanalyticexample.
The functionunderstudyis
g(x)=5x
2 x
2
1
Firstorder derivativesof functiong are:
@g @x 1 = 2x 1 @g @x 2 = 5
The current approximation point is (x 0
1 ;x
0
2
) = (4;4) where the function and derivatives
valuesare: g(x 0 ) = 4 rg(x 0 ) = [ 8; 5] T
Reciprocal scheme is not conservative and lies in the unfeasible region of the true
constraint. Linear,CONLIN,MMAand GCMMAaregraduallymore conservative andlie
inthefeasiblepart ofthedesignspace.
Remark that as CONLIN, MMA, etc. are monotonous approximation, so that the
contoursareopen curves, whileGCMMAwhichisanon monotonousapproximationleads
to a closedcontourcurve.
3.8 Second order schemes
An alternative strategyto remedy to CONLIN'sproblems consistsinimprovingthe
qual-ity oftheapproximation and inintroducingsecond order derivative information. Smaoui,
FleuryandSchmit[36 ]proposedageneralizedMMAinwhichtheasymptotesare
automat-icallyselectedtomatchthesecondorderderivatives. AbitlaterFleury[19 ]generalizedthe
useof second order derivativesto several convex otherapproximations(likethe quadratic
diagonalschemeandtheexpansionintermsofapowerofthedesignvariables)andshowed
there is a bigpay-o related to the use of the second order informationin structural
ap-proximations.
3.8.1 The Generalized Method of Moving Asymptotes (GMMA)
Theapproximationofthegeneralizedmethodof movingasymptotescan bewritteninthe
followingform: ~ g(x 0 )=c 0 + n X i=1 a i x i b i (90)
g(x) =5x 2 x 2 1 at point (x 1 ;x 2 ) = (4;4)
AstheoriginalMMA[38],thisapproximationcan easilymatch thefunctionvalueand
itsrst derivativesatcurrentdesignpointx 0
. Ifthesecond orderinformationisavailable,
Smaoui et al. [36 ] showed that the asymptotes could be selected automatically too. The
parameters ofthe expansionare given by:
a i = (x 0 i b i ) 2 @g(x 0 ) @x i b i = x 0 i + 2 @g(x 0 ) @x i =max (; @ 2 g(x 0 ) @x 2 i ) 0<1 (91)
whiletheparameterc
0
isdenedtomeettheconstraintvalueatthedesignpoint. Thesmall
positivenumberguarantees theconvexityoftheapproximation. Thisapproximationisa
generalizationofthepureMethodofMovingAsymptotesof[38],since,here,eachconstraint
hasgotits ownset ofasymptotes.
3.8.2 Quadratic Separable Approximations
ThequadraticapproximationisthesecondorderTaylor'sexpansionofthegivenstructural
response. Nevertheless, the full second order expansion introduces coupling terms and
the approximation is no more separable. Furthermore, the full second order sensitivity
analysis can be expensive forstructural problems. So Fleury [19 ] proposed to neglect the
~ g(x)=g(x 0 ) + n X i=1 @g(x 0 ) @x i (x i x 0 i ) + 1 2 n X i=1 ( @ 2 g(x 0 ) @x 2 i +Æ ii )(x i x 0 i ) 2 (92)
The additional termsÆ
ii
aregenerally addedinorder to reinforcetheconvexity of the
separableapproximationandto playtherole of'movelimits'inestablishingatrustregion
around thedesignpoint. Forexample, thepenaltytermsÆ
ii
can beestimated sothat the
unconstrainedoptimumiskeptwithinagiven distance of theexpansionpoint.
jx i x 0 i j = j @g @x i =( @ 2 g @x 2 i +Æ ii )j < x 0 i (93)
3.8.3 Approximation procedure of diagonal second derivatives
A study [15, 16 ] based on the theory of Quasi-Newton updates preserving the diagonal
structurehasshownthat thebestapproximationof secondorder diagonaltermsis simply
given by: B ii ' @g(x k ) @x i @g(x k 1 ) @x i x k i x k 1 i (94)
This rather intuitive result consists in "making nite dierences between the computed
rst derivativesintwo successive iterationpointsandinignoringthecross derivatives".
This theoretical result was adapted to the characteristics of structural optimization
problems to yield quickly convergent estimates of the Hessian. This adaptation relieson
thekey roleof thereciprocaldesignvariablesto reduce thenon-linearityofthestructural
responses. The Hessian is updated in the space of reciprocal design variables and then
converted into curvaturesintermsofthedirectvariablesto beusedintheapproximation.
The initial guess of the Hessian is also very important. Starting in the reciprocal design
space from a diagonal matrix of smallterms restores thecurvaturesof CONLIN which is
generallya good startingpoint.
As in Ref. [15, 16 ] this approximate second order information can be introduced in
two high quality approximations: the second order Method of Moving Asymptotes and
thequadraticseparableapproximation. Theperformancesofthisprocedurearevery close
fromtheresultsobtainedwhileusingthesameapproximationswithanalyticsecondorder
derivatives[19 ]. Inevery cases these performancesare at leastasgoodas thebestresults
withMMA [38 ] orCONLIN [24 ].
3.8.4 Comparison of approximation in topology optimization [10 ]
Firstorder convex approximations
In Refs. [10 , 11 ], we tested rst order approximations in solvingmaterial distribution
that 20 to 40 iterations more are generally necessary to arrive to a stationary solution.
Nevertheless,ifthese performances(interms ofthenumberofiterations)arecomparedto
resultsobtained withstandard implementations of optimalitycriteria (like inRef. [3 , 5 ]),
CONLIN or MMA give equal or better performances. Among the approximations, the
expansion in terms of power p of the design variables is often too conservative and the
convergence rate is the slowest. Performance of CONLIN is generally very satisfactory.
Due to the weaker curvatures of CONLIN and MMA in the rst iterations, it is worth
usingmove-limits to have a smoothconvergencehistory.
From our experiences, the CONLIN (Fleury and Braibant [24 ]) approximations give
rise to good results in topology design. For the compliancesthat are self-adjoint, all the
derivatives are negative and CONLIN restores the reciprocal design variables expansion
that is well known to reduce the non-linearity of the structural responses. But
convex-ity and conservativity properties of the approximation in CONLIN are important when
treatingeigenfrequenciesorconstraintswhoserstderivativeshavemixedsigns. Themain
disadvantage of CONLIN is that the approximations introduce xed curvatures, so that
theapproximationmaybetoo much ortoolittleconservative. Thiscangiveriseto aslow
or unstable convergence towards the optimum. To remedy this, we select the MMA [38]
approximationschemewhichgeneralisesandimprovestheCONLINschemebyintroducing
two sets ofasymptotes. The choice of themovingasymptotes providesthe wayto modify
thecurvatureand to t better to thecharacteristicsof theproblem.
Because of the exibility introduced by the moving asymptotes, MMA ts better to
the convexity of the problemand MMA is oftena bit quickerthan CONLIN. It was also
observed thatforverydiÆcultproblems, MMAwasmorestablethanCONLIN.Therefore
itis generallychosen asdefaultalgorithm.
Nevertheless,wecanconcludethatbothCONLINandMMAleadtosatisfactoryresults
fortopologydesignandimproveoftengreatlytheperformanceofthesolutionprocedure. In
manyproblemsweobservedthatasolutionisoftenachievedin30to50iterationsdepending
on the diÆculty of the problem and the precision of the stopping criterion. One strong
advantageofCONLINandMMAarisesfromtheveryreliabledualsolversthatareusedto
solve the associatedconvex sub-problems. On anotherhand, one major drawbackof rst
order approximationsis thatwe can observe a decelerationoftheprogression towards the
optimum oncethe algorithm arrivesinthe neighbourhood of theoptimum. To accelerate
theconvergencerateinthenalstage,oneneedsbetterapproximationsbasedoncurvature
information(Fleury[19 ]).
We should also brie y discuss the selection of a termination criteria. We prefer not
usingterminationcriteriabasedonlimitedimprovementoftheobjectivefunctionsincethe
optimum is very at and the value of the objective functiondecreases very slowly during
morethanhalfoftheiterations. So,suchamethodwillleadterminationoftheoptimization
process ata tooearlystage. Whenall issaidand done,one looksfortheoptimalmaterial
distribution,sowethinkthatitisbettertoadoptaterminationcriteriabasedonthedesign
or the maximum modication (innite norm). As the problem under consideration is a
constrained optimizationproblem, one can also useKuhn-Tucker conditions. Satisfaction
ofanyone ofthese lastterminationcriteria avoids prematurestopping.
Second order convex approximations [11 ]
Second order are high quality approximations that are indeed more precise and that
lead to faster convergence rates. Nevertheless, second order sensitivityis very onerous to
compute and to store sothat theoverall overall cost of theoptimization run withsecond
ordersensitivitycanbesimilarto thecostofanoptimizationrunthatwouldbemadewith
rst order approximations even if the number of iteration is greater [30 ]. When the size
oftheproblemincreases,computingandstoring second orderderivativesbecomes quickly
cumbersome and theproblembecomes impossibleto manage.
To be able to use second order approximation schemes with large scale optimization
problems,wedevelopeda newprocedure to generateanestimationof thecurvature
infor-mation with a small computation cost [15 , 16 ]. As separable approximations needs only
diagonalsecondderivatives,theideaistobuiltanestimationofthecurvatureinformation
witha quasi-Newton updateable to preserve diagonal structureof the Hessianestimates.
Thisupdateschemeisderivedformthegeneraltheoryofquasi-Newtonupdatewithsparse
Hessian estimatesmade byThapa [40 ]. The diagonalversionof theBFGSupdate[15 , 16 ]
thatwe implementedisvery un-expensiveevenforlargescaleproblemssinceitintroduces
only vectors manipulations. In [10 ], it was observed that for a given topology problem,
the time spent in the diagonal BFGS update is only 3 % of the time spent in the
opti-mizer CONLIN[18 , 20 ] andonly0.01 %ofthe timeneededfor sensitivityanalysis witha
commercialniteelement package.
Theestimatedsecondorderinformationisintroducedintotwowellknownsecondorder
approximations. The rst one is a second order version of MMA proposed by Smaoui et
al.[36]. Thesecondapproximationistheseparablequadraticapproximation suggestedby
Fleury[19 ]. CombiningdiagonalBFGS updatewithboththese approximationsgivesvery
interesting results that leads to important savings in terms of number of iterations and
of computation time. This conclusion can be explained as follow. Firstly, the estimation
of the curvature improvesgreatly the quality of the approximation with onlythe help of
the accumulated rst order information. Secondly, instead of ignoring the second order
coupling terms, diagonal BFGS provides a way to take them into account by correction
terms on the diagonal coming from the diagonal update. Due to our initial guess of the
Hessian,one can observe, intherst iterations,a convergence historythat isvery similar
to rst order approximations. But after some iterations, the update procedure improves
theestimation of theHessian and one can see a real advantage inthe convergencespeed.
Around an accumulation point satisfying the optimality conditions, we could observe a
convergence speed superior to rst order methods, sometimes closed from super-linear
behaviour.
Asaconclusion,secondorderapproximationscanadvantageouslybeusedforlargescale
whichisclosetothereciprocaldesignvariableexpansionresultsinsimilarcharacteristicsas
rst order approximations, duringthe rst iterations. Thischoice generally yieldsa good
descent rate of the objective functionin the beginning. Then, since theHessian estimate
isimprovedand theapproximation isenrichedbythiscurvature information,itleadsto a
better convergence rate duringnal convergence and the number of iterations to reach a
stationary solutionisreduced.
Attention must nevertheless be paid to the well-known fact that second order
algo-rithmsaremore sensitiveto localoptima. Thisfact wasobserved also withourprocedure
and particularly with the quadratic separable approximation. This drawback can be
at-tenuated by adding move-limits or by adding additional convex terms in the quadratic
approximation.
3.9 Topological optimization of short cantilever beam
Finallywe can illustrate theapplication of four approximations -CONLIN, MMA, MMA
second order with diagonal BFGS (DQNMMA), and quadratic separable approximation
with diagonal BFGS (DQNQUA)-, in the context of a topology optimization problem,
the short cantilever beam problemwhose geometry of the problem is given at gure 12.
Thisdesignis a classicalbench-mark of topologyoptimization. For thesake of simplicity,
the material law is simply given bya cubic relationbetween the rigidity and the relative
density [4]: E = 3 E 0 and = 0
. The Poisson's ratio and the Young's modules of
the solid are:
0
= 0:3 and E
0
= 100GPa. The compliance under the given load case
is minimized whilethe volume is bounded to 37:5% of the volume of thedesign domain.
The design domain is discretized by a regular mesh of 1040 nite elements of degree 2.
The niteelement analysis and the sensitivityanalysis are performed with the SAMCEF
package.
Theproblemissolvedwiththefourdierentapproximationsforcompliance(thevolume
is linearized). Since all the rst derivatives of compliance are negative, CONLIN and
the reciprocal variables expansion are the same. We also implemented a MMA scheme
similarto Svanberg'sone. Then, we usethe highquality approximations DQNMMAand
Figure14: Compliancehistory
Historiesofcompliancearegivenat gure14. At rstglance,thedierentconvergence
curvesareverysimilarandthedierent algorithmstendtowards localoptimawithnearly
the same compliances. Nevertheless, when material distributions are visualized, the
op-timal distributions reveal a very similartopology, except in the quadratic approximation
(Figure 13 presents the material distribution obtained with CONLIN). Thus several
op-timaexistwhen intermediatedensities arehighlypenalizedandattention mustbepaidto
local congurations. Also, second order approximationsare partly more sensitive to local
optima.
Firstorderapproximationsgivesmoothandmonotonehistorycurves. Atthebeginning,
the descent rate is good but, after more or less 10 iterations, compliance reduction is
seriously slowed down when close to the optimum. Progression becomes much slower.
ofthe designvariablesbetweentwo iterations (gure 16) diminishslowly,even withsmall
oscillations.
Secondorderapproximationsalsoresultsinconvergencecurveswithaverygooddescent
rate. The progression towards the optimum is not handicapped too much by the non
monotonebehaviouroftherstiterations(iterations3and4). Thatfactmaybeduetoan
uncertainvalueoftheestimationofthecurvaturesbythediagonalBFGS.Afterthisphase
which isnecessaryto stabilizetheestimationsof thecurvatures, second orderinformation
gives rise to a very good descent rate. When in the at part of the compliance curve,
second order information preservesa good convergence speedand continues to accelerate
the progression towards a stationary solution. This can be clearly noted on the mean
modicationsofthedesignvariablesorontheKKTmodulus. Stationarityisreachedmuch
fasterwithsecondorderschemesanddiagonalBFGSthanwithrstorderapproximations.
AspredictedbyThapa's theory[40 ],werecoverinfacttheasymptoticsuperlineardescent
rate of quasi-Newton methodsin the neighbourhood of the optimum. This characteristic
saves a high number of iterations. The quadratic scheme and the MMA second order
method come to stationarity in30 to 40 iterations whileCONLIN and MMA needs more
than40 iterationsto satisfya weaker terminationcriteria.
Finally one can also compare the two proposed termination criteria: the modulus of
the Lagrangian function (in gure 15) and the mean modicationof thedesign variables
betweentwo iterations(in gure16). Onecan observe thatbothhavea parallelevolution
and they are in perfect correlation. As predicted by the theory they give an equivalent
informationand theybe bothused to predict convergenceintopology optimization
prob-lems.
With the DQNMMA and DQNQUA approximations, the objective function is
modi-iterations
cations can be observed, whereasthe convergence of theother rst order schemesis not
achieved yet. CONLIN and MMA's nalconvergencerates aremuch slower.
4 PERIMETER APPROXIMATION [11 ]
Perimeterconstraint hasa majorplace intopologydesign, becauseit isaveryinteresting
alternative to optimalrealxation usingoptimal microstructures asin the homogenization
method [1 ,5 ]. Inengineering applications, perimeter controlalso seemsfurtherattractive
than the rigoroushomogenization method because it allows to use penalization of
inter-mediatedensitiesandto generatecleardensitydistributionswithwellseparatedvoidsand
solidszones, so that an unambiguous macroscopic topology often appear. Ambrosio and
Buttazzo [2] demonstrated that the design problem with perimeter penalization is
well-posed. Applicationofperimetercontrolto topology designof structures waspresentedby
Haberetal.[25 ]. TheworkpresentedhereaimsatremedingsomediÆcultiesthatoccurin
thenumericalimplementationsoftheperimetercontrol. WefocusonprovidinganeÆcient
numericalstrategytotake perimeterconstraintinto accountinorderto useperimeter
con-trolasareal practical designtool. The perimeter isa ratherdiÆcultconstraint to satisfy
and to approximate, asitwillbeseen.
Webasedournumericalexperiencesonpowerlawmodels(alsocalledSIMP materials)
toprovideinthesametimeacontinuousapproximationofthedesigndistributionproblem
as well as a penalization of the intermediate densities. SIMP material is also easy to
implementinanyindustrialcode. Finally,SIMPmaterialintroduces onlyonevariableper
element,sothat thesizeoftheproblemis keptat a minimum.
AlthoughintheoriginalworkofHaberetal.[25 ]theperimetercontrolwastreatedwith
as a constrained problem in which perimeter is one of the inequality constraints. If one
wants to control theperimeter by prescribinga target value with a penalization, one can
use a relaxation technique and introducean additional variable Æ, which is quadratically
penalized in the objective function (the pds factor is a tuning parameter to control the
relative weight of thepenalization compared to themagnitudeofthe objective function).
min 1 1Æ 2 f T u + pdsÆ 2 s.t. V V (95) P P + (Æ 1)P
P isan allowablemaximumviolation ofthe target bound
P givebythe user. Standard
versions of solvers like CONLIN [22 ] orMMA [38 ] are able to handle thesolution of this
kindofoptimization problemswithrelaxation.
Whenthedensityvariescontinuously,Haber,etal. [25 ]proposeto replacethe
geomet-rical measure of the perimeter by the total variation of thedensity . Fora densityeld
which iselement byelement constant,wecan write:
P = X k l k q <> 2 k + 2 (96) where <> k
is the jump of material density trough the element interface k of length
l
k
. The parameter is a small positive number to guarantee the dierentiability of the
measure. Valuesof aregenerally taken between10 2
to 10 4
.
Figure17sketchestheperimetermeasureofasquareelementofdensitysurroundedby
fourelementsofdensity
1 , 2 , 3 ,and 4
approximate with classical schemes. Monotonous approximationslike CONLIN orMMA
give rise to oscillatory behaviours. Furthermore, the perimeter is globally non linear and
there are important couplings between neighboringnite element (F.E.) densities. Thus,
thetrustregionof separableapproximationsisnarrow.
Nevertheless,inordertotreatproblemswithalargenumberofvariablesandtousedual
solvers, we need a convex and separableapproximation. From our numerical experience,
good resultsareexpected withaquadratic separableapproximation ofthe generalform:
~ P()=P( 0 ) + n X i=1 @P @ i ( i 0 i ) + 1 2 n X i=1 a i ( i 0 i ) 2 (97)
Themainproblemisnowto choosethesecondorder termsa
i
carefully. Theirvaluesmust
beacompromisebetweenprecisionandconservativity. Toosmallvaluesofa
i
wouldimply
important constraint violations whiletoo large valuesof these second order terms would
lead to a freeze the motion of the variables. On one hand, by selecting a
i
, one will try
to t thetrue shape of theconstraint whileon theother hand, these coeÆcients willplay
the role of move-limits that restrict thevalidity of the approximation. Also, the analytic
second orderderivativesarenotusefulsincethey arezero, except inangularpointswhere
they arevery large (ordo notexist). So thechoice of thearticial curvaturesis basedon
a heuristicrule,whichis explainedinthenext section.
4.1 A heuristic estimation of curvatures for perimeter approximation
In the following, we develop a heuristic estimation of curvatures for approximating the
perimeter when there is one density variable
i
per element as it is for SIMP materials.
These estimates of the curvatures are based on a bound over individual contributionsof
each element. According to thequadratic approximation, thecontribution of element "i"
to globalperimeteris:
~ P i ()=P i ( 0 i ) + @P @ i ( i 0 i ) + 1 2 a i ( i 0 i ) 2 (98) P i ( 0 i
) isthe contribution of element "i" to theperimeterwith thecurrent distributionof
density. This contribution of element "i" is maximum when the density jump across the
element interfaces becomes equal to unity. Suppose now that this situation happens at
point ?
i
. Then one can writethesecond order coeÆcients intermofthenew parameter:
a i = 2 P k2Ki l k ( p 1+ 2 q j 0 i 0 k j 2 + 2 ) @P @i ( ? i 0 i ) ( ? i 0 i ) 2 (99)
wherethesum isrealized over thesetK
i
ofthe interfacesof element "i".
The question is now to nd thepoint ?
i
where thissituation could probablyhappen.
If theseparabilityhypothesis were true, thepoint ?
i
could be chosen at theboundaryof
the admissible set of
i
, that is when
i