Solution of topology optimization problems with sequential convex programming

(1)

PROBLEMS WITH SEQUENTIAL CONVEX

PROGRAMMING

Pierre DUYSINX

Michael BRUYNEEL

Claude FLEURY

Institute of Mechanics and CivilEngineering

University of Liege,

Chemindes Chevreuils 1,Building B52

B-4000Liege, Belgium

1 INTRODUCTION

Structural optimization consists in formulating the design problem of structural

compo-nentsasan optimizationprobleminorder to usethepowerof mathematical optimization

tools.

From a mathematical point of view, a quite general statement of the optimization

problemisgiven asfollowing

(P) 8 > > < > > : min x g 0 (x) s.t.: g j (x) g j j =1:::m x i x i x i i=1:::n (1) The function g 0

is the objective function of the problem, i.e. a cost function or a

performanceindexthatonewantstominimizeinordertohaveabetterdesign. Intopology

optimization,thisisforexamplethecomplianceofthestructureundertheconsideredload

case.

The set of constraint functions g

j

(in number m) are expressing the restrictions the

designis subject to inorder to be feasible. Forexample these functions are some bounds

upon a stress measure to have resistance, restricted displacements, a volume resource or

perimeterbound:::

The n variables x

i

are the design variables of the problem, that is, the parameters,

whichcanbemodied,toimprovethedesign. Inthetopologyoptimizationcontext,thex

i

(2)

Thedesignvariablevaluesarealsosubjecttosomeverysimplerestrictions,becauseofsome

physicalormathematicalreasons(thedensityliesbetween0and1,andtheorientationcan

besearchedbetween0and becauseofperiodicity). Otherconstraintsondesignvariables

can come from technological or manufacturing arguments like prescribed density regions

in topologyoptimization. Allthese specialconstraints on design variablesare called side

constraints. They generally callfor a special treatment inthe algorithm because of their

very simplestructure.

The direct solution of optimization problem (P) is totally prohibitive in structural

optimization,andfurthermoreintopologyoptimization,becauseofthecomputationalcost

ofthestructural and sensitivityanalysis oftheproblem. Indeedtheproblemofstructural

optimizationishighlynon-linear,andimplicitintermsofthedesignvariablessothateach

functionevaluation wouldrequirea Finiteelement (F.E.) analysis.

As soon as the seventies, Schmidt and his co-workers (e.g. Schmidt and Farshi [34 ],

Schmidt and Fleury [35]) proposed an interesting way to circumvent the problem while

usingmathematicalprogrammingtools. Theapproximation conceptsapproach replacesthe

primary optimization problem(P) witha sequence of explicit approximate sub-problems

having a simple algebraic structure, and built from the available information (function

values, rst and second order derivatives) at the current design point or at the former

iterationpoints. ( ~ P) 8 > > < > > : min x ~ g 0 (x) s.t.: g~ j (x) g j j =1:::m x i x i x i j =1:::n (2) whereg~ 0 and g~ j

denotes theapproximationsof objective functiong

0

and constraint

func-tionsg

j

. Theseapproximationscanberegardedassomekindofexpansionsoftheresponse

functionsaroundthecurrentdesignpointx k

(3)

the most famousones remain CONLIN by Fleuryand Braibant [24 ] and MMA (Method

ofMovingAsymptotes) bySvanberg[38]

However the success of the approximation strategy comes from the fact that the

sub-problems( ~

P)canbesolvedeÆcientlywithadaptedmathematicalprogrammingalgorithms.

Up to now themosteÆcient strategyis thedualmethod proposedinitiallybyFleury[17]

but used by many other authors like Svanberg [38 ]. Dual methods are well adapted to

structural problems, because the dimensionality of the dual solution space is generally

much lower than the primal design space. With eÆcient algorithms, dual solvers are

able to solve sub-problems within reasonable computational time. For sizing and shape

optimization, solution time is less than 1 percent of the F.E. computation time. When

dealingwith compliancetopologyproblems, thesame character ispreserved.

Tosummarize,thesolutionstrategyofoptimizationproblemsisbasedonthefollowing

steps:

1. For the current design characterized by the design variables x k

, perform a Finite

Element analysis andits sensitivityanalysis.

2. From theresultsof thecurrent structuralanalysis,generate an approximate( ~

P).

3. Solvethe sub-problem( ~

P)with aneÆcient solver, like dualsolver.

4. Adoptthesolutionoftheapproximatesub-problemx ?

asathenewdesignx k+1

and

go backto 1untilconvergence.

The strategy combines both concepts of approximation and dual solution and it is now

generally known as the sequential convex programming approach (SCP) as suggested by

(4)

cations. Onecancometo anearlystationnarysolutiongenerallywithin20 iterationsteps,

independently of the number of design variables. For topology optimization convergence

speedcan be a bit slower, because of the highercomplexity of the problem;it is notrare

to have to wait for50 or100 iterationsforhavinga stablesolution.

Beforestudyingthenumericalsolutiontechniquesintodetails,itisimportanttoremind

the main characteristics of optimal material distribution problems in order to be able to

selectthemostsuitedalgorithmstosolvetheseproblems. Topologyoptimizationproblems

arelargescaleoptimization problems. ItresultsthatextensionoftraditionalSCPmethods

mustbe consideredcarefully.

Thenumberofdesignvariablesislargeorverylarge,i. e. inusualtopologyproblems,

one hasto considerfrom1.000 to 100.000 densityvariables.

Thenumberofconstraintsisgenerallyquitelowwhentheproblemformulationrelies

on global constraints. For example, in compliance type problems, one has to take

into account asmanycompliance responsesasthenumberofloadcases plusvolume

and perimeter constraints. This means in turn that one has to consider around 10

restrictions. The number of restrictions being much smaller than the number of

designvariables,we areinafavorablesituation forapplyingdualalgorithms.

Thissituationistotallydierentwhenlocal constraintslikestressconstraintsare

con-sideredinthe designproblem. Indeedone hasto copewith one constraint pernite

element,whichmeansanumberofconstraint thatisofthesameorder ofmagnitude

asthenumberofdesignvariables. Thiscouldalsobethecaseifslopeconstraintsare

includedinthedesignasinSigmundandPetersson [31 ]. Thustheproblemisalarge

scale probleminthe designvariable space and in the dualspace. Under these

con-ditions,advantages of dualalgorithms arelower, butwork donein Ref.[13 ] showed

thatdual solvers are stillableto producesolutions witha computationtime that is

of the same order of magnitude as the F.E. analysis even when there are around a

thousandofconstraints.

Thereisalso anothercharacteristic relatedto stressconstraint whichcomplicatesthetask

of the optimizer: the singularity phenomenon of stress constraints. This diÆculty is

al-leviated byusinga perturbation technique (-relaxation technique as proposed by Cheng

and Guo [9]). So to considerstress constraints, one hasto designa additionalstrategy to

manage theextraperturbationparameter.

This lectureis devotedto explainthebasicelementsof SCPmethod andinparticular

its application to topology optimization problems, which are very large scale problems.

Despite thehighlevelof integrationof thedierentconcepts inavailableimplementations

ofSCP,thisstudywillstilldistinguishinthispresentationthetwomainconceptsaspointed

outby[20 , 22 ]:

(5)

insection 3.

Finally as our goal is to apply Sequential Convex programming to topology problems,

dierent issues particular to topology problems will be reviewed in section 4 and 5. At

rst, we will seehow an eÆcient approximation can be builtfor theperimeter constraint

(section 4). Then we will have a careful study of the management of the -relaxation

technique ofstress constraint (section 5).

2 DUAL SOLUTION ALGORITHMS

2.1 Lagrange function

Supposethatwe have to solve thefollowingnon linearoptimization problem:

min x f(x) s.t. g j (x) 0 j=1;:::;m (3)

wherefunctions f(x) and g

j

(x) areassumed to becontinuousand dierentiable.

It is classicto denetheLagrangefunctionassociatedto problem(3):

L(x;)=f(x)+ m X j=1 j g j (x) (4) where the j

0 are Lagrange multipliers associated to each constraint g

j

(and so in

number m as the constraints). The new variables

j

are generally called dual variables

since there is a one-to-one association with design constraints. Conversely the original

designvariablesxof theproblemaresaid primal variables ofthe problem.

One can remark that the Lagrange transformation replaces the constraints g

j (x) 0 bya linear term j g j

(x)in the objective function. This can be interpretedas adding to

objectivefunctionf(x)alinearcost,withmarginalprice

j

,whichhastobepaidwhenever

theconstraint isviolated.

Lagrange function transforms the optimization constrained problem into an

uncon-strainedproblem. Theobjectivefunctionofthisnewoptimizationproblemispreciselythe

Lagrangian functionL(x;) and thedesignvariablesareboththexand variables.

min

x max

0

L(x;) (5)

Maximizationoveraectsaninnitecost(penalty)toLagrangefunctionwhenconstraints

are violated g

j

(x) > 0. The price of this transformation is that the dimension of the

(6)

Theorem 1 Necessary conditions of optimality of contrained problems.

Ifx ?

isanoptimum of problem(3),andifx ?

isa regularpoint,Thenonecan ndavector

of Langrangemultipliers ? =( ? 1 ;:::; ? m ) such that @f(x ) @x i + m X j=1 j @g j (x ) @x i = 0 8i (6) g j (x )0 (7) j 0 (8) j g j (x )=0 8j (9)

The Karush-Kuhn-Tucker (KKT) conditions stated as below consist of four types of

conditions,namely

Stationarityofthe LagrangefunctionL(x;) withrespectto x,

Primalfeasibility,which meansthat x ?

isa feasiblepoint,

Dual feasibility,which meansthatthe Lagrangemultipliers ?

are nonnegative,

Complementaryslackness, whichmeansthattheLagrangemultiplierscorresponding

to inactive constraintsare zero.

Remark 1 A point x ?

is a regular point of the problem, if all the gradientvectorsrg

j of

active constraints (i.e. g

j (x

?

)=0) arelinearly independent.

2.3 Introduction to duality

Asremarked above,therst KKT condition(6) impliesthe solutionof thesystem:

r

x L(x;

?

) = 0 (10)

Thisconditionis equivalentto say thatx ?

isthesolutionof minimizationproblem:

min

x

L(x; ?

)

Imaginenowthat, foranyLagrangevector,i.e.

=( 1 ;:::; m ) such that j 0 j=1:::m

one can ndthesolutionofthe minimizationproblem(called Lagrangian problem)

min

x

(7)

thenew dualvariables.

x = x() (12)

Substitutingprimalvariablesintermsoffunction(12),itispossibletorewriteLagrange

functionL(x;)interms ofdual variables solely

`() = L(x();) = f(x())+ m X j=1 j g j (x()) (13)

Thisfunction`() iscalled dualfunction of theproblem.

Optimizationproblemthatisrelatedtodualfunctionisamaximizationproblem,called

dualproblem (whileoriginalproblemis calledprimal problem):

max j `() s.t. j 0 j=1;::: ;m (14)

Solution of dual problem determines the optimal Lagrange multipliers ?

, which satisfy

KKT conditions. One can showthat dualproblemhasthefollowingproperties:

Ifprimalproblemis a minimizationproblem,dual problem isa maximization

prob-lem;

Dual problempossessesa solutionifprimalproblemdoes;

A solutionofdualproblemalso provides asolutionto primalproblem.

2.4 Weak duality

Withverylittleassumptionsonf andg

j

,onecan statethemostgeneraldenitionofdual

functionasfollows:

`() = inf

x

L(x;) (15)

So dual function exists even for non convex problems or discrete valued design variable

problems, etc.

Withveryweakassumptionsonf andg

j

,onecanprovethat`()isconcave. However,

dualfunctionis non-smooth, sothat derivative mustbe replacedbythesub-gradient.

Dual problem isdened asthemaximizationproblem

max j `() s.t. j 0 j=1;::: ;m (16)

(8)

local optimum of `() is a global optimum. Therefore the dual problem will in general

be more easily solved than the primal problem (for example when dealing with integer

programming). Thisisone ofthereasonswhytheconceptof dualityhasbeenfoundto be

veryusefulinmathematicalprogramming. Howeverbecauseofthenon-smoothnessofdual

function, one has to resort to non smooth maximization algorithms, which are generally

lesseÆcient.

General properties of dualfunction

For any feasiblex and forany Lagrangevector ,there holds:

f(x) `() (17)

Goingalong, if ?

is thesolutionof dualproblem, and ifx ?

is theoptimumof theprimal

problem,one has

f(x ?

) `( ?

) (18)

Thismeansthatdual functionisalwaysa lowerboundofprimalobjective functionvalue.

Thegap G = f(x ?

) `( ?

) iscalledduality gap. In addition,primalpointassociatedto

optimalLagrangemultiplierthroughLagragian problem

arg inf

x

L(x; ?

) (19)

maynotbe realizedormaynotbe afeasiblepoint

2.5 Strong duality

Nowsupposethatwewant to solvethe convex problem:

min x2X f(x) s.t. g j (x) 0 j=1;:::;m (20)

wherefunctions f(x) and g

j

(x) areassumed to beC 1

(i.e. continuousand dierentiable),

and convex. The set X of feasibledesignvariablesis also convex.

The setX is usuallymadeofsideconstraintson designvariables,which isobviouslya

convex set: X = fx i j x i x i x i i=1:::ng (21)

In addition,it isassumed that theSlater condition(orqualicationof the constraint)

isfullled:

9x~ : g

j

(9)

then, thereis atleast oneLagrangemultiplier vector ? such that f(x ? ) = min x L(x; ? ) (23)

and thereis no dualitygap

f(x ?

) = `( ?

) (24)

Corollary 1 Necessary and suÆcientconditions of optimality:

If the problem (20) is a convex and if functions f(x) and g

j

(x) are dierentiable and if

Slater condition is satised, then x ?

is the primal optimal point and ?

the optimal dual

point if and only ifKarush-Kuhn-Tucker conditions aresatised

Corollary 2 If theproblem (20)isaconvexandif Slaterconditionissatised, theprimal

problem (20) is completely equivalentto solve the dual problem:

min j `() s.t. j 0 j=1;:::;m (25)

withdual function

`() = min

x2X

L(x;) (26)

This is a consequence of the nature of the Lagrange function. If f(x) and g

j

(x) are

C 1

andconvex functions,Lagrange functionL(x;)hasa saddle pointin (x ? ; ? ),which meansthat: L(x ? ;) L(x ? ; ? ) L(x; ? ) (27)

Thenceitispossibleto showtheequivalencebetweenthefollowingoptimizationproblems:

min x2X max 0 L(x;) , max 0 min x2X L(x;) (28)

Aslefthandsideproblemisanequivalentofprimalproblem,andrightproblemisthedual

(10)

Concavity

Dual function`()is concave.

Lower boundand duality gap

Generally speaking, dual function is a lower bound of primal function, but for convex

problems, thereis no dualitygap:

`( ?

) = f(x ?

)

Thismeansthat thedualsolutionis equivalentto theprimalsolution.

Gradient of dualfunction

The rst derivatives of dual function with respect to Lagrange multiplier is simply the

constraint valueassociatedto thatmultiplierinpointx():

@`() @ k = g k (x()) (29)

The proofof thisformulaiseasy when derivingdual function:

@`() @ k = @ @ k ff(x())+ m X j=1 j g j (x())g = n X i=1 2 4 @f(x()) @x i + m X j=1 j @g j (x()) @x i 3 5 @x i @ k + g k (x())

The rst contribution vanishes because it is the optimality conditions of the Lagrangian

problemleadingto denitionofx().

From result (29), its comesthat the evaluation of the gradient of the dual function is

very easy to calculate.

Optimality conditions of dual problem

Itcomesfromthepreviouspropertythattheoptimalityconditionsofthequasi-unconstrained

dualproblemcan bewrittenas follows:

@`() @ k = 0 if k > 0 (30) @`() @ k < 0 if k = 0 (31)

A rst algorithm for dual maximization

These conditions indicate that the maximum of the dual function is attained when the

constraints are satised exactly (as equalities) for

j

> 0 and as inequalities if

j = 0.

(11)

to maximizethedual function: + = + r` (32) or + j = j + g j (x()) (33)

whereisastepsize. Thusadualvariables

j

increasesifthecorrespondingconstraintg

j is

violated,whereasitdecreases(possiblyreachingzero)ifg

j

isnegative(andsatised). From

theseconsiderations,itresultsanintuitiveinterpretationofthedualmethodapproach: the

approachattemptsto satisfytheinequalityconstraintsbyadjustingthevaluesofthedual

variables.

Sincenonnegativityconstraintsareveryeasytotakeintoaccount,classicalalgorithms

forunconstrained maximizationcan be readilyadapted to solvequasi-unconstrained dual

problem. Inparticular,conjugate gradientmethod iswellsuitedsincecomputationof the

gradient is straight forward: it needs computingtheconstraintsvaluesg

j

(x()) solely.

Separable case

The main diÆculty of the dual approach is to compute the relationshipsbetween primal

and dualmethodsx():

x() = arg min

x2X

L(x;) (34)

Solvingtheminimizationproblemhastoberepeatedalotoftimes,andthismightleadtoa

prohibitivecomputationalcost. Howeveriftheproblemhassomecertainspecialstructure,

asseparableproblems, thisisnotcumbersome. Inadditionto convexity,separabilityisan

essentialpropertyfordualformulation to be eÆcient.

A functionf(x) is said to be separableif thefunction be written assum of functions

f

i (x

i

),whichdependsonlyon thesingle variable x

i . f(x) = n X i=1 f i (x i ) (35)

Separablefunctionsbenetformsomecomputationallyimportantproperties. Inparticular,

theHessian matrixof such functionisdiagonal.

The constrained problem (20) is a separable problem programming problem if each

functionf(x), g

j

(x)is itself separable. Thisimpliesthatthe Lagrangian functionL(x;)

isseparabletoo. L(x;) = n X i=1 L i (x i ;)

(12)

Lagrangian problems min x2X L(x;) = n X i=1 min x i 2X i L i (x i ;) (36)

whereeachone dimensionalproblemcan be solvedseparately. Thusthedualfunctioncan

bewritten asfollows: `() = n X i=1 ` i () = n X i=1 f min x i 2X i L i (x i ;)g (37)

Inmaycases thesingle-variableminimizationproblemhasa simplealgebraicstructure

and itcan besolved inclosedform,thusyieldingan explicitdualfunction.

Second orderderivatives of dual function

To get second order derivatives of dual function, one derives the expressionof rst order

derivativesgiven in(29) one more time,whichgives:

@ 2 `() @ k @ l = @ @ l g k (x()) = n X i=1 @g k @x i @x i () @ l (38)

One remarks immediately that because of the presence of term @x

i ()

@

l

the second order

derivatives can be discontinuous. Indeed, when sideconstraints are taken into account in

Lagrangian problem(11) in order to be treated explicitly,the function relationship x()

can berst order discontinuous. Thiswillbe explainedinmore detailslater.

Evaluationof thisterm @xi()

@

l

can bemade fromKKTconditions. DerivationsofKKT

conditions(6) withrespectto

l gives: n X k=1 2 4 @ 2 f @x i @x k + m X j=1 j @ 2 g j @x i @x k 3 5 @x k () @ l + @g l () @x i = 0

Goingtomatrixnotations,onerecognizestheHessianmatrixoftheLagrangefunctionand

gradient vector of constraint g

l

withrespectto primalvariables:

r 2 xx L(x;) @x() @ l + r x g l = 0

If side-constraints are treated explicitly, one has to restrict thisrelation to free variables

(variables, which are not xed at their lower or upper bounds). So one has to consider

matrixGinsteadofr 2

xx

L. MatrixGisobtainedfromr 2

xx

Lindeletingrowsandcolumns

correspondingto xed variables. Onegets now:

G @x() @ l + r x g l = 0 (39)

(13)

constraints N = [ rg 1 :::rg m ]

and ifone introduces result(39) into (38), one getsthe nalexpressionofHessian of dual

function: @ 2 `() @ k @ l = N T G 1 N (40)

We must repeatthat thisHessian is onlycontinuous by pieces. Discontinuityof dual

Hessian occursalong hyper-planesof equations

x i () = x i and x i () = x i (41)

This has a major impact on dual maximization, since greatest care has to given to that

property to built second order algorithms for dual maximization. Classical Newton or

quasi-Newton methods can only be applied on subregions where the set of free and xed

designvariablesisfreezed. BuildinganeÆcientstrategytondtheoptimalsetoffreeand

xed designvariablesis one ofthe majorissue ofdual optimizerconstruction.

2.7 Application to quadratic problems with linear constraints

Weillustratethedualapproachonthequadraticproblemwithlinearinequalityconstraints.

The problemstatement is thefollowing:

min x 1 2 x T x s.t. C T x d

whereCisanmmatrixoftheconstraintgradients(whichareareassumedtobelinearly

dependent). TheLagrangian functionof theproblem is:

L(x;) = 1 2 x T x T (C T x d)

Optimalitycondition(KKTconditions) are:

r x L =x T C = 0 C T x + d 0 0 T (C T x d) = 0

TherstconditionisthesolutionoftheLagrangianproblem,whichleadstotherelationship

betweenprimaland dualvariables:

(14)

Replacing primalvariablesinto Lagrangian function,one gets thedual function: `() = 1 2 x() T x() T (C T x() d) = 1 2 T C T C T (C T C d) = 1 2 T C T C + T d

Gradient of dualfunctionis givenby:

r`() = C T C + d = d C T x()

which isthevalue ofprimalconstraints, inagreement withthegeneral theory.

Fromthisexpression, one getsalso easilytheHessian matrixof dualfunction:

r 2

`() = C T

C

This Hessian matrix is constant. Thus dual functionof a quadraticproblem is quadratic

function. It hasthefully explicitform:

`() = 1 2 T A + T d whereA=C T

Cdenotesthenegative ofthedual Hessian matrix.

In the case of equality constraint, dual variables are unrestricted in sign. Therefore

the maximum of dual function can simply be obtained by stating that its gradient must

vanish:

r`() = A + d = 0

Thisleadsto thesolution:

?

= A 1

d

From thedual solution, one recovers theoptimalprimalsolutionwith thehelp of

primal-dualrelationships:

x ?

= C ?

In the case of inequalityconstraints, the dual variables are subject to non negativity

sideconstraints, and onehas to solve dualmaximizationproblem:

max 1 2 T A + T d s.t. j 0

(15)

ers primalsolutionfrom thesame primal-dualrelationshipsasbefore:

x ?

= C ?

Dual maximization is the most natural and rigorous way to select automatically the

optimum set of Lagrange multipliers. In addition, as the number of dual variables is

oftensmaller thanthe numberof primalvariables, thedimensionalityof the optimization

problemwe actuallysolve is smaller.

Example of quadratic separableproblem

min x1;x2 1 2 x 2 1 + 1 2 x 2 2 s.t. x 1 + x 2 4 x 1 x 2 4

2.8 Treatment of side constraints

Let us consider the following separable, quadratic problem, with linear inequality and

side-constraints: min x 1 2 n X i=1 x 2 i s.t. n X i=1 c ij x i d j j=1:::m (42) x i x i x i i=1:::n

The side constraints, which imposelower and upper boundson the design variablescan,

of course, be considered as linear inequality constraints. However this would increase

dramaticallythenumberofdualvariables,andthusthiswouldreducepotentialadvantage

ofdualmethodbyincreasedimensionalityofdualworkspace. Therefore,eÆciencycallsfor

a particular and separate treatment of these very simple constraints, apart from general

constraints.

The Lagragian problemis

min x i x i x i 1 2 n X i=1 x 2 i m X j=1 j ( n X i=1 c ij x i d j ) (43)

Because of the separabilityproperty, the n-dimensional problem can be split into n

one-dimensionalproblemsrelativeto each variable "i":

min x i x i x i L i (x i ;) = 1 2 x 2 i ( m X j=1 j c ij ) x i (44)

(16)

The solutions of these problems areobtained byexpressingthe optimalityconditions,

i.e. vanishingthe rstorder derivativewithrespect to x

i : @L i @x i = x i ( m X j=1 j c ij ) = 0 (45)

Thisyieldsthe primal-dualrelationshipinclosedform:

x ? i = ( m X j=1 j c ij ) (46)

This expression holdsif the sideconstraint are ignored. Now to enforce side-constraints,

one hasto considerthree situations (seegure 3):

x i ()=x ? i if x i x ? i x i ; (47) x i () = x i if x ? i x i ; (48) x i () = x i if x ? i x i : (49)

Even ifthesolutionintroducesdierentconditions,theprimal-dualrelationsx=x() are

still available under closed form. Nonetheless considering dierent cases leads to a

non-smoothexpression. Indeedthederivationofthisexpressionisdiscontinuouswhenchanging

from one conditionto another,thatis when axed variablebecome freeorconversely.

The dual function is formed by inserting these primal-dual relationships into the

La-grangianfunction,whichleadsto thefollowingstatementof thedualproblem(onecannot

ndanymore an explicitexpressionof dualfunction):

max 1 2 n X i=1 x 2 i () m X j=1 j ( n X i=1 c ij x i () d j ) s.t. j 0

(17)

Firstderivativesofdualfunctionaregivenbyprimalvariableconstraint,inwhichprimal

variableshasbeenreplacedbytheirexpressioninterms oftheLagrange multipliers:

@` @ j = d j ( n X i=1 c ij x i ()) (50)

Second derivativesof dualfunction, giving Hessianmatrix, areeasilyavailabletoo:

A jk = @ 2 ` @ j @ k = @ @ k " d j n X i=1 c ij x i () # = n X i=1 c ij @x i @ k (51)

Now the situation is much more complicated, because of the separate treatment of side

constraints, which introduces non smooth primal-dual relations. On one hand for free

variables,wehave: @x i @ k = c ik if x i x i x i (52)

Ontheother hand, forxedvariables, itis obvious that

@x i @ k = 0 if x i =x i or x i = x i (53)

ThereforetheHessian matrixtakestheform:

A jk = @ 2 ` @ j @ k = = X i2F c ij c ik (54)

(18)

eachtimethatafreeprimalvariablebecomesxedorconversely,sinceonechangestheset

F of freevariablesinformula (54). From theprimaldualrelationshipsit isclearthat the

dualspaceispartitionedinseveral regionsseparatedbysecond-order discontinuityplanes.

Theseplanesare denes herebytheirequation:

( m X j=1 j c ij ) = x i ( m X j=1 j c ij ) = x i (55)

Example quadratic problem withseparable treatment of sideconstraints

min x 1 ;x 2 x 2 1 + x 2 2 s.t. x 1 + x 2 1 1=4 x i 4

2.9 Dual solution scheme of MMA sub-problems

Anticipatinga littlebit on structuralapproximations, itis ratherinteresting to studythe

principlesofdualsolutionsappliedtoMMAapproximatedproblems. After normalization,

theMMA sub-problemcan be writteninthefollowingform:

min x n X i=1 p i0 U ij x i + n X i=1 q i0 x i L ij s.t. n X i=1 p ij U ij x i + n X i=1 q ij x i L ij d j j=1:::m (56) x i x i x i i=1:::n

Onemustnoticethattheasymptotes maydependonbothvariableandconstraint indices:

U

ij and L

ij .

In order to simplify the notations, we adopt the following notation:

0

= 1, so that

Lagrangian functionisgiven by:

L(x;) = m X j=0 j ( n X i=1 p ij U ij x i + n X i=1 q ij x i L ij d j ) (57)

The Lagragian problemis

min x i x i x i L(x;) (58)

(19)

dimensionalproblemsrelativeto each variable "i": min x i x i x i L i (x i ;) = m X j=0 j p ij U ij x i + m X j=0 j q ij x i L ij (59)

For apure MMA[38 ] approximation,theasymptotes are thesame foreach constraint

and they do not depend on the approximation index j: U

ij = U i and L ij = L j . The

solution of this problem can be solved explicitly for each variable and gives rise to the

primal-dualrelations. Optimalityconditionsof theLagrangian problemare:

P m j=0 j p ij (U i x i ) 2 P m j=0 j q ij (x i L i ) 2 = 0 (60)

Fromthisexpression,oneextractsthecandidateoptimalvalueofvariableinterms ofdual

variables: x ? i () = U i + L i +1 (61) where = s P m j=0 j p ij P m j=0 j q ij (62)

For an approximation of the GMMA family [36], the primal-dualrelationshipsare no

longer explicit since each asymptote depends now on both the primal variables (index

"i") and on the constraint (index "j"). As the Lagrangian problem no longer admits a

closedsolutionform,a Newton-Raphsonscheme isadopted (as inRef.[42]). However the

approximationisseparableandnone-dimensionalnumericalminimizationsareperformed.

The iterationscheme forprimal-dualrelationshipsx

i () isgiven by: x i ( + ) = x i () @L i =@x i @ 2 L i =@x 2 i (63) where @L i @x i = m X j=0 j p ij (U ij x i ) 2 m X j=0 j q ij (x i L ij ) 2 (64) @ 2 L i @x 2 i = 2 m X j=0 j p ij (U ij x i ) 3 2 m X j=0 j q ij (x i L ij ) 3 (65)

(20)

considerthree situations: x i ()=x ? i if x i x ? i x i ; (66) x i () = x i if x ? i x i ; (67) x i () = x i if x ? i x i : (68)

ValuesofprimalvariablesintermsofLagrangemultipliersaretheninsertedinLagrange

function to calculate dual function value. Gradient of dual function is also given by the

valueof primalconstraints.

When primal-dual relationships are not available in closed form, each evaluation

re-quires the solution of non linear problems, which needs an additional numeric eort. In

orderto circumventtheproblem,Fleury[20 ]suggested tobreakthesolutionoftheconvex

subproblem itself into a sequence of quadratic separable sub-subproblems. As explained

in the former section, these problems can be solved eÆciently by dual methods. So the

non-explicitcharacter of primal-dualrelationshipsisnota realobstacle inpractice.

3 STRUCTURAL APPROXIMATIONS

3.1 Characteristics of approximations

When building approximation schemes, one pursues several (and sometimes antagonist)

goals. It isgoodto explicitlygive them before reviewingthedierent schemes.

The rst constraint one hasto keepinmindis thatthe solutionofthe generated

sub-problemmustremainrigorousandthelesscomputationallyexpensive. Inorderto beable

to resort to dual methods, one would like, at rst, that the solution of the dual problem

is the same as the solution of primal sub-problem. Then the power of the dual method

holdsifthecomputationeorttomove tothedualspaceissmalltosave thebenetofthe

dualsolution. To match theseconditions,thesub-problems( ~

P)andtherelatedstructural

approximations mustbe:

Convex. The convexity of the sub-problems insures that there is a unique solution

and that the solution of the dual problem is the solution of the original problem.

In addition the dual problem is concave and eÆcient algorithms can exploit this

property.

Separable. The separability is essential to arrive at relations between the primal

design variables and the Lagrange multipliers that are easy to compute. Primal

variables are given by an uncoupled system of equations in terms of the Lagrange

multipliers, which is possible to solve independently for each primal variable. F

ur-thermore for most of the approximation schemes, it is even possible to express the

solutioninclosedform. Separabilityisanessentialpropertytoreducestoaminimum

(21)

ary solutionwithin a minimumnumberof iterations withoutconstraint violations.

Obvi-ouslythenumberofstructural analysescan be largelyreducedwhenappropriateschemes

areused. To thisendone would like thatthestructural approximationsare:

Precise inorder to give thebestt to thereal responsesinthelargestneighborhood

possible. Then the higher the quality of the approximation is the quicker is the

convergencespeedto asolution.

Conservative enough inorder to generate asequence of steadily improved iterations

which providefeasiblesolutionsat any stage ofthe optimizationprocess.

Conserva-tive approximations are obtained by increasing the value of the convexity terms in

order to reduce the sizeof the trustregion and to avoid constraint violations. This

argument isgenerally antagonist withtheprecision property.

Furthermore the overall computational time to come to an optimum depends also upon

the numerical work that is necessary to buildthe approximation. To this end one wants

to lookfor

Aminimumcomputationaleort togeneratetheapproximation,butalsotocalculate

thethe necessaryinformationthat is requiredto builtit. Forexample computation

of truesecond order sensitivityisgenerally expensive so thatit is generallyavoided

and replaced by approximated second order information e.g. with Quasi-Newton

techniques. Another exampleis the cost of thesolutionof the primal-dual

relation-ships: Too complex approximation schemes are generally highly penalized because

they require elaborated and expensive numerical solutions of the primal-dual

sys-tem,which,inits turn,increasesthenumericaleortto formulatethedualproblem.

Therefore, itcomesthat reducing theoverall computational eortisalso antagonist

to the accuracy and precision criteria,so that a trade ohas be made between the

two criteria.

Thislistis nonextensive,one must alsokeepinmindthattheapproximationscheme and

theapproximatedsub-problemsmustbe

Robust, in order to be able to construct the approximation in an automatic and

reliablemannerand ina lotof situations.

Flexible,tobeabletobeusedforvariouskindsofstructuralandgeometricresponses.

Overall convergent inorder to be able to endup ina optimized solution(may be a

local optimum)from anystartingpoint design.

:::

The approximation schemes aregeneratedthroughrst orsecond orderTaylor

expan-sionofthedesignfunctiong

j

(x)aroundcurrentdesignpointx k

. Thedierentschemesare

(22)

i

whichmeansthattheprecisionoftheapproximationisrestrictedto aneighborhoodofthe

current designpoint.

Wearenowgoingto reviewthemostimportantschemesandseetheimprovements(or

drawbacks) thateach one brings.

3.2 Linear approximation and sequential linear programming

When thinking about local approximation, the most natural and simple approximation

schemeisatheTaylorexpansionaroundcurrentdesignpointx 0

restricted tolinearterms.

The linearapproximation writes:

~ g j (x) = g j (x 0 )+ n X i=1 @g(x 0 ) @x i (x i x 0 i ) (69)

The choice of linear approximation is natural when it corresponds to the nature of the

restriction. Forexample itis exact forthevolumerestriction when usingSIMP material.

It isalso mandatorywhen dealingwith equalityconstraints.

Whenthelinearapproximationisappliedtoeach restrictionoftheproblem,one

trans-formstheoriginalprobleminto a sequenceof linear programming problems:

min x i g 0 (x 0 )+rg 0 (x 0 ) T (x x 0 ) s.t. g j (x 0 )+rg j (x 0 ) T (x x 0 ) 0 (70) x i x i x i

With the general formulation of linear programming, it is possible to treat a wide

spectrum of problems independently of the nature of the restrictions. But, may be the

most interesting advantage of sequential linear programming, comes from the fact that

the solution of this kind of sub-problems can be realized eÆciently with the help of

lin-ear programming algorithms like a SIMPLEX algorithm or a primal-dual interior point

method [29 ]whichis very welladaptedto verylarge scaleproblems.

However the sequential linear approximation strategy has also some drawbacks. Due

to the lack of convexity of the sub-problem, one can have some problem to stabilize the

convergenceprocess. Onecanhavesomeoscillationsoftheconvergenceoralotofunfeasible

designs during iteration history. To overcome the diÆculty, one has to add some

move-limits to playtherole of a'trust region':

max(x i ;x 0 i i ) x i min(x i ;x 0 i + i ) (71)

For topology problems, we suggest an adaptive move-limit strategy which reduces the

designintervalwhenthedesignvariableoscillatesandwhichopensitwhentheconvergence

process isstable.

Foriteration k=1and 2,take aninitial 10percent move-limits

i = 0:1(x i x i ) (72)

(23)

metriclaminatesubjectto shear and torsionloads[7 ]

Foriterationsk >2,updatemove-limits according to thefollowingrule:

(k+1) i =0:7 (k+1) i if x (k+1) i x (k) i <0 (73) (k+1) i =1:2 (k+1) i if x (k+1) i x (k) i 0 (74)

Whentheoriginalproblemisverynon-linear,itissometimesnecessarytoadoptveryclose

move-limits,whichreducesalottheconvergencespeed. Itisusualthat100iterationsmay

benecessary to come to astable solutionin topologyoptimization.

3.3 Reciprocal variable expansion

As soon as the beginning of the seventies, Schmidt and his co-authors (see for example

Ref. [34 ])showedthatan approximation scheme intermsofreciprocalvariablesy

i =1=x

i

isfavorableto reduce thenon-linearityof responsesin sizingproblems. Thusagoodlocal

approximation is realized when expanding the responses g

j (x) in terms of intermediate variablesy i =1=x i

to better capturethenon-linearcharacter ofthefunction:

~ g j (x) = g j (x 0 )+ n X i=1 (x 0 i ) 2 @g(x 0 ) @x i ( 1 x i 1 x 0 i ) (75)

Foralotofstructuralapplications,thereciprocalscheme(75)leadstosuccessfulresults

(24)

laminatesubjectto shear and torsion loads[7 ]

designproblem. In factthescheme iseÆcient whenall therst derivativesofthefunction

arenegative(likeindeterminatestructures),sincetheapproximation(75)hasonlypositive

curvature terms and is conservative. However when the problem is highly non-linear or

when dealing withother kindsof problems like shape orcompositeproblems, Fleuryand

Braibant [24 ] observed that the derivatives can have mixed signs and that convergence

processisnomore stable,becausetheapproximation(75)isnolongerconservativeforthe

variableswhich have positive derivatives.

3.4 Mixed linearization approximation: CONLIN

ToovercomediÆcultyofreciprocalvariableandlinearexpansion,FleuryandBraibant[24]

expressed theidea to combine reciprocal expansion(when thederivative is negative) and

linearapproximation(whenthederivativehasapositive value) ina singlemixed

approxi-mationscheme: ~ g j (x)=g j (x 0 )+ X + @g j @x i (x i x 0 i ) X (x 0 i ) 2 @g j @x i ( 1 x i 1 x 0 i ) (76) where P +

isthesumoverall theterms forwhichthederivative ispositiveand P

isthe

sum over allthe termsforwhich thederivativeis negative.

Thescheme(76)isunconditionnallyconvexsinceallitssecondderivativesarepositives,

that'swhyitis namedCONLIN asConvexLinearization. Moreover, it wasdemonstrated

(25)

subjectto shear and torsionloads[7]

be generated withlinear and reciprocal variables. This convex character gives riseto the

conservativecharacter ofCONLIN, which meansthatthe approximation (76)tends to lie

inthefeasibledomainoftheconstraint. ItfollowsthattheCONLIN methodmostlytends

to generate feasiblenew solutions. Innumerous applications,CONLIN technique leadsto

a stable convergence and a feasible sequence of steadily improved designs. Convergence

speedisgenerallyfast: 10to20F.E.analysesaregenerallynecessaryto reachastationary

solutioninsizingoptimization.

AnotherstrengthofCONLINcomesfromthefactthatitisverywellsuitedtoausedual

optimizers, because of theconvex and separablecharacter of theapproximatedfunctions,

the sub-problemlends itself to an eÆcient solutionprocedure based on dual method and

second order maximizationalgorithm(see Fleury[18]).

ThemaindrawbackofCONLINisrelatedtothefactthatthecurvatureofthe

approxi-mationisxedandthereisnowayfortheusertomodifyitexceptbyusingtrickychangeof

designvariables. Thisalsosometimesintroducesa badttingoftheapproximationto the

real function. So despite numerous successes, the mixed linearizationleads sometimes to

toosloworunstableconvergencehistoriesaswell. Famousexampleofdivergenceprocesses

were reported bySvanberg[38 ].

3.5 Method of Moving Asymptotes: MMA

To beableto adjustthecurvatureoftheapproximation andtohavea better ttingto the

(26)

Svanberg [38 ]whichextends theconvex linearizationscheme.

~ g j (x)=r 0 j + n X i=1 p ij U i x i + n X i=1 q ij x i L i (77) with p ij = maxf0;(U i x i ) 2 @g j @x i g q ij = maxf0; (x i L i ) 2 @g j @x i g (78) and where r 0 j

collects all zero order terms that are adjustedto t to the constraint value

inx 0

.

Once again the scheme can regarded as a rst order Taylor expansion in terms of

intermediatevariables1=(U

i x i )or1=(x i L i

)dependinguponthesignofthederivatives.

Ofcourse, one hastheasymptotes valuesaresuchthat L k i <x k i <U k i .

The twosets ofverticalasypmtotes U

i andL

i

generalizetheverticalasymptotes

intro-ducedinCONLINwiththechangeofvariablesy

i =1=x

i

. InfacttheCONLIN

approxima-tioncan berecovered fromthegeneral statement byassumingL

i

=0 forthelower bound

and U

i

!1 forthelowerbound.

The two sets of asymptotes also play a role of move-limits. This analysis has been

(27)

pointmethodsinwhichside-constraints(herelowerbound)aretakenintoaccountthrough

barrierfunctionsof thetype [6 , 28 ]:

ln(x i x i ) or 1 x i x i (79)

ToadjusttheMMAapproximation,one hastoplaywiththeparametersofthe

approx-imationi.e. the positionof the verticalasymptotes. One could calculatethat the second

derivatives(and theconvexity aswell)increasewhen theasymptotes are pushedcloser to

thedesignpoint. Andconverselytheconvexityoftheapproximationisreducedwhenthey

aremoved awayfrom thedesignpointx k

.

One diÆculty of the method comes from the fact that the automatic choice of these

curvature parameters remains diÆcult. Svanberg [38 ] proposed a heuristic strategy for

asymptotesupddatebasedonthedesignvariableoscillations. Fortheiterationsk =1and

2,thedefaultvaluesof theasymptotes areadopted:

L k i = x k i s 0 (x i x i ) U k i = x k i + s 0 (x i x i ) (80) with s 0

= 0:5 is suggested by Svanberg. Then asymptotes are updated in the following

ways.

For k >2, the update scheme is the following. When convergence process is smooth

(x k 2 i x k 1 i ):(x k 1 i x k i

(28)

L k i = x k i s 1 (x k 1 i L k 1 i ) U k i = x k i + s 1 (U k 1 i x k 1 i ) (81) withs 1

less than1and generally chosen as1:2.

While when convergence history oscillates (x k 2 i x k 1 i ):(x k 1 i x k i ) < 0, one has

to close the asymptotes towards the design point to increase convexity. So one gets the

followingupdaterules

L k i = x k i (x k 1 i L k 1 i )=s 2 U k i = x k i + (U k 1 i x k 1 i )=s 2 (82) withs 2

greater than1 andgenerally chosen ass

1 or

p

s

1

OnemayalsonoticethatZhangandFleury[41 ]proposedanalternativestrategybased

on thettingtheapproximationto theprevious functionvalue.

3.6 Globally Convergent Method of Moving Asymptotes: GCMMA

Asone can remarkin gures8 and 9 that MMA (but also COLNLIN) ismade of sum of

monotonousfunctions. Whenoneisfacedtonon-monotonousproblems(whichisoftenthe

casewithcompositestructuresforexample),itismandatorytorestrictthedesignvariable

motion in one direction. Therefore, as suggested by Svanberg [38 ], a move-limit strategy

isgenerally adopted incombination withMMA.

max(x i ;0:9L j +0:1x 0 i ) x i min(x i ;0:1x 0 i +0:9U i ) (83)

More recently, Svanberg [39 ] proposed a new extension of MMA method which uses

simultaneouslybothasymptotesinordertocreate anon monotonousapproximation

func-tion. The generalstatement of theapproximation issimilarto MMA formula:

~ g j (x)=r 0 j + n X i=1 p k ij U i x i + n X i=1 q k ij x i L i (84) butcoeÆcients p k ij and q k ij

arebothnon zeroingeneral. Theyare aschosen asfollows:

p k ij = (U k i x k i ) 2 maxf0; @g j (x k ) @x i g + k j 2 (U k i L k i ) ! (85) q k ij = (x k i L k i ) 2 maxf0; @g j (x k ) @x i g + k j 2 (U k i L k i ) ! (86) where k j

are strictly positive parameters (to insure the convexity of the approximation)

whichareupdatedtogetherwiththeasymptotesL k

i

andU k

i

(29)

ofthecomplementarytermin k

j

thatallowstheapproximationto benon-monotonousby

usinginthesame timethe twoasymptotes L k i andU k i .

Mobile asymptotes are updatedwith thesame ruleas inMMA (see formula (80-82)).

Theadditionalparameters k

j

arechosenasfollows. Forthetworstiteration(k=1),one

choose

1

j

=" 8j2f0;1:::;mg 0<"1 (87)

Inthe lateriterations(k2),the parameters k

i

areupdatedaccording to

k j =2 k 1 j if g~ k 1 j (x k ) < g j (x k ) k j = k 1 j if g~ k 1 j (x k ) g j (x k ) (88)

Further(in order to prove theconvergence!), if

~ g k 1 j (x k ) g j (x k ) 8j2f0;1:::;mg

theasymptotes shouldnowinstead be updatedas

L k i = x k i (x k 1 i L k 1 i ) U k i = x k i + (U k 1 i x k 1 i ) (89)

Eventhoughthatforthisscheme,onecanprovethegloballyconvergentcharacterofthe

(30)

inmostcases, itconverges more slowly thantheoriginalMMA (onproblemswhereMMA

doesconverge). Thereasonforthisisthatsincetheparameters k

i

areincreased,andnever

decreased,theapproximationsbecome increasinglyconservative. Thismayeventuallylead

to very smallstepsintheiterationprocess.

3.7 Example

To illustrate the dierence between the various approximation schemes, we now plot the

contoursg(x)=0ofthereal and approximatedconstraintsfora simpleanalyticexample.

The functionunderstudyis

g(x)=5x

2 x

2

1

Firstorder derivativesof functiong are:

@g @x 1 = 2x 1 @g @x 2 = 5

The current approximation point is (x 0

1 ;x

0

2

) = (4;4) where the function and derivatives

valuesare: g(x 0 ) = 4 rg(x 0 ) = [ 8; 5] T

Reciprocal scheme is not conservative and lies in the unfeasible region of the true

constraint. Linear,CONLIN,MMAand GCMMAaregraduallymore conservative andlie

inthefeasiblepart ofthedesignspace.

Remark that as CONLIN, MMA, etc. are monotonous approximation, so that the

contoursareopen curves, whileGCMMAwhichisanon monotonousapproximationleads

to a closedcontourcurve.

3.8 Second order schemes

An alternative strategyto remedy to CONLIN'sproblems consistsinimprovingthe

qual-ity oftheapproximation and inintroducingsecond order derivative information. Smaoui,

FleuryandSchmit[36 ]proposedageneralizedMMAinwhichtheasymptotesare

automat-icallyselectedtomatchthesecondorderderivatives. AbitlaterFleury[19 ]generalizedthe

useof second order derivativesto several convex otherapproximations(likethe quadratic

diagonalschemeandtheexpansionintermsofapowerofthedesignvariables)andshowed

there is a bigpay-o related to the use of the second order informationin structural

ap-proximations.

3.8.1 The Generalized Method of Moving Asymptotes (GMMA)

Theapproximationofthegeneralizedmethodof movingasymptotescan bewritteninthe

followingform: ~ g(x 0 )=c 0 + n X i=1 a i x i b i (90)

(31)

g(x) =5x 2 x 2 1 at point (x 1 ;x 2 ) = (4;4)

AstheoriginalMMA[38],thisapproximationcan easilymatch thefunctionvalueand

itsrst derivativesatcurrentdesignpointx 0

. Ifthesecond orderinformationisavailable,

Smaoui et al. [36 ] showed that the asymptotes could be selected automatically too. The

parameters ofthe expansionare given by:

a i = (x 0 i b i ) 2 @g(x 0 ) @x i b i = x 0 i + 2 @g(x 0 ) @x i =max (; @ 2 g(x 0 ) @x 2 i ) 0<1 (91)

whiletheparameterc

0

isdenedtomeettheconstraintvalueatthedesignpoint. Thesmall

positivenumberguarantees theconvexityoftheapproximation. Thisapproximationisa

generalizationofthepureMethodofMovingAsymptotesof[38],since,here,eachconstraint

hasgotits ownset ofasymptotes.

3.8.2 Quadratic Separable Approximations

ThequadraticapproximationisthesecondorderTaylor'sexpansionofthegivenstructural

response. Nevertheless, the full second order expansion introduces coupling terms and

the approximation is no more separable. Furthermore, the full second order sensitivity

analysis can be expensive forstructural problems. So Fleury [19 ] proposed to neglect the

(32)

~ g(x)=g(x 0 ) + n X i=1 @g(x 0 ) @x i (x i x 0 i ) + 1 2 n X i=1 ( @ 2 g(x 0 ) @x 2 i +Æ ii )(x i x 0 i ) 2 (92)

The additional termsÆ

ii

aregenerally addedinorder to reinforcetheconvexity of the

separableapproximationandto playtherole of'movelimits'inestablishingatrustregion

around thedesignpoint. Forexample, thepenaltytermsÆ

ii

can beestimated sothat the

unconstrainedoptimumiskeptwithinagiven distance of theexpansionpoint.

jx i x 0 i j = j @g @x i =( @ 2 g @x 2 i +Æ ii )j < x 0 i (93)

3.8.3 Approximation procedure of diagonal second derivatives

A study [15, 16 ] based on the theory of Quasi-Newton updates preserving the diagonal

structurehasshownthat thebestapproximationof secondorder diagonaltermsis simply

given by: B ii ' @g(x k ) @x i @g(x k 1 ) @x i x k i x k 1 i (94)

This rather intuitive result consists in "making nite dierences between the computed

rst derivativesintwo successive iterationpointsandinignoringthecross derivatives".

This theoretical result was adapted to the characteristics of structural optimization

problems to yield quickly convergent estimates of the Hessian. This adaptation relieson

thekey roleof thereciprocaldesignvariablesto reduce thenon-linearityofthestructural

responses. The Hessian is updated in the space of reciprocal design variables and then

converted into curvaturesintermsofthedirectvariablesto beusedintheapproximation.

The initial guess of the Hessian is also very important. Starting in the reciprocal design

space from a diagonal matrix of smallterms restores thecurvaturesof CONLIN which is

generallya good startingpoint.

As in Ref. [15, 16 ] this approximate second order information can be introduced in

two high quality approximations: the second order Method of Moving Asymptotes and

thequadraticseparableapproximation. Theperformancesofthisprocedurearevery close

fromtheresultsobtainedwhileusingthesameapproximationswithanalyticsecondorder

derivatives[19 ]. Inevery cases these performancesare at leastasgoodas thebestresults

withMMA [38 ] orCONLIN [24 ].

3.8.4 Comparison of approximation in topology optimization [10 ]

Firstorder convex approximations

In Refs. [10 , 11 ], we tested rst order approximations in solvingmaterial distribution

(33)

that 20 to 40 iterations more are generally necessary to arrive to a stationary solution.

Nevertheless,ifthese performances(interms ofthenumberofiterations)arecomparedto

resultsobtained withstandard implementations of optimalitycriteria (like inRef. [3 , 5 ]),

CONLIN or MMA give equal or better performances. Among the approximations, the

expansion in terms of power p of the design variables is often too conservative and the

convergence rate is the slowest. Performance of CONLIN is generally very satisfactory.

Due to the weaker curvatures of CONLIN and MMA in the rst iterations, it is worth

usingmove-limits to have a smoothconvergencehistory.

From our experiences, the CONLIN (Fleury and Braibant [24 ]) approximations give

rise to good results in topology design. For the compliancesthat are self-adjoint, all the

derivatives are negative and CONLIN restores the reciprocal design variables expansion

that is well known to reduce the non-linearity of the structural responses. But

convex-ity and conservativity properties of the approximation in CONLIN are important when

treatingeigenfrequenciesorconstraintswhoserstderivativeshavemixedsigns. Themain

disadvantage of CONLIN is that the approximations introduce xed curvatures, so that

theapproximationmaybetoo much ortoolittleconservative. Thiscangiveriseto aslow

or unstable convergence towards the optimum. To remedy this, we select the MMA [38]

approximationschemewhichgeneralisesandimprovestheCONLINschemebyintroducing

two sets ofasymptotes. The choice of themovingasymptotes providesthe wayto modify

thecurvatureand to t better to thecharacteristicsof theproblem.

Because of the exibility introduced by the moving asymptotes, MMA ts better to

the convexity of the problemand MMA is oftena bit quickerthan CONLIN. It was also

observed thatforverydiÆcultproblems, MMAwasmorestablethanCONLIN.Therefore

itis generallychosen asdefaultalgorithm.

Nevertheless,wecanconcludethatbothCONLINandMMAleadtosatisfactoryresults

fortopologydesignandimproveoftengreatlytheperformanceofthesolutionprocedure. In

manyproblemsweobservedthatasolutionisoftenachievedin30to50iterationsdepending

on the diÆculty of the problem and the precision of the stopping criterion. One strong

advantageofCONLINandMMAarisesfromtheveryreliabledualsolversthatareusedto

solve the associatedconvex sub-problems. On anotherhand, one major drawbackof rst

order approximationsis thatwe can observe a decelerationoftheprogression towards the

optimum oncethe algorithm arrivesinthe neighbourhood of theoptimum. To accelerate

theconvergencerateinthenalstage,oneneedsbetterapproximationsbasedoncurvature

information(Fleury[19 ]).

We should also brie y discuss the selection of a termination criteria. We prefer not

usingterminationcriteriabasedonlimitedimprovementoftheobjectivefunctionsincethe

optimum is very at and the value of the objective functiondecreases very slowly during

morethanhalfoftheiterations. So,suchamethodwillleadterminationoftheoptimization

process ata tooearlystage. Whenall issaidand done,one looksfortheoptimalmaterial

distribution,sowethinkthatitisbettertoadoptaterminationcriteriabasedonthedesign

(34)

or the maximum modication (innite norm). As the problem under consideration is a

constrained optimizationproblem, one can also useKuhn-Tucker conditions. Satisfaction

ofanyone ofthese lastterminationcriteria avoids prematurestopping.

Second order convex approximations [11 ]

Second order are high quality approximations that are indeed more precise and that

lead to faster convergence rates. Nevertheless, second order sensitivityis very onerous to

compute and to store sothat theoverall overall cost of theoptimization run withsecond

ordersensitivitycanbesimilarto thecostofanoptimizationrunthatwouldbemadewith

rst order approximations even if the number of iteration is greater [30 ]. When the size

oftheproblemincreases,computingandstoring second orderderivativesbecomes quickly

cumbersome and theproblembecomes impossibleto manage.

To be able to use second order approximation schemes with large scale optimization

problems,wedevelopeda newprocedure to generateanestimationof thecurvature

infor-mation with a small computation cost [15 , 16 ]. As separable approximations needs only

diagonalsecondderivatives,theideaistobuiltanestimationofthecurvatureinformation

witha quasi-Newton updateable to preserve diagonal structureof the Hessianestimates.

Thisupdateschemeisderivedformthegeneraltheoryofquasi-Newtonupdatewithsparse

Hessian estimatesmade byThapa [40 ]. The diagonalversionof theBFGSupdate[15 , 16 ]

thatwe implementedisvery un-expensiveevenforlargescaleproblemssinceitintroduces

only vectors manipulations. In [10 ], it was observed that for a given topology problem,

the time spent in the diagonal BFGS update is only 3 % of the time spent in the

opti-mizer CONLIN[18 , 20 ] andonly0.01 %ofthe timeneededfor sensitivityanalysis witha

commercialniteelement package.

Theestimatedsecondorderinformationisintroducedintotwowellknownsecondorder

approximations. The rst one is a second order version of MMA proposed by Smaoui et

al.[36]. Thesecondapproximationistheseparablequadraticapproximation suggestedby

Fleury[19 ]. CombiningdiagonalBFGS updatewithboththese approximationsgivesvery

interesting results that leads to important savings in terms of number of iterations and

of computation time. This conclusion can be explained as follow. Firstly, the estimation

of the curvature improvesgreatly the quality of the approximation with onlythe help of

the accumulated rst order information. Secondly, instead of ignoring the second order

coupling terms, diagonal BFGS provides a way to take them into account by correction

terms on the diagonal coming from the diagonal update. Due to our initial guess of the

Hessian,one can observe, intherst iterations,a convergence historythat isvery similar

to rst order approximations. But after some iterations, the update procedure improves

theestimation of theHessian and one can see a real advantage inthe convergencespeed.

Around an accumulation point satisfying the optimality conditions, we could observe a

convergence speed superior to rst order methods, sometimes closed from super-linear

behaviour.

Asaconclusion,secondorderapproximationscanadvantageouslybeusedforlargescale

(35)

whichisclosetothereciprocaldesignvariableexpansionresultsinsimilarcharacteristicsas

rst order approximations, duringthe rst iterations. Thischoice generally yieldsa good

descent rate of the objective functionin the beginning. Then, since theHessian estimate

isimprovedand theapproximation isenrichedbythiscurvature information,itleadsto a

better convergence rate duringnal convergence and the number of iterations to reach a

stationary solutionisreduced.

Attention must nevertheless be paid to the well-known fact that second order

algo-rithmsaremore sensitiveto localoptima. Thisfact wasobserved also withourprocedure

and particularly with the quadratic separable approximation. This drawback can be

at-tenuated by adding move-limits or by adding additional convex terms in the quadratic

approximation.

3.9 Topological optimization of short cantilever beam

Finallywe can illustrate theapplication of four approximations -CONLIN, MMA, MMA

second order with diagonal BFGS (DQNMMA), and quadratic separable approximation

with diagonal BFGS (DQNQUA)-, in the context of a topology optimization problem,

the short cantilever beam problemwhose geometry of the problem is given at gure 12.

Thisdesignis a classicalbench-mark of topologyoptimization. For thesake of simplicity,

the material law is simply given bya cubic relationbetween the rigidity and the relative

density [4]: E = 3 E 0 and = 0

. The Poisson's ratio and the Young's modules of

the solid are:

0

= 0:3 and E

0

= 100GPa. The compliance under the given load case

is minimized whilethe volume is bounded to 37:5% of the volume of thedesign domain.

The design domain is discretized by a regular mesh of 1040 nite elements of degree 2.

The niteelement analysis and the sensitivityanalysis are performed with the SAMCEF

package.

Theproblemissolvedwiththefourdierentapproximationsforcompliance(thevolume

is linearized). Since all the rst derivatives of compliance are negative, CONLIN and

the reciprocal variables expansion are the same. We also implemented a MMA scheme

similarto Svanberg'sone. Then, we usethe highquality approximations DQNMMAand

(36)

Figure14: Compliancehistory

Historiesofcompliancearegivenat gure14. At rstglance,thedierentconvergence

curvesareverysimilarandthedierent algorithmstendtowards localoptimawithnearly

the same compliances. Nevertheless, when material distributions are visualized, the

op-timal distributions reveal a very similartopology, except in the quadratic approximation

(Figure 13 presents the material distribution obtained with CONLIN). Thus several

op-timaexistwhen intermediatedensities arehighlypenalizedandattention mustbepaidto

local congurations. Also, second order approximationsare partly more sensitive to local

optima.

Firstorderapproximationsgivesmoothandmonotonehistorycurves. Atthebeginning,

the descent rate is good but, after more or less 10 iterations, compliance reduction is

seriously slowed down when close to the optimum. Progression becomes much slower.

(37)

ofthe designvariablesbetweentwo iterations (gure 16) diminishslowly,even withsmall

oscillations.

Secondorderapproximationsalsoresultsinconvergencecurveswithaverygooddescent

rate. The progression towards the optimum is not handicapped too much by the non

monotonebehaviouroftherstiterations(iterations3and4). Thatfactmaybeduetoan

uncertainvalueoftheestimationofthecurvaturesbythediagonalBFGS.Afterthisphase

which isnecessaryto stabilizetheestimationsof thecurvatures, second orderinformation

gives rise to a very good descent rate. When in the at part of the compliance curve,

second order information preservesa good convergence speedand continues to accelerate

the progression towards a stationary solution. This can be clearly noted on the mean

modicationsofthedesignvariablesorontheKKTmodulus. Stationarityisreachedmuch

fasterwithsecondorderschemesanddiagonalBFGSthanwithrstorderapproximations.

AspredictedbyThapa's theory[40 ],werecoverinfacttheasymptoticsuperlineardescent

rate of quasi-Newton methodsin the neighbourhood of the optimum. This characteristic

saves a high number of iterations. The quadratic scheme and the MMA second order

method come to stationarity in30 to 40 iterations whileCONLIN and MMA needs more

than40 iterationsto satisfya weaker terminationcriteria.

Finally one can also compare the two proposed termination criteria: the modulus of

the Lagrangian function (in gure 15) and the mean modicationof thedesign variables

betweentwo iterations(in gure16). Onecan observe thatbothhavea parallelevolution

and they are in perfect correlation. As predicted by the theory they give an equivalent

informationand theybe bothused to predict convergenceintopology optimization

prob-lems.

With the DQNMMA and DQNQUA approximations, the objective function is

(38)

modi-iterations

cations can be observed, whereasthe convergence of theother rst order schemesis not

achieved yet. CONLIN and MMA's nalconvergencerates aremuch slower.

4 PERIMETER APPROXIMATION [11 ]

Perimeterconstraint hasa majorplace intopologydesign, becauseit isaveryinteresting

alternative to optimalrealxation usingoptimal microstructures asin the homogenization

method [1 ,5 ]. Inengineering applications, perimeter controlalso seemsfurtherattractive

than the rigoroushomogenization method because it allows to use penalization of

inter-mediatedensitiesandto generatecleardensitydistributionswithwellseparatedvoidsand

solidszones, so that an unambiguous macroscopic topology often appear. Ambrosio and

Buttazzo [2] demonstrated that the design problem with perimeter penalization is

well-posed. Applicationofperimetercontrolto topology designof structures waspresentedby

Haberetal.[25 ]. TheworkpresentedhereaimsatremedingsomediÆcultiesthatoccurin

thenumericalimplementationsoftheperimetercontrol. WefocusonprovidinganeÆcient

numericalstrategytotake perimeterconstraintinto accountinorderto useperimeter

con-trolasareal practical designtool. The perimeter isa ratherdiÆcultconstraint to satisfy

and to approximate, asitwillbeseen.

Webasedournumericalexperiencesonpowerlawmodels(alsocalledSIMP materials)

toprovideinthesametimeacontinuousapproximationofthedesigndistributionproblem

as well as a penalization of the intermediate densities. SIMP material is also easy to

implementinanyindustrialcode. Finally,SIMPmaterialintroduces onlyonevariableper

element,sothat thesizeoftheproblemis keptat a minimum.

AlthoughintheoriginalworkofHaberetal.[25 ]theperimetercontrolwastreatedwith

(39)

as a constrained problem in which perimeter is one of the inequality constraints. If one

wants to control theperimeter by prescribinga target value with a penalization, one can

use a relaxation technique and introducean additional variable Æ, which is quadratically

penalized in the objective function (the pds factor is a tuning parameter to control the

relative weight of thepenalization compared to themagnitudeofthe objective function).

min 1 1Æ 2 f T u + pdsÆ 2 s.t. V V (95) P P + (Æ 1)P

P isan allowablemaximumviolation ofthe target bound

P givebythe user. Standard

versions of solvers like CONLIN [22 ] orMMA [38 ] are able to handle thesolution of this

kindofoptimization problemswithrelaxation.

Whenthedensityvariescontinuously,Haber,etal. [25 ]proposeto replacethe

geomet-rical measure of the perimeter by the total variation of thedensity . Fora densityeld

which iselement byelement constant,wecan write:

P = X k l k q <> 2 k + 2 (96) where <> k

is the jump of material density trough the element interface k of length

l

k

. The parameter is a small positive number to guarantee the dierentiability of the

measure. Valuesof aregenerally taken between10 2

to 10 4

.

Figure17sketchestheperimetermeasureofasquareelementofdensitysurroundedby

fourelementsofdensity

1 , 2 , 3 ,and 4

(40)

approximate with classical schemes. Monotonous approximationslike CONLIN orMMA

give rise to oscillatory behaviours. Furthermore, the perimeter is globally non linear and

there are important couplings between neighboringnite element (F.E.) densities. Thus,

thetrustregionof separableapproximationsisnarrow.

Nevertheless,inordertotreatproblemswithalargenumberofvariablesandtousedual

solvers, we need a convex and separableapproximation. From our numerical experience,

good resultsareexpected withaquadratic separableapproximation ofthe generalform:

~ P()=P( 0 ) + n X i=1 @P @ i ( i 0 i ) + 1 2 n X i=1 a i ( i 0 i ) 2 (97)

Themainproblemisnowto choosethesecondorder termsa

i

carefully. Theirvaluesmust

beacompromisebetweenprecisionandconservativity. Toosmallvaluesofa

i

wouldimply

important constraint violations whiletoo large valuesof these second order terms would

lead to a freeze the motion of the variables. On one hand, by selecting a

i

, one will try

to t thetrue shape of theconstraint whileon theother hand, these coeÆcients willplay

the role of move-limits that restrict thevalidity of the approximation. Also, the analytic

second orderderivativesarenotusefulsincethey arezero, except inangularpointswhere

they arevery large (ordo notexist). So thechoice of thearticial curvaturesis basedon

a heuristicrule,whichis explainedinthenext section.

4.1 A heuristic estimation of curvatures for perimeter approximation

In the following, we develop a heuristic estimation of curvatures for approximating the

perimeter when there is one density variable

i

per element as it is for SIMP materials.

These estimates of the curvatures are based on a bound over individual contributionsof

each element. According to thequadratic approximation, thecontribution of element "i"

to globalperimeteris:

~ P i ()=P i ( 0 i ) + @P @ i ( i 0 i ) + 1 2 a i ( i 0 i ) 2 (98) P i ( 0 i

) isthe contribution of element "i" to theperimeterwith thecurrent distributionof

density. This contribution of element "i" is maximum when the density jump across the

element interfaces becomes equal to unity. Suppose now that this situation happens at

point ?

i

. Then one can writethesecond order coeÆcients intermofthenew parameter:

a i = 2 P k2Ki l k ( p 1+ 2 q j 0 i 0 k j 2 + 2 ) @P @i ( ? i 0 i ) ( ? i 0 i ) 2 (99)

wherethesum isrealized over thesetK

i

ofthe interfacesof element "i".

The question is now to nd thepoint ?

i

where thissituation could probablyhappen.

If theseparabilityhypothesis were true, thepoint ?

i

could be chosen at theboundaryof

the admissible set of

i

, that is when

i