A filter-trust-region method for unconstrained optimization

(1)

RESEARCH OUTPUTS / RÉSULTATS DE RECHERCHE

Author(s) - Auteur(s) :

Publication date - Date de publication :

Permanent link - Permalien :

Rights / License - Licence de droit d’auteur :

Institutional Repository - Research Portal

Dépôt Institutionnel - Portail de la Recherche

researchportal.unamur.be

University of Namur

A filter-trust-region method for unconstrained optimization

Gould, Nick; Sainvitu, Caroline; Toint, Philippe

Published in:

SIAM Journal on Optimization

DOI:

10.1137/040603851

Publication date:

2006

Document Version

Early version, also known as pre-print

Link to publication

Citation for pulished version (HARVARD):

Gould, N, Sainvitu, C & Toint, P 2006, 'A filter-trust-region method for unconstrained optimization', SIAM Journal

on Optimization, vol. 16, no. 2, pp. 341-357. https://doi.org/10.1137/040603851

General rights

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners

and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.

• You may not further distribute the material or use it for any profit-making activity or commercial gain

• You may freely distribute the URL identifying the publication in the public portal ?

Take down policy

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately

and investigate your claim.

(2)

forun onstrained optimization byN.I.M.Gould 1 ,C.Sainvitu 2 andPh. L.Toint 2 Report04/03 February5,2004 1

ComputationalS ien e andEngineeringDepartment

RutherfordAppletonLaboratory,

Chilton,Oxfordshire,England

Email: gouldrl.a .uk

2

DepartmentofMathemati s,

UniversityofNamur,

61,ruedeBruxelles,B-5000Namur, Belgium,

Email: aroline.sainvitufundp.a .be

(3)

optimization

Ni k I. M.Gould, CarolineSainvitu and

Philippe L.Toint

5February 2004

Abstra t

Anewlter-trust-regionalgorithm for solvingun onstrainednonlinear

opti-mization problems is introdu ed. Based on the lter te hnique introdu ed by

Flet herandLeyer,itextendsanexistingte hniqueofGould,LeyerandToint

(SIAMJ. Optim.,to appear 2004)for nonlinear equations andnonlinear

least-squares to the fullygeneral un onstrainedoptimization problem. The new

al-gorithmisshowntobeglobally onvergentto atleastonese ond-order riti al

point,andnumeri al experimentsindi atethatitisvery ompetitivewithmore

lassi altrust-regionalgorithms.

1 Introdu tion

Sin e ltermethods havebeenintrodu ed for onstrained nonlinear optimization by

Flet her andLeyer [5℄, they haveenjoyed onsiderable interestin their original

do-main of appli ation [1, 4, 6, 7, 16, 17℄. More re ently, they havebeen extended by

Gould, Leyer and Toint [8, 12℄ to the nonlinear feasibility problem (in luding

non-linear equations and nonlinear least-squares), whi h is to minimize the norm of the

violations of aset of (possibly nonlinearand/or non onvex) onstraints. Sin e

non-linearleast-squares anbeseenasaspe ialized aseofun onstrainedoptimization, it

is naturalto onsider the further extension ofthe lterte hniques to general

un on-strainedoptimizationproblems: thisistheobje tofthepresentpaper.

Thepresentationisorganizedasfollows. Se tion2introdu estheproblemandthe

new algorithm,whose global onvergen eto pointssatisfyingse ond-orderoptimality

onditionsis shown in Se tion 3.1. Theresultsof numeri al experien ewith thenew

method are dis ussed in Se tion 4 and some on lusions and perspe tivesare nally

(4)

We onsiderthefollowingun onstrainedminimizationproblem

min x2IR

n

f(x); (2.1)

where f is atwi e ontinuously dierentiable fun tion of the variables x 2 IR n

. An

eÆ ient te hnique for solving this problem is to use Newton's method, whi h, from

a urrentiteratex k

, omputesatrialsteps k

byminimizing amodel oftheobje tive

fun tion onsistingoftherstthreetermsofitsTaylor'sexpansionaroundx k ,yielding atrial point x + k =x k +s k :

Unfortunately,itiswell-knownthatsu hanalgorithmmaynotalwaysbewell-dened

(whentheTaylor'smodelisnon onvex),or onvergentfromanyinitialpointx 0

. These

diÆ ulties anbe ir umventedbyrestri tingthemodelminimizationtoatrustregion

ontainingx k

,inamannerthatisnowwellestablished(seeConn,GouldandToint[2℄

foranextensivedes riptionoftrust-regionmethodsandtheirproperties). Wepropose

to further extend su h methods by introdu ing a multidimensional lter te hnique,

whose aim is to en ourage onvergen e to rst-order riti al points by driving every

omponentoftheobje tive'sgradient

r x f(x) def = g(x)=( g 1 (x);:::;g n (x)) T to zero.

2.1 Computing a trial point

Before indi ating how to apply our lter te hnique, we start by des ribing how to

omputethetrialpointx + k =x k +s k

from a urrentiterate x k

. Atea hiteration,we

dene themodeloftheobje tivefun tion tobe

m k (x k +s)=f(x k )+g T k s+ 1 2 s T H k s; whereH k isasymmetri approximationtor xx f(x k

),and onsideratrust-region

en-tered atx k B k =fx k +sjksk k g;

where we believe this model to be adequate. A trial step s k

is then omputed by

minimizing themodel(possiblyonlyapproximately). Atvarian ewith lassi al

trust-regionmethods, wedonotrequireherethat

ks k

k k

(2.2)

at everyiterationofouralgorithm. The onvergen eanalysisthat followsrequires,as

is ommonintrust-regionmethods[2,Chapter6℄,that thisstepprovides,atiteration

k,asuÆ ient de reaseonthe model, whi histo saythat

m k (x k ) m k (x k +s k ) md max kg k kmin kg k k k ; k ;j k jmin[ 2 k ; 2 k ℄ (2.3)

(5)

md k

Hessianofthemodelm k ,i.e. k def = 1+max x2B k kr xx m k (x k )k and k =min[0; min [H k

℄℄ . Althoughthis onditionseemste hni al,thereareeÆ ient

numeri al methods to ompute s k

that guarantee that it holds (see [9, 13℄, or,more

generally,[2,Chapter7℄). Typi altrust-regionalgorithms thenevaluatetheobje tive

fun tion at thetrialpointanda eptx + k

asthenewiterate ifthe redu tiona hieved

in the obje tive fun tion is at least a fra tion of that predi ted by the model. The

trust-regionradius k

isalsopossiblyenlargedifthisisthe ase,oritisredu edifthe

a hievedredu tionistoosmall.

2.2 The multidimensional lter

Wenow onsider usingalterme hanismto potentiallya eptx + k

asthenewiterate

more often. The notionof lter isbased onthat ofdominan e: for ourproblem, we

saythat apointx 1 dominatesapointx 2 whenever jg i (x 1 )jjg i (x 2 )j forall i=1;:::;n: Thus, ifiterate x 1 dominatesiterate x 2

and ifwefo usourattentionon onvergen e

to rst-order riti al points only, the latter is of no real interest to us sin e x 1

is at

least as goodasx 2

for ea h of the omponentsof thegradient. All we needto do is

to remember iterates that are not dominated by other iterates by using a stru ture

alled alter. We deneamultidimensional lter F asa listof n-tuplesof theform

(g k ;1 ;:::;g k ;n ),whereg k ;i def = g i (x k ),su h that,ifg k andg ` belongtoF,then jg k ;j j<jg `;j

j foratleastone j2f1;:::;ng: (2.4)

Filter methods proposeto a eptanew trialiterate x + k

ifitisnotdominatedbyany

otheriterate inthelter.

However, we do not wish to a ept a new point x + k

if one of the omponents of

g(x + k

)isarbitrarily losetobeingdominatedbyanotherpointalreadyinthelter. In

order to avoidthis situation,weslightlystrengthenoura eptability testandwesay

that anewtrialpointx + k

isa eptable for thelter F ifandonlyif

8g l 2F 9j2f1;:::;ng : jg j (x + k )jjg j;l j g kg l k; (2.5) where g 2(0;1= p

n)isasmallpositive onstant. Ifaniterate x k

isa eptablein the

senseof(2.5),wemaywishtoaddittothelterandremovefromiteveryg ` 2F su h that jg j;` j>jg j;k jfor allj2f1;:::;ng.

If theme hanismdes ribed sofar is adequatefor onvexproblems (where azero

gradientisbothne essaryandsuÆ ientforse ond-order riti ality),itmaybe

(6)

ify the lterme hanismto ensure that the lteris reset to the empty set after ea h

iterationgivingsuÆ ientdes entontheobje tivefun tionatwhi hthemodelm k

was

dete tedtobenon onvex,andsetanupperboundonthea eptableobje tivefun tion

valuestoensurethattheobtainedde reaseis permanent.

Weare now ableto ombine these ideas intoan algorithm, whose main obje tive

is to letthelter playthe majorrole in ensuringglobal onvergen e within \ onvex

basins",fallingba kontheusualtrust-regionmethodonlyifthingsdonotgowellor

ifnegative urvatureisen ountered.

Algorithm 2.1 Filter-Trust-Region Algorithm

Step 0: Initialization.

An initial point x 0

and an initial trust-region radius 0

> 0 are given. The

onstants g 2(0;1= p n), 1 ; 2 ; 1 , 2 and 3

arealsogivenandsatisfy

0< 1 2 <1 and 0< 1 2 <1 3 : (2.6) Computef(x 0 )andg(x 0

),setk=0. InitializethelterF totheemptyset and

hoose f sup

f(x 0

). Dene two ags RESTRICT and NONCONVEX,the formerto

beunset.

Step 1: Determinea trialstep.

Compute a nite step s k

that \suÆ iently redu es" the model m k

, i.e. that

satises (2.3) and that also satises ks k k k if RESTRICT is set or if m k is

non onvex. In thelatter ase,set NONCONVEX; otherwiseunsetit. Computethe

trialpointx + k =x k +s k . Step 2: Computef(x + k

)anddenethefollowingratio

k = f(x k ) f(x + k ) m k (x k ) m k (x + k ) : Iff(x + k )>f sup ,setx k +1 =x k

,setRESTRICTandgotoStep 4.

Step 3: Test to a ept the trialstep.

Computeg + k =g(x + k ). Ifx + k

isa eptableforthelterF andNONCONVEXisunset:

Setx k +1

=x + k

,unsetRESTRICTandaddg + k

tothelterF ifeither k < 1 orks k k> k . Ifx + k

isnota eptableforthelterF orNONCONVEXisset:

If k 1 andks k k k ,then setx k +1 =x + k

,unsetRESTRICTandifNONCONVEXisset,setf sup

=

f(x k +1

)andreinitializethelterF totheemptyset;

elsesetx k +1

=x k

(7)

Ifks k

k k

,updatethetrust-regionradiusby hoosing

k +1 2 8 > < > : [ 1 k ; 2 k ℄ if k < 1 ; [ 2 k ; k ℄ if k 2[ 1 ; 2 ); [ k ; 3 k ℄ if k 2 ; (2.7) otherwise,set k +1 = k

. In rementkbyoneandgoto Step1.

Notethat, asstated, ouralgorithm la ksaformalstopping riterion. Inpra ti e,

onewouldobviouslystopthe al ulationifkg k

kfallsbelowsomeuser-denedtoleran e

and NONCONVEXis unset, orif somexedmaximum number ofiterationsis ex eeded.

Also note that our onditionson the stepmight impose to re ompute s k

within the

trust region ifnegative urvaturewasdis overedforthe model only after omputing

a stepbeyond the trust-region boundary. Fortunately, this is typi allya very heap

al ulation and an bea hieved byba ktra king[14℄ orbyother suitable restri tion

te hniques[9℄.

3 Global onvergen e

Global onvergen e properties of Algorithm 2.1 will be proved under the following

assumptions.

A1 f is twi e ontinuouslydierentiableonIR n

.

A2 Theiterates x k

remainin a losed,boundeddomain ofIR n

.

A3 Forallk,themodelm k

istwi edierentiableonIR n

andhasauniformlybounded

Hessian. A4 Forallk, m k (x k )=f(x k )andg k =r x m k (x k )=r x f(x k ):

NotethatA1,A2andA3togetherimplythatthereexist onstants l , u l , ufh 1 and umh 1su hthat f(x k )2[ l ; u ℄; kr xx f(x k )k ufh and kH k k umh 1 (3.8)

forallk. Combiningthiswiththedenitionof k ,wehavethat k umh (3.9)

forallkandallxinthe onvexhulloffx k

g. Forthepurposeofouranalysis,weshall

onsider S =fk j x k +1 =x k +s k g;

theset ofsu essful iterations,

A=fk j x + k

(8)

D=fk j k

1

g;

theset ofsuÆ ient des entiterations, and

N =fk j NONCONVEXisset g;

theset ofnon onvexiterations. ObservethatAS and

S\N =D\N: (3.10)

We on ludethisse tionbystatinga ru ialpropertyofthealgorithm.

Lemma3.1 Wehavethat,for all k0,

f(x 0 ) f(x k +1 ) k X j=0 j2S\N [f(x j ) f(x j+1 )℄: (3.11) Proof. DenotingS\N =fk i

g,weobservethattheme hanismofthealgorithm

ensuresthat f(x k i+1 )f(x ` )<f(x k i )

foralliandallk i

+1`k i+1

. Thisdire tly impliesthedesiredinequality. 2

3.1 Convergen e to Criti al Points

Werstprovethe onvergen eofouralgorithmto rst-order riti alpoints.

Ourrststepistoprovethat,aslongasarst-order riti alpointisnotapproa hed,

we do not have innitely many su essful non onvex iterations in the ourse of the

algorithm. Westartbyre allingthreeresultsfrom[2℄in orderto showthat the

trust-regionradiusisboundedawayfromzerointhis ase.

Lemma3.2 SupposethatA1-A4hold andthat ks k

k k

. Thenwe havethat

jf(x k +s k ) m k (x k +s k )j ubh 2 k ; (3.12) where x k +s k 2B k and ubh def = max[ ufh ; umh ℄: (3.13)

Theproofisidenti altothatofTheorem6.4.1in[2℄butwenowneedtomakethe

addi-tional assumptionthatks k

k k

expli it(instead ofbeingimpli it,in thisreferen e,

in thedenitionofatrust-regionstep).

Wenowshowthatthetrust-regionradiusmustin reaseifthe urrentiterateisnot

(9)

k k g k 6=0andthat k md kg k k(1 2 ) ubh : (3.14) Then iteration k 2 and k +1 k : (3.15)

The proof is thesame asTheorem 6.4.2in [2℄ when ks k

k k

, while (3.15) follows

fromStep4whenks k k> k asthen k +1 = k

. Asa onsequen e,weobtainthatthe

radius annotbe ometoosmallaslongasarst-order riti alpointisnotapproa hed.

Lemma3.4 Suppose that A1-A4 hold and that there exists a onstant lbg >0 su h that kg k k lbg

for allk. Then thereisa onstant lbd >0su hthat k lbd (3.16) for all k.

Proof. Assumethatiterationkistherstsu h that

k +1 1 md lbg (1 2 ) ubh (3.17)

Thismeansthatthetrust-regionradiushasbeende reasedatiterationk,whi hin

turn implies, from the onditionin Step 4of thealgorithm, that ks k

k k

. We

alsohavethat 1

k

k +1

, andhen ethat

k md lbg (1 2 ) ubh

Our assumption onthenorm ofthe gradientthen impliesthat (3.14) holds. This

and thefa t thatks k

k k

thus givethat(3.15)issatised. Butthis ontradi ts

thefa tthatiterationkistherstsu hthat(3.17)holds,andourinitialassumption

isthereforeimpossible. Thisyieldsthedesired on lusionwith

lb d = 1 md lbg (1 2 ) ubh 2

Wenowprovethe ru ialresultthatthenumberofsu essfulnon onvexiterations

mustbeniteunlessarst-order riti alpointisapproa hed.

Theorem3.5 SupposethatA1-A4 hold andthatthereexistsa onstant lbg >0su h that kg k k lbg

for all k. Then there an only be nitely many su essful non onvex

iterations inthe ourse ofthe algorithm, i.e. jS\Nj<+1.

Proof. Suppose, for the purpose of obtaining a ontradi tion, that there are

innitelymanysu essfulnon onvexiterations,whi hweindexbyS\N =fk i

(10)

k 1

in S\N,whi hinturn implies,with(2.3),that, fork2S\N,

f(x k ) f(x k +1 ) 1 [m k (x k ) m k (x k +s k )℄ 1 md kg k kmin kg k k k ; k 1 md lbg min lbg umh ; lb d ;

wherewehaveusedtheLemma3.4,(3.9)andourlowerboundonthegradientnorm

to obtain the last inequality. Combining now this bound with (3.11), we dedu e

that f(x 0 ) f(x k +1 ) k X j=0 j2S\N [f(x j ) f(x j+1 )℄& k 1 md lbg min lbg umh ; lb d ; where& k

=jf1;:::;kg\S\Nj. Aswehavesupposedthatthereareinnitelymany

su essfulnon onvexiterations,wehavethat

lim k !1 & k =+1; and [f(x 0 ) f(x k +1

)℄ is unbounded above, whi h ontradi ts the fa t that the

obje tive fun tion is bounded below, as stated in (3.8). Our initial assumption

must then be false, andthe set S\N of su essful non onvexiterations mustbe

nite. 2

Wenowestablishthe riti alityofthelimitpointofthesequen eofiterates when

there areonlynitely manysu essfuliterations.

Theorem3.6 SupposethatA1-A4and(2.3)holdandthatthereareonlynitelymany

su essful iterations,i.e. jSj<+1. Then x k

=x

for all suÆ ientlylarge k,andx

isrst-order riti al.

Proof. Letk 0

betheindexofthelastsu essfuliterate. Thenx =x k0+1 =x k0+j and k 0 +j < 1 forall j>0: (3.18)

Nowobservethat RESTRICTis set by thealgorithm in the ourse of every

unsu - essful iteration. This ag mustthus be set atthe beginning ofeveryiteration of

index k 0 +j+1for j >0. Asa onsequen e,ks k 0 +j+2 k k 0 +j+2 forall j >0.

This, (3.18)andtheme hanismofStep 4ofthealgorithmthenimplythat

lim k !1

k

=0: (3.19)

Assume now,forthe purposeof establishinga ontradi tion,that kg k0+1

k" for

some">0. ThenLemma3.4impliesthat(3.19)isimpossibleandwededu ethat

kg k0+j

(11)

Having proved the desired onvergen e property for the ase where S is nite,

we restri t our attention, for the rest of this se tion, to the ase where there are

innitelymanysu essfuliterations,i.e. jSj=+1. Werstinvestigatewhathappens

ifinnitelymanyvaluesareaddedtothelterinthe ourseofthealgorithm.

Theorem3.7 SupposethatA1-A4hold andthatjAj=jSj=+1. Then

liminf k !1

kg k

k=0: (3.20)

Proof. Assume,forthepurposeofobtaininga ontradi tion,that,forallklarge

enough, kg k k lbg (3.21) for some lbg

>0. Thisbound and Theorem3.5 then implythat jS\Nj isnite

and therefore that the lter is no longer reset to the empty set for k suÆ iently

large. Moreover,sin eourassumptionsimplythatfkg ki+1

kgisboundedaboveand

below,theremustexist asubsequen efk ` gfk i+1 gwherefk i g=Asu hthat lim `!1 g k` =g 1 with kg 1 k lbg : (3.22) Bydenitionoffk ` g,x k `

isa eptableforthelterforevery`,whi himplies,sin e

the lter is not reset for ` large enough, that, for ea h ` suÆ iently large, there

exists anindex j2f1;:::;ngsu hthat

jg k ` ;j j jg k ` 1 ;j j< g kg k ` 1 k: (3.23)

But(3.21) impliesthat kg k`

1 k

lbg

forall`suÆ ientlylarge. Hen ewededu e

from (3.23)that jg k`;j j jg k` 1;j j< g lbg

for all ` suÆ iently large. But the left-hand side of this inequality tends to zero

when`tendstoinnitybe auseof(3.22),yieldingthedesired ontradi tion. Hen e

(3.20) holds. 2

Considernowthe asewherethenumberofiteratesaddedtothelterinthe ourse

ofthealgorithmisnite.

Theorem3.8 Suppose that A1-A4 hold and that jSj = +1 but jAj < +1. Then

(3.20) holds.

Proof. Assume,againfor thepurposeof obtaininga ontradi tion, that(3.21)

holds for all k large enough and for some lbg

> 0. The niteness of jAj then

impliesthat k 1 andthatks k k k

forallk2SsuÆ ientlylarge. Ifwedene

&

p;k

=jfp;:::;kg\Sj,wethenobtainthat

f(x p ) f(x k +1 )= k X j=p j2S [f(x j ) f(x j+1 )℄& p;k 1 md lbg min lbg umh ; lb d ;

(12)

derivetheinequality. But& p;k

tendstoinnitywithkforaxedpsuÆ ientlylarge

sin ejSjisinnite,andweagainderivea ontradi tionfromthefa tthatf(x k +1

)

thenbe omesunboundedbelow. Thelimit(3.20)thenfollows. 2

Bythetwolasttheorems,wehavethatatleastoneofthelimitpointsofthesequen e

of iteratesgeneratedbythealgorithm satisestherst-orderne essary ondition. As

thefollowingexampleshows,this annotbeimprovedwithoutmodifyingthealgorithm.

Example3.1 Considertheobje tivefun tion

f(x)=x 3

(3x 4);

whi hhasa(degenerate (1)

) riti alpointat x=0,and itsglobalminimizeratx=1.

WewillshowthatitispossibleforAlgorithm2.1to onstru titeratesforwhi hx 2k = 1 2 k andx 2k +1 = 5 4

for k=0;1;2;:::; learlythere aretwolimitpoints,x L =0and x R = 5 4

,butonlytherstis riti al.

Let 0

>2,and suppose that g

< 1 2

and that thetrust-regionupdating s heme

(2.7)isspe i ally k +1 = 8 > < > : 1 2 k if k < 1 ; k if 1 k < 2 and 2 k if 2 k : (3.24)

Nowsupposethat

F=ff 0 (x 2k )gf 12(1+ 1 2 k ) 1 2 2k g and 2k >2: (3.25)

We then showthat the aboveiterationis possible forAlgorithm 2.1, andthat (3.25)

will persist. Considerrstx 2k = 1 2 k

, andthe onvexmodel

m 2k (x 2k +s)=f(x 2k )+sf 0 (x 2k )+ 1 2 s 2 h 2k ; where h 2k = f 0 (x 2k ) 5 4 x 2k >0:

Then theun onstrainedglobal minimizerof m 2k is s 2k = 5 4 x 2k , ands 2k will

suÆ- ientlyredu ethemodelwithin thetrustregionsin e 2k >2> 5 4 +( 1 2 ) k . Moreover m 2k (x 2k ) m 2k (x 2k +s 2k )= 1 2 (f 0 (x 2k )) 2 h 2k = 1 2 ( 5 4 x 2k )f 0 (x 2k )!0 while f(x 2k ) f(x 2k +s 2k )=f(x 2k ) f( 5 4 )>f(0) f( 5 4 )= 125 256 >0 andthus 2k 2 (3.26)

for largeenoughk. The trialpoint x 2k

+s 2k

is not a eptablefor thelter sin eits

gradientisf 0 ( 5 4 )= 75 16 f 0 (x 2k

),butitisana eptablepointbe ausethetrustregion

(1)

(13)

boundisina tiveandbe auseof(3.26). Thusx 2k +1 =x 2k +s 2k = 4 ,while(3.24)and (3.26) ensurethat 2k +1 =2 2k . Now onsiderx 2k +1 = 5 4

,andthe onvexmodel

m 2k +1 (x 2k +1 +s)=f(x 2k +1 )+sf 0 (x 2k +1 )+ 1 2 s 2 h 2k +1 ; where h 2k +1 = f 0 (x 2k +1 ) x 2k +1 + 1 2 k +1 >0:

As before, the un onstrained global minimizer of m 2k +1 is s 2k +1 = x 2k +1 1 2 k +1 , ands 2k +1

willsuÆ ientlyredu ethemodelwithinthetrustregionsin e 2k +1 >4> 5 4 +( 1 2 ) k . Althoughf(x 2k +1 ) f(x 2k +1 +s 2k +1 )<0andhen e 2k +1 <0; (3.27) x 2k +1 +s 2k +1 = 1 2 k +1

isa eptablefortheltersin eitiseasyto he kthat

jf 0 (x 2k +1 +s 2k +1 )j=jf 0 ( 1 2 k +1 )j< 1 2 jf 0 (x 2k )j: Hen e x 2k +2 = x 2k +1 +s 2k +1 = 1 2 k +1

. Moreoever (3.24) and (3.27) imply that

f 0 (x 2k +2 ) repla es f 0 (x 2k

) in the lter, and that 2k +2 = 1 2 2k +1 = 2k , and thus that (3.25)persists.

It is un lear how to modify the algorithm to enfor e the property that all limit

pointsarerst-order riti alwithoutadversely ae tingits numeri albehaviour. We

have onsiderednotallowinglteriterationswhentheratiobetweenthe urrent

gradi-entnormandthesmallestgradientnormfoundsofarex eedssomepres ribed(large)

onstant. Whilesu hamodi ationdoesnotappeartoae ttheresultsofour

numer-i alexperiments,todatewehavebeenunabletoshowthatthemodi ationyieldsthe

desired on lusion. Sin ewebelievethatthelikelihoodofthealgorithm onvergingto

morethanasinglelimitpointisverysmall(aswitheverytrust-regionmethodweare

awareof),theissuereallyisofmostlytheoreti alinterest.

We thus pursue our analysis by examining onvergen e to se ond-order riti al

points under the assumption that there is only one limit point. As in [2℄, we also

assumethefollowing.

A5 The matrix H k is arbitrarily lose to r xx f(x k

) whenever arst-order riti al

pointisapproa hed,i.e.

lim k !1 kr xx f(x k ) H k k=0 whenever lim k !1 kg k k=0: (Noti ethat h 2k

!0,andthusthat A5holdsin theaboveexample.)

Wearethenabletoderivethefollowingtheorem.

Theorem3.9 Suppose that A1-A5 hold and that the omplete sequen e of iterates

fx k

g onvergetothe uniquelimit pointx

. Then x

(14)

that[2,Lemma6.5.3℄isvalidinour ontext. Observealsothatourpreviousresults

implythat

g(x

)=0: (3.28)

Forthepurposeofderivinga ontradi tion, assumenowthat

def = min [r xx f(x )℄<0: (3.29)

Then, usingA5and(3.28), wededu ethatthereexistsak 0

su hthat, forkk 0 , min [H k ℄< 1 2 <0;

and, onsequently,thatk2N and

ks k k k (3.30) forkk 0

. OursuÆ ientde rease ondition(2.3)thenensuresthat, forkk 0 , m k (x k ) m k (x k +s k ) 1 2 md j jmin [ 1 4 2 ; 2 k ℄: (3.31)

Considernowtheratioofa hievedversuspredi tedredu tion k

inthe asewhere

k 1 2 j

j. Applying[2,Lemma 6.5.3℄to the ompletesequen efx k

g,wededu e

from (3.30)thattheremustexistak 1 k 0 andaÆ 1 2(0; 1 2 j j℄su h that k 2 forall kk 1 su h that k Æ 1 :

Asa onsequen e,ea hiterationwherethesetwo onditionsholdmustbevery

su - essfulandthealgorithmthenguaranteesthat k +1

k

. Thisandtheinequality

1 Æ 1 <Æ 1 1 2 j

jinturnimplythat

k min[ 1 Æ 1 ; k0 ℄ def = Æ 2 (3.32) for all k k 1

. Foreverysu essful iterationk k 1

, we then obtainfrom (3.31)

that f(x k ) f(x k +1 ) 1 2 1 md j jmin[ 1 4 2 ;Æ 2 2 ℄>0:

Remembering now that k 2 N for k k 1

(and thus that jNj = 1), we obtain

from (3.11)that jS\Nj,and hen ejSj, mustbenite, whi h inturn impliesthat

thetrust-regionradiustendstozero. Butthis ontradi ts(3.32). Hen eourinitial

assumption(3.29) mustbefalseandtheproofis omplete. 2

4 Numeri al experiments

Wenowreporttheresultsobtainedbyrunningouralgorithmonthesetof160

un on-strained (2)

problemsfromtheCUTEr olle tion[10℄. Thenamesoftheproblems with

theirdimensions (3)

aredetailedin Table4.1.

(2)

Weex ludedproblemBROYDN7Dbe auseofitsmultiplelo alminima. (3)

(15)

AIRCRFTB 5 DQRTIC 5000 OSBORNEA 5

ALLINITU 4 EDENSCH 10000 OSBORNEB 11

ARGLINA 200 EG2 1000 PALMER1C 8

ARGLINB 200 EIGENALS 2550 PALMER1D 7

ARGLINC 200 EIGENBLS 2550 PALMER2C 8

ARWHEAD 5000 EIGENCLS 2652 PALMER3C 8

BARD 3 ENGVAL1 10000 PALMER4C 8

BDQRTIC 5000 ENGVAL2 2 PALMER5C 6

BEALE 2 ERRINROS 50 PALMER6C 8

BIGGS3 3 EXPFIT 2 PALMER7C 8

BIGGS5 5 EXTROSNB 1000 PALMER8C 8

BIGGS6 6 FMINSRF2 5625 PARKCH 15

BOX2 2 FMINSURF 49 PENALTY1 1000

BOX3 3 FREUROTH 5000 PENALTY2 200

BRKMCC 2 GENROSE 500 PENALTY3 200

BROWNAL 200 GROWTHLS 3 POWELLSG 5000

BROWNBS 2 GULF 3 POWER 100

BROWNDEN 4 HAIRY 2 QUARTC 5000

BRYBND 5000 HATFLDD 3 RAYBENDL 2046

CHAINWOO 4000 HATFLDE 3 RAYBENDS 2046

CHNROSNB 50 HEART6LS 6 ROSENBR 2

CLIFF 2 HEART8LS 8 S308 2

CLPLATEA 10100 HELIX 3 SBRYBND 500

CLPLATEB 4970 HIELOW 3 SCHMVETT 5000

CLPLATEC 4970 HILBERTA 2 SCOSINE 5000

COSINE 10000 HILBERTB 10 SCURLY10 100

CRAGGLVY 5000 HIMMELBB 2 SCURLY20 100

CUBE 2 HIMMELBF 4 SCURLY30 100

CURLY10 10000 HIMMELBG 2 SENSORS 100

CURLY20 10000 HIMMELBH 2 SINEVAL 2

CURLY30 1000 HYDC20LS 99 SINQUAD 10000

DECONVU 61 JENSMP 2 SISSER 2

DENSCHNA 2 KOWOSB 4 SNAIL 2

DENSCHNB 2 LIARWHD 5000 SPARSINE 5000

DENSCHNC 2 LMINSURF 5329 SPARSQUR 10000

DENSCHND 3 LOGHAIRY 2 SPMSRTLS 4900

DENSCHNE 3 MANCINO 100 SROSENBR 5000

DENSCHNF 2 MARATOSB 2 SSC 4900

DIXMAANA 9000 MEXHAT 2 STRATEC 10

DIXMAANB 9000 MEYER3 3 TESTQUAD 5000

DIXMAANC 9000 MINSURF 36 TOINTGOR 50

DIXMAAND 9000 MOREBV 5000 TOINTGSS 5000

DIXMAANE 9000 MSQRTALS 1024 TOINTPSP 50

DIXMAANF 9000 MSQRTBLS 1024 TOINTQOR 50

DIXMAANG 9000 NCB20 5010 TQUARTIC 5000

DIXMAANH 9000 NCB20B 5000 TRIDIA 5000

DIXMAANI 9000 NLMSURF 5329 VARDIM 200

DIXMAANJ 9000 NONCVXU2 5000 VAREIGVL 50

DIXMAANK 9000 NONCVXUN 5000 VIBRBEAM 8

DIXMAANL 9000 NONDIA 5000 WATSON 12

DIXON3DQ 10000 NONDQUAR 5000 WOODS 10000

DJTL 2 NONMSQRT 100 YFITU 3

(16)

performed in double pre isionon aDell Latitude C840 portable omputer(1.6 Mhz,

1Gbyte ofRAM)under RedHat 9.0Linuxand theLaheyFortran ompiler(version

L6.10a)withdefaultoptions. Allattemptstosolvethetestproblemswerelimitedtoa

maximumof1000iterationsor1hourofCPUtime. Thevalues 1 =0:625, 2 =0:25, 3 =2, 1 =0:01, 2 =0:9, 0 =1and g =min 0:001; 1 2 p n wereused.

Twoparti ularvariantswere tested. Therst ( alleddefault) isthe algorithmas

des ribedabove,where,exa trstandse ondderivativesareusedandwhere,atea h

iteration, thetrialpoint is omputed byapproximatelyminimizing m k

(x k

+s) using

the Generalized Lan zos Trust-Region algorithm of [9℄ (without pre onditioning) as

implementedin theGALAHADlibrary[11℄. Thispro edureisterminatedatthersts

forwhi h krm k (x k +s)kmin h 0:1; p max( M ;krm k (x k )k) i krm k (x k )k; (4.33) where M

isthema hinepre ision. Inaddition,we hoose

f sup =min(10 6 f(x 0 );f(x 0 )+1000)

at Step 0 of the algorithm. Based on pra ti alexperien e [12℄, we also impose that

ks k

k1000 k

at alliterationsfollowingtherstoneat whi h arestri tedstepwas

taken. Finally, thealgorithmstops if

krf(x k )k10 6 p n: (4.34)

These ondalgorithmi variantisthepuretrust-regionversion,thatisthesame

algo-rithmwiththeex eptionthatnotrialpointisevera eptedin thelter.

Onthe160problems,thedefaultversionsu essfullysolved144andthepure

trust-region143. Failurealwayso ursbe ausethemaximaliteration ountisrea hedbefore

onvergen eisde lared,withtheex eptionofthetrust-regionvariantfailingonMEYER3

be ausetheproblem isjudgedto betooill- onditioned. Theltervariantisthusjust

asreliable (4)

as thetrust-regionversion.

Figures 4.1, 4.2 and 4.3 give the performan e proles for the twovariants for

it-erations, pu-timeand thetotalamountof onjugate-gradientiterations,respe tively.

Performan e proles give, for every 1, the proportion p() of test problems on

whi hea h onsidered algorithmi varianthasaperforman e withinafa tor ofthe

best (see[3℄ for a more omplete dis ussion). When omparing CPU times, we also

takeintoa ountina ura iesintiming by onsideringrun-timesasindistinguishable

iftheydierbylessthan1se ondorlessthan5%.

(4)

Thetwovariants onsistentlyfailonCHAINWOO,HYDC20LS,LMINSURF,LOGHAIRY,MEYER3, NLMSURF,

(17)

1

2

3

4

5

6

7

8

9

10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 σ

p(

σ

)

Default

Pure trust region

LANB

Figure4.1: Iterationsperforman eprolesforthetwovariantsandLANCELOTB

1

2

3

4

5

6

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 σ

p(

σ

)

Default

Pure trust region

LANB

(18)

1

2

3

4

5

6

7

8

9

10

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1 σ

p(

σ

)

Default

Pure trust region

LANB

Figure 4.3: CG iterationsperforman eprolesforthetwovariantsandLANCELOT B

Itis notdiÆ ultto seein these guresthat theltervariantis signi antlymore

eÆ ientthanthepuretrust-regionmethodintermsofthenumberofiterations(whi h

is identi al to the number of fun tion and gradient evaluations). Its advantage is

smaller but signi ant in termsof onjugate-gradientsiterations,but is osetby the

additional ost of the lter management operations. As a result, both variants are

essentially omparableintermsof pu-timeeÆ ien y,withaveryslightadvantagefor

thedefaultmethod.

Theproles also in lude a omparison with LANCELOT-B, oneof theGALAHAD

odes[11℄. Thisisanon-monotonetrust-regionalgorithm(see[15℄or[2,Se tion10.1℄),

whi hweusedunpre onditionedwith 0

=1andwithitsothersettingsattheirdefault

values. Againthismethod,whi hsu essfullysolves141outof160problems,appears

to be onsistentlyinferiorto thenewlteralgorithm.

5 Con lusion

We havepresentedalteralgorithm forun onstrainedoptimizationandhaveshown,

under standardassumptions, thatitprodu esatleastarst-order riti alpoint,

irre-spe tiveofthe hosenstartingpoint. Undermildadditional onditions,wealsoproved,

onvergen eofthe ompletesequen eofiterates anonlyo urtoase ond-order

(19)

when omparingthenewalgorithmtoapuretrust-regionmethod,signi antgainsin

thenumberof iterationsandfun tion/gradientevaluations anbea hieved.

Referen es

[1℄ C.M.ChinandR.Flet her.Convergen epropertiesofSLP-lteralgorithmsthat

takesEQPsteps. Mathemati al Programming, Series A,96(1):161{177,2003.

[2℄ A.R.Conn,N.I.M.Gould,andPh.L.Toint.Trust-RegionMethods.Number01

in MPS-SIAMSeriesonOptimization.SIAM,Philadelphia,USA,2000.

[3℄ E. D. Dolan and J. J. More. Ben hmarking optimization software with

perfor-man eproles. Mathemati al Programming,91(2):201{213,2002.

[4℄ R.Flet her,N.I.M.Gould,S.Leyer,Ph.L.Toint,andA.Wa hter. Global

on-vergen eoftrust-regionSQP-lteralgorithmsfornonlinearprogramming. SIAM

Journalon Optimization,13(3):635{659,2002.

[5℄ R. Flet herand S. Leyer. Nonlinear programmingwithouta penalty fun tion.

Mathemati al Programming,91(2):239{269,2002.

[6℄ R.Flet her,S.Leyer,andPh.L.Toint.Ontheglobal onvergen eofalter-SQP

algorithm. SIAMJournalonOptimization,13(1):44{59,2002.

[7℄ C.C. Gonzaga,E.Karas,and M.Vanti. A globally onvergentltermethod for

nonlinearprogramming. SIAMJournalonOptimization,13(3):646{669,2003.

[8℄ N.I.M.Gould,S.Leyer,andPh.L.Toint.Amultidimensionallteralgorithmfor

nonlinearequationsandnonlinearleast-squares. SIAMJournalon Optimization,

(toappear),2004.

[9℄ N. I.M. Gould,S. Lu idi, M. Roma,and Ph.L.Toint. Solving thetrust-region

subproblemusingtheLan zosmethod. SIAMJournalonOptimization,9(2):504{

525,1999.

[10℄ N. I.M. Gould, D. Orban, and Ph. L.Toint. CUTEr, a ontrainedand

un on-strainedtestingenvironment,revisited.Transa tionsoftheACMonMathemati al

Software,29(4):373{394,2003.

[11℄ N.I.M.Gould,D. Orban,andPh.L.Toint. GALAHAD|alibraryofthread-safe

Fortran 90 pa kages for large-s ale nonlinear optimization. Transa tions of the

ACMon Mathemati alSoftware,29(4):353{372,2003.

[12℄ N.I.M.GouldandPh.L.Toint.FILTRANE,aFortran95lter-trust-region

pa k-ageforsolvingsystemsofnonlinearequalities,nonlinearinequalitiesandnonlinear

least-squaresproblems.Te hni alReport03/15,RutherfordAppletonLaboratory,

(20)

S ienti andStatisti al Computing,4(3):553{572,1983.

[14℄ J.No edaland Y. Yuan. Combiningtrustregion andline sear hte hniques. In

Y.Yuan,editor, Advan es inNonlinearProgramming,pages153{176,Dordre ht,

TheNetherlands,1998.KluwerA ademi Publishers.

[15℄ Ph.L.Toint. Anon-monotone trust-regionalgorithmfornonlinearoptimization

subje tto onvex onstraints. Mathemati al Programming,77(1):69{94,1997.

[16℄ M. Ulbri h, S. Ulbri h, and L. N. Vi ente. A globally onvergent primal-dual

interiorpointltermethod fornon onvexnonlinearprogramming. Mathemati al

Programming, SeriesA, (toappear),2004.

[17℄ A. Wa hter and L. T. Biegler. Global and lo al onvergen e of line sear h

l-ter methods for nonlinear programming. Te hni al Report CAPD B-01-09,

De-partmentofChemi alEngineering,CarnegieMellonUniversity,Pittsburgh,USA,