A NNALES DE L ’I. H. P., SECTION C
S. M. R OBINSON
Bundle-based decomposition : conditions for convergence
Annales de l’I. H. P., section C, tome S6 (1989), p. 435-447
<
http://www.numdam.org/item?id=AIHPC_1989__S6__435_0>
© Gauthier-Villars, 1989, tous droits réservés.
L’accès aux archives de la revue « Annales de l’I. H. P., section C » (
http://www.elsevier.com/locate/anihpc) implique l’accord avec les condi- tions générales d’utilisation (
http://www.numdam.org/conditions). Toute uti- lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
CONDITIONS FOR CONVERGENCE
S. M. ROBINSON University OJ
Wisconsin-Madison 1513University Avenue, Madison,
WI 53706Abstract
Bundle-based
decomposition
is arecently proposed
method for decentralized convexoptimization. Computational
tests indicate that it is very fast. In this paper we exhibit conditions for convergence of the method. In the process westudy
conditions forlinearly-
constrained
approximate
minimization of a convex function.1. Introduction.
Bundle-based
decomposition (BBD)
is arecently proposed
method forsolving
the convexoptimization problem
-where the
fi
are. closed proper convex functions on a ER"B
and eachAi
is a linear transformation from to R"B Theproblem ( 1.1 ) represents
a decentralizedoptimization
with certain overall constraintsconnecting
the individualproblems.
Themethod in
question
was described in[11],
and extensivecomputational
tests arereported
in
[9].
These tests showed the method to be very fastcompared
both to MINOS 5.0[10]
and to the Ho-Loute "advanced
implementation"
ofDantzig-Wolfe decomposition [3,4].
After the user
prescribes
certainparameters
the BBD methodproduces,
in a finitenumber of
steps, approximate primal
and dual solutions of( 1.1 ).
In this paper weidentify
conditions on the
problem (1.1)
under which the method converges: thatis,
under whichthe
parameters
can, inprinciple,
be set so that thecomputed
solutions lie within anypreassigned
tolerance of an actualpair
ofprimal
and dual solutions of( 1.1 ) . Thus,
theanalysis
here contributes apriori
convergenceconditions,
whereas in[9,
Th.3.7]
Medhidevelops a posteriori
error information.The rest of the paper consists of three sections. In
§2
weanalyze
the BBD method toestablish
properties
of theapproximate
solutions itproduces.
We show thatthey satisfy
The research reported here was sponsored by the National Science Foundation under Grant
CCR-8502202, and by the International Institute for Applied Systems Analysis (IIASA), Laxenburg, Austria.
The US Government has certain rights in this material, and is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon.
certain "e-first-order"
optimality
conditionsgiven by Strodiot, Nguyen,
and Heukemes[14],
and we characterize
points satisfying
those conditions in terms ofapproximate optimization
of a certainperturbed
dualpair
of convexprogramming problems.
In
§3
we introduce asimple
characterization of local boundedness formultifunctions,
and use it to show that the inverse of the multifunction associated with the e-first-order conditions is Hausdorff upper semicontinuous at interior
points
of itsimage. Further,
we obtain anexpression
for the interior of thatimage
and we show that it isindependent
of the tolerance ~.In
§4
we translate theinteriority
information obtained in§3
into apair
ofsimple
conditions on the
optimization problem (1.1);
these amount to a Slater conditionplus
acompactness assumption
on the level sets of the essentialobjective
function. Then we show that under these two conditions the BBD method converges in the sense described above.2. The BBD method and the e-first-order conditions.
The BBD method solves
( 1.1 ) by dualizing
withrespect
to theequality
constraint toproduce
a concave dualobjective
functionUnder the technical
assumptions
thatand that there exists
Pô
withwe have
where is the set of
points solving
the decentralizedsubproblem
The BBD
procedure
uses the bundle method[7]
to find anapproximate
minimizer of g,using (2.3)
and(2.4)
tocompute subgradients of g.
Since the way in which the method usesthis information is
important
to ouranalysis,
we describe it inenough
detail todevelop
the facts we shall need later.
The user of the method
prescribes
two smalltolerances,
E and b. At the termination of the bundlealgorithm
one has dual elementspi , ... , pk
and associatedprimal
elementst
20141,..., n ; j
=1, ... , k ~ having
thefollowing properties:
(1)
Xjiminimizes f=(-) - (Aipj,.)
for each iand j :
thatis,
(2)
Withd~
= a -E: 1
we have from(2.3)
(3)
There existAi,...~
allnon-negative,
withE~=i~ =
1 and such that with~ = we have
and
where
one has 0
by (2.6).
The method takes
p* = pk
to be theapproximate
dual solution for( 1.1 ).
To constructan
approximate primal
solution(31 , ... , 3n )
it setsnote that
so that
(2.7) implies
Therefore
if ð is small then( i 1, ... ,
isnearly
feasible for( 1.1 ) .
The
objective
of this paper can now beprecisely
stated as follows: exhibit conditionson the
problem ( 1.1 )
under which for eachpositive
Ty there exists apositive ~
so thatwhenever
max{ ~, E} ’Y
there arepoints (~... xn ) solving ( 1.1 )
andp* maximizing
thedual
objective 9,
such thatwhere
(31 , ... , 3n)
andp*
are thepoints produced by
thealgorithm
as described above.We shall obtain these conditions in
§4; they
turn out to bestrengthened
versions of the technicalassumptions (2.1 )
and(2.2).
In the remainder of this section we rewrite the information in
(2.5) through (2.10)
ina more
manageable
form. To do so we let 2: =(~i,... Xn)
ERN,
where N = 1 ni, and we definen
so that A :
R~ 2014~
Rm and Ax =~~
We use a similar convention for x and x, as well as forProposition
2.1. Theapproximate
solutions ~ andp* produced by
the BBD methodsatisfy
_ _ . , . _that
is,
and
Proof:
We have =xi
so we can rewrite(2.5)
asand so x3 E for
each j.
Hence for each z* andeach j,
The
quantity
in brackets can be rewritten asComparing
this with(2.9)
andusing p*
=p~
anddJ
= a - we see that this isjust
~j,so we have
Now
multiplying
thisinequality by Aj
andsumming over j,
we obtainthat
is,
i E which isequivalent
to(2.12).
Theproof
of(2.13)
consists ofmultiplying
the definitiondj
= a -Arj by Åj
andsumming over j. 1
The form in which
(2.11)
is writtenemphasizes
its closeness to the standard first-orderoptimality
conditions. Infact, (2.11)
amounts to aslight perturbation
of the "e-first-order"optimality
conditions ofStrodiot, Nguyen,
and Heukemes[14], specialized
to thepresent
case: the
perturbation
consists in thereplacement
of a zeroby
-d in the left side of theinclusion.
The
analysis
in[14] emphasized establishing
necessary and sufficient conditions for E-optimality
in the presence of a constraintqualification.
For thesimpler problem
with whichwe are concerned
here,
the conditions(2.11 )
have a very clear and directinterpretation,
which we
give
in thefollowing proposition.
Init,
we consider thepair
ofoptimization problems
and
where
Note that
(2.14)
and(2.15)
are dual to each other under theduality
structuregenerated by
which is a
slight perturbation (by d)
of that used togenerate
the dualob jective g
of theBBD method. The
function 9d is g - (., d~.
Proposition
2.2. Thefollowing
areequivalent:
(i) x andp* satisfy (2.11).
(ii)
Ax = a - d and,f (~) - 9d(p*)
~.Proof: x
andp* satisfy (2.11)
if andonly
if Ax = a - d andA*p*
E~~f(x).
Thesecond of these relations can be written as
so
(ii)
holds.Reversing
theargument
shows that(ii) implies (i). I
Now define a multifunction M with
arguments (~,r*,~) by
that
is,
for each fM(6,
.,.)
is the multifunction inverse to that on theright
side of(2.11).
. With this notation
M(0, 0, 0)
is theproduct
of theprimal
and dual solution sets of( 1.1 ),
where the
duality
structure is that used in the BBD method:namely, (2.16)
with d = 0.Therefore our aim of
proving
the BBD methodconvergent
will be achieved if we can show that when f,r*,
and s aresufficiently
close to zero, eachpoint
ofM(e, r*, s)
will lie withina
predetermined
distance of somepoint
ofM(0, 0, 0).
This amounts toproving
that M isHausdorff
upper semicontinuous at(0, 0, 0)
relative toR+
xRN
x Rm. In the next sectionwe exhibit conditions under which this will be true.
3.
Semicontinuity
of solutions to the E- first-order conditions.In
§2
we observed that the critical issue inproving
convergence of the BBD method was to show that theoperator M, expressing
solutions of theperturbed
~-first-order conditions in terms of theperturbations
and the tolerance E, was Hausdorff upper semicontinuous at(0, 0, 0).
In this section we prove thisby showing
that M islocally
bounded under certainassumptions.
We then conclude that M isactually
Hausdorff upper semicontinuous asdesired. Then in
§4
weanalyze
therequired assumptions
and relate them toproperties
of the minimizationproblem ( 1.1 ), thereby developing
conditions on( 1.1 )
under which the BBD method will converge.To
begin
theanalysis
of localboundedness,
we consider a multifunction G fromRk
toR~ . By definition,
G islocally
bounded at apoint
Xo ERk
if there is someneighborhood
N of xo such that
G(N) (that is,
the setU{
is bounded. Thefollowing simple proposition
characterizes local boundedness.Proposition
3.1. Let G be a multifunction fromRk
toRl.
Then G islocally
boundedat To E
Rk
if andonly
if for each y near To,Proof (only if):
Choose aneighborhood V
of xo smallenough
so thatG’(V)
C??B
forsome 7y, where B is the unit ball. Let y E
R~.
Then for each x E V and each x* EG(x), (x*, y - x)
Henceand therefore
(3.1)
holds. Note that ifG(V)
=0
the limitsuperior
is -~by
definition.(if):
Assume that(3.1 )
holds foreach y
near Xo. If G is notlocally
bounded at :cothen there is a
sequence { converging
to xo, withx~
~G(a’n)
such that~~j)
> n forn =
1, 2, ....
There is no loss inassuming
that converges to somepoint
zo . Nowchoose any y near xa.
By (3.1)
there is some 03B3 such that for each n,x*n,y - xn~
;.Dividing
thisinequality by
andtaking
thelimit,.we
find thatHowever,
=1,
so(3.2)
cannot hold for every such y. Therefore G islocally
boundedat
We consider
briefly
some classes of multifunctions thatsatisfy (3.1 ). First,
considermonotone
operators:
thatis,
multifunctions G :R~ 2014~ Rk having
the property that for each xl and 2:2 in dom G(= f ~
ERk G’(x) 5~ 0 })
and eachyi
Eand y2
EG(3-2),
one has
For such an
operator G,
if ~o E int dom G then for any y near any fixedy*
EG(y) ,
anyx near Xo and
any x*
EG(x),
we havetherefore
and
(3.1 )
holds. In this case the result ofProposition
3.1 is aspecial
case of Rockafellar’s theorem on the local boundedness of monotoneoperators [12],
and of Kato’s earlier results[5,6].
These results hold in much moregeneral
spacesand,
asmight
beexpected,
theirproofs
are much more substantial than that ofProposition
3.1.Next,
consider for some fixed E > 0 the multifunctionGe
definedby
Suppose
that a-i and x2belong
todom ~~f;
letp*1
andpi
bearbitrary,
and let(r*i, si)
Efor i =
1, 2.
Thenand
so
by
addition we find thatwhere we have used the natural extension of the inner
product
to Since thismultifunction
Ge
satisfies aninequality
similar to that satisfiedby
monotoneoperators,
we can use an
argument
similar to the onejust
made to show thatG
islocally
boundedat each
point
of int domGE .
Observe that since the
key inequalities
used above for monotoneoperators
and for theoperator G
aresymmetric
inarguments
andvalues,
the local boundedness conclusions hold also for the inverses of suchoperators,
where the inverse of a multifunction F :R~ 2014~ R/
is the multifunction
F-1 : RI --> Rk
definedby
Since the effective domain
of F-1
is then theimage
of F(written
imf,
this is the set~z~Rk F(x)),
the local boundedness assertions for the inverses hold at interiorpoints
ofthe
images
of theoriginal
multifunctions.Also,
note that thegraph
of theoperator GE
definedby (3.3)
can be written aswhere we have identified with its
2:*
EBE,f (~) ~.
As whene > 7~ the same
isotonicity
holds for thegraph
ofGE.
Inparticular,
for any sets U andV,
if E > q then and
G~ (V).
Therefore ifG7~ is locally
boundedsomewhere,
then the same boundapplies
toWe can summarize these observations in the
following corollary.
Corollary
3.2. Let E > 0 and letGe
be definedby (3.3). belongs
to the interiorof im GE,
then there exist aneighborhood
N of(rô , so )
and a bounded setV,
such that for each 7J E[0, f], 1 (N)
c V.We can see from the results
already proved
that we will need toidentify points
in theinterior of im
GE.
Thefollowing
theorem characterizes suchpoints:
infact,
it characterizes the closure and interior of im(aEg
+H), where g
is any closed proper convex function and H is asingle-valued,
continuous monotoneoperator.
In this sense it extends the fact thatim ~~g ~ im ~g,
where we write D to indicate that the sets C and D have the sameclosure and the same interior.
Theorem 3.3.
Let g
be a closed proper convex function onR~
and H asingle-valued,
continuous monotone
operator
fromRk
to itself such that8g
-f- H is maximal monotone.Th en for each f >
0,
im(~~g
+im (8g
+(im 8g)
+H(domBg).
Proo f:
Denoteby Ho
the restriction of H to dom8g.
ThenHo
is monotone, domdom Ho,
and8g
+Ho
=8g
+H,
which is maximal monotone.By
the theorem of Brézis and Haraux[2,
Th.4]
one hasim (8g
+im~g
+im Ho .
Butim Ho
=H(dom 8g),
sothis proves the second "^-_’" claim.
For the
first,
note that thegraph
inclusionproperty implies
im(~7+~) D
imand therefore this inclusion holds also for the closures and the interiors of these sets. Write
S’E
forim(~~g
+H)
and S forim (8g
+H),
and suppose that we could prove clSe
C cl S.We know that int S = int cl S
[1,
p.33],
and therefore we would have int int S = int cl S D int cl int5’c,
implying
that all of the sets in this chain of inclusions are the same. Therefore we will have finished theproof
if we can show that cl5’~
C cl S.Since
im~~g
C clim 8g
and dom~~g
C cl dom8g ,
we havewhere we have used the second "^-_’"
relation, already proved.
Nowby taking
theclosure
(of the left side
above,
we obtain cl6’c
= cl S asrequired. I
.
It is worth
remarking
that we do not ingeneral
haveequality,
even when H =0,
asthe
example g(x)
= e-z shows. Hereim 8g
=(2014oo.O),
but for E >0, im 8e g
=(-00,0].
Now recall that at the end of
§2
wepointed
out that the convergenceproperty
we wanted amounted to Hausdorffupper semicontinuity
of a certain multifunction. For amultifunction F
fromRk
toRl,
we say F is Hausdorff upper semicontinuous atR~
if for
each ~
> 0 there is someneighborhood
N of xo such thatF(N)
C +where B is the unit ball. As
might
beexpected,
thisproperty
isclosely
related to localboundedness. Specifically,
we say that F is closed at zo ifwhere is the
neighborhood system
at a:o. This amounts tosaying
that if zo and yn EF(xn)
for each n,with yn ~
yo, then yo EF(xo ).
Now it is easy to show that if F is closed at xo andlocally
boundedthere,
then it is Hausdorff upper semicontinuousat 2’o- This
fact, together
with what we haveproved
up to now, leads to thefollowing
continuity
result for solutions of the f-first-order conditions.Theorem 3.4. Let M be defined
by (2.I 7)
and let e > 0. Then M is Hausdorff upper semicontinuous at(6,r*,.s)~
relative toR+
xRN
xR m,
wheneverand
Proof:
We aregoing
to show that the(r*, s) satisfying (3.4)
and(3.5)
are thosebelonging
to the interior of theimage
of theoperator GE
definedby (3.3). By
Theorem 3.3 this is also the interior ofim Gu
for some a > f. ThenCorollary
3.2 shows that for someneighborhood
N of(r*, s)
and all q E[0, a] , Gn 1 (N)
is contained in some bounded set V.It follows that the
image
under M of aneighborhood
of(~,r*,s)
inR+
xRN
x R’n isbounded;
therefore M islocally
bounded at(f, r*, s).
If we considersn) converging
to
(e, r* , s)
and let E with(xn, pn) converging
to(xo,Põ),
then foreach n we have
and
Now
(3.6)
can be rewritten as’
taking
the limit andusing
the lowersemicontinuity
off
andf *
we find thatthat
is,
r* + E while we have s =Axo -
a from(3.7).
Hence EM(e, r* , s)
and therefore M is closed at(e, r* , s).
But this shows that M is Hausdorff upper semicontinuous at(~,r*,~),
as claimed.Now it remains to show that
(3.4)
and(3.5)
describe thepairs (r*, s)
in int imGE.
Applying
Theorem 3.3 withwe find that
where
g(x,p*) = ,f (~).
Nowand
Therefore
Now we
always
haveso these two sets have the same affine hull.
Therefore
A similar
argument using
the relation riA(C)
=A(ri C)
establishes thatTherefore,
as
required. I
Theorem 3.4
gives
ageneral
criterion for Hausdorff uppersemicontinuity
of the solu-tions to the e-first-order
optimality
conditions. In the next section weapply
this criterion to establish conditions for convergence of bundle-baseddecomposition.
4.
Application : Convergence
of the BBD method.In this section we
apply
Theorem 3.4 to prove convergence of the bundle-baseddecompo-
sition method discussed in
§2.
In the notation of thattheorem,
we want to prove that M is Hausdorff upper semicontinuous at(0, 0, 0)
relative toR+
xRN
x Therefore weneed to
verify (3.4)
for r* = 0 and(3.5)
for s = 0. The first of these conditions says thatwhere L* is the
subspace
im A* and I denotes the indicator function. This isequivalent (for example, by [8,
Lemma6])
towhere rec
f
denotes the recession function off.
Since7~
ispositively homogeneous,
it isits own recession
function;
as it alsoequals IL,
where L =ker A,
we see that(3.4)
withr* = 0 is
equivalent
to the assertion thatf
has no directions of recession in ker A. From[13,
Th.8.7]
we find that this isequivalent
to thefollowing compact-level-set
condition:By [13,
Th.27.1(f)],
we know that the set in(4.1)
iscompact
for eachreal 03B3
aslong
asfor some
real 03B3
it isnonempty
andbounded,
the boundednesssufficing
forcompactness
becausef
is assumed closed.Condition
(3.5)
with s = 0 isdirectly interpretable
as thefollowing Slater-type
con-dition :
For any d near
0,
thesystem
Ar = a - d has a solution x Edom f. (4.2)
It is worth
noting
that(4.1)
and(4.2)
arestrengthened
formsof, respectively,
theconditions
(2.2)
and(2.1)
used in thedevelopment
of the BBDmethod; essentially,
"ri"has been
replaced by
"int." Thefollowing
theorem shows that thisstrengthening
enablesus to conclude a
priori
that the method converges.Theorem 4.1. For i =
1, ... , n,
letfi
be closed proper convex functions from R"’ to( -00,
andAi
be linear transformations from to let a E Assume thefollowing:
(i)
For each d near 0 inRm,
thesystem .
is
solvable.
(ii)
For some real ~y, the setis
nonempty
and bounded.Then
for each ~
> 0 there exist b > 0 and f > 0 such thatp*,
and dsatisfy (2.5)-(2.10),
then there exist x1,..., xn andp*
such that(x1,...,xn)
minimizes1 ,f=(~i)
on theset { (~1, ... , = ca ~,
andp*
maximizes the functionand such that
Proof: (i)
and(ii)
areequivalent
to(4.2)
and(4.1) respectively,
and we have shownthese to be
equivalent
to(3.5)
with s = 0 and(3.4)
with r* = 0.Applying
Theorem3.4 with E =
0,
r* =0,
and s =0,
we find that the multifunction M definedby (2.17)
is Hausdorff upper semicontinuous at
(0, 0, 0)
relative toR+
xRN
x Rffi. This meansthat if E and 6 are taken to be
sufficiently
smallpositive numbers,
and if $ asrequired by (2.7),
then eachpoint
ofAf(~,0,2014~)
will lie within anypreassigned positive
distance from the set
M(0, 0,0).
ButM(0,0,0)
is the setof { (~1, ... , p-’" } having
hteoptimality properties
claimed in the statement of Theorem4.1,
andM(E, 0, -d) contains, by Proposition 2.1, all { (d-i,... satisfying (2.5)-(2.10). I
References
[1]
H.Brézis, Opérateurs
Maximaux Monotones etSemi-Groupes
de Contractions dans lesEspaces
deHilbert,
North-Holland Mathematics Studies No. 5(North-Holland, Amsterdam, 1973).
[2]
H. Brézis and A.Haraux, "Image
d’une sommed’opérateurs
monotones etapplica- tions,"
Israel Journal of Mathematics 23(1976)
165-186.[3]
J. K. Ho and E.Loute,
"An advancedimplementation
of theDantzig-Wolfe decompo-
sition
algorithm
for linearprogramming,"
in: G. B.Dantzig,
M. A. H.Dempster,
andM.
Kallio, eds., Large-Scale
LinearProgramming (International
Institute forApplied Systems Analysis, Laxenburg, Austria, 1981)
425-460.[4]
J. K. Ho and E.Loute,
"DECOMP User’sGuide," unpublished manuscript.
[5]
T.Kato, "Demicontinuity, hemicontinuity,
andmonotonicity,"
Bull. Amer. Math.Soc. 70
(1964)
548-550.[6]
T.Kato, "Demicontinuity, hemicontinuity,
andmonotonicity. II,"
Bull. Amer. Math.Soc. 73
(1967)
886-889.[7]
C.Lemaréchal,
J. -J.Strodiot,
and A.Bihain,
"On a bundlealgorithm
for nons-mooth
optimization,"
in: O. L.Mangasarian,
R. R.Meyer,
and S. M.Robinson, eds.,
Nonlinear
Programming
4(Academic Press,
NewYork, 1981)
245-282.[8]
L. McLinden and R. C.Bergstrom,
"Preservation of convergence of convex sets and functions in finitedimensions,"
Trans. Amer. Math. Soc. 268(1981)
127-142.[9]
D.Medhi, "Decomposition
of structuredlarge
scaleoptimization problems
andparallel
computing," Dissertation, Computer
SciencesDepartment, University
of Wisconsin- Madison(Madison,
WI53706, 1987).
See also twoJuly
1988manuscripts by
thisauthor: "Bundle-based
decomposition
for structuredlarge-scale
convexoptimiza-
tion
problems:
error estimate andcomputational experience,"
and "Parallel bundle- baseddecomposition
forlarge-scale
structured mathematicalprogramming problems,"
AT&T Bell
Laboratories, Holmdel,
NJ 07733.[10]
B. A.Murtagh
and M. A.Saunders,
"MINOS 5.0 User’sGuide,"
TechnicalReport
No.SOL
83-20, Systems Optimization Laboratory, Department
ofOperations Research,
StanfordUniversity (Stanford,
CA94305,
December1983).
[11]
S. M.Robinson,
"Bundle-baseddecomposition: Description
andpreliminary results,"
in: A.
Prékopa,
J.Szelezsán,
and B.Strazicky, eds., System Modelling
andOptimiza-
tion
(Lecture
Notes in Control and Information Sciences No.84, Springer-Verlag, Berlin, 1986)
751-756.[12]
R. T.Rockafellar,
"Local boundedness ofnonlinear,
monotoneoperators," Michigan
Mathematical Journal 16
(1969)
397-407.[13]
R. T.Rockafellar,
ConvexAnalysis (Princeton University Press, Princeton, NJ, 1970).
[14]
J. -J.Strodiot,
V.H.Nguyen,
and N.Heukemes,
"~-optimal
solutions in nondiffer-entiable
programming
and some relatedquestions,"
Math.Programming
25(1983)
307-328.