Existence of the Limit Value of Two Person Zero-Sum Discounted Repeated Games via Comparison Theorems

(1)

Existence of the Limit Value of Two Person Zero-Sum Discounted Repeated Games via Comparison Theorems

^∗

Sylvain Sorin^† and Guillaume Vigeral^‡

Communicated by Irinel Chiril Dragan

Abstract

We give new proofs of existence of the limit of the discounted values for two person zero- sum games in the three following frameworks: absorbing, recursive, incomplete information.

The idea of these new proofs is to use some comparison criteria.

AMS Classification: 91A15, 91A20, 49J40, 47J20

Keywords: stochastic games, repeated games, incomplete information, asymptotic value, comparison principle, variational inequalities

1 Introduction

The purpose of this article is to present a unified approach to the existence of the limit value for two person zero-sum discounted games. The main tools used in the proofs are

- the fact that the discounted value satisfies the Shapley equation [1],

- properties of accumulation points of the discounted values, and of the corresponding optimal strategies,

- comparison of two accumulation points leading to uniqueness and characterization.

∗The research of the first author was supported by grant ANR-08-BLAN- 0294-01 (France). The research of the second author was supported by grant ANR-10-BLAN 0112 (France)

†Combinatoire et Optimisation, IMJ, CNRS UMR 7586, Faculté de Mathématiques, Université P. et M. Curie Paris 6, Tour 15-16, 1er etage, 4 Place Jussieu, 75005 Paris and Laboratoire d’Econométrie, Ecole Polytechnique, France [email protected]

‡Corresponding author ; Université Paris-Dauphine, CEREMADE, Place du Maréchal De Lattre de Tassigny.

75775 Paris cedex 16, France [email protected]

(2)

We apply this program for three well known classes of games, each time covering the case where action spaces are compact.

For absorbing games, the results are initially due to Kohlberg [2] for finitely many actions, later extended in Rosenberg and Sorin [3] for the compact case. An explicit formula for the limit was recently obtained in Laraki, [4] and we obtain a related one. The case of recursive games was first handled in Everett [5], with a different notion of limit value involving asymptotic payoff on plays. It was later shown by Sorin [6] that these results implied also the existence of the limit value for two person zero-sum discounted games. The last class corresponds to games with incomplete information, where the results were initially obtained in Aumann and Maschler [7] and Mertens and Zamir [8] (including also the asymptotic study of the finitely repeated games). In that case, we follow a quite similar approach to Laraki [9].

2 Model, Notations and Basic Lemmas

Let G be a two person zero-sum stochastic game defined by a finite state space Ω, compact metric action spacesIandJ for player 1 and 2 (with mixed extensionsX = ∆(I)andY = ∆(J), respectively, where for a compact metric spaceC,∆(C)denotes the set of Borel probabilities on C, endowed with the weak-?topology ), a separately continuous real bounded payoffgonI×J×Ω and a separately continuous transitionρfrom I×J×Ωto ∆(Ω).

The game is played in discrete time. At stage t, given the state ω_t, the payers choose moves i_t ∈ I, j_t ∈ J, the stage payoff is g_t = g(i_t, j_t, ω_t) and the new state ω_t+1 is selected according to ρ(it, jt, ωt), and is announced to the players. Given λ ∈]0,1], the total evaluation in the λ- discounted game isP∞

t=1λ(1−λ)^t−1gt.

The Shapley operatorΦ(λ, f)[1] is then defined, forλ∈[0,1]andf in some closed subsetF0

of the set of bounded functions fromΩtoIR, by the formula

Φ(λ, f)(ω) = min

Y max

X

λg(x, y, ω) + (1−λ)Eρ(x,y,ω)f(·)

= max

X min

Y

λg(x, y, ω) + (1−λ)E_ρ(x,y,ω)f(·) ,

whereg andρare bilinearly extended to X×Y. Forλ >0, the only fixed point ofΦ(λ,·)is the valuevλ of the discounted game [1].

The sets of optimal actions of each player in the above formula are denoted byXλ(f)(ω)and Y_λ(f)(ω). Let X = X^Ω and, similarly, Y = Y^Ω. For simplicity, for any (x,y) ∈ X×Y we

(3)

denote ρ(x,y, ω) :=ρ(x(ω),y(ω), ω). Moreover, define Xλ(f) :=Q

ω∈ΩXλ(f)(ω) andYλ(f) :=

Q

ω∈ΩYλ(f)(ω).

Sdenotes the set of fixed points of the projective operatorΦ(0, .), andS0is the set of accumulation points of the family{vλ}as λgoes to 0.

The following lemmas are easy to establish in this finite state framework:

Lemma 2.1. S₀⊂ S.

Lemma 2.2. Assume that vλ_n converges to v ∈ S0 and that some sequence of optimal actions xλ_n∈Xλ_n(vλ_n)converges to x. Thenx∈X0(v).

Lemma 2.3. Let v andv⁰ be in S andΩ1 = Argmax(v−v⁰). For any x∈X0(v),y ∈Y0(v⁰), andω∈Ω1, the probabilityρ(x,y, ω) is supported byΩ1.

Proof. Sincev∈ S andx∈X0(v):

v(ω) = Φ(0, v)(ω)≤E_ρ(x,y,ω)v(·).

Using a dual inequality as well:

v(ω)−v⁰(ω)≤E_ρ(x,y,ω)(v−v⁰)(·),

and the result follows.

3 Absorbing Games

We consider here a special class of stochastic games, as defined in Section 2. We are given two separately continuous (payoff) functionsg,g^∗ from I×J to[−1,1], and a separately continuous (probability of absorption) functionpfromI×J to[0,1].

The repeated game with absorbing states is played in discrete time as follows. At staget= 1,2, ...

(if absorption has not yet occurred) player 1 chooses i_t∈Iand, simultaneously, player 2 chooses j_t∈J:

(i) the payoff at stagetisg(i_t, j_t);

(ii) with probability p^∗(i_t, j_t) := 1−p(i_t, j_t), absorption is reached and the payoff in all future stagess > tisg^∗(it, jt);

(iii) with probabilityp(it, jt), the situation is repeated at staget+ 1.

(4)

Recall that the asymptotic analysis for these games is due to Kohlberg [2] in the case whereI andJ are finite.

As usual, denoteX:= ∆(I)andY := ∆(J);g,pandp^∗are bilinearly extended toX×Y. Let p^∗(x, y)g^∗(x, y) := R

I×Jp^∗(i, j)g^∗(i, j)x(di)y(dj). g^∗(x, y) is thus the expected absorbing payoff, conditionally to absorption.

The Shapley operator of the game is then defined onIRby

Φ(λ, f) := min

y∈Y max

x∈X{λg(x, y) + (1−λ)(p(x, y)f+p^∗(x, y)g^∗(x, y)}

:= max

x∈Xmin

y∈Y{λg(x, y) + (1−λ)(p(x, y)f+p^∗(x, y)g^∗(x, y)}.

In this framework, we can prove a stronger version of Lemma 2.3:

Lemma 3.1.

i) Let f ∈IR such that f ≥Φ(0, f)andy∈Y0(f). Then, for any x∈X,

p^∗(x, y)>0 =⇒ f ≥g^∗(x, y).

ii) Letf ∈IR such thatf ≤Φ(0, f)andx∈X0(f). Then, for any y∈Y,

p^∗(x, y)>0 =⇒ f ≤g^∗(x, y).

Proof. We provei). Givenx∈X andy∈Y₀(f),

f ≥Φ(0, f)≥p(x, y)f+p^∗(x, y)g^∗(x, y)

andp(x, y) = 1−p^∗(x, y), hence the result.

Given λ∈]0,1[, x∈X and y ∈Y, let rλ(x, y) be the induced payoff in the discounted game by the corresponding stationary strategies: rλ(x, y) :=Ex,yP

λ(1−λ)^t−1gt. Lemma 3.2.

rλ(x, y)≤











g(x, y), if p^∗(x, y) = 0, max(g(x, y), g^∗(x, y)), if p^∗(x, y)>0.

Proof.

r_λ(x, y) =λg(x, y) + (1−λ) [p(x, y)r_λ(x, y) +p^∗(x, y)g^∗(x, y)] ;

(5)

hence

rλ(x, y) = λg(x, y) + (1−λ)p^∗(x, y)g^∗(x, y) λ+ (1−λ)p^∗(x, y) .

The previous lemma implies:

Lemma 3.3. Let λ∈]0,1[,x_λ∈X_λ(v_λ)andy∈Y; then

vλ≤











g(xλ, y), ifp^∗(xλ, y) = 0, max(g(x_λ, y), g^∗(x_λ, y)), ifp^∗(x_λ, y)>0.

Proof. Sincexλ is optimal in the discounted game, for anyy∈Y,

vλ≤rλ(xλ, y)

and the assertion follows from Lemma 3.2.

Combining the preceding lemmas yields:

Proposition 3.1. Assume that v_λ_n → v and x_λ_n → x with x_λ_n ∈ X_λ_n(v_λ_n). Let v⁰ such that v⁰≥Φ(0, v⁰)andy∈Y0(v⁰); then

v≤max(g(x, y), v⁰).

Proof. For any n and any y ∈ Y, Lemma 3.3 implies that either vλ_n ≤ g(xλ_n, y) or that p^∗(xλ_n, y) > 0, and vλ_n ≤ max(g(xλ_n, y), g^∗(xλ_n, y)). In the second case, since y ∈ Y0(v⁰), the first assertion in Lemma 3.1 ensures that g^∗(xλ_n, y)≤v⁰, so in both cases we get the inequality vλn ≤max(g(xλn, y), v⁰). Passing to the limit yields the result.

Corollary 3.1. v_λ converges as λgoes to 0.

Proof. Suppose, on the contrary, that there are two sequencesv_λ_n→v andv_λ⁰

n→v⁰ withv > v⁰. Up to an extraction, one can assume that xλ_n ∈ Xλ_n(vλ_n)converges to xand, similarly, yλ⁰_n ∈ Yλ⁰_n(vλ⁰_n) converges to y. By Lemma 2.2, v⁰ = Φ(0, v⁰)and y ∈ Y0(v⁰), so applying Proposition 3.1 we get v ≤ max(g(x, y), v⁰), hence v ≤ g(x, y). A dual reasoning yields v⁰ ≥ g(x, y), a contradiction.

We now identify the limitv of the absorbing game.

(6)

Definition 3.1. Define the functionW :X×Y →IRby

W(x, y) := med g(x, y), sup

x⁰;p^∗(x⁰,y)>0

g^∗(x⁰, y), inf

y⁰;p^∗(x,y⁰)>0g^∗(x, y⁰)

! ,

wheremed(·,·,·)denotes the median of three numbers, with the usual convention that a supremum (resp., an infimum) over an empty set equals−∞(resp.,+∞).

Corollary 3.2. The limit v is the value of the zero-sum game, denoted byΥ, with action spaces X andY and payoffW.

Proof. It is enough to show that v ≤ w := sup_xinfyW(x, y) as a dual argument yields the conclusion. Assume, by contradiction, thatw < v.

Letε >0withw+ 2ε < v. Considerx∈X0(v)an accumulation point ofxλ∈Xλ(vλ)and let y be anε-best response toxin the gameΥ. Lemma 3.1ii)implies that

inf

y⁰;p^∗(x,y⁰)>0

g^∗(x, y⁰)≥v > w+ε≥W(x, y),

so that

W(x, y) = max g(x, y), sup

x⁰;p^∗(x⁰,y)>0

g^∗(x⁰, y)

! .

Thus, sup

x⁰;p^∗(x⁰,y)>0

g^∗(x⁰, y) ≤ w+ε < v−ε and, similarly, g(x, y) < v−ε. The corresponding inequalities hold withx_λ, forλsmall enough:

p^∗(x_λ, y)[g^∗(x_λ, y)−(v−ε)]≤0, g(x_λ, y)≤v−ε,

leading by Lemma 3.2 tovλ≤v−ε, a contradiction.

Remark 3.1. The proof of Corollary 3.2 establishes in itself the existence of the limitv(by doing the same reasoning with any accumulation point ofvλ).

Furthermore, notice that this proves that the gameΥhas a value, which is not obvious a priori.

4 Recursive Games

Recursive games are another special class of stochastic games, as defined in Section 2. We are given a finite setΩ = Ω0∪Ω^∗, two compact metric setsI andJ, a payoff functiong^∗ from Ω^∗ to R, and a separately continuous function ρfrom I×J×Ω₀ to ∆(Ω). Ω^∗ is the set of absorbing

(7)

states, whileΩ0 is the set of recursive states.

The repeated recursive game is played in discrete time as follows. At stage t = 1,2, ..., if absorbtion has not yet occurred and the current state is ωt ∈ Ω0, player 1 chooses it ∈I and, simultaneously, player 2 choosesjt∈J:

(i) the payoff at stagetis 0;

(ii) the stateω_t+1 is chosen with probability distributionρ(ω_t+1|i_t, j_t, ω_t);

(iii) ifω_t+1∈Ω^∗, absorbtion is reached and the payoff in all future stagess > t isg^∗(ω_t+1);

(iv) ifωt+1∈Ω0, absorbtion is not reached and the game continues.

The study of those recursive games was first done by Everett [5], who proved that the game has a value when considering the asymptotic payoff on plays.

As before, denoteX := ∆(I)andY := ∆(J),X:=X^Ωand, similarly,Y=Y^Ω;ρis bilinearly extended toX×Y. Recall that in this framework, the Shapley operator is defined fromF1:=R^Ω⁰ to itself by

Φ(λ, f)(ω) := min

y∈Y max

x∈X

(

(1−λ) X

ω⁰∈Ω

ρ(ω⁰|x, y, ω)f(ω⁰) )

:= max

x∈Xmin

y∈Y

(

(1−λ) X

ω⁰∈Ω

ρ(ω⁰|x, y, ω)f(ω⁰) )

,

where, by convention,f(ω⁰) =g^∗(ω⁰)wheneverω⁰∈Ω^∗.

Proposition 4.1. Letv∈ S₀, andv⁰ such thatmax_Ωv(ω)−v⁰(ω)>0. Assume that the inequality v⁰(ω)≥Φ(0, v⁰)(ω)holds for allω∈Ω₁:= Argmax_Ω(v−v⁰). Thenv(·)≤0 onΩ₁.

Proof. Denote byΩ2 theArgmax of v on the set Ω1; it is enough to prove that v(·)≤0 onΩ2, so we assume the contrary. Up to extraction, vλ_n → v, xλ_n ∈ Xλ_n(vλ_n) → x and there exists ω0 ∈Ω2, which realizes the maximum ofvλ_n onΩ2 for everyn. In particular, v(ω0)>0. Since xλ_n is optimal, we get, for anyy∈Y:

vλn(ω0)≤(1−λn)



 X

ω⁰∈Ω2

ρ(ω⁰|xλn,y, ω0)vλn(ω⁰) + X

ω⁰∈Ω\Ω2

ρ(ω⁰|xλn,y, ω0)vλn(ω⁰)



,

so, by definition ofω₀,

(1−(1−λ_n)ρ(Ω₂|x_λ_n,y, ω₀))v_λ_n(ω₀)≤(1−λ_n) X

ω⁰∈Ω\Ω2

ρ(ω⁰|x_λ_n,y, ω₀)v_λ_n(ω⁰).

(8)

For simplicity, denoteρn:=ρ(Ω2|xλ_n,y, ω0). Ifρn= 1for infinitely many n, we immediately get v(ω0)≤0 and the requested contradiction, hence we assume that it is not the case. Hence, up to an extraction,µn defined byµn(w⁰) = ρ(ω⁰|xλn,y, ω0)

1−ρn

is a probability measure onΩ\Ω2. Then, fornlarge enough, we get an analogue of Lemma 3.3:

v_λ_n(ω₀) ≤ 1−λ_n 1−(1−λn)ρn

X

ω⁰∈Ω\Ω₂

ρ(ω⁰|xλn,y, ω₀)v_λ_n(ω⁰) (1)

= (1−λ_n)(1−ρ_n) λn+ (1−λn)(1−ρn)

X

ω⁰∈Ω\Ω₂

ρ(ω⁰|x_λ_n,y, ω₀) 1−ρn

v_λ_n(ω⁰) (2)

≤ max



0, X

ω⁰∈Ω\Ω2

µn(ω⁰)vλ_n(ω⁰)



. (3)

On the other hand, choose nowy∈Y₀(v⁰). Sinceω₀∈Ω₂,

v⁰(ω₀) ≥ Φ(0, v⁰)(ω₀)

≥



 X

ω⁰∈Ω2

ρ(ω⁰|xλn,y, ω0)v⁰(ω⁰) + X

ω⁰∈Ω\Ω2

ρ(ω⁰|xλn,y, ω0)v⁰(ω⁰)



,

so using the fact thatv⁰ is constant onΩ₂, we get an analogue to Lemma 3.1:

v⁰(ω₀)≥ X

ω⁰∈Ω\Ω2

µ_n(ω⁰)v⁰(ω⁰). (4)

Letting n go to infinity in inequalities (3) and (4), and using v(ω₀)>0, we obtain by com- pactness the existence ofµ∈∆(Ω\Ω2)such that

v(ω0) ≤ X

ω⁰∈Ω\Ω2

µ(ω⁰)v(ω⁰), (5)

v⁰(ω₀) ≥ X

ω⁰∈Ω\Ω₂

µ(ω⁰)v⁰(ω⁰). (6)

Substracting (6) from (5) yields

(v−v⁰)(ω₀)≤ X

ω⁰∈Ω\Ω2

µ(ω⁰)(v−v⁰)(ω⁰),

and since ω0 ∈Ω1 = Argmax_Ω(v−v⁰), this implies that the support of µis included in Ω1 and that (5) is an equality. This, in turn, forces the support ofµto be included inΩ2= Argmax_Ω₁v,

(9)

a contradiction to the construction ofµ.

Corollary 4.1. vλ converges as λgoes to 0.

Proof. Assume that there are two accumulation pointsvandv⁰withmaxΩ{v−v⁰}>0, and denote Ω₁= Argmax_Ω(v−v⁰). Then Proposition 4.1 implies thatv(·)≤0onΩ₁. A dual argument yields thatv⁰(·)≥0onΩ₁, a contradiction.

We now recover a characterization of the limit due to Everett [5]:

Corollary 4.2. S0⊂L⁺∩L⁻, whereA is the closure ofA and

L⁺:=











f ∈R^Ω,

Φ(0, f)(ω)≤f(ω) ∀ω∈Ω₀ Φ(0, f)(ω) =f(ω) =⇒ f(ω)≥0 f(ω)≥g^∗(ω) ∀ω∈Ω^∗











, (7)

and symmetrically

L⁻:=











f ∈R^Ω,

Φ(0, f)(ω)≥f(ω) ∀ω∈Ω0

Φ(0, f)(ω) =f(ω) =⇒ f(ω)≤0 f(ω)≤g^∗(ω) ∀ω∈Ω^∗











. (8)

We will need the following lemma:

Lemma 4.1. For anyε≥0, there exist Ω⁰ ⊂Ω₀ andv⁰∈ F1such that the couple(Ω⁰, v⁰)satisfies a) v⁰(ω) =g^∗(ω)for allω∈Ω^∗.

b) v⁰(ω) =v(ω)−ε onΩ⁰.

c) v(ω)≥v⁰(ω)> v(ω)−ε onΩ0\Ω⁰. d) For any ω∈Ω0\Ω⁰,Φ(0, v⁰)(ω)> v⁰(ω).

e) For anyω∈Ω⁰,Φ(0, v⁰)(ω) =v⁰(ω).

Proof. This was proved in [10], but we recall the proof for the sake of completeness.

Let E be the set of couples (Ω⁰⁰, v⁰⁰) such that Ω⁰⁰ ⊂ Ω₀, v⁰⁰ ∈ F1, and (Ω⁰⁰, v⁰⁰) satisfies properties a) to d). This set is nonempty since (Ω₀, v−ε1ω∈Ω0)∈E. Since Ω₀ is finite, we can choose a couple(Ω⁰, v⁰)in E such that there is no (Ω⁰⁰, v⁰⁰)in E withΩ⁰⁰(Ω⁰. LetΩe be the set

(10)

on whichΦ(0, v⁰)(ω) =v⁰(ω); we now prove thatΩ = Ωe ⁰, hence that(Ω⁰, v⁰)also satisfies property e).

By contradiction, assume thatΩe (Ω⁰and consider, for smallα >0,vα:=v⁰+α1ω∈Ω⁰\eΩ. The couple(eΩ, vα)clearly satisfies properties a) to c) forα < ε. It also satisfies property d) forαsmall enough by continuity ofΦ(0,·). So, forαsmall enough, the couple(eΩ, v_α)is inE, contradicting the minimality ofΩ⁰.

We can now prove Corollary 4.2:

Proof of Corollary 4.2. Letv∈ S0, letε >0and define(v⁰,Ω⁰)as in Lemma 4.1. By properties a) to c) ,kv−v⁰k_∞≤ε. IfΩ⁰=∅, then property d) implies that v⁰∈L⁻. IfΩ⁰ is nonempty, then, by properties b), c) and e),Ω⁰= Argmax(v−v⁰)andΦ(0, v⁰)(·) =v⁰(·)onΩ⁰. Hence, Proposition 4.1 yields thatv(·)≤0onΩ⁰. Sov⁰(·)≤0onΩ⁰ andv⁰ ∈L⁻as well. This implies thatv∈L⁻. By duality,v∈L⁺.

Remark 4.1. This corollary implies in itself thatv_λconverges, as there is at most one element in the intersection, see [3] and Proposition 9 in [6].

5 Games with Incomplete Information

We consider here two person zero-sum games with incomplete information (independent case and standard signalling). πis a product probabilityp⊗q on a finite product spaceK×L, with p∈P = ∆(K),q ∈Q= ∆(L). g is a payoff function from I×J ×K×Lto IRwhere I and J are finite action sets. Given the parameter (k, `)selected according to π, each player knows one component (kfor player 1, `for player 2) and holds a prior on the other component. From stage 1 on, the parameter is fixed, the repeated game with payoffg(·,·, k, `)is played. The moves of the players at stage t are {it, jt}, the payoff is gt =g(it, jt, k, `) and the information of the players after stagetis {it, jt}. X = ∆(I)^K andY = ∆(J)^L are the type-dependent mixed action sets of the players;g is extended onX×Y ×K×Lbyg(x, y, p, q) =P

k,` p^kq^`g(x^k, y^`, k, `).

Given (x, y, p, q), let x(i) = P

kx^k_ip^k be the probability of action i and p(i) be the conditional probability onK given the actioni, explicitlyp^k(i) = ^p_x(i)^k^x^kⁱ (and, similarly, foryandq).

While this framework is not a particular case of section 2, since the set P×Qthat will play the role of the state space is not finite, it is still possible to introduce a Shapley operator for this

(11)

game. This operator is defined on the setF2of continuous concave-convex fonctions onP×Qby:

Φ(λ, f)(p, q) := min

y∈Y

max

x∈X







λg(p, q, x, y) + (1−λ)X

i,j

x(i)y(j)f(p(i), q(j))







(9)

:= max

x∈X

min

y∈Y







λg(p, q, x, y) + (1−λ)X

i,j

x(i)y(j)f(p(i), q(j))







(10)

and the value v_λ of the λ-discounted game is the unique fixed point of Φ(λ, .) on F2. These relations are due to Aumann and Maschler (1966) [7] and Mertens and Zamir (1971) [8].

X_λ(f)(p, q)denotes the set of optimal strategies of player 1 inΦ(λ, f)(p, q).

In this framework, anyf ∈ F2is a fixed point of the projective operatorΦ(0, .), that isF2=S.

Note that, if C is a bound for the payoff function g, then any vλ is bounded by C as well, and is moreoverC-Lipschitz. The family{vλ} is thus relatively compact for the topology of uniform convergence, henceS0, the set of accumulation points of the family{vλ}, is nonempty.

To ease the notations, we will denote the sum P

i,jx(i)y(j)f(p(i), q(j))by E_ρ(x,p)×ρ⁰(y,q)f(˜p,q).˜ Note thatp˜only depends onxandp, and thatq˜only depends ony andq.

For anyf ∈ F2, Jensen’s inequality ensures that E_ρ(x,p)f(˜p, q)≤f(p, q). The strategies of player 1 for which the equality holds for all f ∈ F₂ are called non revealing. Their set is denoted N R(p) :={x∈X; ˜p=p, ρ(x, p)a.s.}. The setN R(q) of non revealing strategies of player 2 is defined similarly.

Finally, the non-revealing valueuis

u(p, q) := min

y∈N R(q) max

x∈N R(p)g(x, y, p, q) = max

x∈N R(p) min

y∈N R(q)g(x, y, p, q).

The existence of limvλ was first proved in [7] for games with incomplete information on one side. It was then generalized in [8] for games with incomplete information on both sides, with a characterization of the limitvbeing the only solution of the system

v=Cav_pmin(u, v), v=Vex_qmax(u, v),

where Cav(f) (resp. Vex(f)) denotes the smallest concave function in the first variable which is larger thanf (resp., the largest convex function in the second variable which is smaller thanf).

A shorter proof of this result (including characterization) was established in [9]. The tools used in the following proof are quite similar to the one used in [9], but the structure differs.

(12)

Lemmas 2.1 and 2.2 still hold in this framework; we now prove a more precise version of Lemma 2.3 using the geometry ofP×Q. LetC(P×Q)be the set of real continuous functions onP×Q.

Lemma 5.1. Let v ∈ S and let f ∈C(P×Q) be concave with respect to the first variable. If (p, q)is an extreme point of Argmax(v−f), thenX0(v)(p, q)⊂N R(p).

Proof. Letx∈X0(v)(p, q)andy∈N R(q); then

v(p, q)≤E_ρ(x,p)×ρ⁰_(y,q)v(˜p,q) =˜ E_ρ(x,p)v(˜p, q),

while, by Jensen’s inequality,

f(p, q)≥E_ρ(x,p)f(˜p, q);

so

E_ρ(x,p)(v−f)(˜p, q)≥(v−f)(p, q).

Since(p, q)∈Argmax(v−f),

E_ρ(x,p)(v−f)(˜p, q) = (v−f)(p, q),

and (˜p, q) ∈ Argmax(v−f), ρ(x, p) a.s. Since (p, q) is an extreme point of Argmax(v−f), it follows thatp˜=p, ρ(x, p)a.s.andx∈N R(p).

Remark that v∈ F2 implies thatN R(p)⊂X0(v)(p, q) sincev is a saddle function; hence, in fact,N R(p) =X0(v)(p, q)in the previous lemma.

Note the analogy between Lemma 5.1 and Lemma 3.1. Lemma 3.3 also has an analogue in this setup:

Lemma 5.2. Let xλ∈Xλ(vλ)(p, q)andy∈N R(q), then

vλ(p, q)≤g(xλ, y, p, q).

Proof. By definition ofv_λ andx_λ,

vλ(p, q) ≤ λg(xλ, y, p, q) + (1−λ)E_ρ(x_λ_,p)×ρ0(y,q)vλ(˜p,q)˜

≤ λg(x_λ, y, p, q) + (1−λ)v_λ(p, q),

using Jensen’s inequality and the fact thaty ∈N R(q). Hencev_λ(p, q)≤g(x_λ, y, p, q).

(13)

Recall that S0⊂ S is the set of accumulation points of{vλ}for the uniform norm.

Proposition 5.1. Let v∈ S0.

i) Letf ∈C(P×Q)be concave with respect to the first variable. Then, at any extreme point (p, q) of Argmax(v−f),

v(p, q)≤u(p, q).

ii) Let f⁰ ∈C(P×Q) be convex with respect to the second variable. Then, at any extreme point (p, q)of Argmin(v−f⁰),

v(p, q)≥u(p, q).

Proof. We provei). Apply Lemma 5.2 to any sequence{vλ_n}converging tov. By Lemma 2, there existsx∈X0(v)(p, q)such that

v(p, q)≤ inf

y∈N R(q)g(x, y, p, q).

Lemma 5.1 implies thatx∈N R(p)(sincev∈ S), and the result follows by definition ofu.

ii)is established in a dual way.

Proposition 5.1 implies the following corollaries of existence and caracterization oflimvλ: Corollary 5.1. vλ converges uniformly asλtend to 0.

Proof. Let v and v⁰ in S0 and let (p, q) be any extreme point of Argmax(v−v⁰). Since v⁰ is concave in its first variable, Proposition 5.1 i)with f =v⁰ implies that v(p, q)≤u(p, q). Apply now Proposition 5.1 ii) to f⁰ = v to get v⁰(p, q) ≥ u(p, q). This yields v(p, q) ≤ v⁰(p, q), hence v≤v⁰, thus uniqueness.

Corollary 5.2. Any accumulation point v ofv_λ satisfies the Mertens-Zamir system:

v=Cavpmin(u, v), v=Vexqmax(u, v).

Proof. Letvbe an accumulation point of the family{vλ}. We only prove thatv≤Cavpmin(u, v).

Sincev is concave inp, the other inequality is trivial, and a dual argument gives the dual equality.

Denote f = Cavpmin(u, v), and let (p, q) be any extreme point of Argmax(v −f). Since f is concave inp, Proposition 5.1 implies thatv(p, q)≤u(p, q). Hence,

v(p, q)≤min(u, v)(p, q)≤f(p, q)