• Aucun résultat trouvé

Existence of the Limit Value of Two Person Zero-Sum Discounted Repeated Games via Comparison Theorems

N/A
N/A
Protected

Academic year: 2022

Partager "Existence of the Limit Value of Two Person Zero-Sum Discounted Repeated Games via Comparison Theorems"

Copied!
15
0
0

Texte intégral

(1)

Existence of the Limit Value of Two Person Zero-Sum Discounted Repeated Games via Comparison Theorems

Sylvain Sorin and Guillaume Vigeral

Communicated by Irinel Chiril Dragan

Abstract

We give new proofs of existence of the limit of the discounted values for two person zero- sum games in the three following frameworks: absorbing, recursive, incomplete information.

The idea of these new proofs is to use some comparison criteria.

AMS Classification: 91A15, 91A20, 49J40, 47J20

Keywords: stochastic games, repeated games, incomplete information, asymptotic value, comparison principle, variational inequalities

1 Introduction

The purpose of this article is to present a unified approach to the existence of the limit value for two person zero-sum discounted games. The main tools used in the proofs are

- the fact that the discounted value satisfies the Shapley equation [1],

- properties of accumulation points of the discounted values, and of the corresponding optimal strategies,

- comparison of two accumulation points leading to uniqueness and characterization.

The research of the first author was supported by grant ANR-08-BLAN- 0294-01 (France). The research of the second author was supported by grant ANR-10-BLAN 0112 (France)

Combinatoire et Optimisation, IMJ, CNRS UMR 7586, Faculté de Mathématiques, Université P. et M. Curie Paris 6, Tour 15-16, 1er etage, 4 Place Jussieu, 75005 Paris and Laboratoire d’Econométrie, Ecole Polytechnique, France [email protected]

Corresponding author ; Université Paris-Dauphine, CEREMADE, Place du Maréchal De Lattre de Tassigny.

75775 Paris cedex 16, France [email protected]

(2)

We apply this program for three well known classes of games, each time covering the case where action spaces are compact.

For absorbing games, the results are initially due to Kohlberg [2] for finitely many actions, later extended in Rosenberg and Sorin [3] for the compact case. An explicit formula for the limit was recently obtained in Laraki, [4] and we obtain a related one. The case of recursive games was first handled in Everett [5], with a different notion of limit value involving asymptotic payoff on plays. It was later shown by Sorin [6] that these results implied also the existence of the limit value for two person zero-sum discounted games. The last class corresponds to games with incomplete information, where the results were initially obtained in Aumann and Maschler [7] and Mertens and Zamir [8] (including also the asymptotic study of the finitely repeated games). In that case, we follow a quite similar approach to Laraki [9].

2 Model, Notations and Basic Lemmas

Let G be a two person zero-sum stochastic game defined by a finite state space Ω, compact metric action spacesIandJ for player 1 and 2 (with mixed extensionsX = ∆(I)andY = ∆(J), respectively, where for a compact metric spaceC,∆(C)denotes the set of Borel probabilities on C, endowed with the weak-?topology ), a separately continuous real bounded payoffgonI×J×Ω and a separately continuous transitionρfrom I×J×to ∆(Ω).

The game is played in discrete time. At stage t, given the state ωt, the payers choose moves it I, jt J, the stage payoff is gt = g(it, jt, ωt) and the new state ωt+1 is selected according to ρ(it, jt, ωt), and is announced to the players. Given λ ∈]0,1], the total evaluation in the λ- discounted game isP

t=1λ(1λ)t−1gt.

The Shapley operatorΦ(λ, f)[1] is then defined, forλ[0,1]andf in some closed subsetF0

of the set of bounded functions fromtoIR, by the formula

Φ(λ, f)(ω) = min

Y max

X

λg(x, y, ω) + (1λ)Eρ(x,y,ω)f(·)

= max

X min

Y

λg(x, y, ω) + (1λ)Eρ(x,y,ω)f(·) ,

whereg andρare bilinearly extended to X×Y. Forλ >0, the only fixed point ofΦ(λ,·)is the valuevλ of the discounted game [1].

The sets of optimal actions of each player in the above formula are denoted byXλ(f)(ω)and Yλ(f)(ω). Let X = X and, similarly, Y = Y. For simplicity, for any (x,y) X×Y we

(3)

denote ρ(x,y, ω) :=ρ(x(ω),y(ω), ω). Moreover, define Xλ(f) :=Q

ω∈ΩXλ(f)(ω) andYλ(f) :=

Q

ω∈ΩYλ(f)(ω).

Sdenotes the set of fixed points of the projective operatorΦ(0, .), andS0is the set of accumulation points of the family{vλ}as λgoes to 0.

The following lemmas are easy to establish in this finite state framework:

Lemma 2.1. S0⊂ S.

Lemma 2.2. Assume that vλn converges to v ∈ S0 and that some sequence of optimal actions xλnXλn(vλn)converges to x. ThenxX0(v).

Lemma 2.3. Let v andv0 be in S and1 = Argmax(vv0). For any xX0(v),y Y0(v0), andω1, the probabilityρ(x,y, ω) is supported by1.

Proof. Sincev∈ S andxX0(v):

v(ω) = Φ(0, v)(ω)Eρ(x,y,ω)v(·).

Using a dual inequality as well:

v(ω)v0(ω)Eρ(x,y,ω)(vv0)(·),

and the result follows.

3 Absorbing Games

We consider here a special class of stochastic games, as defined in Section 2. We are given two separately continuous (payoff) functionsg,g from I×J to[−1,1], and a separately continuous (probability of absorption) functionpfromI×J to[0,1].

The repeated game with absorbing states is played in discrete time as follows. At staget= 1,2, ...

(if absorption has not yet occurred) player 1 chooses itIand, simultaneously, player 2 chooses jtJ:

(i) the payoff at stagetisg(it, jt);

(ii) with probability p(it, jt) := 1p(it, jt), absorption is reached and the payoff in all future stagess > tisg(it, jt);

(iii) with probabilityp(it, jt), the situation is repeated at staget+ 1.

(4)

Recall that the asymptotic analysis for these games is due to Kohlberg [2] in the case whereI andJ are finite.

As usual, denoteX:= ∆(I)andY := ∆(J);g,pandpare bilinearly extended toX×Y. Let p(x, y)g(x, y) := R

I×Jp(i, j)g(i, j)x(di)y(dj). g(x, y) is thus the expected absorbing payoff, conditionally to absorption.

The Shapley operator of the game is then defined onIRby

Φ(λ, f) := min

y∈Y max

x∈X{λg(x, y) + (1λ)(p(x, y)f+p(x, y)g(x, y)}

:= max

x∈Xmin

y∈Y{λg(x, y) + (1λ)(p(x, y)f+p(x, y)g(x, y)}.

In this framework, we can prove a stronger version of Lemma 2.3:

Lemma 3.1.

i) Let f IR such that f Φ(0, f)andyY0(f). Then, for any xX,

p(x, y)>0 = f g(x, y).

ii) Letf IR such thatf Φ(0, f)andxX0(f). Then, for any yY,

p(x, y)>0 = f g(x, y).

Proof. We provei). GivenxX andyY0(f),

f Φ(0, f)p(x, y)f+p(x, y)g(x, y)

andp(x, y) = 1p(x, y), hence the result.

Given λ∈]0,1[, xX and y Y, let rλ(x, y) be the induced payoff in the discounted game by the corresponding stationary strategies: rλ(x, y) :=Ex,yP

λ(1λ)t−1gt. Lemma 3.2.

rλ(x, y)

g(x, y), if p(x, y) = 0, max(g(x, y), g(x, y)), if p(x, y)>0.

Proof.

rλ(x, y) =λg(x, y) + (1λ) [p(x, y)rλ(x, y) +p(x, y)g(x, y)] ;

(5)

hence

rλ(x, y) = λg(x, y) + (1λ)p(x, y)g(x, y) λ+ (1λ)p(x, y) .

The previous lemma implies:

Lemma 3.3. Let λ∈]0,1[,xλXλ(vλ)andyY; then

vλ

g(xλ, y), ifp(xλ, y) = 0, max(g(xλ, y), g(xλ, y)), ifp(xλ, y)>0.

Proof. Sincexλ is optimal in the discounted game, for anyyY,

vλrλ(xλ, y)

and the assertion follows from Lemma 3.2.

Combining the preceding lemmas yields:

Proposition 3.1. Assume that vλn v and xλn x with xλn Xλn(vλn). Let v0 such that v0Φ(0, v0)andyY0(v0); then

vmax(g(x, y), v0).

Proof. For any n and any y Y, Lemma 3.3 implies that either vλn g(xλn, y) or that p(xλn, y) > 0, and vλn max(g(xλn, y), g(xλn, y)). In the second case, since y Y0(v0), the first assertion in Lemma 3.1 ensures that g(xλn, y)v0, so in both cases we get the inequality vλn max(g(xλn, y), v0). Passing to the limit yields the result.

Corollary 3.1. vλ converges as λgoes to 0.

Proof. Suppose, on the contrary, that there are two sequencesvλnv andvλ0

nv0 withv > v0. Up to an extraction, one can assume that xλn Xλn(vλn)converges to xand, similarly, yλ0n Yλ0n(vλ0n) converges to y. By Lemma 2.2, v0 = Φ(0, v0)and y Y0(v0), so applying Proposition 3.1 we get v max(g(x, y), v0), hence v g(x, y). A dual reasoning yields v0 g(x, y), a contradiction.

We now identify the limitv of the absorbing game.

(6)

Definition 3.1. Define the functionW :X×Y IRby

W(x, y) := med g(x, y), sup

x0;p(x0,y)>0

g(x0, y), inf

y0;p(x,y0)>0g(x, y0)

! ,

wheremed(·,·,·)denotes the median of three numbers, with the usual convention that a supremum (resp., an infimum) over an empty set equals−∞(resp.,+∞).

Corollary 3.2. The limit v is the value of the zero-sum game, denoted byΥ, with action spaces X andY and payoffW.

Proof. It is enough to show that v w := supxinfyW(x, y) as a dual argument yields the conclusion. Assume, by contradiction, thatw < v.

Letε >0withw+ 2ε < v. ConsiderxX0(v)an accumulation point ofxλXλ(vλ)and let y be anε-best response toxin the gameΥ. Lemma 3.1ii)implies that

inf

y0;p(x,y0)>0

g(x, y0)v > w+εW(x, y),

so that

W(x, y) = max g(x, y), sup

x0;p(x0,y)>0

g(x0, y)

! .

Thus, sup

x0;p(x0,y)>0

g(x0, y) w+ε < vε and, similarly, g(x, y) < vε. The corresponding inequalities hold withxλ, forλsmall enough:

p(xλ, y)[g(xλ, y)(vε)]0, g(xλ, y)vε,

leading by Lemma 3.2 tovλvε, a contradiction.

Remark 3.1. The proof of Corollary 3.2 establishes in itself the existence of the limitv(by doing the same reasoning with any accumulation point ofvλ).

Furthermore, notice that this proves that the gameΥhas a value, which is not obvious a priori.

4 Recursive Games

Recursive games are another special class of stochastic games, as defined in Section 2. We are given a finite setΩ = Ω0, two compact metric setsI andJ, a payoff functiong from to R, and a separately continuous function ρfrom I×J×0 to ∆(Ω). is the set of absorbing

(7)

states, while0 is the set of recursive states.

The repeated recursive game is played in discrete time as follows. At stage t = 1,2, ..., if absorbtion has not yet occurred and the current state is ωt 0, player 1 chooses it I and, simultaneously, player 2 choosesjtJ:

(i) the payoff at stagetis 0;

(ii) the stateωt+1 is chosen with probability distributionρt+1|it, jt, ωt);

(iii) ifωt+1, absorbtion is reached and the payoff in all future stagess > t isgt+1);

(iv) ifωt+10, absorbtion is not reached and the game continues.

The study of those recursive games was first done by Everett [5], who proved that the game has a value when considering the asymptotic payoff on plays.

As before, denoteX := ∆(I)andY := ∆(J),X:=Xand, similarly,Y=Y;ρis bilinearly extended toX×Y. Recall that in this framework, the Shapley operator is defined fromF1:=R0 to itself by

Φ(λ, f)(ω) := min

y∈Y max

x∈X

(

(1λ) X

ω0∈Ω

ρ(ω0|x, y, ω)f0) )

:= max

x∈Xmin

y∈Y

(

(1λ) X

ω0∈Ω

ρ(ω0|x, y, ω)f0) )

,

where, by convention,f0) =g0)wheneverω0.

Proposition 4.1. Letv∈ S0, andv0 such thatmaxv(ω)v0(ω)>0. Assume that the inequality v0(ω)Φ(0, v0)(ω)holds for allω1:= Argmax(vv0). Thenv(·)0 on1.

Proof. Denote by2 theArgmax of v on the set 1; it is enough to prove that v(·)0 on2, so we assume the contrary. Up to extraction, vλn v, xλn Xλn(vλn) x and there exists ω0 2, which realizes the maximum ofvλn on2 for everyn. In particular, v(ω0)>0. Since xλn is optimal, we get, for anyyY:

vλn0)(1λn)

X

ω0∈Ω2

ρ(ω0|xλn,y, ω0)vλn0) + X

ω0∈Ω\Ω2

ρ(ω0|xλn,y, ω0)vλn0)

,

so, by definition ofω0,

(1(1λn)ρ(Ω2|xλn,y, ω0))vλn0)(1λn) X

ω0∈Ω\Ω2

ρ(ω0|xλn,y, ω0)vλn0).

(8)

For simplicity, denoteρn:=ρ(Ω2|xλn,y, ω0). Ifρn= 1for infinitely many n, we immediately get v(ω0)0 and the requested contradiction, hence we assume that it is not the case. Hence, up to an extraction,µn defined byµn(w0) = ρ(ω0|xλn,y, ω0)

1ρn

is a probability measure onΩ\Ω2. Then, fornlarge enough, we get an analogue of Lemma 3.3:

vλn0) 1λn 1(1λnn

X

ω0∈Ω\Ω2

ρ(ω0|xλn,y, ω0)vλn0) (1)

= (1λn)(1ρn) λn+ (1λn)(1ρn)

X

ω0∈Ω\Ω2

ρ(ω0|xλn,y, ω0) 1ρn

vλn0) (2)

max

0, X

ω0∈Ω\Ω2

µn0)vλn0)

. (3)

On the other hand, choose nowyY0(v0). Sinceω02,

v00) Φ(0, v0)(ω0)

X

ω0∈Ω2

ρ(ω0|xλn,y, ω0)v00) + X

ω0∈Ω\Ω2

ρ(ω0|xλn,y, ω0)v00)

,

so using the fact thatv0 is constant on2, we get an analogue to Lemma 3.1:

v00) X

ω0∈Ω\Ω2

µn0)v00). (4)

Letting n go to infinity in inequalities (3) and (4), and using v(ω0)>0, we obtain by com- pactness the existence ofµ∆(Ω\Ω2)such that

v(ω0) X

ω0∈Ω\Ω2

µ(ω0)v(ω0), (5)

v00) X

ω0∈Ω\Ω2

µ(ω0)v00). (6)

Substracting (6) from (5) yields

(vv0)(ω0) X

ω0∈Ω\Ω2

µ(ω0)(vv0)(ω0),

and since ω0 1 = Argmax(vv0), this implies that the support of µis included in 1 and that (5) is an equality. This, in turn, forces the support ofµto be included in2= Argmax1v,

(9)

a contradiction to the construction ofµ.

Corollary 4.1. vλ converges as λgoes to 0.

Proof. Assume that there are two accumulation pointsvandv0withmax{v−v0}>0, and denote 1= Argmax(v−v0). Then Proposition 4.1 implies thatv(·)0on1. A dual argument yields thatv0(·)0on1, a contradiction.

We now recover a characterization of the limit due to Everett [5]:

Corollary 4.2. S0L+L, whereA is the closure ofA and

L+:=

f R,

Φ(0, f)(ω)f(ω) ∀ω0 Φ(0, f)(ω) =f(ω) = f(ω)0 f(ω)g(ω) ∀ω

, (7)

and symmetrically

L:=

f R,

Φ(0, f)(ω)f(ω) ∀ω0

Φ(0, f)(ω) =f(ω) = f(ω)0 f(ω)g(ω) ∀ω

. (8)

We will need the following lemma:

Lemma 4.1. For anyε0, there exist 0 0 andv0∈ F1such that the couple(Ω0, v0)satisfies a) v0(ω) =g(ω)for allω.

b) v0(ω) =v(ω)ε on0.

c) v(ω)v0(ω)> v(ω)ε on0\Ω0. d) For any ω0\Ω0,Φ(0, v0)(ω)> v0(ω).

e) For anyω0,Φ(0, v0)(ω) =v0(ω).

Proof. This was proved in [10], but we recall the proof for the sake of completeness.

Let E be the set of couples (Ω00, v00) such that 00 0, v00 ∈ F1, and (Ω00, v00) satisfies properties a) to d). This set is nonempty since (Ω0, vε1ω∈Ω0)E. Since 0 is finite, we can choose a couple(Ω0, v0)in E such that there is no (Ω00, v00)in E with00(0. Lete be the set

(10)

on whichΦ(0, v0)(ω) =v0(ω); we now prove thatΩ = Ωe 0, hence that(Ω0, v0)also satisfies property e).

By contradiction, assume thate (0and consider, for smallα >0,vα:=v0+α1ω∈Ω0\e. The couple(eΩ, vα)clearly satisfies properties a) to c) forα < ε. It also satisfies property d) forαsmall enough by continuity ofΦ(0,·). So, forαsmall enough, the couple(eΩ, vα)is inE, contradicting the minimality of0.

We can now prove Corollary 4.2:

Proof of Corollary 4.2. Letv∈ S0, letε >0and define(v0,0)as in Lemma 4.1. By properties a) to c) ,kvv0kε. If0=∅, then property d) implies that v0L. If0 is nonempty, then, by properties b), c) and e),0= Argmax(vv0)andΦ(0, v0)(·) =v0(·)on0. Hence, Proposition 4.1 yields thatv(·)0on0. Sov0(·)0on0 andv0 Las well. This implies thatvL. By duality,vL+.

Remark 4.1. This corollary implies in itself thatvλconverges, as there is at most one element in the intersection, see [3] and Proposition 9 in [6].

5 Games with Incomplete Information

We consider here two person zero-sum games with incomplete information (independent case and standard signalling). πis a product probabilitypq on a finite product spaceK×L, with pP = ∆(K),q Q= ∆(L). g is a payoff function from I×J ×K×Lto IRwhere I and J are finite action sets. Given the parameter (k, `)selected according to π, each player knows one component (kfor player 1, `for player 2) and holds a prior on the other component. From stage 1 on, the parameter is fixed, the repeated game with payoffg(·,·, k, `)is played. The moves of the players at stage t are {it, jt}, the payoff is gt =g(it, jt, k, `) and the information of the players after stagetis {it, jt}. X = ∆(I)K andY = ∆(J)L are the type-dependent mixed action sets of the players;g is extended onX×Y ×K×Lbyg(x, y, p, q) =P

k,` pkq`g(xk, y`, k, `).

Given (x, y, p, q), let x(i) = P

kxkipk be the probability of action i and p(i) be the conditional probability onK given the actioni, explicitlypk(i) = px(i)kxki (and, similarly, foryandq).

While this framework is not a particular case of section 2, since the set P×Qthat will play the role of the state space is not finite, it is still possible to introduce a Shapley operator for this

(11)

game. This operator is defined on the setF2of continuous concave-convex fonctions onP×Qby:

Φ(λ, f)(p, q) := min

y∈Y

max

x∈X

λg(p, q, x, y) + (1λ)X

i,j

x(i)y(j)f(p(i), q(j))

(9)

:= max

x∈X

min

y∈Y

λg(p, q, x, y) + (1λ)X

i,j

x(i)y(j)f(p(i), q(j))

(10)

and the value vλ of the λ-discounted game is the unique fixed point of Φ(λ, .) on F2. These relations are due to Aumann and Maschler (1966) [7] and Mertens and Zamir (1971) [8].

Xλ(f)(p, q)denotes the set of optimal strategies of player 1 inΦ(λ, f)(p, q).

In this framework, anyf ∈ F2is a fixed point of the projective operatorΦ(0, .), that isF2=S.

Note that, if C is a bound for the payoff function g, then any vλ is bounded by C as well, and is moreoverC-Lipschitz. The family{vλ} is thus relatively compact for the topology of uniform convergence, henceS0, the set of accumulation points of the family{vλ}, is nonempty.

To ease the notations, we will denote the sum P

i,jx(i)y(j)f(p(i), q(j))by Eρ(x,p)×ρ0(y,q)fp,q).˜ Note thatp˜only depends onxandp, and thatq˜only depends ony andq.

For anyf ∈ F2, Jensen’s inequality ensures that Eρ(x,p)fp, q)f(p, q). The strategies of player 1 for which the equality holds for all f ∈ F2 are called non revealing. Their set is denoted N R(p) :={xX; ˜p=p, ρ(x, p)a.s.}. The setN R(q) of non revealing strategies of player 2 is defined similarly.

Finally, the non-revealing valueuis

u(p, q) := min

y∈N R(q) max

x∈N R(p)g(x, y, p, q) = max

x∈N R(p) min

y∈N R(q)g(x, y, p, q).

The existence of limvλ was first proved in [7] for games with incomplete information on one side. It was then generalized in [8] for games with incomplete information on both sides, with a characterization of the limitvbeing the only solution of the system

v=Cavpmin(u, v), v=Vexqmax(u, v),

where Cav(f) (resp. Vex(f)) denotes the smallest concave function in the first variable which is larger thanf (resp., the largest convex function in the second variable which is smaller thanf).

A shorter proof of this result (including characterization) was established in [9]. The tools used in the following proof are quite similar to the one used in [9], but the structure differs.

(12)

Lemmas 2.1 and 2.2 still hold in this framework; we now prove a more precise version of Lemma 2.3 using the geometry ofP×Q. LetC(P×Q)be the set of real continuous functions onP×Q.

Lemma 5.1. Let v ∈ S and let f C(P×Q) be concave with respect to the first variable. If (p, q)is an extreme point of Argmax(vf), thenX0(v)(p, q)N R(p).

Proof. LetxX0(v)(p, q)andyN R(q); then

v(p, q)Eρ(x,p)×ρ0(y,q)v(˜p,q) =˜ Eρ(x,p)v(˜p, q),

while, by Jensen’s inequality,

f(p, q)Eρ(x,p)fp, q);

so

Eρ(x,p)(vf)(˜p, q)(vf)(p, q).

Since(p, q)Argmax(vf),

Eρ(x,p)(vf)(˜p, q) = (vf)(p, q),

and p, q) Argmax(vf), ρ(x, p) a.s. Since (p, q) is an extreme point of Argmax(vf), it follows thatp˜=p, ρ(x, p)a.s.andxN R(p).

Remark that v∈ F2 implies thatN R(p)X0(v)(p, q) sincev is a saddle function; hence, in fact,N R(p) =X0(v)(p, q)in the previous lemma.

Note the analogy between Lemma 5.1 and Lemma 3.1. Lemma 3.3 also has an analogue in this setup:

Lemma 5.2. Let xλXλ(vλ)(p, q)andyN R(q), then

vλ(p, q)g(xλ, y, p, q).

Proof. By definition ofvλ andxλ,

vλ(p, q) λg(xλ, y, p, q) + (1λ)Eρ(xλ,p)×ρ0(y,q)vλp,q)˜

λg(xλ, y, p, q) + (1λ)vλ(p, q),

using Jensen’s inequality and the fact thaty N R(q). Hencevλ(p, q)g(xλ, y, p, q).

(13)

Recall that S0⊂ S is the set of accumulation points of{vλ}for the uniform norm.

Proposition 5.1. Let v∈ S0.

i) Letf C(P×Q)be concave with respect to the first variable. Then, at any extreme point (p, q) of Argmax(vf),

v(p, q)u(p, q).

ii) Let f0 C(P×Q) be convex with respect to the second variable. Then, at any extreme point (p, q)of Argmin(vf0),

v(p, q)u(p, q).

Proof. We provei). Apply Lemma 5.2 to any sequence{vλn}converging tov. By Lemma 2, there existsxX0(v)(p, q)such that

v(p, q) inf

y∈N R(q)g(x, y, p, q).

Lemma 5.1 implies thatxN R(p)(sincev∈ S), and the result follows by definition ofu.

ii)is established in a dual way.

Proposition 5.1 implies the following corollaries of existence and caracterization oflimvλ: Corollary 5.1. vλ converges uniformly asλtend to 0.

Proof. Let v and v0 in S0 and let (p, q) be any extreme point of Argmax(vv0). Since v0 is concave in its first variable, Proposition 5.1 i)with f =v0 implies that v(p, q)u(p, q). Apply now Proposition 5.1 ii) to f0 = v to get v0(p, q) u(p, q). This yields v(p, q) v0(p, q), hence vv0, thus uniqueness.

Corollary 5.2. Any accumulation point v ofvλ satisfies the Mertens-Zamir system:

v=Cavpmin(u, v), v=Vexqmax(u, v).

Proof. Letvbe an accumulation point of the family{vλ}. We only prove thatvCavpmin(u, v).

Sincev is concave inp, the other inequality is trivial, and a dual argument gives the dual equality.

Denote f = Cavpmin(u, v), and let (p, q) be any extreme point of Argmax(v f). Since f is concave inp, Proposition 5.1 implies thatv(p, q)u(p, q). Hence,

v(p, q)min(u, v)(p, q)f(p, q)

Références

Documents relatifs

In fact, Sorin and Vigeral [11] prove the existence of the limit of the discounted values for stochastic games with finite state space, continuous action space and continuous

Abstract: Consider a two-person zero-sum stochastic game with Borel state space S, compact metric action sets A, B and law of motion q such that the integral under q of every

Starting with Rosenberg and Sorin (2001) [15] several existence results for the asymptotic value have been obtained, based on the Shapley operator: continuous moves absorbing

• For other classes of stochastic games, where the existence of a limit value is still to establish : stochastic games with a finite number of states and compact sets of

Extensions of these results to games with lack of information on both sides were achieved by Mertens and Zamir (1971).. Introduction: Shapley Extensions of the Shapley operator

Personalized diagnostic technologies will aid in the stratifica- tion of patients with specific molecular alterations for a clin- ical trial.. It has been shown in the past

In ähnlicher Weise ließ sich auch die Formel für das Kapillarmodell eines porösen Körpers bei radialer Anordnung aufstellen.. Mit den gewonnenen Resultaten konnten sodann

Anyhow, since science tends to grasp reality by intellectual means which shall finally be of strict mathematical nature, science itself can be said to be an enterprise of grasping