S o m e R e s u l t s o n t h e E x i s t e n c e o f N a s h E q u i l i b r i a
f o r N o n - Z e r o S u m G a m e s w i t h I n c o m p l e t e I n f o r m a t i o n 1 )
By S. Sorin, Paris 2)
Abstract: We prove the existence of Nash equilibria for two person non-zero sum repeated games with iaek of information on one side and two states of nature.
1. I n t r o d u c t i o n
For repeated games with complete information, the set of Nash Equilibrium Payoffs (NEP) coincides with the set of the feasible individually rational payoffs. In particular, this set is not empty. For repeated games with incomplete information, the situation is the following:
- If there is a lack of information on both sides, the set of NEP may be empty [this was proved, even for zero.sum games, by [Aumann/Maschler, 1967; Stearns].
- However, for zero-sum repeated games with lack of information on one side, Aumann/Maschler [1966] proved the existence of a value, hence of a NEP.
We prove here that the set of NEP is also non-empty for non-zero sum games with lack of information on one side, with only two states of nature. Moreover, these NEP can be achieved by a special kind of strategy, called "joint plans" which were introduced by Aumann/Maschler/Stearns [ 1968].
2. Description of the Game
Let K be a Finite set and P the simplex of probabilities on K. We are given two families of I I I • I I I matrices: (Ak), (Bk), k EK, corresponding to the payoffs of players 1 and 2.
For each p in P, G** (p) is the game with infinitely many stages defined as follows:
- at stage 0, k is chosen according to p and is told to player 1;
- at stage m = 1, 2 . . . knowing the previous (m -- 1) stage history, namely I) A first version of this paper was written in June 1980 at the Institute for Advanced Studies (Jerusalem). I am greatly indebted to S. Zamir for very stimulating discussions on this subject.
2) Present address: Laboratoixe d'Econom~trie, Universit~ Paris 6, 4 Place Jussieu, F-75230 Paris.
0020-7276/83/040193-20552.50 9 1983 Physica- Verlag, Vienna.
hm. 1 = (ix, ]1 . . . . ' im-l' ]m-l)' player 1 (resp. player 2) chooses i m in I (resp.
ira in J) and (ira, ira) is told to both players.
All the previous description is common knowledge.
Given an n-stage history we deFme the payoffs
1 ~ a.k. and gn k(2) l ~ b./y . . gnk ( 1 ) = n r a = l 'mlm = n m = l lmlm
Let us denote by Hra the set of m-stage histories and by H~ the set of m-stage 1- histories, namely the set of sequence of m moves of player 1. (DeFine Ho and H0 ~ to be reduced to one point). Let I* (resp. J*) be the set of probabilities on I (resp. J).
A strategy 0 of player 1 in Go, is a I K [-tuple (O k) where for each k inK 0 k is function from = ~ H m into I*. A strategy ~ of player 2 in G is a function from
a
m 0
m~O Hra into J*.
Now given ~ and ~ we introduce in G . (p) 7~ (l) (~, ~) = E;,~- (gn k (l))
(1) (b, ~ ) = Ep,;,~- (gn k (l)) 3'.
3'. (2) O, ~ = ep,~,; r (2)).
The last two are the expected payoffs of each player up to stage n given 0 and f.
Definition: (~*, ~*) is a Nash equilibrium in G** (p) if:
(i) 3"n (i) (?r*, ~*) converges as n goes to infinity to some 3" (i) (~*, ~*), i = 1, 2.
(ii) For each ?r,
lira sup 3'n (1) (b, ?*) ~< 3' (t) (b*, ?*)
. - - r c o
nm sup 3'. (2) (~*, ~-< 3" (2)
(~*, ~*)
n - - - ~ o o
Remark: Since we want to prove the existence of NEP we choose a "strong" d&mition of Nafll equilibria.
3. Joint Plan Equilibrium
Definition: [Aumann/Maschler/Stearns].
A joint plan is a triple (S, x, 3') where:
- S is the set o f "signals", i.e. a subset o f H1n, for some n.
- x (signalling strategy) is a [ K [-tuple where for each k in K, x k is a probability on S such that for each s in S, x (s) = ~ pk x k (s) is strictly positive. (Otherwise we can
k drop this element from the set S).
- 7 (contract) is a i S I-tuple where for each s in S, 7s is a correlated strategy, i.e. a probability on I X J (the entries o f the payoff matrices).
Given a joint plan we introduce the following:
- For each s in S, p (sJ (conditional probability given s) is a probability on K defined by p k (s) = p k x k (s) / x (s} for all k in K.
- For each s in S and k in K
a k (s) = i,i ~" ak'" q 7i/s [3k (s) = i,j ~ bk 7~" (jointplan veetorpayoffs)
~ (s) = :c pk (s) ~k (s).
k
ak = m a x a k (t) a = (a k) k e K.
t~S
A joint plan is non revealing (Nil) i f p (s) = p for all s in S.
Lemma 1: (Type dependent lottery) [Aumann/Masehler, 1966].
Assume I I I >I 2. Then, for each finite set o f distinct probabilites on K, p* (t), t E T satisfying:
Z X ( t ) = l , 2 ; X (t) p* ( t ) = p , 0 < X ( t ) ~ < l V t E T ,
t t
there exists a set o f signals S, a bi]ection f from S to T and a signalling strategy x such that:
x (s) = X (f (s)), p (s) = p* ( f (s)) v s E S.
Proof: Choose some set S such that [ T [ = [ S [, a bijectionf from S to Tand define x k (s) = X Oe(s)) p , k (f(s))/pk, for each k in K such that pk > O. []
Remark: Then if [I 1 f> 2 a joint plan can generate any finite set of probabilities con- taining p in their convex hull.
Lemma 2: (frequency strategy) [Aumann/Maschler/Stearns].
Given any correlated strategy 7 there exists h** in H** such that for all i in I, all] in J l l {m l l <~m<-n, (im,]m)=(i,])} t--~ 7i].
n n--~**
Proof: Let us denote by 1 , . . . , t, . . . . I T i the entries (i,/) in the support of 7 and by (Ct)~ t E T, the corresponding weights ('gi]). Defme inductively an N X I T I matrix D with elements dn, t as follows:
dl, 1 = 1
dn, 1 = 0 ill-- ~ dm,1 >~Cl n m=l
= 1 otherwise and for t > 1
d n , t = 0
_ t
if 1 ~ d .>i ~, C h andd m,t.l=O n m=l m,r h=l
= 1 otherwise.
Given this matrix the history is constructed in the following way: For each stage n, n = 1, 2 , . . . , let t (n) = rain {t t dn, t = 1}.(Note that tin,IT t = 1 for all n). Then player 1 plays i and player 2 plays ] such that (i,/) is the entry t (n). It is easy to see that the proportion of n such that d n,t = 1 converges to Z Ch, t hence that the pro-
h=l
t t-1
portion of n such that t (n) = t converges to I~ C h - ~, C h = Ct. []
h = l h = l
Let us now define A (p) = ~ pk A k and B (p) = ~ pk B k and let a (p) (resp.
b (p)) be the value e r a (p) (resp. B (p)) viewed as a 0-sum two-person game with Player 1 (resp. 2) as the maximizer. (Note that a (') and b (') are continuous on p.)
For each application f from P to R, Cav f(resp. Vex t) denotes the smallest concave (resp. geratest convex) function from P to R above f(resp, below i).
Definition: A set Q in R k is approachable by player 2 in G ** with ~ if for each e > O, there exists N such that for any function ~ from m~O Hm into I ~, n >~ N implies
E~,~ (d (Q, g~ (1))) < e
where d is the Euclidean distance in R k and g n (1) is the i K i-vector (gk n (1)), k EK.
Lemma 3:
(i) For each p in P player l has a strategy ~ (p) = 5in G** (p) such that for all f tim sup "rn (2)
(~, ~).<
Vex b (V).F/,-..-~ oo
(ii) For each a in R IKI such that
~ . p > ~ a ( p ) V p E P
the set Q (a) = (y E R IKI j y k <~ ak, V k E K} is approachable by player 2 with some ~ (~).
Proof: These properties follow from a general result concerning zero-sum repeated games [Aumann/Maschler, 1966; see also Sorin, w 15-19] applied here for the
zero-sum games with matrices (Ak) and (Bk). []
Proposition 1: A sufficient condition for a joint plan to be a Nash equilibrium payoff is:
(O ~ (s) t> Vex b (p (s)) V s E S (individual rationality for player 211 (iO V k EK, V s E S, pk . x k (s) > 0 ~ a k (s) = ~ (no-cheating condition]
(iii) p 9 ~ >~ a (p) V p E P (individual rationality for player 1).
Proof: Let (S, x, 7) be a joint plan for which the payoff satisfies (i), (ii), ('hi). We shall construct 0", ~* Nash equilibrium strategies leading to the same payoff.
Assume S C H n. Choose a family (Ss), s in S and ~ (&) satisfying Lemma 3 for the family (p (s)), s in S and 5, respectively. For each s in S, let hS** be an infinite history satisfying lemma 2 for 7 s.
In order to define the strategy of player 1, we proceed in two steps. First, for each s in S, we introduce 5" given as follows:
- the n-tuple (Ss* (hm); m = 0 . . . . , n -- 1) is equal to s for each n-tuple (h m ; m = 0 , . . . , n - 1).
- for m = 0, 1 , . . . , let t (hn+m) = rain ((1; 1 <<. l <~ m, ]n+t :/:1~), + ~ and define:
5" s (hn+m) = iS+l i f t (hn+m) = +
= Os (hn+m -hn+mo) ift(hn+ m)
= m ewhere h n+ m -- hn+ m o is the history (i n+mo+ l ,/'n+mo+l . . . . ' in+re' ]n+m )"
Now ~.k is a mixture of the strategies ~ according to the probability x k on S.
The strategy of player 2 is defined as follows:
- ~* (h m) is arbitrary for 0 ~< m < n.
- For m = 0, 1 , . . . , define u (hn+m) = n
= rain {(/; 1 <~ l <<. m, it s ~ in+l), + oo]
and let
if(iz . . . i n ) ~ S if (i 1 , . 9 9 ' I t / ) = s
r* (hn+m) = Im+ 1 .S
= ~ (~) (hn+ m -- hn+mo )
if u (h n+m) = + oo and ( i l , . . . , i n ) = s if u (hn+m) = mo.
In words, player. 1 uses a mixture of strategies of the following type: play according to some signal s up to stage n and then follow the history A s o o as long as player 2 is also following it. If player 2 deviates, player 1 uses
~s
to punish him.As for player 2, after observing some signal s, he plays according to hS** as long as player 1 is doing the same and punishes him by using ~ (&) if s ~ S or if player 1 deviates from h s .
It is clear that if (~s' ~*) is played the signal will be s, the history after s will be hS** hence the limit of the vector payoff will be (ak (s),/~k (s)) (using lemma 2) and 7n (f) (~*, ~*) will converge as n goes to infinity, i = 1, 2.
Now assume that player 1 uses 5" and let ~ be a strategy of player 2. If ~ differs from ~* only before stage n the corresponding payoff is the same. Now after stage m, m 1> n, given 5" and the signal s player 2 can compute the posterior probability, which is constant and equal to p (s) from stage n on and the game is actually equi- valent to the game G** (p (s)) starting at stage n + 1. Since ~* specifies a pure move at each stage rn 1> n + 1, any deviation from ~* at such a stage m will be detected by player 1 who will play as from stage m + 1 on.
It follows then from lemma 3 and condition 1 that conditionally on s
lira sup 7 n (2) (~*, r--) < Vex b (p (s)) < 13 (s) = lira 7n (2) (5", ~*), hence player
n ---~. o o ?,/--.~ ~
2 has no profitable deviation.
Assume now that player 2 uses ~* and let 5 be a strategy of player 1 in G** (p).
If player 1 makes an observable deviation, namely if he is not using a mixture of the strategies (~s), s E S, ~* will coincide with ~ (&) from some stage on and it is easy to see that Q (&) will be approachable by ~*. This implies that for all k
am sup (1) < = am (1)
? / - - - ~ o o ? / - - - > o o
Finally if player 1 uses a mixture on S which differs from x (say y) he cannot
benefit because, conditionally on s
lira 3'n k (1) (5, ~*) = a k (s) and hence
n----} ~
k y k ak ~k
lim sup 3'n (1) (~, ~*) = Z (s) (s) ~< (by condition 2). []
n - - - } * o $
Remark 1: Aumann/Maschler/Stearns [1968] used the same kind of strategies to prove Proposition 1 with conditions (i), (ii) and (iv) with
(iv) p. a(s)>afp)vpseandvs S.
Obviously, (iv) implies (iii) but there are games where no joint plan satisfies (i), (ii),
(iv).
Example
[o] [_1]
A1 = A2 =
--1 0
Condition (iv) implies ~1 (s) ~> 0, a 2 (s) 1> 0 for all s in S and there is no 3's satisfying this condition.
Remark 2: It is easy to see that the above proof earl be slightly modified in order to imply Proposition 1 with a stronger definition of Nash Equilibrium, namely (i) and (iii) V e > 0, 3 N such that for each 0, ~ and n/> N
7n (1) (~', ~*) ~< 7 (1) (~*, ?*)+ e 3'n (2) (~'*, ~) ~< 3' (2) (~*, rr*) + e.
4. Preliminary Results
Let us denote by o (resp. r) the elements ofI* (resp. J*). For each p in P we define T(p) = (r E J* I r is optimal for player 2 in B (p)).
A correlated strategy 7 is independent and 2 safe (i -- 2) at p if there exist o in 1"
and r in T (p) such that 7 / / = % r/for all (i,/), (denoted by 3' = o | r).
A joint plan is (i - 2) at p if for all s in S, 7 s is (i - 2) at p (s).
Definition: Go. has a *Nash Equilibrium at p if there exists a NR joint plan (i -- 2) at p such that
oA (q) r = a " q >~a(q) y q EP. (1)
Let
X = { p E P I G** has a *N.E. at p ).
Remarks: It follows from the definition and Proposition 1 that if G** has a *N.E. at p then Go. has a N.E. (Condition (i) is satisfied since r E T(p) implies:
o B (p) r = fl t> b (p)/> Vex b (p); condition (ii) is void since the joint plan is N.R., and condition Cfii) is implied by (1)).
On the other hand, X may be empty (see the example above).
We have now to introduce a few more notations:
V r E T , V q E P , f T ( q ) = max oA(q)T.
o~l * C ( * ) = ((q, t); qEP, t E R t>~fr(q)) D = {(q, t); qEP, t E R t ~ C a v a ( q ) ) D1 = {(q, t); qEP, t E R t<~a(q)).
(Note that all there sets are closed in R IKl+l and that C (r) and D are convex sets.) Finally let
= {r E J* [ [D N C] ~ (z) = ~) = {r ~ J* [f~. (q)/> Cav a (q) on P}
(where Q0 denotes the interior of Q), and note that Tis closed.
Lemma 4: p E X *~* T(p) n T~k ~.
Proof: Ifp EX, let o and T satisfy (1). Then for each q i n / '
/" (q) i>
o i l (q) r = ~ " q i> Car a (q).Reciprocally, choose r ~ T (p) n ~, hence[/) N C] ~ (T) = ~. These convex sets can now be weakly separated, hence let l E R IKt such that
fr (q)>~l" q ~> Cava (q) Vq EP.
Then for each q in P there exists some o in I* satisfying o A ( q ) r - - l . q > ~ O .
The minimax theorem then implies the existence of a* in I* such that o * A ( q ) r - - l . q > ~ O V q E P .
It follows that 7 = o* | r is a N.R. joint plan (i -- 2) at p satisfying (1), hence
p EX. []
Lemma5:[D1 n c ( r ) ] ~ = 0 V r E J * .
Proof: Since a (') is the value of A ('), f r (q) = max oA (q) r is always greater than
a (q) hence the result, o []
Corollary 1: l f a (.) is concave on P then X = P.
Proof: in this case D = D1 and Lemmas 4 and 5 give the result. []
Corollary 2: X is closed.
Proof: Since the correspondence T (.)is u.h.e, and Tis dosed, the assertion follows
from l.emma 4. []
Corollary 3: F (r) = (q E P Ifr (q) < Cav a (q)} is convex and included in P* = {q E P l a (q) < Cava (q)).
Proof." fr is convex and Lemma 5 implies the second statement. []
5. The Case I K I = 2
Theorem: For each p E P, G** (p) has a Nash Equilibriur~
The theorem will follow from the next and more precise proposition.
Proposition 2: For each p E P, there exists a joint plan (i -- 2) at p satisfying Propo- sition 1.
The proposition is true i f p is an extreme point of P and also f f l I i = 1 (since in this case a (.) is concave).
Assume then II i/>. 2 and let us start with a specific couple (Po, re) with Po E / ~ Po ~X, re E
T~o).
Note that P* is a collection (finite since a (.) is algebraic) of open intervals on each of which Cava (') is linear. Hence by Corollary 3 F ( r o ) is included in some non-empty interval (ql, q2) such that:
a (qi) = Cav a (qi) i = 1, 2
a (q) < Car a (q) on (ql, q:) and there exists l = (/1, 12) in R 2 with
Car a ( q ) = l ~ q o n [ q l , q2 ].
Let us now def'me, for each r in J*
R (r) = ~ E e I f~ (p) - l - p = rain ff~ ( q ) - l . q) = z (r)).
qEP Lemma 6:
(i) The correspondence R (') is convex valued and u.h.c.
(ii) The.function z (') is continuous.
(iii) z (r)~>0 implies r e T.
Proof: (r p) + f r (.t9) is continuous, hence the graph of R is dosed so (i) and (ii) follow. As for (iii) z (r) 1> 0 gives fr (p) >~ l" p on P but l is a supporting line at
Cav a hence r E T. []
Let Fr (P) be the extreme points of P (i.e. (I, 0) and (0, 1)) and X = X t3 Fr (P).
Since X is dosed (Corollary 2) we can define p 1 and P2 in P by p~ =rain {ql;p~ < q i <~ 1,q E.Y}
p~ = m a x {ql;O<~ql < p o 1, q E X } hence
po E I, p2).
We first need a general lemma:
Lemma 7: Let ~ be a correspondence from U to V such that (i) U is connected;
(ii) ~ is u.h.c, and ~ (u) is connected for each u E U then ~ (U) = {v I v E ~ (u) for some u E U) is connected.
Proof: Let VI, V2 be a partition of ~ (U) into two disjoint non-void open sets. By (ii) for each uwehave r (u) C E or r V2. But now Ui= CuE UIr C V i} , i = 1,2,is non-void and open since ~o is u.h.c. Hence UI, U: is a partition of Uinto two disjoint
non-void open sets which contradicts (i). []
Lemma 8:
(i) z ( ' r ) < O V r ~ T ( ( p l , p2)) (ii) R ( T ( f p . p2))) c (ql, q2)
(iii) For each r E J* such that R ('r) C (ql, q2) there exists o E I* with
a A (p) "r--l " p = z ( ' r ) o n P (2)
Proof: By defintiion of p 1 and P2, (P 1, Pz) r X = r hence T ((t91, P2)) n T = 0 and (i) follows from Lemma 6. By Corollary 3 this implies that
qi ~:R (T((pl,
P2))). But we know that R ('ro) C F (to) C (q 1, q2 ). Since T and R are uah.c, and convex valued, Lemma 7 then implies that R (T ((P l, P2)))is connected hence (ii).Finally, by def'mition of z (r) we know for each p E P there exists o with o A (p) "r -- l 9 p >~ z ('r). Hence by the minimax theorem we get o in I* such that
a A ( p ) r - - l . p > ~ z ( r ) V p E P .
Since the above inequality is actually an equality for some p with
p E R ('r) c (ql, q2) C po this implies that the equality holds on P. []
Lemma 9: f f pl E X there exist ol E I* and "rl E T ( p l ) such that o l A (p) rl = l " p o n P
Proof: Let us consider a sequence (tim' ~rn) where/~m E (p,, P2 ) and converges to P l and ~m E T (Pro)" Using Lemma 8 (ii) and (iii) we can choose ~rn such that (am' ~rn) satisfy (2). Since I* and J* are compact we can assume (by selecting a sub- sequence) that (Urn, ~m) converge to some (a, "r). Since z (.) is continuous, T (-) and R (') u.h.c., we have by Lemma 8 (i) and (ii):
. r e Z ( p l ) , R ('r) O [ql, q2] :~ 0, Z ('r) ~< 0 a n d
oA ( p ) 'r - l " p = z ('r) o n P.
Now there are two cases:
-- If "r E T, then z ('r) = 0, otherwise for some p E [Pl, iv2 ] it would be true that:
L (p) < 1 9 p = Cava (p).
Thus we can choose al = o and 'rl = 'r.
- If 'r ~ T, then z ('r) < 0, and Corollary 3 implies R ( 0 c (ql, q2). But by definition
of P1, T ~ T(pl ) ~ ~. Since Tis dosed and T(pl ) convex we can now fred rl E T n T ( p l ) such that:
[r, r l ) C T ( p l ) and [r, r l ) f3 ~ = r
Using Lemma 7 we get as in Lemma 8:
R([r, rl))C(ql, q2) and z ( ' ) < 0 o n [ r , rx).
* such that * converges to r l , r* m E [r, r l ) and It remains to construct a sequence ~'m rm
a (sub)-sequence o* m converging to some ol such that (o*, r * ) satisfy (2). Since we
are now in the previous case, the result follows. []
Lemma 10: l f pl gs there exist o2 EI*, zl E T ( p l ) a n d ~2 < l 2 such that olA (p)rl = l i p I + ~2p2 onP.
Proof: Since Pl ~ X we have Pl = (1, 0).
Since T ( p l ) :3 ~ = r we can choose any rl E T ( p l ) and as in I.emma 9 prove that R (rl) c (ql, q2) and z (rl) < 0. Now by Corollary 3, this implies frl (pl) >1 l s 9
Since fr I is continuous there exists some ~'2 < l 2 such that:
frl (p) >1 lip I + 72p 2 on p with equality for some p' :/=Pl.
Let qi E {ql, q2 } with q~ )> 0, then we have:
J', (qi) l . qi > l ' . + 7 9 Hence we have p' E p0.
By using again the minimax theorem we get the existence of some ol E I* with
olA (p) r = liP 1 + "[~ " p2 onP. []
Proof of Proposition 2
Player 1 chooses some S = (sl, s2} andx according to Lemma 1 in order to achieve Pl = P (sl) andp2 = p (s2). The correlated strategy atpi is now a i | r i defined by Lemma 9 (ffPi EX) or Lemma 10 (ifpi gs
By construction r i E T (pi) hence condition (i) of Proposition 1 holds.
By definition l = (/1, 12) satisfies condition Cfii).
As for condition (ii):
- i f P l
EXthena(sl)=l;
- if pl ~ X then Pl = (1, 0) which implies x 2 (sl) = 0.
But we have ~2 (s2) = l 2 > a 2 (sl) = 7 2 and similarly forp2 hence p k . x k (si) > 0 =~ a k (si) = lk; i = 1 , 2 ; k = 1, 2.
This achieves the proof. []
R e f e r e n c e s
Aumann, R.J., and M. Maschler: Game Theoretic Aspects of Gradual Disarmament. Development of Utility Theory for Arms and Control and Disarmament, Chap. V, Report to the U.S.A.C.D.A., Contract S.T. 80, prepared by Mathematica, Princeton, N.J. 1966.
- : Repeated Games with Incomplete Information: A Survey of Recent Results. Models of Gradual Reduction of Arms. Chap. III, Report to the U.S.A.C.D.A., Contract S.T. 116, prepared by Mathematica, Princeton N.J. 1967.
Aumann, R.Z, M. Maschler, and R.E. Stearns: Repeated Games of Incomplete Information: An Approach to the Non-Zero-Sum Case. The Indirect Measurement of Utility, Chap. IV, Report to the U.S.A.C.D.A., Contract S.T. 143, prepared by Mathematica, Princeton, N.J. 1968.
Sorin, S. : Une introduction aux jeux repetes" " " a ~ deux joueurs, ~ somme nuile et ~ information incomplete Cahiers du Groupe de Math~matiques Economiques no. 1, Laboratoire d'Econo- m~trie, Universit~ Paris VI, Paris 1979. (English Version: I.M.S.S.S. Stanford University, TR 312, 1980).
Stearns, R.E. : A Formal Information Concept for Games with Incomplete Information. Models of Gradual Reduction of Arms, Chap. IV, Report to the U.S.A.C.D.A., Contract S.T. 116, prepared by Mathematic.a, Princeton, N.J. 1967.
Received April 1980
(revised version September 1982)