A note on “Big-Match”

(1)

A NOTE ON `BIG-MATCH'

JEAN-MICHEL COULOMB

Abstract. We present a very simple proof of the existence of the value for `Big Match' rst shown by Blackwell and Ferguson (1968).

1. `Big-Match' game

A `Big-Match' game is a two person zero-sum repeated game with absorbing states, represented by the following ^f

TB

^g^f1

2^g payo matrix

C

^?:

TB

a

1 2^?¹

a

^?²

b

¹

b

²

where as usual a star

?

denotes an absorbing payo. A play of the game is made of an innite number of stages. At stage

k

1 player I (resp. II) selects one of the rows (resp. column

j

^k). Each player is told afterwards the choice of his opponent. One assumes perfect recall: both players remember the sequence of their own and their opponent's moves. The rst stage where player I plays T is denoted by

. One denes the payo

z

^k at stage

k

by:

z

^k =

b

^j^k if

k <

a

^j otherwise.

Since one considers only the payos, the strategic choices of the players become irrelevant after stage

. Therefore a strategy

(resp.

) of player I (resp. II) species at stage

k

a probability

^k(

j

¹

::: j

^{k ;1}) (

^k for short) to play T (resp.

^k(

j

¹

::: j

^{k ;1}) to choose column 1) when player II's previous moves are

j

¹

::: j

^{k ;1}. The law of

is induced by the couple of strategies (

)

:

The expected average payo up to stage

n

under (

) is:

n(

) = IE(

z

¹+

:::

+

z

ⁿ

n

⁾

= IE^Xⁿ

k =1

P

(

=

k

^j

j

¹

::: j

^{k ;1})(

n

^;

k

+ 1)

a

^j^k+^P^{k ;1}^h=1

b

^j^h

n

^]

+

P

(

> n

^j

j

¹

::: j

^n;1)

P

n

h=1

b

^j^h

n

:

(1)

URL address of the journal: http://www.emath.fr/ps/.

Received by the journal November 15, 1995. Revised June 5, 1996. Accepted for publication June 20, 1996.

c Societe de Mathematiques Appliquees et Industrielles. Typeset by L^ATEX.

(2)

A

ⁿ(

) =

A

ⁿ(

j

¹

::: j

^n;1) =^X

k =1

n

^;

k

+ 1

n P

⁽

=

k

^j

j

¹

::: j

^{k ;1})

a

^j^k

:

Call the set of probability distributions

y

= (

y

¹

y

²) on ^f1

2^g

:

Let

W

=

min

^{y 2}

max

(

y

¹

a

¹ +

y

²

a

²

y

¹

b

¹+

y

²

b

²) be the value of the following matrix

C

obtained from

C

^? by deleting the `

?

's:

a

¹

a

²

b

¹

b

² One has the following result (3]):

Theorem 1.

W

is the value of the `Big-Match' game.

Remark that player II can guarantee

W

by playing i.i.d. an optimal strategy in the game

C

. In order to show that player I can guarantee

W

, we use a new approach based on two functions of a state variable.

remarks1. Our denition of the value, the same as 5] and 6], is stronger than the one used in 3]. Here we do not associate a payo to any play of the `Big-Match' game. We show the

-optimality of the expected average payo up to stage

n

when

n

becomes large, uniformly with respect to all strategies of player II. On the contrary, in 3] the payo of a play is dened by limsup^n!1 ^z¹^+:::+zⁿ ⁿ

:

2. The functions that we use do not depend on the matrix

C

^? (at least under assuption 1).

In 3], one studies the case:

C

^?= 1^? 0^? 0 1

2. Heuristic

We introduce a strategy which looks very much like the one introduced by Blackwell and Ferguson 3]. However our point of view seems to be quite dierent.

One starts from a simple remark: each time player I plans to choose T with a strictly positive probability, he takes the risk to get a bad i.e.

strictly negative absorbing payo. Furthermore he may accumulate these risks throughout the play. The new idea consists of monitoring at every stage the overall risk in the future that player I is ready to take by using a function of a state variable. More precisely from stage

k

, the overall risk of player I is linked to

^k,

a

^j^k and the overall risk from stage

k

+ 1.

Player I uses two functions

f

and ~

of a parameter

x

depending on player II's previous moves. The overall risk of player I (resp. the probability to play T) is represented by ~

(

x

) (resp.

f

(

x

)). These are given by the following (

M

is a large number to be xed later):

f

(

x

) =

8

>

<

>

:

1 + (

M

1+

x

)² if

M

+

x

1

12 otherwise

9

>

=

>

(3)

and

~

(

x

) =

8

<

:

;

(

M

1+

x

) if

M

+

x

1

;1 otherwise

By straightforward computations we show that

f

and

satisfy (to some extend one can compare with 1] and 2]):

Lemma 2. For all ^j

^j1

=

2 and for all

x

we have:

f

(

x

) + (1^;

f

(

x

))~

(

x

^;

)~

(

x

)

:

(2) To nd both functions, our heuristic is based on the functional inequation (2) which leads us to a set of dierential equations. More precisely, up to second order one has:

(1^;

f

(

x

))~

(

x

^;

) ^'(1^;

f

(

x

))(~

(

x

)^;

~

⁰(

x

) +

²

2 ~

⁰⁰(

x

))

:

To get rid of rst degree terms in (2) one sets:

(1^;

f

(

x

))~

⁰(

x

) =

f

(

x

)

:

Therefore one must verify:

;

f

(

x

)~

(

x

) +

²

2 (1^;

f

(

x

))~

⁰⁰(

x

)0

:

It will hold if one sets the equality for

= 1 i.e.:

2

f

(

x

)~

(

x

) = (1^;

f

(

x

))~

⁰⁰(

x

)

:

Under the condition 0

< f

(

x

)

<

1, one obtains:

2~

(

x

)~

⁰(

x

) = ~

⁰⁰(

x

)

:

3. Main Proof

Given past moves

j

¹

::: j

^{k ;1} of player II, the expression

x

^k =^;^P^{k ;1}^h=1

a

^j^h (

x

¹ = 0) will be used as state variable. The strategy

^M of player I consists of playing T at stage

k

with probability

^k =

f

(

x

^k).

Proposition 3. For all

n

1 and for all strategy

of player II, one has:

IE

A

ⁿ(

^M)^; 1

M :

⁽³⁾

Proposition 3 is a consequence of the following lemma since the function

~

is always negative.

Lemma 4. For all

n

1 and for all strategy

of player II, one has:

IE(

A

ⁿ(

^M) + 1

nP

^M⁽

> n

^j

j

¹

::: j

^n;1)~

(

x

ⁿ⁺¹))^; 1 Proof of the lemma:

M :

For all

k

= 1

::: n

, let us apply lemma 2 with

x

=

x

^k and

=

a

^j^k:

^k

a

^j^k + (1^;

^k)~

(

x

^{k +1})

~(

x

^k)

:

(4)

by ⁿ

P

(

> k

1

j ::: j

) to deduce that:

n

^;

k

+ 1

n P

^M⁽

=

k

^j

j

¹

::: j

^{k ;1})

a

^j^k+

n

^;

k

n

^{k +1}

n

^;

k

+ 1

n

^k since by negativity

n

^;

k

n

^{k +1}

n

^;

k

+ 1

n

^{k +1}

:

If one sums all these in- equalities and remarks that

¹ =^;1

=M

, then one obtains lemma 4.

This implies that the absorbing part of the payo is `controlled'. However it remains to study the non absorbing part of the payo.

Lemma 5. For all

>

0 and for all

k

1, if 1

k

X

h=1

b

^j^k ^;

then:

P

^M(

> k

^j

j

¹

::: j

^{k ;1})(

M

² 1 +

M

²⁾^{k ]}

:

Proof:

Assume that:

b

^j¹+

:::

+

b

^j^k

k <

^;

:

(5)

Remark that for all

k

^;

k

⁰

< k

one has:

b

^j¹ +

:::

+

b

^j^k⁰

<

0

:

Since

W

= 0, this implies that^;

a

^j¹ ^;

:::

^;

a

^j^k⁰ =

x

^k⁰⁺¹0 and

^{k +1}=

f

(

x

^k⁰⁺¹) 1

1 +

M

²

:

One then gets:

P

^M(

> k

^j

j

¹

::: j

^{k ;1}) = ^Y^k

h=1

(1^;

^h)(

M

² 1 +

M

²⁾^{k ]}

:

Finally we obtain:

Proposition 6. ⁸

>

0

⁹

N >

0

⁸

n

N

⁸

: IE(^Xⁿ

k =1

k

^;1

n P

^M⁽

=

k

^j

j

¹

::: j

^{k ;1})

P

k ;1

h=1

b

^j^h

k

^;1 +

P

^M(

> n

^j

j

¹

::: j

^n;1)

P

n

h=1

b

^j^h

n

⁾ ^;²

:

(6) Proof:

Take

N

¹ 1 such that (^1+M^M²²)^N¹^]

< =

2 and

N

² such that ^N1^N2

< =

2

:

Let

k

be the rst

k > N

¹ such that (5) holds. Therefore by lemma 5, one has

P

^M(

> k

^j

j

¹

::: j

^{k ;1}) ²

:

Remark that for all

n

N

²:

n

X

k =1

k

^;1

n P

^M⁽

=

k

^j

j

¹

::: j

^{k ;1})

P

k ;1

h=1

b

^j^h

k

^;1 +

P

^M(

> n

^j

j

¹

::: j

^n;1)

P

n

h=1

b

^j^h

n

(5)

;

N

¹

N

² ^;

P

^M(

N

¹+ 1

<

k

^j

j

¹

::: j

^{k ;1})^;

2

;2

:

Propositions 3 and 6 show that player I can guarantee 0 since (1) is the sum of (3) and (6). This ends the proof of theorem 1.

Acknowedgements

I would like to thank S. Sorin for his precious help and an anonymous referee for his very careful reading. The impulsion of this work originates from discussions with A. Neyman concerning my PhD dissertation.

References

Blackwell D., (1969), Innite ^G-Games with Imperfect Information, Applicationes Mathematicae, Hugo Steinhaus Jubilee Volume, X, 99-101.

Blackwell D., (1989), Operator Solution of Innite^G-Games of Imperfect Information, InProbability, Statistics and Mathematics. Papers in Honor of Samuel Karlin, ed. by T.W. Anderson, K.B. Athreya and D.L. Iglehart, Academic Press, 83-87.

Blackwell D. and Ferguson T.S., (1968), The Big Match, Ann. of Math. Stat.,39, 159-163.

Coulomb J.M., (1992), Repeated Games with Absorbing States and no Signals, Int. J.

of Game Theory,21, 161-174.

Kohlberg E., (1974), Repeated Games with Absorbing States,Ann. of Stat.,2, 724-738.

Mertens J-F. and Neyman A., (1981), Stochastic Games,Int. J. of Game Theory,10, 53-56.

UFR SEGMI, Universite Paris X, 200, avenue de la Republique, 92001 Nan- terre Cedex. E-mail: [email protected]