• Aucun résultat trouvé

Merging, Reputation, and Repeated Games with Incomplete Information

N/A
N/A
Protected

Academic year: 2022

Partager "Merging, Reputation, and Repeated Games with Incomplete Information"

Copied!
35
0
0

Texte intégral

(1)

Merging, Reputation, and Repeated Games with Incomplete Information

Sylvain Sorin*

Laboratoire d’Econom´etrie, Ecole Polytechnique, 1 rue Descartes, 75005 Paris, France, and MODAL’X and THEMA, U.F.R. SEGMI, Universit´e Paris X Nanterre,

200 Avenue de la R´epublique, 92001 Nanterre Cedex, France E-mail: [email protected]

Received September 8, 1997

We relate and unify several results that appeared in the following domains: merg- ing of probabilities, perturbed games and reputation phenomena, and repeated games with incomplete information.Journal of Economic LiteratureClassification Numbers: C72, D83. © 1999 Academic Press

1. PRESENTATION

Consider a discrete stochastic process on a spaceX that follows a prob- ability distribution Q. The paths of the form x= x1; : : : ; xn; : : : are an- nounced stage after stage to an observer who ignoresQbut holds ana priori probability or belief, P, on the process. On each path x, at each stagen, the probability of the observer on the future behavior of the process is re- vised, giving rise to a new belief which is the conditional distribution of P given x1; : : : ; xn‘. On the other hand, Q also determines conditional probabilities.

We first consider conditions for merging, namely, convergence of these beliefs (deduced from P) to the corresponding conditional probabilities (givenQ). The initial concept of merging, due to Blackwell and Dubins, re- quires the two probabilities to be close to each other for all future events.

This notion and the corresponding Blackwell and Dubins theorem (Re- sult I) deal with asymptotic properties. They will be used in the framework of multistage games with undiscounted payoffs. On the other hand, in the discounted case only events in the near future matter. The corresponding notion of proximity for probabilities considers only these events. This leads

*I thank Abraham Neyman for extensive comments.

0899-8256/99 $30.00 274

Copyright © 1999 by Academic Press All rights of reproduction in any form reserved.

(2)

to weak merging introduced by Kalai and Lehrer. However for the appli- cations to discounted games a stronger property involving uniform rate of weak merging is needed. This corresponds to Result II. These two results (I and II) are not comparable: the first one says that two sequences of prob- abilities are asymptotically closed on a large set of events, the second one that they are closed on a smaller set of events but at a uniform rate.

The above results on merging directly apply to the study of multistage two-person games of incomplete information on one side, where uncer- tainty concerns either the payoffs (like in repeated games with incomplete information) or the strategies (like in perturbed games and reputation phe- nomena). In both models the informed Player 1 is one of several types k and her opponent holds an initial probabilitypon this set of typesK. We study the set of equilibrium payoffs in such games and also compute bounds on the advantage the informed player can achieve due to her opponent’s uncertainty. We show that these properties crucially depend on the relative patience of the players and on the specification of the signalling mechanism used during the play.

The argument goes as follows: consider equilibrium strategies for the players. In the standard signalling case the stochastic process x1; : : : ; xn; : : :‘ is simply the play i1; j1‘; : : : ;in; jn‘; : : :‘, namely, the sequence of moves used and observed stage after stage by the players. The belief P of the uninformed Player 2 on the plays of the game is generated by her own strategy, the collection of type-dependent strategies of her opponent and the initial distributionp on types. On the other hand, for each typek of Player 1, the (true) distribution Qk of the process is generated by her strategy and the strategy of the uninformed agent. This framework pro- vides specific relations between distributionsP andQk which are sufficient for merging to occur. It follows that from the point of view of the unin- formed player, P and Qk induce conditional probabilities on the future plays that are close to each other from some stage on. By the equilibrium condition, Player 2 is playing a best reply. We thus derive a bound on her payoff under the distribution P, hence by merging similar results under Qk. This provides conditions satified by the probability Qk, thus finally a bound on the informed player’s payoff while using her strategy generating Qk (under the given strategy of Player 2).

This general methodology is first applied to two-person repeated games with incomplete information on one side and known own payoffs. It gives a short proof of the characterization of the set of equilibrium payoffs in the undiscounted case, using Result I. The same characterization is ob- tained, using the parallel Result II, in the discounted framework where the informed Player 1 is infinitely more patient than her opponent. This condi- tion on the time preferences of the agents plays a crucial role and is actually necessary.

(3)

The characterization shows moreover that the payoff set for the informed player depends only on the support of the distribution p on types. Hence it allows for an interpretation in terms of perturbation when most of the probability is on the “true" type. A quantitymA; B‘, is explicitly computed.

It corresponds to the best lower bound on equilibrium payoffs for Player 1 in the following class of games: the perturbation is in terms of Player 1’s payoffs, the true payoffs for Players 1 and 2 being defined by the matrices AandB, respectively .

We then consider reputation phenomena or incomplete information in terms of strategies. Here the strategy the uninformed player is facing is a mixture of the actual strategy of the informed Player and of some pertur- bation. Each such reputation model is identified by three parameters: one concerns Player 1 and is the nature of the perturbations, the other is the type of payoffs of the players (basically their relative duration), the third one is the information transmitted during the play.

When playing against a sequence of Players 2, each one living for one stage, Player 1 can build a quite strong reputation. In fact, for the unin- formed Player 2, the knowledge of the probability induced byP (hence by Qk) on the next stage implies the knowledge of the one-stage strategy (of the informed player) she is facing. Since Player 2 is playing a best reply it is simple for Player 1 to give intructions to her and a boundwA; B‘, bet- ter than the one obtained in the previous frameworkmA; B‘, is actually reached. On the other hand when both players live forever with undis- counted or discounted payoffs, but again in the latter case the informed player being much more patient than her opponent, the bound obtained with the same merging tools, is lower and coincides withmA; B‘.

Facing longer lived opponents may be worse for the informed player.

Even after obtaining, by merging, a good approximation of Qk from P, Player 2 may be afraid of playing a best reply to the true and unknown strategy of Player 1: by experimenting she could be punished and then obtain a worse payoff. This phenomena, reminiscent of the concept of con- jectural equilibria, explains why reputation effects are not monotonic with respect to the length of the interaction.

However if the uninformed player is using a completely mixed strategy, she generates against any perturbation of Player 1 a “revealing distribu- tion.” In this case, for Player 2, playing a best reply under P essentially implies playing a true best reply. It follows that by adding some noise to the model, so that the distribution on the signals to Player 1 is completely mixed, one can extend the precise monitoring results of the short lived situ- ation to the case where Player 2 lives forn periods. The stochastic process is now on Player 2’s signals and the weak merging refers to the next n stages. The game looks like a repeated version of the normalized n stage games with standard signalling.

(4)

In addition one can, in this situation, take advantage of the length of Player 2’s life to build more elaborate perturbations in terms of monitoring.

This leads to the best achievable payoff from the point of view of Player 1, denoted bysA; B‘.

Another approach to avoid the “conjectural effect” is to introduce more complex perturbations, such that there is an infinite hierarchy of punish- ments on plays compatible with the same perturbation. Rather than play- ing an active role, the uninformed player, trying to avoid punishment, is finally confronted with the perturbation and the merging property implies that she eventually identifies it.

This paper is basically an original presentation of a large number of re- sults that appeared in different articles. It shows that a few basic properties in the theory of merging underline most of the proofs. The advantages of this approach are in several directions.

Concerning the proofs of the results, their basic structure is described explicitly: in order to get properties on the payoff of Player 1 under Qk, one studies the payoff of Player 2 under P and then one uses the merg- ing property. This allows also to shorten drastically the proofs (especially Proposition 2.5, and those in Sections 3 and 4). Finally it enables to extend the results: for example, to a sequence of finite lived Player 2 (in Sections 3, 4, and 6) or to perturbation in mixed strategies (Section 7).

Also this methodology allows to replace a series of specific martingale convergence results—that seem to be adapted to each particular case—by a general property independent on the probability space and much easier to use: see the change of filtrations, deterministic in 3.4 and random in Sec- tion 7. It also helps in understanding the importance of some hypotheses:

for example, the need for the informed player to be much more patient than the other or the impact of the signalling structure on merging.

Finally this unified presentation reveals hidden connections: on one hand between incomplete information games with known payoffs and reputation phenomena, see the reasons for getting the same boundmA; B‘ in Sec- tions 3 and 4; on the other hand between the undiscounted model and the one with a patient informed player, see 3.3 versus 3.4 and 4.4 versus 4.5. It also facilitates the comparison of reputation properties when facing a se- quence of short or long lived players and clarifies the relations between the different lower bounds on the equilibrium payoffs of the informed player:

mA; B‘,wA; B‘, and sA; B‘.

This methodology has the large potential of applications since it relies on a general bound on the number of bad observations in a merging frame- work. This property is clearly independent of the length of the players, of the type of signals, and of the nature of the perturbations.

The content of the paper is as follows:

Part 2 recalls first (Result I) the initial and fundamental result on merging of Blackwell and Dubins (1962), the alternative formulation (weak merging)

(5)

and refinements of Kalai and Lehrer (1994). Then (Result II) a related uniform bound on the speed of convergence is given with a new proof of the property of Fudenberg and Levine (1992) in Proposition 2.5. A new result analog to theirs is also obtained as Proposition 2.4.

Part 3 builds on the above properties to characterize the set of equilib- rium payoffs in two-person repeated games with incomplete information and known own payoffs. Result I is used for the undiscounted case (3.2) in the spirit of Shalev (1994) and Koren (1992) and one relies on Result II for the discounted case with a more patient informed Player in 3.4, following Cripps and Thomas (1995b). The corresponding lower bound on the payoff of the informed player for payoff-perturbed games, due to Shalev (1994) and Israeli (1996) is presented in 3.3.

The remainder of the paper deals with reputation phenomena. The lower bounds on the informed player’s payoff at equilibrium depend upon the relative sizes of the players, the observability conditions, and the nature of the perturbation.

Part 4 is devoted to the standard signalling case with different sizes of players, starting with the seminal works of Fudenberg and Levine (1989, 1992), then following Cripps and Thomas (1995a) and Cripps et al.(1996) using Result I or II.

Part 5 deals with the case of a repeated game with signals, according to Fudenberg and Levine (1992) and studies the relation with conjectural equilibrium.

Part 6 considers games with noise where the whole perturbed strategy can be identified by the uninformed player hence allowing more precise monitoring, see Aoyagi (1996), Celentaniet al.(1996).

Part 7 shows, following Evans and Thomas (1997), that with more com- plex perturbations (with unbounded memory) the informed player can force her opponent to follow any prespecified individually rational path, thus achieving the best boundsA; B‘.

2. NOTIONS OF MERGING

Let;F; Q‘be a probability space equipped with a filtration”Fn•(in- creasing sequence of sub σ-fields of F) that generatesFxF =σS

nFn‘.

We assume that each σ-fieldFn is generated by a countable partition Fn and we denote byFnw‘the atom ofFn containingω.

Let P be another probability distribution on ;F‘.Q defines the law of the process x1; : : : ; xn; : : :‘ withxn=Fnω‘(note that here xn deter- mines x1; : : : ; xn‘) and P corresponds to the beliefs of an observer. Fn describes the information available at stagen: givenω, the future behav- ior is governed by Q·ŽFnω‘‘ ≡ Q:ŽFn‘ω‘ and the beliefs are given by P·ŽFnω‘‘ ≡P:ŽFn‘ω‘.

(6)

2.1. Merging

A notion of merging ofPtoQrefers to a type of convergence, under the true distributionQ, of the beliefsP·ŽFnω‘‘to the conditional probability distributionsQ·ŽFnω‘‘. It corresponds to the following situation: observ- ing stage after stage the realization of the process, the observer eventually is able to make good predictions. The first approach and basic properties are due to Blackwell and Dubins (1962).

Let us first define:

fnP; Q‘ω‘ =sup

A∈FŽPAŽFnω‘‘ −QAŽFnω‘‘Ž:

Definition 2.1. P mergestoQif:

fnP; Q‘ω‘ −→0 asn−→ ∞Qa.s.

The main result is

Theorem 2.2(Blackwell and Dubins, 1962). Assume that Q is abso- lutely continuous with respect toP (PQ). ThenP merges toQ.

Comments. 1. Kalai and Lehrer (1994) have shown conversely that if P merges to Q for all filtrations Fn that generate F, then P Q. It fol- lows that the notion of merging is actually independent of the filtration. It depends only on the data;F; P; Q‘:P merges toQiffPQ.

2. A simple case where P Q holds is obtained when P = ρ Q+

1−ρ‘Q0, for some 0< ρ≤1, whereQ0is another probability distribution on ;F‘. In this case one says that the beliefP contains agrain of truth of sizeρaboutQ(Kalai and Lehrer, 1994).

2.2. Weak Merging

Kalai and Lehrer (1994) also introduced a weaker notion of convergence based on the following function:

enP; Q‘ω‘ = sup

A∈Fn+1ŽPAŽFnω‘‘ −QAŽFnω‘‘Ž:

Definition 2.3. P weakly mergestoQalong”Fn• if:

enP; Q‘ω‘ −→0 asn−→ ∞Qa.s.

This requires, with Qprobability 1, the conditional distributions of P and Q at stage n to be close, for n large enough and for all events that occur at the next stage. Obviously this property extends to all events in a bounded future but not to all events inF like withfnP; Q‘in the definition of merging.

Kalai and Lehrer (1994) have shown that weak merging depends on the filtration”Fn•. Necessary or sufficient conditions for weak merging (in particular weaker than absolute continuity) and refinements can be found in Kalai and Lehrer (1994), Lehrer and Smorodinsky (1996a, 1996b, 1997).

(7)

2.3. Uniform Weak Merging

The following notion is not involving tail events, like in the case of merg- ing, but is stronger than weak merging. It is needed for the study of the discounted framework.

Consider the space of the process as an oriented graph with nodesω; n‘

in×corresponding toFnω‘. Givenδpositive, letAδdenote the nodes whereP andQareδ weakly far away. Explicitly:

Aδ= ”w; n‘ ∈×yenP; Q‘ω‘ ≥δ•:

The following two lemmas control, under a grain of truth hypothesis, the size of the setAδ.

The first lemma considers Aδn the section of Aδ for a fixed n; i.e., the set of pathsω for which the event: “the one stage prediction at stage n is inaccurate by more thanδ” holds.

The second one deals, on each trajectoryω, withAδω‘which is the set of stagesn where this event occurs.

Lemma 2.4. Given any positive constantsδ,ε, and ρ, there existsM = Mδ; ε; ρ‘ such that for any P; Q with P =ρQ+ 1−ρ‘Q0 and ρ≥ ρ, one has for all stagesn except at mostM of them:

QAδn‘ ≤ε:

Proof. We considerP as a probability distribution on an extended prob- ability space, where first a lottery chooses between QandQ0 and then the process follows the selected probability. Let B be the event: “the process follows the distibutionQ.” UnderP,Bhas an initial probabilityρ=q1. We introduceqm=PBŽFm−1‘which defines, under P, a martingaleq= ”qm• with values in’0;1“, hence satisfies:

EP X

m=1

EPqm+1−qm‘2ŽFm−1‘

=EP X

m=1

qm+1−qm‘2

≤1: (1) We obtain a bound of the probability ofAδnby studying the variation ofq.

LetM denote the set of stages whereEPqm+1−qm‘2‘ ≥1/M. By (1), its cardinality satisfies #M≤M.

Now for a stagem /∈M, the boundEPEPqm+1−qm‘2ŽFm−1‘‘ ≤1/M implies thatPωyEPqm+1−qm‘2ŽFm−1ω‘‘ ≥1/√

M‘ ≤1/√ M. Define the conditional “one-stage variation” of the martingaleqby:

Vmq‘ω‘ =EPŽqm+1−qmŽ ŽFm−1ω‘‘:

Hence for m /∈ M, one obtains from the previous inequality and Cauchy Schwarz inequality: PωyVmq‘ω‘ ≥1/√4

M‘ ≤1/√

M as well. Also, de- noting by Dmω‘the family of atoms of the partition ofFm−1ω‘, one has

(8)

an explicit formula for the variation:

Vmq‘ω‘ = X

a∈Dmω‘

PaŽFm−1ω‘‘Žqm+1a‘ −qmω‘Ž:

Now for eacha∈Dmω‘one has

qm+1a‘ =qmω‘QaŽFm−1ω‘‘

PaŽFm−1ω‘‘ ; so that one obtains:

Vmq‘ω‘ =qmω‘emP; Q‘ω‘: (2) The distance between the conditional probabilitiesemP; Q‘is thus propor- tional to the one-stage variationVmq‘. It remains to bound the coefficient qm which is the conditional probability ofB.

Letm0t‘ = ”ω∈y qmω‘ ≤tρ•be the set of paths where this poste- rior probability ofBhas decreased by a ratio at leastt. Forω /∈m0t‘the bound onem follows from (2):

Qω /∈m0t‘y emP; Q‘ω‘ ≥1/tρ√4

M‘ ≤1/ρ√ M:

Finally on 0mt‘, ρQFm−1ω‘‘/PFm−1ω‘‘ = qmω‘ ≤ tρ. Thus QFm−1ω‘‘ ≤t PFm−1ω‘‘. By summing one obtains the following bound on the probability under Qof this set of “low posteriors”:Qm0t‘‘ ≤t.

Choose t=ε/2 andM such that both √

M ≥2/ερ and√4

M ≥1/tδρ to get the result.

The previous lemma was computing for a given stage the probability of the set of paths going through Aδ at that stage. This probability can be reduced as small as we want except on a uniformly finite number of stages.

By uniform we mean that it does not depend on the space  and not on the probabilitiesP andQunder consideration but only on the boundρ on the size of the grain of truth.

The next lemma checks, on each path, the number of nodes inAδ. Except on a set of paths of probability as small as wanted this number is uniformly bounded.

Lemma 2.5(Fudenberg and Levine, 1992). Given any positive constants δ, ε, andρ, there exists M =Mδ; ε; ρ‘such that for any P; Q withP = ρQ+ 1−ρ‘Q0andρ≥ρ, one has:

Qωy #Aδω‘ ≥M‘ ≤ε:

(9)

Proof. We use the notations and results from the proof of the previous lemma.

Let

1M; α‘ = ”w∈y #”my Vmq‘ω‘ ≥α• ≥M•

be the set of paths ω where the martingale has a conditional one-stage variation greater thanα on more thanM stages. From (1) one obtains by Cauchy–Schwarz inequality: α2M P1M; α‘‘ ≤ 1. Using (2), it follows that, except on1M; α‘, there are at mostM stages whereemP; Q‘ω‘ ≥ α/qmω‘.

Let us consider the set of paths exhibiting eventually low posterior dis- tributions:

2t‘ = ”ω∈y qnω‘ ≤tρ; for somen•:

One introduces the partition: 2t‘ = S

mm2t‘ with m2t‘ = ”ω ∈

y qnω‘> tρ; forn < m, andqmω‘ ≤tρ•. Forωin m2t‘,qmω‘ ≤tρ, hence as in the previous proof QFm−1ω‘‘ ≤ tPFm−1ω‘‘. So that Qm2t‘‘ ≤tPm2t‘‘and alsoQ2t‘‘ ≤tP2t‘‘ ≤t.

Given a pointω /∈2t‘,qmω‘ ≥tρ. If in additionω /∈1M; α‘, there are at mostM stagesn where:

emP; Q‘ ‘ω≥ α tρ:

Therefore, except on the set 1M; α‘ ∪W2t‘, ω ∈Aα/tρn for at most M stages. Chooset=ε/2, thenα=tρδand finallyM≥2/α2ερ‘to get the result.

Comments. 1. Note that the previous results are independent of the filtrationFn. They give, under a strong form of absolute continuity (P with grain ofQat leastρ), a uniform bound on the set of nodes ω; n‘ where enP; Q‘ω‘is large. In other words, the number of observations that may lead to bad predictions is uniformly bounded.

2.To have a control on this number of stages is crucial in a discounted framework, by opposition to the undiscounted case where asymptotic results (like upper density) suffice.

3. REPEATED GAMES WITH INCOMPLETE INFORMATION 3.1. Presentation

Consider repeated two-person games with incomplete information on one side introduced by Aumann and Maschler (1995) and described as

(10)

follows: Player 1’s (resp., Player 2’s) stage payoff is defined by a finiteI×J matrixAk(resp.,Bk), wherekbelongs to a finite setK. The game is played in stages: at stage 0, k is selected according to a probabilityp,p∈1K‘, the simplex on K. Player 1 is told thek chosen while Player 2 knows only p. At stagen+1, given the previous history of lengthn,hn∈Hn, which is the sequence of moveshn = i1; j1; : : : ; in; jn‘up to that stage, both play- ers choose a move, say in+1 and jn+1, this couple is announced to both and the stage payoff is an+1 =Akin+1;jn+1 for Player 1 and bn+1 =Bkin+1;jn+1 for Player 2. Note that this payoff is not announced. As usual, strategies, σ = ”σk• for Player 1 and τ for Player 2, can be represented as map- pings from the set of historiesH=S

n≥0Hn to mixed moves, namely,1I‘

and1J‘.6andT denote the corresponding sets of strategies. On the set of plays H = I×J‘, the σ-fields Hn are generated by Hn and gen- erate H. Together with p,σ, and τ define a probability distribution on

K×H;2K⊗H‘, hence on the stream of payoffs.

Let anσ; τ‘ = Ep; σ; τn1Pn

m=1am‘ (and similarly bnσ; τ‘) denote the average expected payoff of Player 1 (resp., Player 2) and let aknσk; τ‘ = Ek; σk; τ1nPn

m=1am‘ specify this payoff given k so that anσ; τ‘ = P pk akn σk; τ‘. Similarly, for 0 < λ ≤ 1, aλσ; τ‘ = Ep; σ; τP k

m=1λ1− λ‘m−1am‘ is the average discounted payoff of Player 1 and so on. 0np‘

(resp., 0λp‘, 0p‘) denotes the n stage (resp., λ discounted, infinitely repeated) game. In this last case, the payoffs are not well defined, however one introduces a set of equilibrium payoffs using the following procedure:

Definition 3.1(Hart, 1985). Given a Banach limit L, an L equi- librium σ; τ‘ is an equilibrium in the game with payoff γLσ; τ‘ = Lanσ; τ‘; bnσ; τ‘‘ (or equivalently with vector payoff ”Laknσk; τ‘‘•

for the informed Player 1). Ep‘ is the set of all L equilibrium payoffs obtained asL varies.

Definition 3.2(Sorin, 1990, 1992). A pair α; β‘ in 2 is a uniform equilibrium payoff if for any positive ε, there exists a couple of strategies

σ; τ‘ inducing anε-equilibrium in any sufficiently large game 0np‘with a payoff withinεofα; β‘. Formally:

∀ε >0 ∃σ; τ‘;∃Nx ∀n≥N ∀σ0; τ0‘ anσ; τ‘ +ε≥α≥anσ0; τ‘ −ε bnσ; τ‘ +ε≥β≥bnσ; τ0‘ −ε or equivalently, for α; β‘ ∈K+1

aknσk; τ‘ +ε≥αk≥aknσ0k; τ‘ −ε ∀k bnσ; τ‘ +ε≥β≥bnσ; τ0‘ −ε:

(11)

Note thatσ; τ‘is also a 2ε-equilibrium in any discounted game0λp‘with small discount factorλ.

Hart (1985) gave a characterization of Ep‘ proving in addition that any point in it is a uniform equilibrium payoff.

There are two important subclasses of these games with lack of informa- tion on one side:

— zero-sum games correspond to the case whereAk= −Bk, for allk;

— “known own payoffs” games are obtained when Player 2’s payoff is independent ofk,Bk=B.

Note that in the first case the uninformed Player 2 does not know her own payoff.

3.2. Known Own Payoffs Games

We consider here the second subclass for which an explicit characteri- zation ofEp‘ is much easier (Shalev, 1994). One can deduce it from the general characterization of Hart (1985), using properties of bimartingales due to Aumann and Hart (1986), or directly, even for the case of lack of in- formation on both sides (but with known own payoffs), like in Koren (1992) (see also Forges, 1992).

We first introduce few notations. For M an I ×J matrix, x in 1I‘, and y in 1J‘, xMy denotes P

i; jxiMi; jyj, val1M = maxxminy xMy and val2M =maxyminx xMy. For π in 1I×J‘, one defines Œπ; M = Pi; jπi; jMi; j. We also useŒ ;  for the scalar product in K.Ap‘ is the average matrix P

kpkAk.p0 means that p has full support. For all x in1I‘, letx denote the strategy in the repeated game that playsxi.i.d.

We assume #I ≥ 2. To make computations simpler we assume that all matrices are of norm less than 1:Ak ≤1,B ≤1.

We recall now basic results fromapproachability theory, due to Blackwell (1956).

Definition 3.3. Letc∈K. An orthant Oc‘ = ”c• −+K is approach- able by Player 2 in the game with vector payoffs ”Ak• if: ∀ε >0;∃τ and

∃N such that forn≥N,aknσk; τ‘ ≤ck+ε, for allσk and allk.

Blackwell’s theorem (1956) implies

Proposition 3.4. A necessary and sufficient condition forOc‘to be ap- proachable by Player 2 is:

(1) val1Aq‘ ≤ Œc; q,∀q∈1K‘

or equivalentlyOc‘being convex:

(2) ∀x∈1I‘,∃y∈1J‘withxAky≤ck,∀k∈K.

(12)

Let us finally define a set of correlated distributions onI×Jby:5K‘ =

”π= ”πk•yπk∈1I×J‘; k∈Kwith:

(i) P

kŒqkAk; πk ≥val1Aq‘; ∀q∈1K‘

(ii) ŒB; πk ≥val2B; ∀k

(iii) ŒAk; πk ≥ ŒAk; πk0; ∀k; k0∈K•

The set of payoffs induced by these distributions is: Ep‘ = ”α; β‘ ∈ K+1y ∃π∈5K‘with:

(iv) ŒAk; πk =αk; ∀k∈K (v) ŒB;P

kpkπk =β•.

The following characterization means that these distributions generate all equilibrium payoffs.

Proposition 3.5(Shalev, 1994; Koren, 1992). Assumep0. Hence, Ep‘ =Ep‘:

Proof. (1) To prove that any payoff in Ep‘ is a uniform equilibrium payoff is easy. The equilibrium strategies are as follows: Player 1 announces her type k and then both players follow a prespecified playwk in H on which the correlated frequency of the moves converges toπk. The induced payoffs satisfy then (iv) and (v). Any deviation duringwk is detectable and can be punished by condition (i) (using Proposition 3.4) for Player 1 and (ii) for Player 2. Finally condition (iii) prevents any undetectable deviation of Player 1 (pretending being of typek0while being of typek) from being profitable.

(2) For the converse we now show that any L equilibrium payoff is in Ep‘. Hence letσ; τ‘be anL equilibrium with payoffα; β‘ ∈K+1.

Denote by P (resp., Pk) the probability induced by p; σ; τ‘ (resp., p; σk; τ‘on the set of plays (H;H). E(resp., Ek) is the corresponding expectation. Obviously one has P=P

kpkPk so that PPk for any k and pk is the size of the grain of truth ofPk inP.

Define θni; j‘ = n1Pn

m=11”i; j•im; jm‘and let πi; jk =LEk”θni; j‘•‘be the asymptotic expected empirical frequency (a.e.e.f.) of the moves under σk andτ. Clearly, (iv) and (v) are satisfied byα; β‘andπ.

We claim thatπ belongs to5K‘. To prove (i), note that otherwise (by Proposition 2.1) there existsq∈1K‘andx∈1I‘such thatP

kqkxAky >

Œα; q, ∀y ∈1J‘. If π denotes the a.e.e.f. corresponding tox; τ‘, one obtains P

kqkŒπ; Ak>Œα; q so that for somekx Œπ; Ak > αk. To follow σ unless in state k where x is used would then be a profitable deviation for Player 1.

The equilibrium property for Player 1 implies obviously that the “no cheating condition” (iii) is satisfied.

(13)

It thus remains to prove (ii) and this relies on merging. The equilibrium condition for Player 2 implies obviously that his payoff is individually ra- tional at the start of the game: ŒB;P

kpkπk ≥val2B and also along the play. Hence, for any m,Palmost surely:

ŒB;LE”θni; j‘ŽHm•‘ ≥val2B: (3) Theorem 2.2 implies that P merges to Pk, in particular one has, taking expectation and limit, Pk a.s.,

LEk”θni; j‘ŽHm•‘ −LE”θni; j‘ŽHm•‘ −→0 asm−→ ∞; (4) which says that the asymptotic distributions under P orPk on the moves in the future, computed at any stagemlarge enough, are close one to the other. Thus, (3) and (4) together imply that,Pk a.s.:

m−→∞lim ŒB;LEk”θni; j‘ŽHm•‘ ≥val2B: (5) Since

EkLEk”θni; j‘ŽHm•‘‘ =LEkEk”θni; j‘ŽHm•‘‘

=LEk”θni; j‘•‘

i; jk ; one obtains from (5) condition (ii).

Basically the merging property allows to deduce from the individual ra- tionality condition ŒB; π ≥val2B, the more precise property: ŒB; πk ≥ val2B,∀k.

Corollary 3.6. Ep‘is nonempty.

Proof. We prove that5K‘is nonempty. Chooseyoptimal for Player 2 in game B, letxk be a best reply of Player 1 toy in game Ak and define πk =xk⊗y, i.e.,πi; jk =xkiyj. Conditions (ii) and (iii) are clearly satisfied.

Finally ifx0is a best reply of Player 1 toyinAq‘, one hasx0Aky≤xkAky for allk, thus val1Aq‘ ≤x0Aq‘y≤P

kqkxkAky= Œα; q.

Comments. 1.Part 1 of the previous proof (use of Blackwell’s approach- ability criteria and joint plans) is now standard, see Aumann and Maschler (1995), Sorin (1983), Hart (1985).

2. The proof of Corollary 3.6 for the class of games with incomplete information on one side is much more difficult, see Sorin (1983) for #K=2 and Simon et al. (1995) for the general case. Note that Koren (1992) has an example showing that the existence result does not extend to the class of known own payoff games with lack of information on both sides.

3. The set of vector payoffs equilibria of Player 1 is the projection of Ep‘ onK and depends only on the payoff matrices in the support of p.

Forp0, denote it byFA1; : : : ; AKyB‘.

(14)

3.3. Games with Perturbed Payoffs

We consider here a two-person nonzero-sum repeated game defined by a pair ofI×J matricesAandBand perturbed in the following way. With a small positive probability Player 2 believes that Player 1’s payoff is given by one of finitely many matrices A2; : : : ; AK. Assuming this uncertainty known by Player 1, the situation is then analogous to the incomplete infor- mation game described in 3.2, withA=A1, in particular the set of vector equilibrium payoffs for Player 1 is independent of the precise specification of the perturbation probability as long as it has full support. We are inter- ested in the payoff of the true type, more precisely we define her smallest equilibrium payoff:

mA; A2; : : : ; AKyB‘ =min”α1yα∈FA1; : : : ; AKyB‘•:

Let also:

mA; B‘ = sup

x∈1I‘ inf

y∈1J‘”xAyyxBy≥val2B‘•: (6) Then one has the following bounds:

Proposition 3.7 (Shalev, 1994; Israeli, 1996). (1) mA; A2; : : : ; AKy B‘ ≤mA; B‘, for any familyA2; : : : ; AK

(2)mA;−ByB‘ ≥mA; B‘.

Proof. (1) We adapt Forges’s idea (personal communication). By defini- tion of mA; B‘,8= ”x; y‘yxAy≤mA; B‘; xBy ≥val2B• is nonempty.

For eachkchoosexkandyk that achieveαk =max”xAkyy x; y‘ ∈8•. Fi- nally let πk=xk⊗yk. We now prove thatπ ∈5K‘. Conditions (ii) and (iii) are clearly satisfied. Also Œα; q = P

kqkmax”xAkyy x; y‘ ∈ 8• ≥ max”xP

kqkAk‘yy x; y‘ ∈8• ≥val1Aq‘.

(2) For the converse, let α be an equilibrium payoff for the game

A;−ByB‘. Conditions (i) and (ii) together imply α2 = val1−B‘ =

−val2B. Again (i) and Proposition 3.4 lead to: ∀x; ∃y; xAy ≤ α1 and x−B‘y ≤α2. Thus, α1 ≥miny”xAyxxBy≥val2B• for allx, which gives point 2).

Comments. 1.The interpretation is that the most advantageous pertur- bation for Player 1 (in terms of lower bound on her equilibrium payoffs) is the one that induces maximal constraints on Player 2: this is the case when facing the perturbed type, Player 2 plays a two-person zero-sum game, i.e., A2= −B. Compared to the Folk theorem, the individually rational level of the informed player increases from val1AtomA; B‘.

2.This result indicates a strong discontinuity w.r.t.pof the set of equi- librium payoffs for the undiscounted case. As soon as the slightest amount

(15)

of uncertainty is present the informed player can build a reputation, see Shalev (1994), Israeli (1996).

3. Note that the bound mA; B‘ may not be reached. Choose, for example:

A=

−1 1

−1 0

B= 2 1

0 1

:

Here mA; B‘ = 12, and for x = 12 there is no constraint on y and minyxAy= −1.

3.4. Discounted Case

We consider now the case where both players use personal discount fac- tors to evaluate their payoffs. For Player 1 the payoff is thus aλ1σ; τ‘ = E P

m=1λ1 1−λ1‘m−1am‘ with 0 < λ1 ≤ 1 and similarly bλ2σ; τ‘ = E P

m=1λ21−λ2‘m−1bm‘for Player 2. We denote by0λ1; λ2‘p‘the cor- responding game with incomplete information.

The results of this section are due to Cripps and Thomas (1995b). They first show that if the informed player is “infinitely patient” with respect to the other one, she can do as well as in the undiscounted case. This means in particular that in perturbed games like in Section 3.3, she can drastically reduce to her advantage her set of equilibrium payoffs, compared to the complete information case.

Proposition 3.8(Cripps and Thomas, 1995b). For anyp0, any dis- count factor λ2 and any positiveε, there exists λ11p; λ2; ε‘such that, ifλ1≤λ1, any equilibrium vector payoffαof Player 1 in0λ12‘p‘is within ε of FA1; : : : ; AKyB‘.

Proof. We use the notations introduced in the proof of Proposition 3.5.

Letσ; τbe an equilibrium in0λ12p‘. Givenλ2andη >0, letNsuch that the weight on stages 1; : : : ; N, givenλ2 is at least 1−η (N is the approx- imate horizon of Player 2). We define the λ2-average frequency between stagesmandnby

θλ2m; n‘i; j‘ = Xn

r=mλ21−λ2‘r−m1”i; j•ir; jr‘;

and the corresponding payoff for Player 2 by:

bλ2m; n‘ = Xn

r=mλ21−λ2‘r−mbr = ŒB; θλ2m; n‘:

The equilibrium condition for Player 2 implies as in section 3.2 that she is individually rational on each play. Hence, for allm,Pa.s.,

Ebλ2m+1;∞‘ŽHm‘ ≥val2B;

(16)

which implies, by the choice ofN:

Ebλ2m+1; m+N‘ŽHm‘ ≥val2B−η:

In particular, for allm,Pa.s., the payoff on themth block of sizeNsatisfies:

Ebλ2mN+1;m+1‘N‘ŽHmN‘ ≥val2B−η: (7) We now use Lemma 2.4 for the filtration Fn =HnN correponding to the sequence of blocks of sizeN andδ,εto be chosen later. One obtains that, for all indices m/∈M, with #M≤M:

Pkn

hmNy sup

A∈Hm+1‘N

ŽPAŽhmN‘ −PkAŽhmN‘Ž ≥δo

≤ε: (8)

Hence for m /∈ M, (7) and (8) imply that, with probability at least 1− ε under Pk, a similar bound holds for the payoffs of Player 2 evaluated underPk:

Ekbλ2mN+1;m+1‘N‘ŽFm‘ ≥val2B−η−2δ:

So that form /∈M, taking expectation

Ekbλ2mN+1;m+1‘N‘‘ ≥val2B−η−2δ−2ε;

and finally, in this case, by the choice ofN:

Ekbλ2mN+1;∞‘‘ ≥val2B−2η−2δ−2ε: (9) This last inequality means that except for finitely many values of m, the payoff of Player 2, at the beginning of the mth block is still (almost) indi- vidually rational underPk (and not only underP).

We want to deduce properties concerning the frequencies evaluated ac- cording to Player 1’s criteria. Recall that if λ1 ≤λ2, θλ1m;∞‘ is in the convex hull of the familyθλ2n;∞‘,n≥m. Define nowλ1≤λ2 so that the weight of the firstMN stages givenλ1is less thanη. From (9) one obtains, forλ1≤λ1,

Ekbλ1mN+1;∞‘‘ +2η≥val2B−2η−2δ−2ε;

hence in particular, lettingπk=Ekθλ11;∞‘‘, one deduces

ŒB; πk ≥val2B−4η−2δ−2ε; (10) which means that also under Player 1’s evaluation, Player 2’s payoff is al- most individually rational, givenPk.

Note that the equilibrium vector payoff of Player 1 is αk = ŒAk; πk.

The individual rationality condition for Player 1 implies, as in the proof of Proposition 3.5, that Œα; q ≥val1Aq‘, for allq in1K‘, using Propo- sition 3.4. Finally the equilibrium condition obviously gives: Œπk; Ak ≥

Œπk0; Ak, for allk; k0:To finish the proof of Proposition 3.8, we need

(17)

Lemma 3.9. Define, for ζ a real number and p 0: 5ζK‘ = ”π =

”πk•yπk∈1I×J‘; k∈K with:

(i) Œα; q ≥val1Aq‘ −ζ; ∀q∈1K‘

(ii) ŒB; πk ≥val2B−ζ; ∀k

(iii) ŒAk; πk ≥ ŒAk; πk0 −ζ; ∀k; k0∈K•.

Then, for anyε positive there exists a positiveζ such that5ζK‘is included in theε-neighborhood of5K‘.

Proof of the Lemma. The lemma says that the correspondance ζ 7→

5ζK‘ is u.h.c., with 5K‘ = 50K‘ and this is clear since all the con- straints are continuous inζ.

Let nowβ be defined byŒB;P

kpkπk and recall that the map from5 toEis nonexpansive. Note that (i)–(iii) are satisfied withζ =4η+2δ+2ε.

Givenε, choose ζ according to the lemma, then η=δ=ε=ζ/8 to get Proposition 3.8.

Comment. 1. The result holds a fortiori if Player 1’s payoff is undis- counted, and also if Player 1 is facing a sequence of Player 2 that lives for a finite number of stages (see Section 4.1).

2.The previous proof uses in a crucial way the fact that the boundλ1 is a function of λ22 determines the horizonN. This allows to use weak merging through a new normalization of the process, but Player 1 has to be “infinitely” more patient than Player 2.

3. In fact a new and important result of Cripps and Thomas (1995b) shows that for p1 near 1 and λ12 near 0 the range of the component α1 of the equilibrium payoff of Player 1 is like in the complete information case (Folk theorem) with payoffsA1 andB.

The next result shows that for small discount factors the set of equilib- rium payoffs almost containsEp‘.

Proposition 3.10(Cripps and Thomas, 1995b). Assume p 0 and ε positive. Then there exist λ = λε‘, such that any α; β‘ in E−εp‘ =

”α; β‘y ∃π ∈ 5−εK‘;ŒAk; πk = αk; ∀k ∈ K; ŒB;P

kpkπk =β• is an equilibrium payoff of 0λ1; λ2‘p‘for anyλ1≤λ,λ2 ≤λ.

Sketch of Proof. The proof follows Part 1 of the proof of Proposi- tion 3.5. Choose a smooth playwk (cf., Sorin, 1992, p. 82) then discounted factors small enough to generate the payoffs and to respect the individ- ual rational constraints, taking into account the stages needed to code the different types k. (Note that the Definition 3.3 of approachability implies continuity with respect to the discount factor.)

(18)

Comments. Altogether the previous results show that the limit behavior, for discount factors near 0, of the set of equilibrium payoffs of 0λ1; λ2‘p‘

depends crucially on the relative size of the discount factors. There is a path λ2; λ1λ2‘‘ going to 0;0‘ on which continuity to Ep‘ holds. The lim inf of these families of sets always contains the interior of Ep‘ but might be much larger.

4. REPUTATION: BASIC RESULTS 4.1. Presentation

A two-person game defined by two I×J payoff matrices A and B is played repeatedly. We consider first the case of standard signalling where, after each stagen, the chosen actionsin; jn‘are announced to both players.

Let an = Ain; jn; bn = Bin; jn be the corresponding payoffs for Players 1 and 2, respectively. We assume that Player 1 is a long run player, hence plays at every stage. Her opponent might be of several types: either there is a sequence of short lived Players 2 where each one lives for finitely many periods and is then replaced, or a single long Player 2 is present. In all cases, the agents playing at stage n are aware of the whole past history hn−1 of moves until that stage.

Gµ1; µ2‘ is the game with payoffs parameters µ1 andµ211 corre- sponds to the case where Player 1’s payoff is discounted with factorλ1 and µ1 = ∞stands for the undiscounted case (see 3.1 for the evaluation of the payoffs). As for Player 2, both values µ22 or µ2 = ∞are feasible but in addition we consider the case µ2 =n where n is the duration of each Player 2’s life.

We add now a perturbation to this game: there are some strategies (in the repeated game) to which, with positive probability, Player 1 is committed.

Moreover this fact is known to Player 2.

The basic idea of the literature on reputation, see Kreps and Wilson (1982), Milgrom and Roberts (1982), Kreps et al.(1982), is that Player 1 can use this uncertainty, namely, play some specific strategy, say ϕ, in the repeated game to build a reputation effect for being of this type, i.e., being in fact committed toϕ. Once convinced of facing this type Player 2 should adapt and this could benefit Player 1. Clearly this possibility depends on both players’ characteristics µ1 and µ2. (The influence of the signalling structureis considered in the next sections).

We exhibit a lower bound on Player 1’s payoff (normal type) in any equi- librium of a perturbation of the game Gµ1; µ2‘ where with positive proba- bilityρshe is of “typeϕ.” (This does not prevent other perturbations to be present as well). The result is obtained by computing the payoff of Player 1

(19)

mimicking her typeϕ and relies on merging properties and on the behav- ior of Player 2 at equilibrium. Denote this (class of) perturbed game by Gµ1; µ2‘ρ; ϕ‘.

4.2. General Properties

Letσ; τ‘be an equilibrium in the perturbed gameGµ1; µ2‘ρ; ϕ‘.eσ de- notes the strategy of Player 1 defined byσ, the commitment types and the perturbation probability. Formallyσ˜ can be written asσ˜ =θσ+ 1−θ‘σ0 withθ≥ρ, andσ0being of the formσ0= ρθϕ+ 1−ρθ‘ϕ0.P is the proba- bility on H;H‘induced byeσ; τ‘while Qis the one generated by the type ϕ and τ. Obviously one hasP =ρQ+ 1−ρ‘Q0, for some distribu- tion Q0. It follows thatP merges to Q and that the merging properties of Section 2.3 hold. (ρis the size of the grain of truth.)

The general procedure to obtain the lower boundwon player is as fol- lows:

Step (1) uses the fact that Player 2 is playing a best reply toeσ inGµ1; µ2‘ to get at each stage a lower bound under P on her conditional expected payoff for the future. This gives a corresponding property on the conditional probabilities on histories.

Step (2) deduces from the merging results of Section 2 a similar property on the conditional probabilities on histories when Player 1 is of typeϕ(i.e., underQ).

Step (3) computes then the lowest payoff of Player 1 under typeϕ, given this property. Since Player 1 can always choose to mimic any perturbation this gives a bound on her payoff.

Note that the two first steps of this procedure are actually similar to those used in the proofs of Propositions 3.5 and 3.8.

Let us also underline the following: If Player 1 is less patient than Player 2, a large part of Player 2’s payoff is achieved while Player 1’s pay- off is essentially over, hence Player 1 cannot efficiently monitor Player 2.

We thus consider here only the case where Player 1 is more patient than Player 2.

4.3. The Caseµ2=1; Fudenberg and Levine (1989, 1992)

The results in this framework correspond to the first systematic treatment of reputation bounds in the literature, after the study of a collection of specific cases.

Givenxin 1I‘, we assume that the perturbationϕisx, i.e., playing x i.i.d. We first deal with Step 1.

Let hn ∈ Hn. Since Player 2 is playing an equilibrium in a one-stage game,τhn‘is a best reply toeσhn‘, withP probability 1.

(20)

Step 2 indicates that, underQ, Player 2 is almost always playing an almost best reply tox:

Lemma 4.1. Givenη0>0, there exists M such that, except at most onM stagesm, one has:

EQbm‘ ≥max

y xBy−η0:

Proof. Denote by BR2 the best reply correspondence of Player 2 and more generally, forε≥0,BR2εz‘ = ”y∈1J‘y zBy≥zBy0−ε, for ally0•, so thatBR2 =BR20. Givenx∈1I‘, choose δ >0 such that x0−x ≤δ implies BR2x0‘ ⊂BR2x‘. We use Lemma 2.4 with this δ and ε=η0 to define an exceptional set of stages M of cardinality M. For m /∈ M one has, with Q probability at least η0, emP; Q‘w‘ ≤ δ, hence in particular

eσhm‘ −x ≤δ. So that the strategy that Player 2 is facing afterhmdiffers from x by less than δ. Since she is playing a best reply τhm‘ belongs to BR2x‘.

We are now ready for Step 3. Since Player 1 is patient, the finite set of exceptional stages defined in the previous result does not affect her payoff.

Proposition 4.2(Fudenberg and Levine, 1992). For allxin 1I‘,ρ >

0,η >0, there exists λ1 such that for any λ1≤λ1, any equilibrium payoff a of Player 1 in the gameGλ1;ρ; x‘satisfies:

a≥min

y ”xAyy y∈BR2x‘• −η:

Proof. Let gx‘ = miny”xAyy y ∈ BR2x‘• and ϕx‘ = xBy for y ∈ BR2x‘. Let η0 >0 be such that xBy≥ ϕx‘ −η0 implies xAy ≥gx‘ − η/2. Apply then the previous lemma with η0, and let λ1 be such that the weight ofM stages is less than η/2. The bound on Player 1’s payoff then follows.

Define

wA; B‘ = sup

x∈1I‘ inf

y∈1J‘”xAyy xBy≥xBy0;∀y0∈1J‘•:

This corresponds to the best previous lower bound of Propositin 4.2 if we let the perturbation x vary.

Corollary 4.3 (Fudenberg and Levine, 1992). If the perturbation of the gameGλ1;1‘has full support on1I‘(interpreted as i.i.d. strategies), then for anyη >0, there existsλ1 such that for anyλ1 ≤λ1, any equilibrium payoffa of Player 1 satisfies:

a≥wA; B‘ −η:

(21)

It follows easily that any equilibrium payoff of Player 1 in the corresponding perturbation of the game G∞;1‘, where her payoff is undiscounted is also greater thanwA; B‘.

Finally one can show, see Fudenberg and Levine (1992), that this bound is tight.

Note that in the present case with µ2 = 1 and standard signalling, Player 2 can anticipate the mixed move of her opponent and play a stage after stage best reply. The bound we obtained has thus a Stackelberg flavour.

In the general case Player 2 is only playing a best reply to some strategy that coincides with Player 1’s strategy on the set of histories compatible with both players’ strategies. In particular the merging ofP toQdoes not imply the merging of the probability induced byeσ andτ0 to the one induced by

ϕ; τ0‘, for an alternative strategyτ06=τ.

4.4. G∞;∞‘; Cripps and Thomas (1995a)

This case is the extreme opposite of the previous one since both players have undiscounted payoffs and, as expected, the results are quite similar to those of Section 3.3.: equilibrium payoffs of incomplete information games in the undiscounted case.

Step 1 can be written, with the notation of Section 3.2 as:

For allm,P almost surely:

ŒB;LEP”θni; j‘ŽHm•‘ ≥val2B: (11) For Step 2, Theorem 2.2 implies that P merges toQ, in particular one has,Qa.s.:

LEQ”θni; j‘ŽHm•‘ −LEP”θni; j‘ŽHm•‘ −→0 asm−→ ∞: (12) Equations (11) and (12) imply that,Qa.s.:

m−→∞lim ŒB;LEQ”θni; j‘ŽHm•‘ ≥val2B: (13) Hence, taking expectation, the payoff of Player 2 givenQsatisfies:

ŒB;LEQ”θni; j‘•‘ ≥val2B:

For Step 3, just write:

a≥ ŒA;LEQ”θni; j‘•‘:

Assume that the perturbation ϕis equal to some x. Then the asymptotic frequencyLEQ”θni; j‘•‘will be of the formx⊗y. Thus we obtain:

Références

Documents relatifs

A simple kinetic model of the solar corona has been developed at BISA (Pier- rard &amp; Lamy 2003) in order to study the temperatures of the solar ions in coronal regions

We define a partition of the set of integers k in the range [1, m−1] prime to m into two or three subsets, where one subset consists of those integers k which are &lt; m/2,

This nding suggests that, even though some criminals do not commit much crime, they can be key players because they have a crucial position in the network in terms of

However, the current relative uncertainty on the experimental determinations of N A or h is three orders of magnitude larger than the ‘possible’ shift of the Rydberg constant, which

However, the voltage clamp experiments (e.g. Harris 1981, Cristiane del Corso et al 2006, Desplantez et al 2004) have shown that GJCs exhibit a non-linear voltage and time dependence

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

That is, we provide a recursive formula for the values of the dual games, from which we deduce the computation of explicit optimal strategies for the players in the repeated game

Assuming the correctness of QED calculations in electronic hydrogen atom, the problem can be solved by shifting by 5 standard deviations the value of the Ryd- berg constant, known