J´ erˆ ome Renault, M2 Ecomath TSE
11This version: August 30, 2011. https://sites.google.com/site/jrenaultsite/enseignement
Contents
Introduction 5
1 Strategic games 7
1.1 Model . . . 7
1.2 Dominant strategy equilibrium . . . 13
1.3 Nash equilibrium . . . 14
1.3.1 Definition and first properties . . . 14
1.3.2 Existence of a Nash equilibrium . . . 16
1.3.3 Iterated elimination of strictly dominated strategies . . 20
1.4 Mixed strategies . . . 22
1.4.1 Finite games and mixed strategies . . . 23
1.4.2 Elimination of strategies strictly dominated by a mixed strategy . . . 27
1.4.3 Generalization of the mixed extension . . . 29
1.5 Rationalizability . . . 30
1.6 Zero-sum games . . . 31
1.6.1 Definition, value and optimal strategies . . . 31
1.6.2 Links with Nash equilibria and the MinMax theorem . 35 2 Extensive games 39 2.1 Introduction . . . 39
2.2 Model . . . 39
2.3 Associated strategic form . . . 42
2.4 Games with perfect information . . . 43
2.5 Behavior strategies . . . 45
2.6 Sequential Rationality . . . 48
2.7 Subgame-perfect, Bayesian-perfect and Sequential Equilibria . 51 2.7.1 Subgame-perfect equilibria . . . 51
3
2.7.2 Bayesian-perfect equilibria . . . 52
2.7.3 Sequential equilibria . . . 52
3 Bayesian games and games with incomplete information 57 3.1 Modeling incomplete information . . . 57
3.2 Bayesian games . . . 58
3.3 Games with incomplete information . . . 60
4 Correlated Equilibrium 63 4.1 Examples . . . 63
4.2 Definitions . . . 65
4.3 Canonical correlated equilibrium . . . 67
4.4 Extensive-form correlated equilibrium and communication equi- librium . . . 69
5 Introduction to repeated games 73 5.1 Examples . . . 73
5.2 Model of standard repeated games . . . 75
5.2.1 Histories and plays. . . 75
5.2.2 Strategies . . . 76
5.2.3 Payoffs . . . 76
5.3 Feasible and individually rational payoffs . . . 78
5.4 The Folk theorems . . . 79
5.4.1 The Folk theorems forG∞ . . . 79
5.4.2 The discounted Folk theorems . . . 81
5.4.3 The finitely repeated Folk theorems . . . 83
5.5 Extensions: examples . . . 83
6 Exercises 89 6.1 Strategic games . . . 89
6.2 Extensive-form games . . . 98
6.3 Bayesian games and games with incomplete information . . . . 110
6.4 Correlated Equilibrium . . . 111
6.5 Repeated Games . . . 113
Introduction
A strategic interaction is a situation where:
a) there are several persons (in a broad sense: physical persons, juridical persons, animals, softwares, automata...) called players,
b) each of these players has to do something (choice of actions or strate- gies),
c) the utility (happiness, money transfer,...) that each player will get from the interaction does not only depend on his own choice, but may also depend on the choices of the other players.
Game theory studies strategic interactions, calledgames. Such situations are extremely frequent, and in social sciences almost all studied phenomena have a strategic aspect. The following are some examples of games.
1. An auction to buy an indivisible good: a bidder prefers to be the unique person to submit a price, and if it is not the case, he will prefer the other proposed prices to be very low.
2. An oligopoly where several firms selling the same good have to choose their own price: a firm usually prefers that the other firms put high prices.
3. An election, e.g. the presidential election in France.
4. Fun games (games in the usual sense), such as chess or poker.
One could add price discrimination, insurance policy and bonus-malus contracts, financial markets etc. Games are indeed so widely present that it is difficult to say something both general and useful in applications. Here are a few main questions:
- How to model a strategic interaction ?
- Is it possible to determine what “rational” players should play ? What is the meaning of playing strategically or playing well ?
- When is it the case that strategic play leads to good (socially optimal) 5
outcomes ? How can we construct mechanisms leading to such games, and avoid to play other games ?
These notes1 are not more than a introduction to non cooperative game theory, there are no chapters dedicated to cooperative games. The objective is to introduce and study strategic concepts, and mathematical formalism and rigour will largely be used. This text also contains interpretations of the situations or concepts and a few economic examples, but nothing is said about interpretations or applications to biology (evolution theory), to com- puter science and algorithmic game theory (cryptography, automata, conges- tion models...).
References:
• A Course in Game Theory: M.J. Osborne and A. Rubinstein. MIT Press 1994.
• Game Theory: Analysis of Conflict, R.B. Myerson. Harvard University Press, 1991.
•Game Theory, D. Fudenberg and J.Tirole. MIT Press, 1991.
• Game Theory for Applied Economists, Gibbons. Princeton University Press 1992.
•Stability and perfection of Nash equilibria, E. Van Damme. Springer 1991.
1Thanks to Stephen Wolff for many language corrections.
Chapter 1
Strategic games
Strategic games are used to study strategic interactions with only one stage (“one-shot games”), and are also fundamental when we do not want to take into account the explicit time structure of the interaction.
1.1 Model
Definition 1.1.1. A strategic game (also strategic-form game, or normal- form game) is described as G= (N,(Ai)i∈N,(gi)i∈N) where:
a)N is a non empty set called the set of players
b) for each player i inN, Ai is a non empty set called the set of actions (or strategies) of player i.
c) for each playeri in N, gi is a mapping from Q
j∈NAj toIR called the
payoff function of player i. 2
One can think of a strategic game as a “simultaneous one-shot game”.
More precisely, the interaction is the following. Each player i has to select an action ai inAi. The choices of the players are simultaneous. At the end of the game, if each player j has chosen aj in Aj, the payoff (or utility) to each player i is given by gi(a), wherea= (aj)j∈N. The goal of each player is to maximize his own payoff. All players know G.
Remarks 1.1.2. For the interpretation, it is not strictly speaking necessary that the choices of actions are simultaneous. It is indeed enough that at the moment when a player chooses his action, this player is not aware of the possible choices already made by the other players. For example, each player
7
may write his selected action in a sealed envelope, and finally all the envelopes are collected and opened by a referee.
It is sometimes also considered that all players know that all players know G, and also that all players know that all players know that all players know G, etc. When these considerations are assumed ad infinitum, the game G is said to be common knowledge among the players. This happens for example when a referee publicly explains the rules of the game to the players before they play.
Moreover, it is sometimes considered that all players are clever or “ra- tional”, and that all players know that all players are rational, and that all players know that all players know that all players are rational, etc. up to common knowledge of rationality. However, be careful it is far from easy to define what rationality means here.
In this text we won’t pay much attention to these considerations on com- mon knowledge and rationality. We will study and compute well-defined mathematical concepts, which are meaningful to these interpretations.
Let’s see a few examples of strategic games.
Example 1.1.3. There are two players: N ={1,2}. Player 1 has to select a line which may be either Top or Bottom, we have : A1 ={T, B}. Player 2 has to select a column Left or Right, we have : A2 = {L, R}. The payoffs, i.e. the mappingsg1 and g2, are given by the following matrix:
L R
T B
(1,1) (3,0) (0,3) (0,0)
The entries of the matrix represent elements ofA1×A2. In each entry there is a couple of real numbers: the first component is the payoff to player 1, the second is the payoff to player 2. For example in the (T, R) cell we read (3,0):
it means that g1(T, R) = 3 and g2(T, R) = 0. In the entry (B, R) one reads (0,0): consequently g1(B, R) = 0 and g2(B, R) = 0, etc. This (bi-)matrix is thus a convenient tool to represent the payoff functions.
A strategic game with 2 players, each of them having a finite number of actions, will almost always be represented by a matrix as in the above example. Player 1 will choose a line, player 2 will choose a column, the first component of the selected entry will be the payoff of player 1 and the second component the payoff of player 2.
Example 1.1.4.
L R
T B
(1,0) (0,1) (0,0) (1,1)
What should player 1 play ? Example 1.1.5. Matching Pennies
L R
T B
(1,−1) (−1,1) (−1,1) (1,−1)
Quite often, a strategic interaction contains several stages, and we will see in chapter 2 how these games can still be analysed with strategic games.
Here is a simple example.
Example 1.1.6. There are 2 players, and the strategic interaction is given by the following “tree”:
@
@
@
@
@
@
J J
J J
J J
J J
J J
J J
P2
P1
P2
(10,9) (0,10) (1,2) (3,1)
L R
l2 r2
l1 r1
x0
x1 x2
x3 x4 x5 x6
c
c c
Such a tree is interpreted as follows. The game starts at the highest node x0, which is called the root of the tree. At this node player 1 is playing and has to choose between Land R.
Suppose that player 1 choosesL atx0. Then the game goes tox1. Player 2 now has to play, he knows that the game is at x1, so he knows that player 1 has chosen L. Then player 2 has to choose betweenl1 and r1. If he selects l1, the game is over and the payoff if (10,9). As before, the first component is the payoff to player 1 (here 10), and the second the payoff to player 2 (here
9). Finally, if player 2 chooses r1, then the game is over and the payoff is (0,10).
Suppose now that player 1 plays R at x0. Similarly, the game goes to x2, player 2 has to play and knows that the game is at x2. He has the choice between l2 and r2. In each case, the game is over and the payoffs are mentioned under the terminal node which is reached.
We have just described a strategic interaction having several stages by means of a tree, we will call it anextensive-form game. It is however possible to represent this game as a simultaneous one-shot game, i.e. as a strategic game, as follows.
The set of players is N ={1,2}. The set of strategies of player 1 simply is A1 ={L, R}. The situation is different for player 2. Imagine that before playing the game, player 2 wants to describe to a friend his strategy: he should specify what he will play if player 1 plays L, but also what he will play if player 1 plays R. Consequently, he should communicate an element of{l1, r1}, and an element of {l2, r2}. We thus set: A2 ={l1, r1} × {l2, r2}= {(l1, l2),(l1, r2),(r1, l2),(r1, r2)}. And we determine the payoff functions by following, for each strategy pair in A1×A2, the play induced on the tree by the strategy pair. We thus get:
(l1, l2) (l1, r2) (r1, l2) (r1, r2) L
R
(10,9) (10,9) (0,10) (0,10) (1,2) (3,1) (1,2) (3,1)
Fix for a while a strategic gameG= (N,(Ai)i∈N,(gi)i∈N). We will always use the following notations.
Notations 1.1.7.
A=Y
i∈N
Ai.
An elementa = (ai)i∈N inAis called an action (or strategy) profile. Fix now a playeriinN, this player will sometimes simply be denoted byP i. We put:
A−i = Y
j∈N\{i}
Aj.
For a = (aj)j∈N ∈ A, we write a−i = (aj)j∈N\{i} ∈ A−i. a−i corresponds to the action profile played by all players but player i. With a small abuse
of notation, we will write a = (ai, a−i) when we want to focus on player i’s action. In general, the exponent −i represents “all the players except player i”.
Finally, we will denote byg the vector payoff function of the game. More precisely, g is the mapping from A to IRN which associates to every action profile a the vector payoff g(a) = (gi(a))i∈N inIRN. 2
We now get interested in the signification of “playing well ” in a normal form game. In example 1.1.3, “it seems clear ” that player 1 should play T because his payoff will be 1 or 3 rather than 0. Similarly, player 2 should play L. So in example 1.1.3 the action profile (T, L) emerges as the unique rational issue of the game.
Let us consider now example 1.1.4. If player 2 is “rational”, he is going to play R to gain 1 rather than 0, independently of the move of player 1.
Player 1 being clever, should anticipate this and play B and not T. The profile (B, R) appears here to be the unique rational issue of the game.
It is a priori not clear to know what should be played in example 1.1.5, we will come back to it later. Let us finally consider example 1.1.6. The following argument is based on the game tree. If the game is at x1, player 2 is going to play r1 to get a payoff of 10 instead of 9. If the game is at x2, player 2 will choose l2 to get a payoff of 2 instead of 1. Player 1 “should”
anticipate this, and consequently choose R atx0 to get a payoff of 1 instead of 0. Thus (R,(r1, l2))seems to be the rational issue of the game. We will come back later on this argument called “backwards induction” and its validity.
It should be clear that the 3 previous paragraphs follow common sense but have no mathematical sense yet since we have not yet defined any solution concept for strategic games, nor have we defined the meanings of “rational”
or ”playing well”.
We now give a few general definitions.
Definition 1.1.8. A strategic game is finite if the set of players and the sets
of actions are all finite. 2
Definition 1.1.9. Fix iin N, and two actionsai and bi of player i in Ai. bi strictly dominates ai if : ∀a−i ∈A−i, gi(bi, a−i)> gi(ai, a−i).
bi weakly dominatesaiif :
∀a−i ∈A−i, gi(bi, a−i)≥gi(ai, a−i), and ∃a−i ∈A−i, gi(bi, a−i)> gi(ai, a−i).
ai is strictly (resp. weakly) dominated if there exists a strategy ci in Ai such thatci strictly (resp. weakly) dominates ai. 2
The next example, called the “prisoner’s dilemma” is very famous.
Example 1.1.10.
L R
T B
(1,1) (−1,2) (2,−1) (0,0)
StrategyT of player 1 is strictly dominated by strategyB. Regarding player 2’s actions, Lis strictly dominated by R.
Numerous stories can illustrate this strategic interaction (prisoners hav- ing to confess or deny a crime, nuclear arms race,...), let us just mention here the following version. Initially, player 1 possesses an apple, and player 2 has a banana. Both players should simultaneously decide whether to keep their fruit or to give it to the other player (to give corresponds to action T for player 1 and to action L for player 2). It happens that player 1 prefers the banana and player 2 the apple. More precisely, each player has the following preferences, strictly ranked from the best one to the worst one: to have both fruits comes first, then comes having his favorite fruit only, next is having his initial fruit only, and the worst is getting no fruit at all.
The following notion will play a great role in the sequel.
Definition 1.1.11. Fix i in N, ai in Ai, and a−i in A−i. We say that ai is a best reply (or best response) of playeri against a−i if:
∀bi ∈Ai, gi(ai, a−i)≥gi(bi, a−i).
2 In words, ai is a best reply against a−i if player i should play ai if the other players play according toa−i. This is equivalent to :
gi(ai, a−i) = max
bi∈Aigi(bi, a−i).
The existence of (at least one) best reply of player iagainsta−i is equivalent to the existence of the maximum of the set {gi(bi, a−i), bi ∈Ai}.When Ai is itself finite, this maximum always exists so player i always has at least one best reply against any action profile in A−i. In some cases, the maximum may not exist, and player i does not have any best reply against a−i. In general, it is often the case that several best replies exist. For example in example 1.1.6, the best replies of player 2 against L are (r1, l2) and (r1, r2).
Remark that a strictly dominated strategy can never be a best reply.
The first solution concept we introduce is the one of equilibrium in dom- inant strategies.
1.2 Dominant strategy equilibrium
Definition 1.2.1. Fix iin N, and ai inAi.
ai is a dominant strategy of player i if whatever the actions of the other players, player ishould play ai, i.e. if:
∀a−i ∈A−i,∀bi ∈Ai, gi(ai, a−i)≥gi(bi, a−i).
2 Equivalently, a dominant strategy of player i is a strategy of this player which is a best reply against any other strategy of the other players.
Definition 1.2.2. An equilibrium in dominant strategies of G is an action profile a = (ai)i∈N in A such that for each player i in N, ai is a dominant
strategy of player i. 2
In example 1.1.3, (T, L) is an equilibrium in dominant strategies. In the prisoners’ dilemma (example 1.1.10), (B, R) is an equilibrium in dom- inant strategies. Even if it is socially preferable that the players exchange their fruits, it is believed that rational players won’t. Equilibria in dominant strategies do not exist in the examples 1.1.4, 1.1.5, 1.1.6. Be careful that there may exist several equilibria in dominant strategies, and even several dominant strategy equilibrium payoffs.
2nd-price auctions are important examples where a dominant strategy equilibrium exists (see exercise 6.1.3). In a general strategic interaction, typically the best reply of a player does depend on the actions of the other players. And dominant strategies seldom exist. The following concept of Nash equilibria is fundamental.
1.3 Nash equilibrium
1.3.1 Definition and first properties
Definition 1.3.1. A Nash equilibrium is an action profile such that each player is in best reply against the strategies of the other players. Formally, leta= (ai)i∈N be an action profile in A.
a is a Nash equilibrium of G if and only if:
∀i∈N,∀bi ∈Ai, gi(ai, a−i)≥gi(bi, a−i).
It is equivalent to: for alli in N, ai is a best reply against a−i.
When a is a Nash equilibrium of G, the vector (gi(a))i∈N in Rn is called
a Nash equilibrium payoff of G. 2
If the other players play according to a−i, then player i should play ai rather than any other action bi.
In other words, a Nash equilibrium is an action profile such that there is no unilateral deviation (i.e. deviation by a single player) which is strictly profitable.
The simplest interpretation is that of a contract between the players.
Assume that the players agree, in one way or another, to play the action profilea. If a player i thinks that the other players are going to respect the contract, i.e. are going to play according to a−i, then player i can not do better than playing ai, i.e. than respecting the contract himself. With this interpretation, a Nash equilibrium is nothing more than a stable contract.
We can also think of social norms or rules in force in our societies. In some countries, people drive on the left. Any driver starting to drive on the right of the road would increase his risk of accident and consequently decrease his utility. Thus, everybody driving left corresponds to a Nash equilibrium.
Regarding the existence and the number of Nash equilibria in strate- gic games, nothing can a priori be excluded. In the examples 1.1.3, 1.1.4, 1.1.6, 1.1.10, there is a unique Nash equilibrium, respectively: (T, L), (B, R), (R,(r1, l2)), and (B, R). There exists no Nash equilibrium in example 1.1.5 (even if we will see later how to extend this game and get the existence of a Nash equilibrium in “mixed strategies”). And it is not unusual to have the existence of several Nash equilibria, as in the next example.
Example 1.3.2.
L R T
B
(2,1) (0,0) (0,0) (1,2)
This game is called the “battle of the sexes”. One may think of a couple
“player 1”, “player 2”, who would like first and foremost to spend the evening together. Each player has to choose between going Dancing (actions T and L) or going to watch Boxing (actions B and R), and choices are supposed to be simultaneous. Here, both (T, L) and (B, R) are Nash equilibria of the game.
The notion of Nash equilibrium is weaker than the one of equilibrium in dominant strategies, as the following proposition shows.
Proposition 1.3.3. An equilibrium in dominant strategies is a Nash equi- librium.
Proof: Let a = (ai)i∈N be a Nash equilibrium in dominant strategies of G.
Fix a player i in N. Since ai is a dominant strategy of player i, it is a best reply against any action profile of the other players, and in particular it is a best reply against a−i. Hence a is a Nash equilibrium of G.
,
The converse of proposition 1.3.3 is clearly false. However, we have the following result.
Proposition 1.3.4. If a= (ai)i∈N is a Nash equilibrium, then for each i in N the strategy ai is not strictly dominated.
Proof: Leta = (ai)i∈N be a Nash equilibrium of G, and let ibe in N. The strategy ai is a best reply against a−i, and consequently can not be strictly
dominated.
,
The following example shows that a Nash equilibrium may consist of weakly dominated strategies.
Example 1.3.5.
L R
T B
(1,1) (0,0) (0,0) (0,0)
(B, R) (as well as (T, L)) is a Nash equilibrium. HoweverB is weakly domi- nated by T, and R is weakly dominated byL.
1.3.2 Existence of a Nash equilibrium
We now look for conditions yielding the existence of a Nash equilibrium. Let us start with a little bit of algebra and analysis.
Definition 1.3.6. Let X be a convex subset of a real vector space. A mapping f from X to the reals is said to be quasi-concave if:
∀x∈X,∀y∈X,∀λ ∈[0,1], f(λx+ (1−λ)y)≥min{f(x), f(y)}.
Lemma 1.3.7. Let X be a convex subset of a real vector space, and f be a mapping from X to the reals. f is quasi-concave if and only if for any real α, the set{x∈X, f(x)≥α} is convex.
Proof:
Assume thatf is quasi-concave. FixαinIR, and putA={x∈X, f(x)≥ α}. We will show that A is convex. Let x and y be in A, and λ be in [0,1]. Write z = λx+ (1−λ)y. Since f is quasi-concave, we have f(z) ≥ min{f(x), f(y)}. Sincexandyare inA, min{f(x), f(y)} ≥α. Consequently f(z)≥α, and z ∈A. A is convex.
Conversely, assume that for each real α, the set {x ∈ X, f(x) ≥ α} is convex. Consider x, y in X, and λ in [0,1]. Put α = min{f(x), f(y)} ∈IR, and A = {z ∈ X, f(z) ≥ α}. We have f(x) ≥ α and f(y) ≥ α, so both x and y are in A. By assumption A is convex, so λx+ (1−λ)y is in A. So f(λx+ (1−λ)y)≥α, and f is quasi-concave.
,
Recall that f is concave if and only if : ∀x ∈ X,∀y ∈ X,∀λ ∈ [0,1], f(λx+ (1 −λ)y) ≥ λf(x) + (1−λ)f(y). Because λf(x) + (1−λ)f(y) ≥ min{f(x), f(y)} always holds for λ in [0,1], a concave function always is quasi-concave. The converse does not hold in general, and for example any non-decreasing mapping from IR to IR is quasi-concave. So is any non- increasing mapping from IR to IR. And the mapping (x 7→ x2), defined overIR+, is both convex and quasi-concave.
We shall also use the notion of correspondence.
Definition 1.3.8. Given two sets X and Y, a correspondence (or multiap- plication) from X to Y is a mapping from X to the set P(Y) of subsets of Y. We write :
F :X ⇒Y
x7−→F(x)⊂Y
to denote the correspondence associating to every xinX the subsetF(x) of Y.
The graph of the correspondenceF is then defined as:
GraphF ={(x, y)∈X×Y, y ∈F(x)}.
And an element xinX is said to be a fixed point of the correspondenceF if
x∈F(x). 2
Notice that if f is a mapping from X to Y, we can define the corre- spondence F from X to Y which associates to each x of X the singleton F(x) = {x}. Then, the graph of the correspondence F coincides with the graph of the mapping f.
We will use, without proof, the following result. Recall that an Euclidean space is simply a real vector space of finite dimension. In such a space, all norms are equivalent, and compact subsets coincide with closed and bounded subsets.
Theorem 1.3.9. (Kakutani’s fixed point theorem)
Let X be a convex and compact subset of an Euclidean space, and let F : X ⇒ X be a correspondence with compact graph such that for each x in X, the setF(x) is convex compact and non empty.
ThenF has a fixed point, i.e. there existsx∗ inX such thatx∗ ∈F(x∗).
Notice that if the graph of F is compact, then automatically we have F(x) compact for each xinX. Hence the hypothesisF(x) compact for each x could simply be removed from the statement of the theorem. The proof goes beyond the scope of these notes, let us just mention a strong link with Brouwer’s fixed point theorem. If f is a mapping from X to X, define the correspondence F fromX to itself associating to each x in X the singleton {x}. Notice that a singleton is always non empty convex and compact. Hence to apply Kakutani’s theorem it is enough to satisfy the condition that the graph of F (or f) is compact. Since X is compact, this graph is closed in X ×X if and only if f is continuous. We then obtain the following famous result.
Theorem 1.3.10. (Brouwer’s fixed point theorem) Let X be a non empty convex compact subset of an Euclidean space, andfbe a continuous mapping from X toX. Then f has a fixed point, i.e. one can find x∗ inX such that f(x∗) =x∗.
Brouwer’s theorem can thus be proved using Kakutani’s theorem. But we have admitted Kakutani’s result, and as a matter of fact it is usual in mathematics to proceed with the order reversed, i.e. to first prove Brouwer’s theorem and then to use it to get Kakutani’s theorem. We now come back to game theory strictly speaking.
Fix a strategic game G= (N,(Ai)i∈N,(gi)i∈N).
Definition 1.3.11.
1) For a player i in N and an element a−i in A−i, we denote by Ri(a−i) the set of best replies of player i againsta−i. We have :
Ri(a−i) = {ai ∈Ai, gi(ai, a−i) = max
bi∈Aigi(bi, a−i)}.
with the convention: Ri(a−i) = ∅if maxbi∈Aigi(bi, a−i) does not exist.
2) Fix i inN. We define Ri : A−i ⇒Ai.
a−i 7−→Ri(a−i)
Ri is called the best reply correspondence of player i.
3) The best reply correspondence of the game G is defined as:
R :A ⇒ A
a= (ai)i∈N 7→R(a) = Y
i∈N
Ri(a−i) ={(bi)i∈N ∈A,∀i∈N bi ∈Ri(a−i)}. 2 By definition, a Nash equilibrium of G is an action profile a = (ai)i∈N such that for each i in N, ai is a best reply against a−i. As a consequence we have the following result.
Lemma 1.3.12. A strategy profile is a Nash equilibrium of G if and only if it is a fixed point of the best reply correspondence of G.
We can now give sufficient conditions for the existence of a Nash equilib- rium.
Theorem 1.3.13. (Glicksberg’s theorem)
LetG= (N,(Ai)i∈N,(gi)i∈N) be a strategic game where N ={1, ..., n} is a finite set. We assume that:
1) For eachi inN, Ai is a convex compact set of an Euclidean space.
2) For eachi inN, gi is continuous.
3) For each i inN, for each a−i in A−i,
the mapping: Ai −→IR is quasi-concave.
ai 7→gi(ai, a−i)
Then a Nash equilibrium of G exists.
Moreover, the set of Nash equilibria of G is closed in A. The set of Nash equilibrium payoffs of G is a compact subset ofIRn. 2 Proof:
The set of action profiles A is by definition Q
i∈N Ai. Since each action set Ai is a subset of a real vector space with finite dimension, A also is a subset of an Euclidean space. Moreover, each Ai being non-empty, convex and compact, so is A. We will apply Kakutani’s theorem to the best reply correspondence R.
Fixi in N and a−i inA−i. The set of best replies of player iagainst a−i is:
Ri(a−i) ={ai ∈Ai, gi(ai, a−i) = max
bi∈Aigi(bi, a−i)}.
Since gi is continuous and Ai is compact, Ri(a−i) is non-empty and closed in Ai. Notice that we can also write: Ri(a−i) = {ai ∈ Ai, gi(ai, a−i) ≥ maxbi∈Aigi(bi, a−i)}. By the quasi-concavity assumption 3), and by lemma 1.3.7, Ri(a−i) is convex. So Ri(a−i) is non-empty, convex and compact.
For each a in A, R(a) = Q
i∈NRi(a−i) is thus non empty, convex and compact. We now show that the graph of R is closed in A×A.
Consider a sequence (at, bt)t≥0 with values in Graph R which converges to a limit (a, b) in A×A. Let us show that (a, b)∈GraphR.
For eacht≥0, writeat= (ait)i∈N andbt= (bit)i∈N. We have by definition of (at, bt)∈GraphR that:
∀i∈N,∀ci ∈Ai, gi(bit, a−it )≥gi(ci, a−it ).
The mapping gi being continuous, we can go to the limit when t goes to infinity and obtain:
∀i∈N,∀ci ∈Ai, gi(bi, a−i)≥gi(ci, a−i).
This means that for each player i,bi is a best reply againsta−i. So b∈R(a), and finally the graph of R is closed. SinceA×A is compact, the graph ofR is compact.
Consequently, we apply Kakutani’s theorem to R, and we get the exis- tence of an element a∗ of A such that a∗ ∈ R(a∗). By lemma 1.3.12, a∗ is a Nash equilibrium ofG.
Denote now by N E the set of Nash equilibria ofG. We haveN E ={a ∈ A, a ∈ R(a)} = {a ∈ A,(a, a) ∈ Graph R}. Since the graph of R is closed in A×A, and since the mapping from A to A ×A associating to each a the couple (a, a) is continuous, we obtain that N E is closed in A, hence is compact.
The set of Nash equilibrium payoffs of G is {g(a), a ∈ N E} = g(N E).
For each playeri,gi is continuous hence so isg. The setN E being compact, g(N E) is a continuous image of a compact set, hence also is a compact set.
The set of Nash equilibrium payoffs ofG is thus compact.
, 1.3.3 Iterated elimination of strictly dominated strate-
gies
We now show that, when computing Nash equilibria, it is always possible to first remove a strategy which is strictly dominated.
Fix a player i and a strictly dominated strategy bi of this player. By proposition 1.3.4, bi is not played in a Nash equilibrium. Consider now the game G0 obtained from G by removing the strategy bi. To be precise, G0 is the game (N,(A0j)j∈N,(g0j)j∈N), where: A0i = Ai\{bi}, for each player j different from i we have A0j = Aj, and for each action profile a0 in A0, we simply haveg0j(a0) =gj(a0) for every j inN.
Proposition 1.3.14. Let G = (N,(Ai)i∈N,(gi)i∈N) be a strategic form game, i be a player in N and bi ∈ Ai be a strictly dominated strategy of playeri. Let G0 be the game obtained from Gwhen the strategy bi has been removed.
Then the Nash equilibria of G and G0 coincide. 2 Proof:
1) Let a = (aj)j∈N be a NE of G. By proposition 1.3.4, ai is not strictly dominated, so ai 6= bi. The action profile a can be played in G0. For each player j in N, aj is a best reply against a−j in G, and since all actions available for j in G0 are also available in G, hence aj also is a best reply againsta−j inG0. a is a Nash equilibrium of G0.
2) Let a= (aj)j∈N be a Nash equilibrium of G0. a is playable in G. For each player j in N\{i}, aj is a best reply against a−j in G0, and A0j = Aj,
so aj is a best reply against a−j in G. We now show that ai is a best reply against a−i in G. We have : ∀ci ∈ Ai\{bi}, gi(ai, a−i) ≥ gi(ci, a−i).
In addition bi is strictly dominated in G, so there exists ci in Ai\{bi} such that: gi(bi, a−i) < gi(ci, a−i) ≤ gi(ai, a−i). Consequently we have: ∀ci ∈ Ai, gi(ai, a−i)≥ gi(ci, a−i), and ai is a best reply against a−i in G. a is a Nash
equilibrium of G.
,
Notice that part 2) of the proof applies as soon asbi is a weakly dominated strategy of player i. As a consequence, when removing a weakly dominated strategy, the set of Nash equilibria can only decrease: the new set of NE is included in the previous one. Notice also that proposition 1.3.14 only applies while removing a (or any finite number of) strictly dominated strategies:
be careful that one may create new NE by removing an infinite number of strictly dominated strategies. Find an example.
Proposition 1.3.14 is sometimes very useful while computing the set of NE of a particular game. One can eliminate a finite number of strictly dom- inated strategies and obtain a game G0. Then some strategies may become strictly dominated inG0, and one can start again and eliminate a finite num- ber of them. A new game is obtained and again, some strategies may become strictly dominated etc. We can continue until no strategy is strictly domi- nated, and if it happens that in the remaining game all strategy profiles yield the same vector payoff, we say that the initial gameGis solvable via iterated elimination of strictly dominated strategies.
Remark 1.3.15. The idea of iterated elimination of strictly dominated strate- gies can be thought as follows. It is “rational” to think that a “rational” player is not going to play a strictly dominated strategy. If all players are “rational”
and think that the other players are “rational”, all players are going to con- sider the game G0. If all players think that all players think that all players are “rational”, we can continue: all players know that G0 is considered, so a strictly dominated strategy in G0 may be removed etc. At each step, by proposition 1.3.14 the set of Nash equilibria remains the same.
Example 1.3.16. Let Gbe the following 2-player game:
l r
T M
B
(3,0) (2,1) (0,0) (3,1) (1,1) (1,0)
.
StrategyBis strictly dominated byT for player 1. The gameG0 obtained fromG by elimination of B, is given by:
l r
T M
(3,0) (2,1) (0,0) (3,1)
.
Now, l is strictly dominated by r for player 2 in G0. One can remove g and obtain the game G00 represented by:
d T
M
(2,1) (3,1)
.
But T is now strictly dominated by M, hence we finally obtain : r
M (3,1)
The unique Nash equilibrium of Gis the action profile (M, r).
The game of exercise 6.1.13 has lots of strictly dominated strategies.
1.4 Mixed strategies
We come back to example 1.1.5 (Matching Pennies):
L R
T B
(1,−1) (−1,1) (−1,1) (1,−1)
. Formally no Nash equilibrium exists here. Imagine, however, that you have to write a computer program for this game (as player 1), and that this pro- gram should be run on a website where everyone can come and play for real, maybe several times, the payoff being (plus or minus) one euro for each game.
How should you write such program ? A program always playingT (orB) would soon lose a lot of money. It is clear here that there exists an appropriate way to proceed, which is to write a program selecting independently at each run to playT orBwith probability 1/2. Even if the users know that you have written such a program, they can not profit from this and have a positive expectation of gain: whether they playLorR, the expected payoff is 0. (and it is now enough for you to put a 5 cents fee for every play to quickly make a profit.)
But how would you play in the following game ?
L R
T B
(2,−2) (−1,1) (−1,1) (1,−1)
1.4.1 Finite games and mixed strategies
We consider here a finite strategic game G = (N,(Ai)i∈N,(gi)i∈N). To sim- plify notations, we will assume that N = {1, ..., n}, where n is the number of players. We will define an extended game ˜G, obtained fromG by allowing the players to select their action randomly and considering expected payoffs.
Notation 1.4.1. Given a finite setS, we denote by ∆(S) the set of probabil- ities over S (endowed with the σ-algebra of all subsets of S). A probability on S will be written x = (x(s))s∈S, with x(s) ≥ 0 for each s in S and P
s∈Sx(s) = 1. We have:
∆(S) ={x= (x(s))s∈S ∈IRS,∀s ∈S x(s)≥0 andX
s∈S
x(s) = 1}.
2 Givenx in ∆(S),x(s) is the probability ofs underx, i.e. the probability that the lottery x selects s. S being finite, IRS is an Euclidean space. For x in ∆(S), we have x(s) ∈ [0,1] for each s in S. Hence ∆(S) is bounded. It is also a closed set of IRS, hence ∆(S) is a compact set. If x and y are in
∆(S), andλis in [0,1], one defines the convex combination : λx+ (1−λ)y = (z(s))s∈S ∈ ∆(S), with for each s, z(s) = λx(s) + (1−λ)y(s). Hence ∆(S) is a convex subset of IRS.
Definition 1.4.2. Let G = (N,(Ai)i∈N,(gi)i∈N) be a finite strategic game, with N = {1, ..., n}. The mixed extension of G is the strategic game ˜G = (N,(∆(Ai))i∈N,(˜gi)i∈N), where for each playeri in N the payoff function ˜gi is defined by : ∀(x1, ..., xn)∈Q
j∈N∆(Aj),
˜
gi(x1, ..., xn) = IEx1⊗...⊗xn(gi) = X
(a1,...,an)∈A
x1(a1)...xn(an)gi(a1, ..., an). 2
In ˜G, each player can play a lottery, or probability over actions. If each player iin N plays the lottery xi onAi, this defines a probability x1⊗x2⊗ ...⊗xnoverA. x1⊗x2⊗...xnis called the (direct) product of the probabilities
(xi)i∈N, meaning that the lotteries used by the players are independent. And the players want to maximize their expected payoff (to some extent, this can be justified via the notion of von Neumann-Morgenstern utility functions).
As an example, suppose that in Matching Pennies player 1 chooses the probability (x1(T), x1(B)) = (2/3,1/3) whereas player 2 chooses the prob- ability (x2(L), x2(R)) = (2/5,3/5). This induces the product probability x1⊗x2over the entries of the matrix: (T, L) has probability (2/3)(2/5)=4/15, (T, R) has probability (2/3)(3/5)=6/15, (B, G) has probability (1/3)(2/5)=2/15, and (B, D) has probability (1/3)(3/5)=3/15. And the expected payoff are
˜
g1(x1, x2) = (4/15)1 + (6/15)(−1) + (2/15)1 + (3/15)(−1) = −1/5, and
˜
g2(x1, x2) = +1/5.
Vocabulary: Fori in N, an element of ∆(Ai) is called a mixed strategy of player i inG.
Forai inAi, the action ai is identified with the probability giving weight 1 to action ai. So we identify ai and the Dirac measure: δai = (xi(bi))bi∈Ai in ∆(Ai), where xi(ai) = 1 and for each bi 6= ai, xi(bi) = 0 (this allows to denote 2/3 T + 1/3 B the mixed strategy playing T with probability 2/3 and B with probability 1/3). In contrast with other probabilities, ai ∈ Ai is called a pure strategy of player i. A pure strategy simply is a particular case of mixed strategy. If a = (ai)i∈N is a pure strategy profile in A, we fortunately have gi(a) = ˜gi(a) for each i.
Notice that each mixed strategy xi = (xi(ai))ai∈Ai can now be written:
xi =P
ai∈Aixi(ai)ai. Since xi(ai) ≥0 for eachai and P
ai∈Aixi(ai) = 1, the mixed strategyxi is a convex combination of pure strategies of playeri. The set ∆(Ai) of mixed strategies of player i is the convex hull of the set Ai of pure strategies of player i, i.e. ∆(Ai) is the smallest subset of IRAi which is convex and contains all elements ofAi.
The following elementary formula, obtained by conditioning the payoff of player i with the random variable of his own action in Ai, will be helpful later.
Lemma 1.4.3. Let i be a player in N, and let (x1, ..., xn) in ∆(A1)×...×
∆(An) be a mixed strategy profile. We have :
˜
gi(x) = X
ai∈Ai
xi(ai) ˜gi(ai, x−i).
2
Proof:
˜
gi(x) = X
(a1,...,an)∈A
x1(a1)...xn(an)gi(a1, ..., an)
= X
ai∈Ai
xi(ai) X
a−i∈A−i
x1(a1)...xi−1(ai−1)xi+1(ai+1)...xn(an)gi(a1, ..., an)
= X
ai∈Ai
xi(ai) ˜gi(ai, x−i) 2
Notice that ifa= (ai)i∈N is a pure strategy profile inA, we havegi(a) =
˜
gi(a) for each i.
Lemma 1.4.4. “A Nash equilibrium in pure strategies remains a Nash equi- librium in mixed strategies. ”
Formally, if a ∈ A is a Nash equilibrium of G, then the element a, seen as a mixed strategy profile, is a Nash equilibrium of ˜G. 2 Proof: Let a = (a1, ..., an) be a Nash equilibrium of G. Fix i in N and xi = (xi(bi))bi∈Ai in ∆(Ai). We have: ˜gi(xi, a−i) = P
bi∈Aixi(bi)gi(bi, a−i)≤ P
bi∈Aixi(bi)gi(ai, a−i) = 1gi(a) = ˜gi(a). So ai is a best reply of player i against a−i in ˜G. This being true for each player i, a is a Nash equilibrium
of ˜G.
,
A Nash equilibrium in mixed strategies is often called a mixed Nash equilibrium. The following result is fundamental.
Theorem 1.4.5. (Nash’s theorem) In a finite game, there exists a Nash equilibrium in mixed strategies. Moreover, the set of mixed Nash equilibria, as well as the set of mixed Nash equilibrium payoffs, are compact. 2 Proof: We will use theorem 1.3.13.
Let i be a player in N. ∆(Ai) is convex compact non empty in the Euclidean space IRAi. The mapping ˜gi is multilinear, hence is continuous.
Fix x−i inQ
j6=i∆(Aj), and denote by hthe mapping from ∆(Ai) toIRsuch that:
∀xi ∈∆(Ai), h(xi) = ˜gi(x1, ..., xi, ..., xn) = ˜gi(xi, x−i).
h associates, to each probability xi on Ai, the payoff of player i if he plays xi against x−i. By lemma 1.4.3, we haveh(xi) = P
ai∈Aixi(ai)h(ai), so h is affine hence concave and a fortiori quasi-concave.
Glicksberg’s theorem consequently applies to ˜G and the results easily
follow. 2
The next proposition will be used in practice to compute mixed Nash equilibria.
Proposition 1.4.6. “Characterization of mixed strategies Nash equilibria”
Let G = (N,(Ai)i∈N,(gi)i∈N) be a finite game, and x = (xi)i∈N in Q
i∈N∆(Ai) be a mixed strategy profile. We have the following equivalence:
x is a mixed Nash equilibrium of G
⇐⇒
∀i∈N,∀ai ∈Ai s.t.xi(ai)>0, g˜i(ai, x−i) = max
bi∈Ai˜gi(bi, x−i)
. 2
In words,xis a Nash equilibrium of ˜G if and only if: “each pure strategy played with positive probability is a best reply against the strategies of the other players”.
Proof: Fix a playeri. Assume that the other players play according to x−i, we study the best replies of player i against x−i. As in the proof of Nash’s theorem, denote by h the mapping from ∆(Ai) to IR such that:
∀yi ∈∆(Ai), h(yi) = ˜gi(yi, x−i) = X
ai∈Ai
yi(ai)h(ai).
h is affine, and ∆(Ai) is the convex hull of a finite number of points. Let us first show that h attains its maximum at a point in Ai. Put α = maxai∈Aih(ai). αis the best payoff which can be obtained by playeriwhile us- ing pure strategies againstx−i. We have: ∀yi ∈∆(Ai),h(yi) =P
ai∈Aiyi(ai)h(ai)≤ P
ai∈Aiyi(ai)α=α.Thus againstx−i, playerican not obtain a payoff greater than α even if he uses mixed strategies.So α = maxyi∈∆(Ai)h(yi). Against x−i, player ialways has a best reply in pure strategies.
Moreover, let yi be a mixed strategy of player i. We have:
yi is a best reply against x−i ⇐⇒ g˜i(yi, x−i) = α,
⇐⇒ X
ai∈Ai
yi(ai)˜gi(ai, x−i) =α,
⇐⇒ X
ai∈Ai
yi(ai) ˜gi(ai, x−i)−α
= 0.
In addition we have for each ai in Ai: yi(ai) ≥ 0 and ˜gi(ai, x−i)−α ≤ 0, hence the product yi(ai) (˜gi(ai, x−i)−α) is non positive. Since a sum of non positive terms is 0 if and only if all terms are 0, we get:
yi is a best reply against x−i ⇐⇒ ∀ai ∈Ai, yi(ai) ˜gi(ai, x−i)−α
= 0,
⇐⇒ ∀ai ∈Ai s.t.yi(ai)>0,˜gi(ai, x−i) =α.
We finally obtain that yi is a best reply against x−i if and only if yi is a convex combination of the pure strategies being best replies against x−i.
Since xis a Nash equilibrium if and only if xi is a best reply againstx−i
for each i, the proposition is proved. 2
Remark 1.4.7. On the interpretation of mixed Nash equilibria.
The probabilities not necessarily represent lotteries used by the players be- fore choosing their pure action. A mixed strategy Nash equilibrium (x1, ..., xn) can also represent a situation where:
- each player i has a belief over the pure strategies to be employed by the other players. More precisely, each player i estimates the action to be played by a player j 6=i with the probability xj. Hence here xj is seen as the belief, or subjective probability, of the other players over the action to be played by player j. It is important to notice that all players i different from j share the same belief xj over the action of player j (this is not always justified in practice).
- each player i plays (randomly or not, in any manner he likes...) a pure strategy ai in Ai which is optimal given the belief of player i, i.e. which is a best reply against x−i (see proposition 1.4.6).
1.4.2 Elimination of strategies strictly dominated by a mixed strategy
We come back to the elimination of strictly dominated strategies. Fix a finite game G as before, and assume that player i has a pure strategy bi which is strictly dominated inG, or more generally in the mixed extension ˜GofG(see example 1.4.9 later). That is, there exists a mixed strategyziin ∆(Ai) which strictly dominates bi, i.e. which satisfies: ∀x−i ∈ Q
j6=i∆(Aj), ˜gi(zi, x−i) >
˜
gi(bi, x−i).
We now show that any mixed strategy xi in ∆(Ai) playing bi with pos- itive probability is also strictly dominated in ˜G. Consider xi in ∆(Ai) such