to solve regarding **games** in our context is **the** existence of a winning strategy for **the** player modeling **the** system. This is now well understood. We know that total **information** parity **games** enjoy **the** memoryless determinacy property [18] ensuring that in each game, one of **the** players has a winning strategy, **and** that a winning strategy exists if **and** only if there is a memoryless winning strategy, i.e. a strategy that depends only on **the** last visited node of **the** graph, **and** not on a history of **the** play. However, **partial** **information** **games** do not enjoy this property since **the** player may need memory to win **the** game. On **the** other hand, regarding tools implementations, **the** field of two-player **games** has not reached **the** maturity obtained in model-checkers area. For total **information** **games**, to **the** notable exception of pgsolver [24] that provides a platform of implementa- tion for algorithms **solving** parity **games**, **and** Uppaal-TiGa[30] that solves in a very efficient way timed **games** (but restricted to reachability conditions), few im- plementations are available. SAT-implementations of restricted types of **games** have also been proposed [17], as well as a reduction of parity **games** to SAT [21]. As for **partial** **information** **games**, even less attempts have been made. To our knowledge, only alpaga [1] solves **partial** **information** **games**, but **the** explicit input format does not allow to solve real-life instances.

En savoir plus
procedure in a game in extensive form, which accounts for **the** players’ nego- tiation possibilities. As a topic for further research, Myerson (1984b)’s value should be further investigated **and** challenged.
**The** previous paragraphs illustrate that, even in **the** absence of strategic externalities, **the** axiomatic approach to cooperation, which was so fruit- ful under complete **information**, to date renders much less clear conclusions when negotiation takes place between privately informed players. As recalled above, Myerson (1984a) proposes a **partial** axiomatization of a bargaining solution in **the** case of two equally powerful players. Focusing on **the** issue of **information** revelation at **the** negotiation stage, de Clippel **and** Minelli (2004) pursue this analysis. They provide cooperative **and** noncooperative characterizations of Myerson (1983, 1984a)’s solutions under **the** additional assumption that types become verifiable at **the** stage where a mechanism is used to make decisions. de Clippel **and** Minelli (2004) propose in particular a refinement of Wilson (1978)’s coarse core. An obvious open problem is **the** extension of these results when types remain unverifiable, even at **the** decision stage. In any case, this contribution, as other recent ones, e.g., Serrano **and** Vohra (2007), detailed in section 2, **and** de Clippel (2005), indicates that **the** analysis of simple, explicit negotiation procedures is a promising approach given **the** state of **the** art.

En savoir plus
• **The** low-level is a set of algorithms for **solving** subproblems of **the** game. Classical state-space algorithms can be used but not exclusively.
We believe that it is possible for almost all **games** to determine primitive game elements that have to reach some goal. In puzzles like **the** 24-tile puzzle, **the** agents could be defined as **the** tiles. In this representation, each tile aims to reach its final destination but cannot move without altering **the** position of other agents. In **the** game of Sokoban, all **the** agents are instances of stones of **the** maze **and** have thus **the** same characteristics. For other **games** however, we could define agents that have their own personality. For **the** game of solitaire for example, **the** agents could be **the** 52 cards. Each agent is now unique. Note that for such imperfect **information** game, we must consider that only a subset of **the** agents is visible. **The** other agents can thus be seen as being in an unknown queue, waiting for entering into play.

En savoir plus
101 En savoir plus

In this paper, we focus our attention on simple stochastic tail **games**. By “simple” we mean that players have perfect-**information** **and** take their decisions turn-by-turn, whereas in stochastic **games** as introduced initially by Shapley [Sha53] players take their decisions concurrently. A tail winning condition is such that **the** winner of a play does not depend on finite prefixes of **the** play, only **the** long-term behaviour of **the** play matters. This class encompasses **games** for verification **and** mean-payoff **games**. From a verification perspective, tail conditions correspond to cases where local glitches are tolerated in **the** beginning of a run, as long as **the** specification is met in **the** long-run, e.g. in self-stabilising protocols.

En savoir plus
procedure in a game in extensive form, which accounts for **the** players’ nego- tiation possibilities. As a topic for further research, Myerson (1984b)’s value should be further investigated **and** challenged.
**The** previous paragraphs illustrate that, even in **the** absence of strategic externalities, **the** axiomatic approach to cooperation, which was so fruit- ful under complete **information**, to date renders much less clear conclusions when negotiation takes place between privately informed players. As recalled above, Myerson (1984a) proposes a **partial** axiomatization of a bargaining solution in **the** case of two equally powerful players. Focusing on **the** issue of **information** revelation at **the** negotiation stage, de Clippel **and** Minelli (2004) pursue this analysis. They provide cooperative **and** noncooperative characterizations of Myerson (1983, 1984a)’s solutions under **the** additional assumption that types become verifiable at **the** stage where a mechanism is used to make decisions. de Clippel **and** Minelli (2004) propose in particular a refinement of Wilson (1978)’s coarse core. An obvious open problem is **the** extension of these results when types remain unverifiable, even at **the** decision stage. In any case, this contribution, as other recent ones, e.g., Serrano **and** Vohra (2007), detailed in section 2, **and** de Clippel (2005), indicates that **the** analysis of simple, explicit negotiation procedures is a promising approach given **the** state of **the** art.

En savoir plus
value corresponding to **the** bit of **the** same index in **the** incremented version of **the** described number (this can be computed on **the** fly). This marking by Adam is done by him playing a distinguished action; **the** checking is done deterministically (thanks to a counter). One also uses this binary **encoding** of **the** index of **the** cell in **the** following way: whenever Adam marks a symbol that he claims will be incorrectly updated in **the** next configuration, a bit of its binary **encoding** is guessed (i.e. randomly chosen) **and** its index is stored **and** not observed by none of **the** players. Later, when Adam indicates **the** supposed corresponding symbol in **the** next configuration, **the** guessed bit is checked **and** should match: if not **the** play goes to a final state **and** Eve wins; otherwise one does as previously explained (i.e. one checks whether **the** symbol is correct: if not **the** play restarts otherwise **the** play goes to a final state **and** Eve wins). Hence, in **the** game’s state one also stores (**and** hides to both players) **the** value **and** index of **the** randomly chosen bit.

En savoir plus
agents i would like to see in her coalitions **and** which agents she would like not to: For instance, if 1 is 2 1 3 1 4, we know that
1 prefers 2 to 3 **and** 3 to 4, but nothing tells us whether 1 prefers to be with 2 (respectively, 3 **and** 4) to being alone, that is, if **the** abso- lute desirability of 2, 3, **and** 4 is positive or negative (of course, if it is negative for 3, it is also negative for 4, etc.). So, both ways are insufficiently informative: Specifying only a partition into positive **and** negative agents (“friends” **and** “enemies”) does not tell which of her friends i prefers to which other agents, **and** which of her en- emies she wants to avoid most. On **the** other hand, specifying a ranking over agents does not say which agents i prefers to be with rather than being alone. Here we propose a model that integrates **the** models of Cases 1, 3, **and** 4: Each agent i first subdivides **the** other agents into three groups, her friends, her enemies, **and** an in- termediate type of agents on which she has neither a positive nor a negative opinion **and** then specifies a ranking of her friends **and** enemies. Based on this representation, we consider a natural exten- sion to a player’s preference, **the** generalized Bossong–Schweigert extension (see [8, 14]), which is a **partial** order over coalitions con- taining **the** player. A related model can be found in **the** context of matching theory: Responsive preferences are studied in bipar- tite many-to-one matching markets **and** consider **the** comparison of one participant to another, 1 although not in distinction of friends or enemies (see, e.g., [19, 20]). In **the** following, we consider differ- ent ways of how to deal with incomparabilities within these **partial** orders. A first approach is to leave incomparabilities open **and** de- fine notions such as “possible” **and** “necessary” stability concepts. A second approach is to define comparability functions in order to determine **the** relation between incomparable coalitions that extend

En savoir plus
vary **the** number of players from 2 to 6, **the** number of types from 2 to 8 **and** **the** number of actions from 2 to 15. For each combination of parameters, we have generated 50 instances **and** measured **the** time necessary to get a Π-NE (or a neg- ative result). We present in **the** following results of 3 game classes: Covariant **games**, Dispersion **games** **and** Travelers Dilemma game. All experiments were conducted on an In- tel Xeon E5540 processor **and** 64GB RAM workstation. We used CPLEX [CPLEX, 2009] as a MILP solver. We also im- plemented **the** transformation of **the** Π-game as a normal form game (T ˜ G) in Java 8. This method, which is exponential in time **and** space, cannot be considered as a **solving** method, **and** this is supported by **the** experimental results. **The** im- plementation of **the** T ˜ G **and** MILP solver are available online [Ben Amor et al., 2019]. In our evaluation, we bounded **the** execution time to 10 minutes as in [Sandholm et al., 2005; Porter et al., 2008] experiments.

En savoir plus
In this paper, we remove those two restrictions by considering concurrent stochastic **games** with imperfect **information**. Those are finite states **games** in which, at each round, **the** two players choose simultaneously **and** independently an action. Then a successor state is chosen accordingly to some fixed probability distribution depending on **the** previous state **and** on **the** pair of actions chosen by **the** players. Imperfect **information** is modeled as follows: both players have an equivalence relation over states **and**, instead of observing **the** exact state, they only see to which equivalence class it belongs. Therefore, if two **partial** plays are indistinguishable by some player, he should behave **the** same in both of them. Note that this model naturally captures several model studied in **the** literature [1, 11, 7, 8]. **The** winning conditions we consider here are reachability (is there a final state eventually visited?), B¨ uchi (is there a final state that is visited infinitely often?) **and** their dual versions, safety **and** co-B¨ uchi.

En savoir plus
I. I NTRODUCTION
In this work we face **the** problem of computing a strong Stackelberg equilibrium (SSE) in a stochastic game (SG). Given a set of states we model a two player perfect **information** dynamic where one of them, called Leader or player A, ob- serves **the** current state **and** decides, possible up to probability distribution f , between a set of available actions. Then other player, called Follower or player B, observes **the** strategy of player A **and** plays his best response noted by g. We represent a two-person stochastic discrete game G by

Digital Object Identifier 10.4230/LIPIcs...
1 Introduction
Two-player zero-sum perfect **information** **games** played on finite (directed) graphs are **the** canonical model to formalize **the** reactive synthesis problem [24, 1]. Unfortunately, this mathematical model is often a too coarse abstraction of reality. First, realistic systems are usually made up of several components, each of them with its own objective. These objectives are not necessarily antagonistic. Hence, **the** setting of non-zero sum graph **games** needs to be investigated, see [9] **and** additional references therein. Second, in systems made of several components, each component has usually a **partial** view on **the** entire system. Hence it is natural to study **games** with imperfect **information** [25, 13]. In this paper, we investigate **the** notion of admissible strategies for infinite duration non-zero sum **games** played on graphs in which players have imperfect **information**.

En savoir plus
As a subset of general zero-sum imperfect **information** **games**, stacked matrix **games** can be solved by general techniques such as creating a single-matrix game in which individual moves represent pure strategies in **the** original game. However, because this transformation leads to an exponential blowup, it can only be applied to tiny problems. In their landmark paper, [77] define **the** sequence form game representation which avoids redundancies present in above game transformation **and** reduces **the** game value computation time to polynomial in **the** game tree size. In **the** experimental section we present data showing that even for small stacked matrix **games**, **the** sequence form approach requires lots of memory **and** therefore can’t solve larger problems. **The** main reason is that **the** algorithm doesn’t detect **the** regular **information** set structure present in stacked matrix **games**, **and** also computes mixed strategies for all **information** sets, which may not be necessary. To overcome these problems [56] introduce a loss-less abstraction for **games** with certain regularity constraints **and** show that Nash equilibria found in **the** often much smaller game abstractions correspond to ones in **the** original game. General stacked matrix **games** don’t fall into **the** game class considered in this paper, but **the** general idea of pre- processing **games** to transform them into smaller, equivalent ones may also apply to stacked matrix **games**.

En savoir plus
175 En savoir plus

C ONCLUSIONS
We considered lock acquisition **games** with **partial**, asym- metric **information**. Agents attempt to control **the** rate of their Poisson clocks to acquire two locks, **the** first one to get both would get **the** reward. There is a deadline before which **the** locks are to be acquired, only **the** first agent to contact **the** lock can acquire it **and** **the** agents are not aware of **the** acquisition status of others. It is possible that an agent continues its acquisition attempts, while **the** lock is already acquired by another agent. **The** agents pay a cost proportional to their rates of acquisition. We proposed a new approach to solve these asymmetric **and** non-classical **information** **games**, ”open loop control till **the** **information** update”. With this approach we have dynamic programming equations applicable at state change update instances **and** then each stage of **the** dynamic programming equations is to be solved by optimal control theory based tools (HJB equations). We showed that a pair of (available) state dependent time threshold policies form Nash equilibrium. We also conjectured **the** results for **the** **games** with N -agents.

En savoir plus
1 Introduction
Recently de Alfaro, Henzinger **and** Majumdar[4] introduced a new variant of µ- calculus: discounted µ-calculus. As it is known since **the** seminal paper [6] of Emerson **and** Jutla µ-calculus is strongly related to parity **games** **and** this relationship is pre- served even for stochastic **games**, [5]. In this context it is natural to ask if there is a class of **games** that corresponds to discounted µ-calculus of [4]. A **partial** answer to this question was given in [8], where an appropriate class of infinite discounted **games** was introduced. However, in [8], only deterministic systems were considered **and** much more challenging problem of stochastic **games** was left open. In **the** present paper we return to **the** problem but in **the** context perfect **information** stochastic **games**. **The** most basic **and** usually non-trivial question is if **the** **games** that we consider admit “simple” optimal strategies for both players. We give a positive answer, for all **games** presented in this paper both players have pure stationary optimal strategies. Since our **games** contain parity **games** as a very special case, our paper extends **the** result known for perfect **information** parity **games** [2, 10, 3, 14].

En savoir plus
Unité de recherche INRIA Sophia Antipolis 2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex France Unité de recherche INRIA Futurs : Parc Club Orsay Université - ZAC des Vi[r]

price dynamics are nearly constant. This result is completely different from De Meyer’s (2010).
In **the** N −stage repeated zero-sum game, in each stage, player 2 makes a lottery involving two prices such that **the** conditional expectation of **the** price equals **the** value L. Playing in this way entails a negative price, which is not a natural interpretation in economics. However, we cannot impose a restriction requiring a positive price because this would violate **the** invariance axiom of **the** natural trading mechanism. That is, **the** value of **the** game must remain unchanged if one shifts **the** liquidation value L by a constant amount. This result might be improved in **the** event that player 2 is risk averse. Therefore, in section 2.4, we discuss a non-zero-sum one-shot game in which player 2 is risk averse. In this setting, we show that **the** value of **the** game is positive under more relaxed conditions on **the** joint distribution of M **and** L. We conjecture that in a repeated game, player 2 cannot guarantee **the** value of **the** game to be zero by slightly modifying his optimal strategy in each stage. Given **the** complexity of analysing **the** repeated game **and** characterizing **the** price dynamics in this setting, we leave such work to further research.

En savoir plus
118 En savoir plus

This result implies that outsiders’ strategy profile p N\S that best punishes coalition S as a first mover (α-approach) also best punishes S as a second mover (β-approach).
4 **Partial** characterization of **the** core
In this section, we first provide an example in which **the** convexity does not hold for a large class of Bertrand oligopoly TU-**games** with transferable technologies. Then, we show that **the** convexity property is satisfied if firms’ marginal costs are not too heteroge- neous. Finally, even if **the** core can not be always fully characterized, we identify a subset of payoff vectors with a symmetric geometric structure easy to compute.

En savoir plus
Abstract: We present **the** application of **the** ALG2 algorithm [1] to solve numerically variationals mean field **games** [5].
Key-words: Mean field **games**, Augmented Lagrangian, Douglas Rachford
Ce travail de recherche a reçu le soutien financier de l’ANR Isotace (ANR-12-MONU-0013). We acknowledge **the** financial support of ANR Isotace (ANR-12-MONU-0013) for this research. ∗ INRIA BP. 105 78153 Le Chesnay Cedex