• Aucun résultat trouvé

II Partial Observability 127

N/A
N/A
Protected

Academic year: 2021

Partager "II Partial Observability 127"

Copied!
3
0
0

Texte intégral

(1)

Contents

1 Introduction 11

1.1 Reactive Systems . . . 11

1.2 Synthesis and Games . . . 12

1.3 Contributions . . . 15

2 Preliminaries 19 2.1 Notations and Conventions . . . 19

2.2 Languages, Automata, and Topology . . . 21

2.3 Quantitative PayoffFunctions . . . 24

2.4 Computational Complexity . . . 25

2.5 Recurrent Problems . . . 26

2.5.1 Quantified Boolean formulas . . . 26

2.5.2 Counter machines . . . 26

3 Quantitative Games 29 3.1 Games Played on Graphs . . . 29

3.1.1 Winning condition . . . 32

3.1.2 Values of a game . . . 33

3.1.3 Classical games . . . 35

3.2 Games Played on Automata . . . 40

3.3 Non-Zero-Sum Games . . . 43

I Regret 45

4 Background I: Non-Zero-Sum Solution Concepts 47 4.1 Regret Definition . . . 48

4.2 Examples . . . 49

4.3 Prefix Independization . . . 51

4.4 Contributions . . . 52

5 Minimizing Regret Against an Unrestricted Adversary 55 5.1 Additional Preliminaries for Regret . . . 55

5.2 Lower Bounds . . . 56

5.3 Upper Bounds for Prefix-Independent Functions . . . 58

5.4 Upper Bounds for Discounted Sum . . . 61

5.4.1 Deciding 0-regret . . . 62

5.4.2 Deciding r-regret . . . 65 7

(2)

8 CONTENTS

5.4.3 Simple regret-minimizing behaviors . . . 69

6 Minimizing Regret Against Positional Adversaries 73 6.1 Lower Bounds . . . 74

6.2 Upper Bounds for Prefix-Independent Functions . . . 80

6.3 Upper Bounds for Discounted Sum . . . 86

6.3.1 Deciding 0-regret . . . 87

6.3.2 Deciding r-regret . . . 90

7 Minimizing Regret Against Eloquent Adversaries 99 7.1 Lower Bounds . . . 102

7.2 Upper Bound for 0-Regret . . . 114

7.2.1 Existence of regret-free strategies . . . 114

7.2.2 Regular words suffice for Adam . . . 116

7.3 Upper Bounds for Prefix-Independent Functions . . . 120

7.4 Upper Bounds for Discounted Sum . . . 123

7.4.1 Deciding r-regret: determinizable cases . . . 123

7.4.2 TheÁ-gap promise problem . . . 124

II Partial Observability 127

8 Background II: Partial-Observation Games are Hard 129 8.1 Observable Determinacy . . . 131

8.2 Contributions . . . 132

9 Partial-Observation Energy Games 135 9.1 The Energy Objective . . . 135

9.2 Undecidability of the Unknown Initial Credit Problem . . . 136

9.3 The Fixed Initial Credit Problem . . . 139

9.3.1 Upper bound . . . 139

9.3.2 Lower bound . . . 143

10 Partial-Observation Mean-PayoffGames 151 10.1 Undecidability of Mean-PayoffGames with Partial Observation . 154 10.2 Strategy Transfer from the Unfolding . . . 155

10.2.1 Strategy transfer for Eve . . . 159

10.2.2 Strategy transfer for Adam . . . 160

10.3 Decidable Classes of MPGs with Limited Observation . . . 162

10.3.1 Forcibly terminating games . . . 162

10.3.2 Forcibly first abstract cycle games . . . 169

10.3.3 First abstract cycle games . . . 173

10.4 Decidable Classes of MPGs with Partial Observation . . . 174

11 Partial-Observation Window Mean-PayoffGames 181 11.1 Window Mean-PayoffObjectives . . . 183

11.1.1 Relations among objectives . . . 184

11.1.2 Lower bounds . . . 185

11.2 DirFix games . . . 188

11.2.1 A symbolic algorithm forDirFix games . . . 192

11.3 Fixgames . . . 195

(3)

CONTENTS 9 11.4 UFixgames . . . 197

12 Conclusion and Future Work 201

12.1 Summary . . . 201 12.2 Conclusion . . . 202 12.3 Future Work . . . 202

Références

Documents relatifs

Keywords Partial volume of a convex cone · Solid angle · Volumetric center · Incenter · Homogeneous cone · Blaschke-Santaló inequality..

In one-sided partial-observation stochastic games with player 1 perfect and player 2 partial, non- elementary size memory is sufficient for pure strategies to ensure

However, doing so, player 1 may prevent the play to reach a target state: it may hold that player 2 has a strategy for winning positively under the hypothesis that pessimistic

For some ergodic hamiltonian systems we obtained a central limit theorem for a non-parametric estimator of the invariant density, under partial observation (only the positions

We describe an explicit procedure for constructing P based on the prolongation theory of the first part of this paper which involves finitely many steps. These results imply

We tested various robotic-like planning problems with action feasibility constraints, which we modeled as AC-POMDPs and solved using our PCVI algorithm. In the subse- quent figures,

The second approach is based on some nonlinear transformations, using Lie algebra, to bring the original system into canonical observability nor- mal form, from which the design

• Les sécrétions de GnRH, de LH et de la testostérone sont discontinues, montrant des pics périodiques toutes les 4heures d’amplitudes 20 pg/ml pour la GnRH, de 1 à