HAL Id: hal-01917798
https://hal.inria.fr/hal-01917798
Submitted on 9 Nov 2018
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
A study of general and security Stackelberg game formulations
Carlos Casorrán, Bernard Fortz, Martine Labbé, Fernando Ordóñez
To cite this version:
Carlos Casorrán, Bernard Fortz, Martine Labbé, Fernando Ordóñez. A study of general and security Stackelberg game formulations. European Journal of Operational Research, Elsevier, 2019, 278 (3), pp.855 - 868. �10.1016/j.ejor.2019.05.012�. �hal-01917798�
A study of general and security Stackelberg game formulations
Carlos Casorr´ana,b,c,*, Bernard Fortza,b, Martine Labb´ea,b, and Fernando Ord´o˜nezc
aD´epartement d’Informatique, Universit´e Libre de Bruxelles, Brussels, Belgium
bINOCS, INRIA Lille Nord-Europe, Lille, France
cDepartamento de Ingenier´ıa Industrial, Universidad de Chile, Santiago, Chile
*Corresponding author. E-mail: [email protected]
Abstract
In this paper, we analyze different mathematical formulations for general Stackelberg games (GSGs) and Stackelberg security games (SSGs). We consider GSGs in which a single leader commits to a utility maximizing strategy knowing that one of ppossible followers optimizes its own utility taking this leader strategy into account. SSGs are a type of GSG that arise in security applications where the strategies of the leader consist in protecting subsets of targets and the strategies of the p followers consist in attacking a single target. We compare existing mixed integer linear programming (MILP) formulations for GSGs, sorting them according to the tightness of their linear programming (LP) relaxations. We show that SSG formulations are projections of GSG formulations and exploit this link to derive a new SSG MILP formulation that i) has the tightest LP relaxation known among SSG MILP formulations and ii) its LP relaxation coincides with the convex hull of feasible solutions in the case of a single follower. We present computational experiments empirically comparing the difficulty of solving the formulations in the general and security settings. The new SSG MILP formulation is computationally efficient, in particular as the problem size increases.
Keywords: Integer programming, discrete optimization, game theory, bilevel opti- mization.
1 Introduction
Stackelberg games model situations where players strive to optimize their individual ob- jectives in a single sequential encounter. These models assume a player, referred to as the leader, can commit to a strategy that optimizes its utility function and then players that respond to the leader’s decision, referred to as followers, take this decision into account
when deciding how to optimize their own utility functions. Stackelberg games were intro- duced to model market competition [von Stackelberg, 2011] and have been used in diverse applications since, such as traffic equilibrium [Krichene et al., 2014], network toll setting [Labb´e et al., 1998], and security [Brown et al., 2006, Jain et al., 2010].
In this work we consider normal form Stackelberg games with finite sets of actions for the leader and followers. We refer to these as general Stackeblerg games (GSG). The utility functions of GSGs are described by matrices, where each combination of actions for the leader and follower gives a reward value for each participant. Selecting a single action corresponds to a pure strategy, while a mixed strategy corresponds to a probability distribution over the set of actions for the player. Therefore, for GSGs the utility functions are bilinear functions of the players’ mixed strategies.
Stackelberg games can be expressed as bilevel optimization problems, where the top level represents the leader’s decision problem and includes the followers’ responses as the optimal solution to the second level problem [Colson et al., 2007]. Mixed integer formulations of GSGs have been introduced thanks to the bilinear objective functions and the linearization of the second level optimality conditions with the use of integer variables [Bard, 1998]. The manner in which the bilinear objectives and second level problem optimality conditions are linearized give rise to the different mixed integer linear programming (MILP) formula- tions considered in this work. For instance, using big M constraints to linearize both the leader objective and the second level optimality conditions give rise to the (D2) formulation [Kiekintveld et al., 2009]. The (DOBSS) formulation considers a single big M constraint but introduces new variables representing the product of the leader and follower strategies, [Paruchuri et al., 2008]. Finally, (MIP-p-G) is a formulation without big M constraints [Yin and Tambe, 2012]. Which of these MILP formulations of the bilevel stackelberg game problem is more convenient for computational efficiency is an underlying question of this work. When the leader in a GSG faces a single follower the problem can be solved in polynomial time, see [Conitzer and Sandholm, 2006]. The same reference shows that if there are multiple followers then the problem is NP-hard. A solution for the multiple fol- lowers problem can be obtained by using the algorithm for the single follower instance on a Harsanyi transformation of the problem, [Harsanyi and Selten, 1972], which combines the multiple adversaries into a single adversary with exponentially many actions. Solution methods based on mixed integer formulations of the multiple follower problem have been presented, for example, by [Jain et al., 2011] and [Yang et al., 2013].
Recent work has applied Stackelberg games in security settings where a leader has a limited budget to protect a set of targets while a follower aims to attack a single target. In
this domain, the payoff matrices are structured with only two payoff values for every partic- ipant depending on whether or not the defender strategy protects the target attacked. We refer to problems that have this structure as Stackelberg security games (SSGs), which are introduced in detail in Section 2. Some SSG applications have included assigning Federal Air Marshals to transatlantic flights [Jain et al., 2010], determining randomized port and waterways patrols for the U.S. Coast Guard [Shieh et al., 2012], preventing fare evasion in public transport systems [Yin et al., 2012], and protecting endangered wildlife [Yang et al., 2014]. The SSG models considered are closely related to the Interdiction games lit- erature, [McMasters and Mustin, 1970], specially when there is a fortification step. Such fortification-interdiction problems are multi-level optimization problems where a defender decides a limited fortification of a network, so that an interdictor (attacker) blocks a num- ber of edges in the network and an operator tries to maximize flow or minimize a path over the network. If the optimal operation response can be subsumed in the interdictor’s decision problem, then the problem has the structure of a Stackelberg security game. There are many variants and extensions of such fortification-interdiction games that allow multi- ple/sequential interdictions and problem specific formulations and algorithms, see reviews in [Smith and Lim, 2008, Snyder et al., 2016, Fischetti et al., 2018]. However, to the best of our knowledge there is no polyhedral study of different mixed integer optimization for- mulations that arise due to the bilevel nature of the interaction between the defender and the attacker.
In this paper we focus on the polyhedral analysis of different mixed integer formulations for GSGs and SSGs. In particular we provide the following four key contributions. First, we provide an exhaustive comparative study of existing MILP formulations for Stackelberg games. Starting from the natural bilevel representation of Stackelberg games, we use well- known integer programming techniques such as Fourier-Motzkin elimination [Dantzig and Eaves, 1973] and Reformulation Linearization Technique [Sherali and Adams, 1994] to derive known MILP formulations. Our study leads to a ranking of these MILP formulations in terms of the strength of their linear programming (LP) relaxations. Second, we explicit a formal link through projections of variables between the polyhedra of the LP relaxation of the GSGs formulations and those of SSGs. This allows to extend our study of GSG formulations to the security setting, leading to a comparison of SSG MILP formulations.
Third, we derive (SDOBSSq,y,s) and (MIP-p-Sq,y), two new SSG MILP formulations. We show that (MIP-p-Sq,y) is the MILP formulation with the tightest linear relaxation among SSG formulations. We further show that if we restrict (MIP-p-Sq,y) to a single attacker type, its LP relaxation provides a complete linear description of the convex hull of its feasible
solutions. Fourth, we provide computational experiments that compare solution times of the MILP formulations in both settings. Our experiments show that the formulations with the tightest LP relaxations have faster solution times as the problem size increases. In particular (MIP-p-Sq,y) scales better than competing formulations, being able to tackle larger-sized instances.
The remainder of this paper is organized as follows. In Section 2, we define general and security Stackelberg games. In Section 3, we derive GSG formulations from the lit- erature. We provide theoretical results comparing the formulations presented. In Section 4, we describe and analyze computational experiments for the formulations in Section 3.
In Section 5, we present SSG formulations using projections, in the appropriate space of variables, of the formulations in Section 3, and derive (SDOBSSq,y,s) and (MIP-p-Sq,y), new MILP formulations for SSGs. We then extend our theoretical comparisons of the general formulations to the security formulations. In Section 6, we describe and analyze the compu- tational experiments for the security formulations. We conclude with some closing remarks in Section 7.
2 Notation and definition of the problem
In this section, we provide a formal definition of the two types of problems we study.
2.1 General Stackelberg games–GSGs
Let K be the set of p followers. We denote by I the set of leader pure strategies and by J the set of follower pure strategies. The leader has a known probability of facing follower k∈K, denoted by πk∈[0,1]. We denote the n-dimensional simplex by Sn={a∈[0,1]n: Pn
h=1ah = 1}. A mixed strategy for the leader consists in a vector x∈ S|I| such that for i ∈ I, xi is the probability with which the leader plays pure strategy i. Analogously, a mixed strategy for a follower k ∈ K is a vector qk ∈ S|J| such that, qjk is the probability with which followerkreplies with pure strategyj∈J. The rewards or payoffs for the leader and each follower, resulting from their choice of strategy, are encoded in a different matrix for each follower. These payoff matrices are denoted by (Rk, Ck), where Rk ∈ R|I|×|J| is the leader’s reward matrix when facing follower k ∈ K and Ck ∈ R|I|×|J| is the reward matrix for follower k. The expected reward of the leader and follower k, respectively, can be expressed as follows:
X
i∈I
X
j∈J
X
k∈K
πkRkijxiqkj, (1) X
i∈I
X
j∈J
Cijkxiqkj, ∀k∈K. (2)
For all k ∈ K, we define the function Bk : S|I| −→ S|J| as the function that, given the leader’s mixed strategy x, returns a best response qk for each follower k. The solution concept used in these games is the Strong Stackelberg Equilibrium (SSE), introduced in [Leitman, 1978] and defined below.
Definition 1. A profile of mixed strategies (x,{Bk(x)}k∈K) form an SSE if:
1. The leader always plays a payoff-maximizing strategy:
xTRkBk(x)≥x0TRkBk(x0) ∀x0 ∈S|I|,∀k∈K.
2. Each follower always plays a best-response,Bk(x)∈Fk(x), where ∀k∈K, Fk(x) = arg max
qk
{xTCkqk:qk ∈S|J|} is the set of best responses for each follower.
3. Each follower breaks ties optimally in favor of the leader:
xTRkBk(x)≥xTRkqk ∀qk∈Fk(x).
An SSE assumes that the follower breaks ties in favor of the leader by choosing, when indifferent between different follower strategies, the strategy that maximizes the payoff of the leader. An SSE is in practice always achievable as the leader can always induce one by selecting a sub-optimal mixed strategy arbitrarily close to the equilibrium, causing the follower to prefer the desired strategy [von Stackelberg, 2011].
Proposition 1. For any leader strategy x and any k∈ K, there is a best response to the k-th follower’s problem that is given by a vector qk ∈ {0,1}|J| such that P
j∈Jqjk= 1.
Proof. Assume that Bk(x) = ¯qk 6∈ {0,1}|J|. We show that any canonical vector ejk such that ¯qjk>0, is also a best response vector,i.e.,ejk ∈Fk(x) and xTRkejk ≥xTRkqk for all qk ∈Fk(x). Since ¯qk =P
j∈Jq¯jkejk, with ejk ∈S|J|, andxTCkejk ≤xTCkq¯k for all j∈J, we have that xTCkq¯k = P
j∈Jq¯jk(xTCkejk) ≤ P
j∈Jq¯kj(xTCkq¯k) = xTCkq¯k. This implies that for any ¯qkj > 0 we have xTCkejk =xTCkq¯k, giving ejk ∈Fk(x). A similar argument shows that for any j such that ¯qjk > 0 we have xTRkejk = xTRkq¯k; Hence, ejk is a best
response vector.
This result shows that we can restrict the follower’s best response only to pure strategies without influencing the SSE solution concept, as done in [Paruchuri et al., 2008].
In mathematical optimization, Stackelberg games are formulated as bilevel programming (BP) problems [Bracken and McGill, 1973]. In BP the optimization problems have two
levels where the top level problem considers some variables that are the optimal solution to another, second level, optimization problem. Important BP surveys are those by [Kolstad, 1985, Savard, 1989, Anandalingam and Friesz, 1992, Labb´e and Violin, 2016]. In our setting, the first level problem corresponds to the leader’s decision problem and the nested problem corresponds to the follower’s decision problem. The following model, (BIL-p-Gx,q), is a bilevel program for the general Stackelberg game problem:
(BIL-p-Gx,q) Maxx,q
X
i∈I
X
j∈J
X
k∈K
πkRkijxiqkj (3)
s.t. x∈S|I| (4)
qk∈arg maxrk
X
i∈I
X
j∈J
Cijkxirkj
∀k∈K, (5) rjk∈ {0,1} ∀j∈J,∀k∈K, (6)
X
j∈J
rjk= 1 ∀k∈K. (7)
The objective function maximizes the leader’s expected reward. Condition (4) charac- terizes the mixed strategies considered by the leader. The second level problem defined by (5)-(7) indicates that the follower maximizes its own payoff by giving a best response with a pure strategy to the leader’s mixed strategy. Recall that such a pure strategy always exists as shown in Proposition 1. If there are multiple optimal strategies for the follower, the main level problem selects the one that benefits the objective of the leader.
2.2 Stackelberg security games–SSGs
In a Stackelberg security game (SSG) the defender allocates security resources to protect a subset of targets. Let J be the set of n targets that could be attacked and assume there are security resources to protect up to m < n of these targets. The setI of defender pure strategies is composed by all Pm
i=1 n
i
subsets of at mostm targets ofJ that the defender can protect simultaneously. With a slight abuse of notation, we refer toi∈I in this context as both the index running through the set of defender pure strategies I and as i⊂ J the corresponding subset of J with at mostmtargets that are protected by security resources.
Similar to GSGs, the elements j∈J constitute the pure strategies of each attacker, which for SSG represents the single target attacked by the follower. In SSGs, payoffs for the players only depend on whether the target attacked is protected or not. This means that many of the strategies have identical payoffs. The authors in [Kiekintveld et al., 2009] use this fact to construct a compact representation of the payoffs.
We denote by Dk the utility of the defender when facing an attackerk∈K and by Ak
the utility of attacker k. Associated with each target and each player there are two payoffs depending on whether or not the target is protected, see Table 1. [Kiekintveld et al., 2009]
Protected Unprotected Defender Dk(j|p) Dk(j|u) Attacker Ak(j|p) Ak(j|u)
Table 1: Payoff structure in an SSG when targetj is attacked by an attackerk take advantage of the aforementioned compact representation to define a protection vector c whose components, cj, represent the frequency with which target j is protected. The components of the vector csatisfy
cj = X
i∈I:j∈i
xi ∀j ∈J, (8)
i.e., the frequency with which targetjis protected is expressed as the sum of all probabilities of the strategies that protect that target. Variablesqjkindicate whether an attackerkstrikes a target j.
The defender’s and attacker k’s expected rewards, are, respectively:
X
j∈J
X
k∈K
πkqkj{cjDk(j|p) + (1−cj)Dk(j|u)}, (9) X
j∈J
qkj{cjAk(j|p) + (1−cj)Ak(j|u)}, ∀k∈K. (10) As with GSGs, such a game can be modeled by means of bilevel programming.
(BIL-p-Sx,c,q)
Max X
j∈J
X
k∈K
πkqjk{cjDk(j|p) + (1−cj)Dk(j|u)}
s.t. (4),(8), qk ∈arg maxrk
X
j∈J
rkj(cjAk(j|p) + (1−cj)Ak(j|u)
∀k∈K,
rkj ∈ {0,1} ∀j∈J,∀k∈K,
X
j∈J
rkj = 1 ∀k∈K.
The objective function maximizes the defender’s expected reward. Constraints (4) and (8) characterize the exponentially many mixed strategies considered by the defender and relate them to the frequencies with which targets are protected. The remaining constraints constitute the second level optimization problem which ensures that the attacker maximizes
its profit by attacking a single target that is the best response to the defender’s selected strategy. Notice that a more compact formulation–one involving a polynomial number of variables and constraints–can be obtained if projecting out the exponentially many x variables does not lead to exponentially many constraints. This would give a polynomial size formulation involving only the c and the q variables. Given an optimal solution to this compact formulation–an optimal protection vectorc and an optimal attack vectorq–a probability vectorx, solution to this game in extensive form, can be obtained by solving the system of linear inequalities defined by conditions (4) and (8). As this system involvesn+ 1 equalities, there exists a solution in which the number of variables xi with a positive value is not larger thann+ 1,i.e., the output size of an SSG, under extensive form, is polynomial in the input size. See Section 5 for more details.
3 General Stackelberg games–GSGs
In Section 3.1, we present equivalent MILP formulations for thepfollower GSG. In Section 3.2 we compare the polyhedra of the LP relaxations for the different formulations.
3.1 General Stackelberg games: single level formulations
[Paruchuri et al., 2008] tackle the problem of solving the bilevel formulation presented earlier, (BIL-p-Gx,q) by using a MILP reformulation. They replace the second level nested optimization problem, described by (5)-(7), by the following set of constraints:
X
j∈J
qjk= 1 ∀k∈K, (11)
qjk∈ {0,1} ∀j ∈J,∀k∈K, (12)
0≤(sk−X
i∈I
Cijkxi)≤(1−qjk)·M ∀j ∈J,∀k∈K, (13) where sk ∈ R for all k ∈ K and M is an arbitrarily large positive constant. The two inequalities in constraints (13) ensure that qkj = 1 only for a pure strategy that maxi- mizes the follower’s payoff. The problem defined by (3)-(4) and (11)-(13) is referred to as (QUADx,q,s). It is possible to eliminate the nonlinearity in the objective function of (BIL- p-Gx,q) by adding additional variables that represent the product between x and q. To be more precise, use zijk =xiqjk for alli∈I, j ∈J and k ∈K. This gives rise to formulation
(DOBSSq,z,s) introduced in [Paruchuri et al., 2008]:
(DOBSSq,z,s) Max X
i∈I
X
j∈J
X
k∈K
πkRkijzijk s.t. (11),(12),
X
j∈J
zijk =X
j∈J
z1ij ∀i∈I,∀k∈K, (14)
X
i∈I
zijk =qkj ∀j ∈J,∀k∈K, (15) zijk ≥0 ∀i∈I,∀j ∈J,∀k∈K, (16) 0≤sk−X
i∈I
X
j0∈J
Cijkzijk0
≤(1−qkj)·M ∀j ∈J,∀k∈K, (17) s∈R|K|.
Alternatively the quadratic term in the objective of (BIL-p-Gx,q) can be addressed by adding
|K|new variables and introducing a second family of constraints involving a big M constant.
This gives rise to formulation (D2x,q,s,f) below (a DOBSS variant with 2 big M constraints that has appeared in [Kiekintveld et al., 2009]):
(D2x,q,s,f) Max X
k∈K
πkfk (18)
s.t. (4),(11)−(13), fk≤X
i∈I
Rijkxi+ (1−qkj)·M ∀j∈J,∀k∈K, (19)
s, f ∈R|K| ∀k∈K.
Additionally, we project the real variables sk in constraints (13) and (17) out by using Fourier-Motzkin elimination [Dantzig and Eaves, 1973]. This gives rise to constraints:
X
i∈I
(Cijk −Ci`k)xi≤(1−q`k)·M ∀j, `∈J,∀k∈K, (20) X
i∈I
X
j0∈J
(Cijk −Ci`k)zijk0 ≤(1−q`k)·M ∀j, `∈J,∀k∈K. (21) Replacing (13) by (20) in (D2x,q,s,f) and (17) by (21) in (DOBSSq,z,s) yields (D2x,q,f) and (DOBSSq,z). We analyze the behavior of these last two new formulations compared to that of (D2x,q,s,f) and (DOBSSq,z,s) to see if removing variablessat the expense of adding constraints is worthwile.
Another equivalent MILP formulation for thep-follower GSG can be obtained by replacing constraints (17) with the following set of constraints:
X
i∈I
(Cijk −Ci`k)zkij ≥0 ∀j, `∈J,∀k∈K. (22)
These constraints are derived by multiplying constraints (20) by q`k, reorganizing and re- placing the nonlinear termsxiqjk by zijk. This leads to (MIP-p-Gq,z):
(MIP-p-Gq,z) Max X
i∈I
X
j∈J
X
k∈K
πkRkijzijk
s.t. (11),(12),(14)−(16),(22).
The linear relaxation of (MIP-p-Gq,z) appears in [Yin and Tambe, 2012]. The MILP for- mulation is a p-follower extension to the single follower formulation (MIP-1-Gq,z), due to [Conitzer and Korzhyk, 2011]. Formal proofs that the formulations seen thus far are equiva- lent MILP formulations,i.e., that they are valid for thep-follower GSG, appear in [Paruchuri et al., 2008] for (DOBSSq,z,s) and [Paruchuri et al., 2008] and [Kiekintveld et al., 2009]
for (D2x,q,s,f). These proofs show that each of them is equivalent to (QUADx,q,s). The equivalence of (DOBSSz,q) and (D2x,q,f) is obtained from the Fourier-Motzkin elimination procedure [Dantzig and Eaves, 1973]. The equivalence proof for (MIP-p-Gq,z) is analogous to the proof used to show the equivalence for (DOBSSq,z,s) and is omitted here.
[Paruchuri et al., 2008] state that the big M constants used are arbitrarily large. To be as computationally competitive as possible, we provide the tightest value for each big M constant in the formulations discussed thus far.
Proposition 2. The tightest values for the positive constants M are:
1. In (19), M = maxi∈I{max`∈JRki`−Rkij} ∀j∈J,∀k∈K.
2. In (13) and (17),M = maxi∈I{max`∈JCi`k −Cijk} ∀j ∈J,∀k∈K.
3. In (20) and (21),M = maxi∈I{Cijk −Ci`k},∀j, `∈J,∀k∈K.
3.2 Comparison of the formulations
Given a formulation F, we denote by F its linear (continuous) relaxation and by P(F) the polyhedral feasible region of F. Further, let Q = {(x, z) ∈ Rn×Rm : Ax+Bz ≤ d}.
Then the projection of Q into the x-space, denoted P rojxQ, is the polyhedron given by P rojxQ={x∈Rn:∃z∈Rm for which (x, z)∈Q}, see [Pochet and Wolsey, 2006].
First, we introduce an additional formulation which we denote by (DOBSSx,q,z,s,f). This formulation is equivalent to (DOBBSq,z,s), in the sense that the values of their LP relaxations coincide. In this formulation, we introduce variablesfkfor allk∈Kto rewrite the objective function so that it matches the objective function of (D2x,q,s,f). We also add variables xi for all i∈I by rewriting (14) asP
j∈Jzijk =xi for alli∈I and all k∈K. Using this last
condition, we can simplify (17) to (13). The formulation (DOBSSx,q,z,s,f) is as follows.
(DOBSSx,q,z,s,f) Max X
k∈K
πkfk
s.t. (11)−(13),(15),(16), fk =X
i∈I
X
j∈J
Rkijzkij ∀k∈K, (23) X
j∈J
zijk =xi ∀i∈I,∀k∈K, (24) s∈R|K|.
Further, note that from the Fourier Motzkin elimination procedure we have that P(D2x,q,f) =P rojx,q,fP(D2x,q,s,f) and,
P(DOBSSq,z) =P rojq,zP(DOBSSq,z,s).
Proposition 3. P rojx,q,s,fP(DOBSSx,q,z,s,f)⊆ P(D2x,q,s,f). Further, there exist instances for which the inclusion is strict.
Proof. Note that all the constraints of P(D2x,q,s,f) can be found in the description of P(DOBSSx,q,z,s,f) except for constraints (4) and (19). Constraints (4) are implied by con- straints (11), (15), (16) and (24).
Further, the projection of P(DOBSSx,q,z,s,f) on the (x, q, s, f)-space can be obtained by applying Farkas’ Lemma [Farkas, 1902]. Constraints (15), (16), (23) and (24) are the only ones involving variables zijk and are separable by k∈K. For a fixed k∈K the projection is given by:
Ak ={(x, q, f) :αfk+X
i∈I
βixi+X
j∈J
γjqkj ≥0∀(α, γ, β) :
αRkij +βi+γj ≥0 ∀i∈I,∀j ∈J} (25) For a fixed j ∈J, defineα=−1, βi =Rijk for all i∈I,γj = 0 and γ` = maxi∈I(Ri`k −Rkij) for all `∈J with `6=j. This definition of the parameters satisfies αRkij +βi+γj ≥0 for all i∈I,j ∈J. Substituting these parameters in the generic constraints ofAk yields
fk≤X
i∈I
Rkijxi+ X
`∈J:`6=j
maxi∈I (Rki`−Rkij)qk` ∀j ∈J,∀k∈K. (26) Constraints (26) imply constraints (19) for the tight value of M provided in Proposition 2 since for all j ∈J and k∈K,
X
`∈J:`6=j
maxi∈I (Rki`−Rkij)qk` ≤max
i∈I
max`∈J Rki`−Rkij
X
`∈J:`6=j
qk` = max
i∈I
max`∈J Rki`−Rkij
(1−qjk).
This proves the inclusion. To show that the inclusion may be strict, consider the following example where |I|=|J|= 3 and|K|= 1. Let the payoff matrix for the game be
(R, C) =
(1,0) (0,0) (0,0) (0,0) (1,0) (0,0) (0,0) (0,0) (0,0)
and consider the point defined by x = (1,0,0)t, q = (13,13,13)t, s= 10 and f = 2/3. Such a point is feasible for (D2x,q,s,f) but violates constraints (26) for j = 2 and is therefore
infeasible for P rojx,q,s,fP(DOBSSx,q,z,s,f).
Next, we compare the polyhedra P(MIP-p-Gq,z) and P rojq,zP(DOBSSq,z,s).
Theorem 1. P(MIP-p-Gq,z)⊆ P(DOBSSq,z)=P rojq,zP(DOBSSq,z,s). Further, there exist instances for which the inclusion is strict.
Proof. The description ofP(DOBSSq,z) differs from that ofP(MIP-p-Gq,z) by only one set of constraints: (21) must hold instead of (22). Hence, the remainder of the proof consists in showing that (21) are implied by (11), (14)-(16), (22) and the nonnegativity of the q variables. The LHS of (21) can be rewritten as:
X
i∈I
(Cijk −Ci`k)zi`k +X
i∈I
X
j0∈J:j06=`
(Cijk −Ci`k)zijk0 ≤X
i∈I
X
j0∈J:j06=`
(Cijk −Ci`k)zkij0, using (22),
≤max
i∈I {Cijk −Ci`k} X
j0∈J:j06=`
X
i∈I
zijk0 ≤M X
j0∈J:j06=`
qjk0, given Proposition 2 and (15)
=M(1−q`k), by (11).
To show that the inclusion may be strict consider the p-follower GSG between a leader and a fixed follower k∈K where the payoff bimatrix is:
(Rk, Ck) =
(0,1) (1,0) (0,0) (0,0)
The point with coordinates x= (1/2,1/2)t, qk= (1/2,1/2)t and
zk=
1/4 1/4 1/4 1/4
has an objective value of 1/4 and is feasible in P(DOBSSq,z). However it is not a feasible point in P(MIP-p-Gq,z) as it doesn’t verify constraints (22) when j= 2 and`= 1.
From an interpretation point of view, (MIP-p-Gq,z) can be seen as the result of applying Reformulation Linearization Technique (RLT) [Sherali and Adams, 1994] to (DOBSSq,z).
Indeed, by multiplying both sides of constraints (20) by variable q`k and noticing that qk`(1−q`k) = 0 sinceq is binary, one obtainsP
i∈I(Cijk −Ci`k)xiq`k≤0 which, once linearized by introducing variables zi`k, yields (22).
For a given formulation F, we denote its optimal value by v(F) and the optimal value of its LP relaxation by v(F). Since (D2x,q,s,f) and (DOBSSx,q,s,f) and (DOBSSq,z) and (MIP-p-Gq,z) have the same objective function, the following corollary holds.
Corollary 1. v(MIP-p-Gq,z)≤v(DOBSSq,z) =v(DOBSSx,q,s,f)≤v(D2x,q,s,f).
Finally, when (MIP-p-G) is restricted to a single follower type, [Conitzer and Korzhyk, 2011] showed that the integrality costraints are redundant, i.e., the remaining constraints in (MIP-1-G) provide a complete linear description of the convex hull of feasible solutions.
4 Computational experiments for GSGs
Here, we present computational experiments for the formulations in Section 3. The machine used for these experiments is an Intel Core i7-4930K CPU, 3.40GHz, equipped with 64 GB of RAM, 6 cores, 12 threads and running the Ubuntu operating system release 12.10 (kernel Linux 3.5.0-41-generic). The experiments were coded in the programming language Python and GUROBI version 6.5.1 was the optimization solver used with a 3 hour solution time limit.
The instances solved in the computational experiments are randomly generated. We con- sider two different ways of randomly generating the payoff matrices for the leader and the different follower types. First, we consider matrices where all the elements are randomly generated between 0 and 10 and second, we consider matrices where 90% of the values are between 0 and 10 but we allow for 10% of the data to deviate between 0 and 100. In the first case we say that there is no variability in the payoff matrices, in the sense that all the data is uniformly distributed, whereas in the second case, we refer to the payoff matrices as matrices with variability.
A general Stackelberg game instance is defined by three parameters: |I|, the number of leader pure strategies, |J|, the number of follower pure strategies and |K|, the number of follower types. For the purpose of these experiments, we have considered instances where
|I| ∈ {10,20,30}, |J| ∈ {10,20,30} and |K| ∈ {2,4,6}. For each instance size, 5 instances are generated without variability in the payoff matrices and 5 are generated with variability.
In total, we consider 135 instances without variability and 135 instances with variability.
Performance profiles summarize our results, with respect to the following 4 measures: to- tal running time employed to solve the integer problem, running time employed to solve
the linear relaxation of the integer problem, total number of nodes explored in the branch and bound (B&B) tree and percentage optimality gap at the root node. The percentage optimality gap at the root node is calculated by comparing the optimal values of the for- mulation and of its LP relaxation: v(F)−v(Fv(F) ) ·100. A performance profile graph plots the total percentage of problems solved for each value of these measures.
We study the behavior of (D2x,q,s,f), (D2x,q,f), (DOBSSq,z,s), (DOBSSq,z) and (MIP-p-Gq,z).
Figures 1 and 2 compare the performance profiles when the payoff matrices are generated without variability and with variability, respectively.
We observe that the instances where variability is introduced in the payoff matrices
0.01 0.1 1 10 100 1000 10000
0 10 20 30 40 50 60 70 80 90 100
Time to solve the integer problem (s.)
% of problems solved
0.00010 0.0010 0.0100 0.1000 1.0000
10 20 30 40 50 60 70 80 90 100
Time to solve the LP relaxation (s.)
% of problems solved
1 100 10000 1000000 100000000
0 10 20 30 40 50 60 70 80 90 100
Number of nodes in the B&B tree
% of problems solved
1 10 100 1000
0 10 20 30 40 50 60 70 80 90 100
% optimality gap at the root node
% of problems solved
Figure 1: GSGs: |I| ∈ {10,20,30},|J| ∈ {10,20,30},|K| ∈ {2,4,6}–without variability.
solve faster than those where no variability is considered. When there is no variability, (DOBSSq,z,s) and (MIP-p-Gq,z) are the two most competitive formulations. (D2x,q,s,f) can also be solved efficiently for the mid-range instances but slows down for the more difficult instances. Introducing variability in the payoff matrices, however, leads to a dominance of (MIP-p-Gq,z) with (DOBSSq,z,s) coming in a close second and (D2x,q,s) becoming noncom- petitive for these instances. Regarding the time spent solving the linear relaxation of the problems, formulation (MIP-p-Gq,z) is the hardest to solve due to the fact that is has the most variables and constraints, O(|K||J|2). On the other hand, (D2x,q,s,f), withO(|K||J|) variables and constraints, is the fastest. With respect to the number of nodes and gap percentage, our theoretical findings are corroborated: (MIP-p-Gq,z) is the tightest formu- lation and therefore uses the fewest nodes. This is even more the case when variability is
0.01 0.1 1 10 100 1000 0
10 20 30 40 50 60 70 80 90 100
Time to solve the integer problem (s.)
% of problems solved
0.00010 0.0010 0.0100 0.1000 1.0000
10 20 30 40 50 60 70 80 90 100
Time to solve the LP relaxation (s.)
% of problems solved
1 10 100 1000 10000 100000 1000000
0 10 20 30 40 50 60 70 80 90 100
Number of nodes in the B&B tree
% of problems solved
0.01 0.1 1 10 100 1000
0 10 20 30 40 50 60 70 80 90 100
% optimality gap at the root node
% of problems solved
Figure 2: GSGs: |I| ∈ {10,20,30},|J| ∈ {10,20,30},|K| ∈ {2,4,6}–with variability.
introduced.
Table 2 summarizes the mean percentage optimality gap at the root node obtained across the instances solved. Finally, note that the formulations obtained through Fourier-Motzkin, (D2x,q,f) and (DOBSSq,z), explore slightly less nodes in the B&B tree than their counter- parts, (D2x,q,s,f) and (DOBSSq,z,s), but because of the increase in the number of constraints, the time to solve each linear relaxation increases. This increases the overall solution time of the Fourier-Motzkin formulations.
(D2x,q,s,f) (DOBSSq,z,s) (MIP-p-Gq,z)
Mean % opt. gap (no variability) 117.68 23.01 9.94
Mean % opt. gap (with variability) 103.44 40.74 5.17
Total mean % opt. gap 110.56 31.88 7.56
Table 2: Mean percentage optimality gap at the root node recorded for GSG formulations.
5 Stackelberg security games-SSGs
In this section, we derive three SSG formulations: (ERASERc,q,s,f), due to [Kiekintveld et al., 2009], and (SDOBSSq,y,s) and (MIP-p-Sq,y). We derive these formulations by explor- ing the inherent link between the general setting, considered up to now and the security setting, defined in Section 2.2. In this setting, the defender pure strategies i∈I correspond to the different ways in which up to m targets can be protected simultaneously. With a
slight abuse of notation, i ∈ I refers both to the index running through the set of pure strategies I and to the subset of at mostm targets protected by pure strategyi∈I. Recall that the payoff matrices of SSGs satisfy:
Rkij =
Dk(j|p) ifj∈i Dk(j|u) ifj /∈i
(27) Cijk =
Ak(j|p) ifj∈i Ak(j|u) ifj /∈i
(28)
The payoff for the leader that commits to a pure strategy i∈I and a follower of type k∈K responds by selecting strategyj∈J is either a reward if pure strategyi∈I protects attacked target j ∈ J, or, a penalty if strategy i does not protect target j. The same argument explains the link between payoffs for the attackers.
5.1 Stackelberg security games: single level formulations
The first formulation we derive is based on (D2x,q,s,f). Consider (D2c,x,q,s,f), an extended description of (D2x,q,s,f) where we introduce the c variables through constraints (8) (see Section 2.2). We further use relations (27) and (28) to adapt the payoff structure:
(D2c,x,q,s,f)
Max X
k∈K
πkfk
s.t. (4),(8),(11),(12),
0≤sk−Ak(j|p)cj −Ak(j|u)(1−cj)≤(1−qkj)·M ∀j∈J,∀k∈K, (29) fk≤Dk(j|p)cj+Dk(j|u)(1−cj) + (1−qjk)·M ∀j∈J,∀k∈K, (30) s, f ∈RK.
This extended formulation is equivalent to (D2x,q,s,f), because, even though they are defined in different spaces of variables, the value of their LP relaxations coincide.
The formulation above has a large number of non-negative variables since in the security setting, the set I of all defender pure strategies is exponential in the number of targets as it contains all subsets of at mostm targets ofJ that the defender can protect simultaneously.
In order to avoid having exponentially many non-negative variables in our formulation, we project out variables xi, i ∈ I, from the formulation. Note that only constraints (4) and (8) involve said variables.
Proposition 4. Consider the following two sets:
A=P rojc
n
(x, c)∈R|I|×R|J|: (4),(8) o
B =
c∈R|J|:X
j∈J
cj ≤m, cj ∈[0,1]∀j∈J
Then, A=B.
Proof. Observe first that using Farkas’ Lemma [Farkas, 1902]:
A=
c∈R|J|:X
j∈J
αjcj+α|J|+1≥0∀α∈R|J|+1:
X
j∈J:j∈i
αj+α|J|+1 ≥0 ∀i∈I :|i| ≤m and α|J|+1 ≥0
,
Thus A⊆B. Indeed, the following 2|J|+ 1 vectors inR|J|+1:
∀j ∈J, ej ∈R|J|+1:ejj = 1, ejk= 0 ∀k∈J :k6=j and ej|J|+1 = 0,
∀j∈J, fj ∈R|J|+1 :fjj =−1, fkj = 0 ∀k∈J :k6=j and f|J|+1j = 1 and g∈R|J|+1:gj =−1 ∀j∈J and g|J|+1=m,
satisfy P
j∈J:j∈iαj +α|J|+1 ≥0 and α|J|+1 ≥0. Additionally, when we substitute the above vectors into the generic constraints defining A, they yield all the constraints defining B.
To show that A=B, it remains to show that any other inequality X
j∈J
αjcj+α|J|+1 ≥0 (31)
such that α satisfies X
j∈J:j∈i
αj+α|J|+1≥0 ∀i∈I :|i| ≤m and α|J|+1≥0, (32) is dominated by some nonnegative linear combination of the constraints defining B.
First, note that we can restrict our attention to constraints such that αj ≤0 for all j ∈J. If there exists ˆj ∈J such that αˆj >0, since α must satisfy (32) and |i\ {ˆj}| ≤ |i| ≤ m, it follows that ¯α with ¯αˆj = 0 and ¯αj =αj for allj ∈J\ {ˆj}also satisfies (32) and sincec≥0, we have that
X
j∈J
¯
αjcj+ ¯α|J|+1 ≤X
j∈J
αjcj+α|J|+1.
Therefore, the constraint defined by α is dominated by the constraint defined by ¯α. We thus distinguish two cases of α satisfying (32):
Case 1. |{j:αj <0}|=k≤m, and Case 2. |{j:αj <0}|=k > m.
In Case 1, by considering a linear combination of inequalities cj ≤ 1 for 1 ≤ j ≤ k with respective weights −αj ≥0, we obtain that:
0≤
k
X
j=1
αjcj−
k
X
j=1
αj ≤X
j∈J
αjcj+α|J|+1, since αj = 0 for all j > k and α satisfies (32) fori={1, . . . , k}.
For Case 2, assume w.l.o.g that α1 ≤ α2 ≤ . . . ≤ αk < 0 and αj = 0 for all j > k.
Then, build a linear combination of inequality P
j∈Jcj ≤ m with weight −αm ≥ 0 and inequalitiescj ≤1 for 1≤j≤m with respective weightsαm−αj ≥0. The valid inequality thus obtained is:
0≤
m
X
j=1
αjcj +X
j>m
αmcj−
m
X
j=1
αj ≤X
j∈J
αjcj −
m
X
j=1
αj, since αj ≥αm for all j > m
≤X
j∈J
αjcj +α|J|+1,
since α satisfies (32) fori={1, . . . , m}.
Proposition 4 leads to the following formulation based on (D2c,x,q,s,f):
(ERASERc,q,s,f)
Max X
k∈K
πkfk
s.t. (11),(12),(29),(30), X
j∈J
cj ≤m,
0≤cj ≤1 ∀j∈J,
s, f ∈RK.
The above formulation involves a polynomial number of variables and constraints and was presented in [Kiekintveld et al., 2009]. The next result is also an immediate consequence of Proposition 4.
Corollary 2. P rojc,q,s,fP(D2c,x,q,s,f) =P(ERASERc,q,s,f).
We now derive new SSG formulations based on (DOBSSq,z,s) and (MIP-p-Gq,z). We first present extended descriptions of both formulations by consideringy`jk variables satisfying:
y`jk = X
i∈I:`∈i
zijk ∀j, `∈J,∀k∈K. (33)
We use (27) and (28) to adapt the payoffs to the security setting leading to:
(DOBSSq,z,y,s)
Max X
j∈J
X
k∈K
{πk(Dk(j|p)yjjk +Dk(j|u)(qjk−yjjk))} (34) s.t. (11),(12),(14)−(16),(33),
0≤sk−Ak(j|p)X
j0∈J
yjjk0− Ak(j|u)(1−X
j0∈J
yjjk0)≤(1−qkj)·M ∀j∈J,∀k∈K, (35)
s∈R|K|. (36)
(MIP-p-Gq,z,y) Max X
j∈J
X
k∈K
πk(Dk(j|p)yjjk +Dk(j|u)(qjk−ykjj)) s.t. (11),(12),(14)−(16),(33),
Ak(j|p)ykjj+Ak(j|u)(qjk−ykjj)−
Ak(`|p)y`jk −Ak(`|u)(qjk−yk`j)≥0 ∀j, `∈J,∀k∈K.
(37) Further, consider the following constraints:
X
j∈J
y`jk =X
j∈J
y`j1 ∀`∈J,∀k∈K, (38)
and let us define the following polyhedra C and D:
C :={(q, z, y, s)∈[0,1]|K||J|×[0,1]|K||I||J|×[0,1]|K||J|2 ×R|K|: (11),(15),(16),(33),(35),(36),(38)}
D:={(q, z, y)∈[0,1]|K||J|×[0,1]|K||I||J|×[0,1]|K||J|2 : (11),(15),(16),(33),(37),(38)}
Lemma 1. C ⊇ P(DOBSSq,z,y,s) andD⊇ P(MIP-p-Gq,z,y)
Proof. Consider constraints (14) and sum over alli∈I such that`∈i:
X
i∈I:`∈i
X
j∈J
zijk =X
i∈I:`∈i
X
j∈J
zij1 ∀`∈J,∀k∈K. (39)
Applying (33) to (39) yields (38) and the result follows.
We now project thezvariables from the larger polyhedraCand D. Said variables only appear in constraints (15), (16) and (33).
Lemma 2. Consider the following two sets;
X =P rojq,yn
(q, z, y)∈R|K||J|
2+|K||J|+|I||J||K|: (15),(16),(33)o
Y ={(q, y)∈R|K||J|
2+|K||J|
:X
`∈J
y`jk ≤mqkj ∀j∈J,∀k∈K, 0≤y`jk ≤qjk ∀j, `∈J,∀k∈K}
Then, X =Y.
Proof. Note that constraints (15), (16) and (33) can be treated independently for each k ∈ K and each j ∈ J. First consider the case where qˆˆk
j = 0 for ˆj ∈ J and ˆk ∈ K.
Constraints (15) then imply that for all i∈I,zˆk
iˆj = 0 and constraints (33) forceykˆ
`ˆj = 0 for all `∈J and the result holds. For all j ∈J,k∈K such that qjk6= 0, consider xi =zijk/qjk and c` =y`jk/qjk and apply Propostion 4. The result follows.
Consider P rojq,y,sC and P rojq,yD as the feasible regions of the linear relaxations of two MILP formulations–(SDOBSSq,y,s) and (MIP-p-Sq,y)–where we maximize the objective function (34) under the additional requirement that the q variables be binary. Hence, we present (SDOBSSq,y,s), a security formulation based on (DOBSSq,z,y,s),
(SDOBSSq,y,s)
Max X
j∈J
X
k∈K
πk(Dk(j|p)ykjj+Dk(j|u)(qjk−yjjk)) s.t. (11),(12),(35),(38)
X
`∈J
y`jk ≤mqkj ∀j ∈J,∀k∈K, (40)
0≤y`jk ≤qjk ∀j, `∈J,∀k∈K, (41) s∈R|K|.
And we also present (MIP-p-Sq,y), a security formulation based on (MIP-p-Gq,z,y),
(MIP-p-Sq,y) Max X
j∈J
X
k∈K
πk(Dk(j|p)yjjk +Dk(j|u)(qjk−yjjk))
s.t. (11),(12),(37),(38),(40),(41)
The following corollaries are an immediate consequence of Lemmas 1 and 2.
Corollary 3. P rojq,y,sP(DOBSSq,z,y,s)⊆ P(SDOBSSq,y,s).
Corollary 4. P rojq,yP(MIP-p-Gq,z,y)⊆ P(MIP-p-Sq,y).