Shrink the encoding

State of the art & contribution

2.4 Tackling the State space explosion

2.4.3 Shrink the encoding

As the state space explosion is ultimately a representation problem, the third strat-egy aims at using efficient representation of the Kripke structure. Instead of re-ducing the number of states to explore, this strategy reduces the memory footprint of the state space by swapping memory against computing power.

Compression

This first tactic still enumerates each and every state but each of them is stored in a compressed way.

Bit-state hashing As the unicity table, that data structure that remembers whether a state has already visited, consumes most of the memory, this tech-nique [Hol03,Hol97] proposes to replace it by a simpler version. Instead of stor-ing a state, it only stores whether it has been already visited. For that purpose, a big bit array is used. As soon as a state is encountered, a hash code is computed and the corresponding (i.e.,the hash code serves as an index) bit in the table is set to one. The problem with this approach is that a hash function is usually not

2.4. Tackling the State space explosion 35 injective. The probability of collision can, however, be computed and there exist approaches to reduce it [KL04]. This technique often serves a pre-computation to prepare and optimize the actual model checking.

Recursive indexing compression This compression method [Hol03,Hol97] is based on the fact that asynchronous systems, components can only take a limited number of local states and that the global state is a concatenation of the global ones. Informally, the goal is to only store pointers to local states and to concate-nate the pointers to form the global state instead of the local state themselves. The more symmetry between the local states of the different modules the more reuse a thus a better reduction. Of course, to be efficient the size of the local states must be greater than the size of the pointer. When local states are themselves composed of sub-components the method can be applied recursively.

∆-State Instead of storing each and every state, this approach [EPP05] proposes to store the state once everyk-steps and then to only save delta from state to state.

Of course, the choice ofk is a trade-offbetween the memory saved and the time that is necessary to rebuild the intermediate states.

Symbolic model checking

Unlike the previous approaches, symbolic model checking does not encode the state space explicitly. It rather uses a symbolic (i.e.,implicit) encoding of the state space. The rest of the state of the art is dedicated to symbolic model checking and the techniques to implement it.

2.4.4 Symbolic model checking

In the early 90’s, McMillan [BCM⁺92] realized that using Bryant’s ROB-DDs [Bry86] — a variant of DDs — it was possible to encode the transition relation symbolically. Because ROBDD are well suited to express regularity in the state space of circuits and protocols, they provide an encoding that is, in the best case, logarithmic to the number of states and linear in the worst. Symbolic model checking is polynomial to the size of the state space representation.

The basic idea is to encode the characteristic function of the set of states and the transition relation as a Boolean expression. A subsetS⁰ of the set of statesS (or any set) can be represented through its characteristic function: χS⁰ : S → B such that:

χS⁰(s)7→











1 if s∈S⁰ 0 otherwise

36 Chapter 2. State of the art&contribution Then,SMCleverage the canonical representation of Boolean expression provided byROBDDto perform very efficient operations on it. The interesting part is that the size of the representation is not directly dependent of the size of encoded the state space. Hence, symbolic model checking has often a sub-linear complexity with respect to the size of the state space. The commutative diagram in Fig.2.4 explains the symbolic approach and its correctness principle. LetS be the set of all possible states of a Kripke structure, andS₀ ∈ P(S) the set of its initial states.

LetN be a transition relation on set of states andτ be the transitive closure ofN. The set Sb₀ of states reachable from S₀ can be calculated by applying τ : P(S) → P(S), the transitive closure of the application of the set of transitions T onS₀(top-left part of Fig.2.4).

The calculated sets of states must be encoded in some domain DD (encS : P(S) → DD, link between the top and bottom part of Fig. 2.4). The transitive closure is encoded as well (encT(τ) : DD → DD, bottom-left part of Fig. 2.4).

Similarly to the calculation of the state space, the property check (|=^P) is encoded and performed on the symbolic encoding (encP(|=φ)).

P(S) −−−−−^τ→ P(S) −−−−−^|⁼^P→ P(S)

The central point of SMC is the choice of a DD that encodes sets of states with a sub-linear complexity. McMillan proposesROBDDsas the co-domain for encS (that encodes a set of states) and pair ofROBDDsforencT (that encodes the transition relation). MostSMCapproaches rely onDDsto encode the character-istic function that represent the state space. To support different kind of models and different techniques, researchers developed a complete zoology of DD. As a detailed survey is beyond the scope of this monograph, interesting reader may refer to [Lin09,LPAK⁺10]. We only give a brief overview of the most prominent categories and particularly the categories this work is based on.

Decision Diagrams

A Decision Diagram encodes a set of sequences of variable assignments. Each sequence encodes a state or a transition (a pair of states). Some variables assign-ments may be shared by several sequences, therefore modifying one assignment

2.4. Tackling the State space explosion 37 may impact a set of states. DDs have several relevant properties. For instance, their size —i.e.,the number of arcs and nodes — does not depend directly on the quantity of stored data. This size can be exponentially smallerw.r.t.the actual size of stored data. In addition, thanks to the canonicity (i.e.,only representation) and unicity (i.e.,only one instance of the representation in memory) that is intrinsically built-in (i.e.,flyweight pattern, hash-consing), the equality of twoDDsis checked in constant time. This property enables fast memoization [Wik12b, Mic68] and provides efficient fixed point computation. Fixed point is a very important opera-tion in theSMCframework. The way of manipulateDDsdepends of the particular kind. Some are mainly restricted to the Shannon decomposition [Sha38] that ap-plies set-theoric operations such as union, intersection and difference as well as existential operations (e.g.,ROBDDs); other also supports user-defined functions (e.g.,DDDs,Hierarchical Set Decision Diagrams (SDDs)).

Most of the time, a path of theDDrepresents a state of the system. However, in some encoding, a variable interleaving is used to represent the one-step transition relation in pre/post form. In our work, a path of theDDdoes represent a state of the system.

ROBDD This paragraph presents the ROBDDs. As it is common knowledge, readers familliar to the subject may safely skip ahead to the following paragraphs.

ROBDDs[Bry86] were the first category ofDDsused in model checking. ROB-DDs were originally dedicated to the efficient representation of Boolean expres-sions. In their seminal paper [BCM⁺92], Burch et al. proposed to useROBDDsto encode finite state space. In this setting, the state space is encoded as aROBDD and the transition relation is encoded as pairs ofROBDDs. In hardware systems, which were the initial case study, a vector of Boolean variables encodes the state.

Therefore the characteristic function of the states was of the form: χS⁰ : Bⁿ → B where n stands for the number of variables used to encode a state. The basic principle relies on the Shannon decomposition of Boolean expressions:

βx₁,...,x_n = (x₁∧β1,x₂,...,x_n)∨(x₁∧β0,x₂,...,x_n)

where βx₁,...,xn is a Boolean expression over the variables x₁, . . . ,xn and β_0,...,x_n stands for the assignment ofx₁to 0 inβ. Fig.2.5presents an example of the Shan-non decomposition of the Boolean function (x1∧x2∧x3∧x4)∨(x1∨x2)∧(x3∨x4).

A decision tree represents the decomposition. Solid lines stand for an assignment to 1 whereas dashed lines stand for an assignment to 0. The square nodes (i.e.,the terminals) represent the result of the function. Most of the time, only the support of the characteristic function is represented (i.e.,the values that do exist in the set) and therefore, when all paths respect the same variable order, the paths that lead to the terminal 0 can be ignored.

38 Chapter 2. State of the art&contribution x1 (x₁∧x₂∧x₃∧x₄)∨(x₁∨x₂)∧(x₃∨x₄)

x2 x2∧(x₃∨x4) x2 (x₂∧x₃∧x₄)∨(x₃∨x₄) 0 x₃ x₃∨x₄ x₃ x₃∨x₄ x₃ (x₃∧x4)∨(x₃∨x4)

x₄ x₄ 1 x₄ x₄ 1 x₄ x4∨x4 1

0 1 0 1 1 1

Figure 2.5: Shannon decomposition of(x₁∧x₂∧x₃∧x₄)∨(x₁∨x₂)∧(x₃∨x₄)

Obviously, some of the subgraphs are identical (e.g., x3 ∨ x3, x4 and the terminal nodes) and therefore can be merged. Therefore,ROBDDs(as well as all othersDDs) are Directed Acyclic Graphs (DAGs), in which the nodes represent variables and the arcs contain assignments to these variables. Besides, if the ordering (e.g., x₁ < x₂ < x₃ < x₄) is the same along all the paths, nodes of which both arcs lead to the same node can be ignore (i.e.,don’t care). Applying these reductions recursively until it is no more possible lead to Fig. 2.6 that is the reduced form (Reduced Ordered Binary Decision Diagram) of Fig.2.5.

Fig. 2.7 presents the same function encoded by ROBDD with two different ordering. The left version (7 nodes) is smaller than the right one (15 nodes) and thus the left ordering (x₁ < x₂ < x₃ < x₄ < x₅ < x₆) is better than the right one (x₁ < x₂< x₃< x₄< x₅< x₆). Using a good (resp. bad) ordering for representing the same function can lead to exponentially smaller (resp. bigger) ROBDDs.

As the ordering has to be defined prior the ROBDD construction, the question to select a specific ordering naturally arise. Unfortunately, finding the optimal ordering is known to be NP-complete [BW96] and therefore we rely on heuristics to find a good ordering. Some techniques propose to reorganize the variable order at runtime [Rud93]. Set-theoric operations (∪,∩,\) between ROBDDsare computed in polynomial time with respect to the number of operands. Most of the operations onROBDDsare based on theapplyoperation that basically applies the Shannon decomposition, which is the basis of the set-theoric operations such as union, intersection and difference as well as existential operations. User-defined

2.4. Tackling the State space explosion 39 x1 (x₁∧x₂∧x₃∧x₄)∨(x₁∨x₂)∧(x₃∨x₄)

x2 x2∧(x₃∨x4) x2 (x₂∧x₃∧x₄)∨(x₃∨x₄)

x3 x3∨x4

x₄ x₄

Figure 2.6: ROBDD of(x₁∧x₂∧x₃∧x₄)∨(x₁∨x₂)∧(x₃∨x₄)

functions are not foreseen in the framework.

Thanks to the canonicity and unicity of the representation (i.e.,flyweight pat-tern, hash-consing) that is intrinsically built-in, the equality of two Binary De-cision Diagrams (BDDs) is checked in constant time. This property enables fast memoization [Wik12b,Mic68].This property enables memoization [Wik12b, Mic68] and provides efficient fixed point computation. Caching the result of the application of an operation to a given list of BDD only requires to save a tuple hoperation,inputBDD₁, . . . ,inputBDD_n,out putBDDi. The next time an operation is performed on the same parameters list, the result will be extracted from the cache rather than computed again. ROBDDs are covered in great details in [MT98].

From now on, we may refer toROBDDssimply asBDDs.

MDD DDs rapidly took off in model checking, people however noticed that the expressivity power of BDDs was limited and that for software system is was not enough. Indeed, BDDs only encodes Boolean functions and therefore to encode variables of more complex types such as integer domains, BDDs en-coding requires log2(|domain|) variables. To overcome such limitations, more expressive DDs have been developed and used. One of these evolutions are the MDDs [KB90] and have been introduced in model checking by Ciardo et al. [CLS00]. State space and transition relation are encoded similarly to

ROB-40 Chapter 2. State of the art&contribution x₁

x₅

1 (a)Good ordering (7 nodes)

x₁

x₃ x₃

x5 x5 x5 x5

x₂ x₂ x₂ x₂

x₄ x₄

x₆

(b)Bad ordering (15 nodes) Figure 2.7: Same Boolean expression encoded with different variable ordering

DDs. UnlikeBDDs,MDDsencode function of the type:

f : 0, ..,Kn×. . .×0, ..,K₁→ B whereK₁, . . . ,Kn,n∈N

Concurrent systems consisting of multiple subsystems give rise to state spaces whose characteristic function is of the form 0, ..,Kn×. . .× 0, ..,K₁ → B. Since we assume that a system’s local state spaces Si is finite, we may identify each local state with an integer in the range 0,1, ...,|Si| −1. State spaces may thus be represented naturally viaMDDs.

MDDs have been successfully used in a stochastic model checking in Smart [CLS00,CLS01,Min01,CC04]. More recently,MDDshas been integrated in the stochastic model checker GreatSPN [BBDM10].

DDD DDDs[CEPA⁺02] are more expressive than theMDDsas they may han-dle any infinite scalar domain. They brought also a powerful and easy to use framework for their manipulation. DDDspropose so-calledDD-homomorphisms (DDHoms)that allow a clear separation of concerns between data and operations.

Unlike MDDs manipulations are inductively defined (not in place). The DDD

2.4. Tackling the State space explosion 41 structure is also more flexible as they allow repeating variables, and besides, the variables’ domains do not have to be known a priori. Finally, the length of a path is flexible. Unlike BDDs or MDDs the transition relation is not encoded in DDs but using user-defined operations encoded using DD-homomorphisms (DDHoms). Nevertheless, they follow the same paradigm that isone path is one state.

Definition 2.4.6 (Data Decision Diagrams). —GivenEa set of variables, the DDD setDDDis the least set:

• {0,1} ⊆ DDD

• he, αi ∈DDDwith:

– e∈EwithEthe set of DDD variables.

– Dom(e) represents the domain of the variablee∈E.

– α: Dom(e)→DDD, a total function s.t.{x|x∈Dom(e)∧α(x),0} is finite.

Notation: e →−^x d denotes the DDDhe, αiwith α(x) = d and ∀y ∈ Dom(e) s.t x, y,α(y)=0.

As for BDDs, the terminal 1 (resp. 0) stands for an existing (resp. non-existing) sequence of assignments. Although in the seminal paper [CEPA⁺02], a third terminal called > is introduced to handle operation between incompat-ible DDDs, in this work we only consider operations between DDDs that are compatible.

Definition 2.4.7 (DDD compatibility). —Two DDD are said compatible iffall of their sequences (i.e.,paths) are compatible. Two sequences s,s⁰ are compat-ible (noted s ≈ s⁰) iff s = s⁰ = 1 or s = e₁ −→^x¹ . . .1 and s⁰ = e⁰₁

x⁰

−→1 . . .1 are compatible iff: s=e→−^x d∧s⁰ =e⁰ ^x

−→d⁰such thate=e⁰ andd≈ d⁰ ifx= x⁰. Fig. 2.8 presents an example of a DDD. It is built over the following set of variables E = {a,b} where Dom(a) = {2,5}, and Dom(b) = {2,3}. It represents following union: a→−² b→−² 1 + a→−² b→−³ 1 +a→−⁵ b→−² 1+ a→−⁵ b→−³ 1. This illustrates the sharing among the encoded states. The Cartesian product of the variables’ domains is encoded in an efficient way.

42 Chapter 2. State of the art&contribution

a b 1

2 + 2

a b 1

2 + 3

a b 1

5 + 2

a 5 b 3 1

a b 1

2 2

5 3

Figure 2.8: DDD

Since DDD represent sets, we can define the usual set operations on them such as ∪_DDD, ∩_DDD, \_DDD. For a definition of the set operations on DDDs see [CEPA⁺02]. The product (⊗_DDD) operation simply replaces the terminals of the first operand by the second operand. This operation corresponds to computing the Cartesian product of two DDDs. As for the BDDs the result of set theoric operations can be cached.

Unlike Binary Decision Diagrams, operators are not limited to those pre-viously defined. Indeed one of the strengths of the DDD-like structure is their support of so-called inductive homomorphisms. Namely, operations that are inductively defined on the structure of the DDD and that are compatible with the union operator. This compatibility, called homomorphism, induces a high efficiency of user-defined operations. A homomorphism is a mapping φ from DDDto itself s.t. φ(0)= 0 andφ(d ∪ d⁰)=φ(d) ∪ φ(d⁰),∀d,d⁰ ∈DDD.

The union (∪) and the composition (◦) of two homomorphisms are homomor-phisms. Since a decision diagram is inductively defined, operations on them can also be inductively defined. This allows the user to give a local definition of the homomorphism, i.e.,what it should do with a given pairhvariable,valueior the terminal 1.

Definition 2.4.8 (Inductive Homomorphisms on DDD). — Given E a set of variables, let {φe,x|e∈E,x∈Dom(e)} be a family of homomorphisms and d,d1 ∈DDD, an inductive homomorphism is defined by:

2.4. Tackling the State space explosion 43 d1 represents the DDD returned whenever a homomorphism reaches the terminal node. Please note that induction is not mandatory, each and every homomorphismφe,xdecides whether to propagate to the sub-graph. As for the set operations,φe,x(α(x)) can be evaluated lazily saving both memory and processing time.

Example 2.4.8. — Let suppose we want to define a user-defined function φaddn

that adds n to every variable greater than zero and returns the terminal 1 when reaching it: Similarly to BDD, thanks to the canonicity induced by hash-consing, each step of the application of a homomorphism can be cached efficiently. This is especially useful when computing the fixed point (i.e.,φⁿ = φⁿ⁺¹) application of a given homomorphism as it avoids computing the stepφⁿ⁺¹for it is returned by the cache. We noteφ^∗(d), the fixed point application ofφond.

SDD To handle more complex structures, [CTM05] introduces theSDDs. They are very similar to the DDDs in that they handle paths of different lengths or several times the same variable and they also use homomorphisms as manipulation framework. The basic difference resides in thatSDDslabel arcs with sets instead of scalar values. Of course, asSDDsrepresent sets, they can be used to label the arcs thus making hierarchicalDDs.

Definition 2.4.10 (Set Decision Diagrams). —Given Ea set of variables, the SDDsetSDDis the least set:

• {0,1} ⊆ SDD

• ∀e∈E,Dom(e)⊆SDD

44 Chapter 2. State of the art&contribution sequences, product operator (⊗_SDD) and set operators (∪_SDD, ∩_SDD and \_SDD) are defined on SDD. For a definition see [CTM05]. One can define SDD homomor-phisms that are similar to their DDD equivalent.

Definition 2.4.11 (SDD compatibility). —Two SDDsare said compatible, iff either their sequences are compatible or σ1 = σ2 = 1. Two sequences σ1 = Example 2.4.11. — The SDD of Fig. 2.9 represents 9 paths or states. In this example aSDDson the set of variablesE1 = {p1,p2}embed otherSDDson the set of variablesE2= {a,b,c,d}: Again, the power of the SDDslies in the Cartesian product symbolic encod-ing. UsingSDDs, thanks to the sets, we end up with a two-dimensional symbolic encoding.

2.4. Tackling the State space explosion 45

Figure 2.9: Hierarchical Set Decision Diagramrepresenting 9 paths

As for theDDDs. SDDssupport set-theoric operations, fixed point, and con-catenation as well as user-defined operations called DDHoms. New SDDs are created by concatenation and union. Whereas in the first case, the operation is straightforward because both operand being compatible the result will be. In the second case, however, it requires a canonization operation to guarantee that the new structure satisfy the criteria of Def. 2.4.10and Def. 2.4.11. The union op-eration that assumes the canonization process is known to be quadratic in the number of nodes. The details of the definition of the union operation can be found in [TM04]. In this work, we propose an alternate definition in Section5.4.2.

The definition of homomorphisms onSDDsis very similar to the one ofDDDs given in Def.2.4.8. One can note that since it is possible to embed DDD into SDD, it is also possible to embed DDD homomorphisms into SDD homomorphisms.

Besides, letid_SDDbe the identity morphism such that∀s∈SDD, id_SDD(s)= s.

Definition 2.4.13 (Inductive Homomorphisms on SDD). — GivenE a set of variables, let φe,x with e ∈ E and x ∈ Dom(e) be a family of homomorphisms andd₁∈SDDa SDD : As for the set operations, inductive homomorphism applications can be cached saving both memory and processing time.

Once created, as homomorphisms are purely functional, aDDis never altered.

When a set operation or a homomorphism is applied to aDD, a newDDis created.

ThisDDis checked for existence in a unicity table. Either theDDhas already been

46 Chapter 2. State of the art&contribution registered and the reference is returned or the new reference is added to the unicity table. From an implementation point of view, we can leverage the canonicity (thanks to theDDcreation andDDunion operator) of the representation in order to implement constant time equality between DD and thus implement efficient caching. Set operations or homomorphisms are applied on an inductive structure and thus each processing step can be put in the cache for further use. This is very useful to save computing time as it also applies to sub-operations.

Similarly to DDDs, the result of the application of set theoric operations as well as the application of user-defined homomorphisms can be cached.

The transitive closure

In this subsection, we present the different evolutions of the state space generation usingDDs. Computing the state space of finite systems amounts to apply a set of

Dans le document High-level Petri net model checking : the symbolic way (Page 83-103)

State of the art &amp; contribution

2.4 Tackling the State space explosion

2.4.3 Shrink the encoding

2.4.4 Symbolic model checking

State of the art & contribution