HAL Id: inria-00591593
https://hal.inria.fr/inria-00591593
Submitted on 9 May 2011
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Statistical Model Checking: An Overview
Axel Legay, Benoît Delahaye, Saddek Bensalem
To cite this version:
Axel Legay, Benoît Delahaye, Saddek Bensalem. Statistical Model Checking: An Overview. Runtime
Verification, Nov 2010, Malta, Malta. �10.1007/978-3-642-16612-9_11�. �inria-00591593�
Axel Legay
1, Benoˆıt Delahaye
1, and Saddek Bensalem
21
INRIA/IRISA, Rennes, France
2
Verimag Laboratory, Universit´ e Joseph Fourier Grenoble, CNRS, France
Abstract. Quantitative properties of stochastic systems are usually specified in logics that allow one to compare the measure of executions satisfying certain temporal properties with thresholds. The model check- ing problem for stochastic systems with respect to such logics is typically solved by a numerical approach [31,8,35,22,21,5] that iteratively com- putes (or approximates) the exact measure of paths satisfying relevant subformulas; the algorithms themselves depend on the class of systems being analyzed as well as the logic used for specifying the properties. An- other approach to solve the model checking problem is to simulate the system for finitely many executions, and use hypothesis testing to infer whether the samples provide a statistical evidence for the satisfaction or violation of the specification. In this tutorial, we survey the statis- tical approach, and outline its main advantages in terms of efficiency, uniformity, and simplicity.
1 Introduction and Context
Quantitative properties of stochastic systems are usually specified in logics that allow one to compare the measure of executions satisfying certain temporal prop- erties with thresholds. The model checking problem for stochastic systems with respect to such logics is typically solved by a numerical approach that iteratively computes (or approximates) the exact measure of paths satisfying relevant sub- formulas. The algorithm for computing such measures depends on the class of stochastic systems being considered as well as the logics used for specifying the correctness properties. Model checking algorithms for a variety of contexts have been discovered [2,13,8] and there are mature tools (see e.g. [25,7]) that have been used to analyze a variety of systems in practice.
Despite the great strides made by numerical model checking algorithms, there are many challenges. Numerical algorithms work only for special systems that have certain structural properties. Further the algorithms require a lot of time and space, and thus scaling to large systems is a challenge. Finally, the logics for which model checking algorithms exist are extensions of classical temporal logics, which are often not the most popular among engineers.
Another way to verify quantitative properties is to use a simulation-based approach. The key idea is to deduce whether or not the system satisfies the property by observing some of its executions with a monitoring procedure [1,19], and use hypothesis testing to infer whether the samples provide a statistical
G. Ro¸su et al. (Eds.): RV 2010, LNCS 6418, pp. 122–135, 2010.
c Springer-Verlag Berlin Heidelberg 2010
evidence for the satisfaction or violation of the specification [42]. Of course, in contrast to a numerical approach, a simulation-based solution does not guarantee a correct result. However, it is possible to bound the probability of making an error. Simulation-based methods are known to be far less memory and time intensive than numerical ones, and are sometimes the only option [41].
The crux of the statistical model checking approach is that since sample ex- ecutions of a stochastic system are drawn according to the distribution defined by the system, they can be used to get estimates of the probability measure on executions. Starting from time-bounded Probabilistic Computational Tree Logic properties [42], the technique has been extended to handle properties with un- bounded until operators [33], as well as to black-box systems [32,37]. Tools based on this idea have been built [34,39], and they have been used to analyze many systems.
This approach enjoys many advantages. First, these algorithms only require that the system be executable (or rather, sample executions be drawn according to the measure space defined by the system). Thus, it can be applied to larger class of systems than numerical model checking algorithms including black-box systems and infinite state systems. Second the approach can be generalized to a larger class of properties, including Fourier transform based logics. Finally, the algorithm is easily parallelizable, which can help scale to large systems. However, the statistical approach also has some disadvantages when compared with the numerical approach. First, it only provides probabilistic guarantees about the correctness of the algorithms answer. Next, the sample size grows very large if the model checker’s answer is required to be highly accurate. Finally, the statistical approach only works for purely probabilistic systems, i.e., those that do not have any nondeterminism. Furthermore, since statistical tests are used to determine the correctness of a system, the approach only works for systems that
“robustly” satisfy a given property, i.e., the actual measure of paths satisfying a given subformula, is bounded away from the thresholds to which it is compared in the specification.
In this tutorial, we will overview existing statistical model checking algorithms and discuss their efficiency. We will also overview existing tools and case studies and discuss future work.
2 Our Objective
We consider a stochastic system S and a property φ . An execution of S is a possibly infinite sequence of states of S . Our objective is to solve the probabilistic model checking problem, i.e., to decide whether S satisfies φ with a probability greater or equal to a certain threshold θ . The latter is denoted S | = P
≥θ( φ ), where P is called a probabilistic operator. This paper will overview solutions to this problem. These solutions depend on the nature of S and φ . We consider three cases.
1. We first assume that S is a white-box system, i.e., that one can generate
as much executions of the system as we want. We also assume that φ does
not contain probabilistic operators. In Section 3, we recall basic statistical algorithms that can be used to verify bounded properties (i.e., properties that can be monitored on fixed-length execution) of white-box systems.
2. In Section 4, we discuss extensions to the full probabilistic computation tree logic[8]. There, we consider the case where φ can also contain probabilistic operators and the case where it has to be verified on infinite executions.
3. In Section 5, we briefly discuss the verification of black-box systems, i.e.
systems for which a part of the probability distribution is unknown.
In Section 6, we will present various experiments that show that (1) statistical model checking algorithms can be more efficient than numerical ones, and (2) statistical model checking algorithms can be applied to solve problems that are beyond the scope of numerical methods. Finally, Section 7 discusses our vision of the future of statistical model checking.
Remark 1. The objective of the tutorial is not to feed the reader with technical details, but rather to introduce statistical model checking, and outline its main advantages in terms of efficiency, uniformity, and simplicity.
Remark 2. There are other techniques that allow to estimate the probability for S to satisfy φ . These approaches (that are based on Monte-Carlo techniques) will not be covered in this paper. The interested reader is redirected to [17,20,26] for more details.
Remark 3. Statistical Model Checking also applies to non stochastic systems [17].
This topic will not be covered in this tutorial.
3 Statistical Model Checking : The Beginning
In this section, we overview several statistical model checking techniques. We assume that S is a white-box system and that φ is a bounded property. By bounded properties, we mean properties that can be defined on finite executions of the system. In general, the length of such executions has to be precomputed.
Let B
ibe a discrete random variable with a Bernoulli distribution of param- eter p . Such a variable can only take two values 0 and 1 with P r [ B
i= 1] = p and P r [ B
i= 0] = 1 − p . In our context, each variable B
iis associated with one simulation of the system. The outcome for B
i, denoted b
i, is 1 if the simulation satisfies φ and 0 otherwise. To make sure that the above approach works, one has to make sure that one can get the result of any experiment in a finite amount of time. In general, this means that we are considering bounded properties, i.e., properties that can be decided on finite executions.
Remark 4. All the results presented in this section are well-known mathemati-
cal results coming from the area of statistics. As we shall see, these results are
sufficient to verify bounded properties of a large class of systems. As those prop-
erties are enough in many practical applications, one could wonder whether the
contribution of the computer scientist should not be at the practical level rather
than at the theoretical one.
Before going further one should answer one last question: “What is the class of models that can be considered?” In fact, statistical model checking can be applied to any stochastic system and logic on which one can define a probability space for the property under consideration. Hence, the approach provides a uniform way for the verification of a wide range of temporal logic properties over vari- ous stochastic models, including Markov Chains or Continuous Timed Markov Chains [35,3,2]. In general, it is not necessary to make the hypothesis that the system has the Markovian property
1, except when working with nested formulas (see Section 4). It is worth mentioning that the technique cannot be applied to systems that combine both nondeterministic and stochastic aspects (such as Markov Decision Processes). Indeed, the simulation-based approach could not distinguish between the probability distributions that are sampled.
3.1 Qualitative Answer Using Statistical Model Checking
The main approaches [38,32] proposed to answer the qualitative question are based on hypothesis testing. Let p = P r ( φ ), to determine whether p ≥ θ , we can test H : p ≥ θ against K : p < θ . A test-based solution does not guarantee a correct result but it is possible to bound the probability of making an error.
The strength ( α, β ) of a test is determined by two parameters, α and β , such that the probability of accepting K (respectively, H ) when H (respectively, K ) holds, called a Type-I error (respectively, a Type-II error ) is less or equal to α (respectively, β ).
A test has ideal performance if the probability of the Type-I error (respectively, Type-II error) is exactly α (respectively, β ). However, these requirements make it impossible to ensure a low probability for both types of errors simultaneously (see [38] for details). A solution to this problem is to relax the test by working with an indifference region ( p
1, p
0) with p
0≥p
1( p
0− p
1is the size of the region).
In this context, we test the hypothesis H
0: p ≥ p
0against H
1: p ≤ p
1instead of H against K . If the value of p is between p
1and p
0(the indifference region), then we say that the probability is sufficiently close to θ so that we are indifferent with respect to which of the two hypotheses K or H is accepted. The thresholds p
0and p
1are generally defined in term of the single threshold θ , e.g., p
1= θ − δ and p
0= θ + δ . We now need to provide a test procedure that satisfies the requirements above. In the next two subsections, we recall two solutions proposed by Younes in [38,43].
Single Sampling Plan. To test H
0against H
1, we specify a constant c . If
ni=1
b
iis larger than c , then H
0is accepted, else H
1is accepted. The difficult part in this approach is to find values for the pair ( n, c ), called a single sampling plan (SSP in short), such that the two error bounds α and β are respected. In practice, one tries to work with the smallest value of n possible so as to minimize the number of simulations performed. Clearly, this number has to be greater if α and β are smaller but also if the size of the indifference region is smaller. This results in
1
The Markovian property ensures that the probability to go from a state s to a next
state only depends on s, not on the states that have been visited before reaching s.
an optimization problem, which generally does not have a closed-form solution except for a few special cases [38]. In his thesis [38], Younes proposes a binary search based algorithm that, given p
0, p
1, α, β , computes an approximation of the minimal value for c and n .
Remark 5. There are many variants of this algorithm. As an example, in [33], Sen et al. proposes to accept H
0if
(n
i=1bi)
n
≥p . Here, the difficulty is to find a value for n such that the error bounds are valid.
Sequential probability ratio test. The sample size for a single sampling plan is fixed in advance and independent of the observations that are made. However, taking those observations into account can increase the performance of the test.
As an example, if we use a single plan ( n, c ) and the m > c first simulations satisfy the property, then we could (depending on the error bounds) accept H
0without observing the n − m other simulations. To overcome this problem, one can use the sequential probability ratio test (SPRT in short) proposed by Wald [36]. The approach is briefly described below.
In SPRT, one has to choose two values A and B ( A > B ) that ensure that the strength of the test is respected. Let m be the number of observations that have been made so far. The test is based on the following quotient:
p
1mp
0m=
m i=1P r ( B
i= b
i| p = p
1)
P r ( B
i= b
i| p = p
0) = p
d1m(1 − p
1)
m−dmp
d0m(1 − p
0)
m−dm, (1) where d
m=
mi=1
b
i. The idea behind the test is to accept H
0if
pp1m0m
≥ A , and H
1if
pp1m0m
≤ B . The SPRT algorithm computes
pp1m0m
for successive values of m until either H
0or H
1is satisfied; the algorithm terminates with probabil- ity 1[36]. This has the advantage of minimizing the number of simulations. In his thesis [38], Younes proposed a logarithmic based algorithm SPRT that given p
0, p
1, α and β implements the sequential ratio testing procedure.
Discussion. Computing ideal values A
idand B
idfor A and B in order to make sure that we are working with a test of strength ( α, β ) is a laborious procedure (see Section 3.4 of [36]). In his seminal paper [36], Wald showed that if one de- fines A
id≥A =
(1−βα )and B
id≤ B =
(1−αβ ), then we obtain a new test whose strength is ( α
, β
), but such that α
+ β
≤ α + β , meaning that either α
≤α or β
≤ β . In practice, we often find that both inequalities hold. This is illustrated with the following example taken from [38].
Example 1. Let p
0= 0 . 5, p
1= 0 . 3, α = 0 . 2 and β = 0 . 1. If we use A
id≥A =
(1−β)
α