• Aucun résultat trouvé

Heterogeneous Contact Rates over Random Connectivity Graphs

Performance Modeling for Heterogeneous Mobility and

3.1 Performance under Heterogeneous Mobility

3.1.1 Heterogeneous Contact Rates over Random Connectivity Graphs

Definition 3.1.1(Heterogeneous Full Contact Network). The sequence of the contact events between each pair of nodes{i, j}is independent from other pairs, and given by a Poisson process with rateλij. The contact ratesλij are independently drawn from an arbitrary distributionfλ(λ), λ A (0,), with finite meanµλ and variance σ2λ.

The above model is a generalization of the standard IID inter-contact model, used also in the previous chapter, which assumesλij =λ >0for all pairs. Different choices

Figure 3.1: Markov Chain for epidemic spreading over ahomogeneousnetwork with N nodes

Figure 3.2: Markov Chain for epidemic spreading over aheterogeneousnetwork with 4nodes

of fλ(λ)can describe a significantly broader range of scenarios. For example, large σ2λvalues imply that the contact frequencies between different pairs are very hetero-geneous, e.g. some pairs will rarely contact each other while others much more often.

Anfλ(λ)symmetric aroundµλ(e.g. uniform distribution) implies a balanced number of high and low contact rates, while a right-skewedfλ(λ)(e.g. Pareto) describes a network with most pairs having large inter-contact times, but few contacting very fre-quently. Smallµλ values could correspond to slow moving nodes, e.g. pedestrians, (or large geographical areas). Finally, multi-modalfλ(λ)functions might approximate scenarios with some hierarchical structure. Note that, conditional onfλ, we get a class of random matricesΛ =ij}. Our goal is to analyze the expected performance of epidemic spreading across the possible instances in this class.

To understand the complexity of the problem, let us first assume the simple case of λij =λ > 0. We can model epidemic spreading with a pure-birth Markov chain, as depicted in Fig. 3.1, where a statekdenotes the number of “infected” nodes (i.e. nodes with the message). In thishomogeneous contact network, it is easy to show that thestep timeTk,k+1(i.e. the time to move from statekto statek+1) is exponentially distributed with rate k(N −k)λ(due to independent Poisson pair-wise meetings. Its expected value is then given byE[Tk,k+1] = k(N1k)λ, and, therefore, one could straightwardly calculate the expected spreading time (broadcast, anycast or unicast).

While we could still use a Markov chain for the heterogeneous contact network, in order to findE[Tk,k+1], we now need to knowwhich nodes exactlyare included in thek infected nodes. As an example, in Fig. 3.2, we present the Markov Chain of a message epidemic spreading in a heterogeneous network with four nodes,{A, B, C, D}. This Markov Chain is composed of15states, whereas the respective Markov Chain of an homogeneous network with 4nodes would be composed of only4 states. Hence, it becomes evident that the complexity increases quickly, even for this simple4-node network. In a network withNthere will be(N

k

)different states forstep k, each with a potentially different probability.

Lemma 1. The expected delay for the transition from stepkto stepk+ 1is given by

While keeping track of the probabilities in the above lemma could be done recur-sively, the state space grows exponentially fast, so even numerical solutions [126] are infeasible beyond very simple problems. Instead, we prove that, in the limit of largeN (number of nodes), the majority of such starting states become statistically equivalent and the aproximation error from using the mean value forSmk goes to0.

This is captured by the following main result. The proof is technical and can be found in [29]

Theorem 1.1. As the network size increases, the relative errorREk between the ex-pected step delayE[Tk,k+1]and the quantity k(N1k)µ

λ converges to zero in probabil-ity:

In Table 3.1, we present the values for the relative errorREk (Theorem 1.1) in synthetic simulation scenarios of different network sizesN and contact rates hetero-geneityCVλ =

(σλ

µλ

)

. The values in Table 3.1 correspond to the relative errorREk

averaged over all the stepskof the epidemic process and over100different network instancesΛwith equivalent characteristics (N,fλ). It can be seen that in networks with higher heterogeneity (CVλ) the errors are larger, as our theory predicts. However, as the network size increases, the errors for all scenarios become very small.

The decrease of the relative errors can be observed also in Fig. 3.3, where we present the distribution (boxplots) of the values of REk in the different network in-stances. Here, the relative errors do not correspond to averaged (over different steps) values, but we present theREk at the steps that correspond at the20%(e.g. in the scenario withN = 100, we present the relative errors in the stepk = 20) and70%

of the spreading process, in Fig. 3.3(a) and Fig. 3.3(b), respectively. These plots show that the error not only decreases on average, but it does so for almost all instances from that contact class.

While the above approximation becomes exact for largeN, the error can be non-negligible for finite networks. For such networks we can derive a better, second order approximation using the Delta method known from statistics. Details can be found in [29].

Lemma 2. For a heterogeneous contact network following Definition 3.1.1, the ex-pected step delay can be approximated by

E[Tk,k+1] = 1

Table 3.1: Relative Step Delay Error REk: Averaged over All Steps and over100 Network Instances

N = 50 N = 100 N = 200 N = 500

CVλ= 0.5 2.8% 2.7% 2.6% 2.5%

CVλ= 1 4.2% 3.1% 2.7% 2.6%

CVλ= 1.5 8.2% 4.6% 3.2% 2.6%

CVλ= 3 34.1% 15.3% 8.2% 3.8%

N: 20 50 100 200 500 0

0.2 0.4 0.6 0.8 1

RE k

(a) spreading step:20%

N: 20 50 100 200 500 0

0.2 0.4 0.6 0.8 1

RE k

(b) spreading step:70%

Figure 3.3: Relative Step Error for the step (a)k = 0.2·N (i.e. message spreading at20%of the network) and (b)k = 0.7·N. Each boxplot correspond to a different network sizeN (withµλ= 1andCVλ= 1.5). Box-plots show the distribution of the Relative Step ErrorREkfor100different network instances of the same size.

The above results can be further generalized for networks where some nodes never meet. Specifically, we can assume an underlying Poisson connectivity graph with pa-rameterps, where a pecentage of randomly chosen pairs1−psdo not communicate at all (i.e. λij = 0)), and the non-zero contact rates are drawn again according to Def.3.1.1. Then, the following Corollary holds:

Corollary 1. Under a Heterogeneous Poisson Mixing Contact Network, where a con-tact pair either never meets (with probability1−ps), or meets regularly (with prob-abilityps) and according to Definition 3.1.1, all previous theoretical results hold, by substituting the moments of the contact rate distribution (µλandσλ2) with the expres-sions

µλ(p)=ps·µλ, (3.5)

σλ(p)2 =ps·[

σλ2+µ2λ·(1−ps)]

(3.6) With these basic results in hand, we can derive or aproximate the performance of epidemic routing as well as other DTN routing variants (e.g. Spray and Wait). Details and additional plots for synthetic scenarios can be found in [29], corroborating our results.

Figure 3.4: Box-plots of the message delivery delay under (a) epidemic, (b) 2-hop routing, and (c) SnW (with L = 6 copies) routing. On each box, the central horizontal line is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points not considered outliers, and outliers are plotted individually as crosses. The thick (black) horizontal lines represent the theoretical values predicted by our model.

In Table, we just show the analytical expressions for these protocols, for reference.

We also show sample performance results for real mobility traces and realistic mo-bility models. Analytical results against synthetic contact scenarios conforming to the assumptions in our model show remarkable accuracy, even for modestly sized net-works, and are omitted here. The traces we consider are: (i)Cabspotting[127], which contains GPS coordinates from 536 taxi cabs collected over 30 days in San Francisco, and (ii)Infocom[128], which contains traces of Bluetooth sightings of 78 mobile nodes from the 4 days iMotes experiment during Infocom 2006. In addition to these real traces, we generated mobility traces with two recent mobility models that have been shown to capture well different aspects of real mobility traces, namely, TVCM [129]

and SLAW[130]. In order to compare with analysis, we parse each trace and estimate the mean contact rate for all pairs{i, j}. We then produce estimates for the1st and 2nd moments of these rates,µˆλandσˆ2λ, as well as the percentage of connected pairspˆ and use them in our analytical expressions.

Fig.3.4 compares the theoretical performance of three protocols, epidemic routing, 2-hop routing, and spray and wait (SnW), according to our analytical expressions, to simulation results. Source-destination pairs are chosen randomly in different runs and messages are generated in random points of the trace. In the next chapter, we will consider scenario where traffic demand is not random among pairs of nodes.

The first thing to observe is that delay values span a wide range of values for dif-ferent source-destination pairs. This implies a large amount of heterogeneity in the

“reachability” of different nodes. Our analytical predictions are shown as thick dark horizontal lines. As it can be seen, our result is in most cases close to the median and in almost all cases between the 25thand 75thpercentile of the delay observed in both the real traces and mobility models.

It is somewhat remarkable that our delay predictors are close to the actual results (qualitatively or even quantitatively in some cases) in a range of real or realistic sce-narios; studies of these scenarios reveal considerable differences to the much simpler contact classes for which our results are derived. We should also be careful not to jump

to generalizations about the accuracy of these results in all real scenarios, as we are aware of situations that could force our predictors to err significantly. Nevertheless, we believe these results are quite promising in the direction of finding simple, usable analytical expressions even for complex, heterogeneous contact scenarios.