• Aucun résultat trouvé

Revisiting Pitfalls of DTN Datasets Statistical Analysis

N/A
N/A
Protected

Academic year: 2021

Partager "Revisiting Pitfalls of DTN Datasets Statistical Analysis"

Copied!
18
0
0

Texte intégral

(1)

1

Revisiting Pitfalls of DTN Datasets

Statistical Analysis

Gwilherm Baudic, Tanguy P´erennou and Emmanuel Lochin firstname.lastname@isae.fr

DMIA, ISAE, University of Toulouse, France

(2)

2

Contents

1 Introduction

2 Datasets and assumptions

3 Impact of assumptions on dataset analyses

4 Checklist proposal

(3)

3

Introduction

Datasets are key in DTN performance evaluation, but. . .

Issues

Data collection is hard to setup

Traces do not capture limitations on node buffers and transfer bandwidth

They may miss some contact opportunities

(4)

3

Introduction

Datasets are key in DTN performance evaluation, but. . .

Issues

Data collection is hard to setup

Traces do not capture limitations on node buffers and transfer bandwidth

They may miss some contact opportunities

(5)

4

Datasets studied

Characteristics

Rollernet MIT Infocom 2005 Technology Bluetooth

Duration (days) 0.125 284 3 Granularity (s) 15 300 120 Internal nodes 62 89 41 Internal contacts 60,146 114,046 22,459

(6)

5

Assumptions

In the following, we focus on:

Choice of nodes Symmetry of the pairs

Minimum number of contacts Treatment of 0-second contacts Dataset time span

Inter-contact definition

(7)

5

Assumptions

In the following, we focus on: Choice of nodes

Symmetry of the pairs

Minimum number of contacts

Treatment of 0-second contacts Dataset time span

Inter-contact definition

(8)

6

Impact on dataset analyses (1/5)

Baseline assumptions

Choice of nodes: internal only. Symmetry of the pairs: asymmetrical. Minimum number of contacts: not enforced. 0-second contacts: extended to 1 second.

Power-law parameters α and xmin: xminis the measurement

granularity.

(9)

7

Impact on dataset analyses (2/5)

0-second contacts

5000 first seconds of Rollernet

1 2 5 10 20 50 100 200 1e−04 1e−03 1e−02 1e−01 1e+00 Time (s) P[X>x] Rollernet 0s−>1s Rollernet >0s Rollernet >=15s

(10)

7

Impact on dataset analyses (2/5)

0-second contacts MIT 1 100 10000 0.0 0.2 0.4 0.6 0.8 1.0 Time (s) P[X>x] MIT 284 days 0s−>1s Pareto alpha=1.534 xmin=300 MIT 284 days >0s

(11)

8

Impact on dataset analyses (3/5)

Pareto lower bound estimation

Measurement granularity vs. estimation (Clauset et al.) Infocom 2005 (granularity = 120s) 100 200 500 1000 2000 5000 10000 20000 50000 1e−04 1e−03 1e−02 1e−01 1e+00 Time (s) P[X>x] Data

Pareto alpha= 1.886 xmin= 120 Pareto alpha= 2.676 xmin= 1402

(12)

9

Impact on dataset analyses (4/5)

Trace length Rollernet 1 2 5 10 20 50 100 200 1e−04 1e−03 1e−02 1e−01 1e+00 Time (s) P[X>x] Rollernet 5000s Rollernet full trace

(13)

10

Impact on dataset analyses (5/5)

External nodes

5000 first seconds of Rollernet

1 2 5 10 20 50 100 200 1e−04 1e−03 1e−02 1e−01 1e+00 Time (s) P[X>x] Rollernet internal Rollernet internal+external

(14)

11

Checklist proposal

Did I discard some values or periods of the dataset?

Ex.: 0-second, weekends. . .

Did the fitting method discard some data?

Ex.: Pareto lower bound xmin.

Did I change some values?

(15)

11

Checklist proposal

Did I discard some values or periods of the dataset?

Ex.: 0-second, weekends. . .

Did the fitting method discard some data?

Ex.: Pareto lower bound xmin.

Did I change some values?

(16)

11

Checklist proposal

Did I discard some values or periods of the dataset?

Ex.: 0-second, weekends. . .

Did the fitting method discard some data?

Ex.: Pareto lower bound xmin.

Did I change some values?

(17)

12

Conclusions

Contributions

Summary of pre-analysis assumptions from the literature Study of their influence on statistical analyses

Strong influence of 0-second contacts and Pareto lower bound estimation

Weaker influence of trace length and external nodes

Checklist proposal

Future work

Research the other assumptions Extend the work to pairwise metrics

(18)

13

Références

Documents relatifs

The analysis of results obtained for different stretching speeds of the electrodes indicates that the breaking mechanism of di- and mono-atomic junction is identical, and that

We finally assessed the performance of the Boost-HiC procedure for comparing contact maps obtained in different biological conditions. The comparison is usually done by computing

• Je vais être contacté par les équipes de l’Assurance Maladie pour identifier les personnes avec qui j’ai été en contact à risque (personnes sous le même toit,

Malgré les difficultés nous avons pu recueillir des enregistrements dans des familles wolof et pulaar, parlant donc une langue véhiculaire ou vernaculaire, dont les durées de

Bernard Bonnard, Jean-Baptiste Caillau, Emmanuel Trélat. Geometric analysis of minimum time Keplerian orbit transfers. Proceedings of the 22nd IFIP TC 7 Conference on System

In this paper, we present our early work in processing, combining and contrasting different datasets about the city of Milano – population statistics and phone call records – and

To assess how close an estimated centre is to a critical point of the Fréchet functional, for the different centroids and variational templates, we computed a ratio R using the

Yet, some papers appear to discard all contacts shorter than the measurement granularity [6], thus removing most of the data, while the authors of [7] include instead these