• Aucun résultat trouvé

On the Quest for Representative Behavioral Datasets: Mobility and Content Demand

N/A
N/A
Protected

Academic year: 2021

Partager "On the Quest for Representative Behavioral Datasets: Mobility and Content Demand"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-01323917

https://hal.inria.fr/hal-01323917

Submitted on 31 May 2016

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

On the Quest for Representative Behavioral Datasets: Mobility and Content Demand

Guangshuo Chen, Sahar Hoteit, Aline Carneiro Viana, Marco Fiore

To cite this version:

Guangshuo Chen, Sahar Hoteit, Aline Carneiro Viana, Marco Fiore. On the Quest for Representative Behavioral Datasets: Mobility and Content Demand. CHIST-ERA Projects Seminar 2016, May 2016, Bern, Switzerland. �hal-01323917�

(2)

On the Quest for Representative

Behavioral Datasets: Mobility and

Content Demand

Guangshuo CHEN, Sahar HOTEIT, Aline C. Viana, Marco FIORE

{guangshuo.chen, sahar.hoteit, aline.viana}@inria.fr, marco.fiore@ieiit.cnr.it

1. Objectives

Understand correlations between human mobility and data demand. Objectives:

• Extract fully-featured dataset from cellular traces;

• Construct mobility: identify Home locations, and estimate trajecto-ries;

• Characterize user behaviors in terms of time, space and volume (data).

2. Dataset Description

17,366 subscribers are extracted by applying a series of filters on cellular traces, consisting of 2,398,392 calls and 954,737 sessions in 4 weeks.

Session Location Estimation

Using call detail records, we infer users’ locations for data sessions.

,

Call Data Session

Based on trajectory estimation, at least 70% of sessions (80% of vol-ume) are traceable for 80% of subscribers.

% of number of session in total per (user,day)

0 20 40 60 80 100 CDF 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Case 1 Case 2 Traceable Untraceable

% of volume of traffic in total per (user,day)

0 20 40 60 80 100 CDF 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Case 1 Case 2 Traceable Untraceable

Decomposition of Cells in Mexico City

Cellular Cells (Voronoi) Population Density

3. Analysis

Home Location Identification

For each user, the most visited cell between 10pm and 7am is identified as his/her HOME. Confidence 0 0.2 0.4 0.6 0.8 1 CDF 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 All Day HOME WORK Clustering Sessions

Sessions (space, time, volume) are clustered by DBScan.

For each cluster, Relative Cohesion is calculated as a function of time, space or volume, respectively.

RC(∗) = P p∈C dist (∗)(p, c)2 P p∈C dist(p, c)2 (1)

RC(loc) + RC(time) + RC(vol) = 1 (2)

4. Future Works

• Estimate trajectories’ incomplete information; • Model volume demand;

• Characterize content demand;

• Link and predict mobility (important locations) and volume/content demand simultaneously.

Références

Documents relatifs

Pour communiquer directement avec un auteur, consultez la première page de la revue dans laquelle son article a été publié afin de trouver ses coordonnées.. Si vous n’arrivez pas

To name just a few, such soft constraints concern the choice of the price adjustment process, the interpretative and descriptive potential of conditions for stability, the

The question therefore addressed in this text, based on a field study of newborn screening for cystic fibrosis in France, is that of the link between the quest

In particular fluctuations in laser frequency or power can couple to interferometer length asymmetries to give dangerous phase noises than can compete with the shot noise.. The laser

When we monitored tRNA ribocc in Scp160p-depleted cells, it turned out that the tRNA species the most depleted from the ribosome are the ones that are less optimized for tRNA

The tail distributions of the number of different GSM antennas, WiFi access points, and POI per user (Figure 3a) ; and the number of unique users per GSM antenna, WiFi access point,

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

Les faciès de vasiëre interne, à sédimentation détritique fine, sont caractérisés par une macrofaune réduite, et un microplancton abondant et bien conservé» Les