HAL Id: hal-02605894
https://hal.inrae.fr/hal-02605894
Submitted on 6 Jul 2020HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Spatial uncertainty propagation in ICT data analysis
Maxime Lenormand, Thomas Louail, Marc Barthélemy, J.J. Ramasco
To cite this version:
Maxime Lenormand, Thomas Louail, Marc Barthélemy, J.J. Ramasco. Spatial uncertainty propaga-tion in ICT data analysis. Urban Net 2016, Sep 2016, Amsterdam, Netherlands. pp.18. �hal-02605894�
Maxime Lenormand | Irstea, France
21 September 2016
UrbanNet 2016 | Amsterdam
Spatial uncertainty propagation in
ICT data analysis
Thomas Louail | IFISC, Spain Marc Barthelemy | CEA, France
Motivation
"Reality"
Data
(Spatial) Uncertainty Propagation
Example: population size of Netherlands. We may estimate
it as 16,800,000 milion. Perhaps in reality it is
16,967,234; hence error of -1%
(Spatial) Uncertainty Propagation
Example: population size of Netherlands. We may estimate
it as 16,800,000 milion. Perhaps in reality it is
16,967,234; hence error of -1%
Error is usually not know because reality is not know
Crosschecking information
Crosschecking mobility information
Lenormand et al. (2014) Cross-checking different sources of mobility information.
PlosOne, 9(8):e105407. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10−4 10−3 10−2 10−1 100 10−4 10−3 10−2 10−1 100
Commuters (Mobile Phone)
C om m ut er s (T w itt er ) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● 10−4 10−3 10−2 10−1 10−4 10−3 10−2 10−1
Commuters (Mobile Phone)
C om m ut er s (C en su s) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 10−4 10−3 10−2 10−1 10−4 10−3 10−2 10−1 Commuters (Twitter) C om m ut er s (C en su s) 10−710−610−510−410−310−210−1 10−7 10−6 10−5 10−4 10−3 10−2 10−1 Commuters (Census) C om m ut er s (B B V A )
(Spatial) Uncertainty Propagation
Example: population size of Netherlands. We may estimate
it as 16,800,000 milion. Perhaps in reality it is
16,967,234; hence error of -1%
Error is usually not know because reality is not know
Crosschecking information
Uncertainty propagation analysis
Lenormand et al. (2016) Is spatial information in ICT data reliable?
300,000 mobile phone users' trajectories x 25 two-week periods
Inferring land use and identifying home-work locations
from mobile phone activity
Time (hours)
6 12 18 0 6 12 18 0 6 12 18 0 6 12 18 0 6 12 18 0 6 12 18 0 6 12 18
Lenormand et al. (2015) Comparing and modeling land use organization in cities.
Royal Society Open Science 2, 15052015.
Land use detection
Extraction of 50 independent samples based on
150,000 users activity during on week
Land use detection
0.20 0.25 0.30 0.35 0.40 0.45 0.20 0.25 0.30 0.35 0.40 0.45 0.20 0.25 0.30 0.35 0.40 0.45Residential 0.15 0.20 0.25 0.30 0.35 0.40 0.15 0.20 0.25 0.30 0.35 0.40 0.15 0.20 0.25 0.30 0.35 0.40 Fr action of mobile phone user Business 0.30 0.35 0.40 0.45 0.50 0.55 0.30 0.35 0.40 0.45 0.50 0.55 0.30 0.35 0.40 0.45 0.50 0.55 6 12180 61218 0 612180 6 12180 612180 61218 0 61218 Nightlife Time of day
Land use detection
0 10 20 30 40 50 60Residential Business Nightlife Other
P
ercentage of total surf
ace
Land use type (a) 60 70 80 90 100 0.00 0.05 0.10 0.15
Shared surface area (%)
PDF (b) Residential Business Nightlife Total
Land use detection
50 100 150 200 250 300 65 70 75 80 85 90 95 Number of users (x 103) Shared surf ace area (%) Voronoi Grid 1km Grid 2km Grid 3km 1 2 3 4 5 6 7 8 9 10 11 12 12 11 10 9 8 7 6 5 4 3 2 1Four weeks period
Four w eeks per iod −15 −10 −5 0 5 S − S
Song et al. (2010) Limits of predictability in human mobility. Science 327, 1018-1021. 52% 5% 6% 27%
Home
Most frequented location between 7pm and 7am
Work
Most frequented location between 8am and 5pm on weekdays
Origin-Destination Matrix
Tij: number of individuals living in
cell i and working in cell j
Most frequented locations
Most frequented locations
50 100 150 200 250 300 0.65 0.70 0.75 0.80 0.85 0.90 0.95 Number of users (x 103) λc (a) 50 100 150 200 250 300 0.4 0.6 0.8 Number of users (x 103) λc * (b) 50 100 150 200 250 300 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Number of users (x 103) λl (c) 50 100 150 200 250 300 0.994 0.995 0.996 0.997 0.998 Number of users (x 103) λl (d) Voronoi Grid 1km Grid 2km Grid 3kmMost frequented locations
1 2 3 4 5 6 7 8 9 10 11 12 12 11 10 9 8 7 6 5 4 3 2 1Four weeks period
Four w eeks per iod −0.06 −0.04 −0.02 0.00 0.02 λc− λc
Robustness of home-work locations identified for each user
Most frequented locations
5 10 15 20 25 74 76 78 80 82 84 86 88 Nh Accur acy Home Work
Good agreement between land uses identified from 100,000 users activity signals, with an average of 75% of shared surface area.
Uncertainty on the journey-to-work commuting network is highly dependent of the spatial resolution.
More studies in this spirit need to be done to assess the biases and uncertainty associated with this new data sources.
Take home messages...
Lenormand et al. (2016) Is spatial information in ICT data reliable?