Session: Big Data & Transport Business Convention on Big Data Université Paris-Saclay, 25 novembre 2015
Mobility Analytics through Social and Personal Data
Pierre Senellart
Analyzing Transportation and Mobility Data
Consider an analytics task on transportation and mobility:
Macro-level: How is transportation infrastructure affected by the creation of Université Paris-Saclay?
Micro-level: What are the chances I will be late if I have a meeting with John today?
Current approaches
Current approaches to transportation and mobility analytics:
Deployingsensor networks(LIDAR, cameras, etc.)
Leveraging onconnected devices used for other means (car GPS systems, speed cameras, smart card validators for public
transportation, mobile cell towers, toll gates,etc.) Personalizedsurveys
Potential issues:
Costof deploying new sensors
Datanot publicly available, sometimes preciously guarded by companies as valuable assets
Privacy concerns, data may not be easily redistributed
Latencyin installing new systems and acquiring new forms of data Costly and hard to set up at the macro-level,inaccessible at the micro-level
Current approaches
Current approaches to transportation and mobility analytics:
Deployingsensor networks(LIDAR, cameras, etc.)
Leveraging onconnected devices used for other means (car GPS systems, speed cameras, smart card validators for public
transportation, mobile cell towers, toll gates,etc.) Personalizedsurveys
Potential issues:
Costof deploying new sensors
Datanot publicly available, sometimes preciously guarded by companies as valuable assets
Privacy concerns, data may not be easily redistributed
Latencyin installing new systems and acquiring new forms of data Costly and hard to set up at the macro-level,inaccessible at the micro-level
Current approaches
Current approaches to transportation and mobility analytics:
Deployingsensor networks(LIDAR, cameras, etc.)
Leveraging onconnected devices used for other means (car GPS systems, speed cameras, smart card validators for public
transportation, mobile cell towers, toll gates,etc.) Personalizedsurveys
Potential issues:
Costof deploying new sensors
Datanot publicly available, sometimes preciously guarded by companies as valuable assets
Privacy concerns, data may not be easily redistributed
Latencyin installing new systems and acquiring new forms of data Costly and hard to set up at the macro-level,inaccessible at the micro-level
Social Web: Abundant Source of Data
Hundreds of millions of messages postedeach day on platforms such as Twitter
Widespread use of social mediain city inhabitants (68% of Singaporeans regularly use social media)
Crowdsourcingplatforms (e.g., Amazon Mechanical Turk) can be used to obtain new data or to annotate existing data.
Can be combined withuser-contributedand open data of high quality (e.g., Open Street Map, RATP open data with all locations of bus stops)
A large part of this isfully accessible, free or cheap to obtain Give access to both objectiveand subjective information Complementary to traditional data collection
The Social Web, the Crowd as virtual sensors
Personal Information:
Rich Information at the Individual’s Level
Individuals interested in analytics on issuesmattering to them On these issues, wealth of personal information:
Trajectories tracked by mobile phones (Google Timeline, Apple Location Services)
Calendar information describing appointments, including addresses Contacts
Emails with information about appointments, trips
Virtual assistant services (Wipolo, Tripit) with detailed information about trips
Own and relations’ social media information, with timestamped and geolocated events, pictures, messages
etc.
Digital tracesextremely valuable, especially for mobility analytics, but only exploited by Google, Facebook, Apple, . . . , for their own purposes Givedigital tracesback to users, and the way to exploit them!
This talk
3 use cases for urban data collection from the Web (current research activities at Télécom ParisTech):
User Trajectories and Mobility Networks
Social Sensing for Trajectories of Moving Objects Asking the Right Questions to the Crowd
Outline
User Trajectories and Mobility Networks
Social Sensing for Trajectories of Moving Objects
Asking the Right Questions to the Crowd
Semantics from users’ GPS tracks
Users often equipped with a smartphone with capability of recordingregular timestamped locations from GPS data, WiFi network or cell tower triangulation
Indeed, such information oftenalready recorded to support some applications (Google Now, Frequent Locations, etc.)
Tracks are noisy,incomplete, and not matched with actual transit networks
Little added value for the user so far
Smarter urban mobility
[Montoya et al., 2015]Multimodal
J ourney Smartphone Transportation
Network , , , Multimodal
Itinerary Transit
Schedules , Multimodal Itinerary A
B
A 123
8:45 8:50 9:18 9:23
9:409:48
10:00 10:10 8:45
8:50 9:18 9:23 9:309:31
9:409:48
9:559:56 10:00 10:10
Dynamic Bayesian Network Particle Filter Location
Dynamics Context
Path generation Projection Time
Series
3-axis
Clean-up
Transit Line Recognition Time
Problem: Smart and adaptive recommendations for mobility in cities (transit, bike rental, car, etc.)
Challenge: Should take into accountpersonal information
(calendar, etc.),past trajectories,public information about transit networks and traffic
Methodology: Map GPS tracks to routesof public transport, learn route patterns, connect to personal information
Outline
User Trajectories and Mobility Networks
Social Sensing for Trajectories of Moving Objects
Asking the Right Questions to the Crowd
Geolocation from the Social Web
Much social Web content annotated with location information:
Twitter or Facebook smartphone apps can add location to posts Pictures uploaded on Flickr or Instagram from GPS-equipped cameras (e.g., smartphones) often have geolocation data Possible to use information extraction techniques to extract location information from text and semi-structured information
Mobility in smart cities from social data
Problem: Inferpatterns of mobilities of populations within a city from social Web data (e.g., Instagram)
Challenge: Partialand noisy data
Methodology: Aggregate information from numerous users, to identifyhotspots(e.g., tourist attractions) and paths between hotspots; clusterpopulations according to their mobility history
Social sensing of moving objects
[Ba et al., 2014]
Problem: Infertrajectories and meta-information of moving objects from Web and social Web data (e.g., Flickr)
Challenge: Uncertainty and
inconsistencyin extracted information Methodology: Data cleaningby filtering incorrect locations, and truth discovery to identify reliable sources
Outline
User Trajectories and Mobility Networks
Social Sensing for Trajectories of Moving Objects
Asking the Right Questions to the Crowd
Optimizing crowd queries
[Amsterdamer et al., 2013a,b]
Problem: How to answer a specific mining need (e.g., mobility patterns) by asking a sequence of questionson crowdsourcing platforms
Challenge: Crowd accesses are costly;
individual questions asked (e.g., “How often do you use a bike?”, “What is your most common transportation option to your workplace?”) are not independent of each other
Methodology: Maintain at all times a current knowledge of the world and estimate what is the best question(s) to ask given this knowledge.
Merci.
Contributions from, and joint work with:
Télécom ParisTech. Talel Abdessalem, Antoine Amarilli, Mouhamadou Lamine Ba, Anis Bouriga, Sébastien Montenez
ENS Cachan & INRIA Saclay. Serge Abiteboul INRIA Saclay & ENGIE. David Montoya
Tel Aviv University. Yael Amsterdamer, Yael Grossman, Tova Milo National Univ ˙of Singapore. Stéphane Bressan
A*STAR, Singapore. Huayu Wu
Bibliography I
Yael Amsterdamer, Yael Grossman, Tova Milo, and Pierre Senellart.
Crowd mining. InProc. SIGMOD, pages 241–252, New York, USA, June 2013a.
Yael Amsterdamer, Yael Grossman, Tova Milo, and Pierre Senellart.
Crowd miner: Mining association rules from the crowd. InProc.
VLDB, pages 241–252, Riva del Garda, Italy, August 2013b.
Demonstration.
M. Lamine Ba, Sébastien Montenez, Talel Abdessalem, and Pierre Senellart. Monitoring moving objects using uncertain web data. In Proc. SIGSPATIAL, pages 565–568, Dallas, USA, November 2014.
Demonstration.
David Montoya, Serge Abiteboul, and Pierre Senellart. Hup-me:
Inferring and reconciling a timeline of user activity with smartphone and personal data. InProc. SIGSPATIAL, Seattle, USA, November 2015.