• Aucun résultat trouvé

Privacy-preserving Wi-Fi tracking systems

N/A
N/A
Protected

Academic year: 2021

Partager "Privacy-preserving Wi-Fi tracking systems"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-01535820

https://hal.archives-ouvertes.fr/hal-01535820

Submitted on 13 Feb 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Privacy-preserving Wi-Fi tracking systems

Célestin Matte, Marine Minier, Mathieu Cunche, Franck Rousseau

To cite this version:

Célestin Matte, Marine Minier, Mathieu Cunche, Franck Rousseau. Privacy-preserving Wi-Fi tracking systems. Citi Lab PhD day, Apr 2016, Villeurbanne, France. 2015. �hal-01535820�

(2)

Privacy-preserving Wi-Fi tracking systems

C ´elestin Matte

1

, Marine Minier

1

, Mathieu Cunche

1

, Franck Rousseau

2

1

Universit ´e de Lyon, INRIA, INSA-Lyon, CITI-INRIA, France

2

Grenoble Institute of Technology, CNRS Grenoble Informatics Laboratory UMR 5217, Grenoble, France

W I -F I TRACKING SYSTEMS

I Wi-Fi-enabled smartphones frequently emit frames because they scan for access points

I Each frame contains a unique identifier of the device: the MAC address

I Wi-Fi tracking systems monitor and store these frames in order to make statistics:

I number of visitors

I frequency of visits

I travel paths

I etc.

P RIVACY ISSUES

I Lot of private information stored - much more than needed to make basic statistics

I Anyone with access - legitimate or not - to that data can get a lot of information about passers-by

I In France, made illegal by the “Informatique et Libert ´e” law (LIL). The CNIL made precise

recommandations for this case of information collecting so that one knows if they respect the law [1]:

I Data must be deleted when the person exits the place

I Used algorithm must ensure a strong collision rate, i.e. an identifier must correspond to several people

I People must give their explicit consent it if one wants to store data for a longer period (opt-in system).

I We can work on storing only information valuable for statistics

I There is no miracle solution

I We aim to propose a privacy/utility tradeoff

O BJECTIVES

I Short term:

I First, comparing the different methods

I Finding a good privacy metric

I Long term:

I Having a ready-to-use system with the same functionalities than existing Wi-Fi tracking systems, augmented with privacy-by-design

D ATA COLLECTION

I We need datasets to perform all our tests, containing logs of raw MAC addresses seen at a certain time at a certain place.

I Legislation forbids the collection of such datasets, as they contain personal identifiers

I Need consent of all concerned people

I difficult to set up

I Possible solutions:

I getting the consent of concerned people (one try at ACM Middleware, small dataset obtained)

I generate synthetic datasets from models

P RIVACY METRICS

I We need metrics to quantify “privacy”, in order to evaluate our methods

I Possible metrics are (non-exhaustive list):

I K-anonymity

I Collision rate: how many devices may be identified by the same identifier?

I entropy?

I Other considerations:

I Is the collision rate evenly distributed?

I What happens on extreme values? (e.g. very few number of passers-by)

I We have to determine which one (among these ones or other ones) best fits ours needs

H ASHING

I Principle: Using a simple cryptographic hash function, such as MD5 or functions of the SHA family, with no salt

I Problems:

I Easy to reverse in the case of MAC addresses [2]

I Almost no collision

I Almost useless

H ASHING AND TRONCATION

I Principle: Same as hashing, but only keep a small part of the result

I the less you keep, the more collisions you’ll get

I Also possible to troncate before hashing, in order to manually increase collision rate (first 3 bytes of the MAC address are the constructor identifier)

I Problem: Still easy to reverse?

B ITMAPS

I Principle: logarithmic repartition of the hashes [4]

I Problems:

I Unequal collision probability (half MAC addresses will set first bit ot one while only one will set the last bast to one)

I Very approximate for low numbers of MAC addresses (which is our case most of the time)

B LOOM FILTERS

I Data structure useful for many applications

I Insertion and search in O(1)

I ...but false positives

I ...which we use for privacy purpose

I We aim to be able to say that a person may have been here, but never be sure about it

I Principle: about the same as bitmaps, but use several hashing functions spreading bits uniformly on the array of bits

aa:bb:cc:dd:ee:ff

0

0 1 1 1 0 0 0 0 0 0 0 0 0 0

1

1 1

Bloom filter

gives the approximate number of visitors (with m length of the filter,

k number of hash functions, X number of ones) [3]

0 1 2 3 4 5 6 ... m

h2() h1() h3() hash functions give index number

0 1 1

I Problems:

I It may be possible to know for sure that some MAC addresses were seen

I May still give bad results for low numbers of MAC addresses (tests ongoing)

O THER POSSIBILITIES

I Using other phone characterisitics instead of MAC addresses (plenty of fingerprinting techniques exist) [5]

I Hashing with salts (keys), and destroying keys after a predefined period of time. Key destruction is not a trivial problem.

D EPLOYMENT

I Simple tools: raspberry pis with Wi-Fi dongles

I Partnership with UrbaLyon

I Searching for partners with an infrastructure to share (electricity + network access)

F UNDING

I Project founded by Academic Research Community (ARC), Rh ˆone-Alpes region (ARC 7).

R EFERENCES

[1] CNIL’s obligations

http://www.cnil.fr/linstitution/actualite/article/article/

mesure-de-frequentation-et-analyse-du-comportement-des- consommateurs-dans-les-magasins/, accessed on 2015.04.07.

[2] Demir, Levent and Cunche, Mathieu and Lauradoux, C ´edric Analysing the privacy policies of Wi-Fi trackers and statistics.

[3] Swamidass, S. Joshua; Baldi, Pierre (2007)

Mathematical correction for fingerprint similarity measures to improve chemical retrieval [4] Flajolet, Philippe and Martin, G Nigel

Probabilistic counting

[5] Xu, Qiang and Zheng, Rong and Saad, Walid and Han, Zhu

Device Fingerprinting in Wireless Networks: Challenges and Opportunities

Privacy-preserving Wi-Fi tracking systems CITI 2015 C. Matte, M. Minier, M.Cunche, F. Rousseau

Références

Documents relatifs

2 - Avec le “WI-FI” (appuyez sur “UNE AUTRE OPTION”) - si bluetooth indisponible sur dispositif Smartphone ou Tablet par exemple (point 13b)... FRANÇAIS 13a) CONFIGURATION AVEC

Les calculs formulés à partir de l’outil de comptabilité carbone deviennent performatifs en opérant à trois niveaux : à un premier niveau, ils produisent des effets

Our protocol consists of an initialization phase, the phase that prepares a sensor to enter the supply chain. Then, the authentication phase to verify the legitimacy of the site

[r]

Text format could be used to analyze sig- nal strength for a given access point in another program like Excel.. Latitude and Longitude are also exported in the Decimal

Utilisation du Questionnaire d’habiletés auditives “ Parole, audition spatiale et qualité d’audition ” (“ Speech, Spatial and Qualities of Hearing Scale) (SSQ) en

Qu’il s’agisse d’étendre un réseau local ou d’ouvrir des services spécifiques, le niveau de sécurité des communications WiFi, donc le type de réseau WiFi (c.f. section 2 page

L’Ile-de-France comme le reste de la France présente de réels problèmes de répartition des chirurgiens-dentistes sur le territoire, ce qui ne risque pas de s’arranger