• Aucun résultat trouvé

Towards Scalable, Efficient and Privacy Preserving Machine Learning

N/A
N/A
Protected

Academic year: 2021

Partager "Towards Scalable, Efficient and Privacy Preserving Machine Learning"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: hal-01956155

https://hal.archives-ouvertes.fr/hal-01956155

Submitted on 14 Dec 2018

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Towards Scalable, Efficient and Privacy Preserving Machine Learning

Rania Talbi, Sara Bouchenak

To cite this version:

(2)

Preliminary results

Towards Scalable, Efficient and Privacy Preserving Machine Learning

Context and Motivation

Rania Talbi, Sara Bouchenak

INSA Lyon, France

{firstname.lastname}@insa-lyon.fr

Related work

Design principles

Objectives

References

⋮ ⋮ M(⋃ B%) 𝑩𝒊 : Local bank transactions of 𝐶+ 𝑪𝑭: Fraudulent company 𝑪𝒊 : Company i 𝐶. 𝐶/ 𝐶0 𝐶1 𝐴 𝑨: Central Supervision Authority 𝑀: Data Mining for fraud detection December 10th, Middleware 2018’ s doctoral symposium - Rennes, France.

DynAmic Privacy Preserving machine Learning Framework (DAPPLE)

𝐷𝑂/ Privacy Preserving Classifier Learning Privacy Preserving Class Prediction 𝑸𝒋 [𝑋;]=>? [𝐶;]=>? [𝑤>] =>A 𝐶𝑆𝑃 𝐷𝑂.

.

.

.

𝐷𝑂D [𝑆>.]=>E [𝑆>/]=>F [𝑆>D]=>G Incremental update of the data model 𝑫𝑶𝒊: Data Owner i 𝑸𝒋: Classification Qeurier j [𝒘𝒌] 𝒑𝒌𝒘: Encrypted data model 𝐂𝐒𝐏: Classification Service Provider [𝑿𝒋] 𝒑𝒌𝒋: Encrypted classification query [𝑪𝒋] 𝒑𝒌𝒋: Encrypted classification response [𝑺𝒌𝒊] 𝒑𝒌𝒊: Encrypted local training data chunk from data owner 𝐷𝑂+

§

Minimize the computational costs incurred by privacy preservation.

§

Provide an end-to-end privacy preserving outsourced data classification service.

§

Enable a set of mutually untrusted data owners to have a global vision on the union of their data without breaching the privacy of each one of them.

§

Enable dynamic data model updates when new training data samples are available.

§

We have used a synthetic dataset for fraud detection in a B2B network.

§

This dataset contains 1000 bank transactions with 9 attributes each.

§

We compare our work to the Ciphermed framework [8]. PPML Different ML algorithms Different Privacy-preservation objectives Different architectures - Clustering [1] - Classification [2] - Association Rule Mining [3] ML output protection Original data protection …. Distributed [4] Outsourced [5] Privacy Runtime Utility Privacy Runtime Utility Cryptographic techniques (SMC/HE, GC, OT) Non-cryptographic techniques (PP-Data Publishing techniques) Privacy Preservation techniques Privacy Runtime Utility

§

Cryptographic based protection (data model, training data, classification queries and responses)

§

Decent privacy and utility levels

§

Partial homomorphic encryption (PHE ) based building blocks

§

Efficient runtime

§

Entirely outsourced ML computations over encrypted data

§

Combine PHE with cryptographic blinding (DTPKC cryptosystem [6]) 𝑒𝑥 ∶ [𝑥]=>⨂ 𝑟 => = [𝑥⨁𝑟]=> 𝑼𝟏 𝑼𝟐 § (1) Blind inputs § (2) Partially decrypt blinded values § (3) Decrypt blinded values § (4) Run operation over blinded values § (4) remove blinding from the result (2) (4)

§

We implemented the VFDT incremental decision tree learning algorithm [7] Naive approach: a combination of low level PP-building blocks 1st optimization : use inline building blocks 2nd optimization : Parallel computing B A A B

§ [1] X. Hu, et. al: Privacy-Preserving K-Means Clustering Upon Negative Databases. ICONIP (4) 2018.

§ [2] S. Kim et al. Privacy-Preserving Naive Bayes Classification Using Fully Homomorphic Encryption. ICONIP (4)2018: 349-358

§ [3] L.Liu et al : Privacy-Preserving Mining of Association Rule on Outsourced Cloud Data from Multiple Parties. ACISP2018: 431-451

§ [4] H.Yu et al.: Privacy-Preserving SVM Classification on Vertically Partitioned Data. PAKDD 2006: 647-656

§ [5] T.Li et al. : Outsourced privacy-preserving classification service over encrypted data. J. Network and Computer Applications 106: 100-110 (2018)

§ [6] X.Liu et al. : An Efficient Privacy-Preserving Outsourced Calculation Toolkit With Multiple Keys. IEEE Trans Information Forensics and Security 11(11): 2401-2414 (2016)

§ [7] M. Domingos et al.: Mining high-speed data streams. KDD 2000: 71-80

§ [8] R.Bost et al. : Machine Learning Classification over Encrypted Data. NDSS 2015

2018 ACM/IFIP International Middleware Conference, Doctoral Symposium,

Références

Documents relatifs

The solution achieves the following functionalities: (1) it enables users to store their private data securely; (2) it enables users, from the same or different OSNs, to compute

This property allows the Server with a pre-trained model to pro- vide recommendation services without a tedious re-training process, which significantly improves the

The loca- tion information collected by the mobile device of a user should serve as a basis for deriving location proofs that are verifiable in order to ensure the validity of

A publication data stream contains information of interest to users who register subscriptions to the cloud service.. The service uses dedicated machines, referred as brokers, to

A Scalable and Efficient Privacy Preserving Global Itemset Support Approximation Using Bloom Filters.. 28th IFIP Annual Conference on Data and Applications Security and Privacy

To construct a privacy-preserving (PP) fleet RL method, we enhance the fleet Gaussian process reinforcement learning (FGPRL) method exposed in [1].. with a secure

two-fold when submitting a batch of data for annotation: (1) labeling this selected subset leads to the greatest increase in our machine learning model performance, (2) the

We introduce the notions of compliance of a concept with a policy and of safety of a concept for a policy, and show how optimal compliant (safe) generalizations of a given EL