• Aucun résultat trouvé

Security-preserving Support Vector Machine with Fully Homomorphic Encryption

N/A
N/A
Protected

Academic year: 2022

Partager "Security-preserving Support Vector Machine with Fully Homomorphic Encryption"

Copied!
2
0
0

Texte intégral

(1)

Security-preserving Support Vector Machine with Fully Homomorphic Encryption

Saerom Park*, Jaeyun Kim, Joohee Lee, Junyoung Byun, Jung Hee Cheon, Jaewook Lee**

Seoul National University

1 Gwanak-ro, Gwanak-gu, Seoul, South Korea 08826 drsaerompark@gmail.com *, jaewook@snu.ac.kr **

Abstract

Recently, security issues have become more and more impor- tant to apply machine learning models to a real-world prob- lem. It is necessary to preserve the data privacy for using sen- sitive data and to protect the information of a trained model for defending the intentional attacks. In this paper, we want to propose a security-preserving learning framework using fully homomorphic encryption for support vector machine model.

Our approach aims to train the model on encrypted domain to preserve data and model privacy with the reduced communi- cation between the servers. The proposed procedure includes our protocol, data structure and homomorphic evaluation.

As machine learning models have been effectively applied to various real-world problems, collecting data from vari- ous sources became crucial for many service providers(Park, Hah, and Lee 2017). In this situation, privacy issues have become a major problem in the learning society. Therefore, it is necessary to preserve the privacy of the data and the security of the learning process without compromising the performance of the learning algorithms.

Homomorphic encryption (HE) enables computations on encrypted data (ciphertext) which are equivalent to oper- ations on decrypted data (plaintext) (Gentry 2009). In re- cent years, privacy-preserving machine learning applica- tions have developed, but they have high computational cost and huge memory burden. Cheon et al. developed a ho- momorphic encryption scheme for approximate arithmetic (HEAAN) (Cheon et al. 2017).

Support vector machine (SVM) is one of the most effec- tive machine learning algorithms for classification (Cortes and Vapnik 1995). The training data and the model param- eters should be protected to apply the SVM model to the security-preserving scenario. Therefore, in this paper, we propose the implementation for the training phase of the SVM classifier on the encrypted domain with the HEAAN scheme.

Design Components Fully Homomorphic Encryption

Fully homomorphic encryption (FHE) is a cryptographic scheme which aims to enable homomorphic operations such Copyright held by authors

as additions and multiplications on encrypted data. Recently, HEAAN scheme (Cheon et al. 2017) was developed to carry out approximate computations efficiently.

For the (leveled) HEAAN of depthL, we set the parame- ters such as a power of twoM0, integersp,q0,qL=pL·q0, handP, and a realσforλ-bit security. The specifications of the algorithm are as follows.

-(pk, sk, evk)←KeyGen(1λ)

- ~c ←Encrypt(pk, m): Sample ~r ∈ R2 and e0, e1 ← DGqL2).

Output~c←~r·pk+ (m+e0, e1)∈ R2qL.

-m←Decrypt(sk,~c = (c0, c1)): Outputm ←c0+c1·s modq`.

-cadd←Add(~c1,~c2): Output~cadd←~c1+~c2 modq`. -cmult~ ←Mult(~c1,~c2): Setc1= (b1, a1)andc2 = (b2, a2).

Let(d0, d1, d2)←(b1·b2, a1·b2+a2·b1, a1·a2)∈ R3q`. Out- put~cmult←(d0, d1) +bP1·(d2·evk (mod P·q`))e ∈ R2q`. -~c0 ←Rescaling`→`0(~c): For a level`ciphertext~c, output

~c0← bqq`0

` ·~ce ∈ R2q

`0 at level`0.

For more details, we recommend to see (Cheon et al. 2017).

Least Square Support Vector Machine

The SVM aims to find the maximum margin hyperplane, given the training examples{(x1, y1), ...,(xn, yn)} ⊂Rd× {−1,1}. To train the SVM model over encrypted data, we used a nonlinear least-square SVM with a basis function φ(x) :Rd →Rlthat minimizes the following optimization problem (Suykens and Vandewalle 1999):

w,b,emini

1 n

Pn

i=1ei+λkwk2 (1)

s.t. yi[w·φ(xi) +b] = 1−ei,∀i= 1, . . . , n, where w ∈ Rl. To solve the problem (1), we con- struct the Lagrangian function: L(w, b,e,α) = λkwk2+

1 n

Pn

i=1e2i−Pn

i=1αi{[w·φ(xi)+b]+ei−1}. With the opti- mality conditions of the Lagrangian function, the following linear system is obtained by removingwande:

Ab=

0 yT y Ω+λIn

b α

= 0

1n

= ˜1 (2) whereΩ ∈ R(n+1)×(n+1)s.t.Ωi,j =k(xi, xj). We intro- duce an additional least square problem with gradient de- scent method for the system (2), which convergence can be

(2)

Figure 1: Our procedure with protocol

guaranteed by the well-posedness of. The matrixAcan be decomposed as follows:

A= ( 1

y

1 yT )

0 1T 1 Ω

+η 0 0T

0 In

(3) whererepresents a Hadmard product. By Schur product theorem, the first term becomes PSD with Mercer kernel.

Therefore, the linear system (2) is well-posed with some η >0, and the convergence of iterative method can be gau- ranteed. With these design components, the intermediate dy- cryptions for training the SVM model can be reduced.

Implementation

In this paper, we propose a protocol which can reduce the communication to construct kernel matrix and to learn the model parameters with FHE operations. Figure1illustrates the whole procedure of our protocol which secures server- side and data provider-side information because all opera- tions can be performed on encrypted domain.

Data Structure

As mentioned previously, we first compute the kernel matrix from the encrypted data. For simplicity, we assume that the kernel matrix is encrypted as a ciphertext. We want to build a parallelizable procedure because of computational cost and large memory of ciphertext. The HEAAN scheme supports slot-wise operation over ciphertext, so the operations of ma- trices should be replaced with addition and Hadmard prod- uct. We propose data structure for calculating the inner prod- uct matrix. Assume thatZ∈Rn×dandZZT,Zˆj ∈Rn×n.

ZZT = ˆZ1T1 +· · ·+ ˆZdTd,

Z=

z11 · · · z1d

... . .. ... zn1 · · · znd

, Zˆj=

z1j z2j · · · znj

... ... . .. ... z1j z2j · · · znj

.

The multiplication of cihpertexts in this operation can be performed onddifferent machines in parallel.

Secure-Preserving Iterative Training

In this study, to efficiently implement the gradient descent method, we utilize the symmetricity of the matrix (3) in

Table 1: Classification accuracy for real datasets matrix multiplication. To learn b, the updated equation is bk+1 = bk −αAT(Abk −˜1). Matrix-vector multiplica- tion on the encrypted domain is efficiently implemented by rotating the slots. We used pre-computedATAby using the symmetric property instead of two matrix-vector multipli- cations per iteration. Therefore, the resulting update equa- tion isbk+2 =Mbk−c, whereM= (I−αATA)2and c= ((1 +α)I−αATA)ATbare pre-computed.

Comparison

Our procedure replaces the dual convex optimization with numerical gradient descent to implement the security- preserving LSSVM. To illustrate the result of this replace- ment, we compare the classification performances with RBF and polynomial kernels for some real datasets used in (Suykens and Vandewalle 1999). Table1shows that the re- placement does not severely affect the performances. From this results, we can expect to develop a secure-preserving SVM without compromising the performance.

Conclusion

In this paper, we present a new framework to train the LSSVM model using HEAAN. This framework includes protocol, data structure and secure-preserving iterative train- ing procedure. Our method considers the reduction of com- putational cost and memory burden that are the common problem for the application of HE scheme. The proposed solution can be helpful to show the potential for the practi- cal utilization of machine learning models without concerns on security and privacy issues.

Acknowledgement

This work was supported in part by NRF of Korea grants No. 2018R1D1A1A02085851, 2016R1A2B3014030 and 2017R1A5A1015626.

References

[Cheon et al. 2017] Cheon, J. H.; Kim, A.; Kim, M.; and Song, Y.

2017. Homomorphic encryption for arithmetic of approximate numbers. InASIACRYPT 2017, 409–437. Springer.

[Cortes and Vapnik 1995] Cortes, C., and Vapnik, V. 1995.

Support-vector networks.Machine learning20(3):273–297.

[Gentry 2009] Gentry, C. 2009. A fully homomorphic encryp- tion scheme. Ph.D. Dissertation, Stanford University. http:

//crypto.stanford.edu/craig.

[Park, Hah, and Lee 2017] Park, S.; Hah, J.; and Lee, J. 2017. In- ductive ensemble clustering using kernel support matching. Elec- tronics Letters53(25):1625–1626.

[Suykens and Vandewalle 1999] Suykens, J. A., and Vandewalle, J.

1999. Least squares support vector machine classifiers. Neural processing letters9(3):293–300.

Références

Documents relatifs

In this section we describe two implementations of homomorphic evaluation of an AES circuit (HAES), using our BDGHV scheme with ` plaintext bits embedded in each ciphertext.

3, we show how the weeding of “bad” votes can be done in linear time, with minimal change to JCJ, via an approach based on recent advances in various FHE primitives such as

(a) Security assessment process Asset name description criticality data type confidentiality capability volume user Assessment name concerns ◀ * * Project name

In this paper a Least-Squares Support Vector Machine (LS-SVM) approach is introduced which is capable of reconstructing the dependency structure for linear regression based LPV

Abstract—In this article we study the use of the Support Vector Machine technique to estimate the probability of the reception of a given transmission in a Vehicular Ad hoc

This paper also describes a new modulus switching technique for the DGHV scheme that enables to use the new FHE framework without bootstrapping from Brakerski, Gentry and

Several constructions based on the Ring-LWE problem [26] are today among the most promising FHE candidates, each of them having particular advantages that depend on the type of

To reduce the degree of the decryption polynomial, Gentry introduced the following transformation [6]: instead of using the original secret key, the decryption procedure uses a