A case study of using RM-ODP in mobile cloud computing applications

(1)

Computers and Software

(IRECOS)

.

(2)

Improving Search Results Through Reducing Replica in User Profile

by P. Srinivasan, K. Batri 2444

An Evaluation of the Movie Song Browser System Among IT and Non-IT Users

by Munauwarah, Nazlena Mohamad Ali, Hyowon Lee 2453

An Access Control Model of Web Services Based on Multifactor Trust Management

by R. Joseph Manoj, A. Chandrasekhar 2460

Performance Evaluation of the Hearing Impaired Speech Recognition in Noisy Environment

by C. Jeyalakshmi, V. Krishnamurthi, A. Revathy 2467

SEVALERPS a New EX-ANTE Multi-Criteria Method for ERP Selection

by Abdelilah Khaled, Mohammed Abdou Janati-Idrissi 2477

A Novel Expert System in Hospital Location Analysis with the Aid of Adaptive Artificial Bee Colony (AABC)

by K. Janaki, N. Radhakrishnan

2486

Design of High Speed Serial-Serial Multiplier for OFDM Applications

by N. Saravanakumar, A. Nirmal Kumar, K. N. Vijeyakumar, M. K. Ananda Moorthy 2495 Feature Based Image Retrieval Using Fused Sift and Surf Features

by V. Vijayarajan, M. Dinakaran 2500

A New Multibiometric Identification Method Based on a Decision Tree and a Parallel Processing Strategy

by Kamel Aizi, Mohamed Ouslim

2507

Computed Tomography Images Restoration Using Anisotropic Diffusion Regularization

by Faouzi Benzarti, Hamid Amiri 2515

Secure Medical Image Retrieval Using Dynamic Binary Encoded Watermark

by A. Umaamaheshvari, K. Thanushkodi 2521

Microarray Gene Expression and Multiclass Cancer Classification Using

Improved PSO Based Evolutionary Fuzzy ELM Classifier with ICGA Gene Selection by T. Karthikeyan, R. Balakrishnan

2532

Comparative Analysis of Intrusion Detection System with Mining

by S. Vinila Jinny, J. Jayakumari 2540

Enhanced Distributed Text Document Clustering Based on Semantics

by J. E. Judith, J. Jayakumari 2545

(3)

Privacy-Preserving Distributed Collaborative Filtering Using Secure Set Operations

Chongjing Sun, Yan Fu, Hui Gao, Junlin Zhou

Abstract – At present, collaborative filtering has been wildly used in many fields such as e-commerce, search engineering, and etc. To produce a better recommendation, many data owners want to collaborative with each other to build a shared model. Considering the privacy problem, the data owner is reluctant to reveal its data to others. To solve this problem, we present a privacy-preserving approach using the secure set operations and encryption methods. In our method, firstly the private set intersection cardinality protocol is adopted to compute the user similarities. Then our method uses the homomorphic encryption to compute the predicted rating values for the unrated items. Finally, the model recommends the top-k unrated items to each user.

We show that the distributed collaborative filtering based on our approach can provide zero loss of accuracy in the recommendation while preserving the privacy of different data owners. Copyright © 2013 Praise Worthy Prize S.r.l. - All rights reserved.

Keywords: Privacy Preserving, Set Operations, Collaborative Filtering

Nomenclature

U The set of all users I The set of all items

Iu The set of items that user u rated rui Rating value of user u rated to item i suv Similarity of users u and v

 t

O The t^th participant in the distribute system

 ^t

ui The i^th user of the participant O^{ }^t

 t

Ii The item set rated byu^{ }_i^t

q The integer from 0 to q1

 

  Random permutation function

 

H  Hash function

Epk Encrypt a plaintext with public key pk Dsk Decrypt a ciphertext with private key sk

I. Introduction

Nowadays, the explosive growth of the information on the Web leads to the information overload problem, which makes people get lost in all of the massive information. To provide better services to users and make more benefit from the product selling, many information filtering and recommendation techniques have been proposed [1], such as the classic collaborative filtering, the location-based recommendation service [2], and the context-dependent recommendation [3].

This can help people filtering out the redundant information, shortening the searching time, and finding the personalized items which they are most interested in.

Recommender System plays an important role in filtering the information, and many works have been proposed.

Among them, the Collaborative Filtering (CF) is a classic technique and widely be used in many e-commerce sites.

Usually, some small or start-up companies do not have enough data to provide satisfying recommendations to their customers. They want to collaborative with other companies to build a shard recommender system which can provide better recommendations. But the problem is that the other companies do not reluctant to share their data considering the privacy of their customers.

For example, some customers buy some private products and do not want others to know. The data sharing may violate the privacy of these users. Under this condition, we consider how to build a shared recommender system without disclosing the privacy.

In this paper, we focus on the binary user-item ratings, such as buy or not buy a product. Hence the privacy is defined as whether a user bought or rated an item. Based on the secure set operations and homomorphic encryption, we propose an algorithm which can build the shared recommender model without disclosing the privacy while having the zero accuracy losing. The user-based CF mainly has two steps, the similarities computation and rating scores prediction.

In the first step, we adopt the private set intersection cardinality protocol to compute the similarity between users without revealing the true ratings for each user. In the second step, we design an approach based on the homomorphic encryption to generate the predicted ratings for the unrated items.

(4)

C. J. Sun, Y. Fu, H. Gao, J. L. Zhou

each user based on the predicted ratings.

To solve the privacy and secure problem, the correspo- nding techniques have been developed very fast, especi- ally in the mobile [4] and RFID [5] areas. In this paper, we put our focus on the privacy problem under the distributed CF, which conducts the algorithms on the rating data stored in the multiple responsories. Plot and Du [6]

proposed privacy-preserving algorithms on the Collabo- rative Filtering recommendation under the horizontally or vertically partitioned data. Yakut and Polat [7] presented privacy-preserving schemes to make the item-based predictions on the arbitrary distributed data.

Both works cause the accuracy loss when recommending items to users. Plot and Du [8] designed methods on the privacy- preserving CF under the vertically distributed data, which select all the users as the targeted user’s neighbours. In our work, we only select the top-k users as its neighbours. Considering the distributed CF techniques, Kaleli and Polat [9] achieved the privacy-preserving on a model- based CF (the naive Bayesian classifier-based CF) recommendation. Yakut and Polat [10] also gave a solution on the privacy-preserving model-based CF (the SVD- based CF) recommendation. In our work, we solve the privacy- preserving problem on the memory-based CF. In our work, the secure set operations and homomorphic encryption techniques are combined for designing a new privacy- preserving scheme, which can conduct the shared user- based CF recommendation without any accuracy losing.

The rest of this paper is organized as follows. In Section 2, we introduce the preliminaries and define the research problem in this paper. In Section 3, we devise the privacy-preserving distributed collaborative filtering approach, and evaluate it in Section 4. Finally, we conclude this paper in Section 5.

II. Preliminaries and Problem Definition

In this paper, we focus on the recommender systems based on the collaborative filtering technique, which is one of the most successful technologies. Specifically, this work solves the privacy-preserving problem concerning the user-based CF model, in which there is a list of users



1 2



= , ,..., _n

U u u u and a list of itemsI=



i i1 2, ,...,i_m



. All the binary ratings can be summarized in a user-item table, which contains the rating scores r_ui provided by user u for item i. r_ui is set to 1 if u has rated item i; otherwise 0. Each user u has a list of rated items



0



u ui

I  i | iI ,r  .

Many metrics have been proposed to compute the similarity between two uses. Suppose the rating vector for user u and v are r_u and r_v respectively. Similarity measures for the binary ratings are listed in Table I.

For the binary ratings, Cosine measure is equivalent to Slaton’s measure. As our work adopts the secure set operations, we put emphasis on the measures based on set

operations. After the similarities between any two users are obtained, the predicted rating score for the item can be calculate by Formula (1):

u u

ui v N uv vi v N uv

r s r s

 



 

 (1)

where N_udenotes the top-k most similar users of the target user u.

TABLEI SIMILARITY MEASURES [11]

Name of measure method Formula of similarity s_uv Cosine

2 2

u v

r r



Salton ^u ^v

u v

I I I I



Jaccard ^u ^v

u v

I I I I





Dice ² ^u ^v

u v

I I I I



LHN-I ^u ^v

u v

I I I I



The secure computation technique [12] needs to be designed for the computation between partners without leaking the information. In our work, we adopt the secure set operations and homomorphic encryption to attain our task, the privacy-preserving collaborative filtering. The secure set operations allow one party to compute the result of a set operation with another party, such that they learn nothing about the inputs of each other beyond the result of the set operation.

From the previous analysis, we only need to get cardinalities of the set intersection and set union. Some works on the Private Set Intersection Cardinality (PSI-CA) and Private Set Union Cardinality (PSU-CA) have been learned to make sure that the parties are only allowed to learn the magnitude of set intersection or union.

Emiliano [13] proposed the solutions on this problem which can achieve the complexities linear in the size of input sets.

Homomorphic Encryption is a technique which allows certain operations on the ciphertext. Given two messages m1 and m₂ , the additively homomorphic encryption schemes satisfy the following properties:

   



1 2



1 2

sk pk pk

D E m E m m m (2)

  

¹ ^m²



¹ ²

sk pk

D E m m m (3)

We adopt the Paillier cryptosystem [14] in our method, a classic additively homomorphic encryption system.

(5)

Problem definition. Suppose that there are p parties and m items in the distributed system, and the data is horizontally partitioned. Each party has a number of users who rate the items, i.e. O^{ }^t has users as follow:

       



¹ ²



= ^t , ^t ,...,

t t

U u u unt

User u_i^{ }^t has an item rating vector R_i^{ }^t , and the item set rated by u_i^{ }^t is represented by I_i^{ }^t .

The overall architecture of the privacy-preserving distributed CF system is depicted in Fig. 1.

Fig. 1. The distributed collaborative filtering infrastructure The p parties cooperate with each other to establish a better shared CF model while preserving the original privacy information about their preferences, i.e., the ratings for items. We design the protocols under the semi-honest model [15], which means that the participant follows the protocol strictly, but can keep the intermediate calculating data to analyze more information.

It is reasonable for small or medium companies to build a shared collaborative filtering model under the semi-honest model, in which they want to get more benefit from the data sharing without the invasion of other’s privacy.

III. New Privacy-Preserving Distributed Collaborative Filtering

In this section, we combine the secure set operation with the encryption technique to design the privacy- preserving CF schemes under the distributed system.

Private similarity computation

In order to recommend items to a target user, we need to compute the similarities between this user and all the others. As the other users are distributed in different parties, we need to design the Private Similarity Computation (PSC) which can compute the similarity between users without leaking the personal ratings. To explain our method clearly, we give the PSC on the data distributed on the two-parties, which can be easily extended to multi-parties.

Suppose that two parties Alice and Bob have n_a and nb users respectively. Then we simplify the representation of similarity matrix of these users as:

T

S A C C B

  

  

  

(4) Protocol 1. Private set intersection cardinality

           

     

1 2 1 2

1 2 1

Alice: Input Bob: Input

a b

a a a b b b

n n

a a b

q q q

I ,I ,...,I I ,I ,...,I

r ,r r ,

    

    ^{ }

   

1 1

2

1 1

a b

b q

r r

a

r

x g y g

i i n j j



  

      



 

   

 

^^{ }

 

^{ }

   

 

²^{ }â ¹^{ }â ^na^{ }â

b b

b j j

a a b b

j

i i j

r x, RH ,...,RH

a a

i i

n I I

HI H I HI H I

RH HI





  

   

  



   

 

^{ }

^{ }



^{ }



1 2

rb

a a

a i i

a a

i i

i i n DR RH

DR  DR

     

 

 ^{ } ^{ } ^{ }

 

^{ } ^{ }

2 1

1

1 1

b b

a a

na

b b

nb

b r b r

j j

b y , DR ,...,DR

a DH ,...,DH

j j n RH x HI

i i n ^

 

        

 

     ^{ } ^{ }

^{ } ^{ } ^{ }

 

^{ } ^{ }

^{ } ^{ }

2 1

1

=

1 and 1

a a

b b

j j

/ r mod q

a r a

i i

a a

i i

a b

i j a b

DH H' RH

TR y DR

DH H' TR

Output : | DH DH | for i i n j j n

 

   

 

 

   

 

 

   

 

            

(6)

respectively, which represent the similarities of users belong to Alice and Bob respectively. C is ann_an_b matrix representing the similarity between two users, in which one is from Alice and the other is from Bob. To select the top-k most similar users to a targeted user, Alice only needs to know the matrix A and C, while Bob only needs to know B and C . Therefore, we advance the problem on how to compute the matrix C in a secure manner without revealing the detailed rating score. Based on the PSI-CA [13], we propose the private similarity computation as shown in Protocol 1.

Alice hasn_a users, and each user has a rated item set

 ^a

Ii . Alice and Bob share the common primes p and q with q | p1, of which p can be set as 1024 or 2048 and

q be 160 or 224. The protocol is conducted on a generator g of subgroup of size q , and two hash functions,

 

^{0 1}^* ^*p

H : ,  and^{H' :}

 

^{0 1}^, ^* ^

 

^{0 1}^, ^k, where k is the security parameter.

From Protocol 1, we can see that Alice learns nothing about which items are the intersection as Bob shuffled the set of Alice. The similar privacy proof can be found in [13].

Correctness. For the i-th user of Alice and j-th user of Bob, the rated item sets I_i^{ }^a and I^{ }_j^b are processed by Protocol 1 as follows:

^{ } ^{ } ^{ }

 

     

   

^{ }

^{ } ^{ } ^{ } ^{ }

     

   

^{ }

2 1

2

1 1

1 2

2

1 1

1 ^a

a

b

a b

b b

b

a b

/ r mod q

a r a

i i

a r r r

i

b r b r

i j

b r r r

i

DH H ' y DR

H ' g H I

DH H ' x ( HI )

H ' g H I



 

 

   

   

 

 

 

 

 

 

 

   

 

 

 

 

 

 

Protocol 2. Secure top-n recommendation

   

recommends top- items to ^t

t

O  n ui

   

  

^{ }



   

     

 

Party ll other parties

Paillier_crypt aggregate

t i

P U t

i

t U t

t pk

O O

S ,Id ,Id top_neigs u ,k pk ,sk

a Id ,O

p E

  



 



 

^a^{ }^t ^{pk , p}^{ }^t^,Id^{ }^P^,Id^{ }^U 

^

^{C, f}

^

^reorder



^Id^{ }^P





 ^{ }⁰ ^{ }

For 1

s pt

j :| C |



    



 



1 x

x f j

O does

  

 

^a^{ }^x ^aggregate



^Id^{ }^U ^,O^{ }^x



 ^{ }

 

^{ } ^ ^

   ¹ 



1

sends to 1

j x j

pk

f j

j

s E a s

s O





  

  



  



f 1|C|

End

O does







  ^{ }     

 

  

^{ }



^{ }

1 1

aggregate

1 ^|C|

t i

f |C| U f |C|

|C| s

u sk

a Id ,O

r / sum S D s

 

 

    

 

  

 ^{ }  ^{ }  



1 1

top- unrated items

f |C|

|C| |C|

s Epk a s

Return n

 

 

   

 

 

(7)

Therefore, if two values i^{ }â i^{ }^b and i^{ }â I_i^{ }â ,

 b  b

i Ii , then there must exist two values d^{ }^a DH i^{ }^a and ^{ }^b ^{ }^b

d DHj , withd^{ }^a d^{ }^b . Then Alice learns the set intersection cardinality by counting the number of matching pairs.

Suppose that the number of items rated by each user can be shared with all parties. Then the similarity shown in Table I can be directly computed after the set intersection cardinality is securely learned.

Matrix C in Formula (4) can be computed as Formula (5):

   

^{ } ^{ }

    ^{ } ^{ }

a b

i j

ij a b

i j

a b

i j

a b

i j

| I I |

C

| I I |

| DH DH |

| I | | I | | DH DH |

 

 

  







(5)

Other similarity measure can be computed similarly.

Finally Alice shares the matrix C with Bob. If there are p parties in the distributed system, then each pair of them need to cooperate with each other to securely compute their matrix C.

III.1. Secure Top-N Item Recommendation For a targeted user, the top-k neighbors are selected according to their similarities, and the rating predictions are produced by aggregating the ratings of its neighbors.

Suppose that the party O^{ }^t wants to recommend items to its customer u_i^{ }^t . Protocol 2 shows the secure top-n item recommendation. The aggregate function in the protocol is defined in Formula (6):

   

  _

^{ }^| ^{ }

_

^{ }

aggregate = x U t

j i

U x

v u j Id u v v

Id ,O s r

  



⁽⁶⁾

The party O^{ }^t first selects the top-k most similar users which usually distributed in different parties. The protocol gives the party indices Id^{ }^P and the selected user indices

 ^U

Id in each party. Then O^{ }^t generates a pair of public and secret keys using the Paillier cryptosystem. According to the Formula (2), the multiplication of the encrypted ratings can be decrypted as the summation of the ratings.

Hence, the summation can be calculated without disclosing the rating value of each user. Finally, O^{ }^t decrypts the ciphertext and gets the predicted ratings to each item, and selects the top-n items to the targeted user.

IV. Experimental Evaluation

By analyzing our privacy-preserving protocols, we can

conclude that the protocols have zero losing on the accuracy.

Therefore, in this section we show the improvement on the recommendation when the parities cooperated with others.

The datasets include Epinions [16] and Friendfeed[17].

We sample the original data, and finally Epinions contains 4726 users, 3907 items and 164221 ratings in total. As compared, Friendfeed contains 3133 users who collected 4956 items and 92351 ratings. The metrics evaluated the recommendation are precision, recall, F1 and HD. The definitions of them can found in [18].

The first experiment illustrates the improvement when each party cooperated with all the others. We divide each dataset into 5 parties. For the users in each party, we recommend the top-n items by analyzing the k most similar users of this party (isolated) compared with the users of all parties (cooperated). Figs. 2 and 3 show the results of experiments conducted on two datasets respectively. The x-axis represents the i-th party. Clearly on both dataset, the recommendation results on the cooperated data are better than the results on the isolated data.

Next, we measure how much the improvement can be obtained when a party cooperates with different number of parties.

We recommend the top-n items to the user of first party by analyzing the k most similar users of this party compared with the users from p cooperated parties, where p from 2 to 5. Figs. 4 and 5 show the results on datasets Epinions and Friendfeed. The trend is that the values of measures increase when the number of cooperated parties increases.

But when the number is 4 on Epinions, the values decreased a little which means that some noise users exist in this party to the users of first party. We will focus on this problem in the future work to avoid the decreasing.

V. Conclusion

In this paper, we focus on the privacy problem concerning how to build a shared collaborative filtering model without disclosing any user’s privacy. For this problem, we designed a solution under the semi-honest model.

Theory analysis supported that our scheme combining secure set operations with the encryption technique can preserve the privacy while maintain the accuracy of the rating prediction. The experimental results show that the recommendation accuracy can be improved by cooperating with others.

But some noise party may results in the accuracy decreasing. Next, we will put our emphasis on this problem.

Acknowledgements

This research work was supported by National Natural Science Foundation of China under Grant No.61003231.

(8)

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

(a) Different parties

Precision

Isolated Cooperated

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

(b) Different parties

Recall

Isolated Cooperated

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07

(c) Different parties

F1

Isolated Cooperated

1 2 3 4 5

0 0.2 0.4 0.6 0.8 1

(d) Different parties

HD

Isolated Cooperated

Figs. 2. Recommendation results on Epinions

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06

(a) Different parties

Precision

Isolated Cooperated

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07

(b) Different parties

Recall

Isolated Cooperated

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05

(c) Different parties

F1

Isolated Cooperated

1 2 3 4 5

0 0.2 0.4 0.6 0.8 1

(d) Different parties

HD

Isolated Cooperated

Figs. 3. Recommendation results on Friendfeed

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05

(a) Number of cooperated parties

Precision

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07

(b) Number of cooperated parties

Recall

1 2 3 4 5

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04

(c) Number of cooperated parties

F1

1 2 3 4 5

0 0.2 0.4 0.6 0.8 1

(d) Number of cooperated parties

HD

Figs. 4. Cooperation with different number of parties on Friendfeed

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07

(a) Number of cooperated parties

Precision

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07

(b) Number of cooperated parties

Recall

1 2 3 4 5

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07

(c) Number of cooperated parties

F1

1 2 3 4 5

0 0.2 0.4 0.6 0.8 1

(d) Number of cooperated parties

HD

Figs. 5. Cooperation with different number of parties on Epinions

References

[1] Sneha, Y.S., Mahadevan, G., Parvathi, R.M.S., Recommender system based on user ratings: A comprehensive study and future challenges, (2013) International Review on Computers and Software (IRECOS), 8 (7), pp. 1624-1635.

[2] Wu, J., Wu, Z., Mobile location-aware personalized recommendation with clustering-based collaborative filtering, (2012) International Review on Computers and Software (IRECOS), 7 (5), pp. 2231-2238.

[3] Yao, L., Yang, W., A context-aware recommender for trustworthy

service, (2012) International Review on Computers and Software (IRECOS), 7 (6), pp. 3354-3359.

[4] Tripathy, P.K., Biswal, D., Multiple server indirect security authentication protocol for mobile networks using elliptic curve cryptography (ECC), (2013) International Review on Computers and Software (IRECOS), 8 (7), pp. 1571-1577.

[5] M. Eslamnezhad Namin, F. Badihiyeh Aghdam, M. Hosseinzadeh, A Secure and Efficient RFID Mutual Authentication Protocol, (2011) International Journal on Communications Antenna and Propagation (IRECAP), 1 (5), pp. 429-433.

A case study of using RM-ODP in mobile cloud computing applications

Computers and Software

(IRECOS)

Contents

.

Privacy-Preserving Distributed Collaborative Filtering Using Secure Set Operations

 

 













 

   





  







 

 

 

 

 





 

 

 

   

   

  



   

 

 









 







 





  





  





^

^

  _

_