PUDA - Privacy and Unforgeability for Data Aggregation

(1)

Aggregation

Iraklis Leontiadis, Kaoutar Elkhiyaoui, Melek ¨Onen, Refik Molva EURECOM, Sophia Antipolis, France

{firstname.lastname}@eurecom.fr

Abstract. Existing work on data collection and analysis for aggregation is mainly focused on confidentiality issues. That is, the untrusted Aggregator learns only the aggregation result without divulging individual data inputs. In this paper we extend the existing models with stronger security requirements. Apart from the privacy requirements with respect to the individual inputs, we ask forunforge- abilityfor the aggregate result. We first define the new security requirements of the model. We also instantiate a protocol for private and unforgeable aggregation for multiple independent users. I.e, multiple unsynchronized users owing to personal sensitive information without interacting with each other, contribute their values in a secure way: The Aggregator learns the result of a function without learning individual values, and moreover, it constructs a proof that is forwarded to a verifier that will convince the latter for the correctness of the computation.

Our protocol is provably secure in the random oracle model.

1 Introduction

With the advent of theBig Data era, research on privacy preserving data collection and analysis is culminating. Users continuously produce data that can be considered as valuable whenever an Aggregator is interested in aggregating users’ data. We therefore consider a scenario whereby an Aggregator collects individual data from multiple users who do not interact with each other and executes a function which outputs an aggregate value. This result is further forwarded to the Data Analyzer who can finally extract useful information about the entire population. Various motivating examples under for the aforementioned generic scenario exist in the real-world:

– The analysis of different user profiles and the derivation of statistics can help rec- ommendation engines provide targeted advertisements. In such scenarios a service provider would collect data from each individual user (i.e: on-line purchases), thus acting as an Aggregator, and compute an on-demand aggregate value upon receiving a request from the advertisement company. The latter will further infer some statistics acting as a Data Analyzer, in order to send the appropriate advertisements to each category of users.

– Data aggregation is a promising tool in the field of healthcare research. Different types of data, sensed by body sensors (eg. blood pressure), are collected in large scale in data enclaves who can be considered as Aggregators. Health scientists who act as Data Analyzers are interested in inferring some statistical information from

(2)

these data without having access to each individual input (for privacy and performance reasons). An aggregate value computed over a large population would give very useful information for deriving statistical models, evaluating therapeutic performance or learning the likelihood of upcoming patients’ diseases.

Unfortunately, existing solutions only focus on the problem of data confidentiality and consider the Aggregator to behonest-but-curious: the Aggregator is curious in dis- covering the content of each individual data, but performs the aggregation operation correctly. In this paper we consider a more powerful security model and assume that the Aggregator is untrusted : The Aggregator may provide a bogus aggregate value to the Data Analyzer. In order to protect against such a malicious behavior, we propose that along with the aggregate value, the Aggregator provides a proof of the correctness of the computation of the aggregate result.

The underlying idea of our solution is that each user encrypts its data according to Shiet al.[15] scheme using its own secret encryption key, and sends the resulting ciphertext to the untrusted Aggregator. Users, also homomorphically tag their data using two layers of randomness with two different keys and they forward the tags to the Aggregator. The latter computes the sum by applying operations on the ciphertexts and it also computes a proof for the correctness of the result from the tags. The Aggrega- tor finally sends the result and the proof to the Data Analyzer. The latter verifies the correctness of the computation. We also require the Data Analyzer not to be able to communicate with each user and the result to be publicly verifiable. Moreover, similarly to the existing solutions, the proposed protocol assures obliviousness against the Aggregator and the Data Analyzer in the multi-user setting; meaning that neither the Data Analyzer nor the Aggregator learns individual data inputs.

To the best of our knowledge we are the first to define a model forPrivacy and Unforgeability for Data Aggregation(PUDA). We also instantiate aPUDAscheme which mainly pursues the following three objectives:

– Multi-user setting where multiple users produce personal sensitive data without interacting with each other.

– Public verifiability of the aggregate value.

– Privacy of individual data for all participants.

2 Problem Statement

We are envisioning a scenario whereby a set of usersU={Ui}ⁿ_i=1are producing sensitive data inputsxi,tat each time intervalt. These individual data are first encrypted into ciphertextsci,tand further forwarded to an untrusted AggregatorA. AggregatorAag- gregates all the received ciphertexts, decrypts the aggregate and forwards the resulting plaintext to a Data AnalyzerDAtogether with a cryptographic proof that assures the correctness of the aggregation operation, which in this paper corresponds to thesumof the users’ individual data. An important criterion that we aim to fulfill in this paper is to ensure that Data AnalyzerDAverifies the correctness of the Aggregator’s output without compromising users’ privacy. Namely, at the end of the verification operation, both AggregatorAand Data AnalyzerDAlearn nothing, but the value of the aggregation.

(3)

While homomorphic signatures proposed in [4, 10] seem to answer the verifiability requirement, authors in those papers only consider scenarios where a single user generates data.

In the aim of assuring both individual user’s privacy and unforgeable aggregation, we first come up with a generic model for privacy preserving and unforgeable aggregation that identifies the algorithms necessary to implement such functionalities and defines the corresponding privacy and security models. Furthermore, we propose a concrete solution which combines an already existing privacy preserving aggregation scheme [15] with an additively homomorphic tag designed for bilinear groups.

Notably, a scheme that allows a malicious Aggregator to compute the sum of users’

data in privacy preserving manner and to produce a proof of correct aggregation will start by first running a setup phase. During setup, each user receives a secret key that will be used to encrypt the user’s private input and to generate the corresponding authentication tag; the AggregatorAand the Data AnalyzerDAon the other hand, are provided with a secret decryption key and a public verification key, respectively. After the key distribution, each user sends its data encrypted and authenticated to Aggregator A, while making sure that the computed ciphertext and the matching authentication tag leak no information about its private input. On receiving users’ data, AggregatorAfirst aggregates the received ciphertexts and decrypts the sum using its decryption key, then uses the received authentication tags to produce a proof that demonstrates the correctness of the decrypted sum. Finally, Data AnalyzerDAverifies the correctness of the aggregation, thanks to the public verification key.

2.1 PUDAModel

APUDAscheme consists of the following algorithms:

– Setup(1^κ) → (P,SKA,{SKi}U_i∈U,VK): It is a randomized algorithm run by a trusted dealerKD, which on input of a security parameter κoutputs the public parametersPthat will be used by subsequent algorithms, the AggregatorA’s secret keySK_A, the secret keysSK_iof usersUiand the public verification keyVK.

– EncTag(t,SKi, xi,t) →(ci,t, σi,t): It is a randomized algorithm which on inputs of time intervalt, secret keySK_i of userUi and datax_i,t, encryptsx_i,t to get a ciphertextc_i,tand computes a tagσ_i,tthat authenticatesx_i,t.

– Aggregate(SKA,{ci,t}Ui∈U,{σi,t}Ui∈U)→(sumt, σt): It is a deterministic algorithm run by the AggregatorA. It takes as inputs AggregatorA’s secret keySK_A, ciphertexts{ci,t}Ui∈Uand authentication tags{σi,t}Ui∈U, and outputs in cleartext the sumsum_tof the values{x_i,t}_U_i_∈_U. Moreover, it computes a proofσ_tassessing the correctness ofsum_t, using the authentication tags{σ_i,t}_U_i_∈_U.

– Verify(VK,sum_t, σ_t)→ {0,1}: It is a deterministic algorithm that is executed by the Data AnalyzerDA. It outputs1if Data AnalyzerDAis convinced that the sum sum_t=P

Ui∈U{x_i,t}; and0otherwise.

2.2 Security Model

In this paper, we only focus on the adversarial behavior of Aggregator A. The ratio- nale behind this is that AggregatorAis the only party in the protocol that sees all the

(4)

messages exchanged during the protocol execution: Namely, AggregatorAhas access to users’ ciphertexts and it is the party that interacts directly with the Data Analyzer.

It follows that by ensuring security properties against the Aggregator, one by the same token, ensures these security properties against both Data AnalyzerDAand external parties.

In accordance with previous work [11, 15], we formalize the property ofAggregator obliviousness, which ensures that at the end of a protocol execution, AggregatorAonly learns the sum of users’ inputs x_i,t and nothing else. Also, we enhance the security definitions of data aggregation with the notion ofaggregate unforgeability. As the name implies, aggregate unforgeability guarantees that AggregatorA cannot forge a valid proofσ_tfor a sumsum_tthat was not computed correctly from users’ inputs (i.e. cannot generate a proof forsumt6=Pxi,t).

Aggregator Obliviousness Aggregator obliviousnessensures that when usersUipro- vide AggregatorAwith ciphertextsci,tand authentication tagsσi,t, AggregatorAcan- not reveal any information about individual inputsxi,t, other than the sum valueP

xi,t. We extend the existing definition ofAggregator Obliviousness(cf. [11, 12, 15]) so as to capture the fact that AggregatorAnot only has access to ciphertextsc_i,t, but also has access to the authentication tagsσ_i,tthat enable AggregatorAto generate proofs of correct aggregation.

Similarly to the work of [11, 15], we formalizeAggregator obliviousnessusing an indistinguishability-based game in which AggregatorAaccesses the following oracles:

– OSetup: When called by AggregatorA, this oracle initializes the system parameters;

it then gives the public parametersP, the Aggregator ’s secret keySK_Aand public verification keyVKtoA.

– OCorrupt: When queried by AggregatorAwith a userUi’ s identifieruidi, this oracle provides AggregatorAwithUi’s secret key denotedSKi.

– OEncTag: When queried with timet, userUi’s identifieruidi and a data pointxi,t, this oracle outputs the ciphertextci,t and the authentication tagσi,t of xi,t computed usingUi’s secret keySK_i.

– OAO: When called with a subset of users S ⊂ U and with two time-series (Ui, t, x⁰_i,t)_U_i_∈Sand(Ui, t, x¹_i,t)_U_i_∈Ssuch thatPx⁰_i,t =Px¹_i,t, this oracle flips a random coinb∈ {0,1}and returns an encryption of the time-serie(U_i, t, x^b_i,t)_U_i_∈_S (that is the tuple of ciphertexts{c^b_i,t}_U_i_∈_S) and the corresponding authentication tags{σ_i,t^b }Ui∈S.

AggregatorAis accessing the aforementioned oracles during a learning phase (cf.

Algorithm 1) and a challenge phase (cf. Algorithm 2). In the learning phase,Acalls oracleOSetupwhich in turn returns the public parametersP, the public verification key VKand the Aggregator ’s secret keySKA. It also interacts with oracleOCorruptto learn the secret keysSKi of usersUi, and oracleOEncTag to get a set of ciphertextsci,tand authentication tagsσi,t.

In the challenge phase, Aggregator A chooses a subset S^∗ of users that were not corrupted in the learning phase, and a challenge time interval t^∗ for which it did not make an encryption query. OracleOAO then receives two time-seriesX_t⁰∗ =

(5)

(Ui, t^∗, x⁰_i,t∗)_U_i_∈S^∗ and X_t¹∗ = (Ui, t^∗, x¹_i,t∗)_U_i_∈S^∗ from A, such that P

x⁰_i,t∗ = P

Ui∈S^∗x¹_i,t∗. Then oracle OAO flips a random coin b← {0,^$ 1}and returns to Athe ciphertexts{c^b_i,t∗}_U_i_∈S^∗and the matching authentication tags{σ^b_i,t∗}_U_i_∈S^∗.

At the end of the challenge phase, AggregatorAoutputs a guessb^∗for the bitb.

We say that AggregatorA succeeds in the Aggregator obliviousness game, if its guessb^∗equalsb.

Algorithm 1:Learning phase of the obliviousness game (P,SK_A,VK)← OSetup(1^κ);

//Aexecutes the following a polynomial number of times SK_i← O_Corrupt(uid_i);

//Ais allowed to callO_EncTagfor all usersU_i (c_i,t, σ_i,t)← O_EncTag(t,uid_i, x_i,t);

Algorithm 2:Challenge phase of the obliviousness game A →t^∗,S^∗;

A → X_t⁰∗,X_t¹∗;

(c^b_i,t∗, σ^b_i,t∗)U_i∈S^∗← OAO(Xt⁰^∗,Xt¹^∗);

A →b^∗;

Definition 1 (Aggregator Obliviousness). LetPr[A^AO] denote the probability that Aggregator Aoutputs b^∗ = b. Then an aggregation protocol is said to ensure Ag- gregator obliviousness if for any polynomially bounded AggregatorAthe probability Pr[A^AO]6 ¹₂+(κ), whereis a negligible function andκis the security parameter.

Aggregate Unforgeability We augment the security requirements of data aggregation with the requirement ofaggregate unforgeability. More precisely, we assume that Ag- gregatorAis not only interested in compromising the privacy of users participating in the data aggregation protocol, but also interested in tampering with the sum of users’

inputs. That is, AggregatorAmay sometimes have an incentive to feed Data Analyzer DAerroneous sums. Along these lines, we defineaggregate unforgeabilityas the security feature that ensures that Aggregator Acannot convince Data AnalyzerDAto accept a bogus sum, as long as usersUiin the system are honest (i.e. they always submit their correct input and do not collude with the AggregatorA).

In compliance with previous work [7, 10] on homomorphic signatures, we formalize aggregate unforgeabilityvia a game in which AggregatorAaccesses oraclesOSetupand OEncTag. Furthermore, given the property that anyone holding the public verification key VK can execute the algorithm Verify, we assume that AggregatorA during the unforgeability game runs the algorithmVerifyby itself.

As shown in Algorithm 3, AggregatorAenters theaggregate unforgeabilitygame by querying the oracleOSetupwith a security parameterκ. OracleOSetupaccordingly

(6)

Algorithm 3:Learning phase of the aggregate unforgeability game P,VK← OSetup(1^κ);

//Aexecutes the following a polynomial number of times //Ais allowed to callOEncTagfor all usersUi

(ci,t, σi,t)← OEncTag(t,uidi, xi,t);

Algorithm 4:Challenge phase of the aggregate unforgeability game (t^∗,sumt^∗, σt^∗)← A

returns public parametersP, verification keyVKand the secret keySKAof Aggregator A. Moreover, AggregatorAcalls oracleOEncTagwith tuples(t,uidi, xi,t)in order to receive the ciphertextci,tencryptingxi,tand the matching authenticating tagσi,t, both computed using userUi’s secret keySKi. Note that for each time intervalt, Aggregator Ais allowed to query oracleOEncTagfor userUionly once. In other words, Aggregator Acannot submit two distinct queries to oracleOEncTag with the same time intervalt and the same user identifieruidi. Without loss of generality, we suppose that for each time intervalt, AggregatorAinvokes oracleOEncTagfor all usersUiin the system.

At the end of the aggregate unforgeability game (see Algorithm 4), Aggre- gator A outputs a tuple (t^∗,sum_t^∗, σ_t^∗). We say that Aggregator A wins the aggregate unforgeabilitygame if one of the following statements holds:

1. Verify(VK,sum_t^∗, σ_t^∗) → 1 and Aggregator A never made a query to oracle O_EncTag that comprises time intervalt^∗. In the remainder of this paper, we denote this type of forgeryType I Forgery.

2. Verify(VK,sumt^∗, σt^∗)→1and AggregatorAhas made a query to oracleOEncTag

for timet^∗, however the sumsumt^∗ 6=P

Uixi,t^∗. In what follows, we call this type of forgeryType II Forgery.

Definition 2 (Aggregate Unforgeability).LetPr[A^AU]denote the probability that Ag- gregator Awins the aggregate unforgeabilitygame, that is, the probability that Ag- gregator Aoutputs a Type I Forgeryor Type II Forgerythat will be accepted by algorithmVerify.

An aggregation protocol is said to ensure aggregate unforgeability if for any polynomially bounded adversaryA,Pr[A^AU] 6 (κ), whereis a negligible function in the security parameterκ.

3 Idea of our PUDA protocol

In an extended model with an untrusted Aggregator, it is of utmost importance to design a solution in which the untrusted Aggregator cannot provide bogus results to the Data Analyzer. Such a solution will use a proof system that enables the Data Analyzer to verify the correctness of the computation. Yet verifiability should be achieved without

(7)

sacrificing privacy. Towards this goal, we propose a protocol that relies on the following techniques:

– Ahomomorphic encryptionalgorithm that allows the Aggregator to compute the sum without divulging individual data.

– Ahomomorphic tagthat allows each user to authenticate the data inputxi,t, in such a way that the Aggregator can use the collected tags to construct a proof that demonstrates to the Data AnalyzerDAthe correctness of the Aggregator sum.

Concisely, a set of non-interacting users are connected to personal services and de- vices that produce personal data. Without any coordination, each user chooses a random tag keytkiand sends an encoding thereof,tkito the key dealer. After collecting all en- coded keystkiby users, the key dealer publishes the public verification keyVKof this group of users. This verification key is computed as a function of the encodingstki. Later, the key dealer gives to each user in the system an encryption keyekithat will be used to compute the user’s ciphertexts. Accordingly, the secret key of each userSKiis defined as the pair of tag keytkiand encryption keyeki. Finally, the key dealer provides the Aggregator with secret keySKAcomputed as the sum of encryption keysekiand goes off-line.

Now at each time intervalt, each user employs its secret keySKi to compute a ciphertext based on the encryption algorithm of Shiet al.[15] and a homomorphic tag on its sensitive data input. When the Aggregator collects the ciphertexts and the tags from all users, it computes the sumsum_tof users’ data and a proofσ_tfor the sum, and forwards the sum and the proof to the Data Analyzer. At the final step of the protocol, the Data Analyzer verifies with the verification keyVKand proofσ_tthe validity of the result sum_t. Although the modification seems straightforward, the proof forType II Forgeryturns out to be challenging.

Thanks to the homomorphic encryption algorithm of Shiet al.[15] and the way in which we construct our homomorphic tags, we show that our protocol ensuresAggrega- tor obliviousness. Moreover, we show that the Aggregator cannot forge bogus results.

Finally, we note that the Data AnalyzerDAdoes not keep any state with respect to users’ transcripts be they ciphertexts or tags, but it only holds the public verification key, the sumsumtand the proofσt.

4 PUDA Instantiation

LetG1,G2,GT be cyclic groups of large prime orderpandg₁, g₂generators ofG1,G2

accordingly. We say thateis a bilinear map, if the following properties are satisfied:

1. bilinearity:e(gâ₁, g₂^b) =e(g₁, g₂)âb, whereg₁, g₂∈G1×G2anda, b∈Zp. 2. Computability: there exists an efficient algorithm that computes e(gâ₁, g₂^b)where

g1, g2∈G¹×G²anda, b∈Z^p. 3. Non-degeneracy:e(g1, g2)6= 1.

For encryption and sum computation we employ thediscrete logarithmbased encryption scheme of Shiet al.[15]:

(8)

4.1 Shi-Chan-Rieffel-Chow-Song Scheme

– Setup(1^κ): LetG1 be a group of large prime orderp. A trusted key dealerKD selects a hash functionH : {0,1}^∗ → G1. Furthermore, KDselects secret encryption keyseki ∈Zp, uniformly at random.KDdistributes to each userUi the secret keyekiand it also sends the secret keyskA=−Pn

i=1ekito the Aggregator.

– Encrypt(ek_i, x_i,t): Each userUiencrypts the valuex_i,tby using its secret encryption keyek_iand outputs the corresponding ciphertextc_i,t=H(t)^ekⁱg^x₁^i,t∈G1. – Aggregate({c_i,t}_U_i_∈_U,{σ_i,t}_U_i_∈_U,SK_A): Upon receiving all the cipher-

texts {ci,t}ⁿ_i=1, the Aggregator computes: Vt = (Qn

i=1ci,t)H(t)^sk^A = H(t)^Pⁿⁱ⁼¹^ekⁱg

Pn i=1x_i,t

1 H(t)⁻^Pⁿⁱ⁼¹^ekⁱ = g

Pn i=1x_i,t

1 ∈ G1. Finally A learns the sumsum_t=Pn

i=1x_i,t∈Zpby computing the discrete logarithm ofV_ton the baseg₁. The sum computation is correct as long asPn

i=1x_i,t< p.

4.2 PUDA Scheme

In what follows we describe ourPUDAprotocol:

– Setup(1^κ):KDoutputs (p, g₁, g₂,G1,G2,GT) for an efficient computable bilinear mape : G1×G2 → GT, whereg1 andg2 are two random generators for the multiplicative groupsG¹andG²respectively andpis a prime number that denotes the order of all the groupsG1,G2andGT. Moreover a secret keyais selected by KD. EachUi selects a random tag keytki ∈ Zp independently and forwardsg^tk₂ⁱ toKD.KDpublishes the verification keyVK = (vk1,vk2) = (g

Pn i=1tki

2 , g^a₂)and distributes to each userUi ∈ Uthe secret keyg₁^a ∈ G1through a secure channel.

Thus the secret keys of the scheme areSKi = (eki,tki, g₁^a). After publishing the public parametersP = (H, p, g1, g2,G1,G2,GT)and the verification keyVK,KD goes off-line and it does not further participate in any protocol phase.

– EncTag(t,SKi = (eki,tki, g^a₁), xi,t): At each time intervalt each user Ui encrypts the data valuex_i,t with its secret encryption keyek_i, using the encryption algorithm, described in section 4.1, which results in a ciphertext

c_i,t=H(t)^ekⁱg₁^x^i,t∈G1

Uialso constructs a tagσi,t∈G1with its secret tag key(tki, g₁^a):

σi,t=H(t)^tkⁱ(g₁^a)^x^i,t∈G1

FinallyUisends(ci,t, σi,t)toA.

– Aggregate(SKA,{ci,t}U_i∈U,{σi,t}U_i∈U): Aggregator A computes the sum sumt=Pn

i=1xi,tby using theAggregatealgorithm presented in section 4.1.

Moreover,Aaggregates the corresponding tags as follows:

σt=

n

Y

i=1

σi,t=

n

Y

i=1

H(t)^tkⁱ(g₁â)^xî,t=H(t)^P^tkⁱ(gâ₁)^P^xî,t Afinally forwardssumtandσtto data analyzerDA.

(9)

– Verify(VK,sumt, σt): During the verification phaseDAverifies the correctness of the computation with the verification keyVK = (vk1 = g

Ptk_i

2 ,vk2 = g₂^a), by checking the following equality:

e(σt, g2)=^? e(H(t),vk1)e(g₁^sum^t,vk2) Verification correctness follows from bilinear pairing properties:

e(σt, g2) =e(

n

Y

i=1

σi,t, g2) =e(

n

Y

i=1

H(t)^tkⁱgâx₁ î,t, g2) = e(H(t)^Pⁿⁱ⁼¹^tkⁱgâ

Pn i=1x_i,t

1 , g₂) =e(H(t)^Pⁿⁱ⁼¹^tkⁱ, g₂)e(g^a

Pn i=1x_i,t 1 , g₂) = e(H(t), g

Pn i=1tki

2 )e(g

Pn i=1xi,t

1 , g₂^a) = e(H(t), g

Pn i=1tk_i

2 )e(g₁^sum^t, g^a₂) =e(H(t),vk1)e(g^sum₁ ^t,vk2)

5 Analysis

5.1 Aggregator Obliviousness

Theorem 1. The proposed solution achieves aggregator obliviousness in the random oracle model under the decisional Diffie-Hellman (DDH) assumption inG1.

Proof. Assume there is an aggregatorAwhich breaks the obliviousness of the PUDA scheme with a non-negligible advantage. We build in what follows an adversaryBwho usesAas a subroutine to break the aggregator obliviousness of the private streaming aggregation (PSA) protocol presented in [15], which is guaranteed under DDH. Without loss of generality we call the oracles that the adversaryBhas access to from the PSA scheme as follows:O_Setup^PSA,O^PSA_Corrupt,O_Encrypt^PSA , andO^PSA_AO .

We consider in PSA as in PUDAthat there arenusersUi and each one of these users possesses a secret encryption keyek_i. In the following, we show how an adversary B simulates the aggregator obliviousness game presented in Algorithms 1 and 2 to aggregatorAand how therewith breaks the aggregator obliviousness of PSA.

Learning phase:In the learning phase, adversaryBproceeds as following: Whenever A calls oracleOSetup with a security parameter κ,B queries oracle O^PSA_Setup with the same security parameter. OracleO^PSA_Setupin turn outputs the public parameters that are composed of a hash functionH :{0,1}^∗→G1, a generatorg₁of the groupG1of safe prime orderp, and the aggregator’s secret keySK_A = −Pn

i=1ek_i.Bthen selects the parameters of a bilinear pairing(e, g₁, g₂,G1,G2,GT).Bchooses uniformly at random a,{r_i}_U_i_∈Usuch and defines the verification keyVKas follows:

VK= (g^aSK^A⁺

Pn i=1ri

2 , g₂^a) = (g^a

Pn

i=1eki+Pn i=1ri

2 , g₂^a) = (g

Pn

i=1aeki+ri

2 , g^a₂) This entails that tk_i is defined as: aek_i +r_i. Finally B forwards to A the public parameters:P = (H, p, g1, g2,G1,G2,GT), the verification keysVK= (g

Pn i=1tk_i 2 , g₂^a) and the secret key of the AggregatorskA.

(10)

Whenever A calls oracle OCorrupt with a user’s identifier uidi, B relays the query uiditoO^PSA_Corruptof the PSA scheme which in turns outputs the secret encryption keyeki

of userUi.Bthen returns secret keySKi= (eki,tki) = (eki, aeki+ri).

Whenever A calls oracle OEncTag with query (t,uid_i, x_i,t), B forwards the query to the O_Encrypt^PSA oracle which returns the appropriate ciphertext ci,t = H(t)êkⁱg^x₁î,t. B computes then the tag associated with ciphertext ci,t as σi,t= (ci,t)âH(t)^rⁱ =H(t)âekⁱ^+rⁱg₁âxî,t=H(t)^tkⁱgâx₁ î,tand transmits toAciphertext ci,tand tagσi,t.

Challenge phase:In the challenge phaseAchooses a set of usersS^∗that have not been corrupted during the learning phase and a time intervalt^∗for whichAdid not make a query to oracleOEncTag.Athen submits two time-seriesX₀^∗= (Ui, t^∗, x⁰_i,t∗)Ui∈S^∗and X₁^∗= (Ui, t^∗, x¹_i,t∗)_U_i_∈_S^∗toOAO, such thatPx⁰_i,t∗=Px¹_i,t∗.Bsimulates this oracle as follows:

It forwards the series X₀^∗ and X₁^∗ to O_AO^PSA which chooses uniformly at random a bitb← {0,^$ 1}and returns toBthe ciphertexts{c^b_i,t∗}_U_i_∈_S^∗encrypting time-serieX_b^∗. Next, B constructs for all Ui in S^∗ the tag σ_i,t^b ∗ corresponding to ciphertext c^b_i,t∗

by computing:

σ_i,t^b ∗= (c^b_i,t)^aH(t^∗)^rⁱ = (H(t^∗)^ekⁱg^x

b i,t∗

1 )^aH(t^∗)^rⁱ

=H(t^∗)^aekⁱ^+rⁱg^ax

b i,t∗

1 =H(t^∗)^tkⁱg^ax

b i,t∗

1

Note thatσ_i,t^b ∗ corresponds to a correctly computed tag for inputx^b_i,t∗. Finally,B forwards toA {(c^b_i,t∗, σ^b_i,t∗}_U_i_∈S^∗. At this point, the simulated view of aggregatorAis computationally indistinguishable from its view in an actualaggregator obliviousness game as defined in Algorithms 1 and 2. This leads to correct verification of the sum computed byA, more precisely:

e(Y

i∈S^∗

σ_i,t^b ∗, g2) =e(

n

Y

i=1

H(t^∗)^tkⁱg^ax

b i,t∗

1 , g2)

=e(H(t^∗), g^a

Pn

i=1ek_i+Pn i=1r_i

2 )e(g

Pn i=1x^b_i,t∗

1 , g₂^a) =e(H(t^∗),vk1)e(g

Pn i=1x^b_i,t∗

1 ,vk2) It follows that if aggregatorAis able to output a correct guess b^∗ for the bit bwith a non-negligible advantage: (i.e. is able to break the aggregator obliviousness of our scheme), thenBwill break the aggregator obliviousness of the PSA scheme with the same non-negligible advantageby outputting the guessb^∗.

As such PSA scheme ensures aggregator obliviousness under the DDH assumption in G1, we can conclude that our scheme also ensures aggregator obliviousness:

Pr[A^AO]6 ¹2+(κ)as long as DDH holds inG1.

(11)

5.2 Aggregate Unforgeability

We first introduce a new assumption that is used during the security analysis of our PUDAinstantiation. Our new assumption named hereafter asLEOMis a variant of the LRSWassumption [14] which is proven secure in the generic model [16] and it used for the construction of the CL signatures [5]. W.l.g we assume a setIof sizenand an indext. TheOLEOMoracle chooses{γi}ⁿ_i=1,∀i∈I, δ∈Zpuniformly and at random which are kept secret. It also gives the public key (g

Pn i=1γ_i

2 , g₂^δ) to the adversary and chooses α ∈ G1 at random. Adversary makes bulk queries (i, t,{xi,t}ⁿ_i=1),∀i ∈ I and the OLEOM oracle, chooses β_t ∈ Zp uniformly and at random and replies with {(α, βt, β_t^γⁱα^δx^i,t)}ⁿ_i=1for each differentt.OLEOMaborts if it receives a bulk query for atfor which there isi⁰ ∈ I : i = i⁰ for whichx_i,t 6=x⁰_i,t. In the end the adversary succeeds if it outputs a tuple(t, z, α, βt, β

Pn i=1γi

t α^δz)for atin whichPn

i=1xi,t6=z.

Theorem 2. (LEOMAssumption) LetGbe an algorithm that on input the security pa- rameterκoutputs the parameters of a bilinear groupG= (e,G1,G2, g₁, g₂, p). Define

∆=g^δ₂, Γ =g

Pn i=1γi

2 ∈G²2forδ, γi∈Zp,∀i∈I. Consider an oracleOLEOMthat on input a set of queries(i, t,{xi,t}ⁿ_i=1)responds with (α, βt, β_t^γⁱα^x^i,t^δ) for a uniformly at random elementα∈G1, βt∈Zp.

Then for all probabilistic polynomial time adversariesAthe probability:

Pr[G← G(1^κ);δ, γi∈Z^p; (Γ =g^δ₂, ∆=g

Pn i=1γi

2 );

(t, z, a, b, c)← A^O^LEOM^(i,t,{x^i,t^}ⁿⁱ⁼¹⁾: (z6=

n

X

i=1

xi,t, t)∧a=α∧b=βt∧c=β

Pn i=1γ_i

t α^zδ]≤2(κ) Due to space limitations, the security evidence of theLEOMis deferred in the Appendix section.

We show in our analysis that aType I Forgeryimplies a break of theBCDHas- sumption and next that aType II Forgeryimplies a break of theLEOMassumption.

Theorem 3. Our scheme achieves aggregate unforgeability against aType I Forgery underBCDHassumption in the random oracle model.

Proof. We show how to build an attacker B that solves BCDH in (G1,G2,GT).

Let g1 and g2 be two generators for G1 and G2 respectively. B receives the challenge(g1, g2, g₁â, g^b₁, g₁^c, gâ₂, g₂^b)from theBCDHoracleOBCDH and is asked to output e(g1, g2)âbc∈GT.Bsimulates the interaction withAin the two phases (Setup, Learn- ing) as follows:

Setup:

– To simulate the O^A_Setup oracle B selects uniformly at random 2n keys {ki}ⁿ_i=1, {yi}ⁿ_i=1 ∈ Zp and outputs the public parametersP = (κ, p, g₁, g₂,G1,G2) the verification keyVK = (vk1,vk2) = (g^b

Pn i=1k_i

2 , g₂^a)and the secret key of the Ag- gregatorSKA=−Pn

i=1yi.

(12)

Learning phase

– Ais allowed to query the random oracleH for any time interval .Bconstructs a H−listand responds toAquery as follows:

1. If query (t) already appears in a tupleH-tupleht : rt,coin(t), H(t)i of the H−listit responds toAwithH(t).

2. Otherwise it selects a random number r_t ∈ Zp and flips a random coin← {0,^$ 1}. With probabilityp,coin(t) = 0 andBanswers withH(t) = g₁^r^t. Otherwise ifcoin(t) =1thenBresponds withH(t) =g^cr₁^t and updates theH−listwith the new tupleH-tupleht:rt,coin(t), H(t)i.

– WheneverAsubmits a query (t,uidi, xi,t) to theO^A_EncTag,Bconstructs aT−list and responds as follows:

1. If at time intervaltAhas never queried before theO^A_EncTagoracle then:

(a) Binitializes variableΣ_t= 0.

(b) B calls the simulated random oracle, receives the result for H(t) and ap- pends the tupleH-tupleht:r_t,coin(t), H(t)ito theH−list.

(c) Ifcoin(t) = 1thenBstops the simulation.

(d) Otherwise it chooses the secret tag key ki wherei = uidito be used as secret tag key from the set of{ki}keys, chosen byBin theSetupphase.

(e) Bsends toAthe tagσi,t=g₁^r^t^bkⁱgâx₁ î,t=H(t)^bkⁱg₁âxî,t, which is a valid tag for the valuexi,t. Notice thatBcan correctly compute the tag without knowingaandbfrom theBCDHproblem parametersgâ₁, g₁^b.

(f) Bchooses also a secret encryption keyyi ∈ {yi}ⁿ_i=1 ∈ Zp and computes the ciphertext asci,t=H(t)^yⁱg₁^x^i,t. The simulation is correct sinceAcan check that the sumPn

i=1xi,t corresponds to the ciphertexts given byB with its decryption key SKA = −Pn

i=1yi, considering the attacker has made distinct encryption queries for all thenusers in the scheme at a time intervalt.

(g) B sets Σ_t = Σ_t + x_i,t and updates the T−list with the tuple:

ht,uid_i, x_i,t, σ_i,ti

2. Else ifT−listcontainsi⁰ =uid_iandx_i,t =x⁰_i,tthenBfetches the corre- spondingσ_i,tfrom the list and forwards it toA.

3. Else ifT−listcontainsi⁰ =uidiandxi,t 6=x⁰_i,tthenBaborts.

4. Otherwise (0<cnt_t< n),Blooks to theH−listlist for the tuple indexed bytin order to getht : rt,coin(t), H(t)i. If the tuple does not exist thenB tosses a randomcoin and ifcoin(t) = 1thenBaborts. If coin(t) = 0then B computes the tag identically as in 1(d)(e)(f)(g) steps: It chooses a key ki

wherei = uidi from the selected keys{ki}. It constructs the tag asσi,t = g₁^r^t^bkⁱg₁âxî,t =H(t)^bkⁱg₁âxî,t and the ciphertext asc_i,t =H(t)^yⁱg₁^xî,t. Finally BsetsΣ_t=Σ_t+x_i,t, updates theT−listwith the tuple:ht,uid_i, x_i,t, σ_i,ti.

Now, whenBreceives the forgery(sum_t^∗, σ_t^∗)at time intervalt=t^∗, it continues ifsum_t^∗ 6=Σ_t.Bfirst queries theH-tuple for timet^∗in order to fetch the appropriate tuple.

– Ifcoin(t^∗) = 0thenBaborts.

(13)

– Ifcoin(t^∗) = 1then since Aoutputs a valid forgedσt∗ att^∗, it is true that the following equation should hold:

e(σ_t^∗, g₂) =e(H(t^∗),vk₁)e(g^sum₁ ^t^∗,vk₂)

which is true whenAmakesnqueries for time intervalt^∗for distinct users to the O_EncTag^A oracle during theLearningphase. As suchσt∗=g^cr^t^b

Pk_i 1 g^asum₁ ^t^∗ FinallyBoutputs:

e(( σ_t^∗

g₁âsum^t^∗)^rt^P¹^kⁱ, gâ₂) =e((g^cr₁^t^b^P^kⁱgâsum₁ ^t^∗

g^asum₁ ^t^∗ )^rt^P¹^kⁱ, g₂^a)

=e((g₁^cr^t^b^P^kⁱ)^rt^P¹^kⁱ, gâ₂) =e(g^bc₁ , gâ₂) =e(g₁, g₂)âbc

LetAÂU1the event whenAsuccessfully forges aType I forgeryσtfor our PUDA protocol that happens with some non-negligible probability ⁰. Then Pr[B^BCDH] = Pr[event₀] Pr[event₁] Pr[AÂU2] = p(1−p)^q^H⁻¹⁰, for q_H random oracle queries with the probabilityPr[coin(t) = 0] = p. As such we ended up in a contradiction as- suming the hardness of theBCDHassumption and finallyPr[AÂU1]≤₁, where₁is a negligible function.

Theorem 4. Our scheme guarantees aggregate unforgeability against a Type II Forgeryunder theLEOMassumption in the random oracle model.

Proof. (Sketch) TheO^A_EncTagoracle behaves equivalently as the oracle in theLEOMas- sumption.Bchooses secret encryptions keys{eki}ⁿ_i=1and sends toAthe secret decryption keySKA=−Pn

i=1eki.Breceives also the public key (vk1=g

Pn i=1γ_i

2 ,vk2=g^δ₂) from theOLEOM oracle and forwards it toA along with the public parameters P = (κ, p, g₁ =α, g₂,G1,G2). For a random oracle query H(t) the simulatorBqueries the O_LEOMwith input (i 3 I, t, x_i,t←^$ Zp)which replies with(a = α∧b = β_t∧c = β^γ_tⁱα^xî,t^δ). FinallyBforwards to A,H(t) = β_t. For queries(i = uid, t, x_i,t)to the OÂ_EncTag oracle the simulatorBreturnsσi,t =β_t^γⁱα^δxî,tfrom theOLEOMoracle, as a tag, and constructs the ciphertext asci,t =β_têkⁱg₁^xî,t.Ais able to correctly verify the sum, more precisely:

e(

n

Y

i=1

σi,t, g2) =e(

n

Y

i=1

β_t^γⁱα^δx^i,t, g2) =e(β

Pn i=1γ_i

t α^δ^Pⁿⁱ⁼¹^x^i,t, g2)

=e(βt, g

Pn i=1γi

2 )e(α^Pⁿⁱ⁼¹^x^i,t, g^δ₂) =e(βt,vk1)e(α^Pⁿⁱ⁼¹^x^i,t,vk2)

Therefore, from the point of view ofA, the tagsσi,t = β_t^γⁱα^δx^i,t correspond to well formed verifiable tags. Notice that if there is some non-negligible probability that B breaks theLEOMassumption then the probability thatAoutputs aType II Forgery is also non-negligible. This leads to a contradiction under theLEOMassumption and accordingly,Pr[A^AU2]≤2for a negligible function2. We conclude that our scheme guaranteesaggregate unforgeabilityfor aType II Forgeryunder theLEOMassumption in the random oracle model.

(14)

Participant Computation Communication

User 2EXP+1MUL 2·l

Aggregator (n−1)MUL 2·l

Data Analyzer 3PAIR+1EXP+1MUL+1HASH -

Table 1: Performance of tag computation, proof construction and verification operations.ldenotes the bit-size of the prime numberp.

To conclude with the analysis the success probabilities for theaggregate unforgeability gamePr[A^AU], are taken over the union of the success probabilities for the two type of forgeries. As such

Pr[AÂU] = Pr[AÂU1] + Pr[AÂU2]≤1(κ) +2(κ) where1and2are negligible functions.

5.3 Performance Evaluation

In this section we analyze the extra overhead of ensuring theaggregate unforgeability property in our PUDAinstantiation scheme. First, we consider a theoretical evaluation with respect to the mathematical operations a participant of the protocol be it user, Ag- gregator or Data Analyzer has to perform with respect to the verifiability transcripts.

That is, the computation of the tag by each user, the proof by the Aggregator and the verification of the proof by the Data Analyzer. We also present an experimental evaluation that shows the practicality of out scheme.

To allow the Data analyzer to verify the correctness of computations performed by an untrusted Aggregator each user selects uniformly and at random a secret key tki ∈ Zp. The key dealer distributes to each userg₁â ∈ G1 and publishesgâ₂ ∈ G2, which calls for two exponentiations: one inG1and one inG2. At each time intervalt each user computesσi,t = H(t)^tkⁱ(gâ₁)^xî,t ∈ G1, which entails two exponentiations and one multiplication inG1. For the computation of theσtthe Aggregator is involved inn−1multiplications inG1:Qn

i=1σi,t. Finally the data analyzer verifies by checking the equality: e(σt, g2) =^? e(H(t),vk1)e(g₁^sum^t,vk2), which asks for three pairing evaluations, one hash inG1, one exponentiation inG1 and one multiplication inGT

(see table 1). The efficiency ofPUDAstems from the constant time verification with respect to the size of the users. This is of crucial importance since the Data Analyzer may not own computational power.

We implemented the verification functionalities ofPUDAwith theCharmcryp- tographic framework [1, 2]. For pairing computations, it inherits thePBC[13] library which is also written inC. All of our benchmarks are executed on Intel Core i5 CPU M 560 @ 2.67GHz×4 with 8GB of memory, running Ubuntu 12.04 32bit.Charmuses 3 types of asymmetric pairings:MNT159,MNT201,MNT224. We run our benchmarks with these three different types of asymmetric pairings. The timings for all the underlying mathematical group operations are summarized in table 3. There is a vast difference on the computation time of operations betweenG1andG2for all the different curves. The reason is the fact that the bit-length of elements inG2is much larger than inG1.

(15)

XXXX XXX

Operation Pairings

MNT159 MNT201 MNT224

Tag 1.2ms 1.8ms 2.2ms

Verify 28.3ms42.7ms53.5ms Table 2: Computational cost ofPUDAoperations with respect to different pairings.

PP PPP

Op.

Curve

MNT159 MNT201 MNT224 HASHinG1 0.139ms 0.346ms 0.296ms HASHinG2 25.667ms41.628ms48.305ms MULinG1 0.004ms 0.0006ms 0.006ms MULinG2 0.040ms 0.051ms 0.054ms MULinGT 0.012ms 0.015ms 0.016ms EXPinG1 0.072ms 0.092ms 0.099ms EXPinG2 0.615ms 0.757ms 0.784ms PAIR 7.077ms 10.674ms13.105ms Table 3: Average computation overhead of the underlying mathematical group operations for different type of curves.

As shown in table 2, the computation of tagsσi,timplies a computation overhead at a scale of milliseconds with a gradual increase as the bit size of the underlying elliptic curve increases. The data analyzer is involved in pairing evaluations and computations at the target group independent of the size of the data-users.

6 Related Work

In [6], authors proposed a solution which is based on homomorphic message authen- ticators in order to verify the computation of generic functions on outsourced data.

Each data input is authenticated with an authentication tag. A composition of the tags is computed by the cloud in order to verify the correctness of the output of a programP. Thanks to the homomorphic properties of the tags the user can verify the correctness of the program. The main drawback of the solution is that the user in order to verify the correctness of the computation has to be involved in computations that take exactly the same time as the computation of the functionf.Backeset al. [3] proposed a generic solution for efficient verification of bounded degree polynomials in time less than the evaluation off. The solution is based onclosed form efficientpseudorandom function P RF. Contrary to our solution both solutions do not provide individual privacy and they are not designed for a multi-user scenario.

Catalanoet al.[8] employed a nifty technique to allow single users to verify computations on encrypted data. The idea is to re-randomize the ciphertext and sign it with a homomorphic signature. Computations then are performed on the randomized ciphertext and the original one. However the aggregate value is not allowed to be learnt in cleartext by the untrusted aggregator since the protocols are geared for cloud based scenarios.

In the multi-user setting, Choiet al.[9] proposed a protocol in which multiple users are outsourcing their inputs to an untrusted server along with the definition of a func- tionality f. The server computes the result in a privacy preserving manner without learning the result and the computation is verified by a user that has contributed to the function input. The users are forced to operate in anon-interactivemodel, whereby they cannot communicate with each other. The underlying machinery entails a novel proxy based oblivious transfer protocol, which along with a fully homomorphic scheme and garbled circuits allows for verifiability and privacy. However, the need of fully homo-