Aggregation
Iraklis Leontiadis, Kaoutar Elkhiyaoui, Melek ¨Onen, Refik Molva EURECOM, Sophia Antipolis, France
{firstname.lastname}@eurecom.fr
Abstract. Existing work on data collection and analysis for aggregation is mainly focused on confidentiality issues. That is, the untrusted Aggregator learns only the aggregation result without divulging individual data inputs. In this paper we extend the existing models with stronger security requirements. Apart from the privacy requirements with respect to the individual inputs, we ask forunforge- abilityfor the aggregate result. We first define the new security requirements of the model. We also instantiate a protocol for private and unforgeable aggregation for multiple independent users. I.e, multiple unsynchronized users owing to per- sonal sensitive information without interacting with each other, contribute their values in a secure way: The Aggregator learns the result of a function without learning individual values, and moreover, it constructs a proof that is forwarded to a verifier that will convince the latter for the correctness of the computation.
Our protocol is provably secure in the random oracle model.
1 Introduction
With the advent of theBig Data era, research on privacy preserving data collection and analysis is culminating. Users continuously produce data that can be considered as valuable whenever an Aggregator is interested in aggregating users’ data. We therefore consider a scenario whereby an Aggregator collects individual data from multiple users who do not interact with each other and executes a function which outputs an aggregate value. This result is further forwarded to the Data Analyzer who can finally extract useful information about the entire population. Various motivating examples under for the aforementioned generic scenario exist in the real-world:
– The analysis of different user profiles and the derivation of statistics can help rec- ommendation engines provide targeted advertisements. In such scenarios a service provider would collect data from each individual user (i.e: on-line purchases), thus acting as an Aggregator, and compute an on-demand aggregate value upon receiv- ing a request from the advertisement company. The latter will further infer some statistics acting as a Data Analyzer, in order to send the appropriate advertisements to each category of users.
– Data aggregation is a promising tool in the field of healthcare research. Different types of data, sensed by body sensors (eg. blood pressure), are collected in large scale in data enclaves who can be considered as Aggregators. Health scientists who act as Data Analyzers are interested in inferring some statistical information from
these data without having access to each individual input (for privacy and perfor- mance reasons). An aggregate value computed over a large population would give very useful information for deriving statistical models, evaluating therapeutic per- formance or learning the likelihood of upcoming patients’ diseases.
Unfortunately, existing solutions only focus on the problem of data confidentiality and consider the Aggregator to behonest-but-curious: the Aggregator is curious in dis- covering the content of each individual data, but performs the aggregation operation correctly. In this paper we consider a more powerful security model and assume that the Aggregator is untrusted : The Aggregator may provide a bogus aggregate value to the Data Analyzer. In order to protect against such a malicious behavior, we propose that along with the aggregate value, the Aggregator provides a proof of the correctness of the computation of the aggregate result.
The underlying idea of our solution is that each user encrypts its data according to Shiet al.[15] scheme using its own secret encryption key, and sends the resulting ciphertext to the untrusted Aggregator. Users, also homomorphically tag their data us- ing two layers of randomness with two different keys and they forward the tags to the Aggregator. The latter computes the sum by applying operations on the ciphertexts and it also computes a proof for the correctness of the result from the tags. The Aggrega- tor finally sends the result and the proof to the Data Analyzer. The latter verifies the correctness of the computation. We also require the Data Analyzer not to be able to communicate with each user and the result to be publicly verifiable. Moreover, simi- larly to the existing solutions, the proposed protocol assures obliviousness against the Aggregator and the Data Analyzer in the multi-user setting; meaning that neither the Data Analyzer nor the Aggregator learns individual data inputs.
To the best of our knowledge we are the first to define a model forPrivacy and Unforgeability for Data Aggregation(PUDA). We also instantiate aPUDAscheme which mainly pursues the following three objectives:
– Multi-user setting where multiple users produce personal sensitive data without interacting with each other.
– Public verifiability of the aggregate value.
– Privacy of individual data for all participants.
2 Problem Statement
We are envisioning a scenario whereby a set of usersU={Ui}ni=1are producing sensi- tive data inputsxi,tat each time intervalt. These individual data are first encrypted into ciphertextsci,tand further forwarded to an untrusted AggregatorA. AggregatorAag- gregates all the received ciphertexts, decrypts the aggregate and forwards the resulting plaintext to a Data AnalyzerDAtogether with a cryptographic proof that assures the correctness of the aggregation operation, which in this paper corresponds to thesumof the users’ individual data. An important criterion that we aim to fulfill in this paper is to ensure that Data AnalyzerDAverifies the correctness of the Aggregator’s output with- out compromising users’ privacy. Namely, at the end of the verification operation, both AggregatorAand Data AnalyzerDAlearn nothing, but the value of the aggregation.
While homomorphic signatures proposed in [4, 10] seem to answer the verifiability re- quirement, authors in those papers only consider scenarios where a single user generates data.
In the aim of assuring both individual user’s privacy and unforgeable aggregation, we first come up with a generic model for privacy preserving and unforgeable aggre- gation that identifies the algorithms necessary to implement such functionalities and defines the corresponding privacy and security models. Furthermore, we propose a concrete solution which combines an already existing privacy preserving aggregation scheme [15] with an additively homomorphic tag designed for bilinear groups.
Notably, a scheme that allows a malicious Aggregator to compute the sum of users’
data in privacy preserving manner and to produce a proof of correct aggregation will start by first running a setup phase. During setup, each user receives a secret key that will be used to encrypt the user’s private input and to generate the corresponding au- thentication tag; the AggregatorAand the Data AnalyzerDAon the other hand, are provided with a secret decryption key and a public verification key, respectively. After the key distribution, each user sends its data encrypted and authenticated to Aggregator A, while making sure that the computed ciphertext and the matching authentication tag leak no information about its private input. On receiving users’ data, AggregatorAfirst aggregates the received ciphertexts and decrypts the sum using its decryption key, then uses the received authentication tags to produce a proof that demonstrates the correct- ness of the decrypted sum. Finally, Data AnalyzerDAverifies the correctness of the aggregation, thanks to the public verification key.
2.1 PUDAModel
APUDAscheme consists of the following algorithms:
– Setup(1κ) → (P,SKA,{SKi}Ui∈U,VK): It is a randomized algorithm run by a trusted dealerKD, which on input of a security parameter κoutputs the public parametersPthat will be used by subsequent algorithms, the AggregatorA’s secret keySKA, the secret keysSKiof usersUiand the public verification keyVK.
– EncTag(t,SKi, xi,t) →(ci,t, σi,t): It is a randomized algorithm which on inputs of time intervalt, secret keySKi of userUi and dataxi,t, encryptsxi,t to get a ciphertextci,tand computes a tagσi,tthat authenticatesxi,t.
– Aggregate(SKA,{ci,t}Ui∈U,{σi,t}Ui∈U)→(sumt, σt): It is a deterministic algo- rithm run by the AggregatorA. It takes as inputs AggregatorA’s secret keySKA, ciphertexts{ci,t}Ui∈Uand authentication tags{σi,t}Ui∈U, and outputs in cleartext the sumsumtof the values{xi,t}Ui∈U. Moreover, it computes a proofσtassessing the correctness ofsumt, using the authentication tags{σi,t}Ui∈U.
– Verify(VK,sumt, σt)→ {0,1}: It is a deterministic algorithm that is executed by the Data AnalyzerDA. It outputs1if Data AnalyzerDAis convinced that the sum sumt=P
Ui∈U{xi,t}; and0otherwise.
2.2 Security Model
In this paper, we only focus on the adversarial behavior of Aggregator A. The ratio- nale behind this is that AggregatorAis the only party in the protocol that sees all the
messages exchanged during the protocol execution: Namely, AggregatorAhas access to users’ ciphertexts and it is the party that interacts directly with the Data Analyzer.
It follows that by ensuring security properties against the Aggregator, one by the same token, ensures these security properties against both Data AnalyzerDAand external parties.
In accordance with previous work [11, 15], we formalize the property ofAggregator obliviousness, which ensures that at the end of a protocol execution, AggregatorAonly learns the sum of users’ inputs xi,t and nothing else. Also, we enhance the security definitions of data aggregation with the notion ofaggregate unforgeability. As the name implies, aggregate unforgeability guarantees that AggregatorA cannot forge a valid proofσtfor a sumsumtthat was not computed correctly from users’ inputs (i.e. cannot generate a proof forsumt6=Pxi,t).
Aggregator Obliviousness Aggregator obliviousnessensures that when usersUipro- vide AggregatorAwith ciphertextsci,tand authentication tagsσi,t, AggregatorAcan- not reveal any information about individual inputsxi,t, other than the sum valueP
xi,t. We extend the existing definition ofAggregator Obliviousness(cf. [11, 12, 15]) so as to capture the fact that AggregatorAnot only has access to ciphertextsci,t, but also has access to the authentication tagsσi,tthat enable AggregatorAto generate proofs of correct aggregation.
Similarly to the work of [11, 15], we formalizeAggregator obliviousnessusing an indistinguishability-based game in which AggregatorAaccesses the following oracles:
– OSetup: When called by AggregatorA, this oracle initializes the system parameters;
it then gives the public parametersP, the Aggregator ’s secret keySKAand public verification keyVKtoA.
– OCorrupt: When queried by AggregatorAwith a userUi’ s identifieruidi, this oracle provides AggregatorAwithUi’s secret key denotedSKi.
– OEncTag: When queried with timet, userUi’s identifieruidi and a data pointxi,t, this oracle outputs the ciphertextci,t and the authentication tagσi,t of xi,t com- puted usingUi’s secret keySKi.
– OAO: When called with a subset of users S ⊂ U and with two time-series (Ui, t, x0i,t)Ui∈Sand(Ui, t, x1i,t)Ui∈Ssuch thatPx0i,t =Px1i,t, this oracle flips a random coinb∈ {0,1}and returns an encryption of the time-serie(Ui, t, xbi,t)Ui∈S (that is the tuple of ciphertexts{cbi,t}Ui∈S) and the corresponding authentication tags{σi,tb }Ui∈S.
AggregatorAis accessing the aforementioned oracles during a learning phase (cf.
Algorithm 1) and a challenge phase (cf. Algorithm 2). In the learning phase,Acalls oracleOSetupwhich in turn returns the public parametersP, the public verification key VKand the Aggregator ’s secret keySKA. It also interacts with oracleOCorruptto learn the secret keysSKi of usersUi, and oracleOEncTag to get a set of ciphertextsci,tand authentication tagsσi,t.
In the challenge phase, Aggregator A chooses a subset S∗ of users that were not corrupted in the learning phase, and a challenge time interval t∗ for which it did not make an encryption query. OracleOAO then receives two time-seriesXt0∗ =
(Ui, t∗, x0i,t∗)Ui∈S∗ and Xt1∗ = (Ui, t∗, x1i,t∗)Ui∈S∗ from A, such that P
x0i,t∗ = P
Ui∈S∗x1i,t∗. Then oracle OAO flips a random coin b← {0,$ 1}and returns to Athe ciphertexts{cbi,t∗}Ui∈S∗and the matching authentication tags{σbi,t∗}Ui∈S∗.
At the end of the challenge phase, AggregatorAoutputs a guessb∗for the bitb.
We say that AggregatorA succeeds in the Aggregator obliviousness game, if its guessb∗equalsb.
Algorithm 1:Learning phase of the obliviousness game (P,SKA,VK)← OSetup(1κ);
//Aexecutes the following a polynomial number of times SKi← OCorrupt(uidi);
//Ais allowed to callOEncTagfor all usersUi (ci,t, σi,t)← OEncTag(t,uidi, xi,t);
Algorithm 2:Challenge phase of the obliviousness game A →t∗,S∗;
A → Xt0∗,Xt1∗;
(cbi,t∗, σbi,t∗)Ui∈S∗← OAO(Xt0∗,Xt1∗);
A →b∗;
Definition 1 (Aggregator Obliviousness). LetPr[AAO] denote the probability that Aggregator Aoutputs b∗ = b. Then an aggregation protocol is said to ensure Ag- gregator obliviousness if for any polynomially bounded AggregatorAthe probability Pr[AAO]6 12+(κ), whereis a negligible function andκis the security parameter.
Aggregate Unforgeability We augment the security requirements of data aggregation with the requirement ofaggregate unforgeability. More precisely, we assume that Ag- gregatorAis not only interested in compromising the privacy of users participating in the data aggregation protocol, but also interested in tampering with the sum of users’
inputs. That is, AggregatorAmay sometimes have an incentive to feed Data Analyzer DAerroneous sums. Along these lines, we defineaggregate unforgeabilityas the se- curity feature that ensures that Aggregator Acannot convince Data AnalyzerDAto accept a bogus sum, as long as usersUiin the system are honest (i.e. they always sub- mit their correct input and do not collude with the AggregatorA).
In compliance with previous work [7, 10] on homomorphic signatures, we formalize aggregate unforgeabilityvia a game in which AggregatorAaccesses oraclesOSetupand OEncTag. Furthermore, given the property that anyone holding the public verification key VK can execute the algorithm Verify, we assume that AggregatorA during the unforgeability game runs the algorithmVerifyby itself.
As shown in Algorithm 3, AggregatorAenters theaggregate unforgeabilitygame by querying the oracleOSetupwith a security parameterκ. OracleOSetupaccordingly
Algorithm 3:Learning phase of the aggregate unforgeability game P,VK← OSetup(1κ);
//Aexecutes the following a polynomial number of times //Ais allowed to callOEncTagfor all usersUi
(ci,t, σi,t)← OEncTag(t,uidi, xi,t);
Algorithm 4:Challenge phase of the aggregate unforgeability game (t∗,sumt∗, σt∗)← A
returns public parametersP, verification keyVKand the secret keySKAof Aggregator A. Moreover, AggregatorAcalls oracleOEncTagwith tuples(t,uidi, xi,t)in order to receive the ciphertextci,tencryptingxi,tand the matching authenticating tagσi,t, both computed using userUi’s secret keySKi. Note that for each time intervalt, Aggregator Ais allowed to query oracleOEncTagfor userUionly once. In other words, Aggregator Acannot submit two distinct queries to oracleOEncTag with the same time intervalt and the same user identifieruidi. Without loss of generality, we suppose that for each time intervalt, AggregatorAinvokes oracleOEncTagfor all usersUiin the system.
At the end of the aggregate unforgeability game (see Algorithm 4), Aggre- gator A outputs a tuple (t∗,sumt∗, σt∗). We say that Aggregator A wins the aggregate unforgeabilitygame if one of the following statements holds:
1. Verify(VK,sumt∗, σt∗) → 1 and Aggregator A never made a query to oracle OEncTag that comprises time intervalt∗. In the remainder of this paper, we denote this type of forgeryType I Forgery.
2. Verify(VK,sumt∗, σt∗)→1and AggregatorAhas made a query to oracleOEncTag
for timet∗, however the sumsumt∗ 6=P
Uixi,t∗. In what follows, we call this type of forgeryType II Forgery.
Definition 2 (Aggregate Unforgeability).LetPr[AAU]denote the probability that Ag- gregator Awins the aggregate unforgeabilitygame, that is, the probability that Ag- gregator Aoutputs a Type I Forgeryor Type II Forgerythat will be accepted by algorithmVerify.
An aggregation protocol is said to ensure aggregate unforgeability if for any poly- nomially bounded adversaryA,Pr[AAU] 6 (κ), whereis a negligible function in the security parameterκ.
3 Idea of our PUDA protocol
In an extended model with an untrusted Aggregator, it is of utmost importance to design a solution in which the untrusted Aggregator cannot provide bogus results to the Data Analyzer. Such a solution will use a proof system that enables the Data Analyzer to verify the correctness of the computation. Yet verifiability should be achieved without
sacrificing privacy. Towards this goal, we propose a protocol that relies on the following techniques:
– Ahomomorphic encryptionalgorithm that allows the Aggregator to compute the sum without divulging individual data.
– Ahomomorphic tagthat allows each user to authenticate the data inputxi,t, in such a way that the Aggregator can use the collected tags to construct a proof that demonstrates to the Data AnalyzerDAthe correctness of the Aggregator sum.
Concisely, a set of non-interacting users are connected to personal services and de- vices that produce personal data. Without any coordination, each user chooses a random tag keytkiand sends an encoding thereof,tkito the key dealer. After collecting all en- coded keystkiby users, the key dealer publishes the public verification keyVKof this group of users. This verification key is computed as a function of the encodingstki. Later, the key dealer gives to each user in the system an encryption keyekithat will be used to compute the user’s ciphertexts. Accordingly, the secret key of each userSKiis defined as the pair of tag keytkiand encryption keyeki. Finally, the key dealer provides the Aggregator with secret keySKAcomputed as the sum of encryption keysekiand goes off-line.
Now at each time intervalt, each user employs its secret keySKi to compute a ciphertext based on the encryption algorithm of Shiet al.[15] and a homomorphic tag on its sensitive data input. When the Aggregator collects the ciphertexts and the tags from all users, it computes the sumsumtof users’ data and a proofσtfor the sum, and forwards the sum and the proof to the Data Analyzer. At the final step of the protocol, the Data Analyzer verifies with the verification keyVKand proofσtthe validity of the result sumt. Although the modification seems straightforward, the proof forType II Forgeryturns out to be challenging.
Thanks to the homomorphic encryption algorithm of Shiet al.[15] and the way in which we construct our homomorphic tags, we show that our protocol ensuresAggrega- tor obliviousness. Moreover, we show that the Aggregator cannot forge bogus results.
Finally, we note that the Data AnalyzerDAdoes not keep any state with respect to users’ transcripts be they ciphertexts or tags, but it only holds the public verification key, the sumsumtand the proofσt.
4 PUDA Instantiation
LetG1,G2,GT be cyclic groups of large prime orderpandg1, g2generators ofG1,G2
accordingly. We say thateis a bilinear map, if the following properties are satisfied:
1. bilinearity:e(ga1, g2b) =e(g1, g2)ab, whereg1, g2∈G1×G2anda, b∈Zp. 2. Computability: there exists an efficient algorithm that computes e(ga1, g2b)where
g1, g2∈G1×G2anda, b∈Zp. 3. Non-degeneracy:e(g1, g2)6= 1.
For encryption and sum computation we employ thediscrete logarithmbased en- cryption scheme of Shiet al.[15]:
4.1 Shi-Chan-Rieffel-Chow-Song Scheme
– Setup(1κ): LetG1 be a group of large prime orderp. A trusted key dealerKD selects a hash functionH : {0,1}∗ → G1. Furthermore, KDselects secret en- cryption keyseki ∈Zp, uniformly at random.KDdistributes to each userUi the secret keyekiand it also sends the secret keyskA=−Pn
i=1ekito the Aggregator.
– Encrypt(eki, xi,t): Each userUiencrypts the valuexi,tby using its secret encryp- tion keyekiand outputs the corresponding ciphertextci,t=H(t)ekigx1i,t∈G1. – Aggregate({ci,t}Ui∈U,{σi,t}Ui∈U,SKA): Upon receiving all the cipher-
texts {ci,t}ni=1, the Aggregator computes: Vt = (Qn
i=1ci,t)H(t)skA = H(t)Pni=1ekig
Pn i=1xi,t
1 H(t)−Pni=1eki = g
Pn i=1xi,t
1 ∈ G1. Finally A learns the sumsumt=Pn
i=1xi,t∈Zpby computing the discrete logarithm ofVton the baseg1. The sum computation is correct as long asPn
i=1xi,t< p.
4.2 PUDA Scheme
In what follows we describe ourPUDAprotocol:
– Setup(1κ):KDoutputs (p, g1, g2,G1,G2,GT) for an efficient computable bilinear mape : G1×G2 → GT, whereg1 andg2 are two random generators for the multiplicative groupsG1andG2respectively andpis a prime number that denotes the order of all the groupsG1,G2andGT. Moreover a secret keyais selected by KD. EachUi selects a random tag keytki ∈ Zp independently and forwardsgtk2i toKD.KDpublishes the verification keyVK = (vk1,vk2) = (g
Pn i=1tki
2 , ga2)and distributes to each userUi ∈ Uthe secret keyg1a ∈ G1through a secure channel.
Thus the secret keys of the scheme areSKi = (eki,tki, g1a). After publishing the public parametersP = (H, p, g1, g2,G1,G2,GT)and the verification keyVK,KD goes off-line and it does not further participate in any protocol phase.
– EncTag(t,SKi = (eki,tki, ga1), xi,t): At each time intervalt each user Ui en- crypts the data valuexi,t with its secret encryption keyeki, using the encryption algorithm, described in section 4.1, which results in a ciphertext
ci,t=H(t)ekig1xi,t∈G1
Uialso constructs a tagσi,t∈G1with its secret tag key(tki, g1a):
σi,t=H(t)tki(g1a)xi,t∈G1
FinallyUisends(ci,t, σi,t)toA.
– Aggregate(SKA,{ci,t}Ui∈U,{σi,t}Ui∈U): Aggregator A computes the sum sumt=Pn
i=1xi,tby using theAggregatealgorithm presented in section 4.1.
Moreover,Aaggregates the corresponding tags as follows:
σt=
n
Y
i=1
σi,t=
n
Y
i=1
H(t)tki(g1a)xi,t=H(t)Ptki(ga1)Pxi,t Afinally forwardssumtandσtto data analyzerDA.
– Verify(VK,sumt, σt): During the verification phaseDAverifies the correctness of the computation with the verification keyVK = (vk1 = g
Ptki
2 ,vk2 = g2a), by checking the following equality:
e(σt, g2)=? e(H(t),vk1)e(g1sumt,vk2) Verification correctness follows from bilinear pairing properties:
e(σt, g2) =e(
n
Y
i=1
σi,t, g2) =e(
n
Y
i=1
H(t)tkigax1 i,t, g2) = e(H(t)Pni=1tkiga
Pn i=1xi,t
1 , g2) =e(H(t)Pni=1tki, g2)e(ga
Pn i=1xi,t 1 , g2) = e(H(t), g
Pn i=1tki
2 )e(g
Pn i=1xi,t
1 , g2a) = e(H(t), g
Pn i=1tki
2 )e(g1sumt, ga2) =e(H(t),vk1)e(gsum1 t,vk2)
5 Analysis
5.1 Aggregator Obliviousness
Theorem 1. The proposed solution achieves aggregator obliviousness in the random oracle model under the decisional Diffie-Hellman (DDH) assumption inG1.
Proof. Assume there is an aggregatorAwhich breaks the obliviousness of the PUDA scheme with a non-negligible advantage. We build in what follows an adversaryBwho usesAas a subroutine to break the aggregator obliviousness of the private streaming aggregation (PSA) protocol presented in [15], which is guaranteed under DDH. Without loss of generality we call the oracles that the adversaryBhas access to from the PSA scheme as follows:OSetupPSA,OPSACorrupt,OEncryptPSA , andOPSAAO .
We consider in PSA as in PUDAthat there arenusersUi and each one of these users possesses a secret encryption keyeki. In the following, we show how an adversary B simulates the aggregator obliviousness game presented in Algorithms 1 and 2 to aggregatorAand how therewith breaks the aggregator obliviousness of PSA.
Learning phase:In the learning phase, adversaryBproceeds as following: Whenever A calls oracleOSetup with a security parameter κ,B queries oracle OPSASetup with the same security parameter. OracleOPSASetupin turn outputs the public parameters that are composed of a hash functionH :{0,1}∗→G1, a generatorg1of the groupG1of safe prime orderp, and the aggregator’s secret keySKA = −Pn
i=1eki.Bthen selects the parameters of a bilinear pairing(e, g1, g2,G1,G2,GT).Bchooses uniformly at random a,{ri}Ui∈Usuch and defines the verification keyVKas follows:
VK= (gaSKA+
Pn i=1ri
2 , g2a) = (ga
Pn
i=1eki+Pn i=1ri
2 , g2a) = (g
Pn
i=1aeki+ri
2 , ga2) This entails that tki is defined as: aeki +ri. Finally B forwards to A the public parameters:P = (H, p, g1, g2,G1,G2,GT), the verification keysVK= (g
Pn i=1tki 2 , g2a) and the secret key of the AggregatorskA.
Whenever A calls oracle OCorrupt with a user’s identifier uidi, B relays the query uiditoOPSACorruptof the PSA scheme which in turns outputs the secret encryption keyeki
of userUi.Bthen returns secret keySKi= (eki,tki) = (eki, aeki+ri).
Whenever A calls oracle OEncTag with query (t,uidi, xi,t), B forwards the query to the OEncryptPSA oracle which returns the appropriate ciphertext ci,t = H(t)ekigx1i,t. B computes then the tag associated with ciphertext ci,t as σi,t= (ci,t)aH(t)ri =H(t)aeki+rig1axi,t=H(t)tkigax1 i,tand transmits toAciphertext ci,tand tagσi,t.
Challenge phase:In the challenge phaseAchooses a set of usersS∗that have not been corrupted during the learning phase and a time intervalt∗for whichAdid not make a query to oracleOEncTag.Athen submits two time-seriesX0∗= (Ui, t∗, x0i,t∗)Ui∈S∗and X1∗= (Ui, t∗, x1i,t∗)Ui∈S∗toOAO, such thatPx0i,t∗=Px1i,t∗.Bsimulates this oracle as follows:
It forwards the series X0∗ and X1∗ to OAOPSA which chooses uniformly at random a bitb← {0,$ 1}and returns toBthe ciphertexts{cbi,t∗}Ui∈S∗encrypting time-serieXb∗. Next, B constructs for all Ui in S∗ the tag σi,tb ∗ corresponding to ciphertext cbi,t∗
by computing:
σi,tb ∗= (cbi,t)aH(t∗)ri = (H(t∗)ekigx
b i,t∗
1 )aH(t∗)ri
=H(t∗)aeki+rigax
b i,t∗
1 =H(t∗)tkigax
b i,t∗
1
Note thatσi,tb ∗ corresponds to a correctly computed tag for inputxbi,t∗. Finally,B forwards toA {(cbi,t∗, σbi,t∗}Ui∈S∗. At this point, the simulated view of aggregatorAis computationally indistinguishable from its view in an actualaggregator obliviousness game as defined in Algorithms 1 and 2. This leads to correct verification of the sum computed byA, more precisely:
e(Y
i∈S∗
σi,tb ∗, g2) =e(
n
Y
i=1
H(t∗)tkigax
b i,t∗
1 , g2)
=e(H(t∗), ga
Pn
i=1eki+Pn i=1ri
2 )e(g
Pn i=1xbi,t∗
1 , g2a) =e(H(t∗),vk1)e(g
Pn i=1xbi,t∗
1 ,vk2) It follows that if aggregatorAis able to output a correct guess b∗ for the bit bwith a non-negligible advantage: (i.e. is able to break the aggregator obliviousness of our scheme), thenBwill break the aggregator obliviousness of the PSA scheme with the same non-negligible advantageby outputting the guessb∗.
As such PSA scheme ensures aggregator obliviousness under the DDH assump- tion in G1, we can conclude that our scheme also ensures aggregator obliviousness:
Pr[AAO]6 12+(κ)as long as DDH holds inG1.
5.2 Aggregate Unforgeability
We first introduce a new assumption that is used during the security analysis of our PUDAinstantiation. Our new assumption named hereafter asLEOMis a variant of the LRSWassumption [14] which is proven secure in the generic model [16] and it used for the construction of the CL signatures [5]. W.l.g we assume a setIof sizenand an indext. TheOLEOMoracle chooses{γi}ni=1,∀i∈I, δ∈Zpuniformly and at random which are kept secret. It also gives the public key (g
Pn i=1γi
2 , g2δ) to the adversary and chooses α ∈ G1 at random. Adversary makes bulk queries (i, t,{xi,t}ni=1),∀i ∈ I and the OLEOM oracle, chooses βt ∈ Zp uniformly and at random and replies with {(α, βt, βtγiαδxi,t)}ni=1for each differentt.OLEOMaborts if it receives a bulk query for atfor which there isi0 ∈ I : i = i0 for whichxi,t 6=x0i,t. In the end the adversary succeeds if it outputs a tuple(t, z, α, βt, β
Pn i=1γi
t αδz)for atin whichPn
i=1xi,t6=z.
Theorem 2. (LEOMAssumption) LetGbe an algorithm that on input the security pa- rameterκoutputs the parameters of a bilinear groupG= (e,G1,G2, g1, g2, p). Define
∆=gδ2, Γ =g
Pn i=1γi
2 ∈G22forδ, γi∈Zp,∀i∈I. Consider an oracleOLEOMthat on input a set of queries(i, t,{xi,t}ni=1)responds with (α, βt, βtγiαxi,tδ) for a uniformly at random elementα∈G1, βt∈Zp.
Then for all probabilistic polynomial time adversariesAthe probability:
Pr[G← G(1κ);δ, γi∈Zp; (Γ =gδ2, ∆=g
Pn i=1γi
2 );
(t, z, a, b, c)← AOLEOM(i,t,{xi,t}ni=1): (z6=
n
X
i=1
xi,t, t)∧a=α∧b=βt∧c=β
Pn i=1γi
t αzδ]≤2(κ) Due to space limitations, the security evidence of theLEOMis deferred in the Appendix section.
We show in our analysis that aType I Forgeryimplies a break of theBCDHas- sumption and next that aType II Forgeryimplies a break of theLEOMassumption.
Theorem 3. Our scheme achieves aggregate unforgeability against aType I Forgery underBCDHassumption in the random oracle model.
Proof. We show how to build an attacker B that solves BCDH in (G1,G2,GT).
Let g1 and g2 be two generators for G1 and G2 respectively. B receives the chal- lenge(g1, g2, g1a, gb1, g1c, ga2, g2b)from theBCDHoracleOBCDH and is asked to output e(g1, g2)abc∈GT.Bsimulates the interaction withAin the two phases (Setup, Learn- ing) as follows:
Setup:
– To simulate the OASetup oracle B selects uniformly at random 2n keys {ki}ni=1, {yi}ni=1 ∈ Zp and outputs the public parametersP = (κ, p, g1, g2,G1,G2) the verification keyVK = (vk1,vk2) = (gb
Pn i=1ki
2 , g2a)and the secret key of the Ag- gregatorSKA=−Pn
i=1yi.
Learning phase
– Ais allowed to query the random oracleH for any time interval .Bconstructs a H−listand responds toAquery as follows:
1. If query (t) already appears in a tupleH-tupleht : rt,coin(t), H(t)i of the H−listit responds toAwithH(t).
2. Otherwise it selects a random number rt ∈ Zp and flips a random coin← {0,$ 1}. With probabilityp,coin(t) = 0 andBanswers withH(t) = g1rt. Otherwise ifcoin(t) =1thenBresponds withH(t) =gcr1t and updates theH−listwith the new tupleH-tupleht:rt,coin(t), H(t)i.
– WheneverAsubmits a query (t,uidi, xi,t) to theOAEncTag,Bconstructs aT−list and responds as follows:
1. If at time intervaltAhas never queried before theOAEncTagoracle then:
(a) Binitializes variableΣt= 0.
(b) B calls the simulated random oracle, receives the result for H(t) and ap- pends the tupleH-tupleht:rt,coin(t), H(t)ito theH−list.
(c) Ifcoin(t) = 1thenBstops the simulation.
(d) Otherwise it chooses the secret tag key ki wherei = uidito be used as secret tag key from the set of{ki}keys, chosen byBin theSetupphase.
(e) Bsends toAthe tagσi,t=g1rtbkigax1 i,t=H(t)bkig1axi,t, which is a valid tag for the valuexi,t. Notice thatBcan correctly compute the tag without knowingaandbfrom theBCDHproblem parametersga1, g1b.
(f) Bchooses also a secret encryption keyyi ∈ {yi}ni=1 ∈ Zp and computes the ciphertext asci,t=H(t)yig1xi,t. The simulation is correct sinceAcan check that the sumPn
i=1xi,t corresponds to the ciphertexts given byB with its decryption key SKA = −Pn
i=1yi, considering the attacker has made distinct encryption queries for all thenusers in the scheme at a time intervalt.
(g) B sets Σt = Σt + xi,t and updates the T−list with the tuple:
ht,uidi, xi,t, σi,ti
2. Else ifT−listcontainsi0 =uidiandxi,t =x0i,tthenBfetches the corre- spondingσi,tfrom the list and forwards it toA.
3. Else ifT−listcontainsi0 =uidiandxi,t 6=x0i,tthenBaborts.
4. Otherwise (0<cntt< n),Blooks to theH−listlist for the tuple indexed bytin order to getht : rt,coin(t), H(t)i. If the tuple does not exist thenB tosses a randomcoin and ifcoin(t) = 1thenBaborts. If coin(t) = 0then B computes the tag identically as in 1(d)(e)(f)(g) steps: It chooses a key ki
wherei = uidi from the selected keys{ki}. It constructs the tag asσi,t = g1rtbkig1axi,t =H(t)bkig1axi,t and the ciphertext asci,t =H(t)yig1xi,t. Finally BsetsΣt=Σt+xi,t, updates theT−listwith the tuple:ht,uidi, xi,t, σi,ti.
Now, whenBreceives the forgery(sumt∗, σt∗)at time intervalt=t∗, it continues ifsumt∗ 6=Σt.Bfirst queries theH-tuple for timet∗in order to fetch the appropriate tuple.
– Ifcoin(t∗) = 0thenBaborts.
– Ifcoin(t∗) = 1then since Aoutputs a valid forgedσt∗ att∗, it is true that the following equation should hold:
e(σt∗, g2) =e(H(t∗),vk1)e(gsum1 t∗,vk2)
which is true whenAmakesnqueries for time intervalt∗for distinct users to the OEncTagA oracle during theLearningphase. As suchσt∗=gcrtb
Pki 1 gasum1 t∗ FinallyBoutputs:
e(( σt∗
g1asumt∗)rtP1ki, ga2) =e((gcr1tbPkigasum1 t∗
gasum1 t∗ )rtP1ki, g2a)
=e((g1crtbPki)rtP1ki, ga2) =e(gbc1 , ga2) =e(g1, g2)abc
LetAAU1the event whenAsuccessfully forges aType I forgeryσtfor our PUDA protocol that happens with some non-negligible probability 0. Then Pr[BBCDH] = Pr[event0] Pr[event1] Pr[AAU2] = p(1−p)qH−10, for qH random oracle queries with the probabilityPr[coin(t) = 0] = p. As such we ended up in a contradiction as- suming the hardness of theBCDHassumption and finallyPr[AAU1]≤1, where1is a negligible function.
Theorem 4. Our scheme guarantees aggregate unforgeability against a Type II Forgeryunder theLEOMassumption in the random oracle model.
Proof. (Sketch) TheOAEncTagoracle behaves equivalently as the oracle in theLEOMas- sumption.Bchooses secret encryptions keys{eki}ni=1and sends toAthe secret decryp- tion keySKA=−Pn
i=1eki.Breceives also the public key (vk1=g
Pn i=1γi
2 ,vk2=gδ2) from theOLEOM oracle and forwards it toA along with the public parameters P = (κ, p, g1 =α, g2,G1,G2). For a random oracle query H(t) the simulatorBqueries the OLEOMwith input (i 3 I, t, xi,t←$ Zp)which replies with(a = α∧b = βt∧c = βγtiαxi,tδ). FinallyBforwards to A,H(t) = βt. For queries(i = uid, t, xi,t)to the OAEncTag oracle the simulatorBreturnsσi,t =βtγiαδxi,tfrom theOLEOMoracle, as a tag, and constructs the ciphertext asci,t =βtekig1xi,t.Ais able to correctly verify the sum, more precisely:
e(
n
Y
i=1
σi,t, g2) =e(
n
Y
i=1
βtγiαδxi,t, g2) =e(β
Pn i=1γi
t αδPni=1xi,t, g2)
=e(βt, g
Pn i=1γi
2 )e(αPni=1xi,t, gδ2) =e(βt,vk1)e(αPni=1xi,t,vk2)
Therefore, from the point of view ofA, the tagsσi,t = βtγiαδxi,t correspond to well formed verifiable tags. Notice that if there is some non-negligible probability that B breaks theLEOMassumption then the probability thatAoutputs aType II Forgery is also non-negligible. This leads to a contradiction under theLEOMassumption and accordingly,Pr[AAU2]≤2for a negligible function2. We conclude that our scheme guaranteesaggregate unforgeabilityfor aType II Forgeryunder theLEOMassumption in the random oracle model.
Participant Computation Communication
User 2EXP+1MUL 2·l
Aggregator (n−1)MUL 2·l
Data Analyzer 3PAIR+1EXP+1MUL+1HASH -
Table 1: Performance of tag computation, proof construction and verification operations.ldenotes the bit-size of the prime numberp.
To conclude with the analysis the success probabilities for theaggregate unforgeability gamePr[AAU], are taken over the union of the success probabilities for the two type of forgeries. As such
Pr[AAU] = Pr[AAU1] + Pr[AAU2]≤1(κ) +2(κ) where1and2are negligible functions.
5.3 Performance Evaluation
In this section we analyze the extra overhead of ensuring theaggregate unforgeability property in our PUDAinstantiation scheme. First, we consider a theoretical evaluation with respect to the mathematical operations a participant of the protocol be it user, Ag- gregator or Data Analyzer has to perform with respect to the verifiability transcripts.
That is, the computation of the tag by each user, the proof by the Aggregator and the verification of the proof by the Data Analyzer. We also present an experimental evalua- tion that shows the practicality of out scheme.
To allow the Data analyzer to verify the correctness of computations performed by an untrusted Aggregator each user selects uniformly and at random a secret key tki ∈ Zp. The key dealer distributes to each userg1a ∈ G1 and publishesga2 ∈ G2, which calls for two exponentiations: one inG1and one inG2. At each time intervalt each user computesσi,t = H(t)tki(ga1)xi,t ∈ G1, which entails two exponentiations and one multiplication inG1. For the computation of theσtthe Aggregator is involved inn−1multiplications inG1:Qn
i=1σi,t. Finally the data analyzer verifies by check- ing the equality: e(σt, g2) =? e(H(t),vk1)e(g1sumt,vk2), which asks for three pairing evaluations, one hash inG1, one exponentiation inG1 and one multiplication inGT
(see table 1). The efficiency ofPUDAstems from the constant time verification with respect to the size of the users. This is of crucial importance since the Data Analyzer may not own computational power.
We implemented the verification functionalities ofPUDAwith theCharmcryp- tographic framework [1, 2]. For pairing computations, it inherits thePBC[13] library which is also written inC. All of our benchmarks are executed on Intel Core i5 CPU M 560 @ 2.67GHz×4 with 8GB of memory, running Ubuntu 12.04 32bit.Charmuses 3 types of asymmetric pairings:MNT159,MNT201,MNT224. We run our benchmarks with these three different types of asymmetric pairings. The timings for all the underlying mathematical group operations are summarized in table 3. There is a vast difference on the computation time of operations betweenG1andG2for all the different curves. The reason is the fact that the bit-length of elements inG2is much larger than inG1.
XXXX XXX
Operation Pairings
MNT159 MNT201 MNT224
Tag 1.2ms 1.8ms 2.2ms
Verify 28.3ms42.7ms53.5ms Table 2: Computational cost ofPUDAoperations with re- spect to different pairings.
PP PPP
Op.
Curve
MNT159 MNT201 MNT224 HASHinG1 0.139ms 0.346ms 0.296ms HASHinG2 25.667ms41.628ms48.305ms MULinG1 0.004ms 0.0006ms 0.006ms MULinG2 0.040ms 0.051ms 0.054ms MULinGT 0.012ms 0.015ms 0.016ms EXPinG1 0.072ms 0.092ms 0.099ms EXPinG2 0.615ms 0.757ms 0.784ms PAIR 7.077ms 10.674ms13.105ms Table 3: Average computation overhead of the underlying mathematical group operations for different type of curves.
As shown in table 2, the computation of tagsσi,timplies a computation overhead at a scale of milliseconds with a gradual increase as the bit size of the underlying elliptic curve increases. The data analyzer is involved in pairing evaluations and computations at the target group independent of the size of the data-users.
6 Related Work
In [6], authors proposed a solution which is based on homomorphic message authen- ticators in order to verify the computation of generic functions on outsourced data.
Each data input is authenticated with an authentication tag. A composition of the tags is computed by the cloud in order to verify the correctness of the output of a programP. Thanks to the homomorphic properties of the tags the user can verify the correctness of the program. The main drawback of the solution is that the user in order to verify the correctness of the computation has to be involved in computations that take exactly the same time as the computation of the functionf.Backeset al. [3] proposed a generic solution for efficient verification of bounded degree polynomials in time less than the evaluation off. The solution is based onclosed form efficientpseudorandom function P RF. Contrary to our solution both solutions do not provide individual privacy and they are not designed for a multi-user scenario.
Catalanoet al.[8] employed a nifty technique to allow single users to verify com- putations on encrypted data. The idea is to re-randomize the ciphertext and sign it with a homomorphic signature. Computations then are performed on the randomized cipher- text and the original one. However the aggregate value is not allowed to be learnt in cleartext by the untrusted aggregator since the protocols are geared for cloud based scenarios.
In the multi-user setting, Choiet al.[9] proposed a protocol in which multiple users are outsourcing their inputs to an untrusted server along with the definition of a func- tionality f. The server computes the result in a privacy preserving manner without learning the result and the computation is verified by a user that has contributed to the function input. The users are forced to operate in anon-interactivemodel, whereby they cannot communicate with each other. The underlying machinery entails a novel proxy based oblivious transfer protocol, which along with a fully homomorphic scheme and garbled circuits allows for verifiability and privacy. However, the need of fully homo-