Breaking barriers in secret sharing

(1)

Breaking Barriers in Secret Sharing

by Tianren Liu

B.Eng., Tsinghua University. (2014)

S.M., Massachusetts Institute of Technology. (2016)

Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2019

c

Signature of Author . . . . Department of Electrical Engineering and Computer Science August 30, 2019

Certified by . . . . Vinod Vaikuntanathan Associate Professor of Electrical Engineering and Computer Science Thesis Supervisor

Accepted by . . . . Leslie A. Kolodziejski Professor of Electrical Engineering and Computer Science Chair, Department Committee on Graduate Students

(2)

(3)

Breaking Barriers in Secret Sharing

by Tianren Liu

Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

Abstract

In this thesis, we study secret sharing schemes for general (non-threshold) access functions. In a secret sharing scheme for n parties associated to a monotone function F : {0, 1}n→ {0, 1}, a dealer distributes shares of a secret among n parties. Any subset of parties T ⊆ [n] can jointly reconstruct the secret if F(T ) = 1, and should have no information about the secret if F(T ) = 0. One of the major long-standing questions in information-theoretic cryptography is to determine the minimum size of the shares in a secret-sharing scheme for an access function F. There is an exponential gap between lower and upper bounds for share size: the best known scheme for general monotone functions has shares of size 2n−o(n), while the best lower bound is n2/ log(n).

In this thesis, we improve this more-than-30-year-old upper bound by construct-ing a secret sharconstruct-ing scheme for any access function with shares of size 20.994n _{and a} linear secret sharing scheme for any access function with shares of size 20.999n. As a contribution of independent interest, we also construct a secret sharing scheme with shares of size 2O(˜

√

n) _{for a family of 2}(n/2n) _{monotone access functions, out of a total of}

2(

n

n/2)·(1+O(log n/n)) _{of them.}

As an intermediate result, we construct the first conditional disclosure of secrets (CDS) with sub-exponential communication complexity. CDS is a variant of secret sharing, in which a group of parties want to disclose a secret to a referee the parties’ respective inputs satisfy some predicate.

The key conceptual contribution of this thesis is a novel connection between secret sharing and CDS, and the notion of (2-server, information-theoretic) private informa-tion retrieval.

Thesis Supervisor: Vinod Vaikuntanathan

(4)

(5)

First and foremost, I am forever thankful to my advisor Vinod Vaikuntanathan, for the freedom and support he gave me.

Becoming his student was an incredible piece of luck. Five years is too short.1

I am thankful to my committee member, Yael Tauman Kalai, Yuval Ishai and Ronald Rivest.

I am grateful to Hoeteck Wee and Yuval Ishai.

They hosted me in Paris and in Haifa, and have been supporting me ever since that.

I am forever thankful to John Steinberger, my undergraduate advisor in Tsinghua. John introduced me to theoretical computer science, and in particular, cryptography.

Taking his courses is a pure joy.

Finally, I would like to thank my parents, Jingtai and Hongmei, for the everlasting love they gave me.

(6)

(7)

Bibliographic Notes

This thesis is based on the following three papers.

[1] Tianren Liu, Vinod Vaikuntanathan, and Hoeteck Wee. Conditional disclosure of secrets via nonlinear reconstruction. In Advances in Cryptology CRYPTO 2017 -37th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 20-24, 2017, Proceedings, Part I, pages 758–790, 2017.

[2] Tianren Liu, Vinod Vaikuntanathan, and Hoeteck Wee. Towards breaking the expo-nential barrier for general secret sharing. In Advances in Cryptology - EUROCRYPT 2018 - 37th Annual International Conference on the Theory and Applications of Cryp-tographic Techniques, Tel Aviv, Israel, April 29 - May 3, 2018 Proceedings, Part I, pages 567–596, 2018.

[3] Tianren Liu and Vinod Vaikuntanathan. Breaking the circuit-size barrier in secret sharing. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 699–708, 2018.

The following papers were published during my Ph.D. studies, but are not included in this thesis.

[4] Tianren Liu and Vinod Vaikuntanathan. On basing private information retrieval on NP-hardness. In Theory of Cryptography - 13th International Conference, TCC 2016-A, Tel Aviv, Israel, January 10-13, 2016, Proceedings, Part I, pages 372–386, 2016. [5] Tianren Liu. On basing search SIVP on NP-hardness. In Theory of Cryptography

- 16th International Conference, TCC 2018, Panaji, India, November 11-14, 2018, Proceedings, Part I, pages 98–119, 2018.

[6] Melissa Chase, Yevgeniy Dodis, Yuval Ishai, Daniel Kraschewski, Tianren Liu, Rafail Ostrovsky, and Vinod Vaikuntanathan. Reusable non-interactive secure computation. In Advances in Cryptology - CRYPTO 2019 - 39th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 18-22, 2019, Proceedings, Part III, pages 462–488, 2019.

(8)

Chapter 1 Introduction

Secret sharing [Sha79, Bla79] is a powerful cryptographic technique that allows a dealer to distribute shares of a secret to n parties such that certain authorized subsets of parties, and only they, can recover the secret. The original definition of secret sharing is what we now call a t-out-of-n threshold secret sharing scheme, where any set of t or more parties can recover the secret, and no subset of fewer than t parties can learn any information about the secret whatsoever. E.g., in 2-out-of-2 secret sharing, the shares are two random strings such that their XOR is the secret. This construction can be easily generalized to n-out-of-n secret sharing. Shamir [Sha79] shows how to construct t-out-of-n threshold secret sharing for any 1 ≤ t ≤ n using polynomials over a finite field.

A set of parties that can recover the secret is called an authorized set. In Shamir’s threshold secret sharing, authorized sets are all the sets of size at least t. This was generalized by Ito, Saito and Nishizeki [ISN89] to allow authorized sets to be specified by any monotone function. More precisely, a secret sharing scheme realizing a monotone function F : {0, 1}n_→

{0, 1} is simply a randomized algorithm that on input a secret s, outputs n shares s1, . . . , sn

such that for any (x1, . . . , xn) ∈ {0, 1}n, the collection of shares {si : xi = 1} determine

the secret if F(x1, . . . , xn) = 1 and reveal nothing about the secret otherwise.1 Here “reveal

nothing” means the joint distribution of these shares does not depend on the value of the secret. It is easy to see that t-out-of-n threshold secret sharing corresponds to the special case where F is the threshold function that outputs 1 if and only if at least t of the n input bits are 1.

While the landscape of threshold secret sharing is relatively well-understood, even very basic information-theoretic questions about the more general notion of secret sharing remain embarrassingly open. It is simple to construct a secret sharing scheme realizing any monotone function F : {0, 1}n → {0, 1} where each share is at most 2n _{bits [}_ISN89_{]; the share size can}

be improved to O(2n_/√_{n) bits [}_BL88_{]. We also know that there is an (explicit) monotone}

function F : {0, 1}n _{→ {0, 1} that requires a total share size of Ω(n}2_{/ log n) bits [}_Csi97_{], a}

far cry from the upper bounds. No better lower bounds are known, even in a non-explicit

1_{The typical formulation of secret sharing refers to a dealer that holds a secret distributing shares to n}

parties, such that only certain subsets of parties – described by a so-called access function or access structure – can reconstruct the secret. In our formulation, the randomized algorithm corresponds to the dealer, si

corresponds to the share given to party i, xi∈ {0, 1} indicates whether party i is present in a subset, and F

(10)

sense.

The 2n−o(n) _{Size Barrier.} _{Closing the exponential gap between the afore-mentioned upper}

bounds and lower bounds is a long-standing open problem in cryptography. All known secret sharing schemes [ISN89, BL88, BI92, KW93] require 2n−o(n) share size. The general consensus appears to be that the upper bound is almost tight; see, e.g., [Bei11]. In the secret sharing scheme presented by Benaloh and Leichter [BL88], the access function is written as a monotone formula, and the share size equals the number of gates in the formula (if all gates have fan-in-2). Thus by counting argument, since there are 22n−o(n)

different monotone functions, to construct secret sharing scheme for the worst monotone function under such paradigm, the share size is at least Θ(2n_/√_n).

Karchmer and Wigderson [KW93] present a different framework based on (monotone) span programs to construct secret sharing schemes. The framework captures all linear secret sharing schemes, which are secret sharing schemes where the shares are linear functions of the secret and the randomness, over a fixed finite field. Assume the shares of a linear secret sharing scheme are ` field elements, it’s easy to show that `-field-element randomness is sufficient, and the secret sharing scheme can be specified by its ` × ` coefficient matrix.2 Thus by counting argument, ` ≥ ˜Ω(2n/2_{) for the worst monotone function. Although all}

previous constructions of linear secret sharing for the worst function need 2n−o(n) _{share size,}3

potentially, the share size could be improved to ˜O(2n/2).

In this thesis, we construct the first secret sharing scheme with 2cn share size, where the exponent c is a constant less than 1. Such improvement is also achieved by linear secret sharing, for a slightly worse exponent.

Theorem 1 (Informal). For every monotone access function F : {0, 1}n _{→ {0, 1}, there is}

a secret sharing scheme with total share size 20.994n_{, and a linear secret share scheme with}

total share size 20.999n.

The Representation Size Barrier. All known constructions of secret sharing schemes represent the target access functions in a concrete computational model, be it (monotone) circuits, formulas, branching programs or span programs [ISN89,BL88, BI92, KW93]. As a result, the share size in these schemes is polynomially related to the bit-length of the repre-sentation.4 _{Thus, a reasonable conjecture is that the share size is polynomially related to the}

Kolmogorov complexity of the access function. Note that such conjecture is also compatible with Theorem 1.

In this thesis, we show there exists monotone functions whose share size is sub-polynomial in their Kolmogorov complexity.

Theorem 2 (Informal). There is a family of 22n−o(n)

monotone functions, such that every monotone access function in the family has a secret sharing scheme with total share size 2O(˜

√ n)_.

2_{The coefficient matrix is the (monotone) span program that computes the access function.} 3_{All the secret sharing schemes in [}_ISN89_, _BL88_,_BI92_,_KW93_{] are linear.}

4_{We remark that this state of affairs appears to be true even for the relaxed notion of computationally}

(11)

Since the family has 22n−o(n)

monotone functions, counting tells us that the description complexity of most functions in the collection is 2n−o(n)_{. The secret sharing scheme in}

Theo-rem 2 is highly non-linear. Non-linearity is necessary since any linear secret sharing scheme for the class has complexity polynomially related to the Kolmogorov, via the monotone span program connection.

Conditional Disclosure of Secrets. As an important intermediate step, we studied conditional disclosure of secrets (CDS) [GIKM00], which is a variant of secret sharing: two parties want to disclose a secret to another party called a referee if and only if their respective inputs satisfy some fixed predicate P : [N ] × [N ] → {0, 1} (illustrated by Figure 1.1). Concretely, Alice holds x ∈ [N ], Bob holds y ∈ [N ] and in addition, they both hold a secret s ∈ {0, 1} (along with some additional private randomness w). The referee knows both x and y but not the secret s or randomness w; Alice and Bob each send a single message to the referee, and want to disclose s to the referee if and only if P(x, y) = 1. How many bits do Alice and Bob need to communicate to the referee?

This is a very simple and natural model where non-private computation requires very little communication (just a single bit), whereas the best upper bound for private computation is exponential. Indeed, in the non-private setting, Alice or Bob can send s to the referee, upon which the referee computes P(x, y) and decides whether to output s or not. This trivial protocol with one-bit communication is not private because the referee learns s even when the predicate is false. In contrast, in the private setting, we have a big gap between upper and lower bounds. The best upper bound we have for CDS for general predicates P requires that Alice and Bob each transmit O(N1/2) bits [BIKK14, GKW15], and the best known lower bound is Ω(log N ) [GKW15,AARV17]. A central open problem is to narrow this gap, namely:

Do there exist CDS protocols for general predicates P : [N ] × [N ] → {0, 1} with o(N1/2) communication?

In this thesis, we address this question in the affirmative, giving a protocol with sub-polynomial No(1) communication.

Theorem 3 (Informal). For any predicate P : [N ] × [N ] → {0, 1}, there is a 2-party CDS protocol with 2O(˜ √log N ) _{communication.}

Multi-party CDS is a natural extension of CDS, where Alice and Bob are replaced by k parties. Informal, k-party CDS allows k parties to disclose a secret to the referee if and only if the k parties’ respective inputs satisfy a fixed predicate P : [N ] × [N ] × . . . × [N ]

| {z }

k times

→ {0, 1}. Concretely, the i-th party hold xi ∈ [N ]; all k parties know a secret bit and some shared

private randomness; each of them sends a message to the referee, such that the referee, who knows (x1, . . . , xk), learns the secret bit if and only if P(x1, . . . , xk) = 1. In this thesis, we

extend our 2-party CDS protocol into multi-party CDS protocol.

Theorem 4 (Informal). For any predicate P : [N ]k _{→ {0, 1}, there is a k-party CDS protocol}

with 2O(˜

√

(12)

One might consider the generalization from 2-party CDS to multi-party CDS as a dynamic process, where the input domain of each party is [N ], and number of parties keeps increasing. There is an alternative interpretation of the generalization, which gives a more intuitive view of our construction: Consider the total input length as a fixed invariant, and number of parties keeps increasing. As the input is evenly split among a greater number of parties, every single party holds a shorter input.

When the total input length is fixed, increasing the number of parties makes the problem harder (i.e. high communication complexity). Let n denote the total input length. For the simplest 2-party case, the formal version of Theorem 3 gives 2-party CDS with 2O(

√ n log n)

communication. The hardest case is when there are n parties, so that each party only has a 1-bit input.5 _{The formal version of Theorem} ₃ _{shows that despite the number of parties,}

there is a multi-party CDS with 2O(

√

n log n) _{communication. Thus we can argue that our}

generalization from 2-party CDS to multi-party CDS only introduces a minor communication overhead – an extra √log n factor in the exponent.

1.1 Preview of Our Techniques

Connection to PIR. The key conceptual contribution of this thesis is a surprising con-nection between private information retrieval (PIR) and secret sharing, via CDS as an inter-mediate step. A similar connection was recently observed by Beimel et al. between PIR and a related model called private simultaneous messages (PSM) [BIKK14]. Private information retrieval (PIR) – in particular, 2-server information-theoretic PIR – is a protocol that allows a client to retrieve the i-th item of a database from two servers without leaking the index i to any single server [CKGS98].

From their definitions (visualized by Figure 1.1), CDS and PIR don’t seem to have much in common. In CDS, the only privacy requirement is to hide a 1-bit secret from the referee if the inputs don’t satisfy a fixed predicate, while in PIR, the privacy requirement is stronger, namely to hide a (multi-bit) index from each of the two servers. Nevertheless, we construct a very general transformation from PIR protocols to CDS protocols. As long as a PIR protocol satisfies some mild constraints (which all existing protocols do), we show how to transform it into a CDS protocol while preserving the communication complexity. In this section, we will present the simplest PIR and CDS protocols, and point out the similarities between them, so that the reader gets a flavor of the generic transformation that will be presented in Chapter 3.

Let us start with PIR protocols. To be concrete, let N be the length of the database. The client knows an index i ∈ [N ], both servers know D ∈ {0, 1}N_{, and the client should}

learn the i-th bit D[i] from the protocol. All know PIR protocols have only 2 rounds of communication. In the first round, the client generates a query q1 (resp. q2) and sends it to

the first server (resp. second server); the server computes an answer a1 (resp. a2) based on

the query and the database D, and sends it back to the client as a response. The privacy requirement says the distribution of q1 (resp. q2) doesn’t depend on the index i.

Here is a simple textbook example of 2-server PIR protocol [CKGS98]: Consider the database D as an ` × (N/`) matrix. Then any index i ∈ [N ] can be split into its higher half

(13)

conditional disclosure of secrets (CDS) Referee x, y Alice x, s Bob y, s private w learns s iff P(x, y) = 1

private information retrieval (PIR)

Client i Server D Server D private w learns D[i] Figure 1.1: Pictorial representations of CDS and PIR.

iH ∈ [`] and lower half iL ∈ [N/`] accordingly, such that D[i] is in the iH-th row and iL-th

column of the matrix. Thus we have D[i] = e|_i

H · D · eiL, where ej denotes the vector that

are all zero except its j-th bit equals 1. This observation leads to a PIR protocol where the sender’s queries are `-bit long and the receivers’ answers are (n/`)-bit long.

• The client samples a random vector r ← {0, 1}`_{, sends q}

1 := r to the first server, sends

q2 := r ⊕ eiH to the second server.

• For either server t ∈ {1, 2}, upon receiving qt, sends at:= D · qt to the client.

• The client, upon receiving a1, a2 from the servers, output e|iH · (a1⊕ a2).

Each server learns no information about the index i, as the distribution of each query is always uniform no matter what the value of i is. The client’s output equals D[i] because

e|_i_H · (a1⊕ a2) = e|iH · (D · q1⊕ D · q2) = e |

iH · D · (q1⊕ q2) = e |

iH · D · eiL = D[i].

Under the generic transformation in Chapter 3, this PIR protocol can be transformed into the CDS protocol presented by Gay, Kerenidis and Wee [GKW15]. Gay et al. consider a harder problem, where Alice’s input x ∈ [N ] is replaced by a table D ∈ {0, 1}N _such

that D[i] := P(x, i). Then Bob’s input should be viewed as an index, denoted by i to be consistent with the PIR notation. Similarly, the table can be viewed as an ` × (N/`) matrix, and the index can be split into its higher half iH ∈ [`] and lower half iL ∈ [N/`]. And the

CDS protocol in [GKW15] is as follows:

• Alice and Bob sample r ← {0, 1}`_{, r}0 _{← {0, 1}}N/` _{from their shared randomness.}

• Alice sends mA:= D · r ⊕ r0 to the referee.

• Bob sends mB,1:= r ⊕ seiL and mB,2 := r 0_[i

H] to the referee.

(14)

The referee outputs s when D[i] = 1 as e|_i H · (mA⊕ D · mB,1) ⊕ mB,2= e | iH · (D · r ⊕ r 0_{⊕ D · m} B) ⊕ r0[iH] = e|_i_H · D · (r ⊕ mB) = e|iH · D · seiL = sD[i].

As to privacy, the referee’s view doesn’t depend on the secret when D[i] = 0: (mA, mB,1) is

uniform random as it’s one-time padded by (r, r0), and mB,2 = e|iH · (mA⊕ D · mB,1) when

D[i] = 0.

There are some striking similarities between the above PIR protocol and CDS protocol. There is an apparent mapping between the components of the PIR protocol and the compo-nents of the CDS protocol. The database in PIR is Alice’s input in CDS; the index in PIR is Bob’s input in CDS; the query to the first server is in the shared randomness in CDS; the query to the second server is in Bob’s message when s = 1; Alice’s message is the one-time pad of the first server’s response; and the second server’s response is locally computed by the referee when s = 1.

We show that these similarities are not a coincidence. Indeed, we show a generic transfor-mation from PIR to CDS that preserves total communication complexity (Chapter3). The transformation requires some mild constraints, which are satisfied by all known PIR proto-cols, including the state-of-the-art PIR protocol with 2O(˜ √log N )_{communication [}_DG15_{]. Thus}

it yields a CDS protocol of the same communication complexity, which proves Theorem 3. From CDS to Secret Sharing. As shown by [BIKK14], 2-party CDS is equivalent to secret sharing for the so-call “forbidden bipartite graph” [SS97]. In forbidden bipartite graph secret sharing, the access function is specified by a bipartite graph: each vertex is a party; a group of vertices can recover the secret if and only if they contain two vertices from the same side, or two adjacent vertices. Given a 2-party CDS protocol for predicate P : [N ] × [N ] → {0, 1}, the corresponding bipartite graph is the one with N parties on each side, whose adjacency matrix is P. It’s easy to construct a secret sharing scheme for this forbidden bipartite graph where the share size ≈ the communication complexity of the CDS protocol, as follows.

• Sample random r1, r2 ← {0, 1} and the shared randomness w of the CDS protocol.

• For each i ∈ [N ], the i-th share is Alice’s message on input i, secret r1⊕ r2⊕ s and

shared randomness w.

• For each i ∈ [N ], the (N + i)-th share is Bob’s message on input i, secret r1⊕ r2 ⊕ s

and shared randomness w;

• Additionally, give r1 to each of the first N parties; share s among the first N parties

using 2-out-of-N threshold secret sharing; give r2 to each of the the later N parties;

share s among the later N parties using 2-out-of-N threshold secret sharing.

In this secret sharing scheme, two vertices on the same side can recover the secret from the 2-out-of-N threshold secret sharing; two adjacent vertices have r1, r2 and can recover

(15)

r1⊕ r2⊕ s from the CDS messages. The privacy of this secret sharing scheme is also easy

to verify.

Similarly, we show multi-party CDS implies secret sharing for slice functions.6 Slice function is a generalization of threshold function. A function F : {0, 1}n _{→ {0, 1} is a slice}

function if there exists a threshold t ∈ [N ] such that

x1+ . . . + xn < t =⇒ F(x1, . . . , xn) = 0,

x1+ . . . + xn > t =⇒ F(x1, . . . , xn) = 1.

That is, the definition doesn’t restrict the output value if the input’s Hamming weight equals the threshold. There are 2(nt) different slice functions for threshold t. We show that for any

slice function F : {0, 1}n_{→ {0, 1}, a secret sharing scheme realizing F can be constructed as}

from n-party CDS protocol for F.

In Theorem 4, we construct k-party CDS protocols for any predicate P : [N ]k _{→ {0, 1}}

with communication complexity 2O(˜

√

k log N ) _{(see Chapter} ₄_{). By combining with our}

con-struction of slice function secret sharing (let N = 2 and k = n), it implies secret sharing schemes for arbitrary slice functions with share size 2O(˜ √n)_{. Note that there are more than}

2(n/2n ) different slice functions, thus the bit-length of the representation of most slice functions

is at least ˜Ω(2n). This proves Theorem 2.

General Secret Sharing. We construct the first secret sharing scheme for arbitrary mono-tone functions with O(2cn_{) share size, where constant c < 1. Our construction can be}

explained by combining the formula-based paradigm for constructing general secret shar-ing [BL88] with the slice functions secret sharing with sub-exponential share size.

Any monotone function is computed by a monotone formula, i.e. a formula using only (unbounded fan-in) AND, OR as gates. Benaloh and Leichter present a formula-based gen-eral secret sharing scheme, such that the share size is no more than the formula size [BL88], as follows.

• Fix an monotone formula computing the access function; • Tag the final output wire with the secret s;

• Propagate from the output wire to input wires: For each OR gate, tag its input wires by the tag of its output wire; For each AND gate, tag its input wires by an additive secret sharing of the tag of its output wire;

• The share of the i-th party consists of all tags on its input wires.

We generalize this framework by extending the base gates to include all slice functions, then the propagation step is replaced as

• Propagate from the output wire to input wires: For each F gate, share the tag of its output wire using a secret sharing scheme realizing F, tag its input wires by the generated shares.

(16)

The share size of the resulting scheme is larger than the formula size by a sub-exponential factor (if the formula is of constant depth).

The new framework is complemented by a formula size upper bound that is implic-itly shown in Chapter 5: Every monotone function is computed by a constant-depth size-O(20.994n_{) monotone formula if the base gates also include all n-bit-to-1-bit slice functions.}

Therefore, we construct secret sharing schemes for general monotone functions with share size O(20.994n).

1.2 Related Work

Amortization. For both CDS and secret sharing, an interesting question is whether the efficiency can be improved for long secrets. For threshold secret sharing, this question is well understood. When the secret is long, the share size equals the secret size [Sha79]. On the other hand, when the secret is 1-bit, the share size is at least log(n) − O(1) [KN90,BGK16]. Recently, Applebaum et al. [AARV17] show an elegant 2-party CDS protocol for long secret, where for sufficiently long secret (double exponential in the input length), the message size is input length times the secret size. The result was later improved by Applebaum and Arkis [AA18], where they construct multi-party CDS for long secret where the message size is only 4 times the secret size.

Private Simultaneous Messages (PSM). PSM, introduced by Feige, Kilian and Naor [FKN94], is a primitive similar to CDS. In a k-party PSM protocol specified by function F : [N ]k → {0, 1}, there are k parties, each holds a private input in [N ] and a common random string, and sends a message to the referee. The referee, upon receiving k messages, learns the value of F applied to the private inputs without learning extra information about the inputs. Beimel et al. [BIKK14] construct 2-party PSM with communication complexity O(√N ). The best lower bound for 2-party PSM is 3 log N −O(log log N ) [AHMS18,FKN94]. Beimel, Kushilevitz and Nissim [BKN18] generalized the construction as a k-party PSM with communication size O(poly(k)·Nk/2). Assouline and Liu [AL19] improve the communication complexity to Ok(N

k−1 2 ).

Limitation of Our Secret Sharing Construction. The share size of our general secret sharing scheme is roughly the monotone formula size for computing the access function, where the formula is constant-depth and uses AND, OR and all slice functions as gates. As observed by Beimel and Kamath [Bei19,Kam19], the size of such formula is lower bounded by 2nΩ(1), as all slice functions are computed by small monotone real formulas [Ros97], and the size of monotone real formula (computing arbitrary monotone boolean function) is lower bounded by 2nΩ(1) _[_Pud97_,_HC99_].

1.3 Subsequent Work

Following on the heels of our work, Applebaum et al. [ABF+₁₉_{] show a general secret sharing}

(17)

multi-party CDS, thus can also be explained in the formula-based paradigm. Implicitly, Applebaum et al. show every monotone function is computed by a constant-depth size-O(20.892n) monotone formula if the base gates also include slice functions. For the case of linear secret sharing, Applebaum et al. improve the share size to O(20.942n_{), following the}

same paradigm.

Beimel and Peter [BP19] further improve the share size of general secret share to 20.762n, and their schemes are linear. Their secret share schemes are based on robust CDS, a variant of 2-party CDS. In robust CDS, even if both Alice and Bob generates multiple messages according to multiple different inputs, these messages leak no information about the secret unless there is a pair of inputs that satisfies the predicate.

(18)

Chapter 2 Preliminaries

Let N := {0, 1, . . .} denote the set of all nature numbers, and let [n] := {1, . . . , n}. Let F denote a field, R denote a ring. For prime power p, let Fp denote the unique finite field

of size p. A vector will be denoted by a bold face lowercase letter. For a vector v, let v[i] denote its i-th entry. For vectors v, w of same length, let hv, wi denote their inner product. For a set T , let 2T _{denote the set consisting of all subsets of T . For a set T and non-negative}

integer k, let T_k denote the set consisting of all size-k subsets of T .

2.1 Partitions

Definition 1. A k-partition of [n] is a tuple of d disjoint subsets P1, . . . , Pk ⊆ [n] such that

Sk

t=1Pt= [n].

Definition 2. For an integer k divides n, an even k-partition of [n] is a k-partition (P1, . . . , Pk)

such that |P1| = . . . = |Pk| = n/k.

2.2 Private Information Retrieval (PIR)

Definition 3 (2-server Private Information Retrieval (PIR) [CKGS98]). A 2-server PIR pro-tocol consists of a randomness space W and a tuple of deterministic functions (Q1, Q2, A1, A2, R)

Qj : [n] × W → {0, 1}ccj,q for all j ∈ {1, 2},

Aj : {0, 1}n× {0, 1}ccj,q → {0, 1}ccj,a for all j ∈ {1, 2},

R : {0, 1}cc1,a _{× {0, 1}}cc2,a _{× [n] × W → {0, 1},}

where n is the data size and cc1,q+ cc2,q+ cc1,a+ cc2,a is the total communication complexity.

A PIR protocol satisfies the following properties:

(prefect correctness.) For all index i ∈ [n], data D ∈ {0, 1}n _{and randomness w ∈ W,}

q1 = Q1(i, w) ∧q2 = Q2(i, w) ∧a1 = A1(D, q1) ∧a2 = A2(D, q2) =⇒ R(a1, a2, i, w) = D[i].

(19)

(information-theoretic privacy.) For all j ∈ {1, 2}, the distributions of Qj(i, w) and

Qj(i0, w) are identicial for all i, i0 ∈ [n] if w is randomly sampled.

2.3 Private Simultaneous Messages (PSM)

Definition 4 (private simultaneous message (PSM)). A k-party PSM functionality is spec-ified by its input spaces X1, . . . , Xk, output space Y, and a mapping f : X1× . . . × Xk → Y.

A PSM protocol for functionality f consists of a randomness space W and a tuple of deterministic functions (M1, . . . , Mk, R)

Mi : Xi× W → {0, 1}cci, for all i ∈ [k],

R : {0, 1}cc1 _{× . . . {0, 1}}cck _{→ {0, 1},}

where cci is the communication complexity of the i-th party, cc := cc1+ . . . + cck is the total

communication complexity.

A PSM protocol for f satisfies the following properties:

(correctness.) For all input (x1, . . . , xk) ∈ X1× . . . × Xk and randomness w ∈ W,

R(M1(x1, w), . . . , Mk(xk, w)) = f (x1, . . . , xk)

(privacy.) There exists a randomized simulator S, such that for any input (x1, . . . , xk) ∈

X1× . . . × Xk, the joint distribution of M1(x1, w), . . . , Mk(xk, w) is perfectly

indistin-guishable from S(f (x1, . . . , xk)), where the distributions are taken over w ← W and

the coin tosses of S.

2.4 Conditional Disclosure of Secrets (CDS)

In a k-party CDS scheme, there are k parties who know a secret message s and jointly hold input x. The goal is to let another party, called the referee, learn the secret s if and only if P(x) = 1, for a fixed predicate P. These parties cannot communicate with each other, but instead they have access to a common random string (CRS). Their each sends a single message to the CDS referee. At the end of which the referee, who already knows x, should learn s if and only if P(x) = 1.

Definition 5 (conditional disclosure of secrets (CDS) [GIKM00]). Let input spaces X1, . . . , Xk,

secret space {0, 1} and randomness space W be finite sets. Fix a predicate P : X1× X2 ×

. . . × Xk → {0, 1}. A cc-conditional disclosure of secrets (CDS) protocol for P is a tuple of

deterministic functions (B1, . . . , Bk, C)

Transmitting functions Bi : {0, 1} × Xi× W → {0, 1}cc

Reconstruction function C : X1× . . . × Xk× {0, 1}cc×k → {0, 1}

(20)

(reconstruction.) For all (x1, . . . , xk) ∈ X1 × . . . × Xk such that P(x1, . . . , xk) = 1, for all

w ∈ W, and for all s ∈ {0, 1}:

C(x1, . . . , xk, B1(s, x1; w), . . . , Bk(s, xk; w)) = s .

(privacy.) There exists a randomized algorithm S such that for all input tuple (x1, . . . , xk) ∈

X1×. . .×Xksatisfying P(x1, . . . , xk) = 0 and for any secret s ∈ {0, 1}, the joint

distribu-tion of B1(s, x1; w), . . . , Bk(s, xk; w) is perfectly indistinguishable from S(x1, . . . , xk),

where the randomness are taken over w← W and the coin tosses of S.r

Definition 6 (linear CDS). An CDS scheme (B1, . . . , Bk, C) is linear over binary field if the

randomness space W is a vector space over binary field and Bi is a linear function on the

secret and the randomness, i.e. for all i ∈ [k], xi ∈ Xi, the mapping (s, w) 7→ Bi(s, xi, w) is

linear.

2-party CDS Predicates. We consider the following 2-party CDS predicate families: • Index INDEXN: X := {0, 1}N, Y := [N ] and

PINDEX(D, i) = D[i].

• All (“worst”) functions ALLN: a fixed function F : [N ] × [N ] → {0, 1}, X = Y := [N ]

PALL(x, y) = F (x, y)

Multi-party CDS Predicates. We consider the following predicates: • Index INDEXk+1

N : An index i ∈ [N ] is distributed amongst the first k parties. Let

X1 = · · · = Xk := [ k

√

N ], Xk+1 = {0, 1}N and under the natural mapping [N ] 3 i 7→

(i1, . . . , ik) ∈ ([ k

√ N ])k_,

PINDEX(i1, . . . , ik, D) = 1 iff D[i] = 1

Note that D can also be interpreted as the characteristic vector of a subset of [N ]. • All (“worst”) predicates ALLk

N: An index i ∈ [N ] is distributed among the k parties

as before and the predicate is specified by a truth table. Let X1 = . . . = Xk := [ k

√ N ] and there is a fixed public function F : [N ] → {0, 1}. under the natural mapping i ∈ [N ] 7→ (i1, . . . , ik) ∈ [

k

√ N ]k_,

PALL(i1, . . . , ik) := F (i).

ALLk_N is an easier predicate (family) than INDEXk+1_N , as any CDS protocol for INDEXk+1_N implies a CDS protocol for ALLk_N with the same total communication complexity.

The 2-party predicate INDEXN is the same as INDEX2N. And the 2-party parties

(21)

2.5 Secret Sharing

Definition 7 (general secret sharing). A general secret sharing scheme over n parties is spec-ified by a a monotone boolean function F : 2[n]→ {0, 1}. For any monotone F : 2[n] _{→ {0, 1},}

an information-theoretic secret-sharing scheme realizing access function F is a randomized algorithm

Share : {0, 1} × W → ({0, 1}ss)n

that on input a secret bit, outputs n shares s1, . . . , sn ∈ {0, 1}ss satisfying the following

properties:

(correctness.) For all T ⊆ [n] that F(T ) = 1, there exists a reconstruction algorithm CT : ({0, 1}ss)|T | → {0, 1} such that for all σ ∈ {0, 1}, w ∈ W,

Share(σ; w) = (s1, . . . , sn) =⇒ CT((xi)i∈T) = σ.

(privacy.) For all T ⊆ [n] that F(T ) = 0, there exists a simulator whose output on empty input is perfectly indistinguishable from the joint distribution of (si)i∈T for any secret

σ ∈ {0, 1}, where (s1, . . . , sn) := Share(σ; w) and the randomness are taken over w r

← W and the coin tosses of the simulator.

ss is the share size of this secret sharing scheme Share.

Definition 8 (linear secret sharing). A linear secret sharing scheme is a secret sharing scheme where all operations are linear. For example, Share : {0, 1} × W → ({0, 1}ss₎n _{is a}

linear secret sharing scheme over binary field if W is a vector space over binary field and Share is a linear function.

Definition 9 (share size complexity). For any monotone boolean function F, the share size of F, denoted by ss(F), is the minimum integer such that there exists a secret sharing scheme realizing F whose share size is ss(F).

Similarly, the linear secret-sharing share size of F, denoted by sslin(F), is defined as the

minimum integer such that there exists a linear secret sharing scheme realizing F whose share size is sslin(F). By definition, sslin(F) ≤ ss(F).

Definition 10 (minimal authorized sets, maximal unauthorized sets). For any monotone function F, a subset T ⊆ [n] is an authorized set according to F if F(T ) = 1, and T is unauthorized set otherwise. Let F−1(1) denote all authorized subsets according to F and F−1(0) denote all unauthorized subsets according to F.

T is a minimal authorized set if T is an authorized set and all proper subsets of T are unauthorized sets. T is a maximal unauthorized set if T is an unauthorized set and all proper supersets of T are authorized sets. Let min F−1(1) denote all minimal authorized subsets according to F and max F−1(0) denote all maximal unauthorized subsets according to F.

(22)

Access Function Families. An access function family over n parties is a collection of monotone functions F : 2[n] _{→ {0, 1}. For any access function family F, the share size of F,}

denoted by ss(F), is defined as ss(F) := maxF∈Fss(F).

In the introduction, we already mentioned the following access function families. A few auxiliary access function families will also be defined on demand in Section 5.2.

• All monotone functions: Fn _{contains all function F : 2}[n] _{→ {0, 1} satisfying ∀T ⊆}

T0 ⊆ [n], F(T ) ≤ F(T0_).

• Threshold functions: Fn

thrshcontains all threshold function Fthres-t for t ∈ [1, n] where

Fthres-t is defined as and ∀T ⊆ [n], Fthres-t(T ) = 1 ⇐⇒ |T | ≥ t.

• Slice functions: Fn

t contains all monotone function F ∈ Fnsuch that ∀T ⊆ [n], (|T | <

t =⇒ F(T ) = 0) ∧ (|T | > t =⇒ F(T ) = 1). • Fat-slice functions: Fn

[a,b] contains all monotone function F ∈ F

n _{such that ∀T ⊆}

[n], (|T | < a =⇒ F(T ) = 0) ∧ (|T | > b =⇒ F(T ) = 1).

Previous Results on Information-Theoretic Secret Sharing.

Lemma 2.1 (conjunction and disjunction). For any access functions F1, F2 ∈ Fn,

ss(F1∨ F2) ≤ ss(F1) + ss(F2), ss(F1∧ F2) ≤ ss(F1) + ss(F2),

sslin(F1∨ F2) ≤ sslin(F1) + sslin(F2), sslin(F1∧ F2) ≤ sslin(F1) + sslin(F2).

Theorem 2.2 (threshold, [Sha79, KN90, BGK16_{]). For all n ∈ N, ss}lin(Fnthrsh) ≤ blog nc,

which is optimal as ss(Fn_thrsh) = Θ(log n).

Lemma 2.3. For any non-trivial access function F ∈ Fn_{, for each authorized set A according}

to F, there exists a minimal authorized set A0 such that A0 ⊆ A. Symmetrically, for each unauthorized set B according to F, there exists a maximal unauthorized set B0 such that B0 ⊇ B.

2.6 Matching Vector Family

Definition 11 (Matching vector family). For integers N, `, `0, an (N, `, `0)-matching vector family is a collection of N pairs of vectors (ui, vi) for 1 ≤ i ≤ N , where all pairs are in

F`2× F` 0

3, such that for any i, j ∈ [N ]

• hui, uji ∈ {0, 1}, hvi, vji ∈ {0, 1},

• (hui, uji = 0) ∧ (hvj, vji = 0) ⇐⇒ i = j.

Every (N, `, `0)-matching vector family is an (N, max(`, `0))-matching vector family. The later notation is used when we don’t care about the trade-off between ui’s length and vi’s

(23)

This definition is a variant of the one from [Yek08, Efr12], and fits our purposes better. The magical fact about matching vector families is that they exist for lengths `, `0 that are significantly less than N .

Lemma 2.4 ([Gro00]). For every integer N , there is a (N, `)-matching vector family of length ` = 2O(√log N log log N ) _{= N}o(1)_.

2.7 PIR Encoding

Definition 12 (PIR encoding). For any fields (or rings) F, F0 and integers N, `, `0, N vectors u1, . . . , uN ∈ F` and N vectors v1, . . . , vN ∈ F0`

0

form an (N, ` · log |F|, `0 · log |F0_|)-PIR

encoding, if for every D ∈ {0, 1}N, there exists a mapping HD : F` → F0` 0

, such that for every i ∈ [N ], w ∈ F`_.

hvi, HD(ui+ w) − HD(w)i = 0 ⇐⇒ D[i] = 0.

Every (N, cc, cc0)-PIR encoding is an (N, max(cc, cc0))-PIR encoding. The later notation is used when we don’t care about the trade-off between ui’s length and vi’s length.

(24)

Chapter 3 2-Party CDS

In this chapter, we will show how to construct 2-party CDS protocols, and reveal a connection between CDS and private information retrieval (PIR). As a recap, in a CDS protocol for a predicate P : [N ] × [N ] → {0, 1}, Alice who holds x ∈ [N ] and Bob who holds y ∈ [N ] jointly disclose a secret to a referee if and only if P(x, y) = 1. We start by considering a harder CDS problem, replacing Alice’s input by a table D ∈ {0, 1}N_{, and considering Bob’s input is an}

index i ∈ [N ], such that the referee learns the secret if and only if D[i] = 1. Recall that the referee knows both index i and table D.

Overview We start by presenting a high-level overview of the PIR-to-CDS connection, hiding several technical details. As mentioned earlier, our CDS protocols draw upon tech-niques for non-linear reconstruction developed in the context of information-theoretic PIR. From previous PIR protocols [CKGS98, DG15], we extract a structure called PIR encoding. An (N, `)-PIR encoding is a collection of N length-` vectors u1, . . . , uN, such that for any

database D ∈ {0, 1}N_{, there exists a length-preserving function H}

D, and

∀i and w, hHD(ui+ w), uii − hHD(w), uii = D[i],

where h , i denotes inner product (over a ring that is not specified here). Equation (3.1) also explains why this collection of vectors is called PIR encoding, as it implies a 2-server PIR protocol: The client who knows i can send w, w + ui to the 2 servers, and the servers

answer by HD(w) and HD(ui+ w). Equation (3.1) immediately implies that for all i, D, w

and s ∈ {0, 1}, we have

hHD(sui+ w), uii − hHD(w), uii = sD[i]. (3.1)

Then a 2-party CDS protocol with communication O(`) is constructed as follows: • Alice and Bob share randomness w, r (hidden from the referee);

• Alice sends m1

A:= HD(w) + r.

• Bob sends m1

B:= sui + w and m2B := hui, ri.

(25)

• The referee can now compute sD[i] (and thus s when D[i] = 1) given D, i, (m1

A, m1B, m2B)

using the relation

sD[i] = hHD(sui+ w | {z } m1 B ), uii − hHD(w) + r | {z } m1 A , uii + hr, uii | {z } m2 B

which follows readily from (3.1).

The total communication complexity is O(`). Privacy follows fairly readily from the fact that the joint distribution of (m1

A, m1B) is uniformly random, and that m2B is completely

determined given (m1

A, m1B) and sD[i] along with D, i.

3.1 PIR Encoding Implies 2-Server PIR Protocols

As the name suggests, any PIR encoding scheme can be used to construct a 2-server informa-tion theoretic private informainforma-tion retrieval (PIR) scheme. Indeed, the 2-server PIR schemes of [CKGS98, DG15] can all be viewed as instances of this paradigm.

Lemma 3.1. Every (N, cc, cc0)-PIR encoding implies a 2-server information-theoretic PIR protocol whose communication complexity is 2(cc + cc0).

Moreover, the communication complexity doesn’t have to be balanced. The client’s query length is of cc-bit and the server’s answer length is of cc0-bit.

Proof. For any (N, cc, cc0)-PIR encoding consisting of u1, . . . , uN ∈ F` and v1, . . . , vN ∈ F0` 0

, a PIR protocol can be easily constructed as the following:

Let F` _{be the randomness space. The querying algorithms are Q}

1(i, w) = w, Q2(i, w) =

ui+w. The privacy requirement is satisfied as the two queries forms an additive secret sharing

of ui. The answering algorithm are Aj(D, q) = HD(q) for j ∈ {1, 2}. The reconstruction

algorithm is

R(a1, a2, i, w) =

(

1, if hvi, a1 − a2i 6= 0,

0, if hvi, a1 − a2i = 0.

The correctness comes directly from the definition of PIR encoding.

3.2 PIR Encoding with Linear H

D

Function

Lemma 3.2. For any `, there exists a (N, `, dN/`e)-PIR encoding, in which HD is a linear

function.

Proof. W.l.o.g. assume ` divides n. Split the index i into i1 ∈ [`], i2 ∈ [N/`] in the natural

way, such that D[i1, i2] – when D is interpreted as an ` × (N/`) matrix – always equals D[i].

Then construct a PIR encoding as the following • ui = ei1 ∈ F ` 2, • vi = ei2 ∈ F N/` 2 ,

(26)

• HD(x) = D|x.

This is a valid PIR encoding as

hHD(ui + w) − HD(w), vii = hD|(ei1 + w) − D |_{w, e} i2i = hD|ei1, ei2i = e | i1Dei2 = D[i1, i2] = D[i].

3.3 PIR Encoding with Quadratic H

_D

Function

Lemma 3.3. There exists a (N, O(√3

N ))-PIR encoding, in which HDis a quadratic function.

Combining with Lemma 3.1, this PIR encoding implies the 2-server PIR protocol with

3

√

N communication that was constructed in [CKGS98].

Proof. W.l.o.g. assume N is a cubic number. Split the index i into i1, i2, i3 ∈ [ 3

√

N ] in the natural way, such that ei = ei1 ⊗ ei2 ⊗ ei3.

Then construct a PIR encoding as the following • ui ∈ F3· 3 √ N 2 is the concatenation of ei1, ei2, ei3. • vi ∈ F1+3· 3 √ N 2 is the concatenation of 1, ei1, ei2, ei3. • HD : F3· 3 √ N 2 → F 1+3·√3N

2 is defined as the following. Split the input x ∈ F 3·√3N 2 into x1, x2, x3 ∈ F 3 √ N

2 , the output of HD(x) is the concatenation of hD, x1⊗ x2⊗ x3i and

hD, (x1+ ei1) ⊗ x2⊗ x3i for i1 ∈ {1, . . . , 3 √ N }, hD, x1⊗ (x2+ ei2) ⊗ x3i for i2 ∈ {1, . . . , 3 √ N }, hD, x1⊗ x2⊗ (x3+ ei3)i for i3 ∈ {1, . . . , 3 √ N }. Let w ∈ F3·3 √ N 2 be the concatenation of w1, w2, w3 ∈ F 3 √ N 2 . Then hHD(w), vii = hD, w1⊗ w2⊗ w3i + hD, (w1+ ei1) ⊗ w2⊗ w3i + hD, w1 ⊗ (w2+ ei2) ⊗ w3i + hD, w1⊗ w2⊗ (w3+ ei3)i Similarly, hHD(ui+w), vii = hD, (w1+ei1)⊗(w2+ei2)⊗(w3+ei3)i+hD, w1⊗(w2+ei2)⊗(w3+ei3)i + hD, (w1+ ei1) ⊗ w2⊗ (w3+ ei3)i + hD, (w1+ ei1) ⊗ (w2 + ei2) ⊗ w3i

Note that addition and subtraction are the same operation in F2. We have

hHD(ui+ w) − HD(w), vii = X b1,b2,b3∈{0,1} hD, (w1+ b1ei1) ⊗ (w2+ b2ei2) ⊗ (w3+ b3ei3)i = hD, ei1 ⊗ ei2 ⊗ ei3i = hD, eii = D[i].

(27)

3.4 PIR Encoding of Sub-polynomial Length

Lemma 3.4. Every (N, `, `0)-matching vector family implies a (N, ` + 1, (`0+ 1) · log 3)-PIR encoding.

Corollary 3.5. There exists a (N, 2O(

√

log N log log N )_{)-PIR encoding.}

Combining with Lemma 3.1, this PIR encoding implies the 2-server PIR protocol with sub-polynomial communication that was constructed in [DG15].

Construction Overview Let ((u1, v1), . . . , (uN, vN)) ∈ (F`2×F` 0

3)N be a (N, `, `

0_)-matching

vector family. For any i ∈ [N ], w ∈ F`

2, [DG15] shows that the value of D[i] is determined

by the following four numbers:

N X j=1 (−1)hw,uji_{· D[j],} D vi, N X j=1 (−1)hw,uji_{· D[j] · v} j E , N X j=1 (−1)hw+ui,uji_{· D[j],} D vi, N X j=1 (−1)hw+ui,uji_{· D[j] · v} j E . (3.2)

Following [DG15], define four index sets, S0,0, S0,1, S1,0, S1,1, such that

St,t0 := {j ∈ [N ]|hu_i, u_ji = t, hv_i, v_ji = t0}.

(St,t0 implicitly depends on i.) By the definition of matching vector family, S_0,0, S_0,1, S_1,0, S_1,1

is a partition of [N ], and S0,0 = {i}. Also define ct,t0 as

ct,t0 :=

X

j∈S_t,t0

(−1)hw,uji_{· D[j].}

(ct,t0 implicitly depends on i, D, w.) In particular, c_0,0 = (−1)hw,uii· D[i].

Then the values in (3.2) can be represented as linear combinations of c0,0, c0,1, c1,0, c1,1,

N X j=1 (−1)hw,uji_{· D[j] = c} 0,0+ c0,1+ c1,0+ c1,1, D vi, N X j=1 (−1)hw,uji_{· D[j] · v} j E = c0,1 + c1,1, N X j=1 (−1)hw+ui,uji_{· D[j] = c} 0,0+ c0,1− c1,0− c1,1, D vi, N X j=1 (−1)hw+ui,uji_{· D[j] · v} j E = c0,1 − c1,1.

(28)

The linear transformation is full-rank, thus c0,0 is determined by 2c0,0 = N X j=1 (−1)hw,uji_{· D[j] −} D vi, N X j=1 (−1)hw,uji_{· D[j] · v} j E + N X j=1 (−1)hw+ui,uji_{· D[j] −} D vi, N X j=1 (−1)hw+ui,uji_{· D[j] · v} j E .

Notice that c0,0 = 0 ⇐⇒ D[i] = 0, therefore u1, . . . , uN, v1, . . . , vN is very close to a PIR

encoding. And the mapping HD should be similar to x 7→

PN j=1(−1)

hx,uji· D[j] · v j.

Proof of Lemma 3.4. Construct a PIR encoding u0₁, . . . , u0_N, v₁0, . . . , v_N0 as follows: Let (ui, vi)Ni=1

be the given matching vector family. • u0

i ∈ F 1+`

2 is the concatenation of 1 and ui.

• v0 i ∈ F

1+`0

3 is the concatenation of 1 and vi.

• HD(x) is the concatenation of PN_j=1(−1)hx,u 0 ji· D[j] and −PN j=1(−1) hx,u0 ji· D[j] · v j.

For any w ∈ F1+`2 , by defining St,t0 the same way as in the construction overview and

defining ct,t0 as c_t,t0 :=P j∈St,t0(−1) hw,u0 ji· D[j] for t, t0 ∈ {0, 1}, we have N X j=1 (−1)hw,u0ji· D[j] = c 0,0+ c0,1+ c1,0+ c1,1, D vi, N X j=1 (−1)hw,u0ji· D[j] · v j E = c0,1 + c1,1, N X j=1 (−1)hw+u0i,u0ji· D[j] = −c 0,0− c0,1+ c1,0+ c1,1, D vi, N X j=1 (−1)hw+u0i,u0ji· D[j] · v j E = − c0,1 + c1,1. Therefore hHD(u0i+ w) − HD(w), v0ii = N X j=1 (−1)hw+u0i,u 0 ji· D[j] − D vi, N X j=1 (−1)hw+u0i,u 0 ji· D[j] · v j E − N X j=1 (−1)hw,u0ji· D[j] + D vi, N X j=1 (−1)hw,u0ji· D[j] · v j E = c0,0 = (−1)hw,u0ii· D[i].

(29)

3.5 2-Party CDS based on PIR Encoding

Lemma 3.6. Every (N, cc, cc0)-PIR encoding implies a 2-party CDS for INDEXN

(Fig-ure 3.1) whose communication complexity is O(cc + cc0).

CDS for INDEXN using PIR encoding

Public Knowledge: A PIR encoding consisting of u1, . . . , uN ∈ F` and

v1, . . . , vN ∈ F0` 0

.

Alice’s Input: D ∈ {0, 1}N_.

Bob’s Input: i ∈ [N ] and s ∈ {0, 1}. Shared Randomness: w ∈ F`, r ∈ F0`0 • Alice sends m1 A:= HD(w) + r ∈ F0` 0 . • Bob sends m1 B := sui+ w ∈ F`, m2B := hvi, ri ∈ F0 • Referee outputs 1 if hvi, HD(m1B)i − hvi, m1Ai + m 2 B 6= 0, and 0 otherwise.

Figure 3.1: 2-party CDS protocol using PIR encoding.

Protocol Overview. Due to the definition of PIR encoding,

hvi, HD(s · ui+ w) − HD(w)i = 0 ⇐⇒ s · D[i] = 0. (3.3)

Our CDS protocol is essentially a PSM protocol computing hvi, HD(s · ui+ w) − HD(w)i.

When D[i] = 1, whether the output is zero or non-zero leaks s; and when D[i] = 0, the output is always zero thus leaks no information about s.

Proof of Lemma 3.6. The correctness of the protocol is straight-forward. The referee outputs 1 if and only if hvi, HD(m1B)i − hvi, m1Ai + m2B 6= 0, whose left side equals

hvi, HD(m1B)i − hvi, m1Ai + m 2 B = hvi, HD(s · ui+ w)i − hvi, HD(w) + ri + hvi, ri = hvi, HD(s · ui+ w) − HD(w)i. (3.4)

Thus by equation (3.3), when D[i] = 1, the referee will output 1 if and only if s = 1. Privacy follows from by the following observations:

• the joint distribution of Alice’s message m1

A and Bob’s first message m1B is uniformly

(30)

• combining equations (3.3) and (3.4), we have m2

B = −hvi, HD(m1B)i + hvi, m1Ai when

D[i] = 0.

The observations implies a simulator, whose complexity is roughly the same as HD.

Corollary 3.7. There exist CDS protocols for INDEXN and ALLN

• with unbalanced communication complexity O(`), O(dN/`e) and linear reconstruction [GKW15];

• with communication complexity O(√3

N ) and quadratic reconstruction; • with communication complexity 2O(√log N log log N )_.

(31)

Chapter 4 Multi-Party CDS

In this chapter, we construct multi-party CDS protocols, by extending the technique for constructing 2-party CDS protocol in Chapter3. In a k-party CDS associated to a predicate P : [N ]k_{→ {0, 1}: there are k parties P}

1, . . . , Pk who know i1, . . . , ik∈ [N ] respectively, and

they jointly disclose a secret to a referee if and only if P(i1, . . . , ik) = 1.

We start by considering a harder problem by adding an extra party Pk+1, and letting

only the referee and Pk+1know the predicate P. Then the new party Pk+1 is essentially Alice

in the two 2-party CDS, and P1, . . . , Pk jointly act as an equivalence of Bob, who knows

index i ∈ [Nk]. In our 2-party CDS, Bob sends sui + w and hui, ri to the referee, where

(ui)i∈[Nk_] is a collection of Nk fixed vectors, called PIR encoding. In the multi-party CDS,

index i ∈ [Nk_{] is distributed among P}

1, . . . , Pk so that each of them only knows one piece

of i. Ideally, we want a protocol that allows P1, . . . , Pk to jointly disclose sui+ w, hui, ri to

the referee without leaking more information. This demand is exactly captured by private simultaneous messages (PSM) protocols [FKN94, IK97]. Ishai and Kushilevitz [IK00, IK02] show an efficient PSM protocol exists if the function is simple enough – in particular, if the function (i, s, w, r) 7→ (sui+ w, hui, ri) can be computed by small formulas. The main

difficulty is whether the function i 7→ ui can be computed by small formulas.

In our 2-party CDS protocol, the PIR encoding are matching vectors constructed by Grolmusz [Gro00]. Unfortunately, in Grolmusz’s construction, the function i 7→ ui is not

computed by small formulas. We resolve the problem by revisiting the construction of matching vectors, tweaking it so that each bit of ui is computed by a formula of size O(k),

at the cost of slightly increasing the vector length. The new PIR encoding allows parties P1, . . . , Pk to jointly emulate Bob, with small communication overhead.

4.1 A Framework for Multi-Party CDS

In this section, we describe a framework for constructing multi-party conditional disclosure of information (CDS) protocols. The framework is an extension of the 2-party CDS protocol constructed in Section 3.5, The extension relies on one extra property of the PIR encoding: There has to exist a efficient PSM protocol for a functionality associated to the PIR encoding.

(32)

A Special-Purpose PSM Protocol. In the 2-pary CDS protocol (Figure 3.1), Alice’s input is a truth-table D, and Bob’s input is an index i ∈ [N ]. Bob sends the pair (w + sui, hvi, ri), where s is the secret bit, w, r are random vectors sample from CRS and ui, vi

are picked from a PIR encoding.

In (k + 1)-party CDS for INDEXk+1_N , the index i = (i1, . . . , ik) is divided equally among

the first k parties and the last party (nickname Alice) holds the database D. Our plan is to have the first k parties jointly emulate what Bob did in the 2-party CDS. We describe our CDS protocol in Figure4.1, which assumes a PSM protocol for computing the following k-party functionality

Faux: [ k

√

N ] × . . . × [√k_{N ] × ({0, 1} × F}`_{× F}0`0_{) → F}`_{× F}0 where Faux(i1, . . . , ik; (s, w, r)) 7→ (w + sui, hvi, ri)

(4.1)

Here, w, r and s are common inputs and the index i is divided equally among the first k parties. This, in particular, will enable the k parties to emulate the Bob of 2-party CDS, and jointly send w + sui and hvi, ri to the referee without revealing any extra information.

Our multi-party CDS construction relies on a PIR encoding such that there is a PSM for this special-purpose functionality with small communication complexity.

Theorem 4.1. Assume that there is an (N, cc, cc0)-PIR encoding scheme u1, . . . , uN, v1, . . . , vN

and a PSM for functionality Fauxas defined in (4.1) with communication complexity ccPSM(N, `, k),

then there is a (k + 1)-party CDS protocol for INDEXk+1_N (Figure 4.1) with communication complexity cc0+ ccPSM(N, `, k).

Proof. Both the correctness and the privacy of the k-party protocol in Figure 4.1 are quite straight-forward.

Correctness. By the correctness of the PSM protocol, we have m1

B = w + sui and m2B =

hvi, ri. Then correctness of the k-party CDS protocol comes from the same argument as

Lemma 3.6. Note that the referee learns

hvi,HD(m1B)i − hvi, mAi + m2B

= hvi, HD(w + sui)i − hvi, HD(w) + ri + hvi, ri

= hvi, HD(w + sui) − HD(w)i

which, when D[i] = 1, is non-zero if and only if s = 1.

Privacy. Privacy follows by combining the privacy of 2-party CDS and the nested PSM together.

• When D[i] = 0, the joint distribution of mA, m1B, m + B2 can be perfectly simulated

without knowing s (Lemma 3.6);

• Then the distribution mpsm,1, . . . , mpsm,kcan be perfectly simulated merely from (m1B, m2B),

(33)

CDS for INDEXk+1_N

Public Knowledge: An (N, cc, cc0)-PIR encoding (u1, . . . , uN, v1, . . . , vN)

and a PSM protocol (B1, . . . , Bk+1, C) for Faux in (4.1).

Input of Pt (1 ≤ t ≤ k): it∈ [N1/k]

Input of Pk+1 (Alice): D ∈ {0, 1}N.

Shared Randomness: w ∈ F`, r ∈ F0`0, secret s ∈ {0, 1} and randomness crs0 for the nested PSM protocol.

The protocol proceeds as follows.

• For 1 ≤ t ≤ k, the t-th party sends mpsm,t := Bt(it, µ, w, r; crs0).

• Pk+1 (Alice) sends mA:= HD(w) + r ∈ F0` 0

. • Referee computes (m1

B, m2B) := C(mpsm,1, . . . , mpsm,k).

• Referee outputs 1 if hvi, HD(m1B)i − hvi, mAi + m2B 6= 0, and outputs 0

oth-erwise.

Figure 4.1: (k + 1)-party CDS for INDEXk+1_N from PIR encodings and PSM.

Efficiency. The last party (Alice) sends ` bits. Other parties send ccPSM(N, `, k) bits in

total.

4.2 Decomposable PIR Encoding from Decomposable

Matching Vectors

These two ingredients used to construct multi-party CDS, namely PIR encodings and the special-purpose PSM protocol are connected by the property of decomposability.

Definition 13 (k-decomposability). Let N = Bk be a k-th power. A family of vectors (ui)Ni=1 is k-decomposable if there exist vectors (u1,i)Bi=1, . . . , (uk,i)Bi=1 such that under the

natural mapping i 7→ (i1, . . . , ik) ∈ [B]k

ui = u1,i1 ◦ . . . ◦ uk,ik

for all i ∈ [N ]. Here, ◦ denotes the component-wise multiplication operation over appropriate field or ring.

A k-decomposable matching vector family is a matching vector family (ui)i∈[N ], (vi)i∈[N ]

such that both (ui)i∈[N ] and (vi)i∈[N ] are k-decomposable. A k-decomposable PIR encoding

family is a PIR encoding (ui)i∈[N ], (vi)i∈[N ] such that both (ui)i∈[N ] and (vi)i∈[N ] are

(34)

Decomposability ensures that the functionality Fauxdefined in (4.1) is sufficiently simple.

So that there exists an efficient PSM for functionality Faux. Put together, they fulfill the

assumptions in Theorem4.1and give us a communication-efficient multi-party CDS protocol. Note that k-decomposability is an overkill for ensuring efficient PSM protocol for Faux.

In this section, we show construction of a k-decomposable (N, cc)-PIR encoding with length cc = 2O(

√

log N log log N )_{. Our result also allows trade-off between vector lengths. We}

construct a k-decomposable (N, cc, cc0)-PIR encoding with cc = 2O((log N )α_{log log N )}

, cc0 = 2O((log N )1−α_{log log N )}

for any constant α ∈ (0, 1). In order to construct a k-decomposable (N, cc, cc0)-PIR encoding, it’s sufficient to construct a k-decomposable (N, cc − 1,_{log 3}cc0 − 1)-matching vector family. Because Lemma 3.4 has shown how any matching vector family can be converted into a PIR-encoding by appending “1” to every vector, and such operation preserves decomposability.

Lemma 4.2 (decomposable matching vectors family). For integers N and k ≤ log N , and real number 0 < α < 1, there exists a k-decomposable (N, `, `0)-matching vectors family where ` = 2O((log N )α·log log N ), `0 = 2O((log N )1−α·log log N ).

Corollary 4.3 (decomposable PIR encoding). For integers N and k ≤ log N , and real number 0 < α < 1, there exists a k-decomposable (N, cc, cc0)-PIR encoding where cc = 2O((log N )α_{·log log N )}

, cc0 = 2O((log N )1−α_{·log log N )}

.

As a result of independent interest, this PIR encoding implies a PIR protocol with un-balanced communication complexity.

Corollary 4.4 (PIR with unbalanced communication). For any real number 0 < α < 1, there exists a 2-server information-theoretic PIR protocol, where the client’s queries are of length 2O((log N )α·log log N )_{, and the servers’ answers are of length 2}O((log N )1−α·log log N )_.

In order to prove Lemma4.2, we tweak previous matching vector family [BBR94,Gro00], and enable new properties of decomposability and length trade-off at the cost of slightly increasing vector length.

Lemma 4.5. For any prime p and integers n, r, d, there exists a degree-(pd_{−1) multi-variant}

polynomial f : Fnp → Fp, such that for any x ∈ {0, 1}n

f (x) = (

0, if kxk1 ≡ r mod pd,

1, otherwise.

Proof. The target polynomial can be constructed using symmetric polynomials. For any integer m, the degree-(pd_{− 1) homogeneous symmetric polynomial sym : F}m_p _{→ F}p is defined

as sym(x) := X S⊆[m] s.t. |S|=pd₋₁ Y i∈S x[i].

Then for any x ∈ {0, 1}m,

sym(x) = kxk1 pd_{− 1} mod p = ( 1, if kxk1 ≡ −1 mod pd, 0, otherwise.

(35)

Then the target polynomial f can be built on top of sym. Let f (x) := 1 − sym(x, 1, . . . , 1

| {z }

pd_{− 1 − r times}

).

Lemma 4.6. For any prime p, integers d, r and N = 2k_{, there exists a k-decomposable}

vector family u1, . . . , uN ∈ F`p, such that ` ≤ 2 · (2k + 1)p d₋₁

and for any i, j ∈ [N ]

hui, uji =     

0, if, consider i, j as bit string in {0, 1}k_,

the number of bits i, j agree ≡ r mod pd_,

1, otherwise.

Proof. For any index i = (i1, . . . , ik) ∈ {0, 1}k, define vector xi as xi := (i1, 1 − i1, . . . , ik, 1 −

ik). Then for any indexes i, j, the hamming weight kxi◦ xjk1 equals the number of bits i, j

agree.

Let f be the degree-(pd− 1) polynomial from Lemma 4.5. Then

f (xi◦ xj) =

(

0, if the number of bits i, j agree ≡ r mod pd_,

1, otherwise.

Consider all monomial of f , there exists sets S1, . . . , S` ⊆ [2k] and coefficients c1, . . . , c` ∈ Fp

such that f (y) = ` X t=1 c2_t Y j∈St y[j]. (4.2)

We also have ` ≤ 2 · (2k + 1)pd−1: As the degree of f is no more than pd− 1, there are at most (2k + 1)pd₋₁

monomials. If the coefficient of a monomial is not a quadratic residue in Fp, divide it as the sum of two quadratic residues.1

Then we can define the decomposable vector family as ui[t] = ct Y j∈St xi[j]. Then hui, uji = f (xi◦ xj) = (

0, if the number of bits i, j agree ≡ r mod pd_,

1, otherwise.

The decomposability is obvious, one feasible decomposition is defined by us,is[t] =

(

0, if 2s − 1 ∈ St∧ is = 0 or 2s ∈ St∧ is= 1

1, otherwise

for s > 1; and u1,i1[t] is similarly defined except there is also an multiplicative factor ct.

1

For this thesis, the prime p is either 2 or 3. Thus the only quadratic non-residue is 2 ∈ F3, which equals

1 + 1, the sum of two quadratic residues. The proof that every quadratic non-residue equals the sum of two quadratic residues in general Fp is skipped.

(36)

Now we are ready to prove Lemma 4.2, by applying Lemma 4.6 twice. It’s sufficient to only consider the case when N = 2k_{. Firstly, it’s fine to assume N = 2}bk _{be a power of 2}k_,

as otherwise we can replace N by the smallest power of 2k that is greater than N , which is no more than N2_{. Then by the case we will consider, there is a bk-decomposable matching}

vector family with desired length. Notice that if a vector family is bk-decomposable, it’s also k-decomposable.

Proof of Lemma 4.2. W.l.o.g. assume N = 2k_{. Let 2}d _{be the smallest power of 2 greater}

than kα and 3d0 be the smallest power of 3 greater than k1−α.

By Lemma 4.6, there exists a k-decomposable vectors u1, . . . , uN ∈ F`2, such that

hui, uji =

(

0, if the number of bits i, j agree ≡ k mod 2d, 1, otherwise.

and ` ≤ 2·(2k+1)2d−1 = 2O((log N )αlog log N ). Also by Lemma4.6, there exists a k-decomposable vectors v1, . . . , vN ∈ F`

0

3, such that

hvi, vji =

(

0, if the number of bits i, j agree ≡ k mod 3d0_,

1, otherwise. and `0 ≤ 2 · (2k + 1)3d0₋₁

= 2O((log N )1−αlog log N ).

And (ui)i∈[N ], (vi)i∈[N ] form a matching vector family, because

hui, uji = 0 ∧ hvi, vji = 0 ⇐⇒ the number of bits i, j agree ≡ k mod 2d3d 0

⇐⇒ the number of bits i, j agree = k ⇐⇒ i = j.

4.3 A Special-Purpose PSM Protocol

Given the decomposable PIR encoding from Section4.2, the final piece required to instantiate the framework in Section4.1is a special purpose PSM protocol for the functionality described in Equation (4.1).

Lemma 4.7. For field F, F0, integers N, `, `0 and k ≤ log N , if vector families u1, . . . , uN ∈

F`, v1, . . . , vN ∈ F0` 0

are k-decomposable, then there is a PSM for the functionality Faux : [

k

√

N ] × . . . × [√k_{N ] × ({0, 1} × F}`_{× F}0`0_{) → F}`_{× F}0 where Faux(i1, . . . , ik; (s, w, r)) 7→ (sui+ w, hvi, ri)

(4.1)

with communication complexity O(k2_{` log |F| + k}2_`0

log |F0|) per party.

In order to construct a efficient PSM protocol for this specialized functionality, we show that (a) each output bit of this functionality can be written as a small branching program (b) use the fact that there are efficient PSM protocols that compute branching programs over rings where the total communication is polynomial in the size of the branching program.

(37)

Lemma 4.8 ([IK02, CFIK03]). For any finite ring R, and any size-m branching program over R, there is a multi-party PSM protocol for this branching program in which every party sends O(m2) ring elements.

Proof of Lemma 4.7. Let (u1,i) k √ N i=1, . . . , (uk,i) k √ N

i=1 ⊆ F` be the decomposition of (ui)Ni=1, then

under the natural mapping i 7→ (i1, . . . , iN) ∈ [ k

√

N ]k_{, we have}

ui = u1,i1 ◦ . . . ◦ uk,ik

for all i ∈ [N ].

For every 1 ≤ j ≤ `, the j-th coordinate of sui+ w can be written as

(sui+ w)[j] = su1,i1[j]u2,i2[j] . . . uk,ik[j] + w[j].

Since the t-th party (1 ≤ t ≤ k) can compute ut,it locally, the j-th bit of sui + w can be

computed by a size-O(k) branching program with over F. By Lemma4.8, there exists a PSM protocol for (sui+ w)[j] using O(k2· log |F|) communication per party. Therefore, there is

a PSM protocol for

F_aux(1)(i1, . . . , ik; s, w) := sui+ w ∈ F`,

with O(k2_{` log |F|) communication per party.}

Let (v1,i) k √ N i=1, . . . , (vk,i) k √ N i=1 ⊆ F 0`0

be the decomposition of (vi)Ni=1. Again, for all 1 ≤ t ≤ k,

the t-th party has locally computed vt,it. The functionality

F_aux(2)(i1, . . . , ik; r) := hvi, ri ∈ F0 can be written as hvi, ri = `0 X j=1

v1,i1[j]v2,i2[j] . . . vk,ik[j]r[j],

which is branching program of size O(`0k). By Lemma 4.8, there is PSM protocol for hvi, ri

with O(k2`02_{log |F}0|) communication per party. We’d like to point out that, although we are going to optimize the communication to O(k2_`0

log |F0|), this PSM protocol is already sufficiently strong for the main results of this thesis.

To improve the communication, sample a random s ∈ F0`0 from CRS such thatP`0

j=1s[j] =

0. Then vi ◦ r + s is a randomized encoding of hvi, ri, in the sense that vi◦ r + s reveals

hvi, ri and nothing else. The j-th bit of vi◦ r + s can be written as

(vi◦ r + s)[j] = v1,i1[j]v2,i2[j] . . . vk,ik[j]r[j] + s[j],

which again is a branching program with O(k) nodes. By Lemma 4.8, there exists a PSM protocol evaluating it with O(k2_{log |F}0|) communication per party. Clearly, once the referee gets all bit of vi◦ r + s, he can add up all the bits to get hvi, ri. Thus there is a PSM for

Breaking barriers in secret sharing

Breaking Barriers in Secret Sharing

Breaking Barriers in Secret Sharing

Bibliographic Notes

Contents

Chapter 1

Introduction

1.1

Preview of Our Techniques

1.2

Related Work

1.3

Subsequent Work

Chapter 2

Preliminaries

2.1

Partitions

2.2

Private Information Retrieval (PIR)

2.3

Private Simultaneous Messages (PSM)

2.4

Conditional Disclosure of Secrets (CDS)

2.5

Secret Sharing

2.6

Matching Vector Family

2.7

PIR Encoding

Chapter 3

2-Party CDS

3.1

PIR Encoding Implies 2-Server PIR Protocols

3.2

PIR Encoding with Linear H

Function

3.3

PIR Encoding with Quadratic H

Function

3.4

PIR Encoding of Sub-polynomial Length

3.5

2-Party CDS based on PIR Encoding

Chapter 4

Multi-Party CDS

4.1

A Framework for Multi-Party CDS

4.2

Decomposable PIR Encoding from Decomposable

Matching Vectors

4.3

A Special-Purpose PSM Protocol