• Aucun résultat trouvé

Information Flow in Interactive Systems

N/A
N/A
Protected

Academic year: 2021

Partager "Information Flow in Interactive Systems"

Copied!
17
0
0

Texte intégral

(1)

HAL Id: inria-00479672

https://hal.inria.fr/inria-00479672v3

Submitted on 1 Nov 2011

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Information Flow in Interactive Systems

Mário Alvim, Miguel Andrés, Catuscia Palamidessi

To cite this version:

Mário Alvim, Miguel Andrés, Catuscia Palamidessi. Information Flow in Interactive Systems. 21th

International Conference on Concurrency Theory (CONCUR 2010), Aug 2010, Paris, France. pp.102-

116, �10.1007/978-3-642-15375-4_8�. �inria-00479672v3�

(2)

Information Flow in Interactive Systems

?

M´ario S. Alvim1, Miguel E. Andr´es2, and Catuscia Palamidessi1.

1INRIA and LIX, ´Ecole Polytechnique Palaiseau, France.

2Institute for Computing and Information Sciences, The Netherlands.

Abstract. We consider the problem of defining the information leakage in in- teractive systems where secrets and observables can alternate during the com- putation. We show that the information-theoretic approach which interprets such systems as (simple) noisy channels is not valid anymore. However, the principle can be recovered if we consider more complicated types of channels, that in Infor- mation Theory are known as channels with memory and feedback. We show that there is a complete correspondence between interactive systems and such kind of channels. Furthermore, we show that the capacity of the channels associated to such systems is a continuous function of the Kantorovich metric.

1 Introduction

Information leakage refers to the problem that the observable parts of the behavior of a system may reveal information that we would like to keep secret. In recent years, there has been a growing interest in the quantitative aspects of this problem, partly because it is convenient to represent the partial knowledge of the secrets as a probability distribution, and partly because the mechanisms to protect the information may use randomization to obfuscate the relation between the secrets and the observables.

Among the quantitative approaches, some of the most popular ones are based on Information Theory [5, 12, 4, 16]. The system is interpreted as an information-theoretic channel, where the secrets are the input and the observables are the output. The channel matrix is constituted by the conditional probabilities p(b|a), defined as the measure of the executions that give observableb within those which contain the secreta. The leakage is represented by themutual information, and the worst-case leakage by the capacityof the channel.

In the above works, the secret value is assumed to be chosen at the beginning of the computation. In this paper, we are interested in Interactive systems, i.e. systems in which secrets and observables can alternate during the computation, and influence each other. Examples of interactive protocols include auction protocols like [21, 18, 17]. Some of these have become very popular thanks to their integration in Internet- based electronic commerce platforms [9, 10, 14]. As for interactive programs, examples include web servers, GUI applications, and command-line programs [3].

We investigate the applicability of the information-theoretic approach to interactive systems. In [8] it was proposed to define the matrix elementsp(b|a)as the measure of

?This work has been partially supported by the project ANR-09-BLAN-0345 CPP and by the INRIA DRI Equipe Associ´ee PRINTEMPS.

(3)

the traces with (secret, observable)-projection(a, b), divided by the measure of the trace with secret projectiona. This follows the definition of conditional probability in terms of joint and marginal probability. However, it does not define an information-theoretic channel. In fact, by definition a channel should be invariant with respect to the input distribution, and such construction is not, as shown by the following example.

Example 1. Figure 1 represents a web-based interaction between one seller and two possible buyers,richandpoor. The seller offers two different products,cheapandex- pensive, with given probabilities. Once the product is offered, each buyer may try to buy the product, with a certain probability. For simplicity we assume that the buyers offers are exclusive. We assume that the offers are observables, in the sense that they are made public in the website, while the identity of the buyer that actually buys the product should be secret to an external observer. The symbolsr,s,t,r,s,trepresent the probabilities, with the convention thatr= 1−r.

r r

s s t t

cheap expensive

poor rich poor rich

Fig. 1.Inter. System Following [8] we can compute the conditional probabili-

ties asp(b|a) =p(a,b)p(a) , thus obtaining the matrix on Table 1.

However, the matrix is not invariant with respect to the input distribution. For instance, if we fixr = r = 0.5and consider two different input distributions, obtained by vary- ing the values of (s,t),we get two different matrices of condi- tional probabilities, which are represented in Table 2. Hence when the secrets occurafterthe observables we cannot con- sider the conditional probabilities as representing a (classical)

channel, and we cannot apply the standard information-theoretic concepts. In particular, we cannot adopt the (classical) capacity to represent the worst-case leakage, since the capacity is defined using a fixed channel matrix over all possible input distributions.

cheap expensive poor rs+rtrs rs+rtrt rich rs+rtrs rs+rtrt Table 1. Cond. proba- bilities of Example 1 The first contribution of this paper is to consider an exten-

sion of the theory of channels which makes the information- theoretic approach applicable also the case of interactive sys- tems. It turns out that a richer notion of channels, known in Information Theory aschannels with memory and feedback, serves our purposes. The dependence of inputs on previous outputs corresponds to feedback, and the dependence of out- puts on previous inputs and outputs corresponds to memory.

cheap expensive Input dist.

poor 25 35 p(poor) =12 rich 35 25 p(rich) =12

(a)r=12, s= 25, t=35

cheap expensive Input dist.

poor 14 34 p(poor) =15 rich 169 167 p(rich) = 45

(b)r= 12, s= 101, t= 103 Table 2.Two different channel matrices induced by two different input distributions

(4)

A second contribution of our work is the proof that the

channel capacity is a continuous function of the Kantorovich metric on interactive sys- tems. This was pointed out also in [8], however their construction does not work in our case due to the fact that (as far as we understand) it assumes that the probability of a se- cret action, in any point of the computation, is not0. This assumption is not guaranteed in our case and therefore we had to proceed differently.

A more complete version of this paper (with proofs) is on line [1].

2 Preliminaries

2.1 Concepts from Information Theory

For more detailed information on this part we refer to [6]. LetA, Bdenote two random variables with corresponding probability distributions pA(·), pB(·), respectively. We shall omit the subscripts when they are clear from the context. LetA={a1, . . . , an},B= {b1, . . . , bm}denote, respectively, the sets of possible values forAand forB.

TheentropyofAis defined asH(A) = −P

Ap(ai) logp(ai)and it measures the uncertainty ofA. It takes its minimum valueH(A) = 0whenpA(·)is a delta of Dirac.

The maximum valueH(A) = log|A|is obtained whenpA(·)is the uniform distribu- tion. Usually the base of the logarithm is set to be2 and the entropy is measured in bits. Theconditional entropy of A givenB isH(A|B) = −P

Bp(bi)P

Ap(aj|bi) logp(aj|bi), and it measures the uncertainty ofAwhenBis known. We can prove that 0 ≤H(A|B)≤H(A). The minimum value,0, is obtained whenAis completely de- termined byB. The maximum valueH(A)is obtained whenAandBare independent.

Themutual informationbetweenAandBis defined asI(A;B) =H(A)−H(A|B), and it measures the amount of information aboutAthat we gain by observingB. It can be shown thatI(A;B) =I(B;A)and0≤I(A;B)≤H(A).

The entropy and mutual information respect thechain laws. Namely, given a se- quence of random variablesA1, A2, . . . , AkandB, we have:

H(A1, A2, . . . , Ak) =

k

X

i=1

H(Ai|A1, . . . , Ai1) (1)

I(A1, A2, . . . , Ak;B) =

k

X

i=1

I(Ai;B|A1, . . . , Ai−1) (2)

A (discrete memoryless)channelis a tuple(A,B, p(·|·)), where A,B are the sets of input and output symbols, respectively, and p(bj|ai) is the probability of observing the output symbolbj when the input symbol isai. An input distributionp(ai)overA determines, together with the channel, the joint distributionp(ai, bj) =p(ai|bj)·p(ai) and consequentlyI(A;B). The maximumI(A;B)over all possible input distributions is the channel’scapacity. Shannon’s famous result states that the capacity coincides with the maximum rate by which information can be transmitted using the channel.

In this paper we consider input and outputsequencesinstead of just symbols.

(5)

Convention 1. LetA={a1, . . . , an}be a finite set ofndifferent symbols (alphabet).

When we have a sequence of symbols (ordered in time), we use a Greek letterαtto denote the symbol at timet. The notationαtstands for the sequenceα1α2. . . αt. For instance, in the sequencea3a7a5, we haveα2=a7andα2=a3a7.

Convention 2. LetXbe a random variable.Xtdenotes the sequence oftconsecutive occurrencesX1, . . . , Xtof the random variableX.

When the channel is used repeatedly, the discrete memoryless channel described above represents the case in which the behavior of the channel at the present time does not depend upon the past history of inputs and outputs. If this assumption does not hold, then we have a channelwith memory. Furthermore, if the outputs from the channel can be fed back to the encoder, thus influencing the generation of the next input symbol, then the channel is said to bewith feedback; otherwise it iswithout feedback.

Equation 3 makes explicit the probabilistic behavior of channels regarding those classifications. Suppose a general channel fromAtoBwith the associated random vari- ablesAfor input andBfor output. Using the notation introduced in Convention 1, the channel behavior afterTuses can be fully described by the joint probabilityp(αT, βT).

Using probability laws we derive:

p(αT, βT) =

T

Y

t=1

p(αtt−1, βt−1)p(βtt, βt−1) (by the expansion law) (3)

The first term p(αtt1, βt1)indicates that the probability of αt depends not only onαt1, but also onβt1(feedback). The second termp(βtt, βt1)indicates that the probability of eachβtdepends on previous history of inputsαtand outputs βt1(memory).

If the channel is without feedback, then we have thatp(αtt−1, βt−1) =p(αtt−1), and if the channel is without memory, then we have alsop(βtt, βt−1) =p(βtt).

From these we derivep(βTT) = QT

t=1p(βtt), which is the classic equation for discrete memoryless channels without feedback.

Let(V,K)be a Borel space and let(X,BX)and(Y,BY)be Polish spaces equipped with their Borelσ-algebras. Letρ(dx|v)be a family of measures onX givenV. Then ρ(dx|v)is astochastic kernelif and only if and only ifρ(·|v)is a random variable from Vinto the power setP(X).

2.2 Probabilistic automata

A functionµ: S →[0,1]is adiscrete probability distributionon a countable setS if P

s∈Sµ(s) = 1andµ(s)≥0for alls. The set of all discrete probability distributions onSisD(S).

Aprobabilistic automaton[15] is a quadrupleM = (S,L,ˆs, ϑ)whereSis a count- able set ofstates,La finite set oflabelsoractions,ˆstheinitialstate, andϑatransition functionϑ:S → ℘f(D(L × S)). Here℘f(X)is the set of all finite subsets ofX. If ϑ(s) =∅thensis aterminalstate. We writes→µforµ∈ϑ(s), s∈ S. Moreover, we

(6)

writes→`rfors, r∈ Swhenevers→µandµ(`, r)>0. Afully probabilistic automa- tonis a probabilistic automaton satisfying|ϑ(s)| ≤1for all states. Whenϑ(s)6=∅we overload the notation and denoteϑ(s)the distribution outgoing froms.

Apathin a probabilistic automaton is a sequenceσ = s0

`1

→ s1

`2

→ · · · where si ∈ S,`i ∈ Landsi

`i+1

→si+1. A path can befinitein which case it ends with a state.

A path iscomplete if it is either infinite or finite ending in a terminal state. Given a finite pathσ,last(σ)denotes its last state. LetPathss(M)denote the set of all paths, Paths?s(M)the set of all finite paths, andCPathss(M)the set of all complete paths of an automatonM, starting from the states. We will omitsifs= ˆs. Paths are ordered by the prefix relation, which we denote by≤. Thetraceof a path is the sequence of actions inL∪ Lobtained by removing the states, hence for the aboveσwe have trace(σ) = l1l2. . .. IfL0 ⊆ L, thentraceL0(σ)is the projection oftrace(σ)on the elements ofL0.

LetM = (S,L,s, ϑ)ˆ be a (fully) probabilistic automaton,s ∈ S a state, and let σ ∈ Paths?s(M)be a finite path starting in s. Theconegenerated byσis the set of complete pathshσi = {σ0 ∈ CPathss(M) | σ ≤ σ0}. Given a fully probabilistic automaton M = (S,L,s, ϑ)ˆ and a state s, we can calculate the probability value, denoted byPs(σ), of any finite pathσstarting insas follows:Ps(s) = 1andPs(σ →` s0) =Ps(σ)µ(`, s0), wherelast(σ)→µ.

LetΩs ,CPathss(M)be the sample space, and letFsbe the smallestσ-algebra generated by the cones. ThenPinduces a uniqueprobability measureonFs(which we will also denote byPs) such thatPs(hσi) =Ps(σ)for every finite pathσstarting in s. Fors= ˆswe writePinstead ofPˆs.

Given a probability space(Ω,F, P)and two eventsA, B∈ FwithP(B)>0, the conditional probabilityofAgivenB,P(A|B), is defined asP(A∩B)/P(B).

3 Discrete channels with memory and feedback

We adopt the model proposed in [19] for discrete channels with memory and feedback.

Such model, represented in Figure 2, can be decomposed in sequential components as follows. At timetthe internal channel’s behavior is represented by the conditional probabilitiesp(βtt, βt1). The internal channel takes the inputαtand, according to the history of inputs and outputs up to the momentαt, βt1, produces an output symbol βt. The output is then fed back to the encoder with delay one. On the other side, at time t the encoder takes the message and the past output symbols βt1, and produces a channel input symbolαt. At final timeT the decoder takes all the channel outputsβT and produces the decoded messageWˆ. The order is the following:

Message W, α1, β1, α2, β2, . . . , αT, βT, Decoded Message Wˆ Let us describe such channel in more detail. LetAandBbe two finite sets. Let{At}Tt=1

(channel’s input) and{Bt}Tt=1(channel’s output) be families of random variables inA andBrespectively. Moreover, letAT andBT represent theirT-fold product spaces. A channelis a family of stochastic kernels{p(βtt, βt1)}Tt=1.

LetFtbe the set of all measurable mapsϕt:Bt1→ Aendowed with a probability distribution, and letFtbe the corresponding random variable. LetFT,FT denote the

(7)

W //

Code- Functions

ϕT

ϕt

// Encoder {αt=ϕtt1)}Tt=1

αt

// Channel {p(βt|αt, βt1)}Tt=1

βt

//

oo

Decoder

Wˆ =γ(βT) //Wˆ

Time0 βt−1 Delay

OO

TimeT+ 1 __ __ __ __ __ __ __ __ ___ __ __ __ __

__ __ __ __ __ __ __ __ ___ __ __ __ __ Time

t= 1. . . T

Fig. 2.Model for discrete channel with memory and feedback

Cartesian product on the domain and the random variable, respectively. Achannel code functionis an elementϕT = (ϕ1, . . . , ϕT)∈ FT.

Note that, by probability laws,p(ϕT) =QT

t=1p(ϕtt−1). Hence the distribution onFT is uniquely determined by a sequence{p(ϕtt−1)}Tt=1. We will use the notation ϕtt−1)to represent theA-valuedt-tuple(ϕ1, ϕ21), . . . , ϕtt−1)).

In Information Theory this kind of channels are used to encode and transmit mes- sages. IfWis a message set of cardinalityM with typical elementw, endowed with a probability distribution, achannel codeis a set ofM channel code functionsϕT[w], interpreted as follows: for messagew, if at timet the channel feedback isβt1, then the channel encoder outputsϕt[w](βt1). Achannel decoderis a map fromBT toW which attempts to reconstruct the input message after observing all the output history βT from the channel.

3.1 Directed information and capacity of channels with feedback

In classical Information Theory, the channel capacity, which is related to the channel’s transmission rate by Shannon’s fundamental result, can be obtained as the supremum of the mutual information over all possible input’s distributions. In presence of feedback, however, this correspondence does not hold anymore. More specifically, mutual infor- mation does not represent any longer the information flow fromαT toβT. Intuitively, this is due to the fact that mutual information expresses correlation, and therefore it is increased by feedback. But the feedback, i.e the way the output influences the next input, is part of the a priori knowledge, and therefore should not be counted when we measure the output’s contribution to the reduction of the uncertainty about the input. If we want to maintain the correspondence with the transmission rate and with information flow, we need to replace mutual information withdirected information[13].

Definition 1. In a channel with feedback, the directed information from inputAT to output BT is defined as I(AT → BT) = PT

t=1I(αttt1). In the other di- rection, the directed information from BT to AT is defined as: I(BT → AT) = PT

t=1I(αtt1t1).

Note that the directed information defined above are not symmetric: the flow from AT toBT takes into account the correlation betweenαtandβt, while the flow from

(8)

BT toAT is based on the correlation betweenβt1andαt. Intuitively, this is because αtinfluencesβt, but, in the other direction, it isβt1that influencesαt.

It can be proved [19] thatI(AT;BT) =I(AT →BT)+I(BT →AT). If a channel does not have feedback, thenI(BT →AT) = 0andI(AT;BT) =I(AT →BT).

In a channel with feedback the information transmitted is the directed information, and not the mutual information. The following example should help understanding why.

Example 2. Consider the discrete memoryless channel with input alphabetA={a1, a2} and output alphabetB={b1, b2}whose matrix is represented in Table 3.

b1 b2 a1 0.5 0.5 a2 0.5 0.5 Table 3.Channel ma- trix for Example 2 Suppose that the channel is used with feedback, in such a

way that, for allt’s,αt+1 =a1ifβt =b1, andαt+1 =a2if βt= b2. It is easy to show that ift ≥2thenI(At;Bt) 6= 0.

However, there is no leakage from from At toBt, since the rows of the matrix are all equal. We have indeed thatI(At→ Bt) = 0, and the mutual informationI(At;Bt)is only due to the feedback information flowI(Bt→At).

The concept of capacity is generalized for channels with feedback as follows. Let DT ={{p(αtt1, βt1)}Tt=1}be the set of all input distributions. For finiteT, the capacity of a channel{p(βtt, βt−1)}Tt=1is:

CT = sup

DT

1

TI(AT →BT) (4)

4 Interactive systems as channels with memory and feedback

(General) Interactive Information Hiding Systems ([2]), are a variant of probabilistic automata in which we separate actions in secret and observable; “interactive” means that secret and observable actions can interleave and influence each other.

Definition 2. A generalIIHSis a quadrupleI = (M,A,B,Lτ), whereM is a prob- abilistic automaton (S,L,s, ϑ),ˆ L = A ∪ B ∪ Lτ where A, B, and Lτ are pair- wise disjoint sets of secret, observable, and internal actions respectively, andϑ(s)⊆ D(B ∪ Lτ × S)implies |ϑ(s)| ≤ 1, for all s. The condition onϑensures that all observable transitions are fully probabilistic.

AssumptionIn this paper we assume that general IIHSs arenormalized, i.e. once un- folded, all the transitions between two consecutive levels have either secret labels only, or observable labels only. Moreover, the occurrences of secret and observable labels alternate between levels. We will callsecret statesthe states from which only secrets- labeled transitions are possible, and observable statesthe others. Finally, we assume that for everysand`there exists a uniquersuch thats→` r. Under this assumption we have that the traces of a computation determine the final state, as expressed by the next proposition. In the followingtraceAandtraceBindicate the projection of the traces on secret and observable actions, respectively. Given a general IIHS, it is always possible to find an equivalent one that satisfies this assumptions. The interested reader can find in [1] the formal definition of the transformation.

(9)

Proposition 1. LetI = (M,A,B,Lτ)be a generalIIHS. Consider two pathsσand σ0. Then,traceA(σ) =traceA0)andtraceB(σ) =traceB0)impliesσ=σ0.

In the following, we will consider two particular cases: thefully probabilisticIIHSs, where there is no nondeterminism, and thesecret -nondeterministicIIHSs, where each secret choice is fully nondeterministic. The latter will be called simply IIHSs.

Definition 3. LetI= ((S,L,s, ϑ),ˆ A,B,Lτ)be a generalIIHS. ThenIis:

– fully probabilistic ifϑ(s)⊆ D(A × S)implies|ϑ(s)| ≤1for eachs∈ S. – secret-nondeterministic ifϑ(s)⊆ D(A × S)implies that for eachs∈ Sthere exist

si’ such thatϑ(s) ={δ(ai, si)}ni=1.

We show now how to construct a channel with memory and feedback from IIHSs.

We will see that an IIHS corresponds precisely to a channel as determined by its stochas- tic kernel, while a fully probabilistic IIHS determines, additionally, the input distribu- tion. In the following, we consider an IIHSI= ((S,L,s, ϑ),ˆ A,B,Lτ) is innormal- ized form. Given a pathσof length2t−1, we denotetraceA(σ)byαt, andtraceB(σ) byβt1.

Definition 4. For eacht, the channel’s stochastic kernel corresponding toIis defined asp(βtt, βt1) =ϑ(q)(βt, q0), whereqis the state reached from the root via the path σwhose input-trace isαtand output traceβt1.

Note thatqandq0 in previous definitions are well defined: by Proposition 1,qis unique, and since the choice ofβtis fully probabilistic,q0is also unique.

IfIis fully probabilistic, then it determines also the input distribution and the de- pendency ofαtuponβt1(feedback) andαt1.

Definition 5. IfIis fully probabilistic, the associated channel has a conditional input distribution for eachtdefined asp(αtt1, βt1) =ϑ(q)(αt, q0), whereqis the state reached from the root via the pathσwhose input-trace isαt1and output trace isβt1.

4.1 Lifting the channel inputs to reaction functions

Definitions 4 and 5 define the joint probabilitiesp(αt, βt)for a fully probabilistic IIHS.

We still need to show in what sense these define a information-theoretic channel.

The{p(βtt, βt−1)}Tt=1determined by the IIHS correspond to a channel’s stochas- tic kernel. The problem resides in the conditional probability of{p(αtt−1, βt−1)}Tt=1. In an information-theoretic channel, the value ofαtis determined in the encoder by a deterministic functionϕtt−1). However, inside the encoder there is no possibility for a probabilistic description ofαt. Furthermore, in our setting the concept of encoder makes no sense as there is no information to encode. A solution to this problem is to externalize the probabilistic behavior ofαt: the code functions become simplereaction functionsϕtthat depend only onβt1(the messagewdoes not play a role any more), and these reaction functions are endowed with a probability distribution that generates the probabilistic behavior of the values ofαt.

(10)

Definition 6. Areactor is a distribution on reaction functions, i.e., a stochastic ker- nel{p(ϕtt1)}Tt=1. A reactorRisconsistent with a fully probabilistic IIHSI if it induces the compatible distributionQ(ϕT, αT, βT)such that, for every1 ≤ t ≤ T, Q(αtt1, βt1) = p(αtt1, βt1), where the latter is the probability distribution induced byI.

The main result of this section states that for any fully probabilistic IIHS there is a reactor that generates the probabilistic behavior of the IIHS.

Theorem 3. Given a fully probabilisticIIHSI, we can construct a channel with mem- ory and feedback, and probability distributionQ(ϕT, αT, βT), which corresponds toI in the sense that, for everyt,αtandβt, with1≤t≤T,Q(αt, βt)def= P

ϕTQ(ϕT, αt, βt) = p(αt, βt)holds, wherep(αt, βt)is the joint probability of input and output traces in- duced byI.

Corollary 1. Let aIbe a fully probabilisticIIHS. Let{p(βtt, βt1)}Tt=1 be a se- quence of stochastic kernels and{p(αtt1, βt1)}Tt=1 a sequence of input distribu- tions defined byIaccording to Definitions 4 and 5. Then the reactorR={p(ϕtt1)}Tt=1

compatible with respect to theIis given by:

p(ϕ1) =p(α10, β0) =p(α1) (5) p(ϕtt1) = Y

βt1

p(ϕtt1)|ϕt1t2), βt1), 2≤t≤T (6)

Figure 3 depicts the model for IIHS. Note that, in relation to Figure 2, there are some simplifications: (1) no messagewis needed; (2) the decoder is not used. At the beginning, a reaction function sequence ϕT is chosen and then the channel is used T times. At each usaget, the encoder decides the next input symbolαtbased on the reaction functionϕtand the output fed backβt1. Then the channel produces an output βtbased on the stochastic kernelp(βtt, βt1). The output is then fed back to the encoder with a delay one.

Reaction- Functions

ϕT

ϕt

// “Interactor”

{αt=ϕtt−1)}Tt=1 αt

//

Channel {p(βt|αt, βt−1)}Tt=1

βt

//

ooDelay

βt−1

OO

___________________________

___________________________

Fig. 3.Channel with memory and feedback model for IIHS

We conclude this section by remarking an intriguing coincidence: The notion of reaction function sequenceϕT, on the IIHSs, corresponds to the notion of deterministic scheduler. In fact, each reaction functionϕtselects the next step,αt, on the basis of the βt1andαt1(generated byϕt1), andβt1t1represent the path until that state.

(11)

5 Leakage in Interactive Systems

In this section we propose a notion of information flow based on our model. We fol- low the idea of defining leakage and maximum leakage using the concepts of mutual information and capacity (see for instance [4]), making the necessary adaptations.

Since the directed informationI(AT →BT)is a measure of how much information flows from AT toBT in a channel with feedback (cfr. Section 3.1), it is natural to consider it as a measure of leakage of information by the protocol.

Definition 7. The information leakage of anIIHS is defined as: I(AT → BT) = PT

t=1H(At|At−1, Bt−1)−H(AT|BT).

Note thatPT

t=1H(At|At−1, Bt−1)can be seen as the entropyHRof reactorR.

Compare this definition with the classical Information-theoretic approach to infor- mation leakage: when there is no feedback, the leakage is defined as:

I(AT;BT) =H(AT)−H(AT|BT) (7) The principle behind (7) is that the leakage is equal to the difference between thea priori uncertaintyH(AT)and thea posteriori uncertaintyH(AT|BT)(gain in knowl- edge about the secret by observing the output). Our definition maintains the same prin- ciple, with the proviso that the a priori uncertainty is now represented byHR.

5.1 Maximum leakage as capacity

In the case of secret-nondeterministic IIHS, we have a stochastic kernel but no distri- bution on the code functions. In this case it seems natural to consider the worst leakage over all possible distributions on code functions. This is exactly the concept of capacity.

Definition 8. Themaximum leakageof anIIHSis defined as the capacityCT of the associated channel with memory and feedback.

6 Modeling IIHSs as channels: An example

In this section we show the application of our approach to theCocaine Auction Proto- col[17]. Let us imagine a situation where several mob individuals are gathered around a table. An auction is about to be held in which one of them offers his next shipment of cocaine to the highest bidder. The seller describes the merchandise and proposes a starting price. The others then bid increasing amounts until there are no bids for 30 consecutive seconds. At that point the seller declares the auction closed and arranges a secret appointment with the winner to deliver the goods.

The basic protocol is fairly simple and is organized as a succession of rounds of bidding. Roundistarts with the seller announcing the bid pricebifor that round. Buyers havet seconds to make an offer (i.e. to say yes, meaning “I’m willing to buy at the current bid pricebi”). As soon as one buyer anonymously says yes, he becomes the winnerwiof that round and a new round begins. If nobody says anything fortseconds,

(12)

round i is concluded by timeout and the auction is won by the winnerwi1 of the previous round, if one exists. If the timeout occurs during round 0, this means that nobody made any offers at the initial priceb0, so there is no sale.

Although our framework allows the forrmalization of this protocol for an arbitrary number of bidders and bidding rounds, for illustration purposes, we will consider the case of two bidders (CandlemakerandScarface) and two rounds of bids. Furthermore, we assume that the initial bid is always1 dollar, so the first bid does not need to be announced by the seller. In each turn the seller can choose how much he wants to increase the actual bid. This is done by adding an increment to the last bid. There are two options of increments, namelyinc1(1 dollar) andinc2(2 dollars). In that way, bi+1 is eitherbi +inc1 or bi+inc2. We can describe this protocol as anormalized IIHSI = (M,A,B,Lτ), whereA={Candlemaker,Scarface, a}is the set of secret actions, B = {inc1, inc2, b} is the set of observable actions, Lτ = ∅ is the set of hidden actions, and the probabilistic automatonM is represented in Figure 4. For clarity reasons, we omit transitions with probability0in the automaton. Note that the special secret actiona represents the situation where neitherCandlemakernorScarfacebid.

The special observable actionbis only possible after no one has bidden, and signalizes the end of the auction and, therefore, no bid is allowed anymore.

p1Cm

Sf p2

ap3

inc1

q4 inc2

q5 q6inc1 inc2

q7 b

1 pCm9 Sf

p10

a

p11 p12CmSf p13

a

p14 p15CmSf p16

a

p17 p18CmSf p19

a

p20 a 1 inc1

q22

inc2

q23

inc1

q24

inc2

q25 b 1

inc1

q27

inc2

q28

inc1

q29

inc2

q30 b 1

inc1

q32

inc2

q33

inc1

q34

inc2

q35 b 1

inc1

q37

inc2

q33

inc1

q39

inc2

q40 b 1 b

1

Fig. 4.Cocaine Auction example

Table 4 shows all the stochastic kernels for this example. The formalization of this protocol in terms of IIHSs using our framework makes it possible to prove the claim in[17] suggesting that if the seller knows the identity of the bidders then the (strong) anonymity guaranties are not provided anymore.

7 Topological properties of IIHSs and their Capacity

In this section we show how to extend to IIHSs the notion of pseudometric defined in [8] for Concurrent Labelled Markov Chains, and we prove that the capacity of the corresponding channels is a continuous function on this pseudometric. The metric con- struction is sound for general IIHSs, but the result on capacity is only valid for secret- nondeterministic IIHSs.

Given a set of statesS, a pseudometric (or distance) is a functiondthat yields a non-negative real number for each pair of states and satisfies the following:d(s, s) = 0;

(13)

α1→β1 inc1 inc2 b

Candlemaker q4 q5 0

Scarface q6 q7 0

a 0 0 1

(a)t= 1, p(β11, β0)

α1, β1, α2→β2 Cheap Expensiveb

Candlemaker,inc1,Candlemaker q22 q23 0 Candlemaker,inc1,Scarface q24 q25 0

Candlemaker,inc1,a 0 0 1

Candlemaker,inc2,Candlemaker q27 q28 0 Candlemaker,inc2,Scarface q29 q30 0

Candlemaker,inc2,a 0 0 1

Scarface,inc1,Candlemaker q32 q33 0 Scarface,inc1,Scarface q34 q35 0

Scarface,inc1,a 0 0 1

Scarface,inc2,Candlemaker q37 q38 0 Scarface,inc2,Scarface q39 q40 0

Scarface,inc2,a 0 0 1

a,b,a 0 0 1

All other lines 0 0 1

(b)t= 2, p(β22, β1) Table 4.Stochastic kernels for the Cocaine Auction example.

d(s, t) = d(t, s), andd(s, t) ≤ d(s, u) +d(u, t). We say that a pseudometricdisc- bounded if ∀s, t : d(s, t) ≤ c, where c is a positive real number. We now define a complete lattice on pseudometrics, and define the distance between IIHSs as the greatest fixpoint of a distance transformation, in line with the coinductive theory of bisimilarity.

Definition 9. Mis the class of1-bounded pseudometrics on states with the ordering dd0if∀s, s0∈S:d(s, s0)≥d0(s, s0).

It is easy to see that(M,)is a complete lattice. In order to define pseudometrics on IIHSs, we now need to lift the pseudometrics on states to pseudometrics on distribu- tions inD(L ×S). Following standard lines [20, 8, 7], we apply the construction based on the Kantorovich metric [11].

Definition 10. Ford ∈ M, andµ, µ0 ∈ D(L ×S), we defined(µ, µ0)(overloading the notationd) asd(µ, µ0) = maxP

(`i,si)∈L×S(µ(`i, si)−µ0(`i, si))xiwhere the maximization is on all possible values of thexi’s, subject to the constraints0≤xi≤1 and xi −xj ≤ d((`ˆ i, si),(`j, sj)), where d((`ˆ i, si),(`j, sj)) = 1if `i 6= `j, and d((`ˆ i, si),(`j, sj)) =d(si, sj)otherwise.

It can be shown that with this definitionmis a pseudometric onD(L ×S).

Definition 11. d∈ Mis abisimulation metricif, for all∈[0,1),d(s, s0)≤implies that ifs→µ, then there exists someµ0such thats0 →µ0andd(µ, µ0)≤.

The greatest bisimulation metric isdmax =F

{d∈ M |dis a bisimulation metric}. We now characterizedmax as a fixed point of a monotonic functionΦonM. For sim- plicity, from now on we consider only the distance between states belonging to different IIHSs with disjoint sets of states.

(14)

Definition 12. Given twoIIHSswith transition relationsθandθ0 respectively, and a preudometricdon states, defineΦ:M → Mas:

Φ(d)(s, s0) =

















maxid(si, s0i)if ϑ(s) ={δ(a1,s1), . . . , δ(am,sm)} and ϑ0(s0) ={δ(a1,s01), . . . , δ(am,s0m)} d(µ, µ0) ifϑ(s) ={µ}andϑ0(s0) ={µ0} 0 ifϑ(s) =ϑ0(s0) =∅

1 otherwise

It is easy to see that the definition ofΦis a particular case of the functionFdefined in [8, 7]. Hence it can be proved, by adapting the proofs of the analogous results in [8, 7], thatF(d)is a pseudometric, and thatdis a bisimulation metric iffdΦ(d). This implies thatdmax =F

{d∈ M |dΦ(d)}, and still as a particular case ofFin [8, 7], we have thatΦis monotonic onM. By Tarski’s fixed point theorem,dmaxis the greatest fixed point ofΦ. Furthermore, in [1] we show thatdmax is indeed a bisimulation metric, and that it is the greatest bisimulation metric. In addition, the finite branchingness of IIHSs ensures that the closure ordinal ofΦisω(cf. Lemma 3.10 in the full version of [8]). Therefore one can show thatdmax = {Φi(>)|i∈N}, where>is the greatest pseudometric (i.e.>(s, s0) = 0for everys, s0), andΦ0(>) =>.

Given two IIHSsIandI0, with initial statessands0respectively, we define the dis- tance betweenIandI0asd(I,I0) =dmax(s, s0).Next theorem states the continuity of the capacity w.r.t. the metric on IIHSs. It is crucial that they are secret-nondeterministic (while the definition of the metric holds in general).

Theorem 4. Consider two normalizedIIHSsIandI0, and fix aT >0. For every >0 there existsν >0such that if d(I,I0)< ν then |CT(I)−CT(I0)|< .

We conclude this section with an example showing that the continuity result for the capacity does not hold if the construction of the channel is done starting from a system in which the secrets are endowed with a probability distribution. This is also the reason why we could not simply adopt the proof technique of the continuity result in [8] and we had to come up with a different reasoning.

Example 3. Consider the two following programs, wherea1, a2are secrets,b1,b2are observable,kis the parallel operator, and+pis a binary probabilistic choice that assigns probabilitypto the left branch, and probability1−pto the right one.

s) (send(a1) +p send(a2))kreceive(x).output(b2)

t) (send(a1) +q send(a2))kreceive(x).if x=a1then output(b1)else output(b2).

Table 5 shows the fully probabilistic IIHSs corresponding to these programs, and their associated channels, which in this case (since the secret actions are all at the top- level) are classic channels, i.e. memoryless and without feedback. As usual for classic channels, they do not depend on pand q. It is easy to see that the capacity of the first channel is0and the capacity of the second one is1. Hence their difference is1, independently frompandq.

Let nowp= 0andq= . It is easy to see that the distance betweensandtis. Therefore (when the automata have probabilities on the secrets), the capacity is not a continuous function of the distance.

(15)

s t

p 1−p

0 1 0 1

a1 a2

b1 b2 b1 b2

q 1−q

1 0 0 1

a1 a2

b1 b2 b1 b2

s b1 b2

a1 0 1 a2 0 1

(a)

t b1 b2

a1 1 0 a2 0 1

(b) Table 5.The IIHSs of Example 3 and their corresponding channels

8 Conclusion and future work

In this paper we have investigated the problem of information leakage in interactive sys- tems, and we have proved that these systems can be modeled as channels with memory and feedback. The situation is summarized in Table 6(a). The comparison with the clas- sical situation of non-interactive systems is represented in (b). Furthermore, we have proved that the channel capacity is a continuous function of the kantorovich metric.

IIHSsas automata IIHSsas channels Notion of leakage

Normalized IIHSs with nondeterministic Sequence of stochastic kernels Leakage as capacity inputs and probabilistic outputs {p(βtt, βt−1)}Tt=1

Normalized IIHSs with a deterministic Sequence of stochastic kernels scheduler solving the nondeterminism {p(βtt, βt−1)}Tt=1+

reaction function seq.ϕT

Fully probabilistic normalized IIHSs Sequence of stochastic kernels Leakage as directed {p(βtt, βt−1)}Tt=1+ informationI(AT →BT) reactor{p(ϕtt−1)}Tt=1

(a)

Classical channels Channels with memory and feedback

The protocol is modeled in independent uses of The protocol is modeled in several the channel, often a unique use. consecutive uses of the channel.

The channel is fromAT→ BT, i.e., its input The channel is fromF → B, i.e. its is a single stringαT1. . . αTof secret input is a reaction functionϕtand its symbols and its output is a single stringβT= output is an observableβt.

β1. . . βTof observable symbols.

The channel is memoryless and in general The channel has memory. Despite the fact that the implicitly it is assumed the absence of channel fromF → Bdoes not have

feedback. feedback, the internal stochastic kernels

do.

The capacity is calculated using information The capacity is calculated using mutual

I(AT;BT). directed informationI(AT →BT).

(b) Table 6.

For future work we would like to provide algorithms to compute the leakage and maximum leakage of interactive systems. These problems result very challenging given

(16)

the exponential growth of reaction functions (needed to compute the leakage) and the quantification over infinitely many reactors (given by the definition of maximum leak- age in terms of capacity). One possible solution is to study the relation between deter- ministic schedulers and sequence of reaction functions. In particular, we believe that for each sequence of reaction functions and distribution over it there exists a proba- bilistic scheduler for the automata representation of the secret-nondeterministic IIHS.

In this way, the problem of computing the leakage and maximum leakage would reduce to a standard probabilistic model checking problem (where the challenge is to compute probabilities ranging over infinitely many schedulers).

In addition, we plan to investigate measures of leakage for interactive systems other than mutual information and capacity.

References

1. M. S. Alvim, M. E. Andr´es, and C. Palamidessi. Information Flow in Interactive Systems, 2010. http://hal.archives-ouvertes.fr/inria-00479672/en/.

2. M. E. Andr´es, C. Palamidessi, P. van Rossum, and G. Smith. Computing the leakage of information-hiding systems. In Proc. of TACAS, volume 6015 ofLNCS, pages 373–389.

Springer, 2010.

3. A. Bohannon, B. C. Pierce, V. Sj¨oberg, S. Weirich, and S. Zdancewic. Reactive noninterfer- ence. InProc. of CCS, pages 79–90. ACM, 2009.

4. K. Chatzikokolakis, C. Palamidessi, and P. Panangaden. Anonymity protocols as noisy chan- nels.Inf. and Comp., 206(2–4):378–401, 2008.

5. D. Clark, S. Hunt, and P. Malacaria. Quantified interference for a while language. InProc.

of QAPL 2004, volume 112 ofENTCS, pages 149–166. Elsevier, 2005.

6. T. M. Cover and J. A. Thomas.Elements of Information Theory. J. Wiley & Sons, Inc., 1991.

7. Y. Deng, T. Chothia, C. Palamidessi, and J. Pang. Metrics for action-labelled quantitative transition systems. InProc. of QAPL, volume 153 ofENTCS, pages 79–96. Elsevier, 2006.

8. J. Desharnais, R. Jagadeesan, V. Gupta, and P. Panangaden. The metric analogue of weak bisimulation for probabilistic processes. InProc. of LICS, pages 413–422. IEEE, 2002.

9. Ebay website.http://www.ebay.com/.

10. Ebid website.http://www.ebid.net/.

11. L. Kantorovich. On the transfer of masses (in Russian).Doklady Akademii Nauk, 5(1):1–4, 1942. Translated inManagement Science, 5(1):1–4, 1958.

12. P. Malacaria. Assessing security threats of looping constructs. InProc. of POPL, pages 225–235. ACM, 2007.

13. J. L. Massey. Causality, feedback and directed information. InProc. of the 1990 Intl. Sym- posium on Information Theory and its Applications, 1990.

14. Mercadolibre website.http://www.mercadolibre.com/.

15. R. Segala. Modeling and Verification of Randomized Distributed Real-Time Systems. PhD thesis, 1995. Tech. Rep. MIT/LCS/TR-676.

16. G. Smith. On the foundations of quantitative information flow. InProc. of FOSSACS, volume 5504 ofLNCS, pages 288–302. Springer, 2009.

17. F. Stajano and R. J. Anderson. The cocaine auction protocol: On the power of anonymous broadcast. InInformation Hiding, pages 434–447, 1999.

18. S. Subramanian. Design and verification of a secure electronic auction protocol. InProc. of SRDS, pages 204–210. IEEE, 1998.

19. S. Tatikonda and S. K. Mitter. The capacity of channels with feedback. IEEE Transactions on Information Theory, 55(1):323–349, 2009.

(17)

20. F. van Breugel and J. Worrell. Towards quantitative verification of probabilistic transition systems. InProc. of ICALP, volume 2076 ofLNCS, pages 421–432. Springer, 2001.

21. W. Vickrey. Counterspeculation, Auctions, and Competitive Sealed Tenders.The Journal of Finance, 16(1):8–37, 1961.

Références

Documents relatifs

On the other hand, Statistical Inference deals with the drawing of conclusions concerning the variables behavior in a population, on the basis of the variables behavior in a sample

In the special case of “input-deterministic” channels, which are stronger than channels in which the input automaton is deterministic, we show that the covert channel capacity

This document elaborates an opinion of the Channel Regulatory Authorities, agreed on 20 th March 2018, on the Channel TSO Proposal of common capacity calculation methodology for

The general case with arbitrary memory has been recently treated in [27]. Unfortunately, the capacity.. formulas in [2, 3, 27] do not provide much intuition on practical good

We studied the ergodic capacity of two models for partial chan- nel knowledge: the pathwise channel model with knowledge of the slow varying parameters at the transmitter and

We consider single user spatial multiplexing systems over flat fading MIMO channels. We furthermore consider the usual block fading model in which data gets transmitted over a number

• Suppose now that matrix S has a certain correlation structure then Sayeed’s model answers the following question: what is the consistent model for a ULA when the modeler knows

For several well-known families of quantum channels that can be characterized by noise parameters, the quantum capacity is proved to be zero within moderately noisy regimes, well