Formal approaches to information hiding : An analysis of interactive systems, statistical disclosure control, and refinement of specifications

(1)

HAL Id: tel-00639948

https://tel.archives-ouvertes.fr/tel-00639948v3

Submitted on 13 Feb 2012

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

Formal approaches to information hiding : An analysis of

interactive systems, statistical disclosure control, and

refinement of specifications

Mário Alvim

To cite this version:

Mário Alvim. Formal approaches to information hiding : An analysis of interactive systems,

statis-tical disclosure control, and refinement of specifications. Cryptography and Security [cs.CR]. Ecole

Polytechnique X, 2011. English. �tel-00639948v3�

(2)

PhD. Thesis - Thèse de Do torat

Spé ialité Informatique

Formal approa hes to information

hiding:

An analysis of intera tive systems, statisti al

dis losure ontrol, and refinement of

spe ifi ations

Mário S. Alvim

LIX, É ole Polyte hnique

Palaiseau, Fran e Supervisor Catus ia Palamidessi Rapporteurs Gilles Barthe Mi hael Mislove Examinateurs

Béatri e Bérard

Stéphanie Delaune

Loï Hélouët

Daniel Le Métayer

Geoffrey Smith

(3)

(4)

É olePolyte hnique

Équipe Comète

CentreNationalde laRe her he

S ientique

(5)

(6)

Contents i List of Figures iv List of Tables v A knowledgements vii 1 Introdu tion 1 1.1 Informationhiding . . . 1

1.2 Qualitativeand quantitative approa hes to information hiding: abrief history . . . 3

1.2.1 Thequalitative approa h . . . 4

1.2.2 Thequantitative approa h. . . 6

1.3 Case studiesofinformation hiding . . . 10

1.3.1 Quantitative information owand anonymity . . . 10

1.3.2 Statisti al dis losure ontrol . . . 18

1.3.3 Reningspe i ations into implementations . . . 20

1.4 Planofthethesis and ontribution . . . 22

1.5 Publi ations . . . 23

2 Preliminaries 25 2.1 Probability spa es . . . 25

2.2 Probabilisti automata . . . 27

2.3 CCS withinternal probabilisti hoi e . . . 28

3 Therationalebehindtheuseofinformationtheoryforleakage 31 3.1 Informationtheoryand ommuni ation. . . 31

3.2 Informationtheoryand information ow . . . 32

3.3 Un ertaintyand leakage . . . 34

3.3.1 Shannon entropy . . . 35

3.3.2 Min-entropy. . . 37

3.3.3 Guessingentropy . . . 40

(7)

3.3.5 Comparison anddis ussion . . . 41

4 Information owin intera tive systems 43 4.1 Intera tive systems . . . 45

4.2 Dis rete hannels withmemoryand feedba k . . . 48

4.2.1 The powerof feedba k . . . 50

4.2.2 Dire tedinformationand apa ityof hannelswith feed-ba k . . . 53

4.3 Intera tive systems as hannels withmemory and feedba k. . . 55

4.3.1 Constru tion of the hannel asso iated to an IIHS . . . 58

4.3.2 Liftingthe hannelinputs to rea tion fun tions . . . 61

4.4 Leakage inintera tive systems. . . 66

4.5 An example: the Co aine Au tionproto ol . . . 72

4.5.1 Cal ulating theinformation leakage . . . 75

4.6 Topologi al propertiesof IIHSsand their apa ity. . . 78

4.7 Related work . . . 85

4.8 Chaptersummaryand dis ussion . . . 86

5 Dierential priva y: the trade-o between leakage and utility 89 5.1 Dierential priva y . . . 90

5.1.1 Formaldenition . . . 92

5.1.2 Alternative interpretationinthe ase of liques . . . 92

5.2 A modelofutilityand priva y for statisti aldatabases . . . 94

5.2.1 Leakage about anindividual. . . 96

5.2.2 A noteon the hoi e ofvalues . . . 96

5.2.3 The questionswe explorewiththehelp ofour model . . 97

5.3 Graph symmetries . . . 98

5.4 Derivingtherelationbetweendierential priva yand quantita-tive information ow onthebasisof thegraph stru ture . . . . 102

5.4.1 Assumptions andnotation . . . 102

5.4.2 The matrixtransformation . . . 103

5.4.3 The bound onthea posteriorientropyof the hannel. . 112

5.5 Appli ation to leakage . . . 114

5.5.1 Measuringtheleakage aboutan individual. . . 121

5.6 Appli ation to utility . . . 122

5.6.1 The bound ontheutility. . . 124

5.7 Related work . . . 130

5.8 Chaptersummaryand dis ussion . . . 131

6 Safe equivalen es for se urity properties 133 6.1 Theuseof equivalen esinse urity . . . 134

6.2 Distributedsystems and omponents . . . 137

6.2.1 Tagged Probabilisti Automata . . . 137

(8)

6.2.3 Distributed systems . . . 139

6.3 Admissible s hedulers . . . 140

6.3.1 Restri tingglobal s hedulers . . . 141

6.3.2 Restri tinglo als hedulers . . . 142

6.4 Safeequivalen es . . . 143

6.4.1 Safe ompletetra es . . . 143

6.4.2 Safe bisimilarity . . . 144

6.5 Safenondeterministi information hiding . . . 152

6.6 Relatedwork . . . 153

6.7 Chaptersummaryanddis ussion . . . 155

7 Con lusion 157

(9)

1.1 An example ofthe dining ryptographers proto ol . . . 16

1.2 The Crowds proto olat work . . . 17

2.1 The semanti s ofCCS

p

. . . 29

4.1 Intera tive systemofExample 1 . . . 46

4.2 Modelfor dis rete hannel withmemory andfeedba k . . . 50

4.3 S heme ofse rettransitions for se ret-nondeterministi IIHSs . . . 56

4.4 Lo altransformation inan IIHS tree . . . 58

4.5 Transformation inan IIHS tree . . . 59

4.6 The normalized IIHS for theextendedwebsiteexample . . . 61

4.7 Channelwith memoryand feedba kmodelfor IIHS. . . 66

4.8 Co aineau tion example. . . 73

4.9 Comparison between theleakage inExamples a,b, and . . . 79

5.1 Randomized fun tion

K

. . . 95

5.2 Leakage and utilityfor obliviousme hanisms . . . 96

5.3 Some distan e-regulargraphs withdegree

3

. . . 99

5.4 Some

V T

+

graphs . . . 99 5.5 Some

(Val

u

_{, ∼)}

graphs . . . 101

5.6 Venn diagram for the lasses of graphs onsidered in this se tion. Here

S

∗

_{= {Val}

u

_{| |Val | = 2, u ≤ 2}}

. . . 102

5.7 Steps of the matrix transformation for distan e-regular and

V T

+

graphs . . . 104

5.8 The relation between elements of a row

i

and the elements in the diagonal . . . 107

5.9 Graphs of

Bnd

(u, v, ǫ)

for

u = 100

and

v = 2

(lowest line),

v = 10

(intermediate line), and

v=100

(highestline), respe tively. . . 116

5.10 Universeandhighestmin-entropyleakagematrixgiving

ǫ

-dierential priva yfor Example 7. . . 118

6.1 Exe ution trees for Example10 . . . 136

(10)

4.1 Channelmatrix for Example 1 . . . 46

4.2 Two dierent hannel matri es indu edbytwo dierent input dis-tributions for Example 1 . . . 47

4.3 Channelmatrix for binaryerasure hannel . . . 50

4.4 General formof hannel matrix . . . 52

4.5 A possible evolutionofthe binary hannelwithtime,for

W = 011

and

T = 3

. . . 54

4.6 Channelmatrix for Example 5 . . . 68

4.7 Sto hasti kernelsfor theCo aine Au tion example . . . 73

4.8 Rea tion fun tionsfor the o aineau tion example . . . 74

4.9 Values oftheprobabilities inFigure 4.8for Examples a,b, and . 77 4.10 Values of the entropy and dire ted information for Examples a, b, and , where

I(A

T

_{; B}

T

_{) = H(A}

T

_{) − H(A}

T

_|B

T

₎

and

I(A

T

_→

B

T

_{) = H}

R

− H(A

T

|B

T

)

. . . 78

4.11 The IIHSsofExample 6and their orresponding hannels . . . 85

4.12 Summary ofresults . . . 87

5.1 Me hanismsfor the itywithhigher numberofvotesfor andidate

cand

. . . 129

(11)

(12)

Praise the bridge that arried you over.

GeorgeColman

Everypie e of work isprodu ed within a ontext, andnaturally this thesis is

no ex eption. I want to dedi ate this spa e to express my gratitude to some

people that have helped to reate an environment of s ienti , material, and

emotional support, whi h was ru ialtothedevelopment ofmywork over the

past three years. I am deeply grateful to all these people, and the inuen e

theyhavehadinthisworkisonlyasmallpartoftheinuen eandimportan e

they have inmy life.

First of all, I will always be deeply grateful to Catus ia Palamidessi for

her outstanding work as my thesis supervisor. During these three years she

has provided astimulatingand ex iting s ienti environment, endowed with

all the material and logisti support a student ould ever need. Her passion

for s ien e is ontagious, and her brillian e and persisten e are qualities that

I an only hope to be fortunate enough to a hieve someday. And not only is

shea widelyre ognized resear her, butsheisalso aremarkablehumanbeing,

whosekindnessandethi shavesetanexamplethatIwillalwayskeepwithme

ina ademiaand for life. Itiswithsin ere joythatI an saythat, besidesthe

fruitfuls ienti ooperation,we wereableto reatea deeplinkoffriendship,

and Iwill do mybest sothatboth an last for life.

Anotherpersonoffundamentalimportan einmypathtothisdayisElaine

Pimentel. Asthersts ienti tutorIhaveeverhad,andlaterasmyMaster's

program supervisor, she was the one wel oming me to the fas inating world

of a ademia. She guided my rst steps in resear h, and her dedi ation and

intelligen e are remarkable. Elaine was the strongest supporter I have ever

had for doing a do toral program abroad, espe iallyin the early times when

not even my family was onvin ed yet it was a good idea. More than on e

Elaine was a thoughtful friend and a wise advisor, who helped megure out

solutionsforpra ti alproblemsthat,insomemoments,mademedoubtI ould

getto theend of thisprogram. Thank you,Elaine, verymu h forit all.

I would also like to thank the CNRS (Centre National de la Re her he

(13)

the fundsfor these three years of resear h in Fran e. I also thank INRIA for

all the nan ial and logisti al support with respe t to s ienti onferen es,

events, andwork trips.

I am grateful to the members of my jury, who kindly gave their time to

go through my work and evaluate it. Thanks to Béatri e Bérard, Stéphanie

Delaune, Loï Hélouët, Daniel Le Métayer, and Georey Smith. And spe ial

thanks to my rapporteurs Mi hael Mislove and Gilles Barthe, who produ ed

theevaluationreportformythesis. Iamhonoredtohave hadtheopportunity

to have su h ahighqualied jury.

Iwouldalso like to thank all the people from theGraduate S hool(É ole

Do torale) of É olePolyte hnique, espe iallyAudrey Lémare halfor herhelp

withthedo umentation regardingmystayinFran e,Fabri e Baronnetforhis

administrative work, and Christine Ferret for everything involving the thesis

defense.

Ifeelespe iallyfortunateforhavinghadtheopportunityto workinsu ha

stimulatingenvironment asis theLIXlaboratory (Laboratoire d'Informatique

de l'É ole Polyte hnique), and in parti ular the Comète team. It is with a

weight inmyheartthatIleave alltheseamazingpeople. Iamdeeplygrateful

to Frank Valen ia, who gave me one of the warmest wel omes I got in my

new life in Europe. Frank was not only a tea her, but a olleague, a gym

ompanion,and a goodfriend. Iamalso gratefulto hiswife, SaraSödergren,

and to their son, Felipe Valen ia, for all the good moments shared. Thanks

alsotoAndrésAristizábal forhiskindnessandforalwaysbeingreadyto help;

toCarlosOlarteforthe help,friendshipandgoodmomentssharedtogether (I

willneverforgetthatitwasCarloswhotookmeonmyrstwalkinParis,and

introdu ed me to the Eiel Tower); to Sophia Knight for the shared laughs,

food, jokes and omplaints that make our love-hate friendship unique; and

to Justin Dean, Sophia's husband, who is a remarkably kind and smart guy

withan interestingviewoflife. Thanksto DaleMiller, foralwayshavingwise

advi e to oer when I needed it, and thanks as well to Catus ia and Dale's

kids, Nadia and Alexis, for the good moments shared. I am also grateful to

Christele Braun, Ehab El Salamouny, Jéremy Dubrueil, Jesus Aranda, Lili

Xu, Luis Pino, Mar o Giunti, Mar o Stronati, Ni olás Bordenabe, Ralu a

Dia onu, Romain Beauxis, andSylvainPradalier, who, even ifI didnot have

the opportunity to work with them dire tly, helped make LIX su h a great

environment.

I would like to thank my o-authors, withwhom I have had the

opportu-nitynotonly to ooperates ienti ally,butalsoto reatefriendships. Thanks

to Miguel E. Andrés for our fruitful ollaboration, the onstant good mood,

and the always stimulating joke-ghts. Thanks to Konstantinos (Kostas)

Chatzikokolakis for all the work we developed together, theenlightening

dis- ussions about so many subje ts, and the good moments shared. Thanks to

PierpaoloDegano, withwhomIhadthepleasureof ollaboratingandlearning

(14)

The team of administrative support at LIX was also fundamental for my

work. I would like to thank Marie-Jeanne Gaard for her remarkable

om-peten e and dedi ation, whi h have frequently saved me from a great deal of

trouble. Herprofessionalbehaviorisamodeltobefollowed,andIwishI ould

en ounter people like her everywhere I will ever work. I am also grateful to

Valérie Le omte, for the ountless timesshehelped me, even when itwasnot

her duty, always with the hara teristi ompeten e and sympathy. I annot

forget Corinne Poulain, who guided me through the endless administrative

maze when Iarrived inFran e. Thanksalso to James Régis for thete hni al

support; and to Isabelle Bier ewi z and Lydie Fontaine for the assistan e in

my rst years at LIX.I would also like to say a ouple of words about Ryna

LamPe h,whose heerfulsmile andalways goodmoodmadeea h oeetime

inthe afeteria an even moreenjoyable moment.

I am also grateful to the experien ed s ientists who have shared part of

their vast knowledge with me, either in onferen es, workshops or informal

meetings, and reinfor ed my view that people in a ademia are not only

bril-liant, but usually good human beings as well. Thanks espe ially to Georey

Smith for sharing his expertise with me in so many insightful onversations,

and for organizing the ex iting workshop on information ow at Florida

In-ternational University. Iwill not forget the hospitalityhe, hiswife Elena, his

sons Danieland David,andtheadorableYoshioered inMiami. Thanksalso

toPrakashPanangandenforthele turesattheSFM-10:QAPLsummers hool

in Bertinoro, and also for the opportunity to parti ipate in theworkshop on

quantumand lassi information ow at theBellairsResear h Institute.

I annot pro eed without mentioning all the amazing friends I have in

Brazil,who werefundamentalintheba kground thatbrought mehere. Even

beingfar away,theyare onstantly inmymind,andI always ount downthe

days tothe next time Iwill seethem again. Thanks toAline Miranda, whom

I have had the privilege of knowing and whose friendship I enjoy very mu h;

to Aline Resende, an in redible friend, on whom I know I an always ount

on at any time of dayor night, and withwhom I have had some of themost

joyful andmemorable moments ofall mylife; to AnísioLa erda, thetalented

and sensibleguywhom Ialwaysenjoyed talkingto about anysubje t(serious

or not); to DeznieLopes, who always hasa smile to oer; to KatiaLage, the

sweetand kind friendwhois always thereto helpothers; to LaraCoelho, the

funny and pra ti al girl, whose visit to Paris wasone of thehighlights of my

time in the ity; and to Marina Cruz, my hildhood friend, the one I have

known for the longest in my life and whose love always warms up my heart.

I am alsodeeply gratefulto Adriani Quatrini, who played su han important

role of support and understanding during one of the darkest moments inmy

last three years; and to Giselle Moura, who has ared so mu h for me and

was the main for e driving the pro ess that literally hanged my fa e and,

therefore, mylife (formu h better).

(15)

arrived in Europe I did not know a single person on this side of the Atlanti

O ean. It wasa bigturning point inmylife, and Iamsoglad thatI de ided

to ome, for these three years inParis were not only a period of professional

growth, but also of in redible personal learning. I have had the pleasure of

meeting here some of the most remarkable human beings I have ever met,

both at the professional and personal levels. In parti ular, our sweet, sweet

Maunoury,thebuildingsharedashomebysomanyforeignstudentsatÉ ole

Polyte hnique, has been the stage of ountless adventures, memorable

mo-ments, and deep learning. Without the ompanionship of the people I met

there, I would not have been able to enjoy my stay in Fran e as mu h, and

thereforemy work would not have been as produ tive. Iwouldlike to thank

ea h and every one of the people I met in Maunoury for the friendship that

hasmarked mesodeeply. Also,Iwant tothankea hone forparti ular things

thatI will keep in mymemory forever. Thanks to Saddaf Shabbirfor allthe

philosophi al dis ussions bythe lake during summer (or until late night

oth-erwise), that have enlightened me so mu h in so many subje ts; to Andreas

Engelhardt for the onstant ompanionship and mutual-understanding whi h

have somany timeslightened theweight of beingabroad; to Nadia Vertti for

the happinessand heerfulness that ouldalways makeme smileat anytime;

to Keesjande Vries for all theawesome tripsshared together (Do youwanna

know why? Well...); to Ri ardo Kawahara for sharing the fun of nights out,

andalsothefrustrationofthewayba khomebytheNo tilien 122;toMi haª

Zydor for the un ountable movies seen together in Paris; to Fabien Immler

for being my German little brother; to Alex Rinke for the hospitality

dur-ingthe winter holidays in Berlinin 2009/2010; to OliverValen ia for thefun

moments at Bbar; to Kalle Ba klund, Anna Folke Larsen, and Uli

S hnei-der for all the unforgettable evenings at their pla e in Rue Guisarde and at

Chez Georges; to Steen Lohrey and Marie Le Mouel for the ni e evenings

wat hing Audrey Hepburnmovies inmyroom; to Chiara Altomare,Manuele

Auero, Paolo Carozzo andLorenzo Sponza (theItalian maa) for the

on-stant heerfulness in ourbeloved international kit hen;to BenjaminMosk for

the energy to neversayno to a night out dan ing; to Maria Rosario(Charo)

Mestre for the ompany not only in Paris, but also in Frankfurt; to Álvaro

Izquierdo for the onstant ompany in the gym, and the fun trips together;

toAmyGilson,AntonKarrman, DaviVas on ellos, LelandEllison,Lysandra

Alves, and Mi hael Martin for the unforgettable Summer of 2009; to Citlali

Cabreraforherkindnessineverymoment, andtheni e dinnerssheoered to

me; to Igor Reshetnyak for always being ready to help in anything; to Théo

Touvet for the rare example of ondent and unique life hoi es; to Tomás

Lungenstrassfor the onstant smile and goodmood; and to François Wirion

andJuliaDurasfortherstmomentssharedinthedo toralprogram. Thanks

also to Alex Lang, Alfredo Parra, Daniel Ruiz, Federi oCárdenas, Benjamin

Uekermann, Fredrik Hallgren, Henri de Belsun e, Herbert Mangesius, Ivan

(16)

Cho-je ki, Sara Rome, and Seydou Traoré for all the unforgettable moments. I

annot forget Hannah S hneider and Soa Karlsson, who have not lived in

Maunoury but arepart of the family,and I would like to thank them for the

friendship and hospitalitywhen IvisitedbothCologne and Sto kholm.

Itwasnot only on ampus,however, thatImet friends. Amongthemany

amazing people I met in Paris, and all over the world, are Alexandra Silva,

good ompany in several onferen es and summer s hools, whom I hope to

meet often,both asa friendandasa olleague; Diogo Arbigaus,thekind and

goodfriendwho, even thoughhe isBrazilian, Ihave met onlyinParis; Maria

Poulaki, whose refreshing ompany and kindness always make me feel good;

Ni olás Lopez andall theSpanish rowd, whose partiesinRueSouot willbe

alwaysinmymemory;andIzabelRezende,afamilymemberawayfromhome,

who wasan essential and kind supportduringmy stayinFran e.

I oftensaythat we do not have mu h ontrol over our lives, and thatthe

best we an do is to try to be prepared enough to at h a good opportunity

when itshows up. Today I an lookba k andbeglad to saythatI aught at

least twolife-time opportunitiesinthe pastthree years. The rstone was on

the

1 st

of O tober2008, whenIlanded inParis to startmydo toral program

at É olePolyte hnique. These ondonewasonthe

19 th

ofMar h 2010,when

I metTrevor RayTisler. Meeting him wasa turningpoint inmylife,and his

emotional supporthas paved the road soI ould work witha lighter spirit. I

am gratefulfor the patien ewith whi h hehasrevised my Englishwriting so

manytimes, the dedi ationhehasshowntomeeven beingoverseasforover a

yearnow, and for hislove, supportandpresen e inmy life.

Finally, I would like to thank myfamily, of whom Iam so proud, for the

love and support during my whole life, and espe ially during the hallenges

these past three years have imposed on me. Thanks to my mother, Maria

Angéli a, who hasalways been amodelhuman being for me, as a strong yet

sweetwoman, andwhogivesmestrengthinhard momentsandsharesmyjoy

in the good ones; to mybrother Mar oAntnio, who hasset an example for

me with his dedi ation, ethi al behavior and kindness that are a onstant in

everything he does; to my brother Mar us Viní ius, whose parti ular sense

of humor and tough behavior are not enough to hide a kind heart and a

person one an always ount on; to mystep-father Mario Montoya, who is a

remarkablehumanbeing, andwhohasgivenmemoresupport,understanding

and lovethan mybiologi alfatherhaseverdone;to mysistersinlawLu iana

Salomão and Débora Pires, for being like real sisters, and for the ountless

joyful moments shared; and to my ousin Adriana de Lima, for always being

bymyside andsupporting me.

Iapologizeto the severalpeople thatplayed an important role inmyway

andwhohavenotfoundtheir namementioned here: Iamsorryifmymemory

playeda tri kon me.

MárioS. Alvim

(17)

Inthis thesis we onsider theproblem of information hidingin the

s enariosofintera tivesystems,statisti aldis losure ontrol,and

rene-mentofspe i ations. Weapplyquantitativeapproa hestoinformation

owin therst two ases, and wepropose improvementsfor theusual

solutionsbasedonpro essequivalen esforthethird ase.

In the rst s enariowe onsider theproblem of dening the

infor-mationleakageinintera tivesystemswherese retsandobservables an

alternate during the omputation and inuen e ea h other. We show

that the information-theoreti approa h whi h interprets su h systems

as(simple) noisy hannels isnotvalid. Theprin iple anbere overed,

however, if we onsider hannels of a more ompli ated kind, that in

information theory are known as hannels with memory and feedba k.

We show that there is a omplete orresponden e between intera tive

systemsandthese hannels,andweproposetheuseofdire ted

informa-tionfrom input to output asthe real measure of leakagein intera tive

systems. Wealsoshowthat ourmodelisaproperextensionofthe

las-si alone,i.e. in theabsen eofintera tivitythemodelof hannelswith

memoryand feedba k ollapsesinto the model of memoryless hannels

withoutfeedba k.

Inthese ond s enariowe onsider theproblem ofstatisti al

dis lo-sure ontrol, whi h on erns how to reveal a urate statisti s about a

setofrespondentswhilepreservingthepriva yofindividuals. Wefo us

on the on ept of dierential priva y, a notion that has be ome very

popular in the database ommunity. Roughly, the idea is that a

ran-domized query me hanism provides su ient priva y prote tion if the

ratiobetweentheprobabilitiesthattwoadja entdatasets givea ertain

answeris bound by a onstant. Weobserve thesimilarity of this goal

withthemain on ern in theeld of informationow,namelylimiting

thepossibility of inferring the se retinformation from theobservables.

We show how to model the query system in terms of an

information-theoreti hannel,andwe omparethenotionofdierentialpriva ywith

thatofmin-entropyleakage. Weshowthatdierentialpriva yimpliesa

boundonthemin-entropyleakage,andwealso onsidertheutilityofthe

randomization me hanism, whi h represents how lose the randomized

answersare,inaverage,totherealones. Finallyweshowthatthenotion

ofdierentialpriva yimpliesatightboundonutility,andweproposea

method that under ertain onditions buildsan optimalrandomization

me hanism.

Moving the fo us away from quantitative approa hes, in the third

s enarioweaddresstheproblemofusingpro ess equivalen esto

hara -terizeinformation-hidingproperties(forinstan ese re y,anonymityand

non-interferen e). Intheliterature,someworkshaveusedthisapproa h,

basedontheprin iplethat aproto ol

P

withavariable

]

. We show that, in the presen e of nondetermin-ism,theaboveprin iplemayrelyontheassumptionthat thes heduler

worksforthebenetoftheproto ol,andthisisusually notasafe

(18)

ningaspe i ationintoanimplementation,sin eusuallytheformeris

moreabstra tthanthelatter,andtherenementpro essinvolves

redu -ingthenondeterminism. Thes heduleris,in thissense,analprodu t

oftherenementpro ess,afterallthenondeterminismisruledout. We

presentaformalismin whi h we anspe ifyadmissible s hedulers and,

orrespondingly,safeversionsof omplete-tra eequivalen eand

bisimu-lation. Weprovethatsafebisimulationisstilla ongruen e. Finally,we

showthatsafeequivalen es an beused toestablishinformation-hiding

(19)

(20)

Introdu tion

There are two mistakes one an make along the road totruth:

notgoing all the way,and not starting.

Gautama Siddharta

1.1 Information hiding

In the last few de ades the amount of information owing through

omputa-tional systemshasin reased dramati ally. Neverbefore inhistory hasa

so i-etybeen sodependentonsu hahugeamount ofinformationbeinggenerated,

transmitted andpro essed. Itisexpe tedthatthissolidtrendofin rease will

ontinue in the near future, if not virtually indenitely, reinfor ing the need

for e ient and safeways to ope withthis reality.

Although the e ient and broad dissemination of information is a goal

in manysituations, thereare instan es where thedis losure ofinformation is

undesirableorevenuna eptable. Theeldofinformationhiding on ernsthe

problemofguaranteeingthatpartoftheinformationrelativetoaneventiskept

se ret. In omputer s ien e, theterm information hiding en ompassesalarge

spe trumofelds. Dierent eldshavedistin thistori almotivationsandthe

resulting resear h followed a unique path. The variation of the subelds of

information hiding dependson threemainfa tors: (i) what one wants tokeep

se ret; (ii)from whi h adversary or atta ker doesone want to keep it se ret;

and (iii) how powerful the adversaryor atta keris.

The eld of ondentiality (or se re y) refers to the problem of keeping

an a tion se ret. One appli ation of ondentiality is ryptographi

proto- ols, where the sender and the re eiver of a message an be known, but the

ontentsofthe messageitselfare onsideredtobesensitiveinformation.

Gen-erally,we an saythat ondentiality on erns data,whiletheeldof priva y

(21)

beinterestedinprote tingtheinformation aboutsomeone(a redit ard

num-ber, for instan e) or the person's identity itself. Anonymity is the eld that

on erns the prote tion oftheidentities ofagents involved inevents. In

prin- iple, anonymity an be related to both the a tive agent (often the sender

of a message), or to the passive agent (often the re eiver of a message). For

instan e, in the ase of a journalist re eiving information from a ondential

sour e, theidentity of the sender is intended to be se ret. As for the ase of

an intelligen e agen y sending a oded message to a spy, the identity of the

re eiver is ondential information. There is yetanother kind of anonymity,

sometimesreferredtoasunlinkability,wheretheidentityofagentsanda tions

performedarepubli information, butthe linkage between agents andthe

a -tions performed should not be determined. One exampleof unlinkability is a

ondential votingsystem,where boththevoters andthenalvote ountare

inthe publi domain, but the relationship between the voters' identities and

the ballots ast isprote ted.

Oneappli ation ofpriva ythathasdrawnalotofattentioninre entyears

istheproblemofstatisti aldatabases. Astatisti isaquantity omputedfrom

asample,andthegoalofstatisti aldis losure ontrol istoenabletheuserofthe

database to learn properties of the population as a whole, while maintaining

the priva y of individuals in the sample. The eld of statisti al databases

highlightsthedeli ate equilibriumbetween thebenetsand thedrawba ks of

the spread of information. A pra ti al example o urs in medi al resear h,

where it is desirable that a great number of individuals agree to give their

personal medi al information. With theinformation a quired, resear hers or

publi authorities an al ulate a seriesof statisti sfrom thesample(su h as

the average age of people with a parti ular ondition) and de ide, say, how

mu h money the health aresystem shouldspend next year inthetreatment

ofa spe i disease. It isintheinterest ofea h individual, however, thather

parti ipation in the sample will not harm her priva y. In our example, the

individuals usually do not want to have dis losed their spe i status with

relationtoa given disease,not evento theusers queryingthedatabase. Some

studies, e.g. [Joi01℄, suggestthat whenindividuals areguaranteed anonymity

andpriva ythey tendto be more ooperativeingiving personalinformation.

Another important eld of information hiding is information ow, whi h

on erns the leakage of lassied information via publi outputs in programs

andsystems. Consider asystemthataskstheusersa password to grant their

a essto some fun tionality. Naturally, the password itself is intended to be

se ret, however an atta ker trying to guess it will always get an observable

rea tionfromthesystem,whethertheresponseisana eptan eorareje tion

of the entered ode. In either ase, the observable behavior of the system

revealssome informationaboutthepassword,be auseeven ifitisnot guessed

orre tly, at least the sear h spa e is narrowed (even if, in this ase, only

slightly).

(22)

history

mutuallyex lusive. Inasystemwherepubli outputs anrevealtheidentityof

agents, for instan e, both theproblems of information owand of anonymity

are present. The lassi ation is usually based more on the ontextual

mo-tivation for the problem than on a rigid taxonomy of subelds. In fa t, in

re entyears therehasbeenana tivelineofresear h exploringthesimilarities

between problemssu hasthe foundationsofanonymityandinformation ow,

and alsopriva y andinformationow. Theresulthasbeenanin reasing

on-vergen ebetweentheseelds. Inthisthesisweexplorethesimilaritiesbetween

information ow, statisti aldatabases, and anonymity.

Inabroader ontext,theimportan eofinformation hidinggoesfarbeyond

therealmof omputers ien e,andtherearealotofsubtlequestionsthatneed

tobe onsidered arefully. Fromapoliti alandevenphilosophi alperspe tive,

the unrestri ted use of priva y prote tion an be ontroversial. Even though

it is broadly a epted that people should have the right to ex hange e-mails

privately, to vote in demo rati ele tions anonymously, and to express their

ideas on the Internet freely,there aresituationswhere information prote tion

poli ies an be argued to have serious drawba ks. The same me hanism that

grants a politi al a tivist anonymity and free spee h on the Internet, while

living under a repressive government, also grants a pedophile anonymity to

broad astharmful material. Thisbalan e between freedom and ontrol inthe

virtual media hasbeen thesubje t ofpassionate dis ussion. Independently of

whether one's goal is to maximize or to minimize the degree of information

prote tion in a given situation, it is anyway desirable to measure the extent

to whi h the information is prote ted, to dene whi h spe i denition of

prote tion theinformation fallsunder, andfromwhom theinformation is

pro-te ted.

Inthisthesisweavoidthe ontroversyofde idinginwhi h asesthe

appli- ation and extent of information hiding methods arejustiable. Rather, our

fo usisonmeasuringthedegreeofinformationprote tionoeredbyasystem,

thus makingevaluationand omparisonofdierent systemspossible.

Spe i- ally,weareinterestedinusing on epts ofinformationtheoryto quantifythe

leakage ofinformation.

1.2 Qualitative and quantitative approa hes to

information hiding: a brief history

Histori ally, the resear h on information hiding has evolved from the simple

but impre ise qualitative approa h toward the more rened, but at thesame

time more omplex, quantitative approa h. In the following se tions we will

brieyoverviewboth. Wedonotintendtoprovidehereanexhaustivestudyof

thesubje t,but ratherto highlight some ofthemostimportant ontributions

(23)

1.2.1 The qualitative approa h

Thequalitativeapproa h emergedrstintheliteratureof informationhiding.

The entral idea is that, by observing the output of a system, the adversary

annotbe ompletely sure ofwhat these ret information is. Theprin iple of

onfusion says that for every observable output generated by a se ret input,

there is another se ret that ould also have generated the same output. In

anonymity,forinstan e, this orrespondsto the on ept ofpossible inno en e,

i.e. theimpossibilityofidentifyingthe ulpritwith ertaintybyonlyobserving

the system'soutput. The prin iple of onfusion does not take into

onsidera-tiontheadversary's ertainty about thevalueof these ret: itis enough that

there be an alternative hypothesis, no matter howunlikely it is. This is also

knownasthe possibilisti approa h.

Oneof therstdevelopmentsinthis elddatesfrom 1976,when Belland

La Padula dened the model of multilevel se urity systems [BLP76℄. In this

modelthe omponents of asystem are lassied aseithersubje ts, i.e. a tive

entitiessu hasusersorpro esses,orasobje ts,i.e. passiveentitiessu hasles.

Thesubje ts aredivided into trusted and untrusted entities, and the authors

dene restri tionsonhowto manage untrustedobje ts. Therule noreadup

orwritedown statesthatuntrustedentities anreadonlyfromobje tsofthe

sameor lower levels, andthatthey an onlywrite into obje tsof thesame or

higherlevels. Thismodelwasdeveloped tosupportdierent levels ofse urity,

andaimedtoensurethatinformationonlyowsfromlowertohigherlevelsand

never intheoppositedire tion. Ea h input into and outputfrom the system

is labeled with a se urity level. Any pair of an input and its orresponding

outputis alledanevent. Aview ofase uritylevel

l

orrespondstotheevents at level

l

or lower, and alltheevents ofa higher level arehidden to level

l

.

Usually in this model only two levels are distinguished: high and low.

The high level orresponds to sensitive information, whi h should only be

availableto someuserswithspe ialprivileges,whilethelowlevel orresponds

to publi information a essible to everyone. The goal of se ure information

owanalysis is,inthis ontext,toavoidleakagefromthehighleveltothelow

level.

BellandLaPadula'smodel,however, didnot addresstheproblemof

leak-ageof informationdue to overt hannels. A overt hannelisawayof

trans-mitting information from the high to thelow environment by means not

de-signed or intended for this purpose. Consider,for instan e, asystem where a

an be established asfollows. The lowuser sends a leto the highuser, who

thenuses her power of de iding whether to grant or to deny

ℓ

further a ess to itto en ode a message. Ina later stage,

ℓ

tries to writein thele, and an

(24)

history

a essfailure anbeinterpretedasthebit

0

,whileasu ess anbeinterpreted asthebit

1

. Inthiswayanymessage aneventuallybesentthroughthe overt hannel fromthe orrupted highuser to thelowone.

To ope with the threat of overt hannels, Goguen and Meseguer

devel-opedthe on eptof noninterferen e[GM82℄. Asystemisnoninterfering when

thea tionsof highusers donot alter what anbeseen bylowusers. Inother

words, the low outputs of the system will only ree t the values of the low

inputs, independently of what the high inputs are (ifany). The authors

pro-posedamodelofnoninterferen e thatseparated thesystemfrom these urity

poli ies. Their model, nevertheless, was only appropriate for deterministi

systems.

Noninterferen e,however,maybeatoorestri tive on eptforseveral

pra -ti al appli ations. It doesnot allow, for instan e, thesummarization of data.

It is often the ase where a system allows statisti al (or summarizing)

fun -tions (e.g. mean, total number) to be al ulated on its high inputs and then

dis losed to low users, even ifthe high inputs themselves are supposed to be

keptse ret. Thesesystemsaretypi alintheareaofstatisti aldatabases, and

we will dis uss this issue in more detail in Se tion 1.3.2. Clearly, a system

that allows the summarization of highdata for the low environment violates

noninterferen e, sin e a hange on the highinputmayae t thelow output.

Considering this problem, in 1986 Sutherland [D.S86℄ proposed the

on- ept of nondedu ibility on inputs, whi h fo uses not on whether the output

is ae ted a ording to a hange in the input, but on whether it is possible

to dedu e the input from the output. Under this denition, a system may

allow summarization of data and still be se ure, sin e the output of a

sta-tisti al fun tion does not ne essarily allow theadversary to dedu e what the

inputs are. Onedrawba k of the on ept of nondedu ibility oninputs is that

it assumes that the strongest form of the prin iple of onfusion is enough to

ensure se urity. Notably, it relies on the assumption that no high value an

beruledout afterobservingalowvalue. Thisisnot astrongenough se urity

guarantee inmany real systems. In some ases, even ifno high value an be

ruled outasapossibility,asinglevalue(ora smallsetofvalues) anbemu h

more likely than the others, and in pra ti e it makes little sense to onsider

the alternatives. This riti ism an be seenas anearly attemptto onsider a

quantitativeapproa hforinformationow,whereitistakeninto onsideration

how mu h anatta kerlearns(or doesnot learn) about these retmatters.

Another important issue in se urity systems is the problem of

omposi-tionality. In [M C87℄, M Cullough pointed out the importan e of hook-up

se urity,i.e. the ompositionalityofmulti-usersystems. Usually,realsystems

are far too omplex to be analyzed as a whole, espe ially be ause the task

of designing and implementing a system is normally divided between teams.

Ea h team is responsible for a number of omponents that, in a later stage,

will be put to work together. It is highly desirable that se urity properties

(25)

that the nal omposite system is also se ure. M Cullough showed that the

on epts of multilevel se urity systems, noninterferen e, and nondedu ibility

on inputs arenot omposable. Asa repla ement, he proposed the on ept of

restri tiveness, a ordingto whi h no highlevelinformation should ae tthe

behavior of thesystem,asseenbya lowuser.

In[WJ90℄WittboldandJohnsonaddressedthequestionofnondedu ibility

on inputs undera dierent perspe tive, showing thatit is not a guarantee of

absen eofleakage. Considerthefollowingalgorithm,where

H

and

L

standfor thehighandthelowenvironments,respe tively. Herewe assumethevariables

x

and

y

are binary, and the randomized ommand

x ← 0 ⊕

0.5

1

assigns to

x

eitherthe value

0

or the value

1

with

0.5

probability ea h.

whiletrue do

x ← 0 ⊕

0.5

1

; output

x

to

H

; input

y

from

H

; output(

x

XOR

y

) to

L

reality inmany pra ti al situations. In many ases some information leakage

istolerableorevenintentional. Take anele tionproto ol. Afterthenalvote

ountisreleased,therearefewerpossiblehypotheses on erningwhovotedfor

whomthan thehypothesesavailablebeforethevotes were ast. Inthis

exam-plethere is a natural leakage of information, sin e the un ertaintyabout the

sensitive information de reases after theobservation of theproto ol's output.

Thisleakage o urs,however, asa ne essaryfun tionalityofthe proto ol.

Infa t,inmostrealsystemsnoninterferen e annotbea hieved,astypi al

systemswillalwaysleaksomeinformation. Thisdoesnotmean,however,that

(26)

history

variesfromsystemto system. Thereforeitisimportantto quantifyhow mu h

leakageasystemallows. Quantitativemethodsareusefultoevaluatetheextent

to whi h asystemis se ure,and to ompare itto othersystems.

One of the rst attempts to quantify information leakage was made by

′

isthelowinformation in

s

′

. Herdenitionof leakagewas:

M

1 = H(h

s

|ℓ

s

) − H(h

s

|ℓ

s

′

)

Ifthequantity

M

1

ispositive,thenitis onsideredtobetheleakage of in-formation. Thismeasure ofleakage, however, doesnot onsiderthehistory of

lowinputs,aproblempointedoutbyClark,Huntand Mala ariain[CHM07℄.

Without the history one annot summate the in rease in knowledge (or

de- rease inun ertainty)thata umulatesbetweenthelowstates

s

and

s

′

. They

proposed, instead, thefollowing measureofleakage:

M

2 = H(h

s

|ℓ

s

) − H(h

s

|ℓ

s

′

, ℓ

_s

)

Sin e

H(X|Y, Z) ≤ H(X|Y )

forallrandomvariables

X

,

Y

and

Z

,wehave

M

1 ≤ M

2

. The quantity

M

2

orresponds to the Shannon onditional mutual information

I(h

s

; ℓ

s

′

|ℓ

s

)

.

In 1987, Millen made a formal onne tion between information ow and

is zero. In other words, noninterferen e is asu ient onditionfor absen e ofinformation ow.

In1990,Masseygaveanimportant ontributiontotheeldofinformation

theory,whi h inuen ed the further development of quantitative information

ow. In [Mas90℄ he showed that the usual denition of dis rete memoryless

(i.e. history-independent) hannels usedat thattimeinfa tdidnottakeinto

a ount the possibilityfor theuseof feedba k. He highlighted the on eptual

1

The on epts of entropy, onditional entropy and mutual information will be dened

formallyinChapter3. Forthemomentitis enoughtoknowthatentropy isameasure of

theun ertaintyofarandomvariable; onditionalentropyisameasureoftheun ertaintyof

onerandomvariablegivenanother randomvariable; andmutual information is ameasure

(27)

dieren e between ausality and statisti aldependen e, and presentedan

a - uratemathemati al des riptionofdis retememoryless hannels thatallowed

feedba k. Then heintrodu ed the on eptofdire ted information,whi h

ap-turestheideaof ausalitybetween theinputandtheoutputofa hannel, and

arguedthatinthepresen eoffeedba k,dire tedinformation isamore

appro-priate measure of the ow of information from input to output than mutual

information.

In the same year, M Lean also onsidered the on ept of time in the

de-s ription ofsystems by proposinghis Flow Model [M L90℄. A ording to this

model, there is a owof information only when a highuser

H

assigns values to obje ts ina state thatpre edes thestate inwhi h a low user

L

makesher assignment. Inthissituationonlypartofthe orrelationbetweenhighandlow

information is onsideredasleakage. Thisaddressedtheproblemof ausality,

butthis model wastoogeneral, andrelatively di ultto apply.

In [Gra91℄ Gray worked on bridging the gap between the overly

ompli- ated Flow Model and themore pra ti al,yet restri ted, approa h of Millen.

Gray used a general-purpose probabilisti (as opposed to nondeterministi )

state ma hine that resembled Millen's model. In Gray's model, the value

T (s, I, s

′

_{, O)}

representstheprobabilityofagivenstate

s

evolvingintoanother state

s

′

, under the input

I

, and produ ing output

O

. The hannels are par-titioned into two sets,

H

and

L

, representing the hannels onne ted to high andlowpro esses,respe tively. Thehighand thelowenvironments an

om-muni ate only through their intera tions with the system, as no other form

of ommuni ation between them is allowed. Gray wanted to take time and

ausalityinto onsiderationinhisdenitionofleakage,andhedidsoby

allow-ingfeedba kandmemoryinhismodel. Hisformulationofase urityguarantee

wasthefollowing:

P (L

I

∩ L

O

∩ H

I

∩ H

O

) > 0

=⇒

P (ℓ|L

I

∩ L

O

∩ H

I

∩ H

O

) = P (ℓ|L

I

∩ L

O

)

(1.1) where

L

I

and

L

O

representthehistoryoflowinputsandoutputs,respe tively,

and

H

I

and

H

O

representthehistoryofhighinputsandoutputs,respe tively.

Thesymbol

ℓ

representsthenaloutputevent hannelsinthelowenvironment. Theformulation(1.1)statesthattheprobabilityofa lowoutputmaydepend

ontheprevioushistoryofthelowenvironment,butnotontheprevioushistory

ofthe highenvironment.

Grayalsotriedtogeneralizethe on eptof apa itytothe aseof hannels

with memory and feedba k. He provided a formula expressing the ow of

information from the whole history of inputs and outputs (during the time

period

0 . . . t − 1

) to thethelow output(at time

t

),and onje turedthatthe apa ity ofthe hannel wouldbe:

C

def

= lim

(28)

history where

C

n

def

= max

H,L

1 n

n

X

i=1

;

and

In

_

Seq

_

Event

A,t

is the input history at hannel

A

(where

A

stands for

L

or

H

)upto time

t − 1

,

Out

_

Seq

_

Event

A,t

istheoutputhistory at hannel

A

upto time

t − 1

,and

Final

_

Out

_

Event

L,t

isthelowoutput eventat time

t

. Grayshowed that the absen e of information ow impliesthat apa ityas formulatedin(1.2)iszero. Healso onje turedthatthisdenitionof apa ity

would orrespond to thenotion of maximum transmission rate supported by

the hannel. As pointed out in [AAP11℄, however, the problem with Gray's

onje ture is the following. For an output at time

t

,the only ausal relation onsideredistheonewiththehistoryofinputsuptotime

t−1

,whiletheee t that theinputat time

t

itselfmayhave ontheoutputisignored. Inthis way, (1.2) doesnotexpress the omplete ausal relationbetweeninputandoutput.

The orre tnotionof apa ityinthepresen eofmemoryandfeedba k, whi h

orrespondstothe maximumtransmissionratefor the hannel, wasproposed

in 2009 by Tatikonda and Mitter [TM09℄, and it willbe dis ussed later on in

Chapter 4.

A similar formal approa h, although with dierent motivations, was

pre-sented by M Iver and Morgan in [MM03℄. They fo used on the problem of

preserving se urity guarantees while rening spe i ations into

implementa-tions. The authors used an equation similar to (1.3), but in the ontext of

sequential programing languages enri hed with probabilities. Their aim was

to prote t the highvalues duringthewhole exe utionof theprogram,instead

oftheinitialhighvaluesonly. Inotherwords,theywantedtoassurethatifthe

highinformation isnot knownbythelowenvironmentat thebeginningofthe

omputation, thenit annotbeinferredat anylater stage. Theyprovedthat,

fordeterministi programs,ifthenalvaluesofthehighobje tsareprote ted,

then theinitial values areprote tedaswell. M Iverand Morganalso dened

the on ept of informationes ape as:

H(h|ℓ) − H(h

′

|ℓ

′

)

where

H(h|ℓ)

representsthe un ertainty( onditional entropy) ofthe high in-formationgiventhe lowinformationat thebeginningofthe omputation,and

H(h

′

|ℓ

′

₎

representsthesameun ertaintyattheend ofthe omputation. They

dened the hannel apa ity as the leastupper bound of information es ape

over all possible input distributions. In this ontext, a system is onsidered

se ure if ithas apa ityequal tozero. Oneadvantage of this modelis thatit

isnot ne essarytokeep tra kofthe wholehistoryofthe omputation,but on

the other hand it an be applied only in s enarios where the adversary does

(29)

InChapter3wewilltakeupagainthedis ussionofquantitativeapproa hes

toinformationowbasedoninformationtheory. Forthemomentwewillfo us

on some topi s relatedto information hiding that are of spe ialrelevan e for

thisthesis.

1.3 Case studies of information hiding

In this se tion we present three ase studies of information hiding that we

addressinthisthesis.

1. The aseofquantitativeinformationow,i.e. howmu haboutthese ret

information an adversary an learn by observing the system's output,

andbyknowing howthesystemworks. Wegivespe ialattentiontothe

broadlystudiedproblemofanonymity,whi h anbeseenasaparti ular

ase of the more general problem of information ow where the se ret

information isthe identityof theagents.

2. Thequestionofstatisti aldis losure ontrol,whi h on ernstheproblem

ofallowing usersof adatabase to obtainmeaningful answers to

statisti- alqueries,while prote tingthepriva yof theindividuals parti ipating

in the database. We fo us on dierential priva y, an approa h to this

problemthathasdrawna lotof attention inre ent years.

3. Theproblemof preserving se urity guarantees while deriving

implemen-tationsfromspe i ations. Usuallyspe i ationsaremoreabstra tthan

implementations, i.e. they present more nondeterminism. The task of

implementing asystemredu es thenondeterminismof thespe i ation,

andifitisnot done arefully,an implementation mayruleout

possibili-tiesallowedbyspe i ationthatareessentialforthese urityguarantees.

1.3.1 Quantitative information ow and anonymity

Anonymity is one of the most studied subje ts of information hiding. The

resear hinthisareahasbeena tiveinthepastseveralyears,andtheadvan es

made an be extended to the more general s enario of information ow. As

briey introdu ed in Se tion 1.1, anonymity on erns the prote tion of the

identities ofthe agents involved inthe events.

WiththeadventoftheInternet,theprote tionofanonymityhasbe omean

issueinthe dailylife of millions ofpeople around theworld. Theimportan e

of anonymity is even more evident on erning the prote tion of freedom of

spee h, a situation that is parti ularly deli ate in ountries under repressive

regimes.

Ptzmann,DresdenandHansen[PDH08℄haveproposedastandard

termi-nologyfor anonymity on epts. Intheir worktherearethree dierentnotions

(30)

•

Sender anonymity: when the identity of the originator should be pro-te ted;

•

Re eiver anonymity: when the identity of the re ipient should be pro-te ted;

•

Unlinkability: when it might be known that an agent

A

originated a message andanagent

B

re eived amessage,yetitshouldnot be known whether themessagesent by

A

wasa tually theone re eived by

B

. Reiter and Rubin also gave a lassi ation of the types of adversary in

an anonymity system in [RR98℄, where they also proposed the anonymity

proto ol Crowds (see Se tion 1.3.1). In their work, they onsidered that the

adversary an bean eavesdropper simplyobserving thetra of messageson

the network, or she an be an a tive atta ker (i.e. a ollaboration between

senders, between re eivers, or between others taking part in the system), or

even a ombination of the previous two types. The authors also dened a

hierar hyofanonymitydegreesthatasystem anprovide. Inde reasingorder

of strength, the proposed s ale is listed below. In this list, let

s, s

′

denote

se rets and

o

an observable, i.e. a parti ular a tion or output of the system that isdistinguishablefrom thepoint of viewof theatta ker.

Strong anonymity From the atta ker's point of view, the observables

pro-du ed by the system do not in rease her knowledge about the se ret

information, i.e. the identity of the individual involved in an event.

Chaum also des ribed the on ept of strong anonymity in his work on

theDiningCryptographers proto ol[Cha88℄. It representstheideal

sit-uationwheretheexe utionoftheproto oldoesnotgivetotheadversary

anyextrainformationabout these rets. The on eptwasformalized as

follows.

∀s, o p(s|o) = p(s)

(1.4)

This denition is the equivalent of probabilisti noninterferen e. In

[CP06℄, Chatzikokolakis and Palamidessi showed thatthe ondition

ex-pressedby(1.4) isequivalent to:

∀s, s

′

, o

p(o|s) = p(o|s

′

)

(1.5)

i.e. the probability of the systemprodu ing an observable is the same,

no matter what the se ret information is. This denition is known as

equality of likelihoods and is advantageous asit does not depend on the

probabilitydistribution on se rets.

Another denition of strong anonymity, more restri tive, was proposed

byHalpernandO'Neill[HO03,HP05℄. Itisequivalenttoea hofthe

pre-viousdenitions((1.4)or(1.5))plustheassumptionthattheinput

(31)

onden e inher guess about the se ret, and dened strong anonymity

as:

∀s, s

′

, o

p(s|o) = p(s

′

|o)

(1.6)

The formulation (1.6) is also known as onditional anonymity and

or-respondsto thelevelofanonymity alled beyond suspi ion inReiterand

Rubin's lassi ation.

Beyond suspi ion From the atta ker's point of view, an agent is no more

likely to be the ulprit than any other agent in the system. It an be

formalized asin(1.6).

Probable inno en e From the atta ker's point of view, an agent does not

appear more likely to be involved in an event than not to be involved.

Formally:

∀s, o p(s|o) ≤ 0.5

(1.7)

The formulation (1.7), however, is not broadly a epted as the

deni-tionof probable inno en e. In[CP06℄,Chatzikokolakis and Palamidessi

showed that the property that Reiter and Rubin indeed proved for the

Crowds proto ol in[RR98℄was:

∀s, o p(o|s) ≤ 0.5

(1.8)

Possible inno en e Fromtheatta ker'spointofview,thereisalwaysa

non-zero probability that the agent involved in the event is someone else.

Formally:

∀s, o. p(s|o) > 0 =⇒ ∃s

′

.p(s

′

|o) > 0

Theabovehierar hygivesari her lassi ation ofthedegreeofprote tion

oered bya systemthan wouldbepossible withsimpler possibilisti models.

Among the quantitative approa hes to anonymity, two are of our spe ial

interest: the onesbasedoninformation-theoreti on eptsand theonesbased

on the Bayes risk. In the following se tion we give a brief overview of these

twoapproa hes. These on epts will berevisited inmoredetailinChapter 3.

Anonymity proto ols as noisy hannels

Informationtheoreti approa hestoanonymity,andmoregenerallyto

informa-tionow,relyon on eptssu hasentropyandmutualinformationtomeasure

theadversary'sla kofinformation aboutthese retbeforeandafterobserving

the system'soutput. Typi ally thesystemis seen asa noisy hannel andthe

on eptofnoninterferen e orrespondstothe onverseofthe hannel apa ity.

Thereareseveralworksintheliteraturethathaveproposedmeasuresof

(32)

[SD02, DSCP02, ZB05, DPW06℄. In [CPP08a℄ Chatzikokolakis, Palamidessi

and Pananganden proposed the on ept of onditional apa ity to ope with

the situation where some leakage of information is intended by the system.

Consider againtheele tionproto olexample. Bydesign,thenalvote

ount-ing needs to be announ ed and it usually in reases the atta ker's knowledge

aboutthese ret. Inthissituation,theleakageshouldbe al ulatedmodulothe

information thatis supposedtobedis losed, i.e. the vote ount. Inthiswork

theauthorsalsoproposedmethodsto al ulatethe hannel apa ityexploiting

some symmetries present inseveral pra ti al systems.

Hypothesis testing and Bayes risk

Insomerealworldsituationsanindividualfa esthefollowing situation: sheis

interestedinthevalueofsomerandom variable

A ∈ A

butshehasa essonly to the values of another random variable

O ∈ O

. She knows that

A

and

O

are orrelated byaknown onditional probabilitydistribution. Thissituation

o urs in several elds, for instan e in medi ine (to make a diagnosis, the

physi ian hasa essto alistof symptoms, but notto thedisease itself). The

attempttoinfer

A

from

O

isknownastheproblemofhypothesistesting. Here weareinterested intheuseof hypothesis testinginthe ontext of anonymity

as known, and then deriving from that and from the knowledge about how

the systemworks, an a posteriori probabilitydistribution after some fa thas

been observed. It is well known that the best strategy for the adversary is

to apply the MAP rule (Maximum A posteriori Probability rule), whi h as

the namesuggests, hooses thehypothesiswith themaximumprobabilityfor

thegiven observation. Here, bybest strategy we meantheone thatindu es

the smallest probability of error in guessingthe hypothesis, that inthis ase

orrespondsto the Bayesrisk.

In[CPP08b℄Chatzikokolakis, Palamidessi andPananganden exploredthe

hypothesis testing approa h to anonymity, ina s enario where the adversary

has one single try to guess the se ret (after exa tly one observation). They

asso iatedthelevelofanonymitytotheprobabilityoferror,i.e. theprobability

of an atta ker making a wrong guess about the se ret. In order to onsider

the worst ase s enario and to give upper bounds for thelevel of anonymity

provided, the adversary is assumed to use the MAP rule strategy. In this

ase, theprobability oferror orrespondsto the Bayesrisk, and thedegree of

prote tion oeredbyaproto ol orrespondsto theBayesriskasso iatedwith

(33)

In [Smi07, Smi09℄ Smith also onsidered the s enario of one-try atta ks

and proposed the notion of vulnerability, whi h takes into onsideration the

probability that the adversary an guess the se ret orre tly after observing

the behaviorof thesystemonly on e. Smith proposedtheframeworkof

min-entropy leakage, whi h is losely related to theBayesrisk, but is dierent as

it uses the on ept of entropy (more pre isely min-entropy) and formalizes

leakage ininformation theoreti terms.

In Chapter 3 we will present a deeper dis ussion about the use of

infor-mationtheoryfor theformalization ofinformation ow,in luding thenotions

of Shannon entropy, mutual information and the framework of min-entropy

leakage for one-try atta ks. First, however, we will reviewsome fundamental

anonymityproto ols inliterature.

Examples of anonymity proto ols

Onthe Internet, every omputer hasa unique IPaddress whi h spe iesthe

omputer's logi al lo ation in the topology of the network. This IP address

is usually sent along with any request originating from the omputer. Even

if the omputer uses an IP address for a single session via an ISP (Internet

Servi eProvider),theidenti ation anbeloggedandretrievedlaterwiththe

ISP's omplian e. One ommon way totry to preserve anonymity isto usea

proxy, i.e. an intermediary omputerthat gathers all therequestsof a group

of omputers and serves as a unique gate for any ommuni ation with the

worldoutsideofthe network. Forpra ti al purposes,itisasifalltherequests

originatedfromtheproxy,andthemembersofthegroupareindistinguishable

from the point of view of an outside observer. One drawba k presented by

the use of proxies is that it reates single points of failures, de reasing the

network'srobustness.

Theproblemillustratedaboveisone ofthemotivationsfortheuseof

om-muni ationproto olsspe i allydesignedtoprote tanonymity. Inthisse tion

wereviewtwoof themost fundamental, andprobablymostfamous,examples

of anonymity proto ols in literature: the dining ryptographers proto ol, and

the Crowds proto ol.

Thedining ryptographers Thedining ryptographers proto olwas

pro-posed by Chaum in[Cha88℄. It is one of therst anonymity proto ols inthe

literature,anditisoneofthe fewproto olsthat anassurestronganonymity.

Theproto olisusuallypresentedinasimplieds enario, wherethree

ryp-tographersemployedbytheNSA(TheNationalSe urityAgen yoftheUnited

States)arehaving dinner ina restaurant. At theend of thedinner,the NSA

de ideswhether itwill paythebillitself or whether itwill assign thedutyof

paying toone ofthe ryptographersat thetable. Inthe asetheNSAde ides

thatone of the ryptographers will pay, itannoun es the de ision se retly to

(34)

will pay the billor not, without revealing the identity of thepayer. In other

words,toanexternalobserver(andtothenon-paying ryptographersaswell),

the only a essible information is whether theNSA is paying or not, but not

the identity of the ryptographer paying (if any). We assume that the NSA

does not dis lose its de ision to anyone but to the ryptographer it hooses

(again, ifany), and that the solutionshould be distributed,i.e. only message

passing between agents is allowed, and no entralized agent oordinates the

pro ess.

Thedining ryptographersproto olsolvesthisproblemasshown

s hemat-i ally inFigure1.1. Ea h ryptographer(

Crypt

0

,

Crypt

1

and

Crypt

2

)tossesa ointhatisvisibleonlyto himselfandtohisright-handneighbor. Inthisway

every ryptographer has a shared oin with ea h of the other two. After all

three oins (

c

0

,

c

1

and

c

2

) are tossed, ea h ryptographer he ks whetherthe two oins visible to him agree (both are heads or both are tails) or disagree

(one is head and the other is tails). Then they announ e publi ly agree or

disagree,a ording to theresultthey obtained withtheir oins. The only

ex- eption is that, ifa ryptographer ispaying,he will announ etheoppositeof

whathesees,i.e. hewillannoun edisagree inthe asethathis oinsagreeand

agree iftheydo not. It an be proventhat ifthe number ofdisagrees is even,

thentheNSA ispaying,andifthenumberofdisagrees isodd,thenoneofthe

ryptographersispaying. Moreover,ifthe oinsareallfair,theproto oloers

strong anonymity in the following sense: The exe ution of the proto ol does

not provide to anexternal observerenough eviden eto hange herknowledge

about whi h ryptographeristhepayer, ifany. Inotherwordstheprobability

of any ryptographer being the payer, under the adversary's point of view,

doesnot hange aftertheobservation ofthe proto ol'sexe ution.

The dining ryptographers proto ol an be generalized to any number of

graphnodes(i.e. ryptographers) andanytypeofgraph onne tivity(i.e. the

shared oins between pairs of ryptographers). Then the same solution an

be usedforanonymous ommuni ation asfollows. Ea hpair ofnodesshare a

ommon se ret (thevalue of the oin) of length

n

, equal to thelength of the transmitteddata. Itisassumedthatthe oinsaredrawnuniformlyfromtheset

ofpossiblese rets. Ea hnodethen omputesthebinarysum(XORoperation)

of all its shared se rets and announ es the result. The only ex eption is that

thenodethatwantsto transmit addsthe datum,also oflength

n

,to thesum it announ es. It an be shown that the total sum of the announ ements of

all nodesis equalsto thedata to be transmitted, sin e ea h se ret is ounted

twi e (on e by ea h node that an see it) and, therefore, is an eled out by

the XOR operation. Theproto olworksunderthe assumption thatonly one

nodeatatimetriestotransmit,andifitisthe asethatmorethanonesender

wants to transmit at the same time, the oni t needs to be solved bysome

sort of oordinator.

One drawba k of the dining ryptographers proto ol is its ine ien y: