HAL Id: tel-00639948
https://tel.archives-ouvertes.fr/tel-00639948v3
Submitted on 13 Feb 2012
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
Formal approaches to information hiding : An analysis of
interactive systems, statistical disclosure control, and
refinement of specifications
Mário Alvim
To cite this version:
Mário Alvim. Formal approaches to information hiding : An analysis of interactive systems,
statis-tical disclosure control, and refinement of specifications. Cryptography and Security [cs.CR]. Ecole
Polytechnique X, 2011. English. �tel-00639948v3�
PhD. Thesis - Thèse de Do torat
Spé ialité Informatique
Formal approa hes to information
hiding:
An analysis of intera tive systems, statisti al
dis losure ontrol, and refinement of
spe ifi ations
Mário S. Alvim
LIX, É ole Polyte hnique
Palaiseau, Fran e Supervisor Catus ia Palamidessi Rapporteurs Gilles Barthe Mi hael Mislove Examinateurs
Béatri e Bérard
Stéphanie Delaune
Loï Hélouët
Daniel Le Métayer
Geoffrey Smith
É olePolyte hnique
Équipe Comète
CentreNationalde laRe her he
S ientique
Contents i List of Figures iv List of Tables v A knowledgements vii 1 Introdu tion 1 1.1 Informationhiding . . . 1
1.2 Qualitativeand quantitative approa hes to information hiding: abrief history . . . 3
1.2.1 Thequalitative approa h . . . 4
1.2.2 Thequantitative approa h. . . 6
1.3 Case studiesofinformation hiding . . . 10
1.3.1 Quantitative information owand anonymity . . . 10
1.3.2 Statisti al dis losure ontrol . . . 18
1.3.3 Reningspe i ations into implementations . . . 20
1.4 Planofthethesis and ontribution . . . 22
1.5 Publi ations . . . 23
2 Preliminaries 25 2.1 Probability spa es . . . 25
2.2 Probabilisti automata . . . 27
2.3 CCS withinternal probabilisti hoi e . . . 28
3 Therationalebehindtheuseofinformationtheoryforleakage 31 3.1 Informationtheoryand ommuni ation. . . 31
3.2 Informationtheoryand information ow . . . 32
3.3 Un ertaintyand leakage . . . 34
3.3.1 Shannon entropy . . . 35
3.3.2 Min-entropy. . . 37
3.3.3 Guessingentropy . . . 40
3.3.5 Comparison anddis ussion . . . 41
4 Information owin intera tive systems 43 4.1 Intera tive systems . . . 45
4.2 Dis rete hannels withmemoryand feedba k . . . 48
4.2.1 The powerof feedba k . . . 50
4.2.2 Dire tedinformationand apa ityof hannelswith feed-ba k . . . 53
4.3 Intera tive systems as hannels withmemory and feedba k. . . 55
4.3.1 Constru tion of the hannel asso iated to an IIHS . . . 58
4.3.2 Liftingthe hannelinputs to rea tion fun tions . . . 61
4.4 Leakage inintera tive systems. . . 66
4.5 An example: the Co aine Au tionproto ol . . . 72
4.5.1 Cal ulating theinformation leakage . . . 75
4.6 Topologi al propertiesof IIHSsand their apa ity. . . 78
4.7 Related work . . . 85
4.8 Chaptersummaryand dis ussion . . . 86
5 Dierential priva y: the trade-o between leakage and utility 89 5.1 Dierential priva y . . . 90
5.1.1 Formaldenition . . . 92
5.1.2 Alternative interpretationinthe ase of liques . . . 92
5.2 A modelofutilityand priva y for statisti aldatabases . . . 94
5.2.1 Leakage about anindividual. . . 96
5.2.2 A noteon the hoi e ofvalues . . . 96
5.2.3 The questionswe explorewiththehelp ofour model . . 97
5.3 Graph symmetries . . . 98
5.4 Derivingtherelationbetweendierential priva yand quantita-tive information ow onthebasisof thegraph stru ture . . . . 102
5.4.1 Assumptions andnotation . . . 102
5.4.2 The matrixtransformation . . . 103
5.4.3 The bound onthea posteriorientropyof the hannel. . 112
5.5 Appli ation to leakage . . . 114
5.5.1 Measuringtheleakage aboutan individual. . . 121
5.6 Appli ation to utility . . . 122
5.6.1 The bound ontheutility. . . 124
5.7 Related work . . . 130
5.8 Chaptersummaryand dis ussion . . . 131
6 Safe equivalen es for se urity properties 133 6.1 Theuseof equivalen esinse urity . . . 134
6.2 Distributedsystems and omponents . . . 137
6.2.1 Tagged Probabilisti Automata . . . 137
6.2.3 Distributed systems . . . 139
6.3 Admissible s hedulers . . . 140
6.3.1 Restri tingglobal s hedulers . . . 141
6.3.2 Restri tinglo als hedulers . . . 142
6.4 Safeequivalen es . . . 143
6.4.1 Safe ompletetra es . . . 143
6.4.2 Safe bisimilarity . . . 144
6.5 Safenondeterministi information hiding . . . 152
6.6 Relatedwork . . . 153
6.7 Chaptersummaryanddis ussion . . . 155
7 Con lusion 157
1.1 An example ofthe dining ryptographers proto ol . . . 16
1.2 The Crowds proto olat work . . . 17
2.1 The semanti s ofCCS
p
. . . 294.1 Intera tive systemofExample 1 . . . 46
4.2 Modelfor dis rete hannel withmemory andfeedba k . . . 50
4.3 S heme ofse rettransitions for se ret-nondeterministi IIHSs . . . 56
4.4 Lo altransformation inan IIHS tree . . . 58
4.5 Transformation inan IIHS tree . . . 59
4.6 The normalized IIHS for theextendedwebsiteexample . . . 61
4.7 Channelwith memoryand feedba kmodelfor IIHS. . . 66
4.8 Co aineau tion example. . . 73
4.9 Comparison between theleakage inExamples a,b, and . . . 79
5.1 Randomized fun tion
K
. . . 955.2 Leakage and utilityfor obliviousme hanisms . . . 96
5.3 Some distan e-regulargraphs withdegree
3
. . . 995.4 Some
V T
+
graphs . . . 99 5.5 Some(Val
u
, ∼)
graphs . . . 1015.6 Venn diagram for the lasses of graphs onsidered in this se tion. Here
S
∗
= {Val
u
| |Val | = 2, u ≤ 2}
. . . 1025.7 Steps of the matrix transformation for distan e-regular and
V T
+
graphs . . . 1045.8 The relation between elements of a row
i
and the elements in the diagonal . . . 1075.9 Graphs of
Bnd
(u, v, ǫ)
foru = 100
andv = 2
(lowest line),v = 10
(intermediate line), andv=100
(highestline), respe tively. . . 1165.10 Universeandhighestmin-entropyleakagematrixgiving
ǫ
-dierential priva yfor Example 7. . . 1186.1 Exe ution trees for Example10 . . . 136
4.1 Channelmatrix for Example 1 . . . 46
4.2 Two dierent hannel matri es indu edbytwo dierent input dis-tributions for Example 1 . . . 47
4.3 Channelmatrix for binaryerasure hannel . . . 50
4.4 General formof hannel matrix . . . 52
4.5 A possible evolutionofthe binary hannelwithtime,for
W = 011
andT = 3
. . . 544.6 Channelmatrix for Example 5 . . . 68
4.7 Sto hasti kernelsfor theCo aine Au tion example . . . 73
4.8 Rea tion fun tionsfor the o aineau tion example . . . 74
4.9 Values oftheprobabilities inFigure 4.8for Examples a,b, and . 77 4.10 Values of the entropy and dire ted information for Examples a, b, and , where
I(A
T
; B
T
) = H(A
T
) − H(A
T
|B
T
)
andI(A
T
→
B
T
) = H
R
− H(A
T
|B
T
)
. . . 784.11 The IIHSsofExample 6and their orresponding hannels . . . 85
4.12 Summary ofresults . . . 87
5.1 Me hanismsfor the itywithhigher numberofvotesfor andidate
cand
. . . 129Praise the bridge that arried you over.
GeorgeColman
Everypie e of work isprodu ed within a ontext, andnaturally this thesis is
no ex eption. I want to dedi ate this spa e to express my gratitude to some
people that have helped to reate an environment of s ienti , material, and
emotional support, whi h was ru ialtothedevelopment ofmywork over the
past three years. I am deeply grateful to all these people, and the inuen e
theyhavehadinthisworkisonlyasmallpartoftheinuen eandimportan e
they have inmy life.
First of all, I will always be deeply grateful to Catus ia Palamidessi for
her outstanding work as my thesis supervisor. During these three years she
has provided astimulatingand ex iting s ienti environment, endowed with
all the material and logisti support a student ould ever need. Her passion
for s ien e is ontagious, and her brillian e and persisten e are qualities that
I an only hope to be fortunate enough to a hieve someday. And not only is
shea widelyre ognized resear her, butsheisalso aremarkablehumanbeing,
whosekindnessandethi shavesetanexamplethatIwillalwayskeepwithme
ina ademiaand for life. Itiswithsin ere joythatI an saythat, besidesthe
fruitfuls ienti ooperation,we wereableto reatea deeplinkoffriendship,
and Iwill do mybest sothatboth an last for life.
Anotherpersonoffundamentalimportan einmypathtothisdayisElaine
Pimentel. Asthersts ienti tutorIhaveeverhad,andlaterasmyMaster's
program supervisor, she was the one wel oming me to the fas inating world
of a ademia. She guided my rst steps in resear h, and her dedi ation and
intelligen e are remarkable. Elaine was the strongest supporter I have ever
had for doing a do toral program abroad, espe iallyin the early times when
not even my family was onvin ed yet it was a good idea. More than on e
Elaine was a thoughtful friend and a wise advisor, who helped megure out
solutionsforpra ti alproblemsthat,insomemoments,mademedoubtI ould
getto theend of thisprogram. Thank you,Elaine, verymu h forit all.
I would also like to thank the CNRS (Centre National de la Re her he
the fundsfor these three years of resear h in Fran e. I also thank INRIA for
all the nan ial and logisti al support with respe t to s ienti onferen es,
events, andwork trips.
I am grateful to the members of my jury, who kindly gave their time to
go through my work and evaluate it. Thanks to Béatri e Bérard, Stéphanie
Delaune, Loï Hélouët, Daniel Le Métayer, and Georey Smith. And spe ial
thanks to my rapporteurs Mi hael Mislove and Gilles Barthe, who produ ed
theevaluationreportformythesis. Iamhonoredtohave hadtheopportunity
to have su h ahighqualied jury.
Iwouldalso like to thank all the people from theGraduate S hool(É ole
Do torale) of É olePolyte hnique, espe iallyAudrey Lémare halfor herhelp
withthedo umentation regardingmystayinFran e,Fabri e Baronnetforhis
administrative work, and Christine Ferret for everything involving the thesis
defense.
Ifeelespe iallyfortunateforhavinghadtheopportunityto workinsu ha
stimulatingenvironment asis theLIXlaboratory (Laboratoire d'Informatique
de l'É ole Polyte hnique), and in parti ular the Comète team. It is with a
weight inmyheartthatIleave alltheseamazingpeople. Iamdeeplygrateful
to Frank Valen ia, who gave me one of the warmest wel omes I got in my
new life in Europe. Frank was not only a tea her, but a olleague, a gym
ompanion,and a goodfriend. Iamalso gratefulto hiswife, SaraSödergren,
and to their son, Felipe Valen ia, for all the good moments shared. Thanks
alsotoAndrésAristizábal forhiskindnessandforalwaysbeingreadyto help;
toCarlosOlarteforthe help,friendshipandgoodmomentssharedtogether (I
willneverforgetthatitwasCarloswhotookmeonmyrstwalkinParis,and
introdu ed me to the Eiel Tower); to Sophia Knight for the shared laughs,
food, jokes and omplaints that make our love-hate friendship unique; and
to Justin Dean, Sophia's husband, who is a remarkably kind and smart guy
withan interestingviewoflife. Thanksto DaleMiller, foralwayshavingwise
advi e to oer when I needed it, and thanks as well to Catus ia and Dale's
kids, Nadia and Alexis, for the good moments shared. I am also grateful to
Christele Braun, Ehab El Salamouny, Jéremy Dubrueil, Jesus Aranda, Lili
Xu, Luis Pino, Mar o Giunti, Mar o Stronati, Ni olás Bordenabe, Ralu a
Dia onu, Romain Beauxis, andSylvainPradalier, who, even ifI didnot have
the opportunity to work with them dire tly, helped make LIX su h a great
environment.
I would like to thank my o-authors, withwhom I have had the
opportu-nitynotonly to ooperates ienti ally,butalsoto reatefriendships. Thanks
to Miguel E. Andrés for our fruitful ollaboration, the onstant good mood,
and the always stimulating joke-ghts. Thanks to Konstantinos (Kostas)
Chatzikokolakis for all the work we developed together, theenlightening
dis- ussions about so many subje ts, and the good moments shared. Thanks to
PierpaoloDegano, withwhomIhadthepleasureof ollaboratingandlearning
The team of administrative support at LIX was also fundamental for my
work. I would like to thank Marie-Jeanne Gaard for her remarkable
om-peten e and dedi ation, whi h have frequently saved me from a great deal of
trouble. Herprofessionalbehaviorisamodeltobefollowed,andIwishI ould
en ounter people like her everywhere I will ever work. I am also grateful to
Valérie Le omte, for the ountless timesshehelped me, even when itwasnot
her duty, always with the hara teristi ompeten e and sympathy. I annot
forget Corinne Poulain, who guided me through the endless administrative
maze when Iarrived inFran e. Thanksalso to James Régis for thete hni al
support; and to Isabelle Bier ewi z and Lydie Fontaine for the assistan e in
my rst years at LIX.I would also like to say a ouple of words about Ryna
LamPe h,whose heerfulsmile andalways goodmoodmadeea h oeetime
inthe afeteria an even moreenjoyable moment.
I am also grateful to the experien ed s ientists who have shared part of
their vast knowledge with me, either in onferen es, workshops or informal
meetings, and reinfor ed my view that people in a ademia are not only
bril-liant, but usually good human beings as well. Thanks espe ially to Georey
Smith for sharing his expertise with me in so many insightful onversations,
and for organizing the ex iting workshop on information ow at Florida
In-ternational University. Iwill not forget the hospitalityhe, hiswife Elena, his
sons Danieland David,andtheadorableYoshioered inMiami. Thanksalso
toPrakashPanangandenforthele turesattheSFM-10:QAPLsummers hool
in Bertinoro, and also for the opportunity to parti ipate in theworkshop on
quantumand lassi information ow at theBellairsResear h Institute.
I annot pro eed without mentioning all the amazing friends I have in
Brazil,who werefundamentalintheba kground thatbrought mehere. Even
beingfar away,theyare onstantly inmymind,andI always ount downthe
days tothe next time Iwill seethem again. Thanks toAline Miranda, whom
I have had the privilege of knowing and whose friendship I enjoy very mu h;
to Aline Resende, an in redible friend, on whom I know I an always ount
on at any time of dayor night, and withwhom I have had some of themost
joyful andmemorable moments ofall mylife; to AnísioLa erda, thetalented
and sensibleguywhom Ialwaysenjoyed talkingto about anysubje t(serious
or not); to DeznieLopes, who always hasa smile to oer; to KatiaLage, the
sweetand kind friendwhois always thereto helpothers; to LaraCoelho, the
funny and pra ti al girl, whose visit to Paris wasone of thehighlights of my
time in the ity; and to Marina Cruz, my hildhood friend, the one I have
known for the longest in my life and whose love always warms up my heart.
I am alsodeeply gratefulto Adriani Quatrini, who played su han important
role of support and understanding during one of the darkest moments inmy
last three years; and to Giselle Moura, who has ared so mu h for me and
was the main for e driving the pro ess that literally hanged my fa e and,
therefore, mylife (formu h better).
arrived in Europe I did not know a single person on this side of the Atlanti
O ean. It wasa bigturning point inmylife, and Iamsoglad thatI de ided
to ome, for these three years inParis were not only a period of professional
growth, but also of in redible personal learning. I have had the pleasure of
meeting here some of the most remarkable human beings I have ever met,
both at the professional and personal levels. In parti ular, our sweet, sweet
Maunoury,thebuildingsharedashomebysomanyforeignstudentsatÉ ole
Polyte hnique, has been the stage of ountless adventures, memorable
mo-ments, and deep learning. Without the ompanionship of the people I met
there, I would not have been able to enjoy my stay in Fran e as mu h, and
thereforemy work would not have been as produ tive. Iwouldlike to thank
ea h and every one of the people I met in Maunoury for the friendship that
hasmarked mesodeeply. Also,Iwant tothankea hone forparti ular things
thatI will keep in mymemory forever. Thanks to Saddaf Shabbirfor allthe
philosophi al dis ussions bythe lake during summer (or until late night
oth-erwise), that have enlightened me so mu h in so many subje ts; to Andreas
Engelhardt for the onstant ompanionship and mutual-understanding whi h
have somany timeslightened theweight of beingabroad; to Nadia Vertti for
the happinessand heerfulness that ouldalways makeme smileat anytime;
to Keesjande Vries for all theawesome tripsshared together (Do youwanna
know why? Well...); to Ri ardo Kawahara for sharing the fun of nights out,
andalsothefrustrationofthewayba khomebytheNo tilien 122;toMi haª
Zydor for the un ountable movies seen together in Paris; to Fabien Immler
for being my German little brother; to Alex Rinke for the hospitality
dur-ingthe winter holidays in Berlinin 2009/2010; to OliverValen ia for thefun
moments at Bbar; to Kalle Ba klund, Anna Folke Larsen, and Uli
S hnei-der for all the unforgettable evenings at their pla e in Rue Guisarde and at
Chez Georges; to Steen Lohrey and Marie Le Mouel for the ni e evenings
wat hing Audrey Hepburnmovies inmyroom; to Chiara Altomare,Manuele
Auero, Paolo Carozzo andLorenzo Sponza (theItalian maa) for the
on-stant heerfulness in ourbeloved international kit hen;to BenjaminMosk for
the energy to neversayno to a night out dan ing; to Maria Rosario(Charo)
Mestre for the ompany not only in Paris, but also in Frankfurt; to Álvaro
Izquierdo for the onstant ompany in the gym, and the fun trips together;
toAmyGilson,AntonKarrman, DaviVas on ellos, LelandEllison,Lysandra
Alves, and Mi hael Martin for the unforgettable Summer of 2009; to Citlali
Cabreraforherkindnessineverymoment, andtheni e dinnerssheoered to
me; to Igor Reshetnyak for always being ready to help in anything; to Théo
Touvet for the rare example of ondent and unique life hoi es; to Tomás
Lungenstrassfor the onstant smile and goodmood; and to François Wirion
andJuliaDurasfortherstmomentssharedinthedo toralprogram. Thanks
also to Alex Lang, Alfredo Parra, Daniel Ruiz, Federi oCárdenas, Benjamin
Uekermann, Fredrik Hallgren, Henri de Belsun e, Herbert Mangesius, Ivan
Cho-je ki, Sara Rome, and Seydou Traoré for all the unforgettable moments. I
annot forget Hannah S hneider and Soa Karlsson, who have not lived in
Maunoury but arepart of the family,and I would like to thank them for the
friendship and hospitalitywhen IvisitedbothCologne and Sto kholm.
Itwasnot only on ampus,however, thatImet friends. Amongthemany
amazing people I met in Paris, and all over the world, are Alexandra Silva,
good ompany in several onferen es and summer s hools, whom I hope to
meet often,both asa friendandasa olleague; Diogo Arbigaus,thekind and
goodfriendwho, even thoughhe isBrazilian, Ihave met onlyinParis; Maria
Poulaki, whose refreshing ompany and kindness always make me feel good;
Ni olás Lopez andall theSpanish rowd, whose partiesinRueSouot willbe
alwaysinmymemory;andIzabelRezende,afamilymemberawayfromhome,
who wasan essential and kind supportduringmy stayinFran e.
I oftensaythat we do not have mu h ontrol over our lives, and thatthe
best we an do is to try to be prepared enough to at h a good opportunity
when itshows up. Today I an lookba k andbeglad to saythatI aught at
least twolife-time opportunitiesinthe pastthree years. The rstone was on
the
1
st
of O tober2008, whenIlanded inParis to startmydo toral program
at É olePolyte hnique. These ondonewasonthe
19
th
ofMar h 2010,when
I metTrevor RayTisler. Meeting him wasa turningpoint inmylife,and his
emotional supporthas paved the road soI ould work witha lighter spirit. I
am gratefulfor the patien ewith whi h hehasrevised my Englishwriting so
manytimes, the dedi ationhehasshowntomeeven beingoverseasforover a
yearnow, and for hislove, supportandpresen e inmy life.
Finally, I would like to thank myfamily, of whom Iam so proud, for the
love and support during my whole life, and espe ially during the hallenges
these past three years have imposed on me. Thanks to my mother, Maria
Angéli a, who hasalways been amodelhuman being for me, as a strong yet
sweetwoman, andwhogivesmestrengthinhard momentsandsharesmyjoy
in the good ones; to mybrother Mar oAntnio, who hasset an example for
me with his dedi ation, ethi al behavior and kindness that are a onstant in
everything he does; to my brother Mar us Viní ius, whose parti ular sense
of humor and tough behavior are not enough to hide a kind heart and a
person one an always ount on; to mystep-father Mario Montoya, who is a
remarkablehumanbeing, andwhohasgivenmemoresupport,understanding
and lovethan mybiologi alfatherhaseverdone;to mysistersinlawLu iana
Salomão and Débora Pires, for being like real sisters, and for the ountless
joyful moments shared; and to my ousin Adriana de Lima, for always being
bymyside andsupporting me.
Iapologizeto the severalpeople thatplayed an important role inmyway
andwhohavenotfoundtheir namementioned here: Iamsorryifmymemory
playeda tri kon me.
MárioS. Alvim
Inthis thesis we onsider theproblem of information hidingin the
s enariosofintera tivesystems,statisti aldis losure ontrol,and
rene-mentofspe i ations. Weapplyquantitativeapproa hestoinformation
owin therst two ases, and wepropose improvementsfor theusual
solutionsbasedonpro essequivalen esforthethird ase.
In the rst s enariowe onsider theproblem of dening the
infor-mationleakageinintera tivesystemswherese retsandobservables an
alternate during the omputation and inuen e ea h other. We show
that the information-theoreti approa h whi h interprets su h systems
as(simple) noisy hannels isnotvalid. Theprin iple anbere overed,
however, if we onsider hannels of a more ompli ated kind, that in
information theory are known as hannels with memory and feedba k.
We show that there is a omplete orresponden e between intera tive
systemsandthese hannels,andweproposetheuseofdire ted
informa-tionfrom input to output asthe real measure of leakagein intera tive
systems. Wealsoshowthat ourmodelisaproperextensionofthe
las-si alone,i.e. in theabsen eofintera tivitythemodelof hannelswith
memoryand feedba k ollapsesinto the model of memoryless hannels
withoutfeedba k.
Inthese ond s enariowe onsider theproblem ofstatisti al
dis lo-sure ontrol, whi h on erns how to reveal a urate statisti s about a
setofrespondentswhilepreservingthepriva yofindividuals. Wefo us
on the on ept of dierential priva y, a notion that has be ome very
popular in the database ommunity. Roughly, the idea is that a
ran-domized query me hanism provides su ient priva y prote tion if the
ratiobetweentheprobabilitiesthattwoadja entdatasets givea ertain
answeris bound by a onstant. Weobserve thesimilarity of this goal
withthemain on ern in theeld of informationow,namelylimiting
thepossibility of inferring the se retinformation from theobservables.
We show how to model the query system in terms of an
information-theoreti hannel,andwe omparethenotionofdierentialpriva ywith
thatofmin-entropyleakage. Weshowthatdierentialpriva yimpliesa
boundonthemin-entropyleakage,andwealso onsidertheutilityofthe
randomization me hanism, whi h represents how lose the randomized
answersare,inaverage,totherealones. Finallyweshowthatthenotion
ofdierentialpriva yimpliesatightboundonutility,andweproposea
method that under ertain onditions buildsan optimalrandomization
me hanism.
Moving the fo us away from quantitative approa hes, in the third
s enarioweaddresstheproblemofusingpro ess equivalen esto
hara -terizeinformation-hidingproperties(forinstan ese re y,anonymityand
non-interferen e). Intheliterature,someworkshaveusedthisapproa h,
basedontheprin iplethat aproto ol
P
withavariablex
satisessu h property if and only if, for every pair of se retss
1
ands
2
,P [
s
1
/
x
]
is equivalent toP [
s
2
/
x
]
. We show that, in the presen e of nondetermin-ism,theaboveprin iplemayrelyontheassumptionthat thes hedulerworksforthebenetoftheproto ol,andthisisusually notasafe
ningaspe i ationintoanimplementation,sin eusuallytheformeris
moreabstra tthanthelatter,andtherenementpro essinvolves
redu -ingthenondeterminism. Thes heduleris,in thissense,analprodu t
oftherenementpro ess,afterallthenondeterminismisruledout. We
presentaformalismin whi h we anspe ifyadmissible s hedulers and,
orrespondingly,safeversionsof omplete-tra eequivalen eand
bisimu-lation. Weprovethatsafebisimulationisstilla ongruen e. Finally,we
showthatsafeequivalen es an beused toestablishinformation-hiding
Introdu tion
There are two mistakes one an make along the road totruth:
notgoing all the way,and not starting.
Gautama Siddharta
1.1 Information hiding
In the last few de ades the amount of information owing through
omputa-tional systemshasin reased dramati ally. Neverbefore inhistory hasa
so i-etybeen sodependentonsu hahugeamount ofinformationbeinggenerated,
transmitted andpro essed. Itisexpe tedthatthissolidtrendofin rease will
ontinue in the near future, if not virtually indenitely, reinfor ing the need
for e ient and safeways to ope withthis reality.
Although the e ient and broad dissemination of information is a goal
in manysituations, thereare instan es where thedis losure ofinformation is
undesirableorevenuna eptable. Theeldofinformationhiding on ernsthe
problemofguaranteeingthatpartoftheinformationrelativetoaneventiskept
se ret. In omputer s ien e, theterm information hiding en ompassesalarge
spe trumofelds. Dierent eldshavedistin thistori almotivationsandthe
resulting resear h followed a unique path. The variation of the subelds of
information hiding dependson threemainfa tors: (i) what one wants tokeep
se ret; (ii)from whi h adversary or atta ker doesone want to keep it se ret;
and (iii) how powerful the adversaryor atta keris.
The eld of ondentiality (or se re y) refers to the problem of keeping
an a tion se ret. One appli ation of ondentiality is ryptographi
proto- ols, where the sender and the re eiver of a message an be known, but the
ontentsofthe messageitselfare onsideredtobesensitiveinformation.
Gen-erally,we an saythat ondentiality on erns data,whiletheeldof priva y
beinterestedinprote tingtheinformation aboutsomeone(a redit ard
num-ber, for instan e) or the person's identity itself. Anonymity is the eld that
on erns the prote tion oftheidentities ofagents involved inevents. In
prin- iple, anonymity an be related to both the a tive agent (often the sender
of a message), or to the passive agent (often the re eiver of a message). For
instan e, in the ase of a journalist re eiving information from a ondential
sour e, theidentity of the sender is intended to be se ret. As for the ase of
an intelligen e agen y sending a oded message to a spy, the identity of the
re eiver is ondential information. There is yetanother kind of anonymity,
sometimesreferredtoasunlinkability,wheretheidentityofagentsanda tions
performedarepubli information, butthe linkage between agents andthe
a -tions performed should not be determined. One exampleof unlinkability is a
ondential votingsystem,where boththevoters andthenalvote ountare
inthe publi domain, but the relationship between the voters' identities and
the ballots ast isprote ted.
Oneappli ation ofpriva ythathasdrawnalotofattentioninre entyears
istheproblemofstatisti aldatabases. Astatisti isaquantity omputedfrom
asample,andthegoalofstatisti aldis losure ontrol istoenabletheuserofthe
database to learn properties of the population as a whole, while maintaining
the priva y of individuals in the sample. The eld of statisti al databases
highlightsthedeli ate equilibriumbetween thebenetsand thedrawba ks of
the spread of information. A pra ti al example o urs in medi al resear h,
where it is desirable that a great number of individuals agree to give their
personal medi al information. With theinformation a quired, resear hers or
publi authorities an al ulate a seriesof statisti sfrom thesample(su h as
the average age of people with a parti ular ondition) and de ide, say, how
mu h money the health aresystem shouldspend next year inthetreatment
ofa spe i disease. It isintheinterest ofea h individual, however, thather
parti ipation in the sample will not harm her priva y. In our example, the
individuals usually do not want to have dis losed their spe i status with
relationtoa given disease,not evento theusers queryingthedatabase. Some
studies, e.g. [Joi01℄, suggestthat whenindividuals areguaranteed anonymity
andpriva ythey tendto be more ooperativeingiving personalinformation.
Another important eld of information hiding is information ow, whi h
on erns the leakage of lassied information via publi outputs in programs
andsystems. Consider asystemthataskstheusersa password to grant their
a essto some fun tionality. Naturally, the password itself is intended to be
se ret, however an atta ker trying to guess it will always get an observable
rea tionfromthesystem,whethertheresponseisana eptan eorareje tion
of the entered ode. In either ase, the observable behavior of the system
revealssome informationaboutthepassword,be auseeven ifitisnot guessed
orre tly, at least the sear h spa e is narrowed (even if, in this ase, only
slightly).
history
mutuallyex lusive. Inasystemwherepubli outputs anrevealtheidentityof
agents, for instan e, both theproblems of information owand of anonymity
are present. The lassi ation is usually based more on the ontextual
mo-tivation for the problem than on a rigid taxonomy of subelds. In fa t, in
re entyears therehasbeenana tivelineofresear h exploringthesimilarities
between problemssu hasthe foundationsofanonymityandinformation ow,
and alsopriva y andinformationow. Theresulthasbeenanin reasing
on-vergen ebetweentheseelds. Inthisthesisweexplorethesimilaritiesbetween
information ow, statisti aldatabases, and anonymity.
Inabroader ontext,theimportan eofinformation hidinggoesfarbeyond
therealmof omputers ien e,andtherearealotofsubtlequestionsthatneed
tobe onsidered arefully. Fromapoliti alandevenphilosophi alperspe tive,
the unrestri ted use of priva y prote tion an be ontroversial. Even though
it is broadly a epted that people should have the right to ex hange e-mails
privately, to vote in demo rati ele tions anonymously, and to express their
ideas on the Internet freely,there aresituationswhere information prote tion
poli ies an be argued to have serious drawba ks. The same me hanism that
grants a politi al a tivist anonymity and free spee h on the Internet, while
living under a repressive government, also grants a pedophile anonymity to
broad astharmful material. Thisbalan e between freedom and ontrol inthe
virtual media hasbeen thesubje t ofpassionate dis ussion. Independently of
whether one's goal is to maximize or to minimize the degree of information
prote tion in a given situation, it is anyway desirable to measure the extent
to whi h the information is prote ted, to dene whi h spe i denition of
prote tion theinformation fallsunder, andfromwhom theinformation is
pro-te ted.
Inthisthesisweavoidthe ontroversyofde idinginwhi h asesthe
appli- ation and extent of information hiding methods arejustiable. Rather, our
fo usisonmeasuringthedegreeofinformationprote tionoeredbyasystem,
thus makingevaluationand omparisonofdierent systemspossible.
Spe i- ally,weareinterestedinusing on epts ofinformationtheoryto quantifythe
leakage ofinformation.
1.2 Qualitative and quantitative approa hes to
information hiding: a brief history
Histori ally, the resear h on information hiding has evolved from the simple
but impre ise qualitative approa h toward the more rened, but at thesame
time more omplex, quantitative approa h. In the following se tions we will
brieyoverviewboth. Wedonotintendtoprovidehereanexhaustivestudyof
thesubje t,but ratherto highlight some ofthemostimportant ontributions
1.2.1 The qualitative approa h
Thequalitativeapproa h emergedrstintheliteratureof informationhiding.
The entral idea is that, by observing the output of a system, the adversary
annotbe ompletely sure ofwhat these ret information is. Theprin iple of
onfusion says that for every observable output generated by a se ret input,
there is another se ret that ould also have generated the same output. In
anonymity,forinstan e, this orrespondsto the on ept ofpossible inno en e,
i.e. theimpossibilityofidentifyingthe ulpritwith ertaintybyonlyobserving
the system'soutput. The prin iple of onfusion does not take into
onsidera-tiontheadversary's ertainty about thevalueof these ret: itis enough that
there be an alternative hypothesis, no matter howunlikely it is. This is also
knownasthe possibilisti approa h.
Oneof therstdevelopmentsinthis elddatesfrom 1976,when Belland
La Padula dened the model of multilevel se urity systems [BLP76℄. In this
modelthe omponents of asystem are lassied aseithersubje ts, i.e. a tive
entitiessu hasusersorpro esses,orasobje ts,i.e. passiveentitiessu hasles.
Thesubje ts aredivided into trusted and untrusted entities, and the authors
dene restri tionsonhowto manage untrustedobje ts. Therule noreadup
orwritedown statesthatuntrustedentities anreadonlyfromobje tsofthe
sameor lower levels, andthatthey an onlywrite into obje tsof thesame or
higherlevels. Thismodelwasdeveloped tosupportdierent levels ofse urity,
andaimedtoensurethatinformationonlyowsfromlowertohigherlevelsand
never intheoppositedire tion. Ea h input into and outputfrom the system
is labeled with a se urity level. Any pair of an input and its orresponding
outputis alledanevent. Aview ofase uritylevel
l
orrespondstotheevents at levell
or lower, and alltheevents ofa higher level arehidden to levell
.Usually in this model only two levels are distinguished: high and low.
The high level orresponds to sensitive information, whi h should only be
availableto someuserswithspe ialprivileges,whilethelowlevel orresponds
to publi information a essible to everyone. The goal of se ure information
owanalysis is,inthis ontext,toavoidleakagefromthehighleveltothelow
level.
BellandLaPadula'smodel,however, didnot addresstheproblemof
leak-ageof informationdue to overt hannels. A overt hannelisawayof
trans-mitting information from the high to thelow environment by means not
de-signed or intended for this purpose. Consider,for instan e, asystem where a
lowuser
ℓ
an sendaleto ahighuserh
,andh
hasthepowertoredenethe a essrightstothele. Theuserh
aneithermaintain thepermissionofℓ
to writeinthele,orshe an hangethe poli ysoℓ
nolongerhasa esstoit. In thiss enario,a overt hannelbetweena orruptedhighuserh
andlowuserℓ
an be established asfollows. The lowuser sends a leto the highuser, whothenuses her power of de iding whether to grant or to deny
ℓ
further a ess to itto en ode a message. Ina later stage,ℓ
tries to writein thele, and anhistory
a essfailure anbeinterpretedasthebit
0
,whileasu ess anbeinterpreted asthebit1
. Inthiswayanymessage aneventuallybesentthroughthe overt hannel fromthe orrupted highuser to thelowone.To ope with the threat of overt hannels, Goguen and Meseguer
devel-opedthe on eptof noninterferen e[GM82℄. Asystemisnoninterfering when
thea tionsof highusers donot alter what anbeseen bylowusers. Inother
words, the low outputs of the system will only ree t the values of the low
inputs, independently of what the high inputs are (ifany). The authors
pro-posedamodelofnoninterferen e thatseparated thesystemfrom these urity
poli ies. Their model, nevertheless, was only appropriate for deterministi
systems.
Noninterferen e,however,maybeatoorestri tive on eptforseveral
pra -ti al appli ations. It doesnot allow, for instan e, thesummarization of data.
It is often the ase where a system allows statisti al (or summarizing)
fun -tions (e.g. mean, total number) to be al ulated on its high inputs and then
dis losed to low users, even ifthe high inputs themselves are supposed to be
keptse ret. Thesesystemsaretypi alintheareaofstatisti aldatabases, and
we will dis uss this issue in more detail in Se tion 1.3.2. Clearly, a system
that allows the summarization of highdata for the low environment violates
noninterferen e, sin e a hange on the highinputmayae t thelow output.
Considering this problem, in 1986 Sutherland [D.S86℄ proposed the
on- ept of nondedu ibility on inputs, whi h fo uses not on whether the output
is ae ted a ording to a hange in the input, but on whether it is possible
to dedu e the input from the output. Under this denition, a system may
allow summarization of data and still be se ure, sin e the output of a
sta-tisti al fun tion does not ne essarily allow theadversary to dedu e what the
inputs are. Onedrawba k of the on ept of nondedu ibility oninputs is that
it assumes that the strongest form of the prin iple of onfusion is enough to
ensure se urity. Notably, it relies on the assumption that no high value an
beruledout afterobservingalowvalue. Thisisnot astrongenough se urity
guarantee inmany real systems. In some ases, even ifno high value an be
ruled outasapossibility,asinglevalue(ora smallsetofvalues) anbemu h
more likely than the others, and in pra ti e it makes little sense to onsider
the alternatives. This riti ism an be seenas anearly attemptto onsider a
quantitativeapproa hforinformationow,whereitistakeninto onsideration
how mu h anatta kerlearns(or doesnot learn) about these retmatters.
Another important issue in se urity systems is the problem of
omposi-tionality. In [M C87℄, M Cullough pointed out the importan e of hook-up
se urity,i.e. the ompositionalityofmulti-usersystems. Usually,realsystems
are far too omplex to be analyzed as a whole, espe ially be ause the task
of designing and implementing a system is normally divided between teams.
Ea h team is responsible for a number of omponents that, in a later stage,
will be put to work together. It is highly desirable that se urity properties
that the nal omposite system is also se ure. M Cullough showed that the
on epts of multilevel se urity systems, noninterferen e, and nondedu ibility
on inputs arenot omposable. Asa repla ement, he proposed the on ept of
restri tiveness, a ordingto whi h no highlevelinformation should ae tthe
behavior of thesystem,asseenbya lowuser.
In[WJ90℄WittboldandJohnsonaddressedthequestionofnondedu ibility
on inputs undera dierent perspe tive, showing thatit is not a guarantee of
absen eofleakage. Considerthefollowingalgorithm,where
H
andL
standfor thehighandthelowenvironments,respe tively. Herewe assumethevariablesx
andy
are binary, and the randomized ommandx ← 0 ⊕
0.5
1
assigns tox
eitherthe value0
or the value1
with0.5
probability ea h.whiletrue do
x ← 0 ⊕
0.5
1
; outputx
toH
; inputy
fromH
; output(x
XORy
) toL
; end whileIn the above algorithm, thelow environment only has a essto the value (
x
XORy
). Note, however, that the high environmentH
learns the value ofx
beforehavingto hoosethevalueofy
,andthereforeit anusethisknowledge toen odeamessage: Totransmit the bit0
,H
hoosesy = x
,andtotransmit thebit1
,H
hoosesy = 1−x
. Itis learthatthereissomeowofinformation fromthehighto the lowenvironment, even thoughL
annotdedu ethehigh inputy
from the low output (x
XORy
). Hen e, satisfying nondedu ibility on inputs does not guarantee a system to be se ure. Wittbold and Johnsondened, then, the on ept of nondedu ibility on strategies, whi h means that
regardless of what view
L
has of the ma hine, no strategy is ex luded from being usedbyH
.1.2.2 The quantitative approa h
Thequalitative approa h, althoughsimple and easyto apply,doesnot ree t
reality inmany pra ti al situations. In many ases some information leakage
istolerableorevenintentional. Take anele tionproto ol. Afterthenalvote
ountisreleased,therearefewerpossiblehypotheses on erningwhovotedfor
whomthan thehypothesesavailablebeforethevotes were ast. Inthis
exam-plethere is a natural leakage of information, sin e the un ertaintyabout the
sensitive information de reases after theobservation of theproto ol's output.
Thisleakage o urs,however, asa ne essaryfun tionalityofthe proto ol.
Infa t,inmostrealsystemsnoninterferen e annotbea hieved,astypi al
systemswillalwaysleaksomeinformation. Thisdoesnotmean,however,that
history
variesfromsystemto system. Thereforeitisimportantto quantifyhow mu h
leakageasystemallows. Quantitativemethodsareusefultoevaluatetheextent
to whi h asystemis se ure,and to ompare itto othersystems.
One of the rst attempts to quantify information leakage was made by
Denning in 1982. In [Den82℄shedened the leakage from astate
s
to astates
′
as the de rease in un ertainty about the high information in
s
resulting from the lowinformation ins
′
. She used the on eptof onditional entropy 1
H(h
s
|ℓ
s
′
)
,whereh
s
isthehighinformation ins
andℓ
s
′
isthelowinformation ins
′
. Herdenitionof leakagewas:
M
1
= H(h
s
|ℓ
s
) − H(h
s
|ℓ
s
′
)
Ifthequantity
M
1
ispositive,thenitis onsideredtobetheleakage of in-formation. Thismeasure ofleakage, however, doesnot onsiderthehistory oflowinputs,aproblempointedoutbyClark,Huntand Mala ariain[CHM07℄.
Without the history one annot summate the in rease in knowledge (or
de- rease inun ertainty)thata umulatesbetweenthelowstates
s
ands
′
. They
proposed, instead, thefollowing measureofleakage:
M
2
= H(h
s
|ℓ
s
) − H(h
s
|ℓ
s
′
, ℓ
s
)
Sin e
H(X|Y, Z) ≤ H(X|Y )
forallrandomvariablesX
,Y
andZ
,wehaveM
1
≤ M
2
. The quantityM
2
orresponds to the Shannon onditional mutual informationI(h
s
; ℓ
s
′
|ℓ
s
)
.In 1987, Millen made a formal onne tion between information ow and
Shannon information theorybyrelating noninterferen e and mutual
informa-tion [Mil87℄. InMillen'smodel, a omputersystemisseenasa hannelwhose
inputisasequen e
W
,possiblygeneratedbyasetofusers,andwhose output (after the omputation is ompleted) isY
. TherandomvariableX
represents asubsequen eofW
generated byauserU
,whileX
representsthehighinputs generated byusers otherthanU
. Millenshowedthatindeterministi systems ifX
andX
are independent andX
is not interfering withY
,thenthe Shan-non mutual informationI(X; Y )
betweenX
andY
is zero. In other words, noninterferen e is asu ient onditionfor absen e ofinformation ow.In1990,Masseygaveanimportant ontributiontotheeldofinformation
theory,whi h inuen ed the further development of quantitative information
ow. In [Mas90℄ he showed that the usual denition of dis rete memoryless
(i.e. history-independent) hannels usedat thattimeinfa tdidnottakeinto
a ount the possibilityfor theuseof feedba k. He highlighted the on eptual
1
The on epts of entropy, onditional entropy and mutual information will be dened
formallyinChapter3. Forthemomentitis enoughtoknowthatentropy isameasure of
theun ertaintyofarandomvariable; onditionalentropyisameasureoftheun ertaintyof
onerandomvariablegivenanother randomvariable; andmutual information is ameasure
dieren e between ausality and statisti aldependen e, and presentedan
a - uratemathemati al des riptionofdis retememoryless hannels thatallowed
feedba k. Then heintrodu ed the on eptofdire ted information,whi h
ap-turestheideaof ausalitybetween theinputandtheoutputofa hannel, and
arguedthatinthepresen eoffeedba k,dire tedinformation isamore
appro-priate measure of the ow of information from input to output than mutual
information.
In the same year, M Lean also onsidered the on ept of time in the
de-s ription ofsystems by proposinghis Flow Model [M L90℄. A ording to this
model, there is a owof information only when a highuser
H
assigns values to obje ts ina state thatpre edes thestate inwhi h a low userL
makesher assignment. Inthissituationonlypartofthe orrelationbetweenhighandlowinformation is onsideredasleakage. Thisaddressedtheproblemof ausality,
butthis model wastoogeneral, andrelatively di ultto apply.
In [Gra91℄ Gray worked on bridging the gap between the overly
ompli- ated Flow Model and themore pra ti al,yet restri ted, approa h of Millen.
Gray used a general-purpose probabilisti (as opposed to nondeterministi )
state ma hine that resembled Millen's model. In Gray's model, the value
T (s, I, s
′
, O)
representstheprobabilityofagivenstate
s
evolvingintoanother states
′
, under the input
I
, and produ ing outputO
. The hannels are par-titioned into two sets,H
andL
, representing the hannels onne ted to high andlowpro esses,respe tively. Thehighand thelowenvironments anom-muni ate only through their intera tions with the system, as no other form
of ommuni ation between them is allowed. Gray wanted to take time and
ausalityinto onsiderationinhisdenitionofleakage,andhedidsoby
allow-ingfeedba kandmemoryinhismodel. Hisformulationofase urityguarantee
wasthefollowing:
P (L
I
∩ L
O
∩ H
I
∩ H
O
) > 0
=⇒
P (ℓ|L
I
∩ L
O
∩ H
I
∩ H
O
) = P (ℓ|L
I
∩ L
O
)
(1.1) whereL
I
andL
O
representthehistoryoflowinputsandoutputs,respe tively,
and
H
I
and
H
O
representthehistoryofhighinputsandoutputs,respe tively.
Thesymbol
ℓ
representsthenaloutputevent hannelsinthelowenvironment. Theformulation(1.1)statesthattheprobabilityofa lowoutputmaydependontheprevioushistoryofthelowenvironment,butnotontheprevioushistory
ofthe highenvironment.
Grayalsotriedtogeneralizethe on eptof apa itytothe aseof hannels
with memory and feedba k. He provided a formula expressing the ow of
information from the whole history of inputs and outputs (during the time
period
0 . . . t − 1
) to thethelow output(at timet
),and onje turedthatthe apa ity ofthe hannel wouldbe:C
def= lim
history where
C
n
def= max
H,L
1
n
n
X
i=1
I(In
_Seq
_Event
H,t
, Out
_Seq
_Event
H,t
;
Final
_Out
_Event
L,t
|In
_Seq
_Event
L,t
, Out
_Seq
_Event
L,t
)
(1.3)
and
In
_Seq
_Event
A,t
is the input history at hannelA
(whereA
stands forL
orH
)upto timet − 1
,Out
_Seq
_Event
A,t
istheoutputhistory at hannelA
upto timet − 1
,andFinal
_Out
_Event
L,t
isthelowoutput eventat timet
. Grayshowed that the absen e of information ow impliesthat apa ityas formulatedin(1.2)iszero. Healso onje turedthatthisdenitionof apa itywould orrespond to thenotion of maximum transmission rate supported by
the hannel. As pointed out in [AAP11℄, however, the problem with Gray's
onje ture is the following. For an output at time
t
,the only ausal relation onsideredistheonewiththehistoryofinputsuptotimet−1
,whiletheee t that theinputat timet
itselfmayhave ontheoutputisignored. Inthis way, (1.2) doesnotexpress the omplete ausal relationbetweeninputandoutput.The orre tnotionof apa ityinthepresen eofmemoryandfeedba k, whi h
orrespondstothe maximumtransmissionratefor the hannel, wasproposed
in 2009 by Tatikonda and Mitter [TM09℄, and it willbe dis ussed later on in
Chapter 4.
A similar formal approa h, although with dierent motivations, was
pre-sented by M Iver and Morgan in [MM03℄. They fo used on the problem of
preserving se urity guarantees while rening spe i ations into
implementa-tions. The authors used an equation similar to (1.3), but in the ontext of
sequential programing languages enri hed with probabilities. Their aim was
to prote t the highvalues duringthewhole exe utionof theprogram,instead
oftheinitialhighvaluesonly. Inotherwords,theywantedtoassurethatifthe
highinformation isnot knownbythelowenvironmentat thebeginningofthe
omputation, thenit annotbeinferredat anylater stage. Theyprovedthat,
fordeterministi programs,ifthenalvaluesofthehighobje tsareprote ted,
then theinitial values areprote tedaswell. M Iverand Morganalso dened
the on ept of informationes ape as:
H(h|ℓ) − H(h
′
|ℓ
′
)
where
H(h|ℓ)
representsthe un ertainty( onditional entropy) ofthe high in-formationgiventhe lowinformationat thebeginningofthe omputation,andH(h
′
|ℓ
′
)
representsthesameun ertaintyattheend ofthe omputation. They
dened the hannel apa ity as the leastupper bound of information es ape
over all possible input distributions. In this ontext, a system is onsidered
se ure if ithas apa ityequal tozero. Oneadvantage of this modelis thatit
isnot ne essarytokeep tra kofthe wholehistoryofthe omputation,but on
the other hand it an be applied only in s enarios where the adversary does
InChapter3wewilltakeupagainthedis ussionofquantitativeapproa hes
toinformationowbasedoninformationtheory. Forthemomentwewillfo us
on some topi s relatedto information hiding that are of spe ialrelevan e for
thisthesis.
1.3 Case studies of information hiding
In this se tion we present three ase studies of information hiding that we
addressinthisthesis.
1. The aseofquantitativeinformationow,i.e. howmu haboutthese ret
information an adversary an learn by observing the system's output,
andbyknowing howthesystemworks. Wegivespe ialattentiontothe
broadlystudiedproblemofanonymity,whi h anbeseenasaparti ular
ase of the more general problem of information ow where the se ret
information isthe identityof theagents.
2. Thequestionofstatisti aldis losure ontrol,whi h on ernstheproblem
ofallowing usersof adatabase to obtainmeaningful answers to
statisti- alqueries,while prote tingthepriva yof theindividuals parti ipating
in the database. We fo us on dierential priva y, an approa h to this
problemthathasdrawna lotof attention inre ent years.
3. Theproblemof preserving se urity guarantees while deriving
implemen-tationsfromspe i ations. Usuallyspe i ationsaremoreabstra tthan
implementations, i.e. they present more nondeterminism. The task of
implementing asystemredu es thenondeterminismof thespe i ation,
andifitisnot done arefully,an implementation mayruleout
possibili-tiesallowedbyspe i ationthatareessentialforthese urityguarantees.
1.3.1 Quantitative information ow and anonymity
Anonymity is one of the most studied subje ts of information hiding. The
resear hinthisareahasbeena tiveinthepastseveralyears,andtheadvan es
made an be extended to the more general s enario of information ow. As
briey introdu ed in Se tion 1.1, anonymity on erns the prote tion of the
identities ofthe agents involved inthe events.
WiththeadventoftheInternet,theprote tionofanonymityhasbe omean
issueinthe dailylife of millions ofpeople around theworld. Theimportan e
of anonymity is even more evident on erning the prote tion of freedom of
spee h, a situation that is parti ularly deli ate in ountries under repressive
regimes.
Ptzmann,DresdenandHansen[PDH08℄haveproposedastandard
termi-nologyfor anonymity on epts. Intheir worktherearethree dierentnotions
•
Sender anonymity: when the identity of the originator should be pro-te ted;•
Re eiver anonymity: when the identity of the re ipient should be pro-te ted;•
Unlinkability: when it might be known that an agentA
originated a message andanagentB
re eived amessage,yetitshouldnot be known whether themessagesent byA
wasa tually theone re eived byB
. Reiter and Rubin also gave a lassi ation of the types of adversary inan anonymity system in [RR98℄, where they also proposed the anonymity
proto ol Crowds (see Se tion 1.3.1). In their work, they onsidered that the
adversary an bean eavesdropper simplyobserving thetra of messageson
the network, or she an be an a tive atta ker (i.e. a ollaboration between
senders, between re eivers, or between others taking part in the system), or
even a ombination of the previous two types. The authors also dened a
hierar hyofanonymitydegreesthatasystem anprovide. Inde reasingorder
of strength, the proposed s ale is listed below. In this list, let
s, s
′
denote
se rets and
o
an observable, i.e. a parti ular a tion or output of the system that isdistinguishablefrom thepoint of viewof theatta ker.Strong anonymity From the atta ker's point of view, the observables
pro-du ed by the system do not in rease her knowledge about the se ret
information, i.e. the identity of the individual involved in an event.
Chaum also des ribed the on ept of strong anonymity in his work on
theDiningCryptographers proto ol[Cha88℄. It representstheideal
sit-uationwheretheexe utionoftheproto oldoesnotgivetotheadversary
anyextrainformationabout these rets. The on eptwasformalized as
follows.
∀s, o p(s|o) = p(s)
(1.4)This denition is the equivalent of probabilisti noninterferen e. In
[CP06℄, Chatzikokolakis and Palamidessi showed thatthe ondition
ex-pressedby(1.4) isequivalent to:
∀s, s
′
, o
p(o|s) = p(o|s
′
)
(1.5)i.e. the probability of the systemprodu ing an observable is the same,
no matter what the se ret information is. This denition is known as
equality of likelihoods and is advantageous asit does not depend on the
probabilitydistribution on se rets.
Another denition of strong anonymity, more restri tive, was proposed
byHalpernandO'Neill[HO03,HP05℄. Itisequivalenttoea hofthe
pre-viousdenitions((1.4)or(1.5))plustheassumptionthattheinput
onden e inher guess about the se ret, and dened strong anonymity
as:
∀s, s
′
, o
p(s|o) = p(s
′
|o)
(1.6)The formulation (1.6) is also known as onditional anonymity and
or-respondsto thelevelofanonymity alled beyond suspi ion inReiterand
Rubin's lassi ation.
Beyond suspi ion From the atta ker's point of view, an agent is no more
likely to be the ulprit than any other agent in the system. It an be
formalized asin(1.6).
Probable inno en e From the atta ker's point of view, an agent does not
appear more likely to be involved in an event than not to be involved.
Formally:
∀s, o p(s|o) ≤ 0.5
(1.7)The formulation (1.7), however, is not broadly a epted as the
deni-tionof probable inno en e. In[CP06℄,Chatzikokolakis and Palamidessi
showed that the property that Reiter and Rubin indeed proved for the
Crowds proto ol in[RR98℄was:
∀s, o p(o|s) ≤ 0.5
(1.8)Possible inno en e Fromtheatta ker'spointofview,thereisalwaysa
non-zero probability that the agent involved in the event is someone else.
Formally:
∀s, o. p(s|o) > 0 =⇒ ∃s
′
.p(s
′
|o) > 0
Theabovehierar hygivesari her lassi ation ofthedegreeofprote tion
oered bya systemthan wouldbepossible withsimpler possibilisti models.
Among the quantitative approa hes to anonymity, two are of our spe ial
interest: the onesbasedoninformation-theoreti on eptsand theonesbased
on the Bayes risk. In the following se tion we give a brief overview of these
twoapproa hes. These on epts will berevisited inmoredetailinChapter 3.
Anonymity proto ols as noisy hannels
Informationtheoreti approa hestoanonymity,andmoregenerallyto
informa-tionow,relyon on eptssu hasentropyandmutualinformationtomeasure
theadversary'sla kofinformation aboutthese retbeforeandafterobserving
the system'soutput. Typi ally thesystemis seen asa noisy hannel andthe
on eptofnoninterferen e orrespondstothe onverseofthe hannel apa ity.
Thereareseveralworksintheliteraturethathaveproposedmeasuresof
[SD02, DSCP02, ZB05, DPW06℄. In [CPP08a℄ Chatzikokolakis, Palamidessi
and Pananganden proposed the on ept of onditional apa ity to ope with
the situation where some leakage of information is intended by the system.
Consider againtheele tionproto olexample. Bydesign,thenalvote
ount-ing needs to be announ ed and it usually in reases the atta ker's knowledge
aboutthese ret. Inthissituation,theleakageshouldbe al ulatedmodulothe
information thatis supposedtobedis losed, i.e. the vote ount. Inthiswork
theauthorsalsoproposedmethodsto al ulatethe hannel apa ityexploiting
some symmetries present inseveral pra ti al systems.
Hypothesis testing and Bayes risk
Insomerealworldsituationsanindividualfa esthefollowing situation: sheis
interestedinthevalueofsomerandom variable
A ∈ A
butshehasa essonly to the values of another random variableO ∈ O
. She knows thatA
andO
are orrelated byaknown onditional probabilitydistribution. Thissituationo urs in several elds, for instan e in medi ine (to make a diagnosis, the
physi ian hasa essto alistof symptoms, but notto thedisease itself). The
attempttoinfer
A
fromO
isknownastheproblemofhypothesistesting. Here weareinterested intheuseof hypothesis testinginthe ontext of anonymity(andinformationow). Morespe i ally,theadversarytriestoinferthese ret
A
giventhatshehasa esstotheobservablesO
andsheknowshowthesystem works,i.e. howthe probabilities ofO
are onditioned withrelation toA
.A ommonly studied approa h to the problem is based on the Bayesian
method and onsists of assuming the a priori probability distribution on
A
as known, and then deriving from that and from the knowledge about howthe systemworks, an a posteriori probabilitydistribution after some fa thas
been observed. It is well known that the best strategy for the adversary is
to apply the MAP rule (Maximum A posteriori Probability rule), whi h as
the namesuggests, hooses thehypothesiswith themaximumprobabilityfor
thegiven observation. Here, bybest strategy we meantheone thatindu es
the smallest probability of error in guessingthe hypothesis, that inthis ase
orrespondsto the Bayesrisk.
In[CPP08b℄Chatzikokolakis, Palamidessi andPananganden exploredthe
hypothesis testing approa h to anonymity, ina s enario where the adversary
has one single try to guess the se ret (after exa tly one observation). They
asso iatedthelevelofanonymitytotheprobabilityoferror,i.e. theprobability
of an atta ker making a wrong guess about the se ret. In order to onsider
the worst ase s enario and to give upper bounds for thelevel of anonymity
provided, the adversary is assumed to use the MAP rule strategy. In this
ase, theprobability oferror orrespondsto the Bayesrisk, and thedegree of
prote tion oeredbyaproto ol orrespondsto theBayesriskasso iatedwith
In [Smi07, Smi09℄ Smith also onsidered the s enario of one-try atta ks
and proposed the notion of vulnerability, whi h takes into onsideration the
probability that the adversary an guess the se ret orre tly after observing
the behaviorof thesystemonly on e. Smith proposedtheframeworkof
min-entropy leakage, whi h is losely related to theBayesrisk, but is dierent as
it uses the on ept of entropy (more pre isely min-entropy) and formalizes
leakage ininformation theoreti terms.
In Chapter 3 we will present a deeper dis ussion about the use of
infor-mationtheoryfor theformalization ofinformation ow,in luding thenotions
of Shannon entropy, mutual information and the framework of min-entropy
leakage for one-try atta ks. First, however, we will reviewsome fundamental
anonymityproto ols inliterature.
Examples of anonymity proto ols
Onthe Internet, every omputer hasa unique IPaddress whi h spe iesthe
omputer's logi al lo ation in the topology of the network. This IP address
is usually sent along with any request originating from the omputer. Even
if the omputer uses an IP address for a single session via an ISP (Internet
Servi eProvider),theidenti ation anbeloggedandretrievedlaterwiththe
ISP's omplian e. One ommon way totry to preserve anonymity isto usea
proxy, i.e. an intermediary omputerthat gathers all therequestsof a group
of omputers and serves as a unique gate for any ommuni ation with the
worldoutsideofthe network. Forpra ti al purposes,itisasifalltherequests
originatedfromtheproxy,andthemembersofthegroupareindistinguishable
from the point of view of an outside observer. One drawba k presented by
the use of proxies is that it reates single points of failures, de reasing the
network'srobustness.
Theproblemillustratedaboveisone ofthemotivationsfortheuseof
om-muni ationproto olsspe i allydesignedtoprote tanonymity. Inthisse tion
wereviewtwoof themost fundamental, andprobablymostfamous,examples
of anonymity proto ols in literature: the dining ryptographers proto ol, and
the Crowds proto ol.
Thedining ryptographers Thedining ryptographers proto olwas
pro-posed by Chaum in[Cha88℄. It is one of therst anonymity proto ols inthe
literature,anditisoneofthe fewproto olsthat anassurestronganonymity.
Theproto olisusuallypresentedinasimplieds enario, wherethree
ryp-tographersemployedbytheNSA(TheNationalSe urityAgen yoftheUnited
States)arehaving dinner ina restaurant. At theend of thedinner,the NSA
de ideswhether itwill paythebillitself or whether itwill assign thedutyof
paying toone ofthe ryptographersat thetable. Inthe asetheNSAde ides
thatone of the ryptographers will pay, itannoun es the de ision se retly to
will pay the billor not, without revealing the identity of thepayer. In other
words,toanexternalobserver(andtothenon-paying ryptographersaswell),
the only a essible information is whether theNSA is paying or not, but not
the identity of the ryptographer paying (if any). We assume that the NSA
does not dis lose its de ision to anyone but to the ryptographer it hooses
(again, ifany), and that the solutionshould be distributed,i.e. only message
passing between agents is allowed, and no entralized agent oordinates the
pro ess.
Thedining ryptographersproto olsolvesthisproblemasshown
s hemat-i ally inFigure1.1. Ea h ryptographer(
Crypt
0
,Crypt
1
andCrypt
2
)tossesa ointhatisvisibleonlyto himselfandtohisright-handneighbor. Inthiswayevery ryptographer has a shared oin with ea h of the other two. After all
three oins (
c
0
,c
1
andc
2
) are tossed, ea h ryptographer he ks whetherthe two oins visible to him agree (both are heads or both are tails) or disagree(one is head and the other is tails). Then they announ e publi ly agree or
disagree,a ording to theresultthey obtained withtheir oins. The only
ex- eption is that, ifa ryptographer ispaying,he will announ etheoppositeof
whathesees,i.e. hewillannoun edisagree inthe asethathis oinsagreeand
agree iftheydo not. It an be proventhat ifthe number ofdisagrees is even,
thentheNSA ispaying,andifthenumberofdisagrees isodd,thenoneofthe
ryptographersispaying. Moreover,ifthe oinsareallfair,theproto oloers
strong anonymity in the following sense: The exe ution of the proto ol does
not provide to anexternal observerenough eviden eto hange herknowledge
about whi h ryptographeristhepayer, ifany. Inotherwordstheprobability
of any ryptographer being the payer, under the adversary's point of view,
doesnot hange aftertheobservation ofthe proto ol'sexe ution.
The dining ryptographers proto ol an be generalized to any number of
graphnodes(i.e. ryptographers) andanytypeofgraph onne tivity(i.e. the
shared oins between pairs of ryptographers). Then the same solution an
be usedforanonymous ommuni ation asfollows. Ea hpair ofnodesshare a
ommon se ret (thevalue of the oin) of length
n
, equal to thelength of the transmitteddata. Itisassumedthatthe oinsaredrawnuniformlyfromthesetofpossiblese rets. Ea hnodethen omputesthebinarysum(XORoperation)
of all its shared se rets and announ es the result. The only ex eption is that
thenodethatwantsto transmit addsthe datum,also oflength
n
,to thesum it announ es. It an be shown that the total sum of the announ ements ofall nodesis equalsto thedata to be transmitted, sin e ea h se ret is ounted
twi e (on e by ea h node that an see it) and, therefore, is an eled out by
the XOR operation. Theproto olworksunderthe assumption thatonly one
nodeatatimetriestotransmit,andifitisthe asethatmorethanonesender
wants to transmit at the same time, the oni t needs to be solved bysome
sort of oordinator.
One drawba k of the dining ryptographers proto ol is its ine ien y: