En oding the program orre tness proofs as programs
in PCC te hnology
Mémoireprésenté
à la Fa ultédes études supérieures de l'Université Laval
dans le adre du programme de maîtrise en informatique
pour l'obtention du grade de Maître èss ien es (M.S .)
Fa ulté des s ien es etde génie
UNIVERSITÉ LAVAL
QUÉBEC
2008
L'une des di ultés de l'appli ation pratique du ode in orporant une preuve
(Proof-Carrying Code (PCC)) est la di ulté de ommuniquer les preuves ar elles- i sont
généralement d'unetailleimportante.
Lesappro hesproposéespouratténuer eproblèmeengendrentdenouveauxproblèmes,
notamment eluide l'élargissementde labase de onan e (Trusted ComputingBase):
un bogue peut alors provoquer l'a eptation d'un programmemali ieux.
Au lieu de transmettre une preuve ave le ode, nous proposons de transmettre un
générateur de preuve dans un adre générique et étendu de PCC (EPCC) (Extended
Proof-Carrying Code). Un générateur est un programme dont la seule fon tion est de
produire une preuve; 'est-à-dire lapreuve que le ode qu'elle a ompagne est orre t.
Legénérateurapourbutprin ipald'êtreunereprésentationplus ompa tedelapreuve
qu'ilsert à générer. Le adre de EPCC permetl'exé ution du générateurde preuve du
One of the key issues with the pra ti al appli ability of Proof-Carrying Code (PCC)
and its related methods is the di ulty in ommuni ating the proofs whi h are
inher-ently large.
Theapproa hesproposedtoalleviatethis,suerwithdrawba ks oftheirownespe ially
theenlargementoftheTrustedComputingBase,inwhi hanybugmay auseanunsafe
programtobea epted.
We propose to transmit, instead, a proof generator for the program in question in a
generi extended PCC framework (EPCC). A proof generator aims tobea more
om-pa t representation of itsresultingproof. The EPCC frameworkenablesthe exe ution
I am greatly indebted to my advisor, Danny Dubé, for his many ontributions to this
work; he is a thoughtful person and a real problem solver. He has allowed me mu h
freedomtoexploremanydierentideasthroughoutmymastersstudiesandhas
en our-aged allof my aspirations. Danny's guidan eand insightshas been aninvaluableasset
tomy resear h, and I sin erely hope that this work ree ts hisusual high standard for
quality.
I would alsolike tothankthe other members ofmyM.S . ommittee: Bé hirKtari
andlateClermontDupuisfortheiren ouragementandforspendingtheirpre ioustime
toread mythesis and provide omments,advi e, and questions.
Josée Desharnais, Bé hir Ktari, Mohamed Mejri, and Jean-Marie Beaulieu must
be thanked as professors for their understanding and their ex ellent lasses at Laval
University. I intera ted with Bé hir more than the others. There were many times
I have visited his o e for various problems. He was always rea hable, his door was
always open. Though we did not use the NWCC ompiler, I would like to thank Nils
Weller for helping me tolearn how NWCC works.
I would alsolike to thank all the people inthe administrativeo es at Laval
Uni-versity -espe iallyLynda Goulletfor helping me out with allthe paperwork.
My deepest thanks to my parents without whom I would not be here. They have
provided me with loving support throughout my hildhood and adulthood, and have
been a great sour e of inspiration. I am very grateful to my sister Setareh for her
onstant loveand support.
Last and most signi antly, I express my love and appre iation for my wife, Sara
Shanian,who tirelesslysupported meanden ouragedme throughoutmymasters
Résumé ii Abstra t iii A knowledgments iv Contents vi List of Figures ix 1 Introdu tion 1 1.1 Software Se urity . . . 1
1.2 Se urity designprin iples . . . 2
1.2.1 Least privilege. . . 2
1.2.2 Minimum trusted omputing base . . . 3
1.3 Current approa hes . . . 3
1.3.1 Authenti ation . . . 3
1.3.2 Code Analysis . . . 4
1.4 Thesis ontribution . . . 8
1.5 Thesis outline . . . 9
2 Extended Proof-Carrying Code 10 2.1 PCC-based Approa hes . . . 10
2.1.1 PCC: Chara teristi s, and Obsta les . . . 10
2.1.2 FoundationalPCC . . . 16
2.1.3 Ora le-Based PCC . . . 18
2.2 Extended Proof-Carrying Code framework . . . 21
2.2.1 Sending a Proof generator . . . 21
2.2.2 Kolmogorov omplexity. . . 22
2.2.3 ExtendedProof-CarryingCode framework . . . 22
2.3 EPCC Appli ations . . . 27
3 The VEP Virtual Ma hine 31
3.1 Ma hine Design . . . 31
3.1.1 Captured goals and requirements . . . 32
3.1.2 Ma hine Type . . . 33
3.1.3 Instru tion Set Ar hite ture . . . 34
3.2 Memory Management . . . 44
3.3 Se urity Requirements . . . 46
3.4 Se urity Enfor ement . . . 47
3.4.1 Initialse urity enfor ement . . . 48
3.4.2 Globalse urity enfor ement . . . 49
3.4.3 Instru tion-wisese urity enfor ement . . . 49
3.4.4 Example . . . 52
3.5 The VEPVersus Other VMs . . . 53
4 Sample Use of EPCC 55 4.1 Assembler . . . 56 4.1.1 Statements . . . 56 4.1.2 Labeleld . . . 57 4.1.3 Operationeld . . . 57 4.1.4 Operandeld . . . 58 4.1.5 Assembler pro ess. . . 60 4.2 Compiler . . . 62
4.2.1 Type he king phase . . . 62
4.2.2 Code generation phase . . . 65
4.3 Proof Generator . . . 67
4.4 Overall view . . . 68
5 Con lusion 70 5.1 Futurework . . . 71
A Instru tion Set Des riptions 72 A.1 Sta k Manipulation Instru tions . . . 72
A.1.1 POP . . . 72 A.1.2 POKE . . . 72 A.1.3 PEEK . . . 73 A.1.4 LOAD1 . . . 73 A.1.5 LOAD2 . . . 73 A.1.6 LOAD3 . . . 74 A.1.7 LOAD4 . . . 74 A.1.8 PUSH-PC . . . 74
A.2 Program ControlInstru tions . . . 75
A.2.1 HALT . . . 75
A.2.2 NOP . . . 75
A.3 Arithmeti alInstru tions . . . 76
A.3.1 ADD . . . 76
A.3.2 SUB . . . 76
A.3.3 MUL . . . 76
A.3.4 DIV . . . 77
A.3.5 MOD. . . 77
A.4 Comparison Instru tions . . . 77
A.4.1 EQU . . . 77
A.4.2 NEQ . . . 78
A.4.3 LTH . . . 78
A.4.4 LEQ . . . 78
A.4.5 ISPAIR . . . 79
A.5 Bitwise Logi al Instru tions . . . 79
A.5.1 BAND . . . 79
A.5.2 BOR . . . 80
A.5.3 BNOT . . . 80
A.5.4 BSHIFT . . . 80
A.6 Heap Manipulation Instru tions . . . 81
A.6.1 CONS . . . 81
A.6.2 CAR . . . 81
A.6.3 CDR . . . 81
A.7 Jump Instru tions. . . 82
A.7.1 JUMP . . . 82
A.7.2 JMPR . . . 82
A.7.3 JMPRT . . . 83
A.7.4 JMPRF . . . 83
A.8 Compa t Instru tions . . . 83
A.8.1 POKE-i . . . 83
A.8.2 PEEK-i . . . 84
A.8.3 LOADi . . . 84
1.1 The TCBof exe ution monitors . . . 5
1.2 The TCBin generalstati analysis framework . . . 6
1.3 The TCBin simpliedPCC framework . . . 8
2.1 Safe odes in PCC point of view . . . 11
2.2 TraditionalProof-CarryingCode framework . . . 12
2.3 Trusted omputing base in PCC . . . 13
2.4 Veri ation ondition generation . . . 15
2.5 Foundationalproof- arrying ode framework . . . 17
2.6 Trusted omputing base in FPCC . . . 18
2.7 Ora le-based proof- arrying ode framework . . . 19
2.8 Trusted omputing base in OPCC . . . 20
2.9 The framework of the generi ExtendedProof- arrying ode . . . 23
2.10 Consumer side inPCC versus its extended version . . . 28
2.11 Consumer side inFPCC versus itsextended version . . . 29
2.12 Consumer side inOPCC versus itsextended version . . . 30
3.1 Virtual Ma hine . . . 31
3.2 Data types inthe VEP . . . 35
3.3 S hemata of the sta k, the heap, and the ode spa e inthe VEP . . . . 36
3.4 Instru tion distribution approximation . . . 37
3.5 Initial instru tion set . . . 38
3.6 Op odes of the initialinstru tion set . . . 40
3.7 Example of PEEK-i . . . 41
3.8 Example of POKE-i . . . 42
3.9 Complete instru tions set: instru tions and their orresponding op odes 43 3.10 Referen e ounting examples . . . 45
3.11 The top elementof the sta k (wrapped_data) is passed tothe re ursive
algorithm (ifpair_redu e) before popping. If the top element is a pair,
by popping we are a tually releasing a referen e to that pair, so the
ounter eld of that pair should be de reased by one. Furthermore, if
the referen e ount fallsto zero,the referen e ountsof hildrenobje ts
should bede remented before the obje t is re laimed . . . 46
3.12 An example of anintertwined ode . . . 50
3.13 Instru tion-wise se urity enfor ement . . . 51
3.14 Example odes . . . 52
3.15 Sta k s hemata of the example. . . 53
3.16 TheTCBsizesofvariousJVMs(inthousandsoflinesof ode): Kaeisan open-sour e nonoptimizingJavaJIT, BulletTrain is a highlyoptimizing Java ompiler. . . 54
4.1 Extending the PCC framework . . . 55
4.2 The arithmeti operators in the VEP assembly language . . . 59
4.3 Example of expression evaluationduring assembly time . . . 60
4.4 Synta ti alrules for expressions . . . 60
4.5 Examples of operand size hange . . . 61
4.6 The integral promotionrules . . . 64
4.7 A sampleC ode whi h al ulates the 13th Fibona i number . . . 65
4.8 Generated assembly ode for the C ode presented inFigure 4.7 . . . . 66
4.9 The omponentsof the Proof generator Builder . . . 67
Introdu tion
1.1 Software Se urity
The rapid growth of the Internet and networks goes along with a rising need for
se u-rity for users. Instant a ess to a huge number of untrusted and mali ious softwares
through Internet onone hand and la k of se urity due tohardness, expense toset up,
and annoyan e to get along with, on the other hand, makes it easier for intruders to
implementtheir plans.
The latestInternetSe urity OutlookReport issuedbyCAIn . detailsthe extentof
the mali ioussoftwares problem [3℄. It argues that mali ioussoftwares (malwares) are
be oming more sophisti ated, and in reasing use of obfus ation te hniques to hide in
plain sight, will help riminals on ealtheir a tivities. Just in year 2007 the malware
volumes grew by 16 times. It also warns that, sin e mu h of the riminal a tivity is
targeted at nations where there are large populations of Internet users, malware and
software failureproblemis aninternational issue.
Furthermore,the requirementof fullytrustedsoftware forultra-expensiveandvital
proje ts learly shows the la kof required te hnology basis toaddress thesenew needs
of omputer se urity [1℄. Sin e the methods used to implement se urity poli ies are
less expensive and more exible in software than in hardware, se urity is in reasingly
be ominga software issue [4℄.
Re ently, there have been some explorations on the domain of programming
proa hes [5℄. In other words, language-based se urity as dened by S hneider is a
set of te hniques based on programminglanguage theory and implementation,
in lud-ing semanti s, types, optimization, and veri ation, brought to bear on the se urity
question. [5℄
Bythat denition, the approa h proposed by this thesis fallswithin the framework
of language-based se urity.
1.2 Se urity design prin iples
Language-basedse urity on eptuallyisa ombinationoftwo lassi omputer se urity
prin iples:
1.2.1 Least privilege
About thirty years ago, Jerry Saltzerand Mike S hroeder des ribed adesign prin iple,
known asPrin iple of Least Privilege [7℄. A ording to this prin iple, throughout
exe- ution,ea h prin ipalshouldbegiven the minimum resour es ne essary toa omplish
its task. In other words, depending on the ontext, every pro ess, user or program
must be able to a ess only su h information and resour es that are ne essary to its
legitimatepurpose.
As a result of this restri ted a ess, the untrusted ode will not have the means
and rights to perform a system-level operation in order to use vulnerability in other
appli ations to exploit the whole system. Indeed, least privilege results in a system
with better se urity. Moreover, the restri tion on sensitive system operations in this
prin ipleleads tobetter system stability.
As a real-world example of the prin iple of least privilege we an refer to need
to know restri tion rule in military. Under need-to-know restri tions, even if one
has all the ne essary o ial approvals (su h as a se urity learan e) to a ess ertain
information,onewouldnotbegivena esstosu hinformationunlessonehasaspe i
need to know; that is, a ess to the informationmust be ne essary for the ondu t of
1.2.2 Minimum trusted omputing base
The trusted omputing base (TCB) of a omputer system is the set of omponents in
whi h the o urren e of bugs might put the se urity properties of the entire system
in danger [9℄. Rationally, in a system, the smaller the TCB is, the less probable the
ompromiseinse urity would be. Thus, the best way toassure that asystem isse ure
isto keep itsTCB small and simple.
If an atta ker an repla e or modify any omponents of the TCB or, simply, there
exists a buggy omponent in TCB, the TCB an no longer be trusted. Con eptually,
theMinimumtrusted omputingbaseprin iple anbeappliedtoanytrustmanagement
system, where the trusted party is the TCB and the trusting party is the omputing
system.
Animportantpropertyofthe prin ipleof LeastPrivilegeandMinimalTCBistheir
independen y of the ar hite ture, omputer speed and other variable dimensions of
omputer systems. This makes themremain sound and sensible throughoutthe time.
1.3 Current approa hes
Our dailylifeis be omingmore and more dependent on omputers. We have a essto
networks and to a huge number of untrusted softwares through Internet. Suppose we
wishtodownloadandrun aprogramfromanunknown oruntrustedsour e. Oneofour
biggest on erns would be about the prote tion of our omputers from this untrusted
ode. The following lasses of approa hes are urrently used torea h this goal.
1.3.1 Authenti ation
Theidea inAuthenti ationisthatone a epts onlythe odes that ome equipped with
asignatureofatrustedprodu er. Ideally,sin ethisprodu eristrusted,the ode anbe
alsotrusted. Mi rosoftAuthenti ode isapopularexampleofa ode-signingme hanism
toassureusers that software they download fromthe Internetis authenti and has not
been tampered with [10℄. This lass of approa hes has several drawba ks.
of Minimum trusted omputing base an be applied toany trust management system,
where the trusted party is the TCB and the trusting party is the omputing system.
Hen e, onsidering the trusted ompanies as the TCB, in order to de ide whether a
ode an be trusted or not, we need to add ea h and every trusted ompany to the
TCBwhi hleadsto enlargementof theTCB. Trustingaprodu erdoes notne essarily
meanthat the produ ed ode issafe,and performs asit laims. As apopularexample,
we an refer to Mi rosoft Word 97whi h ontained a hidden pinball game [11℄. Thus,
itis learthat weneed approa hes whi h an ensurethat theuntrusted ode an ause
noharm tothe omputingsystem.
1.3.2 Code Analysis
Oneway toensurethattheuntrusted ode an ausenoharmtothe omputingsystem
is to understand the ode behavior. Code analysis,also known as program analysis,is
the pro ess of fo using inon spe i aspe ts ofa ode to gain anunderstanding of the
ode behavior. This pro ess of automati allyanalyzing an bedone indierentperiod
of the ode lifetime( ompile time,exe ution time).
Dynami Analysis
Dynami ode analysis te hniques observe the exe ution of a ode and perform an
appropriate a tion before the ode violates the se urity poli y. The performed a tion
againstthese urityviolation anbedis ontinuationoftheexe utionofthe ode,
audit-ingthe ode, orsimplymaking alogle. The SASI[12℄,Na io [13℄, and Polymer [14℄
are among the best-known approa hes in dynami analysis.
Exe ution monitor
The systems thatperform the dynami ode analysis are alled exe utionmonitors. In
dynami analysis,it isimportantto reateasafeanalysis environmentalsoknown asa
sandbox. Possible analysis environments are the operating system (Figure 1.1(b)), the
ode spa e (Figure1.1( )), and a wrapperprogram (Figure1.1(d)).
The operatingsystems mayhavedistin tuser rightsmanagement,timer interrupts,
usual means by whi h a ess to servi es is ontrolled. It issimply alist of the servi es
available, ea h with a list of the entities (users, programs, pro esses) permitted to
use the servi e. Currently, most desktop ma hines are ongured as single-user so
appli ationshave omplete a ess tothe ma hine resour es.
When the ode spa e is the analysis environment, some lines of ode should be
inserted before ea h memory a ess and ea h ontrol transfer to ensure that those
a esses are valid.
Wrappers, interpreters, and virtual ma hines are environments that inter ept (and
interprets or redire ts) the instru tions issued by the wrapped ode. The wrapped
ode does not exe ute dire tly on the underlying hardware but instead is interpreted
by another program. Everywrapped ode instru tionisthusexe uted onlyafterithas
been he ked and found not to be violating the se urity poli y being enfor ed. In this
way, abroad range of se urity poli ies an beapplied.
Code
OS-based Monitor
Instrumented code
Wrapper
TCB
TCB
TCB
Code
Code
TCB
Code
(a)
(b)
(c)
(d)
Figure 1.1: The TCB of exe utionmonitors
Exe ution monitors (EMs) are usually easy to implement and they an work with
binary odeswhi hmakesthemlanguageindependent. Despitethesestrongpoints,the
EMs suer form some drawba ks. The system resour es are engaged by the monitor
the whole time the ode is running and even if one run of the ode was safe, we an
notbesureabout the next runs. Thus, themonitorshouldworkeverytime werun the
ode forthewhole runningtime whi hresults inabigoverhead. Another disadvantage
of the monitor liesin itsusual fail-stop behavior. Fail-stop treatment doesnot twell
in proje ts where the pri e of stopping the program is high (e.g., satellite navigation
system [1℄). Furthermore,it isof high importan eforan approa h tobe ina ordan e
with the prin iples of se urity design. Obviously, exe ution monitors are in partial
(a)), unfortunately, the growth in the size of the TCB is inevitable for all three types
of exe utionmonitors regardless of their implementation.
Stati Analysis
As we mentioned in the previous se tion,one of the major drawba ks of the exe ution
monitors lies in their fail-stop behavior against the mali ious odes. The fail-stop
behavior, itself, is be ause of the runtime investigation whi h leaves the monitor with
no way but to stop the ode in order to avoid any se urity violation to happen. The
stoppage then may lead to agreat loss intime, money, oreven other formsof se urity
insystems where the renewal of the untrusted ode is hard oreven impossible
(safety- riti alorhuge s ienti proje tsinouter-spa e may end up with denialof servi e due
tothe stoppage of apie e of ode asa result of asimple mistake inthe ode [2℄).
Furthermore,the exe utionmonitor anonly onrmthesafetyofthe urrenttra e
of the ode exe ution up to the urrent moment in the tra e. While, for a ode to be
safe,theset of allpossibletra esofthe ode shouldbeprovensafe. Therefore, through
exe utionmonitoring the safety of the ode an not be proven.
Thus, itwouldbereasonabletodothe ode analysiswithouta tuallyexe utingthe
ode(asopposedtodynami analysis). Afamilyofte hniquesof odeanalysisinwhi h
the ode isanalyzed by toolstoprodu e useful informationwithouta tuallyexe uting
programs built fromthat ode is alledStati analysis.
Stati analysiste hniques nd out a ode's possible behavior priortoits exe ution,
reje ting a ode whose set of possible behaviors in ludes una eptable behavior. This
a prioriunderstanding of the ode isthe same asproving the ode safety. This an be
done by providing the omputing system indestination (i.e., the ode onsumer) with
a safety prover whi h an verify the ode safety upon re eiving the ode. Figure 1.2
shows the TCB for the general stati framework in whi h the growth in size of TCB
depends onthe size of Safety he king system.
TCB
Code
Safety checking
system
Execute
Reject
Typedassemblylanguage(TAL)[15,16℄,Flint[35℄,andProof- arrying ode(PCC)[17℄,
are among the best-known approa hes in stati analysis.
Proof-Carrying Code
Asafety veri ationsystem shouldbe apableofverifyingthesafety ofeveryuntrusted
ode. Sin e verifyingthe safety of the re eived ode is ahard task,it inevitably would
need a omplex verier. The omplexity is the worst enemy of se urity, that is,
pla ing a big and potentially buggy programin the TCB ompromises the se urity of
the omputing system.
In ordertoease thissituation,Ne ula andLee introdu edtheProof-Carrying Code
(PCC) approa h [20, 17, 18, 21℄. The main idea behind this approa h is to shift the
roots of the problem out of the TCB. This is done by breaking the verier system
intotwo omponents: a omplex Safety Prover anda simpleProof he ker, pla ingthe
Safety Proveron theprodu erside and the Proof he ker onthe onsumer side (i.e., in
TCB).
In this way, the burden of proving the safety of the untrusted ode is shifted on
the shoulders of the produ er. Therefore, in this framework, the produ er rst proves
the safety of the ode then atta hes the safety proof to the ode and sends it to the
onsumer. On the onsumer side and before exe uting the ode, roughly, the proof
he ker he ks the ode againstthea ompanyingproof. Uponsu ess, the ode ould
be exe uted.
As itis shown inFigure1.3,PCC has arelativelysmallTCB. Moreover, sin e PCC
does the stati analysis, on e the safety of an untrusted ode is he ked su essfully
before the rst run, there is no need to he k the ode before the next runs. As a
result, wewill have a omputing system with less overhead and more se urity [22, 23℄.
These benets nominate the PCC as one of the strongest frameworks to be used in
mobile- ode se urity. However, unfortunately, this great approa h suers from some
short omings. Apart from the di ulty of building or generating the proofs for the
ode, one of the ru ial obsta les for the pra ti alappli ability of the Proof-Carrying
Code te hnique is the size of the proofsthat must a ompany the ode. To bepre ise,
the di ulty of ommuni ating the proofs whi h are inherently large makes the PCC
Chapter 1. Introdu tion 8
Proof
checker
Code
TCB
Execute
Reject
Proof
Prover
Code
Consumer
Producer
Figure1.3: The TCB insimpliedPCC framework
1.4 Thesis ontribution
As dis ussed above, the presen e of untrusted and mali ious odes and the absen e of
thene essary se uritybases andframeworksforsafeuseof themobile odes areamong
the main on erns of our omputer-dependent world. The problem with the urrent
approa hes is either be ause of their refusal to obey the se urity prin iples or due to
their hard appli ability.
The trustworthiness of the proof- arrying ode is an important advantage over
ap-proa hes that involve the use of omplex se urity systems on the onsumer side.
How-ever, this approa h has s aling issues be ause the size of the safety proofs in reases
qui kly. The approa hes proposed to alleviate this suer with drawba ks of their own
espe iallytheenlargementoftheTrustedComputingBase,inwhi hanybugmay ause
anunsafe program tobe a epted.
In ontrast tothesedevelopments,wehaveworked onahybrid approa h withsome
modi ationsintheframeworklevelwhi hessentiallysolvesthePCC'ss alabilityissue.
In our approa h, instead of transmitting the proofs, a proof generator for the ode in
question is sent. The new modied extended framework enables the exe ution of the
proof generator on the onsumer side in a se ure manner. To ut a long story short,
this thesis makes the following ontributions to the eld of mobile ode se urity and
veri ation.
•
We show the design of a generi extended framework for proof- arrying ode (EPCC). We show that the designed framework is tamperproof and an resolvesome key issues ofthe previous frameworks.
•
We present the design of a safe and small virtual ma hine (VEP) introdu ed in the EPCC framework. For this, we implemented the VEP to work as an online•
We showempiri ally that the EPCC and its onjoint virtual ma hine (the VEP) an be used in an industrial-strengthframework. To show this, we implementedthe ne essary programs to omplete the end-to-end hain. For this, we
imple-mented anassembler forthe VEP byte ode. We alsoimplemented aC ompiler
to target the assembler language. Finally, in order to have a runnable EPCC
framework, we made use of GUNzipto build asample proof generator.
1.5 Thesis outline
Theremainderofthethesisisorganizedasfollows. InChapter2,wepresentourgeneri
ExtendedProof-CarryingCode(EPCC)frameworkandtakeaglan eoverrelatedworks
and assess ea happroa h merits and drawba ks. Chapter 3 dis usses the design of the
Virtualma hine forExtendedPCC (VEP). Itdes ribesthe trade-oswe hadtomake
whendesigningthe VEPanddis usses the wayinwhi htheVEPworks. InChapter4,
we make the whole system work by bridging from theory topra ti e. In this hapter,
we present a sample implementation of an EPCC framework. Finally, we summarize
Extended Proof-Carrying Code
Inthis hapter,wepresentourgeneri ExtendedProof-CarryingCodeframework. First
we introdu e and motivate Proof Carrying Code (PCC) along with mentioning its
limitations. Then, we take a glan e over related works and assess ea h approa h's
meritsand drawba ks from our perspe tive. Finally, wepropose our new approa h.
2.1 PCC-based Approa hes
InPCC-based approa hes,anuntrusted ode produ ermust onvin e a ode onsumer
of the safety of his ode by sending a proof. In this se tion, we peruse the
Proof-Carrying Code and its limitations together with its similar approa hes. It is worth
mentioning that we visualize the approa hes using data ow diagrams. Re tangles
denote omponents taking data represented in ovals as input or output su h data.
Arrows indi ate howthe dataows and the poli iesare shown by verti al s rolls.
2.1.1 PCC: Chara teristi s, and Obsta les
Chara teristi s
Proof-Carrying Code [20, 17, 18, 19, 21℄ refers to a stati analysis approa h in whi h
Chapter 2. Extended Proof-Carrying Code 11
Sin ethe veri ation isdonebeforetheexe utionofthe ode,likeotherstati
anal-ysis approa hes, PCC enfor es the se urity onservatively. This onservative behavior
is due to the la k of the a ess to useful runtime information (like inputs and variable
values). Thus, usingPCC anlead tosomefalsenegatives. Figure2.1 shows howPCC
takes are of re eived odes. The darker region shows the odes whi h are a epted
as safe odes by PCC while the white se tion between the origin axis and the darker
region represents the safe odes whi h are not a epted by PCC, mentioned as false
negatives.
Malicious codes
Safe codes
Figure2.1: Safe odes in PCC point of view
The false negatives is the pri e that PCC pays to gain an assured se urity. This
learly shows the importan eof the ertainty of se urity as one of the raisonsd'être of
the PCC approa h.
To nd out other important aspe ts of PCC approa h, we briey talk about the
originalframeworkof the PCC.
In a PCC system, there are typi ally two main parties, (1) a ode produ er, who
builds ma hine ode along with itssafety proof, and (2) a ode onsumer, who wishes
to run the ompiled ode as long as it satises the safety poli y. In reality the
om-muni ation between thesetwoparties ismore ompli ated and onsistsof a multi-step
intera tion between the produ er and the onsumer. In the rst step, the produ er
sends aprogram onsisting of the ode and additionalannotations. Theseannotations
onsist of loopinvariants and fun tion pre- and post- onditions and provide more
in-formationabout the ode. This additionalinformationmakes the following step of the
pro edure easier.
In the next step, the onsumer applies the Veri ation Condition Generator
(VC-Gen) to the re eived annotated ode, a ording to his/her parti ular safety poli y to
generate a veri ation ondition. A veri ation ondition is a logi al formula that, if
satised, impliesthat the ode satisesthe safety poli y[18℄. Here, additional
annota-tions anbeusedtomaketheVeri ationConditionGenerator'sjobeasier. Generating
sim-Program
Code
VC
Generator
Verification
Condition
Theorem
Prover
Proof of
VC
Code
Proof
VC
Generator
Verification
Condition
Proof
Checker
Safety
Policies
CPU
Producer
Consumer
Figure2.2: TraditionalProof-CarryingCode framework
the produ er.
The produ er runs a theorem prover (in many ases along with ne essary human
interventions) to get a proof of re eived veri ation ondition. The theorem prover
uses the axioms dened as part of the onsumer's safety poli y. Thus, a proof of
the veri ation ondition onstitutes a proof of safety of the ode with respe t to the
onsumer's safety poli y.
Ingeneral,provingtheveri ation onditionisresour e- onsumingwhi h anresult
in low performan e. Furthermore, onsidering that the theorem prover is a omplex
program, it ould not be pla ed on onsumer side. Therefore, in PCC the theorem
proverisontheprodu erside. Thus, itisthe beauty ofthis te hnique thatthe di ult
and omplex job isnot being done by the ode onsumer duringany steps of PCC.
In the next step, the produ er submits the proof to the onsumer. The ode
on-sumer he kstheproofbeforeexe utingthe odesubmittedbytheprodu er. Therefore,
he shouldverify ifthe re eived proofis reallyaproofof the veri ation ondition. The
ode onsumer runsthe proof he ker toverify that the proof isindeedavalidproof of
theveri ation ondition onstru tedinthepreviousphase. Iftheproof he k su eeds
the onsumer an thenexe ute the ode safely. On ethe safety ofanuntrusted ode is
As it is shown in Figure 2.2, it is possible to simplify the dialogue between the
ode produ er and the ode onsumer, assuming that they share a same VCGen (the
ode produ er has a opy of the VCGen). In this way, the ode onsumer re eives
the annotated ode atta hed to itsproof and runs the veri ation ondition generator.
Then, he he ks the proof against the veri ation ondition and if the he k su eeds
he an exe ute the ode safely.
One of the most important properties of the PCC frameworkis that the PCC
pro-gramsare tamper-proof. Thatis, anintruder annot modifythe ode orthe proofina
waythatresultsinexe utionofamali ious odeonthe onsumerside. Anyattemptsto
tamperwith eitherthe ode orthe proofresults inavalidationerroratproof he king.
In the few ases when the ode orthe proof are modied su h that the validation still
su eeds, the new ode isalso safe.
TCB Components
As we mentioned earlier, PCC intends to have a small trusted omputing base by
shiftingthe hardtaskof provingthe safetyof the ode tothe produ erside. Figure2.3
shows the main omponents of the TCB in original PCC framework. In pra ti e, the
trusted omputing base in the PCC framework is omposed of the followings.
VC
Generator
Proof
Checker
Safety
Policies
Hardware
Computing
system
Figure2.3: Trusted omputing base in PCC
•
Hardware: wesupposethatthea tualma hinehardwarewilloperateasexpe ted.•
Logi alstru tures:Logi : The logi is used to express the safety poli ies and proof in a formal
way. Proof of the soundness of the logi is done by hand, so a small and
simple logi would be more onvenient.
Safety Poli y: The poli ies whi h are expressed in the syntax of the logi
∗
Proof rules: The rules whi h are ne essary for proving the veri ation ondition. Two sample proofrules are represented in Figure2.4(a).∗
De oder: The de oder interprets the semanti s ofanindividual instru -tion and results in the lo alsafety ondition for exe utingthatinstru -tion and aset ofpossible ma hine statesresultingfromthe exe utionof
the instru tion.
•
Proof he ker: An implementation of the PCC logi whi h makes it possible to express statements in the syntax of the logi and me hani ally he k proofs, isalled Proof- he ker. The proof- he kers are usually simple programs and the
proof he king is a straight-forward task. Assuming logi has all the desirable
properties, we must trust that the proof he ker implementationis orre t.
•
Veri ation Condition generator: VCGen isresponsible for handlingthe ontrol-owaspe ts of the ode and willresult in a veri ation ondition whose validityentailsthesafetyofthe ode. AsitisshowninFigure2.4( ),theVCGen omputes
the veri ation ondition of the entire program by ombining all of the lo al
safe-progress onditions (safety onditions of ea h instru tion) identied by the
de oder. For this, the weakest pre- ondition of the program, starting form the
post- onditionandworkba k,is omputed. Figure2.4(d)showsthepre- ondition
and the post- ondition of the sample sour e ode presented in Figure 2.1(b).
Figure2.4(e)shows asampledenitionofthe VCGen(wereferinterestedreaders
to[19℄for detaileddes riptions of the denitions).
Obsta les
Thetotalsizeofthementioned omponents onstitutingtheTCBinproof- arrying ode
approximatelyisabout15000to20000lines of ode. Anybug inthese omponents an
ompromise the se urity of the whole system. It is often possible to use the elusive
standard of residual defe t density as a metri for faultiness tomeasure the number
of faults that remain in a software ode at the delivery point. A typi al target in
software development is to a hieve a residual defe t density of less than one error per
one thousand lines of non- omment sour e ode (KLOC) [37, 38℄. However, leading
edge software development organizations typi ally a hieve a defe t density of about 2
defe ts/KLOC [39℄.
Chapter 2. Extended Proof-Carrying Code 15
(a) Sample proof rules
(b) Sample source code
(c) The verification condition for the entire program
(d) The pre-condition and the post-condition if the sample code
(e) Sample verification condition generator definition
* The figures are taken from the “Proof Carrying Code” by G. Necula (POPL’97)
Figure2.4: Veri ation ondition generation
about the TCB grows along with its number of lines. Therefore, to have a safe and
implementablePCCframework,oneoftheobsta les infrontisitsrelativelylargeTCB.
Besides the problem of the large TCB, there is also the matter of proof size. In
prin iple, the proofs an be exponentially large, as mentioned by Peter Lee: ...as a
generalmatter,thesizeofthebinariesisanissuethatmustbeaddressed arefully [23℄.
Thus,one ofthe ru ialobsta lesforthe pra ti alappli abilityofProof-CarryingCode
and related te hniques is the size of the proofs that must a ompany the ode. It is
important to have a ompa t representation of proofs be ause they are possibly sent
through ommuni ation networks. In traditionalPCC framework, it was not unusual
tosee proofs that were 1000 times largerthan the asso iated ode whi h made the use
enough exibility. That is, the produ er is onstrained to submit a proof in a logi
whi h hasbeenimposed by the onsumer. Thatis, even ifthe produ ernd itpossible
to build a simpler proof in a higher-order logi , he is for ed to build the proof in the
onsumer's logi whi h mightresult inan overweight proof.
To sum up, the PCC te hnologyhas the followingobsta les infront:
1. proofs are large (pra ti ality);
2. TCBis relativelylarge (se urity);
3. produ ershould dothe hard workwith inadequate means(exibility).
Any solution to ombatthese obsta les and tomakearenementin the PCC
te h-nology should respe t the following fundamental hara teristi sof the PCC approa h:
1. give the highest priority tothe se urity (raison d'être);
2. intend tohave asmall TCB;
3. leave the easier tasks tothe onsumer;
4. an not be tampered with.
In the following sub-se tions we summarize two other PCC-based approa hes
pro-posed to ombat the mentioned obsta les. We try to make the dierent approa hes
omparable by highlighting two aspe ts: the trusted omputing base size and the size
of the safety proofs in thosesystems.
2.1.2 Foundational PCC
In order to relieve the se ond drawba k dis ussed in the foregoing sub-se tion, Appel
introdu ed the notionof foundational proof- arrying ode (FPCC) [28℄.
Although the TCB omponents in traditional PCC framework are simple, Appel
Chapter 2. Extended Proof-Carrying Code 17
To explain more learly, if there exists a bug in either the VCGen or the typing
rules then the TCB be omes vulnerable. As a matter of fa t, a proof- arrying ode
ertifying ompilerforJavanamedSpe ial-J[24℄,happenedtohaveerrors initstyping
rules,dis overedbyLeagueetal.[30℄. Needlesstosay,thisbugae tstheoverallsafety
of the PCC system.
Program
Code
Theorem
Prover
Proof
Proof
Checker
CPU
Producer
Consumer
Code
Proof
Figure2.5: Foundationalproof- arrying ode framework
FoundationalProof-CarryingCodeaims tofurtherredu e the TCBsize by anorder
ofmagnitude. Thatis,emphasizingaminimalTCB, they removed theVCGenand the
safety poli y fromthe onsumer side, asit is shown onFigure 2.5
Foundational PCC is based on the idea of dening the semanti s of the ma hine
instru tions and the proof rules using only a foundational mathemati allogi . In this
way, Appel et al. avoid using the VCGen by dening the operational semanti s of
ma hine instru tions and the safety poli ies in a higher-order logi . This is done by
modeling the ma hine instru tion with a transition from one ma hine state (set of
memory and registers) to another ma hine state. Then, the safety poli y is dened
in the following way: A given state is safe if, for any state rea hable by an arbitrary
sequen eoflegalinstru tions,thereisasafesu essorstate. Hen e,a odeissafeifwe
get a safe state. Forthis, type he king rules are proved as lemmas. Then, the proofs
are onstru ted and he ked on e per system, and rules are used many times to he k
programs. In this way, a safety proof is the appli ation (derivation) of type he king
rules and it shares proofs of ommon lemmas [29℄. FPCC uses a higher order logi
with few axioms of arithmeti , from whi h it is possible to build up most of modern
Therearethreemain omponentsinaFoundationalPCCsystem: atheoremprover,
a proof he ker, and a safety proof of the ode. The theorem provershould produ e a
proof of safety to be a ompanied by the ode. The proof he ker veries the safety
proof beforethe programgets exe uted.
Safety Proof Size
The proofs inFoundationalPCC, in omparison with traditionalPCC, are more
om-pli atedtoprodu eand asAppelhimselfstated, anexplode exponentially. Therefore,
theproofsizewhi hisa ru ialobsta leforthepra ti alappli abilityofProof-Carrying
Codeandrelatedte hniquesisremainedunsolved. A ordingtoNe ula,theproofssize
inFoundationalPCCis20%biggerthantheproofssizeintraditionalPCC.Thismakes
the proof ommuni ation harder and the use of Foundational PCC even less pra ti al
than the traditionalPCC [27℄.
Trusted Computing Base
Foundational PCC is on erned with minimizing the trusted omputing base of the
system,in ludingnottheVCGenasshowninFigure2.6. FPCC,inprin iple,isstri tly
more se ure than traditional PCC be ause it has a smaller trusted omputing base.
With this te hnique, Veri ation Condition Generator is removed from the TCB, and
the TCB be omesminimal.
Proof
Checker
Computing
Hardware
system
Figure 2.6: Trusted omputingbase in FPCC
2.1.3 Ora le-Based PCC
One of the main impediments to s alability in traditional PCC is that the proofs an
beverylarge. Inordertoalleviatethisproblem, Ne ulaproposedanewstrategy alled
Chapter 2. Extended Proof-Carrying Code 19
in Figure 2.7, this hange in strategy, led to a hange in the framework, namely, they
assumed that the onsumer uses anon-deterministi proof he ker.
Program
Code
VC
Generator
Verification
Condition
Theorem
Prover
Proof
witness
VC
Generator
Verification
Condition
Non
Deterministic
Proof Checker
Safety
Policies
CPU
Producer
Consumer
Code
Proof
witness
Figure 2.7: Ora le-basedproof- arrying ode framework
In order tomakeuse of the new non-deterministi proof he ker, they repla ed the
proof by an ora le string whi h guides the non-deterministi he ker. Every time the
he ker mustmakea hoi ebetween the possibleways topro eed, it onsultssomebits
fromthe ora le.
To be more pre ise, the untrusted theorem prover on the left-hand side re ords a
sequen e of bits thatshows whi hsub-goalsfailedand needed ba ktra king. Then, the
produ er sends this bit-stream to the onsumer. On the onsumer side, the re eived
bit-stream works as an ora le whi h an be used by the trusted non-deterministi
proof he ker toavoidba ktra king. It goeswithoutsaying that the ora le,likeproofs
inPCC, needs not betrusted. Thatis, if the ora le iswrong, then the trusted he ker
willgowrong, and willfail tond the proof.
In this approa h, the trusted non-deterministi proof he ker, in fa t, is a
non-deterministi theorem prover. This theorem prover is given the task of proving the
veri ation ondition. Whenever the prover has to pi k from
n
hoi es, it reads some bits fromanora lestring toresolvethat hoi e. Asaresult, the ora leisused todrivethe theorem prover to a nal proof without sear h, and as su h, the ora le string an
Safety Proof Size
The ora le-based proof- arrying ode is e ient. Experimental eviden e shows that
ora lestrings, assuggested by Ne ula, an be about 1/8 of the ode size and about 30
times smaller than proofs in traditional PCC [27℄. However, Wu [31℄ found the ode
size relation de eptive: Unfortunately, this statisti is somewhat misleading. [...℄
a ma hine language program and a proof witness. The Spe ialJ proof- arrying Java
system on whi h Ne ula measured ora le-based he king transmits three omponents:
The ma hine ode, the proof, and a Java lass le. The Java lass le, asis usual in
any Java system, ontains des riptions of the types of all pro edures (methods) in the
program(untrusted ode), in luding formalparameter and result types. However, the
1/8 size gure does not in lude the Java lass les.
While the small size and low ost of he king an ora le string are appealing, a
potentialproblemwith them isthat there are no urrently known ways tomanipulate
or ompose them. Thus, ora le strings for subprograms might be hard to use dire tly
whentrying tond erti ates forlargerprograms (ora lestringsare based onguiding
the sear h for ut-free proofs). They are also fragile in the sense that small hanges
in the formula to be proved or in the version of the theorem prover an invalidate an
ora le string.
Trusted Computing Base
The downside of Ora le-based PCC is that, as it is shown in Figure 2.8, it involves
omplex trusted omponents, su h as a type system with axiomati rules for
mem-ory safety and the VCGen and the non-deterministi proof he ker. Any aw in the
implementationof these omponents an ompromise safety of the system.
VC
Generator
Non-deterministic
Proof Checker
Safety
Policies
Hardware
Computing
system
Figure2.8: Trusted omputingbase in OPCC
and asthe se ondprin ipleof se urity designsuggests, any bug inthe TCBmay ause
anunsafeprogramtobe a epted. Forexample theSpe ial-J system,showed a riti al
leak in its type axioms [30℄. Unfortunately, one an nd the big size of the TCB in
OPCCagainst the rst and the third hara teristi sof aPCC approa h,as mentioned
inSe tion 2.1.1.
2.2 Extended Proof-Carrying Code framework
In this se tion we study the Extended Proof-Carrying Code framework. First, we
explain the idea behind our proposed approa h. Then, we present the framework and
talkabout itsproperties.
2.2.1 Sending a Proof generator
Aswementionedearlier,oneofthe ru ialissuesforthepra ti alappli abilityof
Proof-CarryingCodeand itsrelatedte hniques isthe sizeofthe proofsthatmusta ompany
the ode. Therefore, it is desirable that proofs be represented in a ompa t format.
Oneway torea hthisgoal isProof optimization inwhi hthe proofsarebuilt inamore
ompa t form and an be interpreted as proof of the original form [32, 33℄. The best
proof optimization approa hes result in proofs whi h are 15-30 times smallerthan the
original proof and pay the pri e of the enlargement of the TCB [27, 31, 26℄. We are
notinfavorof ompromisingthe se urity ofthesystem byabigTCBexpansionsimply
be ause the proofs are too large.
Another way of ompa ting the proofs is through Data ompression. Data
om-pressionte hniques try to nd more ompa t representations for data, fromwhi h the
original data an be re onstru ted exa tly. Many su h algorithms ompress data by
sear hing for more e ient en odings that take advantage of repetition in the data.
These te hniques are not well exploited in PCC framework due to the following
rea-sons. The onsumer of ompressed data must rst de ompress it, this needs a safe
de ompresser on onsumer side. Generating the proof of safety for a normal
de om-pressor (relatively big program with about 3000 lines of ode) is a di ult task not
worth performing be ause su h de ompressor would be a spe i de ompresser that
an not have the potentialtowork witha proof ompressed by anappropriatebut
whi h is appropriatefor the safety proof of a ode.
Wepresent inthis thesis an extended framework that allows the PCC proofs tobe
represented as programs. This helps us not to pay a proof-size pri e and enables the
PCC to handle even very large programs. The idea behind the new framework, whi h
we are going to present, is inspired by the Kolmogorov omplexity. We introdu e the
notionof Kolmogorov omplexity inthe followingsub-se tion.
2.2.2 Kolmogorov omplexity
Roughly speaking, the Kolmogorov omplexity of a string is the shortest omputer
programthat produ esthe samestring,i.e., that omputes it,printsit,and thenhalts.
One important observation is that this measure of omplexity indi ates how mu h a
string(or, in the ontext ofproof- arrying ode, aproof) an be ompressed: the ideal
ompressed formfor a given proof is the shortest programthat outputs that proof.
Formally, the Kolmogorov omplexity
K
U
(x)
of a stringx
is dened as the lengthℓ
of the shortest program apableof produ ingx
on auniversal omputer U su h asa Turing ma hine. This omplexity isin omputable.K
U
(x) = min
p∈{0,1}
∗
{ℓ(p) : p on U outputs x}
The denition depends on the spe i omputer programming language and the
uni-versal omputerthat isused. Wedenethesetwo omponentsa ordingtoour generi
extended PCC framework whi hwe present next.
2.2.3 Extended Proof-Carrying Code framework
The idea behind the Extended Proof-Carrying Code (EPCC) is simply to send the
proof in the form of a program. In this way, we make it possible for the produ er to
sendaproofgeneratorinsteadofthe proof,wherea ordingtoKolmogorov omplexity,
the proof generator ideally an be the shortest program whi h an output the original
proof. Forthistowork, the onsumershould be apableof runningthe proofgenerator
Proposed generi framework
Inorder tobenetfromthe above idea inanorganized manner, we proposed a generi
EPCC framework. A diagramof an EPCC system is given inFigure2.9. In anEPCC
system, there are two main parties, a ode produ er, who sends a ode along with its
safety proof generator, on the left-hand side and (2) a ode onsumer, who wishes to
run the ode, provided that itis proven safe by the system onthe right-hand side.
The ommuni ation between these two parties may onsist of a multi-step
inter-a tion between the produ er and the onsumer depending on the proof- arrying ode
framework that they extend. Generally, atthe rst step, the produ er runs a theorem
provertoget asafetyproof ofthe odehe intendstosend. Here, in ontrast withother
PCC frameworks, the onsumer is not for ed to generate the safety proof in the logi
that the onsumer imposes.
.
Proof
Checking
System
CPU
Consumer
Program
Code
Theorem
Proving
System
Proof
Producer
Proof
VEP
Proof
generator
Builder
Code
Proof
generator
Figure2.9: The framework of the generi ExtendedProof- arrying ode
The produ er an use this opportunity tobuild the proof ina logi (e.g., a
higher-order logi ) that results in a smaller proof. In other words, the produ er has the
possibilityofredu ing the sizeof the safetyproof by usinga ustomlogi whi h an be
later onverted (translated) tothe logi set by the onsumer.
Then, the produ er writes a proof generator. In a ordan e with the Kolmogorov
omplexity, this proof generator an, in prin iple, be the shortest program whi h an
output the safety proof in the format whi h is a eptable to the onsumer. That is
to say, the generi EPCC framework provides the produ er with the opportunity of
In the next step, the produ er submits the ode a ompanied by its safety proof
generatortothe onsumer. The onsumerisrequiredto he ktheproofbeforeexe uting
the ode submitted by the produ er. Therefore, he runs the safety proof generator on
the Virtual ma hine of EPCC (VEP) and obtains the safety proof. Then he runs the
proof he ker. After the proof he k su eeds the onsumer an repeatedlyexe ute the
ode safely. As one an easily observe the EPCC framework like the PCC is tamper
proof.
Oneofthe ru ial omponentsintheEPCCframeworkistheVEPwhi hisapartof
thetrusted omputingbaseoftheEPCC.Safeexe utionoftheproofgeneratordepends
on the safety of the VEP and the way it imposes the se urity requirements. Here, we
advert some important aspe ts about the VEP, and later, in Chapter 3 we study the
designof the VEPthoroughly. Inthe following,wedis uss the ways inwhi hthe VEP
provides us with the ne essary basis for applying Kolmogorov omplexity idea and
enablesthe exe utionof the proof generator at the onsumer side in ase ure manner.
The VEP: A Universal Computer
A universal omputer is a omputer whi h is apable of universal omputation. That
is, given a des ription of any other omputer or program and some data, a universal
omputer anperfe tlyemulatethisse ond omputerorprogram[34℄. Thebest-known
ontenderforthe title of universal omputer isthe Turing ma hine. ATuringma hine
is a omputing ma hine whi h has a number
n
of one-way innite tapes, divided into ells, one next to the other. The ells of the tapes an be blank or ontain a symbolfromsome nite alphabet. The rst of the tapesis known as the input tape,on whi h
a string of symbols is written, and the last of the tapes is known as the output tape
wherethe resultofthe Turing ma hine forthat inputiswritten. The other
n
− 2
tapes an be thought of as auxiliary tapes. On ea h tape the Turing ma hine has what isalledahead. Atanyone time,aheadsitsonaparti ular elland anreadthe symbol
whi h is written onthat ell, write a symbol onto that elland move to the left orto
the rightorstay put (insome modelsthe tape moves and thehead is stationary). Itis
worthmentioningthataTuringma hine an equivalentlypro ess asingleinnitetape.
The urrentimplementationoftheVEPisasta k-basedma hinewhi hisequivalent
in omputingpowertoaTuringma hine. TheVEPreadsthe odeandperformsa tions
on itsSta k and Heap. Here, the ode spa e an be regarded as a read-only tape and
KnowingthattheVEPhasniteresour es, popsup thequestionifit anbe
onsid-eredauniversal omputer destinationforthe proofgenerator a ordingtoKolmogorov
omplexity. The answer is yes, it is possible be ause in a nite amount of time, a
universal omputer an only manipulate a nite amount of data whi h ts in nite
resour es. In this way,the VEP an be onsidered asauniversal omputer destination
for the proof generator.
The VEP: An Exe ution Monitor
The proof generators in EPCC framework are untrusted programs whi h have to be
exe uted on the onsumer side. Sin e running untrusted programs on onsumer side
is against the raison d'etre of the PCC approa h and an ompromise the se urity of
the system, we need a se urity me hanism for running the proof generator safely. For
this to happen, the VEP should provide a tightly- ontrolled set of resour es for proof
generatorstorun in. Networka ess, theability toinspe t the host system, orreading
frominputdevi esandwritingintole streamsshouldbedisallowed. In thissense, the
VEP ought tobean exe utionmonitor.
As we mentioned in Se tion 1.3.2, two main drawba ks of the exe ution monitors
are their high overheadand their fail-stop mannerof en ountering unsafe odes. Here,
we dis ussthe existen eof ea h of these issues.
As for the overhead, inexe ution monitoring, the system resour es are engaged by
themonitorthewholetimethe odeisrunningandevenifonerunofthe ode wassafe,
we an not be sureabout the next runs. Thus, the overhead isthe resultof the system
resour es engagementby the monitor for ea h and every periodof the ode exe ution.
Overheads are usually quantiable osts of some kind. If we denote the ost as
C
and the total ost of a monitor asC
T
, we an showC
T
the total ost of anexe ution monitorEM
as:C
T
(EM) ≈
m
X
n=1
(t
EM
avg(C(EM)))
where
n
isnumberof thetimes the exe utionmonitorEM
runs,t
EM
shows the exe u-tiontime periodof theEM
andm
isthe totalnumberofruns andavg(C(EM))
isthe average ost of runningthe monitorEM
perCPU y le whi h an be dened as:avg(C(EM)) =
P
t
′
t=1
C
t
EM
t
′
whereC
t
the period
t
. Nowwe an formulate the problemas follows:lim
m→∞
(C
T
(EM)) = ∞
(2.1)lim
t
EM
→∞
(C
T
(EM)) = ∞
(2.2)As it is shown inEquations 2.1 and 2.2, an unbounded number of runs and exe ution
time of the exe utionmonitor ea h an pla e an unbounded ost on the system whi h
uses the exe ution monitor.
In the ase of EPCC, we need the exe ution monitorVEP torun only for a single
time, inwhi h the proof generator outputs the proof orfails. Therefore, for the
Equa-tion 2.1, number of runs
m
is bounded to1
. Now, if we an run the monitor for a limitedperiod of time we an bound the Equation 2.2. Forthat reason, the VEP runsforalimitednumberofCPU y le,whi hisset inabeforehandagreementbetween the
produ erand the onsumer, and he ked duringthe exe ution of the ode. Hen e, the
problem an bebounded asfollows:
C
T
(EM) ≤ (t
EM
max (C(EM))
In this way, the VEP an enfor e ne-grained memory safety, ontrol-ow safety, and
typesafety through exe utionmonitoring with aninsigni ant onstant ost.
Asforthese onddrawba koftheexe utionmonitors,thefail-stopmannerisaligned
withthe safetyof theEPCC framework. Thatis,weneed theVEP toa tinafail-stop
mannerto prevent anunsafe proof generator to ontinue itsexe ution. Therefore, not
only the fail-stop manner has no dangerous onsequent, but also it is required. Thus,
themajordrawba ksofusingtheexe utionmonitorsarenegligiblewhenusingtheVEP
asan exe utionmonitor.
The VEP: A ordan e with Se urity Design Prin iples
Itisofhighimportan eforanapproa htobeina ordan ewiththe prin iplesof
se u-ritydesign. Obviously,theVEPand otherexe utionmonitorsareinpartiala ordan e
with the least privilege prin ipleasthey are intended toperform su h task.
In additiontothis naturala ordan eofthe VEPwith theleast privilegeprin iple,
an agreement in whi h the produ er and the onsumer settle the possible amount of
resour esthat anbeusedbytheproofgenerator. Amongtheseresour es aretheheap,
thesta k,andthe odespa eoftheproofgenerator. Anydisobedien eoftheagreement
by the proof generator is doomed todis ontinuation of its exe ution. In this way, the
VEP puts the prin ipleof least privilege stri tly into pra ti e.
With regard to the se ond prin iple of the se urity design, we set a riterion for
the size of the TCB. The riterion wasto design and build the VEP ina way that the
enlargement of the TCB be less than the dieren e between the size of the TCB in
Ora le-based PCC and the size of the TCB in traditional PCC in terms of the lines
of ode. That is, we aimed to implement the VEP su h that the se urity of EPCC be
stronger than Ora le-based PCC a ording tothe se ond prin ipleof se urity design.
The size dieren e between the two versions of the TCB in traditional PCC and
Ora le-based PCC is about 2000-3000 lines of ode. Interestingly, the urrent version
of the VEP is less than 300 lines of ode whi h is mu h smaller than the standard we
set. Sin e the VEP onsists of small number of lines it an be veried easily by pen
andpaper. Furthermore,inprin iple, theVEPdoesnotneed toin rease thesize ofthe
TCBif itwould bepossible (without di ulty)to prove it safein a PCC framework.
2.3 EPCC Appli ations
In this se tion, we present some of the possible appli ations of the EPCC framework.
For this, we start by studying the benets of employing EPCC on traditional PCC
framework (i.e., extendingthe traditionalPCC framework ina way that it an a ept
a proof generator). Then we propose employment of EPCC for FPCC and OPCC as
two other PCC te hniques and their possible benets. It is important tomention that
onlythe rst appli ation whi h isan EPCC version ofthe traditionalPCC framework
is implemented as a part of this work (detailed information about the implemented
frameworkonChapter 4)andthetwootherEPCCframeworksarepresentedasfeasible
propositions.
Extending traditional PCC
version of traditional PCC is shown on the right-hand side. The dialogue between
the produ er and the onsumer remains the same as traditional PCC ex ept for some
minormodi ations. In extended versionof traditionalPCC, insteadof a ompanying
the ode with a safety proof, the produ er a ompanies it with the a safety proof
generatorwhi hhehas builtand ustom-madeearlier. On the onsumer sideand upon
re eption of the proof generator, the onsumer safely exe utes the proof generator on
theVEPand obtainsthe proof. Thegeneratedproof isthen given tothe proof he ker.
The proof he ker he ks the generated proof against the veri ation ondition and if
the he king issu essful, the onsumer an run the ode safely.
EPCC Consumer
VC
Generator
Verification
Condition
Proof
Checker
Safety
Policies
VEP
Proof
PCC Consumer
CPU
Code
Proof
VC
Generator
Verification
Condition
Proof
Checker
Safety
Policies
CPU
Code
Proof
generator
Figure2.10: Consumer side in PCC versus itsextended version
ThesafetyproofsinPCCarerepresentedinEdinburghLogi alFramework(LF)[36℄.
Alogi alframeworkisaformalsysteminwhi hotherlogi s anbereadilyrepresented.
The typi al LF representation of the proofs are large, due to a signi ant amount
of redundan y. Storing these proofs in a format that requires less spa e than usual
(e.g., ompressing them) would alleviate the problem of proof size in ommuni ations,
be ause it enables devi es to transmitor store the same amountof data in fewer bits.
Losslessdata ompressionte hniques workbestondata withrepetitioninits
represen-tation. Therefore, the fa t that proofs ontain many repeated patterns of proof rules
and redundantarguments, makes them suitablefor lossless data ompression. To gain
a better ompression, the data ompression algorithm an be ustom-made in keeping
with the ontent ofthe proofwhi h isgoing to besent.
InourexperimentbyusingthenewstrategyofEPCC,withano-the-shelf
ompres-sion te hnique, the type safety proof generators average 5% the original proofs whi h
is about 30 times smaller than before. Interested reader an he k the Chapter 4 to
obtain more information about the results and the end-to-end implementation of the
sibility of sending a proof generator. This gives a han e to the produ er to build
a ompa t and spe ialized proof generator whi h an output the same proof on the
onsumer side. In this way, the proof size issue an be alleviated while the parties
are provided with a more exible framework in whi h the original logi of the proof
generator an be dierent than that of the generated proof.
Extending Foundational PCC
Figure2.11 shows the onsumer side in the Foundational PCC framework on the
left-hand side and its EPCC version on the right-handside.
EPCC Consumer
Proof
Checker
VEP
FPCC Consumer
CPU
Code
Proof
Proof
Checker
CPU
Proof
Code
Proof
generator
Figure 2.11: Consumer side inFPCC versus itsextended version
In FPCC framework, the proofs are bigger than the proofs in traditional PCC
whi h makes the s alability of FPCCharder. By extendingthe FPCCframework, the
produ er an send a proof generator whose size an bea fra tion of the original proof
size. In this way the ru ial obsta les for the pra ti al appli ability of FPCC an be
alleviated. Sin ethe size of theVEP inExtendedFPCC issmallerthanthe size of the
VCGentraditionalPCCframework,inprin iple,extendedFPCC ouldbeimmediately
more se ure than the traditionalPCC be ause it has a smaller TCB. The dialogue in
extended FPCC is similar to the one in FPCC, ex ept that in extended FPCC the
onsumer exe utes the proof generator onthe VEP to obtainthe proof.
Extending Ora le-based PCC
In Figure2.12, the onsumer side inOra le-basedPCC isshown onthe left-hand side.
the right-hand side. On this side, the ode is a ompaniedby a proof generator. The
onsumer an exe utethe proof generator onthe VEPand obtain the proof. Then the
proof he ker he ks the generated proof against the veri ation ondition and if the
he king wassu essful, the onsumer an run the ode safely.
EPCC Consumer
VC
Generator
Verification
Condition
Proof
Checker
Safety
Policies
VEP
Proof
OPCC Consumer
CPU
Code
Proof
VC
Generator
Verification
Condition
Non-deterministic
Proof
Checker
Safety
Policies
CPU
Code
Proof
generator
Figure 2.12: Consumer side inOPCC versus itsextended version
By extending the Ora le-based PCC framework, we provide the produ er with the
possibility of sending a proof generator. The proof generator an use the ora le idea
to generate the omplete proof as the output. Sin e the VEP is smaller than the size
dieren ebetween non-deterministi proof he kerand theoriginalPCCproof he ker,
theTCBsize issue anbealleviatedwhilethe partiesare providedwith amoreexible
framework in whi h the original logi of the proof generator an be dierent than the
generated proof.
2.4 Overview
The ExtendedProof-CarryingCode frameworkis tomakethe PCC idea more s alable
and pra ti alby alleviating the proof size issue while respe ting the hara teristi s of
the PCC te hnique.
EPCC provides the ode onsumer with the luxury of using a safe environment in
whi h a big lass of proof generators an be exe uted in a se ure manner, regardless
of the original logi in whi h the proofs were represented. In this way, EPCC leaves
the easier tasks to the onsumer and gives adequate means to the produ er to do the
hard task. This major exibility for the onsumer and produ er, in addition to the
alleviation of the proof size issue, are gained through a minor TCB extension of less
The VEP Virtual Ma hine
This hapter dis usses the design of the Virtual ma hine for Extended PCC (VEP).
It des ribes the trade-os we had to make when designing the VEP and dis usses the
way in whi h the VEP works.
3.1 Ma hine Design
In this se tion,we present the design pro ess of the VEP. A virtual ma hine is a
fun -tional simulation of a omputer and its asso iated devi es [41℄, whi h is implemented
by adding a software to an exe ution platform togive it the appearan e of a dierent
platformwhi h may have aninstru tion set that diers from that implemented on the
underlying real hardware. Figure 3.1 shows the idea of using virtual ma hine by the
Host Computing System
Host Computing System
Virtual Machine
Untrusted Code
Untrusted Code
Figure3.1: Virtual Ma hine
ne tions. On the other hand,a Virtual ma hinewhi his the virtualizingsoftware, an
translate ( ompletely orpartially) the instru tion set ar hite ture of the original
plat-form, so that the untrusted- ode sees a dierent instru tion set ar hite ture from the
one supported by platform. That is, a virtual ma hine an work as a (partialor
om-plete) emulator whi h exe utes programs written for the virtual ma hine instru tion
setonama hinethat exe utesadierentinstru tionset. Havingrestri tedinstru tion
set (e.g., the unne essary instru tion whi h gives the potential to write unsafe odes
are omitted)and safe emulation (i.e., performingne essary he ks beforeexe uting an
instru tion)bythevirtualma hineboth animprovethese urityofthesystem. Inthis
way, avirtual ma hine an be used toin rease se urity, provide enhan edperforman e
and simplifysoftware migration.
3.1.1 Captured goals and requirements
The virtual ma hine design pro ess starts by apturing the requirements. In EPCC
framework, we exe ute the proof generator on the VEP. The proof generator an be
a pa kage of a de ompression algorithm and the ompressed proof. In this way, by
exe utingthe proof generator, the onsumer is a tually de ompressing the ompressed
proof. We used the GUNZip algorithm as a representative of algorithms within the
de ompression te hniques area. As a guideline, we tried to design the VEP in a way
that it an support an e ient exe ution of programs written in a broad range of
language.
The requirements of a virtual ma hine are mainly on erned with the properties
su h as: size, portability, performan e, memory onsumption, s alability, se urity, et .
In the ase of the VEP, we dealtwith the following requirements:
1. The VEP should provide us with a platform whi h has the potential of working
withtheKolmogorovideal ompressor. A ording totheKolmogorov omplexity,
this ideal ompressor runs ona universal omputer.
2. It should enable the exe ution of the proof generator at the onsumer side in
a se ure manner. That is, the VEP should provide a tightly ontrolled set of
resour es for proof generator. Network a ess, the ability to inspe t the host
system,orreadfrominputdevi esandwriteintolestreamsshouldbedisallowed.
Therefore, the VEP should be able toperform exe utionmonitoring.