Encoding the program correctness proofs as programs in PCC technology

(1)

En oding the program orre tness proofs as programs

in PCC te hnology

Mémoireprésenté

à la Fa ultédes études supérieures de l'Université Laval

dans le adre du programme de maîtrise en informatique

pour l'obtention du grade de Maître èss ien es (M.S .)

Fa ulté des s ien es etde génie

UNIVERSITÉ LAVAL

QUÉBEC

2008

(2)

L'une des di ultés de l'appli ation pratique du ode in orporant une preuve

(Proof-Carrying Code (PCC)) est la di ulté de ommuniquer les preuves ar elles- i sont

généralement d'unetailleimportante.

Lesappro hesproposéespouratténuer eproblèmeengendrentdenouveauxproblèmes,

notamment eluide l'élargissementde labase de onan e (Trusted ComputingBase):

un bogue peut alors provoquer l'a eptation d'un programmemali ieux.

Au lieu de transmettre une preuve ave le ode, nous proposons de transmettre un

générateur de preuve dans un adre générique et étendu de PCC (EPCC) (Extended

Proof-Carrying Code). Un générateur est un programme dont la seule fon tion est de

produire une preuve; 'est-à-dire lapreuve que le ode qu'elle a ompagne est orre t.

Legénérateurapourbutprin ipald'êtreunereprésentationplus ompa tedelapreuve

qu'ilsert à générer. Le adre de EPCC permetl'exé ution du générateurde preuve du

(3)

One of the key issues with the pra ti al appli ability of Proof-Carrying Code (PCC)

and its related methods is the di ulty in ommuni ating the proofs whi h are

inher-ently large.

Theapproa hesproposedtoalleviatethis,suerwithdrawba ks oftheirownespe ially

theenlargementoftheTrustedComputingBase,inwhi hanybugmay auseanunsafe

programtobea epted.

We propose to transmit, instead, a proof generator for the program in question in a

generi extended PCC framework (EPCC). A proof generator aims tobea more

om-pa t representation of itsresultingproof. The EPCC frameworkenablesthe exe ution

(4)

I am greatly indebted to my advisor, Danny Dubé, for his many ontributions to this

work; he is a thoughtful person and a real problem solver. He has allowed me mu h

freedomtoexploremanydierentideasthroughoutmymastersstudiesandhas

en our-aged allof my aspirations. Danny's guidan eand insightshas been aninvaluableasset

tomy resear h, and I sin erely hope that this work ree ts hisusual high standard for

quality.

I would alsolike tothankthe other members ofmyM.S . ommittee: Bé hirKtari

andlateClermontDupuisfortheiren ouragementandforspendingtheirpre ioustime

toread mythesis and provide omments,advi e, and questions.

Josée Desharnais, Bé hir Ktari, Mohamed Mejri, and Jean-Marie Beaulieu must

be thanked as professors for their understanding and their ex ellent lasses at Laval

University. I intera ted with Bé hir more than the others. There were many times

I have visited his o e for various problems. He was always rea hable, his door was

always open. Though we did not use the NWCC ompiler, I would like to thank Nils

Weller for helping me tolearn how NWCC works.

I would alsolike to thank all the people inthe administrativeo es at Laval

Uni-versity -espe iallyLynda Goulletfor helping me out with allthe paperwork.

My deepest thanks to my parents without whom I would not be here. They have

provided me with loving support throughout my hildhood and adulthood, and have

been a great sour e of inspiration. I am very grateful to my sister Setareh for her

onstant loveand support.

Last and most signi antly, I express my love and appre iation for my wife, Sara

Shanian,who tirelesslysupported meanden ouragedme throughoutmymasters

(5)

(6)

Résumé ii Abstra t iii A knowledgments iv Contents vi List of Figures ix 1 Introdu tion 1 1.1 Software Se urity . . . 1

1.2 Se urity designprin iples . . . 2

1.2.1 Least privilege. . . 2

1.2.2 Minimum trusted omputing base . . . 3

1.3 Current approa hes . . . 3

1.3.1 Authenti ation . . . 3

1.3.2 Code Analysis . . . 4

1.4 Thesis ontribution . . . 8

1.5 Thesis outline . . . 9

2 Extended Proof-Carrying Code 10 2.1 PCC-based Approa hes . . . 10

2.1.1 PCC: Chara teristi s, and Obsta les . . . 10

2.1.2 FoundationalPCC . . . 16

2.1.3 Ora le-Based PCC . . . 18

2.2 Extended Proof-Carrying Code framework . . . 21

2.2.1 Sending a Proof generator . . . 21

2.2.2 Kolmogorov omplexity. . . 22

2.2.3 ExtendedProof-CarryingCode framework . . . 22

2.3 EPCC Appli ations . . . 27

(7)

3 The VEP Virtual Ma hine 31

3.1 Ma hine Design . . . 31

3.1.1 Captured goals and requirements . . . 32

3.1.2 Ma hine Type . . . 33

3.1.3 Instru tion Set Ar hite ture . . . 34

3.2 Memory Management . . . 44

3.3 Se urity Requirements . . . 46

3.4 Se urity Enfor ement . . . 47

3.4.1 Initialse urity enfor ement . . . 48

3.4.2 Globalse urity enfor ement . . . 49

3.4.3 Instru tion-wisese urity enfor ement . . . 49

3.4.4 Example . . . 52

3.5 The VEPVersus Other VMs . . . 53

4 Sample Use of EPCC 55 4.1 Assembler . . . 56 4.1.1 Statements . . . 56 4.1.2 Labeleld . . . 57 4.1.3 Operationeld . . . 57 4.1.4 Operandeld . . . 58 4.1.5 Assembler pro ess. . . 60 4.2 Compiler . . . 62

4.2.1 Type he king phase . . . 62

4.2.2 Code generation phase . . . 65

4.3 Proof Generator . . . 67

4.4 Overall view . . . 68

5 Con lusion 70 5.1 Futurework . . . 71

A Instru tion Set Des riptions 72 A.1 Sta k Manipulation Instru tions . . . 72

A.1.1 POP . . . 72 A.1.2 POKE . . . 72 A.1.3 PEEK . . . 73 A.1.4 LOAD1 . . . 73 A.1.5 LOAD2 . . . 73 A.1.6 LOAD3 . . . 74 A.1.7 LOAD4 . . . 74 A.1.8 PUSH-PC . . . 74

(8)

A.2 Program ControlInstru tions . . . 75

A.2.1 HALT . . . 75

A.2.2 NOP . . . 75

A.3 Arithmeti alInstru tions . . . 76

A.3.1 ADD . . . 76

A.3.2 SUB . . . 76

A.3.3 MUL . . . 76

A.3.4 DIV . . . 77

A.3.5 MOD. . . 77

A.4 Comparison Instru tions . . . 77

A.4.1 EQU . . . 77

A.4.2 NEQ . . . 78

A.4.3 LTH . . . 78

A.4.4 LEQ . . . 78

A.4.5 ISPAIR . . . 79

A.5 Bitwise Logi al Instru tions . . . 79

A.5.1 BAND . . . 79

A.5.2 BOR . . . 80

A.5.3 BNOT . . . 80

A.5.4 BSHIFT . . . 80

A.6 Heap Manipulation Instru tions . . . 81

A.6.1 CONS . . . 81

A.6.2 CAR . . . 81

A.6.3 CDR . . . 81

A.7 Jump Instru tions. . . 82

A.7.1 JUMP . . . 82

A.7.2 JMPR . . . 82

A.7.3 JMPRT . . . 83

A.7.4 JMPRF . . . 83

A.8 Compa t Instru tions . . . 83

A.8.1 POKE-i . . . 83

A.8.2 PEEK-i . . . 84

A.8.3 LOADi . . . 84

(9)

1.1 The TCBof exe ution monitors . . . 5

1.2 The TCBin generalstati analysis framework . . . 6

1.3 The TCBin simpliedPCC framework . . . 8

2.1 Safe odes in PCC point of view . . . 11

2.2 TraditionalProof-CarryingCode framework . . . 12

2.3 Trusted omputing base in PCC . . . 13

2.4 Veri ation ondition generation . . . 15

2.5 Foundationalproof- arrying ode framework . . . 17

2.6 Trusted omputing base in FPCC . . . 18

2.7 Ora le-based proof- arrying ode framework . . . 19

2.8 Trusted omputing base in OPCC . . . 20

2.9 The framework of the generi ExtendedProof- arrying ode . . . 23

2.10 Consumer side inPCC versus its extended version . . . 28

2.11 Consumer side inFPCC versus itsextended version . . . 29

2.12 Consumer side inOPCC versus itsextended version . . . 30

3.1 Virtual Ma hine . . . 31

3.2 Data types inthe VEP . . . 35

3.3 S hemata of the sta k, the heap, and the ode spa e inthe VEP . . . . 36

3.4 Instru tion distribution approximation . . . 37

3.5 Initial instru tion set . . . 38

3.6 Op odes of the initialinstru tion set . . . 40

3.7 Example of PEEK-i . . . 41

3.8 Example of POKE-i . . . 42

3.9 Complete instru tions set: instru tions and their orresponding op odes 43 3.10 Referen e ounting examples . . . 45

(10)

3.11 The top elementof the sta k (wrapped_data) is passed tothe re ursive

algorithm (ifpair_redu e) before popping. If the top element is a pair,

by popping we are a tually releasing a referen e to that pair, so the

ounter eld of that pair should be de reased by one. Furthermore, if

the referen e ount fallsto zero,the referen e ountsof hildrenobje ts

should bede remented before the obje t is re laimed . . . 46

3.12 An example of anintertwined ode . . . 50

3.13 Instru tion-wise se urity enfor ement . . . 51

3.14 Example odes . . . 52

3.15 Sta k s hemata of the example. . . 53

3.16 TheTCBsizesofvariousJVMs(inthousandsoflinesof ode): Kaeisan open-sour e nonoptimizingJavaJIT, BulletTrain is a highlyoptimizing Java ompiler. . . 54

4.1 Extending the PCC framework . . . 55

4.2 The arithmeti operators in the VEP assembly language . . . 59

4.3 Example of expression evaluationduring assembly time . . . 60

4.4 Synta ti alrules for expressions . . . 60

4.5 Examples of operand size hange . . . 61

4.6 The integral promotionrules . . . 64

4.7 A sampleC ode whi h al ulates the 13th Fibona i number . . . 65

4.8 Generated assembly ode for the C ode presented inFigure 4.7 . . . . 66

4.9 The omponentsof the Proof generator Builder . . . 67

(11)

Introdu tion

1.1 Software Se urity

The rapid growth of the Internet and networks goes along with a rising need for

se u-rity for users. Instant a ess to a huge number of untrusted and mali ious softwares

through Internet onone hand and la k of se urity due tohardness, expense toset up,

and annoyan e to get along with, on the other hand, makes it easier for intruders to

implementtheir plans.

The latestInternetSe urity OutlookReport issuedbyCAIn . detailsthe extentof

the mali ioussoftwares problem [3℄. It argues that mali ioussoftwares (malwares) are

be oming more sophisti ated, and in reasing use of obfus ation te hniques to hide in

plain sight, will help riminals on ealtheir a tivities. Just in year 2007 the malware

volumes grew by 16 times. It also warns that, sin e mu h of the riminal a tivity is

targeted at nations where there are large populations of Internet users, malware and

software failureproblemis aninternational issue.

Furthermore,the requirementof fullytrustedsoftware forultra-expensiveandvital

proje ts learly shows the la kof required te hnology basis toaddress thesenew needs

of omputer se urity [1℄. Sin e the methods used to implement se urity poli ies are

less expensive and more exible in software than in hardware, se urity is in reasingly

be ominga software issue [4℄.

Re ently, there have been some explorations on the domain of programming

(12)

proa hes [5℄. In other words, language-based se urity as dened by S hneider is a

set of te hniques based on programminglanguage theory and implementation,

in lud-ing semanti s, types, optimization, and veri ation, brought to bear on the se urity

question. [5℄

Bythat denition, the approa h proposed by this thesis fallswithin the framework

of language-based se urity.

1.2 Se urity design prin iples

Language-basedse urity on eptuallyisa ombinationoftwo lassi omputer se urity

prin iples:

1.2.1 Least privilege

About thirty years ago, Jerry Saltzerand Mike S hroeder des ribed adesign prin iple,

known asPrin iple of Least Privilege [7℄. A ording to this prin iple, throughout

exe- ution,ea h prin ipalshouldbegiven the minimum resour es ne essary toa omplish

its task. In other words, depending on the ontext, every pro ess, user or program

must be able to a ess only su h information and resour es that are ne essary to its

legitimatepurpose.

As a result of this restri ted a ess, the untrusted ode will not have the means

and rights to perform a system-level operation in order to use vulnerability in other

appli ations to exploit the whole system. Indeed, least privilege results in a system

with better se urity. Moreover, the restri tion on sensitive system operations in this

prin ipleleads tobetter system stability.

As a real-world example of the prin iple of least privilege we an refer to need

to know restri tion rule in military. Under need-to-know restri tions, even if one

has all the ne essary o ial approvals (su h as a se urity learan e) to a ess ertain

information,onewouldnotbegivena esstosu hinformationunlessonehasaspe i

need to know; that is, a ess to the informationmust be ne essary for the ondu t of

(13)

1.2.2 Minimum trusted omputing base

The trusted omputing base (TCB) of a omputer system is the set of omponents in

whi h the o urren e of bugs might put the se urity properties of the entire system

in danger [9℄. Rationally, in a system, the smaller the TCB is, the less probable the

ompromiseinse urity would be. Thus, the best way toassure that asystem isse ure

isto keep itsTCB small and simple.

If an atta ker an repla e or modify any omponents of the TCB or, simply, there

exists a buggy omponent in TCB, the TCB an no longer be trusted. Con eptually,

theMinimumtrusted omputingbaseprin iple anbeappliedtoanytrustmanagement

system, where the trusted party is the TCB and the trusting party is the omputing

system.

Animportantpropertyofthe prin ipleof LeastPrivilegeandMinimalTCBistheir

independen y of the ar hite ture, omputer speed and other variable dimensions of

omputer systems. This makes themremain sound and sensible throughoutthe time.

1.3 Current approa hes

Our dailylifeis be omingmore and more dependent on omputers. We have a essto

networks and to a huge number of untrusted softwares through Internet. Suppose we

wishtodownloadandrun aprogramfromanunknown oruntrustedsour e. Oneofour

biggest on erns would be about the prote tion of our omputers from this untrusted

ode. The following lasses of approa hes are urrently used torea h this goal.

1.3.1 Authenti ation

Theidea inAuthenti ationisthatone a epts onlythe odes that ome equipped with

asignatureofatrustedprodu er. Ideally,sin ethisprodu eristrusted,the ode anbe

alsotrusted. Mi rosoftAuthenti ode isapopularexampleofa ode-signingme hanism

toassureusers that software they download fromthe Internetis authenti and has not

been tampered with [10℄. This lass of approa hes has several drawba ks.

(14)

of Minimum trusted omputing base an be applied toany trust management system,

where the trusted party is the TCB and the trusting party is the omputing system.

Hen e, onsidering the trusted ompanies as the TCB, in order to de ide whether a

ode an be trusted or not, we need to add ea h and every trusted ompany to the

TCBwhi hleadsto enlargementof theTCB. Trustingaprodu erdoes notne essarily

meanthat the produ ed ode issafe,and performs asit laims. As apopularexample,

we an refer to Mi rosoft Word 97whi h ontained a hidden pinball game [11℄. Thus,

itis learthat weneed approa hes whi h an ensurethat theuntrusted ode an ause

noharm tothe omputingsystem.

1.3.2 Code Analysis

Oneway toensurethattheuntrusted ode an ausenoharmtothe omputingsystem

is to understand the ode behavior. Code analysis,also known as program analysis,is

the pro ess of fo using inon spe i aspe ts ofa ode to gain anunderstanding of the

ode behavior. This pro ess of automati allyanalyzing an bedone indierentperiod

of the ode lifetime( ompile time,exe ution time).

Dynami Analysis

Dynami ode analysis te hniques observe the exe ution of a ode and perform an

appropriate a tion before the ode violates the se urity poli y. The performed a tion

againstthese urityviolation anbedis ontinuationoftheexe utionofthe ode,

audit-ingthe ode, orsimplymaking alogle. The SASI[12℄,Na io [13℄, and Polymer [14℄

are among the best-known approa hes in dynami analysis.

Exe ution monitor

The systems thatperform the dynami ode analysis are alled exe utionmonitors. In

dynami analysis,it isimportantto reateasafeanalysis environmentalsoknown asa

sandbox. Possible analysis environments are the operating system (Figure 1.1(b)), the

ode spa e (Figure1.1( )), and a wrapperprogram (Figure1.1(d)).

The operatingsystems mayhavedistin tuser rightsmanagement,timer interrupts,

(15)

usual means by whi h a ess to servi es is ontrolled. It issimply alist of the servi es

available, ea h with a list of the entities (users, programs, pro esses) permitted to

use the servi e. Currently, most desktop ma hines are ongured as single-user so

appli ationshave omplete a ess tothe ma hine resour es.

When the ode spa e is the analysis environment, some lines of ode should be

inserted before ea h memory a ess and ea h ontrol transfer to ensure that those

a esses are valid.

Wrappers, interpreters, and virtual ma hines are environments that inter ept (and

interprets or redire ts) the instru tions issued by the wrapped ode. The wrapped

ode does not exe ute dire tly on the underlying hardware but instead is interpreted

by another program. Everywrapped ode instru tionisthusexe uted onlyafterithas

been he ked and found not to be violating the se urity poli y being enfor ed. In this

way, abroad range of se urity poli ies an beapplied.

Code

OS-based Monitor

Instrumented code

Wrapper

TCB

Code

TCB

Code

(a)

(b)

(c)

(d)

Figure 1.1: The TCB of exe utionmonitors

Exe ution monitors (EMs) are usually easy to implement and they an work with

binary odeswhi hmakesthemlanguageindependent. Despitethesestrongpoints,the

EMs suer form some drawba ks. The system resour es are engaged by the monitor

the whole time the ode is running and even if one run of the ode was safe, we an

notbesureabout the next runs. Thus, themonitorshouldworkeverytime werun the

ode forthewhole runningtime whi hresults inabigoverhead. Another disadvantage

of the monitor liesin itsusual fail-stop behavior. Fail-stop treatment doesnot twell

in proje ts where the pri e of stopping the program is high (e.g., satellite navigation

system [1℄). Furthermore,it isof high importan eforan approa h tobe ina ordan e

with the prin iples of se urity design. Obviously, exe ution monitors are in partial

(16)

(a)), unfortunately, the growth in the size of the TCB is inevitable for all three types

of exe utionmonitors regardless of their implementation.

Stati Analysis

As we mentioned in the previous se tion,one of the major drawba ks of the exe ution

monitors lies in their fail-stop behavior against the mali ious odes. The fail-stop

behavior, itself, is be ause of the runtime investigation whi h leaves the monitor with

no way but to stop the ode in order to avoid any se urity violation to happen. The

stoppage then may lead to agreat loss intime, money, oreven other formsof se urity

insystems where the renewal of the untrusted ode is hard oreven impossible

(safety- riti alorhuge s ienti proje tsinouter-spa e may end up with denialof servi e due

tothe stoppage of apie e of ode asa result of asimple mistake inthe ode [2℄).

Furthermore,the exe utionmonitor anonly onrmthesafetyofthe urrenttra e

of the ode exe ution up to the urrent moment in the tra e. While, for a ode to be

safe,theset of allpossibletra esofthe ode shouldbeprovensafe. Therefore, through

exe utionmonitoring the safety of the ode an not be proven.

Thus, itwouldbereasonabletodothe ode analysiswithouta tuallyexe utingthe

ode(asopposedtodynami analysis). Afamilyofte hniquesof odeanalysisinwhi h

the ode isanalyzed by toolstoprodu e useful informationwithouta tuallyexe uting

programs built fromthat ode is alledStati analysis.

Stati analysiste hniques nd out a ode's possible behavior priortoits exe ution,

reje ting a ode whose set of possible behaviors in ludes una eptable behavior. This

a prioriunderstanding of the ode isthe same asproving the ode safety. This an be

done by providing the omputing system indestination (i.e., the ode onsumer) with

a safety prover whi h an verify the ode safety upon re eiving the ode. Figure 1.2

shows the TCB for the general stati framework in whi h the growth in size of TCB

depends onthe size of Safety he king system.

TCB

Code

Safety checking

system

Execute

Reject

(17)

Typedassemblylanguage(TAL)[15,16℄,Flint[35℄,andProof- arrying ode(PCC)[17℄,

are among the best-known approa hes in stati analysis.

Proof-Carrying Code

Asafety veri ationsystem shouldbe apableofverifyingthesafety ofeveryuntrusted

ode. Sin e verifyingthe safety of the re eived ode is ahard task,it inevitably would

need a omplex verier. The omplexity is the worst enemy of se urity, that is,

pla ing a big and potentially buggy programin the TCB ompromises the se urity of

the omputing system.

In ordertoease thissituation,Ne ula andLee introdu edtheProof-Carrying Code

(PCC) approa h [20, 17, 18, 21℄. The main idea behind this approa h is to shift the

roots of the problem out of the TCB. This is done by breaking the verier system

intotwo omponents: a omplex Safety Prover anda simpleProof he ker, pla ingthe

Safety Proveron theprodu erside and the Proof he ker onthe onsumer side (i.e., in

TCB).

In this way, the burden of proving the safety of the untrusted ode is shifted on

the shoulders of the produ er. Therefore, in this framework, the produ er rst proves

the safety of the ode then atta hes the safety proof to the ode and sends it to the

onsumer. On the onsumer side and before exe uting the ode, roughly, the proof

he ker he ks the ode againstthea ompanyingproof. Uponsu ess, the ode ould

be exe uted.

As itis shown inFigure1.3,PCC has arelativelysmallTCB. Moreover, sin e PCC

does the stati analysis, on e the safety of an untrusted ode is he ked su essfully

before the rst run, there is no need to he k the ode before the next runs. As a

result, wewill have a omputing system with less overhead and more se urity [22, 23℄.

These benets nominate the PCC as one of the strongest frameworks to be used in

mobile- ode se urity. However, unfortunately, this great approa h suers from some

short omings. Apart from the di ulty of building or generating the proofs for the

ode, one of the ru ial obsta les for the pra ti alappli ability of the Proof-Carrying

Code te hnique is the size of the proofsthat must a ompany the ode. To bepre ise,

the di ulty of ommuni ating the proofs whi h are inherently large makes the PCC

(18)

Chapter 1. Introdu tion 8

Proof

checker

Code

TCB

Execute

Reject

Proof

Prover

Code

Consumer

Producer

Figure1.3: The TCB insimpliedPCC framework

1.4 Thesis ontribution

As dis ussed above, the presen e of untrusted and mali ious odes and the absen e of

thene essary se uritybases andframeworksforsafeuseof themobile odes areamong

the main on erns of our omputer-dependent world. The problem with the urrent

approa hes is either be ause of their refusal to obey the se urity prin iples or due to

their hard appli ability.

The trustworthiness of the proof- arrying ode is an important advantage over

ap-proa hes that involve the use of omplex se urity systems on the onsumer side.

How-ever, this approa h has s aling issues be ause the size of the safety proofs in reases

qui kly. The approa hes proposed to alleviate this suer with drawba ks of their own

espe iallytheenlargementoftheTrustedComputingBase,inwhi hanybugmay ause

anunsafe program tobe a epted.

In ontrast tothesedevelopments,wehaveworked onahybrid approa h withsome

modi ationsintheframeworklevelwhi hessentiallysolvesthePCC'ss alabilityissue.

In our approa h, instead of transmitting the proofs, a proof generator for the ode in

question is sent. The new modied extended framework enables the exe ution of the

proof generator on the onsumer side in a se ure manner. To ut a long story short,

this thesis makes the following ontributions to the eld of mobile ode se urity and

veri ation.

•

We show the design of a generi extended framework for proof- arrying ode (EPCC). We show that the designed framework is tamperproof and an resolve

some key issues ofthe previous frameworks.

•

We present the design of a safe and small virtual ma hine (VEP) introdu ed in the EPCC framework. For this, we implemented the VEP to work as an online

(19)

•

We showempiri ally that the EPCC and its onjoint virtual ma hine (the VEP) an be used in an industrial-strengthframework. To show this, we implemented

the ne essary programs to omplete the end-to-end hain. For this, we

imple-mented anassembler forthe VEP byte ode. We alsoimplemented aC ompiler

to target the assembler language. Finally, in order to have a runnable EPCC

framework, we made use of GUNzipto build asample proof generator.

1.5 Thesis outline

Theremainderofthethesisisorganizedasfollows. InChapter2,wepresentourgeneri

ExtendedProof-CarryingCode(EPCC)frameworkandtakeaglan eoverrelatedworks

and assess ea happroa h merits and drawba ks. Chapter 3 dis usses the design of the

Virtualma hine forExtendedPCC (VEP). Itdes ribesthe trade-oswe hadtomake

whendesigningthe VEPanddis usses the wayinwhi htheVEPworks. InChapter4,

we make the whole system work by bridging from theory topra ti e. In this hapter,

we present a sample implementation of an EPCC framework. Finally, we summarize

(20)

Extended Proof-Carrying Code

Inthis hapter,wepresentourgeneri ExtendedProof-CarryingCodeframework. First

we introdu e and motivate Proof Carrying Code (PCC) along with mentioning its

limitations. Then, we take a glan e over related works and assess ea h approa h's

meritsand drawba ks from our perspe tive. Finally, wepropose our new approa h.

2.1 PCC-based Approa hes

InPCC-based approa hes,anuntrusted ode produ ermust onvin e a ode onsumer

of the safety of his ode by sending a proof. In this se tion, we peruse the

Proof-Carrying Code and its limitations together with its similar approa hes. It is worth

mentioning that we visualize the approa hes using data ow diagrams. Re tangles

denote omponents taking data represented in ovals as input or output su h data.

Arrows indi ate howthe dataows and the poli iesare shown by verti al s rolls.

2.1.1 PCC: Chara teristi s, and Obsta les

Chara teristi s

Proof-Carrying Code [20, 17, 18, 19, 21℄ refers to a stati analysis approa h in whi h

(21)

Chapter 2. Extended Proof-Carrying Code 11

Sin ethe veri ation isdonebeforetheexe utionofthe ode,likeotherstati

anal-ysis approa hes, PCC enfor es the se urity onservatively. This onservative behavior

is due to the la k of the a ess to useful runtime information (like inputs and variable

values). Thus, usingPCC anlead tosomefalsenegatives. Figure2.1 shows howPCC

takes are of re eived odes. The darker region shows the odes whi h are a epted

as safe odes by PCC while the white se tion between the origin axis and the darker

region represents the safe odes whi h are not a epted by PCC, mentioned as false

negatives.

Malicious codes

Safe codes

Figure2.1: Safe odes in PCC point of view

The false negatives is the pri e that PCC pays to gain an assured se urity. This

learly shows the importan eof the ertainty of se urity as one of the raisonsd'être of

the PCC approa h.

To nd out other important aspe ts of PCC approa h, we briey talk about the

originalframeworkof the PCC.

In a PCC system, there are typi ally two main parties, (1) a ode produ er, who

builds ma hine ode along with itssafety proof, and (2) a ode onsumer, who wishes

to run the ompiled ode as long as it satises the safety poli y. In reality the

om-muni ation between thesetwoparties ismore ompli ated and onsistsof a multi-step

intera tion between the produ er and the onsumer. In the rst step, the produ er

sends aprogram onsisting of the ode and additionalannotations. Theseannotations

onsist of loopinvariants and fun tion pre- and post- onditions and provide more

in-formationabout the ode. This additionalinformationmakes the following step of the

pro edure easier.

In the next step, the onsumer applies the Veri ation Condition Generator

(VC-Gen) to the re eived annotated ode, a ording to his/her parti ular safety poli y to

generate a veri ation ondition. A veri ation ondition is a logi al formula that, if

satised, impliesthat the ode satisesthe safety poli y[18℄. Here, additional

annota-tions anbeusedtomaketheVeri ationConditionGenerator'sjobeasier. Generating

(22)

sim-Program

Code

VC

Generator

Verification

Condition

Theorem

Prover

Proof of

VC

Code

Proof

VC

Generator

Verification

Condition

Proof

Checker

Safety

Policies

CPU

Producer

Consumer

Figure2.2: TraditionalProof-CarryingCode framework

the produ er.

The produ er runs a theorem prover (in many ases along with ne essary human

interventions) to get a proof of re eived veri ation ondition. The theorem prover

uses the axioms dened as part of the onsumer's safety poli y. Thus, a proof of

the veri ation ondition onstitutes a proof of safety of the ode with respe t to the

onsumer's safety poli y.

Ingeneral,provingtheveri ation onditionisresour e- onsumingwhi h anresult

in low performan e. Furthermore, onsidering that the theorem prover is a omplex

program, it ould not be pla ed on onsumer side. Therefore, in PCC the theorem

proverisontheprodu erside. Thus, itisthe beauty ofthis te hnique thatthe di ult

and omplex job isnot being done by the ode onsumer duringany steps of PCC.

In the next step, the produ er submits the proof to the onsumer. The ode

on-sumer he kstheproofbeforeexe utingthe odesubmittedbytheprodu er. Therefore,

he shouldverify ifthe re eived proofis reallyaproofof the veri ation ondition. The

ode onsumer runsthe proof he ker toverify that the proof isindeedavalidproof of

theveri ation ondition onstru tedinthepreviousphase. Iftheproof he k su eeds

the onsumer an thenexe ute the ode safely. On ethe safety ofanuntrusted ode is

(23)

As it is shown in Figure 2.2, it is possible to simplify the dialogue between the

ode produ er and the ode onsumer, assuming that they share a same VCGen (the

ode produ er has a opy of the VCGen). In this way, the ode onsumer re eives

the annotated ode atta hed to itsproof and runs the veri ation ondition generator.

Then, he he ks the proof against the veri ation ondition and if the he k su eeds

he an exe ute the ode safely.

One of the most important properties of the PCC frameworkis that the PCC

pro-gramsare tamper-proof. Thatis, anintruder annot modifythe ode orthe proofina

waythatresultsinexe utionofamali ious odeonthe onsumerside. Anyattemptsto

tamperwith eitherthe ode orthe proofresults inavalidationerroratproof he king.

In the few ases when the ode orthe proof are modied su h that the validation still

su eeds, the new ode isalso safe.

TCB Components

As we mentioned earlier, PCC intends to have a small trusted omputing base by

shiftingthe hardtaskof provingthe safetyof the ode tothe produ erside. Figure2.3

shows the main omponents of the TCB in original PCC framework. In pra ti e, the

trusted omputing base in the PCC framework is omposed of the followings.

VC

Generator

Proof

Checker

Safety

Policies

Hardware

Computing

system

Figure2.3: Trusted omputing base in PCC

•

Hardware: wesupposethatthea tualma hinehardwarewilloperateasexpe ted.

•

Logi alstru tures:

Logi : The logi is used to express the safety poli ies and proof in a formal

way. Proof of the soundness of the logi is done by hand, so a small and

simple logi would be more onvenient.

Safety Poli y: The poli ies whi h are expressed in the syntax of the logi

(24)

∗

Proof rules: The rules whi h are ne essary for proving the veri ation ondition. Two sample proofrules are represented in Figure2.4(a).

∗

De oder: The de oder interprets the semanti s ofanindividual instru -tion and results in the lo alsafety ondition for exe utingthat

instru -tion and aset ofpossible ma hine statesresultingfromthe exe utionof

the instru tion.

•

Proof he ker: An implementation of the PCC logi whi h makes it possible to express statements in the syntax of the logi and me hani ally he k proofs, is

alled Proof- he ker. The proof- he kers are usually simple programs and the

proof he king is a straight-forward task. Assuming logi has all the desirable

properties, we must trust that the proof he ker implementationis orre t.

•

Veri ation Condition generator: VCGen isresponsible for handlingthe ontrol-owaspe ts of the ode and willresult in a veri ation ondition whose validity

entailsthesafetyofthe ode. AsitisshowninFigure2.4( ),theVCGen omputes

the veri ation ondition of the entire program by ombining all of the lo al

safe-progress onditions (safety onditions of ea h instru tion) identied by the

de oder. For this, the weakest pre- ondition of the program, starting form the

post- onditionandworkba k,is omputed. Figure2.4(d)showsthepre- ondition

and the post- ondition of the sample sour e ode presented in Figure 2.1(b).

Figure2.4(e)shows asampledenitionofthe VCGen(wereferinterestedreaders

to[19℄for detaileddes riptions of the denitions).

Obsta les

Thetotalsizeofthementioned omponents onstitutingtheTCBinproof- arrying ode

approximatelyisabout15000to20000lines of ode. Anybug inthese omponents an

ompromise the se urity of the whole system. It is often possible to use the elusive

standard of residual defe t density as a metri for faultiness tomeasure the number

of faults that remain in a software ode at the delivery point. A typi al target in

software development is to a hieve a residual defe t density of less than one error per

one thousand lines of non- omment sour e ode (KLOC) [37, 38℄. However, leading

edge software development organizations typi ally a hieve a defe t density of about 2

defe ts/KLOC [39℄.

(25)

(a) Sample proof rules

(b) Sample source code

(c) The verification condition for the entire program

(d) The pre-condition and the post-condition if the sample code

(e) Sample verification condition generator definition

* The figures are taken from the “Proof Carrying Code” by G. Necula (POPL’97)

Figure2.4: Veri ation ondition generation

about the TCB grows along with its number of lines. Therefore, to have a safe and

implementablePCCframework,oneoftheobsta les infrontisitsrelativelylargeTCB.

Besides the problem of the large TCB, there is also the matter of proof size. In

prin iple, the proofs an be exponentially large, as mentioned by Peter Lee: ...as a

generalmatter,thesizeofthebinariesisanissuethatmustbeaddressed arefully [23℄.

Thus,one ofthe ru ialobsta lesforthe pra ti alappli abilityofProof-CarryingCode

and related te hniques is the size of the proofs that must a ompany the ode. It is

important to have a ompa t representation of proofs be ause they are possibly sent

through ommuni ation networks. In traditionalPCC framework, it was not unusual

tosee proofs that were 1000 times largerthan the asso iated ode whi h made the use

(26)

enough exibility. That is, the produ er is onstrained to submit a proof in a logi

whi h hasbeenimposed by the onsumer. Thatis, even ifthe produ ernd itpossible

to build a simpler proof in a higher-order logi , he is for ed to build the proof in the

onsumer's logi whi h mightresult inan overweight proof.

To sum up, the PCC te hnologyhas the followingobsta les infront:

1. proofs are large (pra ti ality);

2. TCBis relativelylarge (se urity);

3. produ ershould dothe hard workwith inadequate means(exibility).

Any solution to ombatthese obsta les and tomakearenementin the PCC

te h-nology should respe t the following fundamental hara teristi sof the PCC approa h:

1. give the highest priority tothe se urity (raison d'être);

2. intend tohave asmall TCB;

3. leave the easier tasks tothe onsumer;

4. an not be tampered with.

In the following sub-se tions we summarize two other PCC-based approa hes

pro-posed to ombat the mentioned obsta les. We try to make the dierent approa hes

omparable by highlighting two aspe ts: the trusted omputing base size and the size

of the safety proofs in thosesystems.

2.1.2 Foundational PCC

In order to relieve the se ond drawba k dis ussed in the foregoing sub-se tion, Appel

introdu ed the notionof foundational proof- arrying ode (FPCC) [28℄.

Although the TCB omponents in traditional PCC framework are simple, Appel

(27)

To explain more learly, if there exists a bug in either the VCGen or the typing

rules then the TCB be omes vulnerable. As a matter of fa t, a proof- arrying ode

ertifying ompilerforJavanamedSpe ial-J[24℄,happenedtohaveerrors initstyping

rules,dis overedbyLeagueetal.[30℄. Needlesstosay,thisbugae tstheoverallsafety

of the PCC system.

Program

Code

Theorem

Prover

Proof

Checker

CPU

Producer

Consumer

Code

Proof

Figure2.5: Foundationalproof- arrying ode framework

FoundationalProof-CarryingCodeaims tofurtherredu e the TCBsize by anorder

ofmagnitude. Thatis,emphasizingaminimalTCB, they removed theVCGenand the

safety poli y fromthe onsumer side, asit is shown onFigure 2.5

Foundational PCC is based on the idea of dening the semanti s of the ma hine

instru tions and the proof rules using only a foundational mathemati allogi . In this

way, Appel et al. avoid using the VCGen by dening the operational semanti s of

ma hine instru tions and the safety poli ies in a higher-order logi . This is done by

modeling the ma hine instru tion with a transition from one ma hine state (set of

memory and registers) to another ma hine state. Then, the safety poli y is dened

in the following way: A given state is safe if, for any state rea hable by an arbitrary

sequen eoflegalinstru tions,thereisasafesu essorstate. Hen e,a odeissafeifwe

get a safe state. Forthis, type he king rules are proved as lemmas. Then, the proofs

are onstru ted and he ked on e per system, and rules are used many times to he k

programs. In this way, a safety proof is the appli ation (derivation) of type he king

rules and it shares proofs of ommon lemmas [29℄. FPCC uses a higher order logi

with few axioms of arithmeti , from whi h it is possible to build up most of modern

(28)

Therearethreemain omponentsinaFoundationalPCCsystem: atheoremprover,

a proof he ker, and a safety proof of the ode. The theorem provershould produ e a

proof of safety to be a ompanied by the ode. The proof he ker veries the safety

proof beforethe programgets exe uted.

Safety Proof Size

The proofs inFoundationalPCC, in omparison with traditionalPCC, are more

om-pli atedtoprodu eand asAppelhimselfstated, anexplode exponentially. Therefore,

theproofsizewhi hisa ru ialobsta leforthepra ti alappli abilityofProof-Carrying

Codeandrelatedte hniquesisremainedunsolved. A ordingtoNe ula,theproofssize

inFoundationalPCCis20%biggerthantheproofssizeintraditionalPCC.Thismakes

the proof ommuni ation harder and the use of Foundational PCC even less pra ti al

than the traditionalPCC [27℄.

Trusted Computing Base

Foundational PCC is on erned with minimizing the trusted omputing base of the

system,in ludingnottheVCGenasshowninFigure2.6. FPCC,inprin iple,isstri tly

more se ure than traditional PCC be ause it has a smaller trusted omputing base.

With this te hnique, Veri ation Condition Generator is removed from the TCB, and

the TCB be omesminimal.

Proof

Checker

_Computing

Hardware

system

Figure 2.6: Trusted omputingbase in FPCC

2.1.3 Ora le-Based PCC

One of the main impediments to s alability in traditional PCC is that the proofs an

beverylarge. Inordertoalleviatethisproblem, Ne ulaproposedanewstrategy alled

(29)

in Figure 2.7, this hange in strategy, led to a hange in the framework, namely, they

assumed that the onsumer uses anon-deterministi proof he ker.

Program

Code

VC

Generator

Verification

Condition

Theorem

Prover

Proof

witness

VC

Generator

Verification

Condition

Non

Deterministic

Proof Checker

Safety

Policies

CPU

Producer

Consumer

Code

Proof

witness

Figure 2.7: Ora le-basedproof- arrying ode framework

In order tomakeuse of the new non-deterministi proof he ker, they repla ed the

proof by an ora le string whi h guides the non-deterministi he ker. Every time the

he ker mustmakea hoi ebetween the possibleways topro eed, it onsultssomebits

fromthe ora le.

To be more pre ise, the untrusted theorem prover on the left-hand side re ords a

sequen e of bits thatshows whi hsub-goalsfailedand needed ba ktra king. Then, the

produ er sends this bit-stream to the onsumer. On the onsumer side, the re eived

bit-stream works as an ora le whi h an be used by the trusted non-deterministi

proof he ker toavoidba ktra king. It goeswithoutsaying that the ora le,likeproofs

inPCC, needs not betrusted. Thatis, if the ora le iswrong, then the trusted he ker

willgowrong, and willfail tond the proof.

In this approa h, the trusted non-deterministi proof he ker, in fa t, is a

non-deterministi theorem prover. This theorem prover is given the task of proving the

veri ation ondition. Whenever the prover has to pi k from

n

hoi es, it reads some bits fromanora lestring toresolvethat hoi e. Asaresult, the ora leisused todrive

the theorem prover to a nal proof without sear h, and as su h, the ora le string an

(30)

Safety Proof Size

The ora le-based proof- arrying ode is e ient. Experimental eviden e shows that

ora lestrings, assuggested by Ne ula, an be about 1/8 of the ode size and about 30

times smaller than proofs in traditional PCC [27℄. However, Wu [31℄ found the ode

size relation de eptive: Unfortunately, this statisti is somewhat misleading. [...℄

a ma hine language program and a proof witness. The Spe ialJ proof- arrying Java

system on whi h Ne ula measured ora le-based he king transmits three omponents:

The ma hine ode, the proof, and a Java lass le. The Java lass le, asis usual in

any Java system, ontains des riptions of the types of all pro edures (methods) in the

program(untrusted ode), in luding formalparameter and result types. However, the

1/8 size gure does not in lude the Java lass les.

While the small size and low ost of he king an ora le string are appealing, a

potentialproblemwith them isthat there are no urrently known ways tomanipulate

or ompose them. Thus, ora le strings for subprograms might be hard to use dire tly

whentrying tond erti ates forlargerprograms (ora lestringsare based onguiding

the sear h for ut-free proofs). They are also fragile in the sense that small hanges

in the formula to be proved or in the version of the theorem prover an invalidate an

ora le string.

Trusted Computing Base

The downside of Ora le-based PCC is that, as it is shown in Figure 2.8, it involves

omplex trusted omponents, su h as a type system with axiomati rules for

mem-ory safety and the VCGen and the non-deterministi proof he ker. Any aw in the

implementationof these omponents an ompromise safety of the system.

VC

Generator

Non-deterministic

Proof Checker

Safety

Policies

Hardware

Computing

system

Figure2.8: Trusted omputingbase in OPCC

(31)

and asthe se ondprin ipleof se urity designsuggests, any bug inthe TCBmay ause

anunsafeprogramtobe a epted. Forexample theSpe ial-J system,showed a riti al

leak in its type axioms [30℄. Unfortunately, one an nd the big size of the TCB in

OPCCagainst the rst and the third hara teristi sof aPCC approa h,as mentioned

inSe tion 2.1.1.

2.2 Extended Proof-Carrying Code framework

In this se tion we study the Extended Proof-Carrying Code framework. First, we

explain the idea behind our proposed approa h. Then, we present the framework and

talkabout itsproperties.

2.2.1 Sending a Proof generator

Aswementionedearlier,oneofthe ru ialissuesforthepra ti alappli abilityof

Proof-CarryingCodeand itsrelatedte hniques isthe sizeofthe proofsthatmusta ompany

the ode. Therefore, it is desirable that proofs be represented in a ompa t format.

Oneway torea hthisgoal isProof optimization inwhi hthe proofsarebuilt inamore

ompa t form and an be interpreted as proof of the original form [32, 33℄. The best

proof optimization approa hes result in proofs whi h are 15-30 times smallerthan the

original proof and pay the pri e of the enlargement of the TCB [27, 31, 26℄. We are

notinfavorof ompromisingthe se urity ofthesystem byabigTCBexpansionsimply

be ause the proofs are too large.

Another way of ompa ting the proofs is through Data ompression. Data

om-pressionte hniques try to nd more ompa t representations for data, fromwhi h the

original data an be re onstru ted exa tly. Many su h algorithms ompress data by

sear hing for more e ient en odings that take advantage of repetition in the data.

These te hniques are not well exploited in PCC framework due to the following

rea-sons. The onsumer of ompressed data must rst de ompress it, this needs a safe

de ompresser on onsumer side. Generating the proof of safety for a normal

de om-pressor (relatively big program with about 3000 lines of ode) is a di ult task not

worth performing be ause su h de ompressor would be a spe i de ompresser that

an not have the potentialtowork witha proof ompressed by anappropriatebut

(32)

whi h is appropriatefor the safety proof of a ode.

Wepresent inthis thesis an extended framework that allows the PCC proofs tobe

represented as programs. This helps us not to pay a proof-size pri e and enables the

PCC to handle even very large programs. The idea behind the new framework, whi h

we are going to present, is inspired by the Kolmogorov omplexity. We introdu e the

notionof Kolmogorov omplexity inthe followingsub-se tion.

2.2.2 Kolmogorov omplexity

Roughly speaking, the Kolmogorov omplexity of a string is the shortest omputer

programthat produ esthe samestring,i.e., that omputes it,printsit,and thenhalts.

One important observation is that this measure of omplexity indi ates how mu h a

string(or, in the ontext ofproof- arrying ode, aproof) an be ompressed: the ideal

ompressed formfor a given proof is the shortest programthat outputs that proof.

Formally, the Kolmogorov omplexity

K

U

(x)

of a string

x

is dened as the length

ℓ

of the shortest program apableof produ ing

x

on auniversal omputer U su h asa Turing ma hine. This omplexity isin omputable.

K

U

(x) = min

p∈{0,1}

∗

{ℓ(p) : p on U outputs x}

The denition depends on the spe i omputer programming language and the

uni-versal omputerthat isused. Wedenethesetwo omponentsa ordingtoour generi

extended PCC framework whi hwe present next.

2.2.3 Extended Proof-Carrying Code framework

The idea behind the Extended Proof-Carrying Code (EPCC) is simply to send the

proof in the form of a program. In this way, we make it possible for the produ er to

sendaproofgeneratorinsteadofthe proof,wherea ordingtoKolmogorov omplexity,

the proof generator ideally an be the shortest program whi h an output the original

proof. Forthistowork, the onsumershould be apableof runningthe proofgenerator

(33)

Proposed generi framework

Inorder tobenetfromthe above idea inanorganized manner, we proposed a generi

EPCC framework. A diagramof an EPCC system is given inFigure2.9. In anEPCC

system, there are two main parties, a ode produ er, who sends a ode along with its

safety proof generator, on the left-hand side and (2) a ode onsumer, who wishes to

run the ode, provided that itis proven safe by the system onthe right-hand side.

The ommuni ation between these two parties may onsist of a multi-step

inter-a tion between the produ er and the onsumer depending on the proof- arrying ode

framework that they extend. Generally, atthe rst step, the produ er runs a theorem

provertoget asafetyproof ofthe odehe intendstosend. Here, in ontrast withother

PCC frameworks, the onsumer is not for ed to generate the safety proof in the logi

that the onsumer imposes.

.

Proof

Checking

System

CPU

Consumer

Program

Code

Theorem

Proving

System

Proof

Producer

Proof

VEP

Proof

generator

Builder

Code

Proof

generator

Figure2.9: The framework of the generi ExtendedProof- arrying ode

The produ er an use this opportunity tobuild the proof ina logi (e.g., a

higher-order logi ) that results in a smaller proof. In other words, the produ er has the

possibilityofredu ing the sizeof the safetyproof by usinga ustomlogi whi h an be

later onverted (translated) tothe logi set by the onsumer.

Then, the produ er writes a proof generator. In a ordan e with the Kolmogorov

omplexity, this proof generator an, in prin iple, be the shortest program whi h an

output the safety proof in the format whi h is a eptable to the onsumer. That is

to say, the generi EPCC framework provides the produ er with the opportunity of

(34)

In the next step, the produ er submits the ode a ompanied by its safety proof

generatortothe onsumer. The onsumerisrequiredto he ktheproofbeforeexe uting

the ode submitted by the produ er. Therefore, he runs the safety proof generator on

the Virtual ma hine of EPCC (VEP) and obtains the safety proof. Then he runs the

proof he ker. After the proof he k su eeds the onsumer an repeatedlyexe ute the

ode safely. As one an easily observe the EPCC framework like the PCC is tamper

proof.

Oneofthe ru ial omponentsintheEPCCframeworkistheVEPwhi hisapartof

thetrusted omputingbaseoftheEPCC.Safeexe utionoftheproofgeneratordepends

on the safety of the VEP and the way it imposes the se urity requirements. Here, we

advert some important aspe ts about the VEP, and later, in Chapter 3 we study the

designof the VEPthoroughly. Inthe following,wedis uss the ways inwhi hthe VEP

provides us with the ne essary basis for applying Kolmogorov omplexity idea and

enablesthe exe utionof the proof generator at the onsumer side in ase ure manner.

The VEP: A Universal Computer

A universal omputer is a omputer whi h is apable of universal omputation. That

is, given a des ription of any other omputer or program and some data, a universal

omputer anperfe tlyemulatethisse ond omputerorprogram[34℄. Thebest-known

ontenderforthe title of universal omputer isthe Turing ma hine. ATuringma hine

is a omputing ma hine whi h has a number

n

of one-way innite tapes, divided into ells, one next to the other. The ells of the tapes an be blank or ontain a symbol

fromsome nite alphabet. The rst of the tapesis known as the input tape,on whi h

a string of symbols is written, and the last of the tapes is known as the output tape

wherethe resultofthe Turing ma hine forthat inputiswritten. The other

n

− 2

tapes an be thought of as auxiliary tapes. On ea h tape the Turing ma hine has what is

alledahead. Atanyone time,aheadsitsonaparti ular elland anreadthe symbol

whi h is written onthat ell, write a symbol onto that elland move to the left orto

the rightorstay put (insome modelsthe tape moves and thehead is stationary). Itis

worthmentioningthataTuringma hine an equivalentlypro ess asingleinnitetape.

The urrentimplementationoftheVEPisasta k-basedma hinewhi hisequivalent

in omputingpowertoaTuringma hine. TheVEPreadsthe odeandperformsa tions

on itsSta k and Heap. Here, the ode spa e an be regarded as a read-only tape and

(35)

KnowingthattheVEPhasniteresour es, popsup thequestionifit anbe

onsid-eredauniversal omputer destinationforthe proofgenerator a ordingtoKolmogorov

omplexity. The answer is yes, it is possible be ause in a nite amount of time, a

universal omputer an only manipulate a nite amount of data whi h ts in nite

resour es. In this way,the VEP an be onsidered asauniversal omputer destination

for the proof generator.

The VEP: An Exe ution Monitor

The proof generators in EPCC framework are untrusted programs whi h have to be

exe uted on the onsumer side. Sin e running untrusted programs on onsumer side

is against the raison d'etre of the PCC approa h and an ompromise the se urity of

the system, we need a se urity me hanism for running the proof generator safely. For

this to happen, the VEP should provide a tightly- ontrolled set of resour es for proof

generatorstorun in. Networka ess, theability toinspe t the host system, orreading

frominputdevi esandwritingintole streamsshouldbedisallowed. In thissense, the

VEP ought tobean exe utionmonitor.

As we mentioned in Se tion 1.3.2, two main drawba ks of the exe ution monitors

are their high overheadand their fail-stop mannerof en ountering unsafe odes. Here,

we dis ussthe existen eof ea h of these issues.

As for the overhead, inexe ution monitoring, the system resour es are engaged by

themonitorthewholetimethe odeisrunningandevenifonerunofthe ode wassafe,

we an not be sureabout the next runs. Thus, the overhead isthe resultof the system

resour es engagementby the monitor for ea h and every periodof the ode exe ution.

Overheads are usually quantiable osts of some kind. If we denote the ost as

C

and the total ost of a monitor as

C

T

, we an show

C

T

the total ost of anexe ution monitor

EM

as:

C

T

(EM) ≈

m

X

n=1

(t

EM

avg(C(EM)))

where

n

isnumberof thetimes the exe utionmonitor

EM

runs,

t

EM

shows the exe u-tiontime periodof the

EM

and

m

isthe totalnumberofruns and

avg(C(EM))

isthe average ost of runningthe monitor

EM

perCPU y le whi h an be dened as:

avg(C(EM)) =

P

t

′

t=1

C

t

EM

t

′

where

C

t

(36)

the period

t

. Nowwe an formulate the problemas follows:

lim

m→∞

(C

T

(EM)) = ∞

(2.1)

lim

t

_EM

→∞

(C

T

(EM)) = ∞

(2.2)

As it is shown inEquations 2.1 and 2.2, an unbounded number of runs and exe ution

time of the exe utionmonitor ea h an pla e an unbounded ost on the system whi h

uses the exe ution monitor.

In the ase of EPCC, we need the exe ution monitorVEP torun only for a single

time, inwhi h the proof generator outputs the proof orfails. Therefore, for the

Equa-tion 2.1, number of runs

m

is bounded to

1

. Now, if we an run the monitor for a limitedperiod of time we an bound the Equation 2.2. Forthat reason, the VEP runs

foralimitednumberofCPU y le,whi hisset inabeforehandagreementbetween the

produ erand the onsumer, and he ked duringthe exe ution of the ode. Hen e, the

problem an bebounded asfollows:

C

T

(EM) ≤ (t

EM

max (C(EM))

In this way, the VEP an enfor e ne-grained memory safety, ontrol-ow safety, and

typesafety through exe utionmonitoring with aninsigni ant onstant ost.

Asforthese onddrawba koftheexe utionmonitors,thefail-stopmannerisaligned

withthe safetyof theEPCC framework. Thatis,weneed theVEP toa tinafail-stop

mannerto prevent anunsafe proof generator to ontinue itsexe ution. Therefore, not

only the fail-stop manner has no dangerous onsequent, but also it is required. Thus,

themajordrawba ksofusingtheexe utionmonitorsarenegligiblewhenusingtheVEP

asan exe utionmonitor.

The VEP: A ordan e with Se urity Design Prin iples

Itisofhighimportan eforanapproa htobeina ordan ewiththe prin iplesof

se u-ritydesign. Obviously,theVEPand otherexe utionmonitorsareinpartiala ordan e

with the least privilege prin ipleasthey are intended toperform su h task.

In additiontothis naturala ordan eofthe VEPwith theleast privilegeprin iple,

(37)

an agreement in whi h the produ er and the onsumer settle the possible amount of

resour esthat anbeusedbytheproofgenerator. Amongtheseresour es aretheheap,

thesta k,andthe odespa eoftheproofgenerator. Anydisobedien eoftheagreement

by the proof generator is doomed todis ontinuation of its exe ution. In this way, the

VEP puts the prin ipleof least privilege stri tly into pra ti e.

With regard to the se ond prin iple of the se urity design, we set a riterion for

the size of the TCB. The riterion wasto design and build the VEP ina way that the

enlargement of the TCB be less than the dieren e between the size of the TCB in

Ora le-based PCC and the size of the TCB in traditional PCC in terms of the lines

of ode. That is, we aimed to implement the VEP su h that the se urity of EPCC be

stronger than Ora le-based PCC a ording tothe se ond prin ipleof se urity design.

The size dieren e between the two versions of the TCB in traditional PCC and

Ora le-based PCC is about 2000-3000 lines of ode. Interestingly, the urrent version

of the VEP is less than 300 lines of ode whi h is mu h smaller than the standard we

set. Sin e the VEP onsists of small number of lines it an be veried easily by pen

andpaper. Furthermore,inprin iple, theVEPdoesnotneed toin rease thesize ofthe

TCBif itwould bepossible (without di ulty)to prove it safein a PCC framework.

2.3 EPCC Appli ations

In this se tion, we present some of the possible appli ations of the EPCC framework.

For this, we start by studying the benets of employing EPCC on traditional PCC

framework (i.e., extendingthe traditionalPCC framework ina way that it an a ept

a proof generator). Then we propose employment of EPCC for FPCC and OPCC as

two other PCC te hniques and their possible benets. It is important tomention that

onlythe rst appli ation whi h isan EPCC version ofthe traditionalPCC framework

is implemented as a part of this work (detailed information about the implemented

frameworkonChapter 4)andthetwootherEPCCframeworksarepresentedasfeasible

propositions.

Extending traditional PCC

(38)

version of traditional PCC is shown on the right-hand side. The dialogue between

the produ er and the onsumer remains the same as traditional PCC ex ept for some

minormodi ations. In extended versionof traditionalPCC, insteadof a ompanying

the ode with a safety proof, the produ er a ompanies it with the a safety proof

generatorwhi hhehas builtand ustom-madeearlier. On the onsumer sideand upon

re eption of the proof generator, the onsumer safely exe utes the proof generator on

theVEPand obtainsthe proof. Thegeneratedproof isthen given tothe proof he ker.

The proof he ker he ks the generated proof against the veri ation ondition and if

the he king issu essful, the onsumer an run the ode safely.

EPCC Consumer

VC

Generator

Verification

Condition

Proof

Checker

Safety

Policies

VEP

Proof

PCC Consumer

CPU

Code

Proof

VC

Generator

Verification

Condition

Proof

Checker

Safety

Policies

CPU

Code

Proof

generator

Figure2.10: Consumer side in PCC versus itsextended version

ThesafetyproofsinPCCarerepresentedinEdinburghLogi alFramework(LF)[36℄.

Alogi alframeworkisaformalsysteminwhi hotherlogi s anbereadilyrepresented.

The typi al LF representation of the proofs are large, due to a signi ant amount

of redundan y. Storing these proofs in a format that requires less spa e than usual

(e.g., ompressing them) would alleviate the problem of proof size in ommuni ations,

be ause it enables devi es to transmitor store the same amountof data in fewer bits.

Losslessdata ompressionte hniques workbestondata withrepetitioninits

represen-tation. Therefore, the fa t that proofs ontain many repeated patterns of proof rules

and redundantarguments, makes them suitablefor lossless data ompression. To gain

a better ompression, the data ompression algorithm an be ustom-made in keeping

with the ontent ofthe proofwhi h isgoing to besent.

InourexperimentbyusingthenewstrategyofEPCC,withano-the-shelf

ompres-sion te hnique, the type safety proof generators average 5% the original proofs whi h

is about 30 times smaller than before. Interested reader an he k the Chapter 4 to

obtain more information about the results and the end-to-end implementation of the

(39)

sibility of sending a proof generator. This gives a han e to the produ er to build

a ompa t and spe ialized proof generator whi h an output the same proof on the

onsumer side. In this way, the proof size issue an be alleviated while the parties

are provided with a more exible framework in whi h the original logi of the proof

generator an be dierent than that of the generated proof.

Extending Foundational PCC

Figure2.11 shows the onsumer side in the Foundational PCC framework on the

left-hand side and its EPCC version on the right-handside.

EPCC Consumer

Proof

Checker

VEP

FPCC Consumer

CPU

Code

Proof

Checker

CPU

Proof

Code

Proof

generator

Figure 2.11: Consumer side inFPCC versus itsextended version

In FPCC framework, the proofs are bigger than the proofs in traditional PCC

whi h makes the s alability of FPCCharder. By extendingthe FPCCframework, the

produ er an send a proof generator whose size an bea fra tion of the original proof

size. In this way the ru ial obsta les for the pra ti al appli ability of FPCC an be

alleviated. Sin ethe size of theVEP inExtendedFPCC issmallerthanthe size of the

VCGentraditionalPCCframework,inprin iple,extendedFPCC ouldbeimmediately

more se ure than the traditionalPCC be ause it has a smaller TCB. The dialogue in

extended FPCC is similar to the one in FPCC, ex ept that in extended FPCC the

onsumer exe utes the proof generator onthe VEP to obtainthe proof.

Extending Ora le-based PCC

In Figure2.12, the onsumer side inOra le-basedPCC isshown onthe left-hand side.

(40)

the right-hand side. On this side, the ode is a ompaniedby a proof generator. The

onsumer an exe utethe proof generator onthe VEPand obtain the proof. Then the

proof he ker he ks the generated proof against the veri ation ondition and if the

he king wassu essful, the onsumer an run the ode safely.

EPCC Consumer

VC

Generator

Verification

Condition

Proof

Checker

Safety

Policies

VEP

Proof

OPCC Consumer

CPU

Code

Proof

VC

Generator

Verification

Condition

Non-deterministic

Proof

Checker

Safety

Policies

CPU

Code

Proof

generator

Figure 2.12: Consumer side inOPCC versus itsextended version

By extending the Ora le-based PCC framework, we provide the produ er with the

possibility of sending a proof generator. The proof generator an use the ora le idea

to generate the omplete proof as the output. Sin e the VEP is smaller than the size

dieren ebetween non-deterministi proof he kerand theoriginalPCCproof he ker,

theTCBsize issue anbealleviatedwhilethe partiesare providedwith amoreexible

framework in whi h the original logi of the proof generator an be dierent than the

generated proof.

2.4 Overview

The ExtendedProof-CarryingCode frameworkis tomakethe PCC idea more s alable

and pra ti alby alleviating the proof size issue while respe ting the hara teristi s of

the PCC te hnique.

EPCC provides the ode onsumer with the luxury of using a safe environment in

whi h a big lass of proof generators an be exe uted in a se ure manner, regardless

of the original logi in whi h the proofs were represented. In this way, EPCC leaves

the easier tasks to the onsumer and gives adequate means to the produ er to do the

hard task. This major exibility for the onsumer and produ er, in addition to the

alleviation of the proof size issue, are gained through a minor TCB extension of less

(41)

The VEP Virtual Ma hine

This hapter dis usses the design of the Virtual ma hine for Extended PCC (VEP).

It des ribes the trade-os we had to make when designing the VEP and dis usses the

way in whi h the VEP works.

3.1 Ma hine Design

In this se tion,we present the design pro ess of the VEP. A virtual ma hine is a

fun -tional simulation of a omputer and its asso iated devi es [41℄, whi h is implemented

by adding a software to an exe ution platform togive it the appearan e of a dierent

platformwhi h may have aninstru tion set that diers from that implemented on the

underlying real hardware. Figure 3.1 shows the idea of using virtual ma hine by the

Host Computing System

Virtual Machine

Untrusted Code

Figure3.1: Virtual Ma hine

(42)

ne tions. On the other hand,a Virtual ma hinewhi his the virtualizingsoftware, an

translate ( ompletely orpartially) the instru tion set ar hite ture of the original

plat-form, so that the untrusted- ode sees a dierent instru tion set ar hite ture from the

one supported by platform. That is, a virtual ma hine an work as a (partialor

om-plete) emulator whi h exe utes programs written for the virtual ma hine instru tion

setonama hinethat exe utesadierentinstru tionset. Havingrestri tedinstru tion

set (e.g., the unne essary instru tion whi h gives the potential to write unsafe odes

are omitted)and safe emulation (i.e., performingne essary he ks beforeexe uting an

instru tion)bythevirtualma hineboth animprovethese urityofthesystem. Inthis

way, avirtual ma hine an be used toin rease se urity, provide enhan edperforman e

and simplifysoftware migration.

3.1.1 Captured goals and requirements

The virtual ma hine design pro ess starts by apturing the requirements. In EPCC

framework, we exe ute the proof generator on the VEP. The proof generator an be

a pa kage of a de ompression algorithm and the ompressed proof. In this way, by

exe utingthe proof generator, the onsumer is a tually de ompressing the ompressed

proof. We used the GUNZip algorithm as a representative of algorithms within the

de ompression te hniques area. As a guideline, we tried to design the VEP in a way

that it an support an e ient exe ution of programs written in a broad range of

language.

The requirements of a virtual ma hine are mainly on erned with the properties

su h as: size, portability, performan e, memory onsumption, s alability, se urity, et .

In the ase of the VEP, we dealtwith the following requirements:

1. The VEP should provide us with a platform whi h has the potential of working

withtheKolmogorovideal ompressor. A ording totheKolmogorov omplexity,

this ideal ompressor runs ona universal omputer.

2. It should enable the exe ution of the proof generator at the onsumer side in

a se ure manner. That is, the VEP should provide a tightly ontrolled set of

resour es for proof generator. Network a ess, the ability to inspe t the host

system,orreadfrominputdevi esandwriteintolestreamsshouldbedisallowed.

Therefore, the VEP should be able toperform exe utionmonitoring.