Advaned Topis in Types and Programming Languages
Advaned Topis in
Types and Programming Lan-
guages
Benjamin C. Piere, editor
TheMITPress
Cambridge,Massahusetts
Contents
Prefae 1
1 ML 2
ByFrançoisPottier andDidierRémy
1.1 Preliminaries 2
1.2 WhatisML? 3
1.3 Constraints 20
1.4 HM(X) 40
1.5 Apurelyonstraint-basedtypesystem:PCB(X) 52
1.6 Constraintgeneration 58
1.7 Typesoundness 63
1.8 Constraintsolving 72
1.9 FromML-the-alulustoML-the-programming-language 86
1.10 Universalquantiationinonstraints 107
1.11 Rows 119
2 Logial Relations and a Case Study in Equivalene Cheking
139
ByKarl Crary
2.1 TheEquivaleneProblem 140
2.2 UntypedEquivaleneCheking 141
2.3 Type-DrivenEquivalene 143
2.4 An EquivaleneAlgorithm 144
2.5 Completeness:AFirst Attempt 147
2.6 LogialRelations 149
2.7 AMonotone LogialRelation 152
2.9 TheFundamentalTheorem 155
2.10 Notes 160
3 Typed Operational Reasoning 161
ByAndrewPitts
3.1 Introdution 161
3.2 Motivating Examples 163
3.3 TheLanguage 169
3.4 ContextualEquivalene 175
3.5 An Operationally-BasedLogialRelation 180
3.6 OperationalExtensionality 187
3.7 Notes 193
4 Dependent Types 196
ByDavid Aspinall andMartinHofmann
4.1 Purerst-orderdependenttypes 201
4.2 Properties 205
4.3 Algorithmitypingandequality 207
4.4 Dependentsumtypes 212
4.5 TheCalulusof Construtions 214
4.6 Relatingabstrations:PureTypeSystems 222
4.7 Implementation 224
4.8 Furtherreading 227
5
Eet
Types and Region-based Memory Management 230
ByFritzHenglein, HenningMakholm, andHenningNiss
5.1 Type-basedprogramanalysis 230
5.2 Valueowanalysis 231
5.3 Eets 241
5.4 Region-basedmemorymanagement 245
5.5 TheTofte-Talpin typesystem 254
5.6 Regioninferene 263
5.7 Morepowerfulmodelsforregion-basedmemory
management 266
5.8 Pratialregion-basedmemorymanagementsystems 271
6 Substrutural Type Systems 274
ByDavid Walker
6.1 StruturalProperties 275
6.3 ExtensionsandVariations 287
6.4 An OrderedTypeSystem 300
6.5 FurtherAppliations 305
6.6 Notes 310
7 Proof-Carrying Code 313
ByGeorgeNeula
7.1 OverviewofProofCarryingCode 314
7.2 FormalizingtheSafetyPoliy 319
7.3 Veriation-ConditionGeneration 323
7.4 SoundnessProof 336
7.5 TheRepresentationandChekingof Proofs 340
7.6 ProofGeneration 351
7.7 PCCBeyondTypes 352
7.8 Conlusion 355
8 Typed Assembly Language 358
ByGregMorrisett
8.1 TAL-0:Control-Flow-Safety 359
8.2 TheTAL-0TypeSystem 363
8.3 TAL-1:SimpleMemory-Safety 372
8.4 TAL-1ChangestotheTypeSystem 378
8.5 Compilingto TAL-1 381
8.6 SomeRealWorldIssues 384
8.7 SalingtoOtherLanguageFeatures 386
8.8 Conlusions 392
9 Design Issues in Advaned Module Systems 393
ByRobert HarperandBenjamin C.Piere
9.1 BasiModularity 394
9.2 PhaseDistintionandPhaseSeparation 405
9.3 DataAbstration 407
9.4 HierarhialModularity 416
9.5 FamiliesofInterfaes 419
9.6 FamiliesofModules 423
9.7 Advaned Topis 433
9.8 RelationtoExistingLanguages 435
9.9 Historyand FurtherReading 437
10 Type Denitions 440
ByChristopher A.Stone
10.1 Denitions intheTypingContext 442
10.2 Denitions inModules 457
10.3 SingletonKinds 466
10.4 Notes 481
A Notational Conventions 483
A.1 MetavariableNames 483
A.2 RuleNamingConventions 483
B Solutions to Seleted Exerises 487
Referenes 536
Index 565
1 ML
By François Pottier and Didier Rémy
1.1 Preliminaries
Names and renaming
Mathematiians and omputer sientists use names to referto arbitrary or
unknown objetsin the statement of a theorem, to referto the parameters
of a funtion, et. Names are onvenient beause they are understandable
by humans; nevertheless, they an be triky. An in-depth treatment of the
diulties assoiated with names and renaming is beyond the sope of the
presenthapter:weenouragethereadertostudyGabbayandPitts'exellent
seriesofpapers(GabbayandPitts,2002;Pitts,2002b).Here,wemerelyreall
afewnotionsthatareusedthroughoutthishapter.Consider,forinstane,an
indutivedenitionoftheabstratsyntaxofasimpleprogramminglanguage,
thepure-alulus:
t::=zjz:tjtt
Here, the meta-variable z ranges over an innite set of variablesthat is,
nameswhilethemeta-variabletrangesoverterms.Asusualinmathematis,
wewritethevariablez andthetermt insteadofthevariabledenotedby
z andthetermdenoted byt.Theabovedenitionstatesthat atermmay
beavariablez,apairofavariableandaterm,writtenz:t,orapairofterms,
written t
1 t
2
.However,thisisnotquitewhat weneed.Indeed,aordingto
The (urrently unnished) ode that aompanies thishapter may befound at http:
//pauilla.inria.fr/~remy/ mlro w/. For spaereasons,some material,inluding proofs,
exerises,and more,hasbeenleftoutof thisversion.Inthe future,a fullversionof this
hapterthatinludesthemissingmaterialwillbemadeavailableatthesameaddress.In
spiteoftheseomissions,thishapterisstilloversizewithrespet toBenjamin's100page
barrier:weurrentlyhaveroughly135pagesoftextand15pagesofsolutionstoexerises.
Wewouldappreiateommentsandsuggestionsfromtheproofreadersastohowthishapter
thisdenition,thetermsz
1 :z
1
andz
2 :z
2
aredistint, whilewewould like
themtobeasinglemathematialobjet,beauseweintendz:ztomeanthe
funtion that maps z to za meaning that is independent of the name z.
Toahievethis eet, we omplete theabove denition by stating that the
onstrutionz:tbinds zwithint.Onemayalsosaythatzisabinder whose
sope is t. Then, z:tisno longera pair:rather, it isan abstration ofthe
variablezwithinthetermt.Abstrationshavethepropertythattheidentity
oftheboundvariabledoesnotmatter;thatis,z
1 :z
1 andz
2 :z
2
arethesame
term.Informally,wesaythattermsareonsideredequalmodulo-onversion.
One theposition andsope ofbindersare known, several standardnotions
follow, suh asthe set of free variables of a term t, written fv(t), and the
apture-avoiding substitution of aterm t
1
foravariable zwithin atermt
2 ,
written[z7!t
1
℄t
2
.Foroniseness,wewritefv(t
1
;t
2
)forfv(t
1
)[fv(t
2 ).A
termissaidtobelosed whenithasnofreevariables.
A renaming is a total bijetive mapping from variables to variables that
aetsonlyanitenumberofvariables.Thesolepropertyofavariableisits
identity, thatis, thefat thatit isdistintfrom othervariables.Asaresult,
at aglobal level, allvariables are interhangeable: ifa theoremholds in the
abseneofhypothesesaboutanypartiularvariable,thenanyrenamingofit
holdsaswell.Weoftenmakeuseofthisfat.WhenprovingatheoremT,we
saythat ahypothesis H maybeassumed wihoutlossof generality (w.l.o.g.)
ifthetheoremT followsfromthetheoremH )T viaarenamingargument,
whihisusuallyleftimpliit.
If z
1 and z
2
are sets of variables, we write z
1
# z
2
as a shorthand for
z
1
\z
2
=?,andsaythat z
1
isfresh forz
2
(orvie-versa).Wesaythat zis
freshfortifandonlyifz#fv (t)holds.
Inthis hapter,weworkwith several distintvarietiesof names:program
variables, memoryloations, and typevariables, the latterof whih may be
furtherdividedintokinds.Wedrawnamesofdierentvarietiesfromdisjoint
sets,eahofwhih isinnite.
1.2 What is ML?
The name ML appeared during the late seventies. It then referred to a
general-purpose programming language that was used as a meta-language
(whene its name) within the theorem prover LCF (Gordon, Milner, and
Wadsworth, 1979b). Sine then, several new programming languages, eah
ofwhihoersseveraldierentimplementations,havedrawninspirationfrom
it.So,whatdoesML standfortoday?
Forasemantiist,MLmightstandforaprogramminglanguagefeaturing
memoryellsalled referenes, exeptionhandling,automatimemoryman-
agement,andaall-by-valuesemantis.ThisviewenompassestheStandard
ML (Milner, Tofte, and Harper, 1990) and Caml (Leroy, 2000) families of
programminglanguages.WerefertoitasML-the-programming-language.
For atype theorist, ML mightstand for apartiular breed of typesys-
tems,basedonthesimply-typed-alulus,butextendedwithasimpleform
of polymorphism introdued by let delarations. These type systems have
deidable type inferene; their type inferene algorithms ruially rely on
rst-order uniation and an be made eient in pratie. In addition to
StandardMLandCaml,thisviewenompassesprogramminglanguagessuh
asHaskell(Hudak,PeytonJones,Wadler,Boutel,Fairbairn,Fasel,Guzman,
Hammond,Hughes,Johnsson,Kieburtz,Nikhil,Partain,andPeterson,1992)
andClean(Brus,vanEekelen,vanLeer,andPlasmeijer,1987),whoseseman-
tisis rather dierentindeed, itis pure andlazybut whose typesystem
tsthis desription.We referto itasML-the-type-system. It isalso referred
toasHindleyandMilner's typedisipline in theliterature.
For us, ML might also stand for the partiular programming language
whoseformaldenitionisgivenandstudiedinthishapter.Itisaorealulus
featuringrst-lassfuntions,letdelarations,andonstants.Itisequipped
with a all-by-value semantis. By ustomizing onstants and their seman-
tis,onemay reoverdata strutures,referenes, andmore.Wereferto this
partiularalulusasML-the-alulus.
Why study ML-the-type-system today, suh a long time after its initial
disovery?Onemaythinkofat leasttworeasons.
First,itstreatmentintheliteratureisoftenursory,beauseitisonsidered
either as asimple extension of thesimply-typed -alulus (TAPL Chapter
9) or as a subset of Girard and Reynolds' System F (TAPL Chapter 23).
The former view is supported by the laim that the let onstrut, whih
distinguishes ML-the-type-systemfrom thesimply-typed -alulus, may be
understoodasasimpletextualexpansionfaility.However,thisviewonlytells
partofthestory,beauseitfailstogiveanaountoftheprinipaltypesprop-
ertyenjoyedbyML-the-type-system,leadstoanaïvetypeinferenealgorithm
whose time omplexityis exponential, and breaks down when the language
is extendedwith side eets, suh as stateorexeptions. Thelatter viewis
supportedbythefat that everytypederivation withinML-the-type-system
isalsoavalid typederivationwithin animpliity-typedvariantofSystemF.
Suhaviewisorret,butagainfailstogiveanaountoftypeinferenefor
ML-the-type-system,sinetypeinfereneforSystemFisundeidable(Wells,
1999).
Seond,existingaountsoftypeinfereneforML-the-type-system(Milner,
Jones,1999)usuallyinvolveheavymanipulationsoftypesubstitutions.Suh
anubiquitous useof type substitutionsis often quiteobsure. Furthermore,
atualimplementationsofthetypeinferenealgorithmdonot expliitlyma-
nipulatesubstitutions;instead,theyextendastandardrst-orderuniation
algorithm, where terms are updated in plae as new equations are disov-
ered(Huet, 1976).Thus, itishardtotell,from theseaounts,howtowrite
aneienttypeinferenealgorithmforML-the-type-system.Yet,inspiteof
the inreasing speedof omputers,eieny remains ruial when ML-the-
type-system is extended with expensive features, suh as Objetive Caml's
objettypes(Rémy and Vouillon, 1998)orpolymorphi methods (Garrigue
andRémy,1999).
Forthesereasons,webelieveitisworthgivinganaountofML-the-type-
systemthat fouses on type inferene and strives to be at one elegant and
faithful toaneientimplementation.Toahievethesegoals,weforegotype
substitutionsand insteadput emphasisononstraints,whihoeranumber
of advantanges. First, onstraints allow a modular presentation of type in-
fereneastheombination ofaonstraintgeneratorandaonstraintsolver.
Suh adeomposition allowsreasoningseparately aboutwhen aprogram is
orret, on the one hand, and how to hek whether it is orret, on the
other hand.It has longbeen standardin the setting ofthe simply-typed-
alulus (TAPL Chapter 22), but, to the best of our knowledge, has never
been proposed for ML-the-type-system. Seond, it is often natural to de-
ne and implement the solver as a onstraint rewriting system. Then, the
onstraint language allows reasoning not only about orretnessis every
rewritingstepmeaning-preserving?butalsoaboutlow-levelimplementation
details,sineonstraintsare thedatastruturesmanipulatedthroughoutthe
typeinfereneproess.Forinstane,desribinguniationintermsofmulti-
equations(JouannaudandKirhner,1991)allowsreasoningaboutthesharing
ofnodesinmemory,whihasubstitution-basedapproahannotaountfor.
Last,onstraintsaremoregeneralthantypesubstitutions,andallowdesrib-
ing many extensions of ML-the-type-system, among whih extensions with
reursivetypes,rows,subtyping,rst-orderuniationunderamixedprex,
andmore.
Before delving into the details of this new presentation of ML-the-type-
system, however,itis worthrealling itsstandarddenition. Thus, in what
follows,werstdene thesyntax andoperationalsemantisoftheprogram-
ming languageML-the-alulus, andequip it with a typesystem,known as
DamasandMilner's typesystem.
x;y ::= Identiers:
z Variable
m Memory loation
Constant
t ::= Expressions:
x Identier
z:t Funtion
tt Appliation
letz=tint Loaldenition
v;w ::= Values:
z Variable
m Memory loation
z:t Funtion
v
1 ::: v
k
Data
2Q +
^ka()
v
1 ::: v
k
Partial appliation
2Q ^k<a()
E ::= Evaluation Contexts:
[℄ Emptyontext
E t Left sideof an appliation
vE Rightsideof an appliation
letz=Eint Loal denition
Figure1-1: Syntax of ML-the-alulus
ML-the-alulus
ThesyntaxofML-the-alulusisdenedinFigure1-1.Itismadeupofseveral
syntatiategories.
Identiers groupseveral kindsof namesthat maybereferened in apro-
gram: variables, memory loations, and onstants. We let x and y range
over identiers. Variablessometimes alled program variables to avoid
ambiguityarenamesthat maybebound to values using orletbinding
forms; in other words, theyare namesfor funtion parametersorloal de-
nitions. Weletzandf rangeoverprogram variables.Wesometimes write
foraprogramvariablethat doesnotourfreewithinitssope:forinstane,
:t standsfor z:t,provided zis fresh for t. Memory loations arenames
thatrepresentmemoryaddresses.Byonvention,memoryloationsneverap-
pearinsoureprograms,thatis,programsthataresubmittedtoaompiler.
Theyonlyappear duringexeution,when newmemorybloksarealloated.
Constantsarexednamesforprimitivevaluesandoperations,suhasinteger
literals andintegerarithmeti operations.Constantsareelementsof anite
orinnitesetQ. Theyareneversubjetto -onversion.Programvariables,
memoryloations,andonstantsbelongtodistintsyntatilassesandmay
neverbeonfused.
The set of onstants Q is kept abstrat, so most of our development is
independentofitsonretedenition.Weassumethateveryonstanthasa
nonnegativeintegerarity a(). Wefurther assumethatQispartitioned into
subsetsofonstrutorsQ +
anddestrutorsQ .Construtorsanddestrutors
operate onvalues.
1.2.1 Example [Integers℄: Foreveryintegern,onemayintrodueanullaryon-
strutorn.^ Inaddition,onemayintrodueabinarydestrutor
^
+,whoseappli-
ationsarewritteninx,sot
1
^
+t
2
standsforthedoubleappliation
^
+t
1 t
2
ofthedestrutor
^
+totheexpressionst
1 andt
2
. 2
Expressionsalsoknownasprogramtermsorprogramsarethemainsyn-
tatiategory.Indeed,unlikeproedurallanguagessuhasCandJava,fun-
tional languages,inludingML-the-programming-language,suppressthedis-
tintionbetweenexpressionsandstatements.Expressionsinludeidentiers,
-abstrations,appliations,andloaldenitions.The-abstrationz:trep-
resentsthefuntionofoneparameternamedzwhoseresultistheexpressiont,
or,in other words, thefuntion that maps z to t. Note that thevariable z
isboundwithinthetermt, so(forinstane)z
1 :z
1
andz
2 :z
2
arethesame
objet.Theappliation t
1 t
2
representsthe resultofalling thefuntion t
1
withatualparametert
2
,or,inotherwords,theresultofapplyingt
1 tot
2 .
Appliationisleft-assoiative,thatis,t
1 t
2 t
3
standsfor(t
1 t
2 )t
3
.Theon-
strutletz=t
1 int
2
representstheresultofevaluatingt
2
afterbindingthe
variablezto t
1
.Notethat thevariablezisboundwithin t
2
,but notwithin
t
1
, sofor instane let z
1
=z
1 inz
1
and letz
2
=z
1 inz
2
are the same
objet.Theonstrutletz=t
1 int
2
hasthesamemeaningas(z:t
2 )t
1 ,
but isdealtwith in amoreexiblewaybyML-the-type-system.Tosumup,
the syntax of ML-the-alulus is that of thepure -alulus, extended with
memoryloations,onstants,andtheletonstrut.
Values formasubsetofexpressions.Theyareexpressionswhoseevaluation
is ompleted. Values inlude identiers, -abstrations, and appliations of
onstants,oftheformv
1 ::: v
k
,wherekdoesnotexeed'sarityifisa
onstrutor,andkissmallerthan'sarityifisadestrutor.Inwhatfollows,
we are often interested in losed values, that is, values that do notontain
anyfreeprogramvariables.Weusethemeta-variablesvandwforvalues.
1.2.2 Example: The integer literals :::;
1;
^
0;
^
1;::: are nullary onstrutors, so
theyarevalues.Integeraddition
^
+isabinarydestrutor,soitisavalue,and
sois everypartial appliation
^
+v. Thus, both
^
+
^
1and
^
+
^
+are values. An
appliationof
^
+totwovalues,suh as
^
2
^
+
^
2, isnotavalue. 2
1.2.3 Example [Pairs℄: Let(;)beabinaryonstrutor.Ift
1 aret
2
areexpres-
sions,thenthedoubleappliation(;)t
1 t
2
maybealledthepair oft
1 and
t
2
,andmaybewritten(t
1
;t
2
).Bythedenitionabove,(t
1
;t
2
)isavalueif
andonlyift
1 andt
2
arebothvalues. 2
Storesarenitemappingsfrommemoryloationstolosedvalues.Astore
eahofwhihisalloatedatapartiularaddressinmemoryandmayontain
pointerstootherelementsoftheheap.ML-the-programming-languageallows
overwriting the ontents of an existing memory blokan operation some-
times referredto asaside eet. Inthe operationalsemantis,this eet is
ahievedbymappinganexisting memoryloation toanewvalue. Wewrite
?fortheemptystore.Wewrite[m7!v℄forthestorethatmapsmtovand
otherwiseoinides with . When and 0
havedisjointdomains, wewrite
0
for theirunion. Wewrite dom() forthe domain of and range () for
thesetofmemoryloationsthatappearin itsodomain.
Theoperationalsemantisofapurelyfuntionallanguage,suhasthepure
-alulus,maybedenedasarewritingsystemonexpressions.BeauseML-
the-alulushassideeets,however,wedeneitsoperationalsemantisasa
rewritingsystemonongurations.Aongurationt=isapairofanexpres-
siontandastore.Thememoryloationsinthedomainofareonsidered
bound within t=, so (for instane) m
1
=(m
1 7!
^
0) and m
2
=(m
2 7!
^
0) are
thesameobjet.Inwhat follows,weareofteninterestedin losed ongura-
tions, that is, ongurations t=suh that t has nofree program variables
and every memory loation that appears within t or within the range of
is in thedomain of . Ift isa soureprogram, itsevaluation begins within
anemptystorethatis,withtheongurationt=?.Beause,byonvention,
soureprogramsdonotontainmemoryloations,this isalosedongura-
tion.Furthermore,weshall seethat allredutsof alosedongurationare
losedaswell.Pleasenotethat,insteadofseparatingexpressionsandstores,
itis possibletomakestorefragmentspartofthesyntax ofexpressions; this
idea,proposedin(CrankandFelleisen,1991),isreminisentoftheenodingof
refereneellsin proessaluli(Turner,1995;FournetandGonthier,1996).
Aontext isanexpressionwhere asinglesubexpression hasbeenreplaed
withahole,written[℄.Evaluationontexts formastritsubsetofontexts.In
anevaluationontext, theholeismeanttohighlightapointin theprogram
whereit isvalidto applyaredutionrule.Thus, thedenition ofevaluation
ontexts determines a redution strategy: it tells where and in what order
redution steps may our. Forinstane, the fat that z:[℄ is not an eval-
uation ontext means that the body of a funtion is never evaluatedthat
is, notuntil thefuntion isapplied, see R-Betabelow.The fatthat tE is
an evaluation ontext onlyif t is avalue means that, to evaluate an appli-
ation t
1 t
2
, oneshould fully evaluate t
1
before attempting to evaluatet
2 .
Moregenerally,intheaseofamultipleappliation,itmeansthatarguments
shouldbeevaluatedfromlefttoright.Ofourse,otherhoiesouldbemade:
for instane, dening E ::=::: j tE j Ev j ::: would enfore aright-to-left
evaluationorder,whiledeningE ::=:::jtEjE tj:::wouldleavetheeval-
(z:t)v ![z7!v℄t (R-Beta )
letz=vint ![z7!v℄t (R-Let )
t=
Æ
!t 0
= 0
t= !t 0
= 0
(R-Delta)
t= !t 0
= 0
dom(
00
)#dom( 0
)
range(
00
)#dom( 0
n)
t=
00
!t 0
= 0
00
(R-Extend )
t= !t 0
= 0
E[t℄= _E[t 0
℄=
0
(R-Context )
Figure1-2: Semantis ofML-the-alulus
bothsubexpressions,andmakingevaluationnondeterministi. Thefat that
letz=vinE is notanevaluationontextmeansthat the body of aloal
denitionisneverevaluatedthatis,notuntilthedenitionitselfisredued,
seeR-Letbelow.WewriteE[t℄fortheexpressionobtainedbyreplaingthe
holeinE withtheexpressiont.
Figure1-2denesrstarelation !betweenongurations,thenarelation
_ betweenlosed ongurations. Ift= !t 0
= 0
ort= _ t 0
= 0
holds,
thenwesaythattheongurationt=reduestotheongurationt 0
= 0
;the
ambiguityinvolvedinthisdenitionisbenign.Ift= !t 0
=holdsforevery
store,thenwewritet !t 0
andsaythattheredutionispure.
The keyredution rule is R-Beta, whih states that a funtion applia-
tion(z:t)vreduestothefuntionbody,namelyt,whereeveryourrene
of the formal argument z has been replaed with the atual argument v.
Theonstrut,whihpreventedthefuntion body tfrombeingevaluated,
disappears, so the new term may (in general) be further redued. Beause
ML-the-alulus adopts a all-by-value strategy, rule R-Beta is appliable
only if theatual argumentis avalue v. In other words, a funtion annot
beinvoked until its atual argumenthas been fully evaluated. Rule R-Let
is very similar to R-Beta. Indeed, it speies that letz =v int hasthe
samebehavior,with respet to redution, as(z:t)v. We remarkthat sub-
stitutionofavalueforaprogramvariablethroughoutatermisexpensive,so
R-Beta andR-Let areneverimplementedliterally:they areonlyasimple
speiation. Atual implementations usually employ runtime environments,
whihmaybeunderstoodasaformofexpliitsubstitutions(Abadi,Cardelli,
Curien, and Lévy, 1991). Please note that our hoie of a all-by-value re-
dutionstrategyisfairlyarbitrary,andhasessentiallynoimpatonthetype
system; the programming language Haskell (Hudak, Peyton Jones, Wadler,
Nikhil, Partain, and Peterson, 1992), whose redution strategy is known as
lazy orall-by-need,alsoreliesonHindleyand Milner'stypedisipline.
RuleR-Deltadesribesthesemantisofonstants.Itmerelystatesthata
ertainrelation Æ
!isasubsetof !.Ofourse,sinethesetofonstantsis
unspeied,therelation Æ
!must bekeptabstrat aswell.Werequirethat,
ift=
Æ
!t 0
= 0
holds,then
(i) tisoftheformv
1 ::: v
n
, whereisadestrutorof arityn;and
(ii) dom()isasubsetofdom( 0
).
Condition (i)ensures that Æ-redution onernsfull appliations of destru-
tors only,and that these are evaluated in aordanewith theall-by-value
strategy. Condition (ii) ensures that Æ-redution may alloatenew memory
loations,butnotdealloateexisting loations.Inpartiular, agarbageol-
letion operator,whihdestroysunreahablememoryells,annotbemade
available asaonstant.Doingsowould notmakemuhsense anywayinthe
preseneofR-Extend,whihstatesthatanyvalidredutionisalso validin
alargerstore.Condition(ii)allowsprovingthat,ift=reduestot 0
= 0
,then
dom()isasubsetofdom(
0
);thisisleftasanexerisetothereader.
1.2.4 Example[Integers, ontinued℄: Theoperationalsemantisofintegerad-
ditionmaybedenedasfollows:
^
k
1
^
+
^
k
2 Æ
!
\
k
1 +k
2
(R-Add)
The left-hand term is the double appliation
^
+
^
k
1
^
k
2
, while the right-hand
termistheintegerliteral
^
k,wherekisthesumofk
1 andk
2
.Thedistintion
betweenobjetlevelandmetalevel(thatis,between
^
kandk)isneededhere
toavoidambiguity. 2
1.2.5 Example[Pairs, ontinued℄: Inaddition to the pair onstrutor dened
inExample1.2.3,wemayintroduetwodestrutors
1 and
2
ofarity1.We
maydene theiroperationalsemantisasfollows,fori2f1;2g:
i (v
1
;v
2 )
Æ
!v
i
(R-Proj)
Thus, ourtreatmentof onstants isgeneral enoughto aount forpair on-
strutionanddestrution;weneednotbuildthesefeaturesexpliitlyintothe
language. 2
1.2.6 Exerise [Booleans,Reommended, FF℄: Let true and false be
nullaryonstrutors.Letifbeaternarydestrutor.Extend theoperational
semantiswith
iftruev v Æ
!v
(R-True )
iffalsev
1 v
2 Æ
!v
2
(R-False )
Letususethesyntatisugarift
0 thent
1 elset
2
forthetripleappliation
ofift
0 t
1 t
2
.Explainwhythesedenitionsdonotquiteprovidetheexpeted
behavior.Withoutmodifyingthesemantisofif,suggestanewdenitionof
thesyntatisugarift
0 thent
1 elset
2
thatorretstheproblem. 2
1.2.7 Example [Sums℄: Booleans may in fat be viewedas a speial ase of the
moregeneraloneptofsum.Letinj
1
andinj
2
beunaryonstrutors,alled
respetivelyleft andrightinjetions.Letasebeaternarydestrutor,whose
semantisisdenedasfollows,fori2f1;2g:
ase(inj
i v)v
1 v
2 Æ
! v
i v
(R-Case)
Here,thevalueinj
i
visbeingsrutinized,whilethevaluesv
1 andv
2 ,whih
aretypiallyfuntions,representthetwoarmsofastandardaseonstrut.
Theruleseletsanappropriatearm(here,v
i
)basedonwhetherthevalueun-
dersrutinywasformedusingaleftorrightinjetion.Thearmv
i
isexeuted
andgivenaessto thedataarriedbytheinjetion(here,v). 2
1.2.8 Exerise [F, 9℄: Explainhowtoenodetrue,falseandtheifonstrut
intermsofsums.ChekthatthebehaviorofR-TrueandR-Falseisproperly
emulated. 2
1.2.9 Example [Referenes℄: Let ref and! be unarydestrutors. Let :=be a
binary destrutor. We write t
1 := t
2
for the double appliation := t
1 t
2 .
Denetheoperationalsemantisofthese threedestrutorsasfollows:
refv=?
Æ
!m=(m7!v) ifmisfreshforv
(R-Ref)
!m=(m7!v) Æ
!v=(m7!v)
(R-Deref)
m:=v=(m7!v
0 )
Æ
!v=(m7!v)
(R-Assign )
Aording to R-Ref, evaluating ref v alloates a fresh memory loation
m and binds v to it. Beause ongurations are onsidered equal up to -
onversionof memoryloations,thehoie ofthename misirrelevant,pro-
vided it is hosen fresh for v, so as to prevent inadvertent apture of the
memoryloationsthat appearfreewithinv.ByR-Deref, evaluating!m re-
turnsthevalueboundtothememoryloationmwithintheurrentstore.By
R-Assign,evaluatingm:=vdisardsthevaluev
0
urrentlyboundtomand
produesanewstorewheremisboundtov.Here,thevaluereturnedbythe
assignmentm:=vis vitself; inML-the-programming-language,itisusually
anullaryonstrutor(),pronounedunit. 2
1.2.10 Example [Reursion℄: Letfixbeabinary destrutor,whose operational
fixv
1 v
2 Æ
!v
1 (fixv
1 )v
2
(R-Fix )
fix is a xpoint ombinator: it eetively allows reursive denitions of
funtions. Indeed, the onstrut letre f = z:t
1 in t
2
provided by ML-
the-programming-language may be viewed as syntati sugar for let f =
fix(f :z:t
1 )int
2
. 2
RuleR-Contextompletesthedenitionoftheoperationalsemantisby
dening _, arelation betweenlosed ongurations,in terms of !.The
rulestatesthatredutionmaytakeplaenotonlyattheterm'sroot,butalso
deepinside it,providedthepathfromtherootto thepointwhere redution
oursformsanevaluationontext.Thisishowevaluationontextsdetermine
anevaluationstrategy.Asapurelytehnialpoint,beause _relateslosed
ongurationsonly,wedonotneedto requirethat thememoryloations in
dom( 0
n)befreshforE:indeed,everymemoryloationthatappearswithin
E mustbeamemberofdom().
1.2.11 Exerise [FF,Reommended, 9℄: AssumingtheavailabilityofBooleans
andonditionals,integerliterals,subtration,multipliation,integerompar-
ison, anda xpoint ombinator,most ofwhih were dened in previous ex-
amples,deneafuntionthatomputesthefatorialofitsintegerargument,
andapply itto
^
3. Determine,stepby step,howthis expressionredues to a
value. 2
It is straightforwardto hek that, if t= redues to t 0
= 0
, then t is not
a value. In other words, values are irreduible: they represent a ompleted
omputation. The proof is left as an exerise to the reader. The onverse,
however,doesnothold:ift=isirreduiblewithrespetto _,thentisnot
neessarilyavalue.Inthat ase,theongurationt= issaidtobestuk.It
representsaruntimeerror,thatis,asituationthatdoesnotallowomputation
to proeed, yet is not onsidered avalid outome.A losed soureprogram
tis saidtogo wrong ifand onlyiftheongurationt=?redues toastuk
onguration.
1.2.12 Example: Runtime errors typially arise when destrutors are applied to
argumentsofanunexpetednature.Forinstane,theexpressions+1mand
1
2and !3arestuk, regardlessof theurrentstore.Theprogram letz=
^
+
^
+inz1isnotstuk,beause
^
+
^
+is avalue.However,itsredutthrough
R-Let is
^
+
^
+1, whih is stuk, so this program goes wrong. The primary
purposeoftypesystemsisto preventsuhsituationsfromarising. 2
1.2.13 Example: The onguration!m= is stukif m is notin thedomain of .
Inthatase,however,!m=isnotlosed.Beauseweonsider _asarela-
words, the semantis of ML-the-alulus never allows the reation of dan-
gling pointers.As aresult,nopartiularpreautionsneedbetakento guard
against them. Several strongly typed programming languages do neverthe-
lessallow danglingpointers in a ontrolled fashion (Tofte and Talpin,1997;
Crary,Walker,andMorrisett,1999b;DeLineandFähndrih,2001;Grossman,
Morrisett,Jim, Hiks,Wang,andCheney,2002a). 2
Damas and Milner's type system
ML-the-type-systemwasoriginallydened byMilner(1978).Here,werepro-
duethedenitiongivenafewyearslaterbyDamasandMilner(1982),whih
iswritteninamorestandardstyle:typingjudgementsaredenedindutively
byaolletionoftypingrules.Wereferto thistypesystemasDM.
Tobegin,wemustdenetypes.InDM,likeinthesimply-typed-alulus,
typesare rst-order termsbuilt outof type onstrutors and type variables.
Webeginwithseveralonsiderationsonerningthespeiationoftypeon-
strutors.
First,wedonotwishtoxthesetoftypeonstrutors.Certainly,sineML-
the-alulushasfuntions,weneedtobeabletoform anarrowtypeT!T 0
outofarbitrarytypesTandT 0
;thatis,weneedabinarytypeonstrutor!.
However, beause ML-the-alulus inludes an unspeied set of onstants,
we annotsaymuh else in general.If onstantsinlude integerliterals and
integeroperations,asinExample1.2.1,thenanullarytypeonstrutorintis
needed;iftheyinludepaironstrutionanddestrution,asinExamples1.2.3
and1.2.5,thenabinarytypeonstrutoris needed;andsoon.
Seond, it isommon to refer to the parametersof a type onstrutor by
position, that is, by numeri index. For instane, when one writes T ! T 0
,
it is understood that the type onstrutor ! has arity 2, that T is its rst
parameter,knownasitsdomain,andthat T 0
isitsseond parameter,known
as its odomain. Here,however,we referto parametersby names, known as
diretions.Forinstane,wedenetwodiretionsdomainandodomainandlet
thetypeonstrutor!havearityfdomain;odomaing.Theextragenerality
aordedbydiretionsisexploitedinthedenitionofnonstruturalsubtyping
(Example1.3.9)andin thedenition ofrows(Setion1.11).
Last,weallowtypestobelassiedusingkinds.Asaresult,everytypeon-
strutormustomenotonlywithanarity,butwitharihersignature,whih
desribesthekindsofthetypesto whih itisappliableandthekindofthe
typethatitprodues.Adistinguishedkind?isassoiatedwithnormaltypes,
thatis,typesthatarediretlyasribedtoexpressionsandvalues.Forinstane,
thesignatureofthetypeonstrutor!isfdomain 7!?;odomain7!?g)?,
beauseitisappliabletotwonormal typesandproduesanormaltype.
Introduingkindsotherthan?allowsviewingsometermsasill-formedtypes;
thisisillustrated,forinstane,inSetion1.11.Inthesimplestase,however,
?isreallytheonlykind,sothesignatureofatypeonstrutorisnothingbut
itsarity(aset ofdiretions),and everytermisawell-formed type,provided
everyappliation ofatypeonstrutorrespetsitsarity.
1.2.14 Definition: Letdrangeoveraniteordenumerableset ofdiretions.Let
rangeoveraniteordenumerablesetofkinds.Let?beadistinguishedkind.
LetKrangeoverpartialmappingsfromdiretionstokinds.LetF rangeover
aniteordenumerablesetoftypeonstrutors,eahofwhihhasasignature
oftheformK).ThedomainofKisreferredtoasthearityofF,whileis
referredtoasitsimagekind.WewriteinsteadofK)whenK isempty.
Let!beatypeonstrutorofsignaturefdomain 7!?;odomain7!?g)?.
2
Thetypeonstrutorsandtheirsignaturesolletivelyformasignature S.
Inthefollowing,weassumethat axed signatureS isgivenand that every
typeonstrutorinithasnite arity,soastoensurethattypesaremahine
representable.However,inSetion 1.11,weshallexpliitlyworkwithseveral
distint signatures, someof whih involvetype onstrutors ofdenumerable
arity.
A type variable is a name that is used to stand for a type. For simpli-
ity, we assumethat every type variable isbranded with akind, or,in other
words,thattypevariablesofdistintkindsaredrawnfromdisjointsets.Eah
of these sets of type variables is individually subjet to -onversion: that
is, renamingsmust preservekinds.Attahing kindsto typevariables isonly
atehnial onveniene: in pratie,every operation performed during type
inferene preservesthe property that every type is well-kinded,so it is not
neessarytokeeptrakofthekindofeverytypevariable.Itisonlyneessary
to hek that all types supplied by the user, within type delarations,type
annotations,ormoduleinterfaes,arewell-kinded.
1.2.15 Definition: Foreverykind, letV
beadisjoint, denumerableset oftype
variables. LetX,Y,andZrangeovertheset V ofalltypevariables.Let
Xand
Y rangeovernite sets of type variables. Wewrite
X
Y for theset
X[
Y and
oftenwriteXforthesingletonsetfXg.Wewriteftv(o)forthesetoffreetype
variables ofanobjeto. 2
The set of types,ranged overby T, is thefree many-kindedterm algebra
thatarisesoutofthetypeonstrutorsandtypevariables.
1.2.16 Definition: AtypeofkindiseitheramemberofV
,oratermoftheform
Ffd
1 7!T
1
;:::;d
n 7!T
n
g,whereF hassignaturefd
1 7!
1
;:::;d
n 7!
n g)
andT ;:::;T aretypesofkind ;:::; ,respetively. 2
Asanotationalonvention,weassumethat, foreverytypeonstrutorF,
thediretionsthatformthearityofF areimpliitlyordered,sothatwemay
saythatF hassignature
1
:::
n
)andemploythesyntaxFT
1 :::T
n
forappliationsofF.Appliationsofthetypeonstrutor!arewritteninx
andassoiatetotheright,soT!T 0
!T 00
standsforT!(T 0
!T 00
).
In order to give meaning to the free type variables of a type, or, more
generally, of a typing judgement, traditional presentations of ML-the-type-
system, inluding Damas and Milner's, employ type substitutions. Most of
ourpresentation avoidssubstitutions andusesonstraints instead. However,
we do need substitutions on a few oasions, espeially when relating our
presentationtoDamasandMilner's.
1.2.17 Definition: Atypesubstitutionisatotal,kind-preservingmappingoftype
variablestotypesthatistheidentityeverywherebutonanite subsetofV,
whihweall the domain of andwrite dom(). Therange of , whih we
writerange(),isthesetftv((dom())).Atypesubstitutionmaynaturallybe
viewedasatotal,kind-preservingmappingoftypestotypes.Inthefollowing,
wewrite
X# for
X#(dom()[range()).Wewriten
Xfortherestrition
ofoutside
X,thatis,therestritionoftoVn
X.Wesometimeslet'denote
atypesubstitution. 2
If
~
X and
~
Tare respetivelyavetorofdistinttype variablesandavetor
of types of the same (nite) length, suh that, for every index i, X
i and T
i
havethesamekind,then[
~
X7!
~
T℄denotesthesubstitutionthatmapsX
i toT
i
foreveryindex i.Thedomain of[
~
X7!
~
T℄isasubsetof
X,theset underlying
thevetor
~
X.Itsrangeisasubsetofftv(
T),where
Tisthesetunderlyingthe
vetor
~
T.Everysubstitution maybewritten undertheform [
~
X7!
~
T℄,where
X=dom().Then,isidempotent ifandonlyif
X#ftv(
T)holds.
Aspointedoutearlier,typesarerst-orderterms;that is,inthegrammar
oftypes,noneoftheprodutionsbinds atypevariable.Asaresult,everytype
variablethatappearswithin atypeTappearsfree withinT.Thissituationis
idential to that of the simply-typed-alulus. Things beome moreinter-
esting when weintrodue typeshemes. As itsname implies,a typesheme
may desribe an entire family of types; this eet is ahieved via universal
quantiation overasetoftypevariables.
1.2.18 Definition: AtypeshemeSisanobjetoftheform8
X:T,whereTisatype
ofkind?andthetypevariables
Xareonsideredboundwithin T. 2
Onemayviewthe typeT asthetrivialtypesheme8?:T, where notype
variablesareuniversallyquantied,sotypesmaybeviewedasasubsetoftype
shemes. Thetypesheme8
X:Tmaybeviewed asanite way ofdesribing
~ ~ ~
(x)=S
`x:S
(dm-Var )
;z:T`t:T 0
`z:t:T!T 0
(dm-Abs )
`t
1
:T!T 0
`t
2 :T
`t
1 t
2 :T
0
(dm-App )
`t
1
:S ;z:S`t
2 :T
`letz=t
1 int
2 :T
(dm-Let )
`t:T
X#ftv( )
`t:8
X:T
(dm-Gen )
`t:8
X:T
`t:[
~
X7!
~
T℄T
(dm-Inst)
Figure1-3: Typingrules for DM
Suhtypesarealledinstanesofthetypesheme8
X:T.Notethat,throughout
mostofthishapter,weworkwithonstrainedtypeshemes,ageneralization
ofDMtypeshemes(Denition1.3.2).
Typingenvironments,orenvironmentsforshort,areusedtoolletassump-
tionsaboutanexpression'sfreeidentiers.
1.2.19 Definition: An environment is a nite ordered sequene of pairs of a
programidentierandatypesheme.Wewrite?fortheemptyenvironment
and;fortheonatenationofenvironments.Anenvironmentmaybeviewedas
anitemappingfromprogramidentierstotypeshemesbyletting (x)=S
ifandonlyif isof theform
1
;x:S;
2 ,where
2
ontainsnoassumption
aboutx.Theset ofdened programidentiers ofanenvironment , written
dpi( ),isdened bydpi(?)=?anddpi( ;x:S)=dpi( )[fxg. 2
To omplete the denition of Damas and Milner's type system, there
remains to dene typing judgements. A typing judgement takes the form
`t:S,wheretisanexpressionofinterest, isanenvironment,whihtyp-
iallyontainsassumptionsaboutt'sfreeprogramidentiers,andSisatype
sheme.Suhajudgementmayberead:underassumptions ,theexpression
thas thetypesheme S.Byabuseoflanguage,itissometimessaidthatthas
typeS.Atypingjudgementisvalid (orholds)ifandonlyifitmaybederived
usingtherulesthatappearinFigure1-3.Anexpressiontiswell-typed within
theenvironment ifandonlyifsomejudgementoftheform `t:Sholds;
itisill-typed within otherwise.
Rule dm-Var allows fething a type sheme for an identier x from the
environment.Itisequallyappliabletoprogramvariables,memoryloations,
andonstants.Ifnoassumptiononerningxappearsintheenvironment ,
thentheruleisn'tappliable.Inthatase,theexpressionxisill-typedwithin
aso-alledinitial environment
0
. It istheenvironmentunder whih losed
programsaretypeheked,so everysubexpressionistypehekedundersome
extension of
0
. Of ourse, the typeshemesassigned by
0
to onstants
mustbeonsistentwiththeiroperationalsemantis;wesaymoreaboutthis
later(Setion 1.7). Ruledm-Abs speies howto typeheka-abstration
z:t.Itspremiserequiresthebodyofthefuntion,namelyt,tobewell-typed
underanextraassumption,whihausesallfreeourrenesofzwithintto
reeiveaommontypeT. Itsonlusion forms thearrowtypeT!T 0
outof
the types of the funtion'sformal parameter, namely T, and result, namely
T 0
. It is worth notingthat this rule always augments the environment with
atypeTreallthat,by onvention,typesform asubsetof typeshemes
but neverwith a nontrivial typesheme. dm-Appstates that the type of a
funtionappliationistheodomainofthefuntion'stype,providedthatthe
domain of the funtion'stype is avalid type for the atual argument. dm-
Let loselymirrorsthe operationalsemantis:whereasthesemantisofthe
loal denition let z = t
1 in t
2
is to augment the runtime environment
bybinding zto thevalue of t
1
priorto evaluatingt
2
, theeet of dm-Let
is to augment the typing environment by binding z to a type sheme for
t
1
prior to typeheking t
2
. dm-Gen turns a type into a type sheme by
universallyquantifyingoverasetoftypevariablesthatdonotappearfreein
the environment; this restrition is disussed in Example 1.2.20below. dm-
Inst,ontheontrary,turnsatypeshemeintooneofitsinstanes,whihmay
behosen arbitrarily.These two operationsare referredto asgeneralization
andinstantiation.Thenotionoftypeshemeandtherulesdm-Genanddm-
Inst are harateristiof ML-the-type-system: they distinguish it from the
simply-typed-alulus.
1.2.20 Example: Itisunsoundtoallowgeneralizingtypevariablesthatappearfree
in the environment. Forinstane, onsider thetypingjudgementz : X` z:
X(1),whih,aordingtodm-Var,isvalid.Applyinganunrestritedversion
ofdm-Gen toit,weobtainz:X`z:8X:X(2),whene, bydm-Inst,z:X`
z:Y(3). Bydm-Absanddm-Gen, wethenhave?`z:z:8XY:X!Y.In
other words,the identity funtion has unrelatedargumentand resulttypes!
Then, the expression (z :z)
^
0
^
0, whih redues to the stuk expression
^
0
^
0,
hastypesheme8Z:Z.So,well-typedprogramsmayauseruntimeerrors:the
typesystemisunsound.
Whathappened?Itislearthatthejudgement(1)isorretonlybeause
the type assigned to z is the same in its assumption and in its right-hand
side. For thesamereason, thejudgements (2)and (3)theformer ofwhih
maybewritten z:X` z:8Y:Yare inorret.Indeed,suh judgementsde-
By universally quantifying overX in the right-hand side only, we break the
onnetion between ourrenesof X in the assumption, whih remain free,
andourrenesintheright-handside,whih beomebound.Thisisorret
onlyifthereareinfat nofreeourrenesofXintheassumption. 2
ItisakeyfeatureofML-the-type-systemthatdm-Absmayonlyintrodue
atypeT,ratherthanatypesheme,intotheenvironment.Indeed,thisallows
therule'sonlusionto formthearrowtypeT!T 0
.Ifinsteadtherule were
to introduethe assumption z: S into theenvironment,then its onlusion
wouldhaveto formS!T 0
, whih isnotawell-formedtype.Inotherwords,
this restritionis neessaryto preserve thestratiation between typesand
typeshemes.Ifweweretoremovethisstratiation,thusallowinguniversal
quantierstoappeardeepinside types,wewouldobtainanimpliitly-typed
version of System F (TAPL Chapter 23). Type inferene for System F is
undeidable(Wells,1999),whiletypeinfereneforML-the-type-systemisde-
idable, aswe show later, sothis designhoie hasarather drasti impat.
1.2.21 Exerise [F, Reommended℄: Build atype derivation for the expression
z
1 :letz
2
=z
1 inz
2
withinDM. 2
1.2.22 Exerise [F, Reommended℄: Letintbeanullarytypeonstrutorofsig-
nature?.Let
0
onsistofthebindings
^
+:int!int!intand
^
k:int,for
everyintegerk. Canyoundderivationsofthefollowingvalidtypingjudge-
ments?Whih of these judgements arevalid in thesimply-typed -alulus,
whereletz=t
1 int
2
issyntatisugarfor(z:t
2 )t
1
?
0
`z:z:int!int
0
`z:z:8X:X!X
0
`letf=z:z
^
+
^
1inf
^
2:int
0
`letf=z:zinff
^
2:int
Show that the expressions
^
1
^
2 and f:(ff) are ill-typed within
0
. Could
theseexpressionsbewell-typedinamorepowerfultypesystem? 2
1.2.23 Exerise [FF℄: Infat,therulesshowninFigure1-3arenotexatlyDamas
and Milner'soriginalrules. In(Damasand Milner,1982),the generalization
andinstantiationrulesare:
`t:S X62ftv( )
`t:8X:S
(dm-Gen')
`t:8
X:T
Y#ftv(8
X:T)
~ ~
(dm-Inst')
where 8X:Sstandsfor8X
X:Tif Sstandsfor8
X:T.Showthat theombination
ofdm-Gen' and dm-Inst'is equivalentto theombination ofdm-Gen and
dm-Inst. 2
DM enjoys a number of nie theoretial properties, whih have pratial
impliations. First, under suitable hypotheses about the semantis of on-
stants, aboutthe typeshemes that theyreeivein theinitial environment,
andin thepreseneof side eetsunderaslightrestritionof thesyntax
ofletonstruts,itispossibletoshowthatthetypesystemissound:thatis,
well-typed(losed)programsdo notgowrong.Thisessentialpropertyensures
thatprogramsthatareaeptedbythetypehekermaybeompiledwithout
runtimeheks.Furthermore,itispossibletoshowthatthere existsanalgo-
rithmthat,givena(losed)environment andaprogramt,tellswhethert
iswell-typedwithrespetto ,andifso,produesaprinipal typeshemeS.
Aprinipaltypeshemeissuhthat(i)itisvalid,thatis, `t:Sholds,and
(ii)itis mostgeneral,that is,everyjudgementof theform `t:S 0
follows
from `t:Sbydm-Instanddm-Gen.(Forthesakeofsimpliity,wehave
stated the properties of the type inferene algorithm only in the ase of a
losedenvironment ;thespeiationisslightlyheavierinthegeneralase.)
This implies that type inferene is deidable: the ompiler does not require
expressions to be annotated with types. It also implies that, under a xed
environment , all of the type information assoiated with an expression t
maybesummarizedin theform ofasingle(prinipal)typesheme,whihis
veryonvenient.
Road map
Beforeprovingtheabovelaims,werstgeneralizeourpresentationbymov-
ingto aonstraint-based setting.Theneessarytools,namelytheonstraint
language,itsinterpretation,andanumberofonstraintequivalenelaws,are
introduedinSetion1.3.InSetion1.4,wedesribethestandardonstraint-
basedtypesystemHM(X)(Odersky,Sulzmann,andWehr,1999a;Sulzmann,
Müller,andZenger,1999;Sulzmann,2000).Weprovethat,whenonstraints
are madeupof equationsbetweenfree,nite terms,HM(X)is areformula-
tionofDM.Inthepreseneofamorepowerfulonstraintlanguage,HM(X)
is an extension of DM. In Setion 1.5, we propose an original reformula-
tionofHM(X),dubbedPCB(X),whose distintivefeatureisto exploittype
sheme introdution and instantiation onstraints. In Setion 1.6, we show
that,thankstotheextraexpressivepoweraordedbytheseonstraintforms,
typeinferenemaybeviewed asaombination ofonstraintgenerationand
onstraintsolving,aspromisedearlier.Indeed,wedeneaonstraintgenera-
theorem.Itisstatedpurelyintermsofonstraints,butthankstotheresults
developedintheprevioussetionsappliesequallytoPCB(X),HM(X),and
DM.
Throughoutthisorematerial,thesyntaxandinterpretationofonstraints
are left partlyunspeied. Thus, the developmentis parameterized with re-
spet to themhene the unknown X in the names HM(X) and PCB(X).
We really desribe a family of onstraint-based type systems, all of whih
share aommon onstraintgenerator and aommon type soundness proof.
Constraint solving, however,annot beindependent of X: on the ontrary,
thedesignofaneientsolverisheavilydependentonthesyntaxandinter-
pretationofonstraints.InSetion1.8,weonsider onstraintsolvinginthe
partiular asewhere onstraints are madeup of equationsinterpreted in a
freetreemodel,anddeneaonstraintsolverontopofastandardrst-order
uniationalgorithm.
Theremainder of this hapter dealswith extensions ofthe framework.In
Setion1.9,weexplainhowtoextendML-the-aluluswithanumberoffea-
tures,inludingdatastrutures, patternmathing, and type annotations.In
Setion1.10,weextendtheonstraintlanguagewithuniversalquantiation
anddesribeanumberofextrafeaturesthatrequirethisextension,inluding
adierentavorof type annotations, polymorphi reursion,and rst-lass
universalandexistentialtypes.Last,inSetion1.11,weextendtheonstraint
languagewithrows and desribetheirappliations,whih inlude extensible
variantsandreords.
1.3 Constraints
Inthissetion,wedenethesyntaxandlogialmeaningofonstraints.Both
arepartlyunspeied.Indeed,thesetoftypeonstrutors (Denition1.2.14)
mustontainatleastthebinarytypeonstrutor!,butmightontainmore.
Similarly, the syntax of onstraints involvesa set of so-alled prediates on
types, whih werequire to ontain at least abinary subtyping prediate,
but mightontainmore.Furthermore,thelogialinterpretationoftypeon-
strutors and of prediates is left almost entirely unspeied. This freedom
allows reasoningnot only about Damas and Milner's type system, but also
aboutafamilyofonstraint-basedextensionsofit.
Type onstrutors other than ! and prediates other than will never
expliitlyappearin thedenition ofouronstraint-basedtypesystems,pre-
isely beause the denition is parametri with respet to them. They an
(and usually do) appear in the type shemes assigned to onstrutors and
destrutorsbytheinitialenvironment
0 .
::= type sheme:
8
X[C℄:T
C ;D ::= onstraint:
true truth
false falsity
PT
1 :::T
n
prediate appliation
C^C onjuntion
9
X:C existentialquantiation
defx:inC typeshemeintrodution
xT typesheme instantiation
::= Typingenvironments:
?
x:
;
C ;D ::= Syntatisugarfor onstraints:
::: Asbefore
T Denition1.3.3
letx:inC Denition1.3.3
9 Denition1.3.3
def inC Denition1.3.4
let inC Denition1.3.4
9 Denition1.3.4
Figure1-4: Syntax of type shemesand onstraints
proofs,yetinreasestheframework'sexpressivepower.Whensubtypingisnot
desired,weinterprettheprediateasequality.
Syntax
Wenowdenethesyntaxofonstrainedtypeshemesandofonstraints,and
introduesomeextraonstraintformsassyntatisugar.
1.3.1 Definition: LetP rangeoveraniteordenumerablesetofprediates,eah
ofwhih hasasignature of theform
1
:::
n
), where n0.Let
beadistinguishedprediateofsignature??). 2
1.3.2 Definition: ThesyntaxoftypeshemesandonstraintsisgiveninFigure1-
4.It isfurther restritedby thefollowingrequirements.In thetype sheme
8
X[C℄:T and in the onstraint x T, the type T must have kind ?. In the
onstraintPT
1 :::T
n
,thetypesT
1
;:::;T
n
musthavekind
1
;:::;
n
,respe-
tively,ifP hassignature
1
:::
n
).Wewrite8
X:Tfor8
X[true℄:T,whih
allowsviewing DMtypeshemesasasubsetofonstrainedtypeshemes. 2
Wewrite T
1 T
2
for the binaryprediate appliation T
1 T
2
, and allit a
subtypingonstraint. Byonvention,9 anddef bind tighter than^; that is,
9
X:C^Dis(9
X:C)^Danddefx:inC^Dis(defx: inC)^D.In8
X[C℄:T,
thetype variables
Xare bound within C and T. In9
X:C, the typevariables
X are bound within C. The sets of free type variables of a type sheme
and of a onstraint C, written ftv() and ftv(C), respetively, are dened
offreeprogramidentiersofatypesheme andof aonstraintC, written
fpi() and fpi(C), respetively, are dened aordingly. Please note that x
oursfreeintheonstraintxT.
Weimmediatelyintrodueanumberofderivedonstraintforms:
1.3.3 Definition: Let be8
X[C℄:T.If
X# ftv(T 0
)holds, then T 0
(read: T 0
is
aninstaneof)standsfortheonstraint9
X:(C^TT 0
).Wewrite9(read:
has an instane)for9
X:Candletx:inC for9^defx:inC. 2
Constrained type shemes generalize Damas and Milner's type shemes,
while our denition of instantiationonstraints generalizesDamas and Mil-
ner's instane relation (Denition 1.2.18). Letus draw aomparison. First,
DamasandMilner'sinstanerelationyields ayes/noanswer,andispurely
syntati: for instane, the type Y ! Z is not an instane of 8X:X ! X in
Damas and Milner's sense, beause Y and Z are distint type variables. In
ourpresentation,on theother hand,8X:X!XY!Zis notan assertion;
rather, it isa onstraint, whih by denition is 9X:(true^X!X Y!Z).
Welater provethat itis equivalent to 9X:(YX^XZ)and to YZ, or,
ifsubtypingis interpretedasequality, toY=Z.That is, T 0
representsa
onditionon (the typesdenoted by)the typevariablesin ftv(;T 0
) forT 0
to
beaninstane of,in alogial,ratherthanpurelysyntati,sense.Seond,
thedenition ofinstantiationonstraintsinvolvessubtyping, soasto ensure
that any supertype of an instane of is again an instane of (see rule
C-ExTrans in Figure 1-6 and Lemma 1.3.17). This is onsistent with the
purpose of subtyping, whih is to allowsupplying asubtype wherea super-
typeisexpeted(TAPLChapter15).Thirdandlast,everytypeshemenow
arriesaonstraint.TheonstraintC,whosefreetypevariablesmayormay
notbemembersof
X,restritstheinstanesofthetypesheme8
X[C℄:T.This
is expressed in the instantiation onstraint 9
X:(C ^T T 0
), where the val-
uesthat
Xmayassumearerestritedbytherequirementthat Cbesatised.
Thisrequirementvanishesin theaseof DMtypeshemes, where C istrue.
Our notions of onstrained type sheme and of instantiation onstraint are
standard:theyare exatlythoseof HM(X)(Odersky, Sulzmann, andWehr,
1999a).
Theonstrainttrue,whihisalwayssatised,mainlyservestoindiatethe
absene of a nontrivial onstraint, while false, whih has no solution, may
be understood as the indiation of a type error. Composite onstraints in-
lude onjuntion and existential quantiation, whih have their standard
meaning, aswell astype sheme introdution and type sheme instantiation
onstraints, whih are similar to Gustavssonand Svenningsson's onstraint
abstrations(2001b). Inshort,theonstrutdefx:inC bindsthenamex
theformxT,wherethisourreneofxisfreeinC,thenthissubonstraint
aquiresthemeaning T.Thus,theonstraintxTisindeedaninstantia-
tiononstraint,wherethetypeshemethatisbeinginstantiatedisreferredto
byname.Theonstraintdefx:inC maybeviewedasanexpliit substitu-
tionofthetypesheme forthenamexwithinC.Later(Setion1.5),weuse
suh expliit substitutions to supplanttyping environments. That is, where
Damas and Milner's type systemaugmentsthe urrent typing environment
(dm-Abs,dm-Let),weintrodueanewdefbindingintheurrentonstraint;
whereitlooksuptheurrenttypingenvironment(dm-Var),weemployanin-
stantiationonstraint.Thepointisthatitisthenuptoaonstraintsolverto
hooseastrategyforreduingexpliitsubstitutionsforinstane,onemight
wish tosimplify before substituting itforx within Cwhereasthe useof
environmentsin standardtypesystemssuhasDMandHM(X)imposesan
eagersubstitutionstrategy,whihisineientandthusneverliterallyimple-
mented. The use of type sheme introdution and instantiation onstraints
allowsseparating onstraintgeneration and onstraintsolving without om-
promising eieny, or,in other words,withoutintroduing agap between
thedesriptionofthetypeinferenealgorithmanditsatualimplementation.
Althoughthealgorithmthatweplantodesribeisnotnew,itsdesriptionin
termsofonstraintsis:tothebestofourknowledge,theonlyloserelativeof
ourdefonstraintsisto befoundin (Gustavssonand Svenningsson,2001b).
Fähndrih,Rehof,andDas'sinstantiationonstraints(2000)arealsorelated,
but may be reursive and are meant to be solved using a semi-uniation
proedure,as opposedto auniation algorithmextended with failitiesfor
reatingandinstantiatingtypeshemes,asin ourase.
Oneonsequeneofintroduingonstraintsinsidetypeshemesisthatsome
type shemes have no instanes at all, or have instanes only if a ertain
onstraintholds. Forinstane,thetypesheme=8X[bool=int℄:X, where
thenullarytypeonstrutorsintandboolhavedistintinterpretations,has
no instanes; that is, no onstraint of the form T 0
has asolution. The
typesheme =8Z[X =Y !Z℄:Z has aninstane only if X=Y !Z holds
for some Z; in other words, for every T 0
, T 0
entails 9Z:(X = Y ! Z).
(We dene entailment on page 29.) We later prove that the onstraint 9
is equivalent to 9Z: Z, where Z 62 ftv() (Exerise 1.3.23). That is, 9
expressestherequirementthathaveaninstane.Typeshemesthatdonot
havean instane indiate atypeerror, so in manysituations, onewishes to
avoidthem; for this reason, we oftenuse theonstraintform letx: in C,
whihrequires tohaveaninstane andatthesametimeassoiatesitwith
thenamex.Beausethedefformismoreprimitive,itiseasiertoworkwith
at alowlevel,butit isnolongerexpliitlyusedafter Setion 1.3;wealways
1.3.4 Definition: Environments remainasinDenition1.2.19,exeptDMtype
shemes S are replaed with onstrained type shemes . We write dfpi( )
for dpi( )[fpi( ). We dene def ? in C = C and def ;x : in C =
def in def x : in C. Similarly, we dene let ? in C = C and let ;x :
in C = let in let x : in C. We dene 9? = true and 9( ;x:) =
9 ^def in9. 2
In order to establish or express ertain laws of equivalene between on-
straints,weneedonstraintontexts.Aontextisaonstraintwithzero,one,
orseveralholes, written[℄.Thesyntaxofontextsisasfollows:
C::=[℄jCjC^Cj9
X:Cjdefx:inCjdefx:8
X[C℄:TinC
Theappliation ofaonstraintontextC to aonstraintC, written C[C℄, is
dened in the usual way. Beausea ontextmay haveany numberof holes,
Cmaydisappearorbedupliatedintheproess.Beauseaholemayappear
in the sope of a binder, some of C's free type variables and free program
identiers may beome bound in C[C℄. We write dtv(C) and dpi(C) for the
sets of typevariables andprogram identiers,respetively, that C maythus
apture. We write letx : 8
X[C℄:T in C for 9
X:C^defx : 8
X[C℄:T in C. Being
ableto statesuha denition iswhywerequiremulti-holeontexts. We let
rangeoverexistentialonstraintontexts, denedbyX ::=[℄j9
X:X.
Meaning
Wehavedened thesyntaxofonstraintsandgivenaninformaldesription
of their meaning. We now give a formal denition of the interpretation of
onstraints.Webeginwiththedenitionof amodel:
1.3.5 Definition: Foreverykind , letM
bea nonempty set, whose elements
arethegroundtypes ofkind.Inthefollowing,t rangesoverM
,for some
thatmaybedeterminedfrom theontext.ForeverytypeonstrutorF of
signature K ), let F denote atotal funtion from M
K
into M
, where
theindexed produtM
K
istheset of allmappingsof domaindom(K)that
map everyd 2 dom(K) to an element of M
K(d)
. Foreveryprediate P of
signature
1
:::
n
), letP denote aprediateonM
1
:::M
n .
WerequiretheprediateonM
?
M
?
tobeapartialorder. 2
For the sakeof onveniene, we abusenotation and write F for boththe
type onstrutor and its interpretation; similarly for prediates. We freely
assumethat abinary equalityprediate, whoseinterpretation is equalityon
M
,isavailableateverykind,soT
1
=T
2
,whereT
1 andT
2
havekind,is
By varying the set of type onstrutors, the set of prediates, the set of
groundtypes,andtheinterpretationoftypeonstrutorsandprediates,one
maydeneanentirefamilyofrelatedtypesystems.Weinformallyrefertothe
olletionofthesehoiesasX.Thus,thetypesystemsHM(X)andPCB(X),
desribedin Setions1.4and1.5,areparameterized byX.
The followingexamples give standard ways of dening the set of ground
typesand theinterpretationoftypeonstrutors.
1.3.6 Example [Syntati models℄: For every kind , let M
onsist of the
losed typesof kind . Then, ground typesare types that do nothave any
freetypevariables,andformtheso-alledHerbranduniverse.Leteverytype
onstrutor F beinterpretedasitself. Models that dene ground typesand
interprettypeonstrutorsin thismannerarereferredtoassyntati. 2
1.3.7 Example [Treemodels℄: Let a path be anite sequene of diretions.
Theemptypath iswritten andtheonatenationof thepaths and 0
is
written 0
.Letatreebeapartialfuntiontfrompathstotypeonstrutors
whose domain is nonempty and prex-losed and suh that,for everypath
in the domain of t, if the type onstrutor t() has signature K ) ,
thend2dom(t) isequivalentto d2dom(K)and,furthermore,for every
d 2 dom(K), the type onstrutor t(d) has image kind K(d). If is in
thedomainof t,thenthesubtree oft rootedat ,writtent=, isthepartial
funtion 0
7!t(
0
). Atree is nite if andonly ifithasnite domain. A
treeisregular ifandonlyifithasanite numberofdistintsubtrees.Every
nite tree is thus regular. Let M
onsist of the nite (resp.regular) trees
t suh that t() hasimage kind:then, wehaveanite (resp.regular) tree
model.
IfF hassignatureK ),onemayinterpretF asthefuntionthat maps
T 2M
K
to the ground typet 2 M
dened by t() =F and t=d= T(d)
ford2dom(T),thatis,theuniquegroundtypewhoseheadsymbolisF and
whose subtreerooted at d is T(d). Then, wehavea free tree model. Please
notethatfreenitetreemodelsoinidewithsyntatimodels,asdened in
thepreviousexample. 2
Rows(Setion 1.11)are interpretedin atreemodel,albeitnotafree one.
The followingexamples suggestdierent ways of interpreting thesubtyping
prediate.
1.3.8 Example [Equalitymodels℄: The simplest way of interpreting the sub-
typingprediateis toletdenote equalityoneveryM
.Modelsthat doso
are referredtoas equality models. Whenno prediateother than equalityis
1.3.9 Example[Strutural,nonstrutural subtyping℄: Let a variane
bea nonempty subset of f ;+g, written (ontravariant), + (ovariant),
or (invariant) for short. Dene the omposition of two varianes as an
assoiativeommutativeoperationwith +asneutralelement andsuh that
= + and = = . Now, onsider a free (nite orregular) tree
model, where everydiretion domes withaxed variane(d). Denethe
variane()ofapath astheompositionofthevarianesofitselements.
Let6beapartialorder ontypeonstrutorssuh that(i)if F
1 6F
2 holds
andF
1 andF
2
havesignatureK
1 )
1 andK
2
)
2
, respetively,then K
1
and K
2
agree on the intersetion of their domains and
1 and
2
oinide;
and(ii)F
0 6F
1 6F
2
impliesdom(F
0
)\dom(F
2
)dom(F
1
).Let6 +
,6 ,
and6
standfor6,>,and =,respetively. Then,dene theinterpretation
ofsubtypingasfollows:ift
1
;t
2 2M
,lett
1 t
2
holdifandonlyif,forevery
path 2 dom(t
1
)\dom(t
2 ), t
1 () 6
()
t
2
() holds. It is not diult to
hekthatisapartialorderoneveryM
.Thereaderisreferredto(Kozen,
Palsberg,andShwartzbah.,1995)formoredetailsaboutthisonstrution.
Modelsthatdene subtyping inthis mannerarereferredto asnonstrutural
subtypingmodels.
Asimplenonstruturalsubtypingmodelisobtainedbylettingthediretions
domainandodomainbeontra-andovariant,respetively,andintroduing,
in addition to the type onstrutor !, two type onstrutors ? and > of
signature ?. This gives rise to a model where ? is the least ground type,
> is the greatestground type, and the arrowtype onstrutor is, as usual,
ontravariantin itsdomainandovariantin itsodomain.
Atypialuseofnonstruturalsubtypingisintypesystemsforreords.One
may, for instane, introdue aovariant diretion ontent of kind ?, akind
, a typeonstrutor absof signature , atype onstrutor preof signature
fontent 7!?g ), and letpre6abs. This givesriseto amodelwhere pre
tabsholdsforeveryt2M
?
.Thisformofsubtypingisallednonstrutural
beauseomparablegroundtypesmayhavedierentshapes, suh as?and
? ! >, or pre > and abs. Nonstrutural subtyping has been studied, for
example,in(Kozen,Palsberg,andShwartzbah.,1995;Palsberg,Wand,and
O'Keefe,1997;Pottier,2001b;NiehrenandPriesnitz,2003).Setion1.11says
moreabouttypehekingoperationsonreords.
Animportantpartiularaseariseswhenany twotypeonstrutors related
by6havethe samearity.Inthatase,itisnotdiulttoshowthatanytwo
groundtypesrelatedby subtypingmusthavethe sameshape,thatis,ift
1 t
2
holds,then dom(t
1
)and dom(t
2
)oinide.Forthisreason,suhaninterpre-
tationofsubtypingisusuallyreferredtoasatomi orstrutural subtyping.It
hasbeenstudiedinthenite(Mithell,1984,1991b;Frey,1997;Rehof,1997;
1993)ases.Struturalsubtypingisoftenusedinautomatedprogramanaly-
sesthatenrihstandardtypeswithatomiannotationswithoutalteringtheir
shape. 2
Ourlastexamplesuggestsaprediateotherthanequalityandsubtyping.
1.3.10 Example [Conditionalonstraints℄: Considera nonstrutural subtyp-
ingmodel.Foreverytypeonstrutor F ofimage kindandfor everykind
0
,let(F 6))beaprediateofsignature 0
0
).Thus,ifT
0
haskindandT
1 , T
2
havethesamekind,thenF 6T
0 )T
1 T
2
isawell-
formedonstraint,alledaonditionalsubtypingonstraint.Itsinterpretation
is dened asfollows:if t
0 2 M
and t
1
;t
2 2 M
0, then F 6t
0 )t
1 t
2
holds ifand only if F 6 t
0
() implies t
1 t
2
. In other words, if t
0 's head
symbolexeeds F aordingto the ordering on type onstrutors, then the
subtypingonstraintt
1 t
2
musthold;otherwise,theonditional onstraint
holdsvauously.Conditional onstraintshavebeenstudiede.g.in (Reynolds,
1969a;Heintze, 1993; Aiken, Wimmers,and Lakshman, 1994; Pottier, 2000;
SuandAiken, 2001). 2
Manyotherkindsofonstraintsexist;seee.g.(Comon,1993).
Throughoutthishapter,weassume(unlessstatedotherwise) thattheset
of typeonstrutors, theset of prediates, and themodelwhih, together,
formtheparameterXarearbitraryand xed.
As usual, the meaning of a onstraintis a funtion of themeaning of its
free type variables, whih isgivenby aground assignment. Themeaning of
freeprogram identiersmay be dened aspartof the onstraint, ifdesired,
usingadefprex, soitneednotbegivenbyaseparateassignment.
1.3.11 Definition: Agroundassignmentisatotal,kind-preservingmappingfrom
V into M. Ground assignmentsare extended to types by (FT
1 :::T
n ) =
F((T
1
);:::;(T
n
)). Then,foreverytypeTof kind,(T)is agroundtype
ofkind.WhetheraonstraintCholdsunderagroundassignment,written
`C (read:satises C),isdened bytherulesinFigure1-5.Aonstraint
C issatisable ifandonlyif`C holdsforsome.Itis false ifandonlyif
`def inC holdsforno groundassignmentand environment . 2
Letus nowexplainthe rulesthat dene onstraintsatisfation(Figure 1-
5). Theyaresyntax-direted:that is, toagivenonstraint,at mostonerule
applies. It is determined by the nature of the rst onstrut that appears
under a maximaldef prex. CM-True statesthat aonstraintof the form
def intrueisatautology,that is,holdsundereverygroundassignment.No
rule mathes onstraints of the form def in false, whih means that suh