A Versatile Constraint-Based Type Inferene System
Franois Pottier
Franois.Pottierinria.fr
Abstrat. Theombinationofsubtyping,onditionalonstraintsandrowsyieldsa
powerfulonstraint-basedtypeinferenesystem. Weillustratethislaimbypropos-
ingsolutionstothreedeliatetypeinfereneproblems: \aurate"patternmath-
ings, reord onatenation, and rst-lass messages. Previouslyknown solutions
involvedadierenttehniquein eah ase;ourtheoretial ontributionisinusing
onlyasinglesetoftools.Onthepratialside,thisallowsallthreeproblemstoben-
etfromaommonsetofonstraintsimpliationtehniques,aformaldesription
ofwhihisgivenin anappendix.
CRClassiation: F.3.3[LogisandMeaningsofPrograms℄: StudiesofProgram
Construts|TypeStruture.
Key words: Constraint-based type inferene. Subtyping. Rows. Conditional
onstraints.
1. Introdution
Type inferene is the task of examining a program whih laks some (or
evenall)typeannotations, andreoveringenoughtypeinformationtomake
itaeptable bya type heker. Itsoriginal, and most obvious,appliation
isto free theprogrammerfrom theburden ofmanuallyproviding these an-
notations,thusmaking statitypingalessdrearydisipline. However, type
inferene has also seen heavy use as a simple, modular way of formulating
programanalyses.
The design of a type inferene system an be inuened by its purpose.
When used as a user-visible way of enforing a oding disipline, it might
be desirable to make it simple and somewhat rigid. When used invisibly
aspart of a ompiler's optimization proess,on theother hand, maximum
preision may be desired. Regardless of this distintion, however, power-
ful type inferene tehniques are often made a neessity by the advaned
featuresfoundinmanyreent programminglanguages.
Thispaperpresentsaommonsolutiontoseveralseeminglyunrelatedtype
inferene problems, using an existing framework for subtyping-onstraint-
based type inferene [14 ℄, equipped withonditional onstraints inspiredby
Aiken,Wimmersand Lakshman[2 ℄ and withrows alaRemy[19 , 21 ℄.
INRIARoquenourt,BP105,78153 LeChesnayCedex,Frane.
Constraint-Based Type Inferene
Subtypingisapartialorderon types,denedsothatanobjetofasubtype
maysafelybesuppliedwhereveranobjetofasupertypeisexpeted. Type
inferene in the presene of subtyping reets this basi priniple. Every
timea piee of data is passedfrom a produer to a onsumer, theformer's
output type is required to be a subtype of thelatter's inputtype. This re-
quirementisexpliitlyreordedbyreating asymbolisubtyping onstraint
betweenthese types. Thus, eah potential dataow disovered inthe pro-
gram yields one onstraint. This fat allows viewing a onstraint set as a
diretedapproximationoftheprogram'sdataowgraph{regardlessofour
partiulardenitionof subtyping.
Varioustype inferenesystems basedonsubtypingonstraintsexist. One
mayiteworksbyAikenetal. [1,2 ,5℄,thepresentauthor[16 ,17 ℄, Trifonov
and Smith[29 ℄, aswellasOdersky et al.'sabstrat framework HM (X) [14 ,
28 ,27 ℄. Relatedsystems inludeset-basedanalysis [9 ,6 ℄andtype inferene
systems basedon featureonstraints[11 , 12 ℄ orprediateonstraints[10 ℄.
Conditional Constraints
In many onstraint-based systems, the expression if e
0
then e
1
else e
2
is,at best, desribedby
1
^
2
where
i
standsfore
i
'stype,and standsforthewholeexpression'stype.
This amounts to stating that \the value of e
1
(resp. e
2
) may beome the
value of the whole expression", regardless of the test's outome. A more
preise desription { \if e
0
may evaluate to true (resp. false), then the
value of e
1
(resp e
2
) may beome thevalue of the whole expression"{ an
begiven usingonditional onstraints:
true
0
?
1
^ false
0
?
2
Introduingtests into onstraints allows keeping trak of some of the pro-
gram's ontrol ow {that is,mirroring, at thelevelof types, theway eval-
uationisaeted bythe outomeof atest.
Conditional setexpressions were introdued byReynolds[25 ℄ asa means
ofsolvingsetonstraintsinvolvingstrittype onstrutors anddestrutors.
Heintze [9℄ uses them to formulate an analysis whih ignores \dead ode".
He also introdues ase onstraints, whih allow ignoring the eet of a
branh,ina aseonstrut, unless itis atually liableto be taken. Aiken,
WimmersandLakshman[2℄useonditionaltypes,togetherwithintersetion
types, forthispurpose.
In thepresent paper,we suggesta single notionof onditional onstraint,
whih isomparableinexpressive powerto theabove onstruts, andlends
itself to a simpleand eÆient implementation. (Asimilarhoie was made
independentlybyFahndrih[5℄.) Weemphasizeits useasawaynotonlyof
introduingsomeontrol intotypes,butalsoofdelaying typeomputations,
thusintroduingsome \laziness" into type inferene.
Rows
Designing a type system for a programming language with reords, or ob-
jets, requires some way of expressing labelled produts of types, where
labels areeldormethodnames. Dually,ifthelanguageallowsmanipulat-
ingstrutured data, thenits type system is likelyto require labelledsums,
wherelabels arenamesof dataonstrutors.
Wand [30℄ and Remy [19 , 21 ℄ elegantly deal with both problems at one
by introduing notationto expressdenumerable, indexed familiesof types,
alledrows:
::=;;:::;'; ;:::ja:; j
(Here, ranges over types, and a;b;::: range over indies.) An unknown
row may be representedby a row variable, exatly as in the ase of types.
(Bylakofsymbols,wewillnotsyntatiallydistinguishplaintypevariables
and row variables.) The term a :; represents a row whose element at
indexais,and whoseother elements aregivenby. The term stands
for a row whose element at any index is . These statements are given
formalmeaningbyinterpreting rowsina logialmodel wherethefollowing
equationshold:
a:
a
; (b:
b
; )=b:
b
; (a:
a
; )
=a:;
If desired, some type onstrutors may be lifted to the level of rows, i.e.
viewedasrowonstrutorsaswell. Forinstane,toliftthetypeonstrutor
!,we extendthesyntax of rows:
::=:::j!
The term ! 0
is logially interpreted as the row obtained by applying
thetype onstrutor !, point-wise, to the rows and 0
. As a result, the
logialmodelsatisesthe followingequations:
(a:; )!(a: 0
; 0
)=a:( ! 0
); (! 0
)
!
0
=( ! 0
)
Moredetailsaregiven inSetion2.
Rows oer a partiularly straightforward way of desribing operations
whih treat all labels (exept possibly a nite number thereof) uniformly.
Beauseevery failityavailableat theleveloftypes(e.g. onstrutors, on-
straints)an alsobemadeavailableatthelevelofrows,adesriptionofthe
operation's eet on a singlelabel,written usingtypes,an also be readas
a desription of the entire operation, written using rows. This interesting
PuttingIt All Together
Our point isto show that the ombination of the three onepts disussed
above yields a very expressive system, whih allows type inferene for a
number of advaned language features. Among these, \aurate" pattern
mathing onstruts, reord onatenation, and rst-lass messages will
be disussed in this paper. Our system allows performing type inferene
for all of these features at one. Furthermore, eÆieny issuesonerning
onstraint-based type inferene systems have already been studied [5 , 17 ℄.
This existing knowledge benets our system, whih may thus be used to
eÆiently performtypeinferene forall ofthe above features.
In this paper, we fous on appliations of ourtype system, i.e. we show
how it allows solving eah of the problems mentioned above. Formal de-
nitionsof ouronstraint resolution and simpliation algorithmsappearin
Appendix A. Furthermore, a robust prototype implementation is publily
available [18 ℄. We do notprove that thetypes given to the three problem-
ati operations disussed in this paper are sound, but we believe this is a
straightforwardtask.
The paper is organized as follows. Setion 2 gives a detailed tehnial
presentation of the type system. Setion 3 gives an informal explanation
of the potential osts and benets of using onditional onstraints. Se-
tions 4, 5, and 6 disuss type inferene for \aurate" pattern mathings,
reordonatenation, and rst-lass messages, respetively,within oursys-
tem. Setion7givesseveralexamples,whihshowwhat inferredtypeslook
like in pratie. Setion 8 sums up our ontribution. Lastly, Appendix A
givesdenitionsand proofsforseveral onstraint manipulationalgorithms.
2. Formal Presentation of the System
This setion gives an in-depth formal presentation of our type system, in
its most general form. Muh of it maybe skipped on a rst reading { the
followingsetionsdesribethe systemina moregentle fashion. Thereader
maywish to ome bakto thissetionat a laterstage.
We dene our type system as an instane of the parametri framework
HM(X) [14 , 28 ,27 ℄. To do so, we simply dene aonstraint system, alled
SRC(forsubtyping-rows-onditionals),givingrisetoHM (SRC). Byre-using
existingmaterial,wesavedenitionsandproofs,andemphasizethefatthat
ourapproahis standard.
In order to retaina measure of generality, SRC isitself parameterized by
a ground signature,whihis a suintdesriptionof a type algebraand of
itsintendedsubtypeordering. GroundsignaturesaredenedinSetion2.1.
Given suh a ground signature, we expliitly dene the syntax of types
and onstraints (Setion 2.2), a logial model within whih they may be
interpreted(Setion 2.3),and theinterpretation itself (Setion2.4).
2.1 Assumptions
Agroundsignatureonsistsofthreeomponents: aseriesofsymbol latties,
indexed by kinds, a set of parameter labels (eah of whih is either o- or
ontra-variant, desribes either a row or a plain type parameter, and has
a xed kind), and a desription of eah symbol's arity as a nite set of
parameterlabels.
Definition 1. Let K be a nite set of kinds. For every kind 2 K , let
S
bea lattieof symbols, withoperations ?
, >
,
, t
andu
. Dene
S=℄
2K S
.
Let L +
and L be denumerable sets of parameter labels. Dene L =
L +
℄L . Let L
row
L be a distinguished subset of row parameter labels.
Let kindbea total mapping of L into K .
Let a be a total mapping from S to nite subsets of L, suh that:
Æ for all s
0
;s
1
;s
2 2S
, s
0
s
1
s
2
implies a(s
0 )\a(s
2
)a(s
1 );
Æ forany nitesubsetS of S
,a(t
S)anda(u
S)aresubsetsof [a(S).
Note that this implies a(?
)=a(>
)=?.
Theinformation desribed above forms a ground signature.
Therstonditionbearingonaisneessarytoguaranteethattheorderings
dogiveriseto anorderingongroundtypes(denedinSetion2.3). The
seondonemakesthedenitionofsome onstraintmanipulationalgorithms
moreonvenient (see Denition23 inAppendixA).
Example 1. Assume there is only one kind ?. Dene S
?
= f?;!;>g,
where?
?
!
?
>. Let L =fdomg,L +
=frnggand L
row
=?. Dene
a(?)=a(>)=?anda(!)=fdom;rngg. Thisdenesagroundsignature,
whihallows typingthepure-alulus.
Example 2. Dene three kinds N, R and V, orresponding to normal,
reord eld and variant eld types, respetively. Let S
N
be the at lat-
tie whose elements other than ? and > are !, fg and [℄. Let S
R
be the lattie with least element Bot, greatest element Any, and whose
other elements are Abs, Pre and Either, ordered by Abs
R
Either and
Pre
R
Either. Let S
V
be the lattie with least element Abs, greatest el-
ement Any, and whose only other element is Pre. (By abuse of language,
we are giving idential names to symbols in S
R
and in S
V
. This remains
non-ambiguous as long as all terms onsidered have known kinds.) Let
L =fdomg,L +
=fontent ;ontents;rngg and L
row
=fontentsg. Dene
a(!)= fdom ;rngg, a(fg) =a([℄) =fontentsg, a(Pre) = a(Either) =
fontentg,anda(?)=a(>)=a(Bot)=a(Abs)=a(Any)=?. Thisdenes
agroundsignature,whihisexpressiveenoughto desribeallprogramming
languagefeaturesonsideredinthispaper. Inpartiular,allofitsexpressive
powerwillbe exploitedto desriberst-lass messagesinSetion6.
::=;;'; ;:::js(
l )
l 2a(s)
jr :; j
C::=truejC^C j9:C j js? (sprime inS
)
Figure1: Syntaxoftypesandonstraints
In the rest of this formal presentation, we assume given a xed, arbi-
trary ground signature. In Setions 4{7, we will use the ground signature
desribedinExample2 above, butwe willre-introdueitstep by step.
2.2 Syntax of Types and Constraints
The (raw) syntax of types and onstraints is given inFig. 1. ;;'; ;:::
denotetype variables. Atypeterms() anbeformedbypikingasymbol
s2S andafamilyoftypeparameters,indexedaordingto thearityofs,
i.e. mustbe of theform (
l )
l 2a(s)
. Lastly,types may also be rows, whih
denotefamiliesoftypesindexed byadenumerableset ofrow labels R. The
term r :; 0
(where r 2 R) represents a row whose element at index r is
, and whose other elements are given bythe row 0
. The term stands
fora rowwhoseelement at any indexis.
The onstraint language oers standard onstruts (truth, onjuntion,
projetion [14 ℄), subtyping onstraints, and onditional onstraints. The
latter are of the form s ? , where s must satisfy the following
ondition: foranynitesubsetSofS
,s
(t
S)implies9s 0
2S s
s
0
.
In other words, s must be a prime element of its symbol lattie S
. This
ensuresthata onditionalonstraintbearingontheleastupperboundof a
set of variables, e.g. (s
1
t:::t
n
)?, is equivalent to a onjuntion
of onditional onstraintsbearingon its members:
V
n
i=1 (s
i
?). It is a
neessary ondition for the orretness of the garbage olletion algorithm
(see Theorem3in AppendixA).
Our denitionof onditional onstraints is dissymmetri. Indeed, ondi-
tionsmustbeoftheforms;onditionsoftheform saredisallowed.
Themotivationforthisdeisionis toallowtheonstraintsolvingalgorithm
toignoreonditionalonstraintsunlesstheironditionmust besatised(see
Denition19inAppendixA). Ifbothformsofonditionswereallowedtoo-
exist,thelanguagewouldbeomeexpressiveenough to enode disjuntions
ofonstraints,making onstraint solvingmore ostly.
To ensure that only meaningful types and onstraints an be built, we
equipthemwithkinding andsorting rules. Thegrammarofsorts isdened
by& ::=TypejRow(R ), whereR ranges over nitesubsets ofR. Forevery
kind and every sort &,we assume given a distint,denumerableset V
&
of
type variables. We dene judgements of the form ` : (resp. ` : &),
meaning that the type has kind (resp. sort &), and judgements of the
2V
`:
s2S
8l2a(s) `l:kind(l )
`s(
l )
l2a(s) :
`
1 :
`2:
`(r:
1
;
2 ):
` :
` :
`true
`C1 `C2
`C1^C2
`C
`9:C
`1:
`2:
`12
s2S `0: `12
`s0?12
Figure2: Kindingrules
2V
&
`:&
a(s)\L
row
=?
8l2a(s) `l:&
`s(
l )
l2a(s) :&
a(s)\Lrow6=?
8l2a(s)nL
row
`
l :Type
8l2a(s)\Lrow `l:Row(?)
`s(
l )
l2a(s) :Type
`1:Type `2:Row(R℄frg)
`(r:
1
;
2
):Row(R )
`:Type
`:Row(R )
`true
`C1 `C2
`C
1
^C
2
`C
`9:C
`
1
:& `
2 :&
`12
`
0
:& `
1
:& `
2 :&
`s0?12
Figure3: Sortingrules
form`C,meaningthattheonstraintC iswell-kinded(resp. well-sorted).
Thekindingrules,giveninFig.2,simplyenforethekinddisiplinerequired
bytheground signature. The sortingrules,displayedinFig.3, ensurethat
only meaningful row terms are built. Intuitively, the sort Type desribes
plain types, while the sort Row(R ) desribes families of types indexed by
RnR . In other words, a row of sort Row (R ) gives information about all
row labels exept those in R . For more details, we refer the readerto [21 ℄
orto [20 , setion5℄.
Beforemovingon,let uspointoutthatatermmayhaveseveralsorts, for
two distint reasons. First, a uniformrow maybe viewed as desribing
any (o-nite) number of entries, i.e. it may have any sort Row (R ). As a
result, the row term r
1 :
1
; :::; r
n :
n
; may have any sort Row(R ),
provided fr
1
;:::;r
n
g\R = ?. Suh a term will be required to have sort
Row (?)onlywhenusedasthel-parameterofatypeonstrutorsexpeting
a full row in l-position (i.e. l 2 a(s)\L
row
). Seond, a type onstrutor
s with non-row parameters (i.e. a(s)\ L
row
= ?) an be used at any
sort &. For instane, if r :
0
; 0
0
and r :
1
; 0
1
have sort Row (R ), then
(r:
0
; 0
0
)!(r:
1
; 0
1
) hassort Row(R ) aswell. Itslogialinterpretation
willbe thesame asthatof r :
0
!
1
; 0
0
! 0
1 .
slightlymore subtle: themeaningofatermdependsonthesort at whihit
isviewed. Fortunately,themeaningofaonstraintwillremainindependent
ofthe sortof its omponents.
2.3 Logial Model
We now dene the logial model within whih our onstraints are inter-
preted. Informallyspeaking,itisthetermalgebra generatedbytheground
signature at hand. However, things are made more omplex by our desire
to have reursive types 1
and bythepresene ofrows.
Definition 2. Let A bethe alphabet formed of all letters l2LnL
row and
all omposite letters lr, where l 2 L
row
and r 2 R. To every l 2 L, we
assoiate a subset A
l
of the alphabet, dened by A
l
= flg if l 2 LnL
row ,
and A
l
=flr;r2Rg otherwise.
A pathp isanitestringover thealphabet A,i.e. an elementof A
. The
letter denotes the empty path. A ground tree t is a partial funtion from
A
into S, whosedomain is non-empty and prex-losed, suh that, for all
paths p2dom(t) and for all labels l2L,
Æ if l2 a(t(p)), then p:A
l
is a subset of dom(t), whose image through t
is a subsetof S
, where =kind(l);
Æ otherwise, p:A
l
lies outside of dom(t).
The head onstrutor of a ground term t, written hd(t), is t(). Given
p2dom(t), the subtree of trooted at p, written t:p, is the tree q7!t(p:q).
Given p, l suh that l 2 a(t(p))\L
row
, the subrow of t rooted at (p;l) is
the funtionr 2R7! t(p:(lr)). A funtionissaid to be quasi-onstant i
its o-restrition to some nite set is a onstant funtion. A groundtree is
regular i it has a nite number of subtrees. A ground tree t is a ground
typeiit isregularandallof itssubrowsarequasi-onstant. Wedenote the
setofgroundtypes byT. A groundtypethas kind ifandonlyif t()2S
.
We denote the set of ground types of kind by T
.
Then, we equip every T
with an ordering . Beause ground types are
innitetrees,annotbedenedeasilybystruturalindution;instead,it
isdened asthelimitof adereasing sequeneof pre-orders.
Definition 3. A family of pre-orders over every T
is dened as follows.
Let
0
beuniformlytrueover everyT
. Then,forany k 2N and t;t 0
2T
,
dene t
k+1 t
0
as the onjuntion of the following onditions:
Æ t()
t
0
();
Æ 8l2a(t())\a(t 0
())nL
row
t:l l
k t
0
:l;
Æ 8l2a(t())\a(t 0
())\L
row
8r2R t:(lr) l
k t
0
:(lr).
1
The presene of reursive types removes the need to hek whether all solutions of
a onstraint are yli, whih,in the presene of subtyping relationships between type
(We let t l
k t
0
stand for t
k t
0
when l 2 L +
and t 0
k
t when l 2 L .)
Subtyping,denotedby,istheintersetionofthesepre-orders; itisalattie
on every T
.
The subtyping relationship is strutural: t and t 0
are related ifand only
iftheirheadonstrutors t()and t 0
() arerelated inthelattie of symbols
and,forevery labelldened bybotht and t 0
,theirl-sub-termsare related
(either o- or ontra-variantly, depending on the variane of l). It is, in
general,non-atomi: type onstrutors ofdierent aritiesmayberelated.
2.4 Logial Interpretation
Thereremainstogive an interpretation oftypesand onstraintswithinthe
model. Itisparameterizedbyagroundsubstitution,whihgivesmeaningto
anyfreetypevariables. Itmapstypestogroundtypes,ortofamiliesthereof
(aordingto theirsort), andonstraintsto Boolean values.
Definition 4. A ground substitution is a funtion of domain V, whih
mapsV Type
into T
, andwhih maps V Row (R)
intothe set of quasi-onstant
funtions of RnR into T
.
Definition 5. The interpretation of a type of sort &, under a ground
substitution,written (
&
), orsimply () when& an bedetermined from
the ontext, isdened as follows.
Æ If is a type variable , then (
&
) isthe imageof through .
Æ If is of the form s(
l )
l 2a(s)
and & = Type , then (
&
) is the ground
type t suh that t() = s, t:l = (
l
) whenever l 2 a(s)nL
row and
t:(lr)=(
l
)(r) whenever l2a(s)\L
row
andr 2R.
Æ If isoftheforms(
l )
l 2a(s)
and& =Row(R ),then,foreveryr2RnR ,
(
&
)(r) is the ground type t suh that t() = s and t:l = (
l )(r)
whenever l2a(s).
Æ If is of the form r :
1
;
2
and & = Row (R ), then (
&
)(r) = (
1 )
and, for every r 0
2Rn(R[frg), (
&
)(r 0
)=(
2 )(r
0
).
Æ If is of the form
0
and & = Row(R ), then, for every r 2 RnR ,
(
&
)(r)=(
0 ).
Definition 6. The onstraint satisfation prediate `, whose arguments
are a ground substitution and a well-sorted onstraint C, is dened as
follows.
Æ `true holds.
Æ `C
1
^C
2
holds i `C
1
and`C
2 hold.
Æ `9:C holdsithereexistsagroundsubstitution 0
,whihoinides
with outside of , suh that 0
`C holds.
Æ If ` ; :Type ,then ` holds i( )( ) holds.
Æ If `
1
;
2
:Row (R ), then `
1
2
holds i, for every r 2 RnR ,
(
1
)(r)(
2
)(r) holds.
Æ If `
0
;
1
;
2
:Type , then ` s
0
?
1
2
holds i s
S (
0 )()
implies (
1
)(
2 ).
Æ If `
0
;
1
;
2
: Row(R ), then ` s
0
?
1
2
holds i, for every
r 2RnR , s
S (
0
)(r)() implies (
1
)(r)(
2 )(r).
Thisdenitioniswell-formedbeause,eventhoughthetypeswhihappear
inaonstraintmayhaveseveral admissiblesorts,allofthemgiverisetothe
same interpretation.
Lastly, onstraint entailmentis given its usualdenition: CC 0
holdsif
and onlyif,foreveryground substitution, `C implies`C 0
.
2.5 TheType System HM(SRC)
We refer to the onstraint logi dened in Setions 2.1{2.4 as SRC. It is a
sound onstraint system in the sense of [14 ℄; thus, it gives rise to a type
system, namelyHM (SRC),forthe-aluluswith let.
We do not repeat the typing rules of HM (X) in this paper. For our
purposes,suÆeittoreallthattypeshemesareoftheform ::=8[C℄: .
Whenall of atype sheme's variablesareuniversallyquantied,weusually
write\ where C".
The-aluluswithletisalimitedprogramminglanguage. Toextendit,
wewilldenenewprimitiveoperations,equippedwithoperationalsemantis
and appropriate type shemes. However, no extension to the type system
itself will be neessary. This explains why we do not desribe it further.
Instead,wewillfousourintereston writing expressive typeshemes.
3. About Conditional Constraints
Theontentofthissetionisinformal. Itshowshowonditionalonstraints
an be usedto gainextra typingexibility,and whywe mightwant to use
them onlysparingly.
In aall-by-valuelanguage, ifanexpressione
2
diverges,then sodoesany
appliation (e
1 e
2
). In partiular, ife
2
has type ?, then (e
1 e
2
) may safely
be given type ? as well. In other words, if it an be proven that e
1 will
neverbe alled,thenits returntype an be disarded.
It is possible to make a type system aware of this fat. To do so, one
merelyintroduesa new typing rule:
C ;( ;x :?)`e:
C ; `x:e:?!?
As a result, the type system is no longer syntax-direted: typing a -
abstration involves ahoie between thisruleand the usual-abstration
ases separately, sine that would have exponential ost. Instead, a natu-
ralsolutionis to usea single type inferene rule,whih emits aonditional
onstraint,along thelinesof
C ;( ;x:)`
I
e: ; fresh
(?<? )^C ; `
I
x:e:!
Aslongasthefuntionisn'tinvoked,?remainsanadmissiblesolutionforits
argumenttype. So,theonditionalonstrainthasnoeet,andremains
unonstrained,meaningthatthefuntionproduesno result. However, ifa
allto this funtionis laterdisovered,then willbe onstrained to some
valuegreaterthan?. Thiswilltriggertheonditionalonstraint,and!
willbeomealowerboundforthefuntion'stype,meaningthatthefuntion
produesa result oftype .
This tehnique allows designing a \lazy" type inferene system, whih
ignores the type of an expression unless it appears liable to be evaluated.
Heintze[9 ℄usesonditionaltypesforthisverypurpose. Infat,itispossible
toarrythisideaeven further,andto ignorenotonlytheexpression'stype,
butalso its eet on thetypingenvironment. This would involve replaing
(?<? )^C abovewith?<?( ^C);thus, theonstraintC,
whihdesribestherequirementsofthefuntiononerningitsenvironment,
would also be subjet to the ondition?< . This idea appears,under a
dierentformulation,ine.g. [26 ℄.
Despite their theoretial appeal, though, these proposals seem a bit ex-
treme. Theyproduealargenumberofonditionalonstraints,makingtype
inferene less eÆient, beause potential onstraint simpliations are de-
layed. Thus,inapratialsystem, \laziness"shouldbeusedonlysparingly.
We proposeto buildit into thetypesof a fewprimitive operations, rather
than to hard-wire it into thetyping rules. We willillustrate this priniple
inthefollowingsetions.
4. Aurate Analysis of Pattern Mathings
Whenfaedwithapatternmathingonstrut,mostexistingtypeinferene
systemsadopt asimple, onservative approah: assuming thateah branh
maybe taken,they letitontributeto thewholeexpression'stype. Amore
auratesystem shouldusetypes to prove that ertain branhes annot be
taken, andprevent them from ontributing.
In thissetion,we desribe suh a system. Theessential idea{ introdu-
ing a onditional onstrut at the level of types { is due to [9 , 2℄. Some
novelty residesin ourtwo-step presentation, whih we believe helpsisolate
independentonepts. First,weonsidertheasewhereonlyone dataon-
strutorexists. Then, we easily move to thegeneral ase, by enrihing the
typealgebra withrows.
4.1 TheBasi Case
We assume thelanguageallows buildingand aessingtagged values.
e::=:::jPrejPre 1
A singledataonstrutor,Pre, allows buildingtagged values,whilethede-
strutorPre 1
allowsaessingtheirontents. Thisrelationshipisexpressed
bythefollowingredutionrule:
Pre 1
v
1 (Prev
2
) redues to (v
1 v
2 )
The rule states that Pre 1
rst takes the tag othe value v
2
, then passes
itto thefuntionv
1 .
At thelevelof types,weintroduea(unary)varianttypeonstrutor[℄.
Also, we establish a distintion between so-alled \normal types," written
,and \eldtypes,"written .
::=;;;:::j?j>j ! j[℄
::='; ;:::jAbsjPre jAny
A subtype ordering over eld types is dened straightforwardly: Absis its
leastelement,Anyis its greatest,and Preis aovariant type onstrutor.
The dataonstrutor Preisgiven thefollowingtypesheme:
Pre : ![Pre℄
Notie that there is no way of buildinga value of type [Abs℄. Thus, if an
expressionhas this type, then itmustdiverge. This explains ourhoie of
names. If an expressionhas type[Abs℄, thenits value must be \absent";if
ithastype[Pre℄,thensome value oftype maybe\present".
The datadestrutor Pre 1
isdesribed asfollows:
Pre 1
: (!)!['℄!
where 'Pre
Pre'?
The onditional onstraint allows (Pre 1
e
1 e
2
) to reeive type ? when e
2
hastype[Abs℄,reetingthefatthatPre 1
isn'tinvokeduntile
2
produes
some value. Indeed, as long as ' equals Abs, the onstraint is vauously
satised, so is unonstrained and assumes its mostpreise value,namely
?. However, as soon as Pre ' holds, must be satised as well.
Then,Pre 1
's type beomesequivalentto (!) ![Pre℄!, whih
isits usualML type.
4.2 TheGeneral Case
We now move to a languagewitha denumerablesetof dataonstrutors.
e::=:::jKjK 1
jlose
(We let K, L;::: stand for data onstrutors.) An expression may be
tagged, as before, by applying a data onstrutor to it. Aessing tagged
values beomes slightly more omplex, beause multiple tags exist. The
semantisoftheelementarydatadestrutor, K 1
,isgiven bythefollowing
redutionrules:
K 1
v
1 v
2 (Kv
3
) reduesto (v
1 v
3 )
K 1
v
1 v
2 (L v
3
) reduesto (v
2 (Lv
3
)) when K6=L
Aording to these rules,ifthe value v
3
arriestheexpeted tag,then it is
passedto the funtionv
1
. Otherwise,the value { stillarrying its tag{ is
passed to the funtion v
2
. Lastly, a speial value, lose, is added to the
language,butno additionalredutionruleisdened forit.
How do we modify our type algebra to aommodate multiple data on-
strutors? In Setion 4.1, we used eld types to enode informationabout
ataggedvalue'spresene orabsene. Here,weneedexatlythesame infor-
mation,butthistimeaboutevery tag. So,weneedtomanipulateafamilyof
eldtypes,indexedbytags. Todoso, weaddone layertothetypealgebra:
rows of eld types.
::=;;;:::j?j>j ! j[℄
::='; ;:::jK :; j
::='; ;:::jAbsjPre jAny
We an nowextendtheprevioussetion's proposal, asfollows:
K : ![K :Pre; Abs℄
K 1
: (!)!([K :Abs; ℄!)![K :'; ℄!
where 'Pre
Pre'?
lose : [Abs℄!?
K 1
'stypeshemeinvolvesthesame onstraintsasinthebasiase. Using
asingle row variable,namely ,intwo distintpositionsallows expressing
thefatthatvaluesarryinganytagotherthanK willbepassedunmodied
to K 1
'sseond argument.
lose's argument type is [Abs℄, whih prevents it from ever being in-
voked. This aords with the fat that lose does nothave an assoiated
redutionrule. Itplaystherole of a funtiondenedbyzeroases.
Thissystemoersextensible patternmathings: anyk-aryase onstrut
lose; it will reeive the desired, aurate type. (This fat is illustrated
by Example3 in Setion 7.) Thus, no spei language onstrut ortype
inferene ruleis neededto deal withthem.
5. Reord Conatenation
Statitypingforreordoperationsisawidelystudiedproblem[4 ,15℄. Com-
monoperationsinludeseletion,extension, restrition,andonatenation.
The latter omes in two avors: symmetri and asymmetri. The former
requiresitsargumentstohavedisjointsetsofelds,whereasthelattergives
preedene to theseond one whena onitours.
Of these operations, onatenation is probablythe most diÆult to deal
with, beause its behavior varies aording to the presene or absene of
eah eld in its two arguments. This has led many authors to restrit
their attention to type heking, and to not address the issue of type in-
ferene[8 ℄. An inferene algorithm forasymmetri onatenation was sug-
gested by Wand [30 ℄. He uses disjuntions of onstraints, however, whih
gives his system exponential omplexity. Remy [22 ℄ suggests an enoding
of onatenation into -abstration and reord extension, whene an infer-
ene algorithm may be derived. Unfortunately, its power is somewhat de-
reased bysubtle interations with ML'srestrited polymorphism; further-
more,theenodingisexposedtotheuser. Inlaterwork[23 ℄,Remysuggests
a diret, onstraint-based algorithm, whih involves a speial form of on-
straints. Sulzmann[27℄followsasimilarrouteandreatesaustominstane
of HM (X), again involving speial-purposeonatenation onstraints. Our
approahisinspiredfromRemy'slaterpaper,butre-formulatedintermsof
onditional onstraints, thus showing that no ad ho onstraint forms are
neessary.
Again,ourpresentationisintwosteps. Thebasiase, wherereordsonly
haveoneeld,istakledusingsubtypingandonditionalonstraints. Then,
rows allowusto easilytransfer ourresultsto thease ofmultipleelds.
5.1 TheBasi Case
We assume a language equipped withone-eld reords, whose unique eld
maybeeither \absent" or\present". Morepreisely,weassume aonstant
data onstrutor Abs, and a unary data onstrutor Pre; a \reord" is a
valuebuiltwithone oftheseonstrutors. Adatadestrutor,Pre 1
,allows
aessing the ontents of a non-empty reord. Lastly, the language oers
asymmetri and symmetri onatenation primitives, written and ,
respetively.
e::=:::jAbsjPrejPre 1
jj
The relationshipbetweenreord reation and reordaessis expressed by
asimple redutionrule:
1
Thesemantisof asymmetri reordonatenation isgiven asfollows:
v
1
Abs reduesto v
1
v
1
(Prev
2
) reduesto Prev
2
(In eah of these rules, the value v
1
is required to be a reord.) Lastly,
symmetrionatenation isdened by
Absv
2
redues to v
2
v
1
Abs redues to v
1
(Inthese two rules,v
1 and v
2
arerequiredto be reords.)
The onstrution of our type algebra is similar to the one performed in
Setion 4.1. We introdue a (unary) reord type onstrutor, as well as a
distintionbetweennormaltypes andeld types:
::=;;;:::j?j>j ! jfg
::='; ;:::jBotjAbsjPre jEither jAny
Letusexplain,step bystep,ourdenitionofeld types. Ourrst, natural
step is to introdue type onstrutors Abs and Pre, whih allow desrib-
ing values built with the data onstrutors Abs and Pre. The former is a
onstant typeonstrutor,whilethelatter isunaryand ovariant.
Many type systems for reord languages dene Pre to be a subtype of
Abs. Thisallows areordwhoseeldispresent topretenditisnot, leading
toalassitheoryofreordswhoseeldsmaybe\forgotten"viasubtyping.
However, when the language oers reord onatenation, suh a denition
isn't appropriate. Why? Conatenation { asymmetri or symmetri { in-
volvesahoiebetweentworedutionrules,whihisperformedbymathing
one, orboth, of the arguments against thedata onstrutors Abs and Pre.
If, at the level of types, we allowa non-empty reordto masquerade asan
empty one, then it beomes impossible, based on the arguments' types, to
nd out whih rule applies, and to determine the type of the operation's
result. In summary, in the presene of reord onatenation, no subtyp-
ing relationship must exist between Pre and Abs. (This problemis well
desribed{ althoughnotsolved { in[4 ℄.)
This leadsusto making Abs and Preinomparable. One thishoie has
beenmade,ompletingthedenitionofeldtypesisratherstraightforward.
Beause our system requires type onstrutors to form a lattie, we dene
a least element Bot, and a greatest element Any. Lastly, we introdue a
unary, ovariant type onstrutor, Either, whih we dene as the least
upperboundof Absand Pre, so thatAbst(Pre) equals Either. This
optional renement allows us to keep trak of a eld's type, even when its
presene is not asertained. (These ideas are due to Remy, who arries
themfurtherinthe aseof objets [24 ℄.) The lattie ofeld typesisshown
Any
Either
OO
Abs
77 o
o o o o
Pre
hh PPP
PP
Bot
ggOO OO O
66 n
n n n n
Figure 4: Thelattieofreordeldtypes
Letusnowassigntypestotheprimitiveoperationsoeredbythelanguage.
Reord reationand aess reeivetheirusualtypes:
Abs : fAbsg
Pre : !fPreg
Pre 1
: fPreg!
There remains to ome up with orret, preise types for both avors of
reordonatenation. The key idea is simple. Asshown byits operational
semantis, (either avor of) reord onatenation is really a funtion de-
nedbyases over thedata onstrutors Abs and Pre { and Setion 4 has
shownhowto aurately desribesuh afuntion. Letusbegin,then, with
asymmetri onatenation:
: f'
1
g!f'
2
g!f'
3 g
where '
2
Either
2
Abs'
2
?'
1 '
3
Pre'
2
?Pre
2 '
3
Clearly, eah onditional onstraint mirrors one of the redution rules. In
the seond onditional onstraint, we assume
2
is the type of the seond
reord's eld { ifit has one. The rst subtypingonstraint represents this
assumption. Notie that we use Pre
2
, rather than '
2
, as the seond
branh'sresulttype;thisisstritlymore preise,beause '
2
maybe ofthe
form Either
2 .
Lastly,weturn to symmetri onatenation:
: f'
1
g!f'
2
g!f'
3 g
where Abs'
1
?'
2 '
3
Abs'
2
?'
1 '
3
Pre'
1
?'
2 Abs
Pre'
2
?'
1 Abs
Again,eahofthersttwoonstraintsmirrorsaredutionrule. Thelasttwo
(Thearefulreaderwill notiethatany one of these two onstraintswould
infatsuÆe; bothare keptforsymmetry.)
In both ases, the operation's desription in terms of onstraints losely
resemblesitsoperationaldenition. Automatiallyderivingtheformerfrom
thelatterseems possible;thisisan area forfutureresearh.
5.2 TheGeneral Case
Wenowmovetoalanguagewithadenumerablesetofreordlabels,written
l,m,et. Thelanguageallowsreatingtheemptyreord,aswellasanyone-
eldreord;italsooersseletionandonatenationoperations. Extension
andrestrition an beeasilyadded, ifdesired.
e::=?jfl=egje:ljj
We do not give the semantis of the language, whih should hopefully be
learenough.
Attheleveloftypes,weagainintroduerowsofeldtypes,denoted by.
Furthermore, we introduerows of normaltypes, denoted by %. Lastly, we
lifttheve eldtype onstrutors to thelevelof rows.
::=;;;:::j?j>j ! jfg
::='; ;:::jBotjAbsjPre jEither jAny
%::=;;;:::jl:; %j
::='; ;:::jl:; jjBotjAbsjPre%jEither%jAny
Thisallows writing omplexonstraints between rows, suh as 'Pre ,
where'andarerowvariables. Aonstraintbetweenrowsisinterpretedas
an innite family of onstraints between types, obtained omponent-wise.
That is, (l : ' 0
; ' 0 0
) Pre (l : 0
; 0 0
) has the same logial meaning as
(' 0
Pre 0
)^(' 0 0
Pre 00
). (SeeSetion2 fordetails.)
We maynowgive typesto theprimitivereordoperations. Creationand
seletionareeasilydealt with:
? : fAbsg
fl=g : !fl:Pre; Absg
:l : fl:Pre; Anyg!
Interestingly,thetypesofbothonatenationoperationsareunhanged from
theprevioussetion{atleast, syntatially. (We donotrepeatthem here.)
A subtle dierene lies in the fat that all variables involved must now be
readas rowvariables, rather thanas type variables. Inshort, the previous
setion exhibited onstraints whih desribe onatenation, at the level of
a single reord eld; here, the row mahinery allows us to repliate these
onstraintsoveraninnitesetoflabels. Thisinreaseinpoweromesalmost
6. First-Class Messages
Inmanyurrentobjet-orientedlanguages,messagesdo nothave rst-lass
status. Thatis,wheneveramessageissenttoanobjet,itsnameisxedand
mustbe expliitlymentioned;onlythemessage parametersandthereeiver
objet are allowed to vary dynamially. Some languages, however, allow
rst-lass(alsoknownas\dynami")messages. Thatis,theyallowmessages
to exist as autonomous entities,whih maybe omputedin arbitrary ways
beforebeingsentto an objet.
Wewillviewobjetsasreordsoffuntions,andmessagesastaggedvalues,
made up of a label and a parameter. Indeed, this simple view suÆes to
exhibitthe type inferene problemwe areinterested in. Thus, we onsider
alanguage withreordsand dataonstrutors, asdesribedinSetions 4.2
and 5.2. Furthermore, we let reord labels and data onstrutors range
overa singlename spae,thatof message labels. A primitivemessage-send
operation,written#, isdenedas follows:
#fm=v
1
;:::g(mv
2
) reduesto (v
1 v
2 )
Inplainwords,#examines its seondargument,whih mustbesome mes-
sage m with parameter v
2
. It then looks up the method named m in the
reeiver objet, and applies the method's ode, v
1
, to the message param-
eter. Put another way, if r is a reord of funtions, then (#r) ats as a
funtion dened by ases. Thus, # is nothing but a witness of the well-
known isomorphismwhihexistsbetween theserepresentations.
6.1 TheProblem
In a language without rst-lass messages, every message-send operation
must involve a xed message label. So, instead of a single, generi opera-
tion suh as #, the language provides a family of primitive message-send
operations,indexed bymessage labels.
In ourview of objets asreords of funtions,these operationsare den-
able within the language. Indeed, the operation #m, whih allows send-
ing the message m to the objet o with parameter p, may be dened as
o:p:(o:m p). Then,type infereneyields
#m : fm:Pre(!); Anyg!!
Beause the message labelm is statially known, it an be expliitly men-
tionedin thetype sheme, making it easy to requirethe reeiverobjetto
arryan appropriatemethod.
In alanguagewithrst-lassmessages, ontheother hand,m isnolonger
known. AsaresultofthisdiÆulty,muhoftheinitialworkontypedobjet-
oriented languages has ignored the issue of rst-lass messages. Gaster [7,
hapter 7℄studiesa statitype system where#(underthe nameof sumE-
onlyifallfuntionsstoredinr havethesamereturntype. Thisonditionis
learly too restritive for our purposes: an objet must be allowed to on-
tain methods with dierent return types. Nishimura [13 ℄ suggests a type
inferene system for an objet-oriented language with rst-lass messages,
in the style of Ohori's seond-order typed reord alulus [15 ℄. It is later
re-formulatedbyMullerandNishimura[12 ℄. The newpresentationisbased
on feature onstraints, inluding a new form of onstraints, speially in-
tendedto modelthebehaviorofa generimessage-sendoperation. Bugliesi
andCrafa[3℄alsoattempttopresentasimpliedviewofNishimura'soriginal
work. However, they hoose a higher-order type system, thus abandoning
typeinferene.
6.2 A Solution
We said above that, given a reordr,the partialappliation(#r) yieldsa
funtiondenedbyases. Indeed,given atagged value (mv), itwillinvoke
anappropriatepieeofode,seleted aordingtothelabelm. Goodpoint
{ this paper is preisely onerned with ways of giving aurate types to
funtionsdenedbyases. Wehaveshownhowonditionalonstraintsallow
ignoring (the return type of) a branh, unless it is liable to be taken. In
objet-orientedterms,they allowignoring(the returntypeof) anymethod
whih is provably unrelated with the message at hand. This solves the
ruialproblemwithrst-lass messages.
Here,wehooseto deal diretly withthease ofmultiplemessage labels,
even though the two-step presentation adopted in Setions 4 and 5 would
stillmake sensehere. Therefore, we propose:
# : f'g![ ℄!
where Pre
Pre ?'Pre(!)
(Here,all variablesexept arerowvariables.) Theoperation's rst (resp.
seond) argument is required to be an objet (resp. a message), whose
ontents (resp. possible values) are desribed bythe row variable ' (resp.
). The rst onstraint merely lets stand for the type of the message
parameter. The onditional onstraint, whih involves three rows, should
again be understoodas a family, indexed by message labels,of onditional
onstraintsbetween eldtypes. The onditional onstraint assoiated with
somelabelmwillbetriggeredonlyif 's element at indexm isoftheform
Pre , i.e. only if the message's label may be m. When it is triggered, its
right-hand sidebeomes ative, with a three-fold eet. First, ''s element
at index m must be of the form Pre( ! ), i.e. the reeiver objet must
arry a method labeled m. Seond, the method's argument type must be
(a supertype of) 's element at label m, i.e. the method must be able to
aeptthemessage'sparameter. Third,themethod'sresulttypemustbe(a
subtypeof) , i.e. the result type of thewhole operation willbe (atleast)
This proposal shows that type inferene for rst-lass messages an be
performed usingexisting tools, with no need for dediated theoretial ma-
hinery. It also shows that rst-lass messages are naturally ompatible
with all operations on reords, inluding onatenation { a question left
unanswered byNishimura[13 ℄.
7. Examples
This setion illustrates the proposals made in the previous setions with
shortexamples.
Example 3. Weonsiderlistsbuiltoutoftwodataonstrutors,N andC.
The funtion ar,whih returns the rst element of a list,if it exists, and
E otherwise (where E is another data onstrutor, standing for error), is
denedasfollows:
ar = (N 1
E
(C 1
((x;r):x)
lose))
Then,ar'sinferred type sheme is
ar : [N :'; C : ; Abs℄!
where 'Pre
Pre'?[E:Pre; Abs℄
Pre( >)
Pre ?
(Beause thelanguageonlyoers unaryonstrutors,N and E mustarry
someargument,whihremainsunspeiedhere; standsforits type. Usu-
ally,one identieswithsome unittype.) The rstonditionalonstraint
above tells that the rst branh of ar's denition { namely E { will not
betakenunless 'is\present", i.e. unlessar'sargument istagged N. The
nextone tellsthattheseondbranh{namely (x;r):x{ willnotbetaken
unless it is tagged C. No other tags are allowed, beause ar's argument
type involves the row (N : '; C : ; Abs), whose projetion on any tag
other thanN and C is Abs.
What happens when applying ar? The type inferred for the expression
ar(C(1;C(true;N()))),whereitispassedaheterogeneous,2-elementlist,
isint. Inotherwords,thisexpressionisstatiallyfoundnottoprodueE,
beausethe rstonditional onstraint isnot triggered.
Example 4. We dene a funtionwhih reads theeld l outof a reordr
andreturnsadefaultvalued ifr hasnosuheld. Itisgivenbyextrat=
d:r:(fl=dgr):l. Inoursystem, extrat's inferredtype is
extrat : !fl:'; g!
where 'Either Either
Abs'? Abs ?AbsAny
Pre'? Pre ?PreAny
The rst onstraint retrieves r:l's type and names it , regardless of the
eld'spresene. (Iftheeldturnsouttobeabsent,willbeunonstrained.)
The left-hand onditional onstraints learly speify the dependeny be-
tweentheeld's preseneand thefuntion'sresult.
The right-hand onditional onstraints have tautologous onlusions {
therefore,theyare superuous. Theyremain onlybeause oururrent on-
straint simpliationalgorithmsare\lazy"and ignoreanyonditional on-
straints whose ondition hasnot yet beenfullled. This problem ould be
xed by making the simpliation algorithm slightly more aggressive, i.e.
byallowingittohekwhether theonlusionofa onditionalonstraint is
redundant,regardless ofits ondition.
The type inferred forextrat 0 fl= 1g and extrat 0 fm=1g is int.
Thus, in many ases, one need not be aware of the omplexity hidden in
extrat'stype.
Example 5. We assumegiven an objeto, of thefollowingtype:
o: f getText:Pre(unit!string);
setText:Pre(string!unit);
selet:Pre(intint!unit);
Abs g
omayrepresent,forinstane,aneditabletexteldinagraphiuserinterfae
system. Itsmethodsallowprogrammatiallygettingandsettingitsontents,
aswellasseletinga portionoftext.
Next, we assumea listdata struture,equippedwith asimpleiterator:
iter :(!unit)! list!unit
The following expression reates a list of messages, and uses iter to send
eahof them inturnto o:
iter(#o)[setText\Hello!" ; selet(0;5)℄
This expression is well-typed, beause o ontains appropriate methods to
deal witheah of these messages, and beause these methodsreturn unit,
asexpeted by iter. The expression's type is of ourse unit, iter's return
type.
Here isa similarexpression,whih involvesa getTextmessage:
Thistime, itis ill-typed. Indeed,sending asetText messageto o produes
aresultof typeunit,whilesendingitagetTextmessage produesa result
oftypestring. Thus,(#o)'sresulttypemustbe>,thejoinofthesetypes.
Thismakes(#o)anunaeptableargumentforiter,sinethelatterexpets
afuntion whosereturntype isunit.
8. Conlusion
Inthispaper,wehaveadvoatedtheuseofaonstraint-basedtypeinferene
system equipped with subtyping, rows and onditional onstraints. This
provides a ommon solution to several diÆult type inferene problems,
whih, sofar, had beenaddressed using speialforms of onstraints. From
a pratial point of view, it allows them to benet from known onstraint
simpliationtehniques(see AppendixA),leadingto aneÆient inferene
algorithm[18 ℄.
Our system subsumes Remy's proposal for reord onatenation [23 ℄, as
well as Muller and Nishimura's view of rst-lass messages [12 ℄. Aiken,
WimmersandLakshman's\soft"typesystem[2 ℄ismore preisethanours,
beause it interprets onstraints in a riher logial model, but otherwise
oerssimilarfeatures.
Thedesignofatypeinferenesysteminvolvestwoorthogonalomponents:
a set of typing rules and a onstraint language (together with its logial
interpretation). Asto the former, we have suggested usingHM(X) [14 , 28 ,
27 ℄, whose formulation appears most elegant, but other hoies would be
possible (see e.g. [10 , 17 ℄). The fous of the paper is really on the latter:
our aim wasto nda onstraint language expressive enough to aurately
desribe the features of the programming language at hand. One should
emphasize the fat that we do not, a priori, view the onstraint system
SRC as better (simpler, more elementary, more anonial, et.) than its
ompetitors. We merely take its wide appliability as evidene of the fat
thatitisomparativelymoregeneral-purpose(lessadho)thansomeofits
predeessors.
To onlude, we hope this paper illustrateshow a small numberof well-
understoodlogimehanismsallowbuildinganadvanedtypeinferenesys-
tem.
Aknowledgements
Thanksto JaquesGarrigue,MartinMuller,DidierRemyandMartinSulz-
mannfor stimulating disussions. DidierRemyalso proof-reada version of
the manusript. Lastly, I would like to thank the anonymous referees for
theirsuggestions.
Appendix A. Algorithms and Proofs
This appendix ontains a formal desription of onstraint resolution and
simpliationalgorithms,inthepreseneofatomiand onditionalsubtyp-
ingonstraints. Resolutionisrequiredindeterminingwhethera programis
type-orret; simpliationiskey to ahievingreasonable eÆieny.
Thesystemdesribedinthisappendixdoesnothaverows,oraseparation
of typesinto distint kinds, butotherwise hasall features presentedin the
body of this paper. Adding rows to this formal desription would require
work,butshouldnotposeanyforeseeablediÆulty,sinetheoneptofrow
isessentiallyorthogonaltothenotionofsubtyping. Addingkindsshouldbe
routine. A referene implementationof thefullsystem, inludingrows and
kinds,isavailable[18 ℄.
Thisappendixdesribesan extensionof[17℄withonditionalonstraints.
Thus, most proofs presented here are partial, and desribe only the mod-
iations required to aommodate onditional onstraints. However, all
denitionsand statementsareomplete.
This appendix is laidout as follows. First, we review all neessary on-
epts,inludingground types, types, onstraints, and type shemes. Then,
wegiveaonstraintresolutionalgorithm,andthreeonstraintsimpliation
algorithms.
Throughoutthis appendix, we use a ouple of notationalshortuts. If P
isa logiprediate,then
8`C P() standsfor 8 (`C))P()
9`C P() standsfor 9 (`C)^P()
AppendixA.1 GroundTypes
AsinSetion2,ourformaldevelopment isparameterizedwithanarbitrary
ground signature (see Denition 1). We assume that it denes only one
kind,sowe writeS and Tinsteadof S
and T
. We write?
S ,>
S ,
S ,t
S
andu
S
instead of?
,>
,
,t
and u
. We also assumeL
row
=?. The
model(T;), isdened asinDenitions2 and 3.
In thisappendix, we use the letter to denote either a ground type, or
a type, and sometimes both at the same time (see e.g. Denition 17 and
Theorem2). We willtry to preserve alear distintionwhenever possible.
Theorem1. T,equippedwith,isalattie. Itslattieoperations, denoted
by t andu, are haraterized by the following identities:
(
1 2
2
)()=
1 ()2
S
2 ()
8l2dom(
1 2
2 ) (
1 2
2 ):l=
1 :l2
l
2 :l
where 2 may standfor tor u. (We let 2 l
stand for 2 when l2L +
; when
l l
seondequation,
1
:l(resp.
2
:l)maybeundened; in suha ase,it should
bereadas the neutral elementof 2 l
.
Note that, beause of the last requirement of Denition 1, at least one of
1
:l and
2
:l must be denedinthe seondequation above.
We let ? (resp. >) stand forthe ground type suh that dom() =fg
and ()=?
S
(resp. >
S ).
Appendix A.2 Types
Types aredenedas inSetion2.2, exeptrow termsare disallowed.
Definition 7. LetV bea denumerablesetof typevariables,denotedby ,
, et. The set of types, denoted by T, is the term algebra T(;V). In
other words, atype iseithera typevariable, ora onstrutedterm, of the
form s(
l )
l 2a(s)
, wheres2S is 's head onstrutor,also written hd().
Definition 8. A ground substitution is a totalmapping from type vari-
ables to ground types. Ground substitutions are straightforwardly extended
to types.
Appendix A.3 Constraintsand Type Shemes
We now give syntax and semantis for three kinds of onstraints: atomi
onstraints, onditions and onditional onstraints. In eah ase, the nota-
tion `
k
meansthat the ground substitution k-satises theonstraint
. The notation ` means that satises , and holds, by denition, if
and onlyif`
k
holdsforall k2N +
.
Definition 9. An atomi onstraint is a pair of types, written
1
2 . A
ground substitution k-satises it i (
1 )
k (
2 ).
Definition 10. A ondition is a pair of a symbol s 2 S and of a type ,
written s,wheresmustbea prime elementofS. Agroundsubstitution
satises s i s
S
hd(()).
Definition 11. A onditional onstraint is a pair of a ondition and of
an atomi onstraint, written s ?
1
2
. A ground substitution
k-satises it i `s implies `
k
1
2 .
Having dened onstraints, we may dene notions of satisfation and en-
tailment on onstraint sets. They are dened in the usual way. We also
introduea non-standardnotion of pre-satisfation (resp. pre-entailment),
whihislogiallyweaker(resp. stronger)thanitsstandardounterpart,be-
auseit ignoresonditionalonstraints. Thesenotionsarepurelytehnial;
they areusedonlywithinourproofs.
Definition 12. Let C bea setof onstraints, bothatomi andonditional.
A ground substitution is a pre-solution of C i ` holdsfor all atomi
2C. isa solutionof C i ` holdsfor all 2C. We write ` pre
C
in the former ase, and `C in the latter.
Let bea onstraint. C pre-entails, whih wewrite C pre
, i8` pre
C `. C entails , whih we write C, i 8`C `.
We now denetype shemes. They are onstrainedpolymorphitypes, i.e.
typesontaining variables whosepossibleinstantiationsare restritedbya
onstraintset. Forsimpliity,weonlyonsiderlosed typeshemes,i.e. type
shemeswhihhave nofreetypevariables. Althoughsomewhatunommon,
typesystemsexistwhihrespetthisrestrition(see[29 ,17 ℄). Itshouldalso
be possibletoextend ourresultsto thease ofarbitrary type shemes.
Definition 13. A typesheme isa pair of a type andof a onstraint set
C,written 8C :.
Atype sheme representsasetof groundtypes, whihweallits denota-
tion. Eahof these ground typesrepresent one possibleorretbehaviorof
theprogramdesribedby. Atypeshemewhosedenotationisempty(i.e.
whoseonstraint sethas nosolution) thus represents anill-typedprogram.
Definition 14. The denotation JKof a typesheme isthe unionof the
upper ones generated by its ground instanes withrespet to . That is,
J8C :K=f 0
;9`C () 0
g
Atypesheme whosedenotationisbiggerrepresentsalargersetofpossible
behaviors; thus, it is more general. This notion allows omparing type
shemes, while aounting for polymorphism and subtyping at the same
time. It was introduedin[29℄, whereitwaswritten 8
;we denote it4.
Definition 15. Given two type shemes
1 and
2
, the former is said to
be moregeneral than the latter iJ
1 KJ
2
K;we shall then write
1 4
2 .
Inother words,
1
ismoregeneral than
2
iforany groundinstaneof
2 ,
thereexists a smallergroundinstaneof
1
. Formally,
(8C
1 :
1
)4(8C
2 :
2 )
isthus equivalentto
8
2
`C
2 9
1
`C
1
1 (
1 )
2 (
2 )
Wewrite
1
2
when
1 4
2 and
2 4
1 .
The relation oers a speiationof onstraint simpliation. Indeed, a
type sheme an be simplied into a type sheme 0
onlyif 0
. One
0
that is not a requirement; it is rather to be viewed as an implementation
detail.
We onlude this setion with a denition of what it means for a type
sheme to be made up of small terms. All of the algorithms dened here
will expet this property to hold, and will preserve it, making it a global
invariant. Thishoie simpliesdenitions and proofs. Furthermore, from
a pratial point of view, it allows enforing maximum sharing, sine it
requires every sub-term to be \named" by a type variable, allowing our
minimizationalgorithm to identifysub-terms.
Definition 16. A smallterm is a onstruted type term whose strit sub-
termsaretype variables. A type sheme8C : is madeup ofsmallterms i
it satises the following onditions:
Æ is a typevariable;
Æ for all (
1
2
) 2C, either
1 and
2
are type variables, or one is a
variable andthe other is a small term.
Æ for all (s?
1
2
)2C, ,
1 and
2
are typevariables.
Every type sheme an be turnedinto an equivalent type sheme whih is
made up of smallterms. (In pratie, thiswouldbe done when onverting
type shemesinputby theuserinto some internalrepresentation.)
Appendix A.4 Solving Constraints
Webeginwithafundamentaltehnialresult,whihdesribesaweak,suÆ-
ientonditionforaonstraintsettohaveasolution. Itshallformthebasis
for the proof of the losure algorithm. We prove a fairly powerful version
of thisresult,allowing groundonstants to appear inonstraints. (If these
onstraintswereto bewritten,someniterepresentationofthese onstants
wouldbe required;however,suhisnotthease here.) Thanksto thisgen-
eralization, thisresult willalso form thebasis forthe proofof thegarbage
olletion algorithm.
Definition 17. A onstraint set withgroundonstants is a onstraint set
C, where atomi onstraints may involve either two variables, one variable
anda small term, or onevariable and a groundtype,and where onditional
onstraints have their usual form. Dene the assertion C +1
1
2 to
mean
8k0 8` pre
k
C `
k+1
1
2
Dene C
#
()=f; 62V^ 2Cg andC
"
()=f; 62V^ 2
Cg. C is said to be weakly losed i the following onditions aremet:
(1) 2C and 2C imply 2C;
(2) 2C and 2C
#
() imply 9 0
2C
#
() C +1
0
;
(3) 2C and 0
2C
"
() imply 9 2C
"
() C
+1
0
;
(4) 2C
#
() and 0
2C
"
() imply C +1
0
;
(5) 2C and s?2C implys?2C;
(6) s?2C, 2C
#
() and s
S
hd() imply C pre
.
C
#
() ontains all lower bounds of whih are not type variables; thus,
every 2 C
#
() must be either a small type term, or a ground type. A
similarremarkholdsonerningC
"
().
Conditions1and5abovearepurelysyntatitransitivityonditions. Con-
ditions2 to 4also involve transitivity,buttheuse of +1
allows expressing
these onditions in a logial, rather than syntati, way, making them less
restritive.
Theorem2. LetC beaonstraintsetwithgroundonstants. IfC isweakly
losed, then C has a solution.
Proof. Note that this proof only uses Conditions 2, 4 and 6 of Deni-
tion 17. The other onditions shall be required by further theorems, suh
astheorretness proof ofgarbage olletion.
The rst step of the proof onsists inexhibiting a ground substitution
suh that, forall 2fv(C), ()equals tf(); 2 C
#
()g, and proving
that is a pre-solution of C. In fat, this step oinides with the lassi
proofperformed intheabsene ofonditional onstraints [17 ℄;we shallnot
repeatithere.
The seond step onsists inprovingthat is a fullsolutionof C. Pika
onditionalonstraint s?2C. Assume `s. By denitionof ,
thisstatement an be written
s
S
hd(())=hd(tf(); 2C
#
()g)
=t
S
fhd(()); 2C
#
()g
=t
S
fhd(); 2C
#
()g
(Theidentityhd(())=hd() stemsfrom thefat that iseither asmall
term,oragroundonstant,withaxedheadonstrutor.) Consideringthat
sis prime (see Denition10), thisentails s
S
hd() forsome 2 C
#
().
We an then apply Condition 6 of Denition 17, whih yields C pre
.
Sine is a pre-solution of C, this implies ` . We have thus veried
`s?,provingthat isa solutionof C.
2
Equippedwiththistehnialresult,wearenowreadytodeneaonstraint
resolutionalgorithm. Itisbasedonasimplelosureomputation. Webegin
by deningan auxiliaryonstraint deompositionfuntion, whih breaksa
onstraintdown into aset of equivalent onstraints.
Definition 18. Given types
1 and
2
, sub(
1
2
) is dened as
Æ f
1
2 g, if
1 or
2
isa variable;
Æ f
1 :l
l
2
:l;l 2 dom(
1
)\dom(
2
)g, if
1 and
2
are onstruted
terms suh that hd(
1 )
S hd(
2 ).
Note that sub(
1
2
) isundened when hd(
1 ) 6
S hd(
2
); indeed, suh
aonstraintis learlyunsatisable.
Usingthisauxiliaryfuntion,wean nowdesribethelosure onditions:
Definition 19. Let C be a onstraint set, made up of small terms. C is
said to be losed i
(1) 2C and 0
2C imply sub(
0
)C;
(2) 2C and s?2C imply s?2C;
(3) s?2C, 2C
#
() and s
S
hd() imply2C.
Condition 1 is the lassi losure ondition, found e.g. in [17 ℄; it involves
transitivityandstruturaldeomposition. Condition2isatransitivityon-
dition onerning onditional onstraints. Condition 3 requires that the
onlusionof a onditional onstraint whoseondition mustbe satisedbe
dishargedinto theonstraint set.
It is easy to hek that losure impliesweak losure [17 ℄. This yields an
algorithm to deidewhether a onstraint set C has a solution: attempt to
omputethesmallestlosedsetC
ontainingit,byrepeatedappliationof
theabovethreerules. Eahrulepreservestheset'ssolutionspae. So,ifthe
omputation sueeds,then C has a solution;if, on the other hand, itfails
(beause sub isappliedoutside ofits domain),then C hasno solution.
Consideraonditionalonstraints?. Aordingto thelosurerules
above, the atomi onstraint willhave no eet on the onstraint resolu-
tion proess untilit isdisharged byrule3. That is,willbeignored until
thealgorithmdisoverssomeevidenethattheonditionsmust besat-
ised. This explains why onditional onstraints delay type omputations,
as mentioned in the body of this paper. The algorithm will not speulate
abouttheonsequeneoftheonditionalonstraint,shoulditsonditionbe
satised;rather, itwaitsuntilit hasno hoie butsatisfy.
Appendix A.5 Polarity
We now denehowto assoiate apolarity witheah typevariable inatype
sheme whose onstraint setif(weakly)losed. Thisnotionwillbe usedin
thedenitionofall three onstraint simpliationalgorithms.
Definition 20. Consideratypesheme =8C :Æ,madeupofsmallterms,
whereC is weakly losed. Theset of positive variables of , and the setof
negative variables of , respetively denoted by fv +
() and fv (), are the
smallest subsets P and N of fv() suh that
Æ Æ 2P;
Æ 82P 8 2C
#
() split()(N;P);
Æ 82N 8 2C
"
() split()(P;N);
Æ 82N s? 2C ) 2P ^ 2N.
where the auxiliary funtion split maps a small term to an element of
2 V
2 V
, as follows:
split()=(f:l;l2L g;f:l; l2L +
g)
Atypevariableissaidtobebipolar ifitispositiveandnegative,andneutral
ifitisneither. fv +
()and fv ()anbeomputedintimelinearinthesize
of , usinga simple x-point alulation. Every type sheme is equivalent
toatypeshemewithnobipolarvariables;wedonotprovethisresulthere.
AppendixA.6 Garbage Colletion
Knowingthepolarityof eahvariableallowsustothrowawaymanyredun-
dantonstraints, asshownbythefollowingdenitionand theorem.
Definition 21. Consider as in Denition 20. The image of through
garbageolletion, denoted by GC(), isthe typesheme8D:Æ,whereD is
a subsetof C dened asfollows:
Æ 2D i 2C, 2fv () and 2fv +
();
Æ D
#
() equals C
#
() if 2fv +
(), and? otherwise;
Æ D
"
() equals C
"
() if 2fv (), and? otherwise;
Æ s? 2Di s? 2C and 2fv ().
This denitionis mostly idential to the one in[17 ℄; onlythe fourth point
is new, and speies that a onditional onstraint is redundant unless it
bears on anegative variable. In operationalterms,a onditionalonstraint
s ? an be triggered only if reeives a lower boundwhih exeeds
s. Consideringthatonlynegative variablesanreeivenew lowerboundsin
thefuture,thisonstraint hasno eet unless is negative.
Theorem3. Consider as in Denition 21. Then GC().
Proof. Write 0
=GC(). Sine 0
hasfeweronstraints,it islearthat
0
4 . So, we need to prove 4 0
. Aording to Denition 15, this is
equivalentto
8 0
`D 9`C (Æ) 0
(Æ)
Piksome 0
`D. WenowwishtoprovethatC[fÆ 0
(Æ)gadmitsasolu-
tion. Thisisaonstraintsetwithgroundonstants,asperDenition17. We
shallmeet ourgoalbyprovingthatthefollowingonstraintset|a superset
oftheprevious one|isweakly losed:
C[f 0
();2fv ()^ 2C r
g
[f
0
();2fv +
()^ 2C r
g