hh PPP

(1)

A Versatile Constraint-Based Type Inferene System

Franois Pottier

Franois.Pottierinria.fr

Abstrat. Theombinationofsubtyping,onditionalonstraintsandrowsyieldsa

powerfulonstraint-basedtypeinferenesystem. Weillustratethislaimbypropos-

ingsolutionstothreedeliatetypeinfereneproblems: \aurate"patternmath-

ings, reord onatenation, and rst-lass messages. Previouslyknown solutions

involvedadierenttehniquein eah ase;ourtheoretial ontributionisinusing

onlyasinglesetoftools.Onthepratialside,thisallowsallthreeproblemstoben-

etfromaommonsetofonstraintsimpliationtehniques,aformaldesription

ofwhihisgivenin anappendix.

CRClassiation: F.3.3[LogisandMeaningsofPrograms℄: StudiesofProgram

Construts|TypeStruture.

Key words: Constraint-based type inferene. Subtyping. Rows. Conditional

onstraints.

1. Introdution

Type inferene is the task of examining a program whih laks some (or

evenall)typeannotations, andreoveringenoughtypeinformationtomake

itaeptable bya type heker. Itsoriginal, and most obvious,appliation

isto free theprogrammerfrom theburden ofmanuallyproviding these an-

notations,thusmaking statitypingalessdrearydisipline. However, type

inferene has also seen heavy use as a simple, modular way of formulating

programanalyses.

The design of a type inferene system an be inuened by its purpose.

When used as a user-visible way of enforing a oding disipline, it might

be desirable to make it simple and somewhat rigid. When used invisibly

aspart of a ompiler's optimization proess,on theother hand, maximum

preision may be desired. Regardless of this distintion, however, power-

ful type inferene tehniques are often made a neessity by the advaned

featuresfoundinmanyreent programminglanguages.

Thispaperpresentsaommonsolutiontoseveralseeminglyunrelatedtype

inferene problems, using an existing framework for subtyping-onstraint-

based type inferene [14 ℄, equipped withonditional onstraints inspiredby

Aiken,Wimmersand Lakshman[2 ℄ and withrows alaRemy[19 , 21 ℄.

INRIARoquenourt,BP105,78153 LeChesnayCedex,Frane.

(2)

Constraint-Based Type Inferene

Subtypingisapartialorderon types,denedsothatanobjetofasubtype

maysafelybesuppliedwhereveranobjetofasupertypeisexpeted. Type

inferene in the presene of subtyping reets this basi priniple. Every

timea piee of data is passedfrom a produer to a onsumer, theformer's

output type is required to be a subtype of thelatter's inputtype. This re-

quirementisexpliitlyreordedbyreating asymbolisubtyping onstraint

betweenthese types. Thus, eah potential dataow disovered inthe pro-

gram yields one onstraint. This fat allows viewing a onstraint set as a

diretedapproximationoftheprogram'sdataowgraph{regardlessofour

partiulardenitionof subtyping.

Varioustype inferenesystems basedonsubtypingonstraintsexist. One

mayiteworksbyAikenetal. [1,2 ,5℄,thepresentauthor[16 ,17 ℄, Trifonov

and Smith[29 ℄, aswellasOdersky et al.'sabstrat framework HM (X) [14 ,

28 ,27 ℄. Relatedsystems inludeset-basedanalysis [9 ,6 ℄andtype inferene

systems basedon featureonstraints[11 , 12 ℄ orprediateonstraints[10 ℄.

Conditional Constraints

In many onstraint-based systems, the expression if e

0

then e

1

else e

2

is,at best, desribedby

1

^

2

where

i

standsfore

i

'stype,and standsforthewholeexpression'stype.

This amounts to stating that \the value of e

1

(resp. e

2

) may beome the

value of the whole expression", regardless of the test's outome. A more

preise desription { \if e

0

may evaluate to true (resp. false), then the

value of e

1

(resp e

2

) may beome thevalue of the whole expression"{ an

begiven usingonditional onstraints:

true

0

?

1

^ false

0

?

2

Introduingtests into onstraints allows keeping trak of some of the pro-

gram's ontrol ow {that is,mirroring, at thelevelof types, theway eval-

uationisaeted bythe outomeof atest.

Conditional setexpressions were introdued byReynolds[25 ℄ asa means

ofsolvingsetonstraintsinvolvingstrittype onstrutors anddestrutors.

Heintze [9℄ uses them to formulate an analysis whih ignores \dead ode".

He also introdues ase onstraints, whih allow ignoring the eet of a

branh,ina aseonstrut, unless itis atually liableto be taken. Aiken,

WimmersandLakshman[2℄useonditionaltypes,togetherwithintersetion

types, forthispurpose.

In thepresent paper,we suggesta single notionof onditional onstraint,

whih isomparableinexpressive powerto theabove onstruts, andlends

itself to a simpleand eÆient implementation. (Asimilarhoie was made

(3)

independentlybyFahndrih[5℄.) Weemphasizeits useasawaynotonlyof

introduingsomeontrol intotypes,butalsoofdelaying typeomputations,

thusintroduingsome \laziness" into type inferene.

Rows

Designing a type system for a programming language with reords, or ob-

jets, requires some way of expressing labelled produts of types, where

labels areeldormethodnames. Dually,ifthelanguageallowsmanipulat-

ingstrutured data, thenits type system is likelyto require labelledsums,

wherelabels arenamesof dataonstrutors.

Wand [30℄ and Remy [19 , 21 ℄ elegantly deal with both problems at one

by introduing notationto expressdenumerable, indexed familiesof types,

alledrows:

::=;;:::;'; ;:::ja:; j

(Here, ranges over types, and a;b;::: range over indies.) An unknown

row may be representedby a row variable, exatly as in the ase of types.

(Bylakofsymbols,wewillnotsyntatiallydistinguishplaintypevariables

and row variables.) The term a :; represents a row whose element at

indexais,and whoseother elements aregivenby. The term stands

for a row whose element at any index is . These statements are given

formalmeaningbyinterpreting rowsina logialmodel wherethefollowing

equationshold:

a:

a

; (b:

b

; )=b:

b

; (a:

a

; )

=a:;

If desired, some type onstrutors may be lifted to the level of rows, i.e.

viewedasrowonstrutorsaswell. Forinstane,toliftthetypeonstrutor

!,we extendthesyntax of rows:

::=:::j!

The term ! 0

is logially interpreted as the row obtained by applying

thetype onstrutor !, point-wise, to the rows and 0

. As a result, the

logialmodelsatisesthe followingequations:

(a:; )!(a: 0

; 0

)=a:( ! 0

); (! 0

)

!

0

=( ! 0

)

Moredetailsaregiven inSetion2.

Rows oer a partiularly straightforward way of desribing operations

whih treat all labels (exept possibly a nite number thereof) uniformly.

Beauseevery failityavailableat theleveloftypes(e.g. onstrutors, on-

straints)an alsobemadeavailableatthelevelofrows,adesriptionofthe

operation's eet on a singlelabel,written usingtypes,an also be readas

a desription of the entire operation, written using rows. This interesting

(4)

PuttingIt All Together

Our point isto show that the ombination of the three onepts disussed

above yields a very expressive system, whih allows type inferene for a

number of advaned language features. Among these, \aurate" pattern

mathing onstruts, reord onatenation, and rst-lass messages will

be disussed in this paper. Our system allows performing type inferene

for all of these features at one. Furthermore, eÆieny issuesonerning

onstraint-based type inferene systems have already been studied [5 , 17 ℄.

This existing knowledge benets our system, whih may thus be used to

eÆiently performtypeinferene forall ofthe above features.

In this paper, we fous on appliations of ourtype system, i.e. we show

how it allows solving eah of the problems mentioned above. Formal de-

nitionsof ouronstraint resolution and simpliation algorithmsappearin

Appendix A. Furthermore, a robust prototype implementation is publily

available [18 ℄. We do notprove that thetypes given to the three problem-

ati operations disussed in this paper are sound, but we believe this is a

straightforwardtask.

The paper is organized as follows. Setion 2 gives a detailed tehnial

presentation of the type system. Setion 3 gives an informal explanation

of the potential osts and benets of using onditional onstraints. Se-

tions 4, 5, and 6 disuss type inferene for \aurate" pattern mathings,

reordonatenation, and rst-lass messages, respetively,within oursys-

tem. Setion7givesseveralexamples,whihshowwhat inferredtypeslook

like in pratie. Setion 8 sums up our ontribution. Lastly, Appendix A

givesdenitionsand proofsforseveral onstraint manipulationalgorithms.

2. Formal Presentation of the System

This setion gives an in-depth formal presentation of our type system, in

its most general form. Muh of it maybe skipped on a rst reading { the

followingsetionsdesribethe systemina moregentle fashion. Thereader

maywish to ome bakto thissetionat a laterstage.

We dene our type system as an instane of the parametri framework

HM(X) [14 , 28 ,27 ℄. To do so, we simply dene aonstraint system, alled

SRC(forsubtyping-rows-onditionals),givingrisetoHM (SRC). Byre-using

existingmaterial,wesavedenitionsandproofs,andemphasizethefatthat

ourapproahis standard.

In order to retaina measure of generality, SRC isitself parameterized by

a ground signature,whihis a suintdesriptionof a type algebraand of

itsintendedsubtypeordering. GroundsignaturesaredenedinSetion2.1.

Given suh a ground signature, we expliitly dene the syntax of types

and onstraints (Setion 2.2), a logial model within whih they may be

interpreted(Setion 2.3),and theinterpretation itself (Setion2.4).

(5)

2.1 Assumptions

Agroundsignatureonsistsofthreeomponents: aseriesofsymbol latties,

indexed by kinds, a set of parameter labels (eah of whih is either o- or

ontra-variant, desribes either a row or a plain type parameter, and has

a xed kind), and a desription of eah symbol's arity as a nite set of

parameterlabels.

Definition 1. Let K be a nite set of kinds. For every kind 2 K , let

S

bea lattieof symbols, withoperations ?

, >

,

, t

andu

. Dene

S=℄

2K S

.

Let L +

and L be denumerable sets of parameter labels. Dene L =

L +

℄L . Let L

row

L be a distinguished subset of row parameter labels.

Let kindbea total mapping of L into K .

Let a be a total mapping from S to nite subsets of L, suh that:

Æ for all s

0

;s

1

;s

2 2S

, s

0

s

1

s

2

implies a(s

0 )\a(s

2

)a(s

1 );

Æ forany nitesubsetS of S

,a(t

S)anda(u

S)aresubsetsof [a(S).

Note that this implies a(?

)=a(>

)=?.

Theinformation desribed above forms a ground signature.

Therstonditionbearingonaisneessarytoguaranteethattheorderings

dogiveriseto anorderingongroundtypes(denedinSetion2.3). The

seondonemakesthedenitionofsome onstraintmanipulationalgorithms

moreonvenient (see Denition23 inAppendixA).

Example 1. Assume there is only one kind ?. Dene S

?

= f?;!;>g,

where?

?

!

?

>. Let L =fdomg,L +

=frnggand L

row

=?. Dene

a(?)=a(>)=?anda(!)=fdom;rngg. Thisdenesagroundsignature,

whihallows typingthepure-alulus.

Example 2. Dene three kinds N, R and V, orresponding to normal,

reord eld and variant eld types, respetively. Let S

N

be the at lat-

tie whose elements other than ? and > are !, fg and [℄. Let S

R

be the lattie with least element Bot, greatest element Any, and whose

other elements are Abs, Pre and Either, ordered by Abs

R

Either and

Pre

R

Either. Let S

V

be the lattie with least element Abs, greatest el-

ement Any, and whose only other element is Pre. (By abuse of language,

we are giving idential names to symbols in S

R

and in S

V

. This remains

non-ambiguous as long as all terms onsidered have known kinds.) Let

L =fdomg,L +

=fontent ;ontents;rngg and L

row

=fontentsg. Dene

a(!)= fdom ;rngg, a(fg) =a([℄) =fontentsg, a(Pre) = a(Either) =

fontentg,anda(?)=a(>)=a(Bot)=a(Abs)=a(Any)=?. Thisdenes

agroundsignature,whihisexpressiveenoughto desribeallprogramming

languagefeaturesonsideredinthispaper. Inpartiular,allofitsexpressive

powerwillbe exploitedto desriberst-lass messagesinSetion6.

(6)

::=;;'; ;:::js(

l )

l 2a(s)

jr :; j

C::=truejC^C j9:C j js? (sprime inS

)

Figure1: Syntaxoftypesandonstraints

In the rest of this formal presentation, we assume given a xed, arbi-

trary ground signature. In Setions 4{7, we will use the ground signature

desribedinExample2 above, butwe willre-introdueitstep by step.

2.2 Syntax of Types and Constraints

The (raw) syntax of types and onstraints is given inFig. 1. ;;'; ;:::

denotetype variables. Atypeterms() anbeformedbypikingasymbol

s2S andafamilyoftypeparameters,indexedaordingto thearityofs,

i.e. mustbe of theform (

l )

l 2a(s)

. Lastly,types may also be rows, whih

denotefamiliesoftypesindexed byadenumerableset ofrow labels R. The

term r :; 0

(where r 2 R) represents a row whose element at index r is

, and whose other elements are given bythe row 0

. The term stands

fora rowwhoseelement at any indexis.

The onstraint language oers standard onstruts (truth, onjuntion,

projetion [14 ℄), subtyping onstraints, and onditional onstraints. The

latter are of the form s ? , where s must satisfy the following

ondition: foranynitesubsetSofS

,s

(t

S)implies9s 0

2S s

s

0

.

In other words, s must be a prime element of its symbol lattie S

. This

ensuresthata onditionalonstraintbearingontheleastupperboundof a

set of variables, e.g. (s

1

t:::t

n

)?, is equivalent to a onjuntion

of onditional onstraintsbearingon its members:

V

n

i=1 (s

i

?). It is a

neessary ondition for the orretness of the garbage olletion algorithm

(see Theorem3in AppendixA).

Our denitionof onditional onstraints is dissymmetri. Indeed, ondi-

tionsmustbeoftheforms;onditionsoftheform saredisallowed.

Themotivationforthisdeisionis toallowtheonstraintsolvingalgorithm

toignoreonditionalonstraintsunlesstheironditionmust besatised(see

Denition19inAppendixA). Ifbothformsofonditionswereallowedtoo-

exist,thelanguagewouldbeomeexpressiveenough to enode disjuntions

ofonstraints,making onstraint solvingmore ostly.

To ensure that only meaningful types and onstraints an be built, we

equipthemwithkinding andsorting rules. Thegrammarofsorts isdened

by& ::=TypejRow(R ), whereR ranges over nitesubsets ofR. Forevery

kind and every sort &,we assume given a distint,denumerableset V

&

of

type variables. We dene judgements of the form ` : (resp. ` : &),

meaning that the type has kind (resp. sort &), and judgements of the

(7)

2V

`:

s2S

8l2a(s) `l:kind(l )

`s(

l )

l2a(s) :

`

1 :

`2:

`(r:

1

;

2 ):

` :

`true

`C1 `C2

`C1^C2

`C

`9:C

`1:

`2:

`12

s2S `0: `12

`s0?12

Figure2: Kindingrules

2V

&

`:&

a(s)\L

row

=?

8l2a(s) `l:&

`s(

l )

l2a(s) :&

a(s)\Lrow6=?

8l2a(s)nL

row

`

l :Type

8l2a(s)\Lrow `l:Row(?)

`s(

l )

l2a(s) :Type

`1:Type `2:Row(R℄frg)

`(r:

1

;

2

):Row(R )

`:Type

`:Row(R )

`true

`C1 `C2

`C

1

^C

2

`C

`9:C

`

1

:& `

2 :&

`12

`

0

:& `

1

:& `

2 :&

`s0?12

Figure3: Sortingrules

form`C,meaningthattheonstraintC iswell-kinded(resp. well-sorted).

Thekindingrules,giveninFig.2,simplyenforethekinddisiplinerequired

bytheground signature. The sortingrules,displayedinFig.3, ensurethat

only meaningful row terms are built. Intuitively, the sort Type desribes

plain types, while the sort Row(R ) desribes families of types indexed by

RnR . In other words, a row of sort Row (R ) gives information about all

row labels exept those in R . For more details, we refer the readerto [21 ℄

orto [20 , setion5℄.

Beforemovingon,let uspointoutthatatermmayhaveseveralsorts, for

two distint reasons. First, a uniformrow maybe viewed as desribing

any (o-nite) number of entries, i.e. it may have any sort Row (R ). As a

result, the row term r

1 :

1

; :::; r

n :

n

; may have any sort Row(R ),

provided fr

1

;:::;r

n

g\R = ?. Suh a term will be required to have sort

Row (?)onlywhenusedasthel-parameterofatypeonstrutorsexpeting

a full row in l-position (i.e. l 2 a(s)\L

row

). Seond, a type onstrutor

s with non-row parameters (i.e. a(s)\ L

row

= ?) an be used at any

sort &. For instane, if r :

0

; 0

0

and r :

1

; 0

1

have sort Row (R ), then

(r:

0

; 0

0

)!(r:

1

; 0

1

) hassort Row(R ) aswell. Itslogialinterpretation

willbe thesame asthatof r :

0

!

1

; 0

0

! 0

1 .

(8)

slightlymore subtle: themeaningofatermdependsonthesort at whihit

isviewed. Fortunately,themeaningofaonstraintwillremainindependent

ofthe sortof its omponents.

2.3 Logial Model

We now dene the logial model within whih our onstraints are inter-

preted. Informallyspeaking,itisthetermalgebra generatedbytheground

signature at hand. However, things are made more omplex by our desire

to have reursive types 1

and bythepresene ofrows.

Definition 2. Let A bethe alphabet formed of all letters l2LnL

row and

all omposite letters lr, where l 2 L

row

and r 2 R. To every l 2 L, we

assoiate a subset A

l

of the alphabet, dened by A

l

= flg if l 2 LnL

row ,

and A

l

=flr;r2Rg otherwise.

A pathp isanitestringover thealphabet A,i.e. an elementof A

. The

letter denotes the empty path. A ground tree t is a partial funtion from

A

into S, whosedomain is non-empty and prex-losed, suh that, for all

paths p2dom(t) and for all labels l2L,

Æ if l2 a(t(p)), then p:A

l

is a subset of dom(t), whose image through t

is a subsetof S

, where =kind(l);

Æ otherwise, p:A

l

lies outside of dom(t).

The head onstrutor of a ground term t, written hd(t), is t(). Given

p2dom(t), the subtree of trooted at p, written t:p, is the tree q7!t(p:q).

Given p, l suh that l 2 a(t(p))\L

row

, the subrow of t rooted at (p;l) is

the funtionr 2R7! t(p:(lr)). A funtionissaid to be quasi-onstant i

its o-restrition to some nite set is a onstant funtion. A groundtree is

regular i it has a nite number of subtrees. A ground tree t is a ground

typeiit isregularandallof itssubrowsarequasi-onstant. Wedenote the

setofgroundtypes byT. A groundtypethas kind ifandonlyif t()2S

.

We denote the set of ground types of kind by T

.

Then, we equip every T

with an ordering . Beause ground types are

innitetrees,annotbedenedeasilybystruturalindution;instead,it

isdened asthelimitof adereasing sequeneof pre-orders.

Definition 3. A family of pre-orders over every T

is dened as follows.

Let

0

beuniformlytrueover everyT

. Then,forany k 2N and t;t 0

2T

,

dene t

k+1 t

0

as the onjuntion of the following onditions:

Æ t()

t

0

();

Æ 8l2a(t())\a(t 0

())nL

row

t:l l

k t

0

:l;

Æ 8l2a(t())\a(t 0

())\L

row

8r2R t:(lr) l

k t

0

:(lr).

1

The presene of reursive types removes the need to hek whether all solutions of

a onstraint are yli, whih,in the presene of subtyping relationships between type

(9)

(We let t l

k t

0

stand for t

k t

0

when l 2 L +

and t 0

k

t when l 2 L .)

Subtyping,denotedby,istheintersetionofthesepre-orders; itisalattie

on every T

.

The subtyping relationship is strutural: t and t 0

are related ifand only

iftheirheadonstrutors t()and t 0

() arerelated inthelattie of symbols

and,forevery labelldened bybotht and t 0

,theirl-sub-termsare related

(either o- or ontra-variantly, depending on the variane of l). It is, in

general,non-atomi: type onstrutors ofdierent aritiesmayberelated.

2.4 Logial Interpretation

Thereremainstogive an interpretation oftypesand onstraintswithinthe

model. Itisparameterizedbyagroundsubstitution,whihgivesmeaningto

anyfreetypevariables. Itmapstypestogroundtypes,ortofamiliesthereof

(aordingto theirsort), andonstraintsto Boolean values.

Definition 4. A ground substitution is a funtion of domain V, whih

mapsV Type

into T

, andwhih maps V Row (R)

intothe set of quasi-onstant

funtions of RnR into T

.

Definition 5. The interpretation of a type of sort &, under a ground

substitution,written (

&

), orsimply () when& an bedetermined from

the ontext, isdened as follows.

Æ If is a type variable , then (

&

) isthe imageof through .

Æ If is of the form s(

l )

l 2a(s)

and & = Type , then (

&

) is the ground

type t suh that t() = s, t:l = (

l

) whenever l 2 a(s)nL

row and

t:(lr)=(

l

)(r) whenever l2a(s)\L

row

andr 2R.

Æ If isoftheforms(

l )

l 2a(s)

and& =Row(R ),then,foreveryr2RnR ,

(

&

)(r) is the ground type t suh that t() = s and t:l = (

l )(r)

whenever l2a(s).

Æ If is of the form r :

1

;

2

and & = Row (R ), then (

&

)(r) = (

1 )

and, for every r 0

2Rn(R[frg), (

&

)(r 0

)=(

2 )(r

0

).

Æ If is of the form

0

and & = Row(R ), then, for every r 2 RnR ,

(

&

)(r)=(

0 ).

Definition 6. The onstraint satisfation prediate `, whose arguments

are a ground substitution and a well-sorted onstraint C, is dened as

follows.

Æ `true holds.

Æ `C

1

^C

2

holds i `C

1

and`C

2 hold.

Æ `9:C holdsithereexistsagroundsubstitution 0

,whihoinides

with outside of , suh that 0

`C holds.

Æ If ` ; :Type ,then ` holds i( )( ) holds.

(10)

Æ If `

1

;

2

:Row (R ), then `

1

2

holds i, for every r 2 RnR ,

(

1

)(r)(

2

)(r) holds.

Æ If `

0

;

1

;

2

:Type , then ` s

0

?

1

2

holds i s

S (

0 )()

implies (

1

)(

2 ).

Æ If `

0

;

1

;

2

: Row(R ), then ` s

0

?

1

2

holds i, for every

r 2RnR , s

S (

0

)(r)() implies (

1

)(r)(

2 )(r).

Thisdenitioniswell-formedbeause,eventhoughthetypeswhihappear

inaonstraintmayhaveseveral admissiblesorts,allofthemgiverisetothe

same interpretation.

Lastly, onstraint entailmentis given its usualdenition: CC 0

holdsif

and onlyif,foreveryground substitution, `C implies`C 0

.

2.5 TheType System HM(SRC)

We refer to the onstraint logi dened in Setions 2.1{2.4 as SRC. It is a

sound onstraint system in the sense of [14 ℄; thus, it gives rise to a type

system, namelyHM (SRC),forthe-aluluswith let.

We do not repeat the typing rules of HM (X) in this paper. For our

purposes,suÆeittoreallthattypeshemesareoftheform ::=8[C℄: .

Whenall of atype sheme's variablesareuniversallyquantied,weusually

write\ where C".

The-aluluswithletisalimitedprogramminglanguage. Toextendit,

wewilldenenewprimitiveoperations,equippedwithoperationalsemantis

and appropriate type shemes. However, no extension to the type system

itself will be neessary. This explains why we do not desribe it further.

Instead,wewillfousourintereston writing expressive typeshemes.

3. About Conditional Constraints

Theontentofthissetionisinformal. Itshowshowonditionalonstraints

an be usedto gainextra typingexibility,and whywe mightwant to use

them onlysparingly.

In aall-by-valuelanguage, ifanexpressione

2

diverges,then sodoesany

appliation (e

1 e

2

). In partiular, ife

2

has type ?, then (e

1 e

2

) may safely

be given type ? as well. In other words, if it an be proven that e

1 will

neverbe alled,thenits returntype an be disarded.

It is possible to make a type system aware of this fat. To do so, one

merelyintroduesa new typing rule:

C ;( ;x :?)`e:

C ; `x:e:?!?

As a result, the type system is no longer syntax-direted: typing a -

abstration involves ahoie between thisruleand the usual-abstration

(11)

ases separately, sine that would have exponential ost. Instead, a natu-

ralsolutionis to usea single type inferene rule,whih emits aonditional

onstraint,along thelinesof

C ;( ;x:)`

I

e: ; fresh

(?<? )^C ; `

I

x:e:!

Aslongasthefuntionisn'tinvoked,?remainsanadmissiblesolutionforits

argumenttype. So,theonditionalonstrainthasnoeet,andremains

unonstrained,meaningthatthefuntionproduesno result. However, ifa

allto this funtionis laterdisovered,then willbe onstrained to some

valuegreaterthan?. Thiswilltriggertheonditionalonstraint,and!

willbeomealowerboundforthefuntion'stype,meaningthatthefuntion

produesa result oftype .

This tehnique allows designing a \lazy" type inferene system, whih

ignores the type of an expression unless it appears liable to be evaluated.

Heintze[9 ℄usesonditionaltypesforthisverypurpose. Infat,itispossible

toarrythisideaeven further,andto ignorenotonlytheexpression'stype,

butalso its eet on thetypingenvironment. This would involve replaing

(?<? )^C abovewith?<?( ^C);thus, theonstraintC,

whihdesribestherequirementsofthefuntiononerningitsenvironment,

would also be subjet to the ondition?< . This idea appears,under a

dierentformulation,ine.g. [26 ℄.

Despite their theoretial appeal, though, these proposals seem a bit ex-

treme. Theyproduealargenumberofonditionalonstraints,makingtype

inferene less eÆient, beause potential onstraint simpliations are de-

layed. Thus,inapratialsystem, \laziness"shouldbeusedonlysparingly.

We proposeto buildit into thetypesof a fewprimitive operations, rather

than to hard-wire it into thetyping rules. We willillustrate this priniple

inthefollowingsetions.

4. Aurate Analysis of Pattern Mathings

Whenfaedwithapatternmathingonstrut,mostexistingtypeinferene

systemsadopt asimple, onservative approah: assuming thateah branh

maybe taken,they letitontributeto thewholeexpression'stype. Amore

auratesystem shouldusetypes to prove that ertain branhes annot be

taken, andprevent them from ontributing.

In thissetion,we desribe suh a system. Theessential idea{ introdu-

ing a onditional onstrut at the level of types { is due to [9 , 2℄. Some

novelty residesin ourtwo-step presentation, whih we believe helpsisolate

independentonepts. First,weonsidertheasewhereonlyone dataon-

strutorexists. Then, we easily move to thegeneral ase, by enrihing the

typealgebra withrows.

(12)

4.1 TheBasi Case

We assume thelanguageallows buildingand aessingtagged values.

e::=:::jPrejPre 1

A singledataonstrutor,Pre, allows buildingtagged values,whilethede-

strutorPre 1

allowsaessingtheirontents. Thisrelationshipisexpressed

bythefollowingredutionrule:

Pre 1

v

1 (Prev

2

) redues to (v

1 v

2 )

The rule states that Pre 1

rst takes the tag othe value v

2

, then passes

itto thefuntionv

1 .

At thelevelof types,weintroduea(unary)varianttypeonstrutor[℄.

Also, we establish a distintion between so-alled \normal types," written

,and \eldtypes,"written .

::=;;;:::j?j>j ! j[℄

::='; ;:::jAbsjPre jAny

A subtype ordering over eld types is dened straightforwardly: Absis its

leastelement,Anyis its greatest,and Preis aovariant type onstrutor.

The dataonstrutor Preisgiven thefollowingtypesheme:

Pre : ![Pre℄

Notie that there is no way of buildinga value of type [Abs℄. Thus, if an

expressionhas this type, then itmustdiverge. This explains ourhoie of

names. If an expressionhas type[Abs℄, thenits value must be \absent";if

ithastype[Pre℄,thensome value oftype maybe\present".

The datadestrutor Pre 1

isdesribed asfollows:

Pre 1

: (!)!['℄!

where 'Pre

Pre'?

The onditional onstraint allows (Pre 1

e

1 e

2

) to reeive type ? when e

2

hastype[Abs℄,reetingthefatthatPre 1

isn'tinvokeduntile

2

produes

some value. Indeed, as long as ' equals Abs, the onstraint is vauously

satised, so is unonstrained and assumes its mostpreise value,namely

?. However, as soon as Pre ' holds, must be satised as well.

Then,Pre 1

's type beomesequivalentto (!) ![Pre℄!, whih

isits usualML type.

(13)

4.2 TheGeneral Case

We now move to a languagewitha denumerablesetof dataonstrutors.

e::=:::jKjK 1

jlose

(We let K, L;::: stand for data onstrutors.) An expression may be

tagged, as before, by applying a data onstrutor to it. Aessing tagged

values beomes slightly more omplex, beause multiple tags exist. The

semantisoftheelementarydatadestrutor, K 1

,isgiven bythefollowing

redutionrules:

K 1

v

1 v

2 (Kv

3

) reduesto (v

1 v

3 )

K 1

v

1 v

2 (L v

3

) reduesto (v

2 (Lv

3

)) when K6=L

Aording to these rules,ifthe value v

3

arriestheexpeted tag,then it is

passedto the funtionv

1

. Otherwise,the value { stillarrying its tag{ is

passed to the funtion v

2

. Lastly, a speial value, lose, is added to the

language,butno additionalredutionruleisdened forit.

How do we modify our type algebra to aommodate multiple data on-

strutors? In Setion 4.1, we used eld types to enode informationabout

ataggedvalue'spresene orabsene. Here,weneedexatlythesame infor-

mation,butthistimeaboutevery tag. So,weneedtomanipulateafamilyof

eldtypes,indexedbytags. Todoso, weaddone layertothetypealgebra:

rows of eld types.

::=;;;:::j?j>j ! j[℄

::='; ;:::jK :; j

::='; ;:::jAbsjPre jAny

We an nowextendtheprevioussetion's proposal, asfollows:

K : ![K :Pre; Abs℄

K 1

: (!)!([K :Abs; ℄!)![K :'; ℄!

where 'Pre

Pre'?

lose : [Abs℄!?

K 1

'stypeshemeinvolvesthesame onstraintsasinthebasiase. Using

asingle row variable,namely ,intwo distintpositionsallows expressing

thefatthatvaluesarryinganytagotherthanK willbepassedunmodied

to K 1

'sseond argument.

lose's argument type is [Abs℄, whih prevents it from ever being in-

voked. This aords with the fat that lose does nothave an assoiated

redutionrule. Itplaystherole of a funtiondenedbyzeroases.

Thissystemoersextensible patternmathings: anyk-aryase onstrut

(14)

lose; it will reeive the desired, aurate type. (This fat is illustrated

by Example3 in Setion 7.) Thus, no spei language onstrut ortype

inferene ruleis neededto deal withthem.

5. Reord Conatenation

Statitypingforreordoperationsisawidelystudiedproblem[4 ,15℄. Com-

monoperationsinludeseletion,extension, restrition,andonatenation.

The latter omes in two avors: symmetri and asymmetri. The former

requiresitsargumentstohavedisjointsetsofelds,whereasthelattergives

preedene to theseond one whena onitours.

Of these operations, onatenation is probablythe most diÆult to deal

with, beause its behavior varies aording to the presene or absene of

eah eld in its two arguments. This has led many authors to restrit

their attention to type heking, and to not address the issue of type in-

ferene[8 ℄. An inferene algorithm forasymmetri onatenation was sug-

gested by Wand [30 ℄. He uses disjuntions of onstraints, however, whih

gives his system exponential omplexity. Remy [22 ℄ suggests an enoding

of onatenation into -abstration and reord extension, whene an infer-

ene algorithm may be derived. Unfortunately, its power is somewhat de-

reased bysubtle interations with ML'srestrited polymorphism; further-

more,theenodingisexposedtotheuser. Inlaterwork[23 ℄,Remysuggests

a diret, onstraint-based algorithm, whih involves a speial form of on-

straints. Sulzmann[27℄followsasimilarrouteandreatesaustominstane

of HM (X), again involving speial-purposeonatenation onstraints. Our

approahisinspiredfromRemy'slaterpaper,butre-formulatedintermsof

onditional onstraints, thus showing that no ad ho onstraint forms are

neessary.

Again,ourpresentationisintwosteps. Thebasiase, wherereordsonly

haveoneeld,istakledusingsubtypingandonditionalonstraints. Then,

rows allowusto easilytransfer ourresultsto thease ofmultipleelds.

5.1 TheBasi Case

We assume a language equipped withone-eld reords, whose unique eld

maybeeither \absent" or\present". Morepreisely,weassume aonstant

data onstrutor Abs, and a unary data onstrutor Pre; a \reord" is a

valuebuiltwithone oftheseonstrutors. Adatadestrutor,Pre 1

,allows

aessing the ontents of a non-empty reord. Lastly, the language oers

asymmetri and symmetri onatenation primitives, written and ,

respetively.

e::=:::jAbsjPrejPre 1

jj

The relationshipbetweenreord reation and reordaessis expressed by

asimple redutionrule:

1

(15)

Thesemantisof asymmetri reordonatenation isgiven asfollows:

v

1

Abs reduesto v

1

v

1

(Prev

2

) reduesto Prev

2

(In eah of these rules, the value v

1

is required to be a reord.) Lastly,

symmetrionatenation isdened by

Absv

2

redues to v

2

v

1

Abs redues to v

1

(Inthese two rules,v

1 and v

2

arerequiredto be reords.)

The onstrution of our type algebra is similar to the one performed in

Setion 4.1. We introdue a (unary) reord type onstrutor, as well as a

distintionbetweennormaltypes andeld types:

::=;;;:::j?j>j ! jfg

::='; ;:::jBotjAbsjPre jEither jAny

Letusexplain,step bystep,ourdenitionofeld types. Ourrst, natural

step is to introdue type onstrutors Abs and Pre, whih allow desrib-

ing values built with the data onstrutors Abs and Pre. The former is a

onstant typeonstrutor,whilethelatter isunaryand ovariant.

Many type systems for reord languages dene Pre to be a subtype of

Abs. Thisallows areordwhoseeldispresent topretenditisnot, leading

toalassitheoryofreordswhoseeldsmaybe\forgotten"viasubtyping.

However, when the language oers reord onatenation, suh a denition

isn't appropriate. Why? Conatenation { asymmetri or symmetri { in-

volvesahoiebetweentworedutionrules,whihisperformedbymathing

one, orboth, of the arguments against thedata onstrutors Abs and Pre.

If, at the level of types, we allowa non-empty reordto masquerade asan

empty one, then it beomes impossible, based on the arguments' types, to

nd out whih rule applies, and to determine the type of the operation's

result. In summary, in the presene of reord onatenation, no subtyp-

ing relationship must exist between Pre and Abs. (This problemis well

desribed{ althoughnotsolved { in[4 ℄.)

This leadsusto making Abs and Preinomparable. One thishoie has

beenmade,ompletingthedenitionofeldtypesisratherstraightforward.

Beause our system requires type onstrutors to form a lattie, we dene

a least element Bot, and a greatest element Any. Lastly, we introdue a

unary, ovariant type onstrutor, Either, whih we dene as the least

upperboundof Absand Pre, so thatAbst(Pre) equals Either. This

optional renement allows us to keep trak of a eld's type, even when its

presene is not asertained. (These ideas are due to Remy, who arries

themfurtherinthe aseof objets [24 ℄.) The lattie ofeld typesisshown

(16)

Any

Either

OO

Abs

77 o

o o o o

Pre

hh PPP

PP

Bot

ggOO OO O

66 n

n n n n

Figure 4: Thelattieofreordeldtypes

Letusnowassigntypestotheprimitiveoperationsoeredbythelanguage.

Reord reationand aess reeivetheirusualtypes:

Abs : fAbsg

Pre : !fPreg

Pre 1

: fPreg!

There remains to ome up with orret, preise types for both avors of

reordonatenation. The key idea is simple. Asshown byits operational

semantis, (either avor of) reord onatenation is really a funtion de-

nedbyases over thedata onstrutors Abs and Pre { and Setion 4 has

shownhowto aurately desribesuh afuntion. Letusbegin,then, with

asymmetri onatenation:

: f'

1

g!f'

2

g!f'

3 g

where '

2

Either

2

Abs'

2

?'

1 '

3

Pre'

2

?Pre

2 '

3

Clearly, eah onditional onstraint mirrors one of the redution rules. In

the seond onditional onstraint, we assume

2

is the type of the seond

reord's eld { ifit has one. The rst subtypingonstraint represents this

assumption. Notie that we use Pre

2

, rather than '

2

, as the seond

branh'sresulttype;thisisstritlymore preise,beause '

2

maybe ofthe

form Either

2 .

Lastly,weturn to symmetri onatenation:

: f'

1

g!f'

2

g!f'

3 g

where Abs'

1

?'

2 '

3

Abs'

2

?'

1 '

3

Pre'

1

?'

2 Abs

Pre'

2

?'

1 Abs

Again,eahofthersttwoonstraintsmirrorsaredutionrule. Thelasttwo

(17)

(Thearefulreaderwill notiethatany one of these two onstraintswould

infatsuÆe; bothare keptforsymmetry.)

In both ases, the operation's desription in terms of onstraints losely

resemblesitsoperationaldenition. Automatiallyderivingtheformerfrom

thelatterseems possible;thisisan area forfutureresearh.

5.2 TheGeneral Case

Wenowmovetoalanguagewithadenumerablesetofreordlabels,written

l,m,et. Thelanguageallowsreatingtheemptyreord,aswellasanyone-

eldreord;italsooersseletionandonatenationoperations. Extension

andrestrition an beeasilyadded, ifdesired.

e::=?jfl=egje:ljj

We do not give the semantis of the language, whih should hopefully be

learenough.

Attheleveloftypes,weagainintroduerowsofeldtypes,denoted by.

Furthermore, we introduerows of normaltypes, denoted by %. Lastly, we

lifttheve eldtype onstrutors to thelevelof rows.

::=;;;:::j?j>j ! jfg

::='; ;:::jBotjAbsjPre jEither jAny

%::=;;;:::jl:; %j

::='; ;:::jl:; jjBotjAbsjPre%jEither%jAny

Thisallows writing omplexonstraints between rows, suh as 'Pre ,

where'andarerowvariables. Aonstraintbetweenrowsisinterpretedas

an innite family of onstraints between types, obtained omponent-wise.

That is, (l : ' 0

; ' 0 0

) Pre (l : 0

; 0 0

) has the same logial meaning as

(' 0

Pre 0

)^(' 0 0

Pre 00

). (SeeSetion2 fordetails.)

We maynowgive typesto theprimitivereordoperations. Creationand

seletionareeasilydealt with:

? : fAbsg

fl=g : !fl:Pre; Absg

:l : fl:Pre; Anyg!

Interestingly,thetypesofbothonatenationoperationsareunhanged from

theprevioussetion{atleast, syntatially. (We donotrepeatthem here.)

A subtle dierene lies in the fat that all variables involved must now be

readas rowvariables, rather thanas type variables. Inshort, the previous

setion exhibited onstraints whih desribe onatenation, at the level of

a single reord eld; here, the row mahinery allows us to repliate these

onstraintsoveraninnitesetoflabels. Thisinreaseinpoweromesalmost

(18)

6. First-Class Messages

Inmanyurrentobjet-orientedlanguages,messagesdo nothave rst-lass

status. Thatis,wheneveramessageissenttoanobjet,itsnameisxedand

mustbe expliitlymentioned;onlythemessage parametersandthereeiver

objet are allowed to vary dynamially. Some languages, however, allow

rst-lass(alsoknownas\dynami")messages. Thatis,theyallowmessages

to exist as autonomous entities,whih maybe omputedin arbitrary ways

beforebeingsentto an objet.

Wewillviewobjetsasreordsoffuntions,andmessagesastaggedvalues,

made up of a label and a parameter. Indeed, this simple view suÆes to

exhibitthe type inferene problemwe areinterested in. Thus, we onsider

alanguage withreordsand dataonstrutors, asdesribedinSetions 4.2

and 5.2. Furthermore, we let reord labels and data onstrutors range

overa singlename spae,thatof message labels. A primitivemessage-send

operation,written#, isdenedas follows:

#fm=v

1

;:::g(mv

2

) reduesto (v

1 v

2 )

Inplainwords,#examines its seondargument,whih mustbesome mes-

sage m with parameter v

2

. It then looks up the method named m in the

reeiver objet, and applies the method's ode, v

1

, to the message param-

eter. Put another way, if r is a reord of funtions, then (#r) ats as a

funtion dened by ases. Thus, # is nothing but a witness of the well-

known isomorphismwhihexistsbetween theserepresentations.

6.1 TheProblem

In a language without rst-lass messages, every message-send operation

must involve a xed message label. So, instead of a single, generi opera-

tion suh as #, the language provides a family of primitive message-send

operations,indexed bymessage labels.

In ourview of objets asreords of funtions,these operationsare den-

able within the language. Indeed, the operation #m, whih allows send-

ing the message m to the objet o with parameter p, may be dened as

o:p:(o:m p). Then,type infereneyields

#m : fm:Pre(!); Anyg!!

Beause the message labelm is statially known, it an be expliitly men-

tionedin thetype sheme, making it easy to requirethe reeiverobjetto

arryan appropriatemethod.

In alanguagewithrst-lassmessages, ontheother hand,m isnolonger

known. AsaresultofthisdiÆulty,muhoftheinitialworkontypedobjet-

oriented languages has ignored the issue of rst-lass messages. Gaster [7,

hapter 7℄studiesa statitype system where#(underthe nameof sumE-

(19)

onlyifallfuntionsstoredinr havethesamereturntype. Thisonditionis

learly too restritive for our purposes: an objet must be allowed to on-

tain methods with dierent return types. Nishimura [13 ℄ suggests a type

inferene system for an objet-oriented language with rst-lass messages,

in the style of Ohori's seond-order typed reord alulus [15 ℄. It is later

re-formulatedbyMullerandNishimura[12 ℄. The newpresentationisbased

on feature onstraints, inluding a new form of onstraints, speially in-

tendedto modelthebehaviorofa generimessage-sendoperation. Bugliesi

andCrafa[3℄alsoattempttopresentasimpliedviewofNishimura'soriginal

work. However, they hoose a higher-order type system, thus abandoning

typeinferene.

6.2 A Solution

We said above that, given a reordr,the partialappliation(#r) yieldsa

funtiondenedbyases. Indeed,given atagged value (mv), itwillinvoke

anappropriatepieeofode,seleted aordingtothelabelm. Goodpoint

{ this paper is preisely onerned with ways of giving aurate types to

funtionsdenedbyases. Wehaveshownhowonditionalonstraintsallow

ignoring (the return type of) a branh, unless it is liable to be taken. In

objet-orientedterms,they allowignoring(the returntypeof) anymethod

whih is provably unrelated with the message at hand. This solves the

ruialproblemwithrst-lass messages.

Here,wehooseto deal diretly withthease ofmultiplemessage labels,

even though the two-step presentation adopted in Setions 4 and 5 would

stillmake sensehere. Therefore, we propose:

# : f'g![ ℄!

where Pre

Pre ?'Pre(!)

(Here,all variablesexept arerowvariables.) Theoperation's rst (resp.

seond) argument is required to be an objet (resp. a message), whose

ontents (resp. possible values) are desribed bythe row variable ' (resp.

). The rst onstraint merely lets stand for the type of the message

parameter. The onditional onstraint, whih involves three rows, should

again be understoodas a family, indexed by message labels,of onditional

onstraintsbetween eldtypes. The onditional onstraint assoiated with

somelabelmwillbetriggeredonlyif 's element at indexm isoftheform

Pre , i.e. only if the message's label may be m. When it is triggered, its

right-hand sidebeomes ative, with a three-fold eet. First, ''s element

at index m must be of the form Pre( ! ), i.e. the reeiver objet must

arry a method labeled m. Seond, the method's argument type must be

(a supertype of) 's element at label m, i.e. the method must be able to

aeptthemessage'sparameter. Third,themethod'sresulttypemustbe(a

subtypeof) , i.e. the result type of thewhole operation willbe (atleast)

(20)

This proposal shows that type inferene for rst-lass messages an be

performed usingexisting tools, with no need for dediated theoretial ma-

hinery. It also shows that rst-lass messages are naturally ompatible

with all operations on reords, inluding onatenation { a question left

unanswered byNishimura[13 ℄.

7. Examples

This setion illustrates the proposals made in the previous setions with

shortexamples.

Example 3. Weonsiderlistsbuiltoutoftwodataonstrutors,N andC.

The funtion ar,whih returns the rst element of a list,if it exists, and

E otherwise (where E is another data onstrutor, standing for error), is

denedasfollows:

ar = (N 1

E

(C 1

((x;r):x)

lose))

Then,ar'sinferred type sheme is

ar : [N :'; C : ; Abs℄!

where 'Pre

Pre'?[E:Pre; Abs℄

Pre( >)

Pre ?

(Beause thelanguageonlyoers unaryonstrutors,N and E mustarry

someargument,whihremainsunspeiedhere; standsforits type. Usu-

ally,one identieswithsome unittype.) The rstonditionalonstraint

above tells that the rst branh of ar's denition { namely E { will not

betakenunless 'is\present", i.e. unlessar'sargument istagged N. The

nextone tellsthattheseondbranh{namely (x;r):x{ willnotbetaken

unless it is tagged C. No other tags are allowed, beause ar's argument

type involves the row (N : '; C : ; Abs), whose projetion on any tag

other thanN and C is Abs.

What happens when applying ar? The type inferred for the expression

ar(C(1;C(true;N()))),whereitispassedaheterogeneous,2-elementlist,

isint. Inotherwords,thisexpressionisstatiallyfoundnottoprodueE,

beausethe rstonditional onstraint isnot triggered.

Example 4. We dene a funtionwhih reads theeld l outof a reordr

andreturnsadefaultvalued ifr hasnosuheld. Itisgivenbyextrat=

(21)

d:r:(fl=dgr):l. Inoursystem, extrat's inferredtype is

extrat : !fl:'; g!

where 'Either Either

Abs'? Abs ?AbsAny

Pre'? Pre ?PreAny

The rst onstraint retrieves r:l's type and names it , regardless of the

eld'spresene. (Iftheeldturnsouttobeabsent,willbeunonstrained.)

The left-hand onditional onstraints learly speify the dependeny be-

tweentheeld's preseneand thefuntion'sresult.

The right-hand onditional onstraints have tautologous onlusions {

therefore,theyare superuous. Theyremain onlybeause oururrent on-

straint simpliationalgorithmsare\lazy"and ignoreanyonditional on-

straints whose ondition hasnot yet beenfullled. This problem ould be

xed by making the simpliation algorithm slightly more aggressive, i.e.

byallowingittohekwhether theonlusionofa onditionalonstraint is

redundant,regardless ofits ondition.

The type inferred forextrat 0 fl= 1g and extrat 0 fm=1g is int.

Thus, in many ases, one need not be aware of the omplexity hidden in

extrat'stype.

Example 5. We assumegiven an objeto, of thefollowingtype:

o: f getText:Pre(unit!string);

setText:Pre(string!unit);

selet:Pre(intint!unit);

Abs g

omayrepresent,forinstane,aneditabletexteldinagraphiuserinterfae

system. Itsmethodsallowprogrammatiallygettingandsettingitsontents,

aswellasseletinga portionoftext.

Next, we assumea listdata struture,equippedwith asimpleiterator:

iter :(!unit)! list!unit

The following expression reates a list of messages, and uses iter to send

eahof them inturnto o:

iter(#o)[setText\Hello!" ; selet(0;5)℄

This expression is well-typed, beause o ontains appropriate methods to

deal witheah of these messages, and beause these methodsreturn unit,

asexpeted by iter. The expression's type is of ourse unit, iter's return

type.

Here isa similarexpression,whih involvesa getTextmessage:

(22)

Thistime, itis ill-typed. Indeed,sending asetText messageto o produes

aresultof typeunit,whilesendingitagetTextmessage produesa result

oftypestring. Thus,(#o)'sresulttypemustbe>,thejoinofthesetypes.

Thismakes(#o)anunaeptableargumentforiter,sinethelatterexpets

afuntion whosereturntype isunit.

8. Conlusion

Inthispaper,wehaveadvoatedtheuseofaonstraint-basedtypeinferene

system equipped with subtyping, rows and onditional onstraints. This

provides a ommon solution to several diÆult type inferene problems,

whih, sofar, had beenaddressed using speialforms of onstraints. From

a pratial point of view, it allows them to benet from known onstraint

simpliationtehniques(see AppendixA),leadingto aneÆient inferene

algorithm[18 ℄.

Our system subsumes Remy's proposal for reord onatenation [23 ℄, as

well as Muller and Nishimura's view of rst-lass messages [12 ℄. Aiken,

WimmersandLakshman's\soft"typesystem[2 ℄ismore preisethanours,

beause it interprets onstraints in a riher logial model, but otherwise

oerssimilarfeatures.

Thedesignofatypeinferenesysteminvolvestwoorthogonalomponents:

a set of typing rules and a onstraint language (together with its logial

interpretation). Asto the former, we have suggested usingHM(X) [14 , 28 ,

27 ℄, whose formulation appears most elegant, but other hoies would be

possible (see e.g. [10 , 17 ℄). The fous of the paper is really on the latter:

our aim wasto nda onstraint language expressive enough to aurately

desribe the features of the programming language at hand. One should

emphasize the fat that we do not, a priori, view the onstraint system

SRC as better (simpler, more elementary, more anonial, et.) than its

ompetitors. We merely take its wide appliability as evidene of the fat

thatitisomparativelymoregeneral-purpose(lessadho)thansomeofits

predeessors.

To onlude, we hope this paper illustrateshow a small numberof well-

understoodlogimehanismsallowbuildinganadvanedtypeinferenesys-

tem.

Aknowledgements

Thanksto JaquesGarrigue,MartinMuller,DidierRemyandMartinSulz-

mannfor stimulating disussions. DidierRemyalso proof-reada version of

the manusript. Lastly, I would like to thank the anonymous referees for

theirsuggestions.

(23)

Appendix A. Algorithms and Proofs

This appendix ontains a formal desription of onstraint resolution and

simpliationalgorithms,inthepreseneofatomiand onditionalsubtyp-

ingonstraints. Resolutionisrequiredindeterminingwhethera programis

type-orret; simpliationiskey to ahievingreasonable eÆieny.

Thesystemdesribedinthisappendixdoesnothaverows,oraseparation

of typesinto distint kinds, butotherwise hasall features presentedin the

body of this paper. Adding rows to this formal desription would require

work,butshouldnotposeanyforeseeablediÆulty,sinetheoneptofrow

isessentiallyorthogonaltothenotionofsubtyping. Addingkindsshouldbe

routine. A referene implementationof thefullsystem, inludingrows and

kinds,isavailable[18 ℄.

Thisappendixdesribesan extensionof[17℄withonditionalonstraints.

Thus, most proofs presented here are partial, and desribe only the mod-

iations required to aommodate onditional onstraints. However, all

denitionsand statementsareomplete.

This appendix is laidout as follows. First, we review all neessary on-

epts,inludingground types, types, onstraints, and type shemes. Then,

wegiveaonstraintresolutionalgorithm,andthreeonstraintsimpliation

algorithms.

Throughoutthis appendix, we use a ouple of notationalshortuts. If P

isa logiprediate,then

8`C P() standsfor 8 (`C))P()

9`C P() standsfor 9 (`C)^P()

AppendixA.1 GroundTypes

AsinSetion2,ourformaldevelopment isparameterizedwithanarbitrary

ground signature (see Denition 1). We assume that it denes only one

kind,sowe writeS and Tinsteadof S

and T

. We write?

S ,>

S ,

S ,t

S

andu

S

instead of?

,>

,

,t

and u

. We also assumeL

row

=?. The

model(T;), isdened asinDenitions2 and 3.

In thisappendix, we use the letter to denote either a ground type, or

a type, and sometimes both at the same time (see e.g. Denition 17 and

Theorem2). We willtry to preserve alear distintionwhenever possible.

Theorem1. T,equippedwith,isalattie. Itslattieoperations, denoted

by t andu, are haraterized by the following identities:

(

1 2

2

)()=

1 ()2

S

2 ()

8l2dom(

1 2

2 ) (

1 2

2 ):l=

1 :l2

l

2 :l

where 2 may standfor tor u. (We let 2 l

stand for 2 when l2L +

; when

l l

(24)

seondequation,

1

:l(resp.

2

:l)maybeundened; in suha ase,it should

bereadas the neutral elementof 2 l

.

Note that, beause of the last requirement of Denition 1, at least one of

1

:l and

2

:l must be denedinthe seondequation above.

We let ? (resp. >) stand forthe ground type suh that dom() =fg

and ()=?

S

(resp. >

S ).

Appendix A.2 Types

Types aredenedas inSetion2.2, exeptrow termsare disallowed.

Definition 7. LetV bea denumerablesetof typevariables,denotedby ,

, et. The set of types, denoted by T, is the term algebra T(;V). In

other words, atype iseithera typevariable, ora onstrutedterm, of the

form s(

l )

l 2a(s)

, wheres2S is 's head onstrutor,also written hd().

Definition 8. A ground substitution is a totalmapping from type vari-

ables to ground types. Ground substitutions are straightforwardly extended

to types.

Appendix A.3 Constraintsand Type Shemes

We now give syntax and semantis for three kinds of onstraints: atomi

onstraints, onditions and onditional onstraints. In eah ase, the nota-

tion `

k

meansthat the ground substitution k-satises theonstraint

. The notation ` means that satises , and holds, by denition, if

and onlyif`

k

holdsforall k2N +

.

Definition 9. An atomi onstraint is a pair of types, written

1

2 . A

ground substitution k-satises it i (

1 )

k (

2 ).

Definition 10. A ondition is a pair of a symbol s 2 S and of a type ,

written s,wheresmustbea prime elementofS. Agroundsubstitution

satises s i s

S

hd(()).

Definition 11. A onditional onstraint is a pair of a ondition and of

an atomi onstraint, written s ?

1

2

. A ground substitution

k-satises it i `s implies `

k

1

2 .

Having dened onstraints, we may dene notions of satisfation and en-

tailment on onstraint sets. They are dened in the usual way. We also

introduea non-standardnotion of pre-satisfation (resp. pre-entailment),

whihislogiallyweaker(resp. stronger)thanitsstandardounterpart,be-

auseit ignoresonditionalonstraints. Thesenotionsarepurelytehnial;

they areusedonlywithinourproofs.

(25)

Definition 12. Let C bea setof onstraints, bothatomi andonditional.

A ground substitution is a pre-solution of C i ` holdsfor all atomi

2C. isa solutionof C i ` holdsfor all 2C. We write ` pre

C

in the former ase, and `C in the latter.

Let bea onstraint. C pre-entails, whih wewrite C pre

, i8` pre

C `. C entails , whih we write C, i 8`C `.

We now denetype shemes. They are onstrainedpolymorphitypes, i.e.

typesontaining variables whosepossibleinstantiationsare restritedbya

onstraintset. Forsimpliity,weonlyonsiderlosed typeshemes,i.e. type

shemeswhihhave nofreetypevariables. Althoughsomewhatunommon,

typesystemsexistwhihrespetthisrestrition(see[29 ,17 ℄). Itshouldalso

be possibletoextend ourresultsto thease ofarbitrary type shemes.

Definition 13. A typesheme isa pair of a type andof a onstraint set

C,written 8C :.

Atype sheme representsasetof groundtypes, whihweallits denota-

tion. Eahof these ground typesrepresent one possibleorretbehaviorof

theprogramdesribedby. Atypeshemewhosedenotationisempty(i.e.

whoseonstraint sethas nosolution) thus represents anill-typedprogram.

Definition 14. The denotation JKof a typesheme isthe unionof the

upper ones generated by its ground instanes withrespet to . That is,

J8C :K=f 0

;9`C () 0

g

Atypesheme whosedenotationisbiggerrepresentsalargersetofpossible

behaviors; thus, it is more general. This notion allows omparing type

shemes, while aounting for polymorphism and subtyping at the same

time. It was introduedin[29℄, whereitwaswritten 8

;we denote it4.

Definition 15. Given two type shemes

1 and

2

, the former is said to

be moregeneral than the latter iJ

1 KJ

2

K;we shall then write

1 4

2 .

Inother words,

1

ismoregeneral than

2

iforany groundinstaneof

2 ,

thereexists a smallergroundinstaneof

1

. Formally,

(8C

1 :

1

)4(8C

2 :

2 )

isthus equivalentto

8

2

`C

2 9

1

`C

1

1 (

1 )

2 (

2 )

Wewrite

1

2

when

1 4

2 and

2 4

1 .

The relation oers a speiationof onstraint simpliation. Indeed, a

type sheme an be simplied into a type sheme 0

onlyif 0

. One

0

(26)

that is not a requirement; it is rather to be viewed as an implementation

detail.

We onlude this setion with a denition of what it means for a type

sheme to be made up of small terms. All of the algorithms dened here

will expet this property to hold, and will preserve it, making it a global

invariant. Thishoie simpliesdenitions and proofs. Furthermore, from

a pratial point of view, it allows enforing maximum sharing, sine it

requires every sub-term to be \named" by a type variable, allowing our

minimizationalgorithm to identifysub-terms.

Definition 16. A smallterm is a onstruted type term whose strit sub-

termsaretype variables. A type sheme8C : is madeup ofsmallterms i

it satises the following onditions:

Æ is a typevariable;

Æ for all (

1

2

) 2C, either

1 and

2

are type variables, or one is a

variable andthe other is a small term.

Æ for all (s?

1

2

)2C, ,

1 and

2

are typevariables.

Every type sheme an be turnedinto an equivalent type sheme whih is

made up of smallterms. (In pratie, thiswouldbe done when onverting

type shemesinputby theuserinto some internalrepresentation.)

Appendix A.4 Solving Constraints

Webeginwithafundamentaltehnialresult,whihdesribesaweak,suÆ-

ientonditionforaonstraintsettohaveasolution. Itshallformthebasis

for the proof of the losure algorithm. We prove a fairly powerful version

of thisresult,allowing groundonstants to appear inonstraints. (If these

onstraintswereto bewritten,someniterepresentationofthese onstants

wouldbe required;however,suhisnotthease here.) Thanksto thisgen-

eralization, thisresult willalso form thebasis forthe proofof thegarbage

olletion algorithm.

Definition 17. A onstraint set withgroundonstants is a onstraint set

C, where atomi onstraints may involve either two variables, one variable

anda small term, or onevariable and a groundtype,and where onditional

onstraints have their usual form. Dene the assertion C +1

1

2 to

mean

8k0 8` pre

k

C `

k+1

1

2

Dene C

#

()=f; 62V^ 2Cg andC

"

()=f; 62V^ 2

Cg. C is said to be weakly losed i the following onditions aremet:

(1) 2C and 2C imply 2C;

(2) 2C and 2C

#

() imply 9 0

2C

#

() C +1

0

;

(3) 2C and 0

2C

"

() imply 9 2C

"

() C

+1

0

;

(27)

(4) 2C

#

() and 0

2C

"

() imply C +1

0

;

(5) 2C and s?2C implys?2C;

(6) s?2C, 2C

#

() and s

S

hd() imply C pre

.

C

#

() ontains all lower bounds of whih are not type variables; thus,

every 2 C

#

() must be either a small type term, or a ground type. A

similarremarkholdsonerningC

"

().

Conditions1and5abovearepurelysyntatitransitivityonditions. Con-

ditions2 to 4also involve transitivity,buttheuse of +1

allows expressing

these onditions in a logial, rather than syntati, way, making them less

restritive.

Theorem2. LetC beaonstraintsetwithgroundonstants. IfC isweakly

losed, then C has a solution.

Proof. Note that this proof only uses Conditions 2, 4 and 6 of Deni-

tion 17. The other onditions shall be required by further theorems, suh

astheorretness proof ofgarbage olletion.

The rst step of the proof onsists inexhibiting a ground substitution

suh that, forall 2fv(C), ()equals tf(); 2 C

#

()g, and proving

that is a pre-solution of C. In fat, this step oinides with the lassi

proofperformed intheabsene ofonditional onstraints [17 ℄;we shallnot

repeatithere.

The seond step onsists inprovingthat is a fullsolutionof C. Pika

onditionalonstraint s?2C. Assume `s. By denitionof ,

thisstatement an be written

s

S

hd(())=hd(tf(); 2C

#

()g)

=t

S

fhd(()); 2C

#

()g

=t

S

fhd(); 2C

#

()g

(Theidentityhd(())=hd() stemsfrom thefat that iseither asmall

term,oragroundonstant,withaxedheadonstrutor.) Consideringthat

sis prime (see Denition10), thisentails s

S

hd() forsome 2 C

#

().

We an then apply Condition 6 of Denition 17, whih yields C pre

.

Sine is a pre-solution of C, this implies ` . We have thus veried

`s?,provingthat isa solutionof C.

2

Equippedwiththistehnialresult,wearenowreadytodeneaonstraint

resolutionalgorithm. Itisbasedonasimplelosureomputation. Webegin

by deningan auxiliaryonstraint deompositionfuntion, whih breaksa

onstraintdown into aset of equivalent onstraints.

Definition 18. Given types

1 and

2

, sub(

1

2

) is dened as

Æ f

1

2 g, if

1 or

2

isa variable;

(28)

Æ f

1 :l

l

2

:l;l 2 dom(

1

)\dom(

2

)g, if

1 and

2

are onstruted

terms suh that hd(

1 )

S hd(

2 ).

Note that sub(

1

2

) isundened when hd(

1 ) 6

S hd(

2

); indeed, suh

aonstraintis learlyunsatisable.

Usingthisauxiliaryfuntion,wean nowdesribethelosure onditions:

Definition 19. Let C be a onstraint set, made up of small terms. C is

said to be losed i

(1) 2C and 0

2C imply sub(

0

)C;

(2) 2C and s?2C imply s?2C;

(3) s?2C, 2C

#

() and s

S

hd() imply2C.

Condition 1 is the lassi losure ondition, found e.g. in [17 ℄; it involves

transitivityandstruturaldeomposition. Condition2isatransitivityon-

dition onerning onditional onstraints. Condition 3 requires that the

onlusionof a onditional onstraint whoseondition mustbe satisedbe

dishargedinto theonstraint set.

It is easy to hek that losure impliesweak losure [17 ℄. This yields an

algorithm to deidewhether a onstraint set C has a solution: attempt to

omputethesmallestlosedsetC

ontainingit,byrepeatedappliationof

theabovethreerules. Eahrulepreservestheset'ssolutionspae. So,ifthe

omputation sueeds,then C has a solution;if, on the other hand, itfails

(beause sub isappliedoutside ofits domain),then C hasno solution.

Consideraonditionalonstraints?. Aordingto thelosurerules

above, the atomi onstraint willhave no eet on the onstraint resolu-

tion proess untilit isdisharged byrule3. That is,willbeignored until

thealgorithmdisoverssomeevidenethattheonditionsmust besat-

ised. This explains why onditional onstraints delay type omputations,

as mentioned in the body of this paper. The algorithm will not speulate

abouttheonsequeneoftheonditionalonstraint,shoulditsonditionbe

satised;rather, itwaitsuntilit hasno hoie butsatisfy.

Appendix A.5 Polarity

We now denehowto assoiate apolarity witheah typevariable inatype

sheme whose onstraint setif(weakly)losed. Thisnotionwillbe usedin

thedenitionofall three onstraint simpliationalgorithms.

Definition 20. Consideratypesheme =8C :Æ,madeupofsmallterms,

whereC is weakly losed. Theset of positive variables of , and the setof

negative variables of , respetively denoted by fv +

() and fv (), are the

smallest subsets P and N of fv() suh that

Æ Æ 2P;

Æ 82P 8 2C

#

() split()(N;P);

(29)

Æ 82N 8 2C

"

() split()(P;N);

Æ 82N s? 2C ) 2P ^ 2N.

where the auxiliary funtion split maps a small term to an element of

2 V

, as follows:

split()=(f:l;l2L g;f:l; l2L +

g)

Atypevariableissaidtobebipolar ifitispositiveandnegative,andneutral

ifitisneither. fv +

()and fv ()anbeomputedintimelinearinthesize

of , usinga simple x-point alulation. Every type sheme is equivalent

toatypeshemewithnobipolarvariables;wedonotprovethisresulthere.

AppendixA.6 Garbage Colletion

Knowingthepolarityof eahvariableallowsustothrowawaymanyredun-

dantonstraints, asshownbythefollowingdenitionand theorem.

Definition 21. Consider as in Denition 20. The image of through

garbageolletion, denoted by GC(), isthe typesheme8D:Æ,whereD is

a subsetof C dened asfollows:

Æ 2D i 2C, 2fv () and 2fv +

();

Æ D

#

() equals C

#

() if 2fv +

(), and? otherwise;

Æ D

"

() equals C

"

() if 2fv (), and? otherwise;

Æ s? 2Di s? 2C and 2fv ().

This denitionis mostly idential to the one in[17 ℄; onlythe fourth point

is new, and speies that a onditional onstraint is redundant unless it

bears on anegative variable. In operationalterms,a onditionalonstraint

s ? an be triggered only if reeives a lower boundwhih exeeds

s. Consideringthatonlynegative variablesanreeivenew lowerboundsin

thefuture,thisonstraint hasno eet unless is negative.

Theorem3. Consider as in Denition 21. Then GC().

Proof. Write 0

=GC(). Sine 0

hasfeweronstraints,it islearthat

0

4 . So, we need to prove 4 0

. Aording to Denition 15, this is

equivalentto

8 0

`D 9`C (Æ) 0

(Æ)

Piksome 0

`D. WenowwishtoprovethatC[fÆ 0

(Æ)gadmitsasolu-

tion. Thisisaonstraintsetwithgroundonstants,asperDenition17. We

shallmeet ourgoalbyprovingthatthefollowingonstraintset|a superset

oftheprevious one|isweakly losed:

C[f 0

();2fv ()^ 2C r

g

[f

0

();2fv +

()^ 2C r

g