1 Introdution 1.1 Subtyping and type inferene Inatypedprogramminglanguage,afuntion appliation(e 1 e 2 )islegalifandonlyifthere existsatype 2 whihisbothavalidtypefortheargumente 2 andavaliddomaintypefor thefuntione 1

(1)

Franois Pottier

Franois.Pottierinria.f r

August 25,2000

Abstrat

This paper oers a theoretial study of onstraint simpliation, a fundamental

issue forthedesignerofapratialtypeinferenesystemwithsubtyping.

Inthesimplerasewhereonstraintsareequations,asimpleisomorphismbetween

onstrained typeshemesandnitestateautomatayieldsaompleteonstraintsim-

pliation method. Using it as aguide for the intuition, we move onto the ase of

subtyping, and desribe several simpliationalgorithms. Althoughno longerom-

plete,theyareoneptuallysimple,eÆient,andveryeetiveinpratie.

Overall,thispapergivesaonisetheoretialaountofthetehniquesfoundatthe

oreofourtypeinferenesystem. Ourstudyisrestritedtotheasewhereonstraints

areinterpretedinanon-struturallattieofregularterms. Nevertheless,wehighlight

asmallnumberofgeneralideas,whihexplainouralgorithmsatahighlevelandmay

beappliable toavarietyofothersystems.

1 Introdution

1.1 Subtyping and type inferene

Inatypedprogramminglanguage,afuntion appliation(e

1 e

2

)islegalifandonlyifthere

existsatype

2

whihisbothavalidtypefortheargumente

2

andavaliddomaintypefor

thefuntione

1 .

In thesimply-typed-alulus, theset of allvalid typesofagiven(un-annotated) ex-

pressione hasaveryregular struture: itis either empty,orexatlythe set ofall substi-

tution instanes of a most general type. Then, inferring the (most general)type of an

expression redues to solving aset of equations between types[Wan87℄. The addition of

let-polymorphism,asdoneinML[Mil78℄,essentiallypreservesthisfat.

These systems have type instantiation as their only notion of type ompatibility. In

partiular, they view any two ground types as inompatible unless they are equal. For

instane, assume mahine integers and oating-point numbers are desribed by two base

types, namely int and real. Then, the appliation (fatx) is illegal if fat and x have

(most general)typesint ! intand real, respetively. This is agood point, sine it is

ertainlyaprogrammingerror. Ontheotherhand,iflog andnhave(most general)types

real!realand int, respetively, thenthe appliation (logn) is deemedillegal aswell.

Yet,beauseintegersare mathematiallyasubset ofreals, onemayatually wishfor this

termtobeaepted.

Tooveromethislimitation,Mithell[Mit84℄suggestsenrihingthesetypesystemswith

subtyping. Thisinvolvesintroduingapartialorderontypes,togetherwithanewtyping

Address: INRIARoquenourt,BP105,78153LeChesnayCedex,Frane. Thispaperistoappearin

Information&Computation.

(2)

rule, stating that if holds (read: if is a subtype of ) then everyexpression of

type has type 0

aswell. For instane, hoosing thestrit ordering int< realauses

(fatx)to remainill-typed,while (logn)beomeswell-typed, beausen:intnowimplies

n:real. Subtypingisnot,ingeneral,limitedtobasetypes: Cardelli[Car88℄equipsreord

typeswithanaturalsubtypingrelation,allowinginformationaboutanynumberofeldsto

bedisarded. In additionto itsintrinsi interest, suh a systemprovides apossiblebasis

forthestudyofobjet-orientedlanguages.

Systemsequippedwithsubtypinghaveaombinationoftypeinstantiationandsubtyping

as their notion of type ompatibility. As a result, the type inferene problem no longer

redues to solving a set of equations. Instead, it requires solving a set of inequalities,

usuallyalled onstraints [Mit84, Pal95℄. This proessistheoretiallystraightforward,but

ostly,beausetheeÆientuniationalgorithms developedto solveequations[JK90℄an

nolongerbeused.

Why, then, should we wish to perform type inferene? Would it not be suÆient to

requirethe programmer to supply type annotations, and merely hek their onsisteny?

Letusgivetworeasonswhytypeinfereneisuseful. First,itfreestheprogrammerfromthe

burdenofdelaringthetypeofeveryprogramvariable|atedioustaskinmanywidespread

languages|and allows him to naturally write polymorphi ode. Seond, type inferene

maybeviewedasasimplewayofdesribingprogram analyses[PO95℄, whose resultsmay

beused, forinstane,todriveompileroptimizations.

1.2 Simpliation

Ouraim,then,is tostudythetypeinfereneprobleminthepreseneofsubtyping, andto

ompareitwiththeoriginalproblem, wheresubtypingisreduedtoequality.

The onstraint system to be solved is the samein both ases; its size is linear in the

program size. (Thoughlet-polymorphismmay, in fat, auseit to growexponentially, it

is an aepted fat that it \usually" does not.) However, while equations an be solved

in quasi-lineartime, solvinginequalitiesbetween(non-atomi)termstypiallyrequires (at

least)ubi-timealgorithms[AW93,Pal95,MR00℄. Thus,aneÆieny problemappears.

Every uniation problem admits a most general solution. Thus, in the absene of

subtyping,everyprogramhasamostgeneraltype. Itisoftenompatandeasilyintelligible.

Ontheotherhand,manylassesofsubtypingproblemsdonothavemostgeneralsolutions.

Then,desribingthesetofallvalidtypesofagivenprogramrequiresprintingtheonstraint

systemitself, whih ofteninvolvesmanyauxiliarytypevariables. Thus, areadability issue

alsoarises.

To address these problems, it seems neessary to simplify systems of subtyping on-

straints,i.e. toreduethemtosmaller,equivalentsystems. Thistopihasreeivedontinued

attentionin thepastfew years[Aik94,AF96, AWP96,FFSA98,AFFS98, Fah99,EST95a,

TS96,FF96,FF97, Fla97, Pot96,Pot98a, Pot98b,Pot98℄. Indeed,designingareasonable

simpliationalgorithm isnoteasy. Itmust beorretandeÆient. Ideally, itshould also

beomplete, i.e. produe optimalresults. Unfortunately, ahievingompleteness involves

solving the onstraint entailment problem, whih may be muh more omplex than on-

straintsolving. Inourframework,forinstane, entailmenthasbeenshownPSPACE-hard,

but its deidability is still unsettled [HR98, NP99℄. Forthis reason, pratial onstraint

simpliationalgorithmsareofteninomplete.

1.3 Choies

Deningatype systemwith subtyping involvestwomain hoies. First, onemust hoose

aonstraintlogi, i.e. dene aonstraintlanguageand itsinterpretation within amodel.

(3)

therulesreduethetypingproblemtoaseriesofassertionsexpressedwithintheonstraint

logi. Thesetwohoies aremostlyorthogonal,aspointedoutby[OSW99℄.

Asfaras thersthoie isonerned,the arrayofpossibilitiesisextremely wide. The

modelmayhaveovarianttypeonstrutorsonly,oritmayhaveontravariantonstrutors

aswell. (Intheformerase,onstraintsystemsmayhavesmallestsolutions.) Itmayormay

nothavereursivetypes. (If present, theymaygivesmoothermathematial properties to

themodel,leadingtosimpleralgorithms.) Themodel,equippedwiththesubtypeordering,

may or may not form a lattie. (If it does, then moreaggressive simpliations beome

valid.) Typesmaybeinterpretedasideals[MPS86℄orasterms. (Theformerinterpretation

assignsmorepreisemeaningsto unionand intersetiontypes. Ontheother hand,it may

bemoreomplex;axiomssuhas ?=(?), where is any unarystrit typeonstrutor,

makeonstraint solving morediÆult.) When types are interpreted as terms, subtyping

maybe atomi(i.e. onlyonstanttype onstrutors may beomparable), strutural (i.e.

onlytypeonstrutorsofidentialaritymaybeomparable),ornon-strutural(eventype

onstrutors with dierent arities may be omparable). Constraints may be interpreted

withinaxedmodel,orwithin afamilythereof. (Iftheformer,thendeepersimpliations

are usually possible. On the other hand, user-extensible subtype hierarhies require the

latter.)

Changesin the onstraintlogi greatlyaet theomplexityof theresolution and en-

tailmentproblems (as well asthe formulation of the orresponding algorithms). For this

reason, we will fous on a single ase, while hoping that (some of) our methods may be

appliableto(some)otherlogis. Morespeially,wehoosetointerprettypesinthexed

model ofallregulartermsgenerated by?,! and>,with arities0,2and 0,respetively.

Subtypingisinterpretedinthemodelbyorderingtheseonstrutorsasgivenandviewing!

asaontra/o-varianttypeonstrutor. Thisyieldsanon-struturalsubtypingrelationship,

whih forms a lattie. Although this ase may seemverysimple, generalizingit to more

elaboratenon-struturaltermlattiesisstraightforward(seee.g.[Pot00a℄)andrequiresno

fundamentalhangestothetheoryortothealgorithms.

Theseond hoiedenitely haslessimpatonthesystemasawhole. Althoughmany

variantshaveappearedin theliterature,mostof themareverylose inspirit. Theideais

to extend the Hindley-Milner type disipline [Mil78℄ with onstraints, while keeping let-

polymorphism. PerhapsthemostelegantformalexpositionofthisideaisthesystemHM(X)

byOdersky, Sulzmannet al.[OSW99,SMZ99℄. Here,however,wewill useaset oftyping

rulesinspiredbyTrifonovandSmith[TS96℄, withafewtehnialmodiationstothetype

inferenerules. Thissomewhatunommon presentationallowsus to dealwith losed(i.e.

fully universally quantied) type shemes only, making aformal desription of onstraint

simpliation|theentraltopi ofthepresentpaper|easier.

1.4 Overview

Inthispaper,wepresentatypeinferenesystemwithsubtyping,designedwithonstraint

simpliationinmind. Itsinferenerulesarewrittensoastogenerateamenableonstraint

systems. We desribethree simpliation algorithms, designed to be used in ombination

withoneanother;theyaresimple,eÆientandeetive. Weemphasizetheparallelbetween

the ase of equality and that of subtyping, and showthat these algorithms are based, in

both ases, on the same broad ideas. In fat, in the ase of equality, their ombination

yields aomplete simpliationstrategy. Althoughit is nolonger ompletein the aseof

subtyping,webelieveitproduesgoodresultsin pratie.

Thispaperis laid out asfollows. Setion 2introduesthe neessarytheoretial bak-

ground,namelyasetofgroundtypesorderedbysubtyping,aorelanguage,asetoftyping

(4)

deterministialgorithm,whihmapsanexpressiontoaonstrainedtypesheme. Theon-

straints thus generated may be viewed, at will, asequations or assubtyping onstraints.

In the former ase, we obtain a typeinferene system lose to that of ML; in the latter,

a system equipped with the full power of subtyping. Setion 3 studies the simpler one,

andsuggestsaompletesimpliation method byborrowingoneptsfrom automatathe-

ory. This setion should help the readerform general intuitions about the struture and

behaviorofonstraints. Wehopetheseideasareappliableinotherontexts;inpartiular,

they may betransferred to ourmoreomplex system, whih isthe topi of setion 4. In

thissetion,whihforms thetheoretialbodyofthepaper,weformallydesribeandprove

severalonstraintsimpliationalgorithms,basedonthesameideas. Setion5showsthese

algorithmsatworkonasimpleexample. Setion6reviewsrelatedwork.

Thispaperborrowsideasfrom several existing works. Oneof itsnovelaspets istheir

seamlessintegration: wedesribealean,simpletheory,whihleadsdiretlytoaneÆient

implementation[Pot00b℄. Anotherontributionisintheareaofpresentation. First,thanks

to a arefullythought-out mathematiallayout, we are ableto presentour formal results

withalmostnoauxiliarysteps,andwithsubstantiallysmallerproofsthanin earlierworks.

Seond, we highlight the similarity of our methods with those appliable in the ase of

equalityonstraints;bydoingso,wehopetohelpthereadergrasptheessentialideasbehind

ouralgorithms. Thus,thispapermayonstituteagoodintrodutiontothetheoretialissues

behindonstraint-basedtypeinferene.

Beforebeginning our tehnial exposition, letus reall that the fous of this paper is

on onstraintsimpliation. Beause of this deision, several issues related to thedesign

of a onstraint-basedtype inferene systemhavebeen left aside. Among them, one may

mention ertainfundamental theoretialresults, suh astype safety; various implementa-

tiononerns,inludingeÆienymeasurements;extensionsoftheorelanguageneessary

to obtain a full-blown programming language; et. These issues are disussed at length

in [Pot98, Pot98b℄. Lastly, we do not address the issue of entailment, i.e. we do not

attempt to give an algorithm to deide whether two given type shemes are in the sub-

sumption relation. Indeed, we do not have a need for suh an algorithm, beause all of

thesimpliationalgorithmspresentedinthispaperprovablypreservethemeaningoftheir

input. Nevertheless, the entailment problem is losely linked to the issue of onstraint

simpliation;werefertheinterestedreaderto [AC93,KPS93,Pot98,HR98,NP99℄.

2 A onstraint-based type inferene system

2.1 Ground types

Groundtypesaretheregulartreesbuiltwiththeelementaryonstrutors?,>and!. They

are thesimplest kindof types,sinetheyare (possibly reursive) types withoutvariables.

Theyaremonomorphi;polymorphismshallbeintroduedlaterbyonsideringtypeshemes

whihdenotesetsofgroundtypes.

Denition1 Let the ground signature

g

onsist of ? and > with arity 0 and ! with

arity 2. A path p is a nite string of 0's and 1's, i.e. an element of f0;1g

. denotes

the empty path. The lengthof apath pisdenotedby jpj. Its parity(p) isthe number of

0's itontains, takenmodulo 2. A groundtree isa partialfuntion frompaths into

g ,

whose domain is non-empty and prex-losed, and suh that (p0) and (p1) are dened

i (p) = !. Given p 2 dom(), the subtree of rooted at p, written

jp

, is the tree

q7!(pq). Atreeis niteiitsdomainisnite. Atreeis regulariithasanitenumber

of subtrees. A groundtype isaregular groundtree. Wedenote the setof groundtypes by

(5)

0 and

1

are trees,

0

!

1

stands for the tree denedby ()=!, (0p)=

0 (p) and

(1p)=

1 (p).

Thesetofgroundtypesisequippedwithapartialorder,alledsubtyping.

Denition2 A family of orderings over ground types is dened indutively as follows.

First,

0

isuniformly true. Seond,for any k2N +

,

k +1

0

holds iatleastoneof the

followingistrue:

=?;

0

=>;

9

0

1

0

1 =

0

!

1 ,

0

= 0

0

! 0

1 ,

0

k

0 and

1

k

0

1 .

Subtyping,denoted by, isthe intersetion ofthese orderings.

(T;) forms a lattie. Its operators t and u an be dened in several ways, e.g. using

automata produts,niteapproximationsorax-pointtheorem. Buttheirdenition isof

littleinterestin itself,andweshallbeontentwiththefollowingharaterization.

Theorem1 The set of groundtypes T, equipped with the subtyping relation, is alattie.

We denote its least upper bound and greatest lower bound operators by t and u, respe-

tively. These operators are of ourse assoiative and ommutative. In addition, they are

haraterizedby thefollowing identities:

?t = ?u =?

>t => >u =

(

1

!

2 )t(

0

1

! 0

2 )=(

1 u

0

1 )!(

2 t

0

2 )

(

1

!

2 )u(

0

1

! 0

2 )=(

1 t

0

1 )!(

2 u

0

2 )

2.2 Types

Wewillsoondesribeourtypesystem,whihisalogiforderivingtypingjudgmentsabout

programs. Wewishthesystemtoenjoymostgeneraltypings: so,informallyspeaking,the

setofaprogram'sgroundtypesshould beexpressiblewithasingletypingjudgment. That

is, apossibly inniteset of possiblyinnite groundtypesshould be desribed by asingle

logialassertion|whih mustbenite. Toallowthis, wenowintroduetypes,whih may

ontaintypevariables. Usingreursiveonstraintsonvariables,anygivengroundtypean

be nitely desribed; in addition, quantiation over type variables allowsgiving a nite

desription of ertain innite sets of ground types. To sum up, type variables serve two

dierentpurposes: theyenodereursivestruture, andtheyallowpolymorphism.

Denition3 LetV be adenumerable set of typevariables,denotedby ,,et. The set

of types,denotedbyT,isdenedby

::=j?j>j !

Atype issaidtobe onstrutedi itisnotavariable.

Denition4 Agroundsubstitutionisatotalmappingfromtypevariables togroundtypes.

A renaming is a bijetion between two subsets of V. Ground substitutionsand renamings

arestraightforwardly extendedtotypes.

(6)

denotedbyfv +

() andfv (), aredenedby

fv +

() =fg fv () =?

fv +

(?) =? fv (?) =?

fv +

(>) =? fv (>) =?

fv +

(

0

!

1

) =fv (

0 )[fv

+

(

1

) fv (

0

!

1 ) =fv

+

(

0

)[fv (

1 )

The setof freevariables of,denotedbyfv(), isdenedby

fv()=fv +

()[fv ()

2.3 Constrained type shemes

Likethat ofML, ourtypesystemoerslet-polymorphism. Thus, typing judgmentsasso-

iateprogramsnotmerelywithtypes,but withtypeshemes.

A onstrainedtypeshemeisessentiallyatype|its body|where variablesare allowed

to assumearbitraryvalues, within thelimitsof ertainonstraints. Hene, atypesheme

represents a set of ground types, whih is obtained|roughly speaking|by applying all

solutionsoftheonstraintsto thebody.

Constraint-basedtypesystemshaveappeared inorder to dealwithsubtyping assump-

tionsin typingjudgments. However,theyanalso desribelassiequality-basedsystems,

suhasMLitself. Forthis reason,wewill givetwovariantsof ourtypesystem: onewhere

onstraintsareto beinterpretedasequations, and onewhere theytrulydenote subtyping

relationships. Theformerisofoursesimpler,butstillinteresting,beauseitpresentsmany

ommonpointswiththelatter,espeiallyintheareaofonstraintsimpliation,wherethe

samebroadoneptsapply. Studyingitrstwillallowustoidentify methodswhihgener-

allyapplytoallonstraint-basedsystems,asopposedtothosespeitoourinterpretation

ofsubtyping.

However, even in the simpler ase, our system exhibits a signiant departure from

ML,beause,followingTrifonovandSmith[TS96℄,wehooseaformulationwhere alltype

shemesarelosed,i.e. withnofreetypevariables.

This deisiongivesrisetoasystemwhere typeshemesare stand-alone: theirmeaning

doesnot depend onanyexternal assumptions. (Dening the denotationof atype sheme

with freetypevariableswouldrequiresupplyinganassignmentofgroundtypestothesefree

variables.) It alsoremovestheneedtomaintainaglobalonstraintset,onstrainingthose

variables whih are free in the environment, sine there are none. Furthermore, we will

notiethattwodistintbranhesofatypeinferenederivationnowsharenotypevariables.

Thesepropertiesleadtoasimpliation,andabetterunderstanding,ofthetheory,aswell

asto amorestraightforwardimplementation.

In ML, it is inorretto generalize overa typevariable if it appears free in the envi-

ronment. So, howan wehopeto be ableto universally quantify overallvariables? The

solutionistomovetheenvironmentintothetypeshemeitself. Thispresentationisknown

as-lifting,for itessentiallyamountsto pretendingthat weare dealingsolelywith losed

programterms. Itsfuntioning willbedetailedbythetypingrules(seesetion2.4). More

preisely,informationonerninglet-boundvariablesremainsstoredinsideanexternalen-

vironment,whileinformationabout-boundvariablesappearsinaontext whihispartof

typeshemes.

Denition6 Assumean ordering on groundtypes, whih anbehosen tobe=or.

Theforthomingdenitionsdependonthehoieon,soweendupdeningtwovariants

ofthetypesystem,basedeitheronequalityoronsubtyping.

(7)

Denition8 A ground ontext is a nite map from -identiers to ground types. The

ordering isextendedtogroundontextsasfollows:

AA 0

() 8x2dom(A 0

) x2dom(A)^A(x)A 0

(x)

A groundtype isapair of a groundontext A andagroundtype ,written A). The

ordering isextendedtogroundtypesby setting

(A))(A 0

) 0

) () (A

0

A)^( 0

)

Denition9 A ontextA isanite mapfrom-identierstotypes. Ifx62dom(A), then

A[x7!℄istheontextwhihextendsAbymappingxto. ifx2dom(A), thenAnxisthe

ontextwhihisundenedatxandwhihoinideswithAelsewhere. Atypeisapair ofa

ontextAandatype,writtenA). Groundsubstitutionsareextendedstraightforwardly

toontextsandtypes.

Denition10 Aonstraintisapair oftypes,written 0

. Agroundsubstitutionisa

solutionof it i()( 0

); wethen write ` 0

. When stands for , we say is

a k-solutionof 0

i()

k (

0

); wethen write`

k

0

. We write`C (resp.

`

k

C) when` (resp. `

k

)holdsfor all 2C.

Denition11 Typeshemesaredenedby

::=A) jC

where A denotes a ontext, a type and C a onstraint set. (The symbol j should be

interpretedhereasaliteral,notasahoie.) Letfv()standforthesetofalltypevariables

whih appear inA, orC. The orderof isjfv()j.

Intuitivelyspeaking,allvariablesofatypeshemearetobeonsideredasuniversallyquan-

tied. However,weshallnotwriteanyquantiersexpliitly. Formallyspeaking,noimpliit

-onversionis allowed ontypeshemes; -onversionshallbedealt withexpliitly. This

deisionallowsarigorousdesriptionofthewayfreshvariablesandrenamingsarehandled.

Wenowdenethedenotationofatypeshemeasasetofgroundtypes.

Denition12 The denotation JK of a type sheme is the union of the -upper ones

generatedby itsgroundinstanes. Thatis,

JA) jCK=fA 0

) 0

;9`C (A))A 0

) 0

g

Informally speaking, atype sheme is simply a way of desribing a set of ground types.

Thus, itsdenotation ispreisely this set, i.e. theset ofgroundtypeswhih theprogram

would reeive in a system without polymorphism. Sine subtyping allows weakening a

program'sgroundtype,itisnaturalforasheme'sdenotationtobeupwardlosed,hene

theuseofupperonesinits denition. Itisnowlearthatatypeshemeis moregeneral

thananotheroneiitrepresentsalargersetofgroundtypes;thus,subsumptionbetween

typeshemesisdenedasset-theoretiinlusionof theirdenotations,asfollows.

Denition13 Given two type shemes

1 and

2

, the former issaid to be moregeneral

than the latter i J

1 K J

2

K; we shall then write

1 4

2

. In other words,

1

is more

generalthan

2

ifor anygroundinstaneof

2

,thereexistsagroundinstaneof

1 whih

issmaller withrespetto. Formally,

(A

1 )

1 jC

1 )4(A

2 )

2 jC

2 )

(8)

8

2

`C

2 9

1

`C

1

1 (A

1 )

2 (A

2 )

Wewrite

1

2

when

1 4

2 and

2 4

1 .

Therelation4wasintroduedin[TS96℄,where itiswritten 8

.

2.4 Typing rules

The language we are interested in is ore ML, that is, a -alulus equipped with alet

onstrut. Forthesakeofsimpliity,weseparate-bound identiersfromlet-boundones,

byplaingthemin twodistintsyntatilasses.

Denition14 Assume given a denumerable set of let-identiers, denoted by X, Y,...

Expressionsaredenedby

e::=xjx:ejeejX jletX =ein e

Denition15 Environmentsaredenedby

::=?j ;X :

Environmentaess isdened, asusual,by

( ;X :)(X)= ( ;Y :)(X)= (X) whenX 6=Y

Note that environments ontaininformation aboutlet-bound variables only. Assoiating

typesto-bound variablesis doneinside typeshemes,asshownbythetypingrulesgiven

ingure1.

Denition16 An expression e is well typed in an environment i there exists a type

sheme ,whosedenotation isnon-empty,suhthat `e:.

ReallthatthedenotationofatypeshemeA) jC isnon-emptyifandonlyifCadmits

asolution. Thus, todeterminewhether aprogramiswell-typed,onemustnotonlybuilda

typingderivation, butalsomakesurethatityieldsasolvableonstraintset.

Also,reallthattherelation4,aswellasthenotionofdenotation,dependonourhoie

of. So,therearetwovariantsofthistypesystem,one basedonequality,theotherbased

onsubtyping.

Inthis system,onerule isdevoted toeahsyntationstrut;in addition,rule(Sub),

alledthesubsumption rule,allowsreformulatingthetypeshemeatanypoint,withgreat

exibility. It allows arbitrary -onversions, as well as simpliations of the onstraint

system.

These rules aim at simpliity. Still, we expet the unfamiliar reader to wonder why

ontexts are made part of type shemes. Let us explain. Contexts are part of the -

liftingmehanism,whihallowsus toemulate thebehaviorofML, whileusing universally

quantied variables exlusively. But how an we express \monomorphi" types, sine all

variablesmustbeuniversallyquantied? Hereisanexample. Considertheexpression

x:letY =xin f:(fYY)

LetustypethisexpressioninML.Y'stypeisamonomorphivariable. So,thetwouses

ofY donotinvolveanyinstantiation, andtheexpression'stypeis!(!!)!.

In oursystem, on the ontrary, Y's type is (x : ) ) , aordingto rule (Var). Here,

(9)

`x:A) jC

(Var)

`e:A[x7!℄) 0

jC

`x:e:A) ! 0

jC

(Abs)

`e

1

:A)

2

! jC `e

2

:A)

2 jC

`e

1 e

2

:A) jC

(App)

(X)=

`X:

(LetVar)

`e

1 :

1

;X:

1

`e

2 :

2

`letX =e

1 in e

2 :

2

(Let)

`e: 4 0

`e: 0

(Sub)

Figure1: Typingrules

is (impliitly) universally quantied. So, ifone were freeto use rule (Sub) to perform

renamings,thetwousesofY ouldyieldtwodistintshemes(x:)) and(x:)).

However,the typing rule forfuntion appliation requires that its twobranhesshare the

sameontext. So, neessarily, and must bethesamevariable, andthesub-expression

f:(fY Y)has type (x : )) ( ! ! ) ! . One the-abstration is performed,

the whole expression reeivestype ! ( ! ! ) ! , as expeted. Tosum up, all

variableswhihappearintheontextatuallyhavemonomorphibehavior;thisisausedby

asharingonstraintonontexts,whihisenforedwhenevertwobranhesofthederivation

are brought together. So, we are able to do away with the notion of unquantied type

variable;nonetheless,thesystemisorret,asstatedbelow.

Statement1 Let ebean expressionsatisfyingthe followingtwoonditions:

eah-identierisboundatmostonewithine;

if letX =e

1 in e

2

isasub-expressionofe,thenX appears freewithin e

2 .

Assume e to be well-typed in the empty environment. Then e is safe with respet to a

all-by-value semantisofthe language.

Theabovetwoonditionsaretehnial. Therstoneismadeneessarybythewaywe\lift"

-binders through let binders; the seond one is required to make rule (Let) safe with

respet to aall-by-valuesemantis[TS96℄. They arenotrestritive,sineanyexpression

anberewritten,withoutalteringitssemantis,soastosatisfythem. Indeed,tosatisfythe

rstondition,anappropriaterenamingof-boundvariables shalldo;tofulll theseond

one, it suÆes to replae the onstrut letX = e

1 in e

2

with letX = e

1

in ( :e

2 )X

wheneverX doesnotappearfreein e

2 .

(10)

[F℄ `

I

x:[F[f;g℄(x7!))jfg

(Var

i )

[F℄ `

I e:[F

0

℄A) 0

jC A(x)= 62F 0

[F℄ `

I

x:e:[F 0

[fg℄(Anx))jC[f! 0

g

(Abs

i )

[F℄ `

I e:[F

0

℄A) 0

jC x62dom(A) ;62F 0

[F℄ `

I

x:e:[F 0

[f;g℄A)jC[f! 0

g

(Abs'

i )

[F℄ `

I e

1 :[F

0

℄A

1 )

1 jC

1

[F 0

℄ `

I e

2 :[F

00

℄A

2 )

2 jC

2

[F 00

℄A

1

^A

2

=[F 000

℄AjC

m

;62F 0 00

C=C

1 [C

2 [C

m

[f;

1

2

!g

[F℄ `

I e

1 e

2 :[F

000

[f;g℄A)jC

(App

i )

(X)= renamingof rng()\F =?

[F℄ `

I

X :[F[rng()℄()

(LetVar

i )

[F℄ `

I e

1 :[F

0

℄

1 [F

0

℄ ;X:

1

`

I e

2 :[F

00

℄

2

[F℄ `

I

letX =e

1 in e

2 :[F

00

℄

2

(Let

i )

Figure2: Typeinferenerules

The readermaypoint outthat these onditionsarenotpreserved by redution,whih

posesaproblemwhenattemptingtoexpressasubjetredutionproperty. However,weshall

not attempt to prove statement1 in this paper, beause we hoose to fous on the issue

of onstraintsimpliation. Weremovethese onditions andgiveafull subjetredution

proof|fortheasewherestandsforthesubtypingrelation|in[Pot98b,Pot98℄. Doingso

requiresamoreomplexformulationofthetypesystem,whihiswhywehoosesimpliity

here.

Lastly, onemay notie that safety|with respet to any semantis|omesfor free in

thepure-alulus,sinetherearenopossibleexeutionerrors. However,thesafetyproof

givenin[Pot98b,Pot98℄isnotbasedonthisremark,andanbeextendedtomoreomplex

aluli.

2.5 Type inferene rules

The typing rules introdued above annot be diretly used to infer an expression's type.

First, they are not syntax direted, beause of rule (Sub). Seond, rule (App) plaes

sharing onstraints on its premises: A,

2

and C appear in both premises. So, we now

deneasetoftypeinferenerules,whihspeifyatypereonstrutionalgorithm;theyare

given in gure 2. The main dierene with the typing rules is the disappearane of the

subtyping rule, whih has been built into the appliation rule. (The \[F℄" annotations,

althoughnoisy,aretrivial;theyallowanexpliittreatmentof freshvariables.)

Rule (App

i

) uses the following denition, whih desribes how ontexts are brought

(11)

Denition17 The assertion [F℄ A

1

^A

2

= [F 0

℄ A j C stands, by denition, for the

followingonjuntion:

dom(A)=dom(A

1

)[dom(A

2 );

8x2dom(A

1

)\dom(A

2

) A(x)2VnF;

8x2dom(A

1

)ndom(A

2

) A(x)=A

1 (x);

8x2dom(A

2

)ndom(A

1

) A(x)=A

2 (x);

F 0

=F[fA(x);x2dom(A

1

)\dom(A

2 )g;

C=fA(x)A

i

(x); x2dom(A

1

)\dom(A

2

);i2f1;2gg.

Informallyspeaking,wesaythatAisthemeetofthetwoontextsA

1 andA

2

. Itisessentially

theleastdemandingontextwhihguaranteesthat bothA

1

'sandA

2

'sexpetationsabout

theexpression'sruntimeenvironmentarefullled.

Thetypeinferenerulesaresoundandompletewithrespetto thetypingrules|that

is,theyinferamostgeneraltypeshemefortheexpressionathand.

Statement2 The typeinferenerules areorretwith respettothe typing rules;that is,

[F℄ `

I e:[F

0

℄ implies `e:.

Statement3 The typeinferenerules are ompletewith respettothe typing rules. That

is, if `e: then, for any nite F V, there existsanite F 0

V anda typesheme

0

4 suh that [F℄ `

I e : [F

0

℄ 0

. Furthermore, 0

is uniquely determined, up to a

renaming,by ande.

These rules are verylose, in spirit, to those of Trifonovand Smith [TS96℄. However,we

have brought a few subtle, but important modiations, so as to produe type shemes

whihsatisfyaoupleofinterestingproperties. First,anysuhshemeismadeupofsmall

terms only. Seond,when standsforthesubtyping relationship,theshemeontainsno

bipolar variables. Bothpropertiesshallbeusedthroughoutthepapertosimplifystatements

andproofs. Weprovetheformerhere;thelatteris introduedin setion4.2.

Denition18 A small term is a type term of the form ?, > or

0

!

1

, i.e. a term

whose strit sub-terms are type variables. A typesheme A ) jC is made upof small

termsiitsatisesthe followingonditions:

for all x2dom(A), A(x) isatypevariable;

isatypevariable;

for all ( 0

) 2C, either and 0

are type variables, or one isa variable and the

other isasmallterm.

Theorem2 If [ ℄F`

I e:[F

0

℄,then ismade upofsmallterms.

Proof. Straightforwardindution onthestrutureofthetypeinferenederivation. 2

Thesmall termspropertyallowsreasoningaboutsharingbetweensub-terms, and isakey

requirementinourformulationofminimization(seesetion4.5). Itisalreadyto befound,

forinstane,inthetheoryofuniation[Hue76℄. Amongworksmoreloselyrelatedtoours,

AikenandWimmers[AW92℄andPalsberg[Pal95℄useasimilaronvention.

(12)

We are done introduing our type inferene system, whih speies how to assoiate a

onstrainedtypeshemewithagivenprogram. Weshallnowfousourattentionontothe

main issue ofinteresthere: how tosimplify aninferred typesheme, withoutaeting its

meaning. Webegin, in this setion, with thesimplerase where ishosento be=, i.e.

whereonstraintsareequations.

Inommonpresentationsof equality-basedtypesystems,noequationsappear;instead,

their most generaluniers are omputed diretly. Here, however, weexpliitly deal with

equalityonstraints,soastohighlightthesimilaritywiththemoreomplexaseofsubtyping

onstraints.

Inthissetion,andinthissetiononly,wehoosetodealwithsimpliedtypeshemes,

oftheform jC. Weshallnotonernourselveswithontexts,beausetheirpresenedoes

notaddanydiÆultytothesimpliationissue.

3.1 Preliminaries

Letusbeginwithafewstraightforwardfatsonerningtermautomata[KPS93℄.

Denition19 Atermautomatonisatuple A=(Q;q

0

;Æ;l)where:

Q isanitesetof states,

q

0

2Qisthe startstate,

Æ:Qf0;1g!Qisa(partial) transitionfuntion,

l:Q!

g

[V isa labelingfuntion,

suhthat forany stateq2Q andforany i2f0;1g,Æ(q;i)isdenedil(q)=!.

Astateq2Qissaid tobe freeiits label isavariable, i.e. l(q)2V. The orderof A

isthe numberof itsstates,i.e. jQj.

A termautomatonisessentiallyawayofrepresentingatypeterm,possiblyreursiveand

possiblywithfreetypevariables. Suharepresentationismoreompatthanalassitree

representation,beauseofitsabilitytoexpresssharingbetweennodes.

Denition20 Let A =(Q;q

0

;Æ;l) be aterm automaton. Extend Æ toa partial funtion

^

Æ: Qf0;1g

!Q. Then, A desribes afuntion

A

frompaths into

g

[V, dened by

p7!l(

^

Æ(q

0

;p)).

Rather thanviewing anautomatonasatypeterm, possiblyontainingtypevariables, we

analsohoosetoviewitasaset ofgroundtypes.

Denition21 Let A be a term automaton. The ground instane of A through a ground

substitution isthegroundtype denedasfollows: for allpaths p,

if

A (p)2

g

,then(p)=

A (p);

if

A

(p)isatype variable 2V,then

jp

=().

The denotationofaterm automatonA isthe setofits groundinstanes.

Statement4 Aterm automaton'sdenotationisnon-empty.

Statement5 Two term automata A and B have the same denotation i

A and

B are

equal uptoarenaming ofvariables.

(13)

=e =e

=e=e 0

(Fuse)

e=>=>

e=>

(Deompose

>

)

e=?=?

e=?

(Deompose

? )

e=

0

!

1

=

0

!

1

e=

0

!

1

0

=

0

1

=

1

(Deompose

! )

Figure3: Solvingmulti-equations

3.2 Simplifying multi-equations

Thetypeinferenealgorithmgeneratesequations. However,itisbestto introdueamore

general notion of multi-equation, as is often done in works on uniation [Hue76, JK90,

Rem92℄.

Denition22 A multi-equationisaset ofterms f

1

;:::;

n

g,written

1

==

n . An

equality onstraint

1

=

2

an be viewed as a multi-equation. The notion of solution is

extendedstraightforwardly tomulti-equationsandtosetsthereof. A multi-equationis made

upofsmalltermsiallof itsmembers arevariables or smallterms.

Inordertodeterminethataprogramiswell-typed,weneedto makesurethatitsasso-

iatedtypeshemehasanon-emptydenotation, i.e. that itsonstraintset hasasolution.

Thisisdonebyapplyingasetofrewritingrulestothemulti-equationset,asfollows.

Theorem3 Consider atype sheme =

0

j C, where C is a multi-equation set, made

up of small terms. Rewrite C aording to the rules of gure 3, until none applies; letC 0

denotetheresultofthisproess. Then,C 0

isalsomade upofsmallterms,andhasthesame

solutions asC. Furthermore,

if C 0

ontainsatleastonemulti-equationofthe forme= = 0

,whereneither nor

0

arevariables, thenJK isempty;

otherwise, C 0

is said to be in anonial form. It an easily be viewed as a term

automaton, whose order equals that of , and whose denotation oinides with JK.

As aorollary, JK isnon-empty.

Proof. With anappropriatedenition ofweight(e.g. giveweight1tovariables, ?and>,

andweight2tothe!symbol),itiseasytoverifythateahrewritingruleausesthetotal

weight of the multi-equation set to derease. Hene, the proess must terminate. Eah

rewritingruleobviouslypreservesthesolutionspae,aswellasthesmalltermsproperty.

Assume C 0

ontains amulti-equationwith twonon-variable terms. Then, these terms

musthaveinompatibleheadonstrutors,beausenoneofthedeompositionrules ing-

ure3applies. So,C 0

hasnosolution. On theotherhand,assumeC 0

is in anonialform;

(14)

rule (Fuse) no longer applies, eah variable appears in at most one multi-equation. In

eah multi-equation, hoose aunique representative, equalto its non-variable term when

ithas one,and toan arbitrarymemberotherwise. Foreah2fv (),letrepr() denote

therepresentativeof 'smulti-equation,if appears in somemulti-equation,and itself

otherwise. Dene atermautomatonA=(Q;q

0

;Æ;l)asfollows:

Q=fv();

q

0

=

0

;

fori2f0;1g,Æ(;i)=

i

whenrepr()=

0

!

1

;

l()=repr()().

Itisstraightforwardtoverifythatthegroundinstanesofthisautomatonareexatlythose

of; hene,itsdenotationoinides withJK. Thenon-emptiness resultstemsfromstate-

ment4. 2

Theorem 3yields an algorithm to determine whether atype sheme has anon-empty

denotation; this makes typeinferene deidable. However,it also showsthat aanonial

typeshemeanbeviewed asaterm automaton;wenow establishthe onverse, showing

thatthetwonotionsareequivalent.

Theorem4 LetAbeaterm automaton. Then,thereexistsaanonial typesheme ,of

the sameorder, whosedenotation oinides withA's.

Proof. Assume A = (Q;q

0

;Æ;l). Choose some injetivemap q 2 Q 7!

q

2 V. Dene a

multi-equationset C by

foreah2rng(l),f

q

;l(q)=g2C;

foreahq2Qsuhthatl(q)=?,f

q

;?g2C;

foreahq2Qsuhthatl(q)=>,f

q

;>g2C;

foreahq2Qsuhthatl(q)=!,f

q

;

Æ(q;0)

!

Æ(q;1) g2C.

Dene =

q0

jC. Itis straightforwardto verifythat JK oinideswith A'sdenotation.

2

Theequivalenebetweenanonialtypeshemesandtermautomatagivesrisetoanessen-

tialidea: the well-knownminimizationproedure for nite-state automata arriesoverto

anonialtypeshemes.

Theorem5 Letbeaanonialtypesheme. Amongtheanonialtypeshemesequivalent

to, thereis oneof minimal order, whih anbe omputedin timeO(nlogn), where nis

the orderof .

Proof. Thankstotheorems3and4,weanstatetheproblemintermsofautomata. Given

anautomatonA, of ordern, wemustomputean automatonB, whosedenotation equals

that of A, and whih is minimal for this property. Aording to statement 5, we an

equivalentlyrequire

A

=

B

. Hene,theproblemsimplyonsistsinminimizingthelabeled

nitestateautomatonA,whihan bedoneintimeO(nlogn)[Hop71℄. 2

(15)

0

where

0

=

1

=

2

!

3

5

=

6

=>

3

=

5

!

6

2

=

4

Figure4: A sampletypesheme,in anonialform

0

!

1

!

2

3

!

4

2

5

>

6

>

0

1

0

1

0 1

Figure5: Thesame,viewedasanautomaton

0

!

2

3

!

5

>

0 1

Figure 6: Theminimizedautomaton

0

where

0

=

2

!

3

5

=>

3

=

5

!

5

Figure7: Thesame,viewedagainasatypesheme

(16)

ameasureof itsomplexity|inquasi-lineartime. Figures4to 7illustrate thisproedure.

Our starting point is a type sheme whose multi-equation set has been put in anonial

formafter therulesofgure3. Theorem 3allowsusto viewitasanautomaton(gure 5),

whih we then minimize. Minimization is a well-known, two-step proess: rst eliminate

anystatesnotreahablefromthestartstate,thenmergeequivalentstates. Inbroadterms,

twostatesare equivalentif theirlabels areequaland iftheyarry transitions, with equal

labels, whose end statesare in turn equivalent. Thisproessyields theautomaton shown

in gure6. Finally, theorem4 allowsusto turn this automatonbakintoatype sheme.

Of ourse, thinking in terms of automata allows asimpleexplanation of the proess, but

isn't mandatory; theminimization proedure anbe desribeddiretly in terms of multi-

equations,ifonesowishes.

HowdoesthisproedureomparetotheusualresolutionproessusedinMLtypeinfer-

ene? An ML typehekeromputes themostgeneralsolution ofthe equation set, using

uniation. This essentially amounts to putting the type sheme in anonial form, by

applying the rules of gure 3, then mergingall membersof a single multi-equation. Our

algorithm goesonestep further,sinevariables belonging to dierent multi-equations an

also bemerged, provided theystand for equivalent statesof theautomaton. Infat, our

simpliation proedure is omplete|it yields a type sheme with a minimal number of

variables. Sineour shemesare made upof smallterms, this is ameaningfulmeasure of

theiromplexity.

Theoretially speaking, our deision of working with small terms allows us to easily

highlighttheisomorphismbetweentypeshemesandtermautomata. Moreintuitively,one

mightsaythatbreakingalargetypetermdownintoaseriesofsmallterms,linkedtogether

by equations, essentially amounts to labelling eah node of the original term with atype

variable. Identifyingvariablesisthentantamounttosharingnodesintheoriginaltypeterm,

thus yielding amore ompatrepresentation. Of ourse, auser is likelyto prefer a more

readablerepresentation,withfewervariablesandlargerterms;itiseasytoreverttosuha

representationfordisplaypurposes. (Forinstane,thetypeshemeofgure7anbeprinted

as

2

! > ! >.) This is already the ase in typial ML implementations, where types

areinternally representedbydireted ayligraphs,but printedastrees. It isimportant

to arefully distinguish thetwo representations, sinethe latter is typially exponentially

larger. In other words, aninternal representation must favor eÆieny; onverting to an

externalrepresentation, whih oersbetterreadability,must be delayeduntil theresultis

readyfortheusertobeseen.

Toonlude,wehavestudied aomplete simpliation proedure foronstrainedtype

shemes,intheasewhereonstraintsareequations. Itonsistsofthreemainsteps: putting

theonstraintsinanonialform,eliminatingunreahablevariables,andmergingequivalent

variables. Weshallnowmoveontotheaseofsubtyping,anddisoverthat,althoughdetails

beomemoreomplex,thesamebroadideasapply.

4 Simplifying subtyping onstraints

4.1 Solving onstraints

Asinsetion3,ourrsttaskistondanalgorithmtodeidewhetheragivenonstraintset

hasasolution. Indeed,doingsoisrequiredtodeterminewhether aprogramis well-typed.

Ourgoal,in thissetion,istodesribesuhanalgorithm.

Webegin with afundamental tehnial result, whih desribesa weak, suÆient on-

dition for aonstraint set to have a solution. It will form the basis for the proof of the

onstraint solving algorithm. We prove a fairly powerful version of this result, allowing

(17)

downtheseextendedonstraintswouldrequiresomeniterepresentation;however,wewill

notneed to do so.) Thanks to this generalization,this result also formsthe basis forthe

proofofthegarbageolletionalgorithm(seesetion4.3).

Denition23 A onstraintset withgroundonstantsisaset C of subtyping onstraints

ofthe form 0

,where and 0

are eithertwovariables,one variable andasmallterm,

orone variableandagroundtype. Dene theassertionC +1

0

tomean

8k0 8`

k

C `

k +1

0

Dene C

#

()=f; 62V^ 2CgandC

"

()=f; 62V^ 2Cg. C issaid

tobe weaklylosedithe followingonditionsaremet:

1. 2C and 2C imply 2C;

2. 2C and 2C

#

()imply 9 0

2C

#

() C

+1

0

;

3. 2C and 0

2C

"

()imply 9 2C

"

() C

+1

0

;

4. 2C

#

() and 0

2C

"

() imply C +1

0

.

Theorem6 LetC beaonstraintsetwithgroundonstants. IfC isweaklylosed,thenC

has asolution.

Proof. Notethat thisproofonlyusesonditions2and4of denition23. Theotherondi-

tionsshallberequiredbyfurther theorems,suhastheorem11.

LetV =fv(C). Considertheset T V

ofgroundsubstitutions ofdomainV. Wedenea

mapS from T V

intoitselfby

7!

G

2C

#

() ()

AssumingT V

isviewedasametrispae,equippedwiththeusualdistanebetween(tuples

of)innitetrees[Cou83℄,itis easytoverifythatS is 1

2

-ontrative. Thus, ithasaunique

x-point.

Weshall nowverify that is asolution of C. This is doneby proving that it is a k-

solutionof C,for allk 0,by indution overk. The base aseis immediate, sine

0 is

uniformlytrue(seedenition2). Itremainstoprove,assuming`

k

C,that `

k +1 C.

Consider a onstraint of the form 2 C. Beause C satises ondition 2 of

denition23,wehave8 2C

#

() 9 0

2C

#

() C

+1

0

. Sine`

k

C,thisimplies

8 2 C

#

() 9 0

2 C

#

() ()

k +1 (

0

), whih in turn entails ( F

2C

#

()

())

k +1

( F

0

2C

#

() (

0

)). Thisstatementisnoneotherthan()

k +1 ().

Next,onsider aonstraintoftheform 2C,where 62V. Then, 2C

#

(). So,

bydenitionof,()(). Inpartiular,()

k +1 ().

Finally,onsider aonstraintoftheform 0

2C,where 0

62V. Then, 0

2C

"

().

Pik some 2 C

#

(). Then, ondition 4 of denition 23, together with our indution

hypothesis, yield`

k +1

0

,i.e. ()

k +1 (

0

). Sinethis holdsforall 2C

#

(), we

alsohave( F

2C

#

()

())

k +1 (

0

),i.e. ()

k +1 (

0

). Thisonludestheproof. 2

Theorem 6is anie toolto exhibit solutionsof aonstraint set. However,it is notlear,

givenanarbitraryonstraintset,howitanbeputinweaklylosedform.So,weshallnow

deneastronger,butsimpler,notionof losure,whihanbeomputedmoreeasily. This

isthenotionoriginallyproposedbyEifrig,SmithandTrifonov[EST95b℄.

(18)

bers arevariables or smalltermsdown intoasetof equivalentonstraints:

sub ()=fg sub( )=f g

sub(?)=? sub( >)=?

sub(

0

!

1

0

! 0

1 )=f

0

;

1

0

1 g

Denition25 LetC be aonstraint set,made upof smallterms. C issaid to be losed

i wheneverf ; 0

gC, sub(

0

) is dened and inluded in C. From now

on,atypesheme A) jC issaid tobelosediC islosed.

In plain words, the above denition means that a onstraint set is losed i it is stable

throughaombination oftransitivity andstrutural deomposition. Letus nowverify, as

announed,thatlosureentailsweaklosure;whihmeans,onsideringtheorem6,thatany

losedonstraintset admitsasolution.

Theorem7 AnylosedonstraintsetC is weaklylosed.

Proof. ItislearthatC satisesondition1ofdenition23.

Assume 2C. Let 2C

#

(). BeauseC islosed, sub( )=f gC.

So, 2C

#

(). Thisis suÆienttoestablishondition2of denition23; just pik 0

=.

Symmetrially,ondition3issatised.

Now,assume 2C

#

() and 0

2C

"

(). BeauseC islosed, sub(

0

) is dened

and part of C. Thus, any k-solution of C is, in partiular, ak-solution of sub ( 0

).

Moreover, onsidering the denition of sub, it is easy to verify that any k-solution of

sub(

0

)isa(k+1)-solutionof 0

. Condition4ofdenition23ensues. 2

Toonludethissetion,wepresentanalgorithmwhihputsagivenonstraintsetinlosed

form,ifithasasolution,and failsotherwise. Thisalgorithmisusedto determinewhether

agivenprogramiswell-typed. Itsbadomplexity: O(n 3

),aswellasthesizeofitsoutput:

O(n 2

),areamongthemainreasonswhyonstraintsimpliationisrequired.

Theorem8 LetC beaonstraintset,made upof smallterms. Let C 2

denote

C[

[

f;

0

gC

sub(

0

)

IfthesequeneC ;C 2

;C 4

;::: isinnite,thenitreahesax-pointC 1

,whihisthesmallest

losed onstraintset ontaining C;its solution spaeis equal toC's andnon-empty. (C 1

isalledthe losure ofC.) Otherwise, C has nosolution.

Proof. For anarbitrary C, it is learthat C 2

is equivalent to C if it is dened, and that

C has no solutionotherwise(i.e. if subis applied outsideof its domain). Thus, if some

elementofthesequeneisundened,thenChasnosolution. Otherwise,thesequenemust

reah a x-point C 1

, beause any newly reated onstraint involves existing terms, and

there isonlyanite numberof suh onstraints. It islearthatC 1

isthesmallestlosed

setontainingC. Aordingtotheorem7,C 1

isalsoweaklylosed;bytheorem6,itadmits

asolution. 2

Whilebuilding a typeinferene derivation, wewish to makesure, at everystep, that the

expressionathandiswell-typed,soastodeteterrorsassoonaspossible. So,wemustmain-

tainouronstraintsetsin losedform. Thismaybedoneinrementally,takingadvantage

ofthefat thateahtypeinfereneruleaddsafewfreshonstraintsto alosed onstraint

set; an inremental algorithm is desribed in [Pot98, Pot98b℄. Of ourse, ifweuse suh

an algorithm, then our simpliation algorithms must preserve the losure property; this

ensuresthatwemayperformsimpliationstransparentlyatanypoint.

(19)

If isthetypeshemeassoiatedtoanexpressione,itwouldbeinterestingtodistinguish

thetypevariablesof whihrepresentaninput(i.e. somedataexpetedbytheexpression

e)fromthosewhihrepresentanoutput(i.e. someresultsuppliedbye). Weshallannotate

eahtype variablewith a sign in theformer ase,and with a+signin thelatterase.

Of ourse, itis possiblefora variable toarry both signsat one; weallsuh avariable

bipolar. Somevariables,ontheotherhand,arrynosignatall;weallthoseneutral. Thus,

we shall assoiate a pair of Boolean ags, whih we all polarity, to eah variable. This

informationwillservetoguideallofoursimpliationalgorithms.

Denition26 Consider a weakly losed type sheme =(A)j C), made up of small

terms. Thesetof positivevariablesof,andthesetof negativevariablesof,respetively

denotedbyfv +

() andfv (), arethe smallest subsetsP andN of fv ()suhthat

2P

rng(A)N

82P fv +

(C

#

())P ^fv (C

#

())N

82N fv +

(C

"

())N^fv (C

"

())P

Polarities may beeasily omputed asa smallest x-point. The time required is linear in

thesize oftheonstraintset. Indeed, visitingavariable'sonstrutedlower(resp. upper)

boundshastobedoneatmostone,namelywhenthevariablerstbeomespositive(resp.

negative). Thus, eahonstraintistraversedatmostone;whenetheresult.

TrifonovandSmith[TS96℄introdued polaritiesasarenementofournotionof reah-

ability[Pot96℄,whihwouldonlydetetneutralvariables,and usedthemto drivegarbage

olletion (see setion 4.3). However, they did not mention ertain useful properties of

polarities,whihweshallnowdesribe.

Intuitivelyspeaking,eah positivevariableof representsapiee ofdataomputedby

e and aessible as a part of its result. Assume e is plaed inside a ontext C, yielding

an expression C[e℄ whose assoiated sheme is 0

. C[e℄'s result might still ontain some

parts ofe'sresult, meaningthat theorrespondingvariables arestill positivein 0

; others

mayhave been dropped, meaning that the orresponding variables are no longer positive

in 0

. However, any value omputed by e, but inaessible through its result, obviously

remainsinaessiblethroughC[e℄'sresult;whihmeansthatanyvariablesnot positivein

annotbeomepositivein 0

. Ananalogouspropertyholdsfornegativevariables. Inother

words, polaritiesderease asonewalksdownatypeinferene derivation. Thispropertyis

formalizedbythefollowingtheorem.

Theorem9 Consider an instane of one of the type inferene rules of gure 2, whose

outputisatypesheme . Pik some2fv(),andassumealsoappearsin 0

,where 0

is oneof the rule'spremises. Then, 2fv +

() (resp. fv ()) implies2fv +

( 0

) (resp.

fv ( 0

)).

Proof. Theonly non-trivialase is that of rule (App

i

). Weuse the notationsof gure2.

Fori2f1;2g,let

i

=(A

i )

i jC

i

);assumeC

i

islosed. Dene

P =fv +

(

1 )[fv

+

(

2 )[fg

N =fv (

1

)[fv (

2

)[fg[fv(A)

Wewishto showthat P andN areonservativeapproximationsofthepolaritiesin, i.e.

that they satisfythe reursiveequations ofdenition 26. However,reall that omputing

(20)

to C 1

, not to C itself; weneed some information about C 1

in order to provethat the

equationshold.

Lettheassertion +

standfortheonjuntionfv +

()P^fv ()N. (Theassertion

isdened symmetrially.) NotiethatC

1 [C

2

islosed,beausethesesetshavedisjoint

domains. Let usall\new"the onstraintsin C

m

[f;

1

2

!g, aswellasany

onstraintsarisingfromthesubsequentlosureomputation. Itisstraightforwardtoverify

that whenever asmall term appears on the left-hand (resp. right-hand) side of anew

onstraint,then +

(resp. )holds.

Thisguaranteesthattheequationsofdenition26,appliedtoA) jC 1

,aresatised

byP and N. Beause fv +

()and fv ()are thesmallestsolutionsof theseequations, we

havefv +

() P and fv ()N. Inpartiular, fv +

()\fv(

i )fv

+

(

i

)and fv ()\

fv(

i

)fv (

i

);whihisthedesiredresult. 2

Theorem9guaranteesthatavariable'spolaritydereasesduringitslifetime. Asaorollary,

ifthetypeinferenerulesarewrittensoastoneverauseafreshvariabletobebipolar|and

sotheyare|thennobipolarvariablesaneverappearin atypeinferenederivation.

Theorem10 Assume[F℄ `

I e:[F

0

℄. If none ofthe (X), for X 2dom( ), ontains

abipolar variable, thenneitherdoes .

Proof. First,wehekthatwheneverafreshvariableisreatedbyoneofthetypeinferene

rules,it isnotbipolar. Consider,forinstane, rule (Var

i

). It reatestwovariablesand

. The formerappears in the ontext of the typesheme, while the latterappears in its

body. Hene, is negative, and is positive. Aording to denition 26, polaritiesan

onlytravelfromavariabletoasmallterm,sotheonstraint doesnotause(resp.

) tobeomepositive(resp. negative). Note, onthe otherhand,that in the typesheme

(x7!)),is bipolar;splitting intotwovariablesand,linkedbyaonstraint,is

thetehnialtrikwhihallowsusnottoreateanybipolarvariables. Rule(App

i

)ontains

asimilar trik.

Seond, theorem 9tellsus that ifavariableis bipolar ata ertainpoint, thenit must

havebeensosinethemomentitwasreated. Aordingtothepreviousparagraph,thisis

impossible;whenetheresult. 2

Thisresultisusedtosimplifyvariousdenitionsandproofs,inpartiularonerninggarbage

olletionandanonization. Of ourse,wewill needto provethat oursimpliationalgo-

rithms also ause polarities to derease, so we an perform simpliations at any point

withoutbreakingthisproperty.

4.3 Garbage olletion

Computing the losure of a onstraint set typially yields alarge number of onstraints.

Many of them are useful as intermediate steps of the losure omputation, but are no

longeressentialoneitisover. Morepreisely,weshallnowshowthattheonlymeaningful

onstraintsin alosedshemeA)jC arethefollowing:

those whih link apositive(resp. negative) variableto anelementofC

#

()(resp.

C

"

())|they giveinformationaboutthestrutureof apiee ofdata supplied(resp.

expeted)bytheexpression;

thosewhih linkanegativevariabletoapositiveone|theyrepresentapossibleow

ofdatafrom oneoftheexpression'sinputstooneofitsoutputs.

(21)

weansimplyforgetaboutthem; this proess,proposedbyTrifonovand Smith[TS96℄, is

alledgarbage olletion. Note thatallneutralvariablesaredisarded;inouranalogywith

setion 3, garbage olletion orresponds to the removal of unreahable nodes in a nite

automaton. Itdoesmorethan that, however, sineit alsoremovesertain edgesbetween

reahablenodes.

Denition27 Consider asindenition 26. Theimageof through garbageolletion,

denoted by GC(), is the type sheme A ) j D, where D is a subset of C dened as

follows:

2D i 2C,2fv ()and2fv +

();

D

#

() equalsC

#

() if2fv +

(), and?otherwise;

D

"

() equalsC

"

() if2fv (), and?otherwise.

Theorem11 Consider asindenition 27. Then GC().

Proof. Write 0

=GC(). Sine 0

has feweronstraints,it is lear that 0

4. So, we

needtoprove 4 0

. Aordingto denition13,thisisequivalentto

8 0

`D 9`C (A)) 0

(A))

Pik some 0

` D. We now wish to prove that the following onstraint set with ground

onstants (seedenition 23)admitsasolution:

C[f 0

()g[f 0

(A(x))A(x); x2dom(A)g

We shall doso by proving that the following onstraintset|whih ontains theprevious

one,aordingtodenition 26|isweaklylosed:

C[f 0

();2fv ()^ 2C r

g

[f 0

();2fv +

()^2C r

g

(whereC r

denotesthereexivelosureofC,i.e. 2C r

i= or2C). Let

E denotethisset.

Beause C satisesondition1ofdenition 23,so doesE. Usingthesameproperty,it

is easy to hekthat E satises onditions 2and 3. There remainsto hek ondition4.

Assume 2E

#

() and 0

2E

"

(). Fourasesarise, depending onwhether and 0

are

smalltermsorgroundterms:

Both and 0

are small terms. Then, 2 C

#

() and 0

2 C

"

(). The result is

immediate, onsideringC meetsondition4.

Both and 0

aregroundterms. Then,aordingtothedenitionofE, isequalto

0

(), for some 2fv ()suh that 2C r

. Symmetrially, 0

is of the form

0

( 0

),forsome 0

2fv +

() suh that 0

2C r

. Beause C satisesondition1

of denition23, 0

2C r

. If = 0

,then = 0

andtheresultisimmediate. So,

weanassume 0

2C. Sine 2fv ()and 0

2fv +

(),denition 27speies

that 0

2D. Sine 0

`D, 0

() 0

( 0

);that is, 0

holds.

isasmalltermand 0

isagroundterm. Asbefore, 0

isoftheform 0

( 0

),forsome

0

2fv +

()suhthat 0

2C r

. Ontheother hand,wemusthave 2C

#

(). If

0

2C,onsideringthatCsatisesondition2ofdenition23,thereexistsasmall

(22)

term 2C ()suhthat C . If, onthe otherhand, = , thenthe

sameholds(simplypik 0 0

=). Piksome`

k

E. Wethenhave()

k +1 (

0 0

).

Furthermore, beause 0

2 fv +

(), denition 27 speies that 0 0

2 D

#

( 0

). Sine

0

`D,thisentails 0

( 0 0

) 0

( 0

). Now,weneedtoreasonbyasesonthestruture

of 0 0

:

{ Assume 0 0

isoftheformÆ

0

!Æ

1

. Sine 0

2fv +

(),denition26speies that

Æ

1 2fv

+

()andÆ

0

2fv (). Aordingto thedenitionof E,Æ

1

0

(Æ

1 )2E.

Sine`

k

E,thisimplies(Æ

1 )

k

0

(Æ

1

). Symmetrially, 0

(Æ

0 )

k (Æ

0 ). Asa

onsequene,(Æ

0

!Æ

1 )

k +1

0

(Æ

0

!Æ

1

). Inotherwords,( 0 0

)

k +1

0

( 0 0

).

{ Assume 0 0

isequalto?or>. Then, thesameholds,i.e. ( 0 0

)

k +1

0

( 0 0

).

Weannowombine,bytransitivity,thethree resultsobtainedabove:

()

k +1 (

0 0

)

k +1

0

( 00

) 0

( 0

)

Thisimplies()

k +1

0

( 0

). Thatis,`

k +1

0

,whih isthedesiredresult.

Thelastaseissymmetrialtothepreviousone. 2

Itis easy to hek that garbageolletion preservespolarities. Furthermore,provided

bipolarvariablesaredisallowed,itsoutputislosed,asstatedbelow. Thisimportantremark

wasmissingfrom[TS96℄.

Theorem12 Consider asindenition27. Iffv +

()\fv ()=?,thenGC()islosed.

Proof. Write GC() = A ) j D, as in denition 27. As per denition 25, assume

f ;

0

g D. Then, 2 fv +

(), beause itappears on theright-hand side ofa

onstraint in D. Symmetrially, 2 fv (). This is impossible, by hypothesis, so D is

(vauously)losed. 2

4.4 Canonization

Insetion3,inordertoviewamulti-equationsystemasanitestateautomaton,werequired

ittobeinanonialform,i.e. toequateeah variablewithatmostonenon-variableterm.

Similarly, in the ase of subtyping, we say that a onstraint set is in anonial form i

eah variable has exatly one non-variable lower (resp. upper) bound. We shall require

this property before we attempt to minimize onstraint sets. In this setion, we give an

algorithm,alledanonization,whihomputesaanonialform ofanarbitraryonstraint

set.

Denition28 Let =A )j C be a type sheme, made up of small terms, ontaining

nobipolarvariables, suhthat =GC().

Let V (resp. W) range over non-empty subsets of fv () (resp. fv +

()). For eah

suh V (resp. W) of ardinality greaterthan 1, pik afresh variable

V

(resp.

W ). (By

fresh variables, we mean that these variables are pairwise distint, and distint from 's

variables.) Dene the rewritingfuntion r (resp. r +

) aordingtogure 8. Therst two

linesdener (resp. r +

) onnon-emptysetsof negative(resp. positive)variables;theyare

thenextendedtosetsof negative(resp. positive)smallterms.

The image of through anonization, denoted by Can(), is A ) j D, where the

onstraintsetD isgiven by gure9. Itislearthat Can()isinanonial form.

(23)

r (fg)= r (fg)=

r +

(W)=

W

whenjWj>1 r (V)=

V

whenjVj>1

r +

(f?g[S)=r +

(S) r (f>g[S)=r (S)

r +

(f>g[S)=> r (f?g[S)=?

r +

(?)=? r (?)=>

r +

(f

1

!

1

;::: ;

n

!

n

g)=r (f

1

;:::;

n g)!r

+

(f

1

;:::;

n g)

r (f

1

!

1

;::: ;

n

!

n g)=r

+

(f

1

;:::;

n

g)!r (f

1

;:::;

n g)

Figure8: Denitionoftherewritingfuntions

r (V)r +

(W)2D i92V 92W 2C

D

#

()=fr +

(C

#

())g D

"

()=fr (C

"

())g

D

#

(

V

)=f?g D

"

(

W

)=f>g

D

#

(

W )=fr

+

( [

2W C

#

())g D

"

(

V

)=fr ( [

2V C

"

())g

Figure9: Canonization

(24)

upper bounds and greatestlowerbounds of existing variables. Forinstane, \t"may

be represented by a fresh variable

f;g

, together with the onstraints

f;g and

f;g

. Astraightforwarddenition ofanonization,based onthis priniple,is given

by Trifonov and Smith [TS96℄. However, it involves intermediate losure omputations,

whih generate many superuous onstraints. Forinstane, and abovemust be pos-

itive, beause the least upper bound expressions whih arise during anonization always

involvepositivevariables. Sinetherearenobipolarvariables,and annotbenegative.

So,thefresh onstraints

f;g

and

f;g

shall beremovedbythenextgarbage

olletionpass. Inbetween,though, theseonstraintswilltakepartinalosure omputa-

tion,and theirtransitiveonsequenesmaysurvivegarbage olletion. Rather thangoing

throughtheproessofaddingsuperuousonstraintsaspartofanonization,performinga

losureomputation,andtheneliminatingthem,wegiveamoredetaileddesriptionofan-

onization,whoseoutputisprovablylosed,andwhihdoesnotgeneratetheseunneessary

onstraints,thussavingtime.

Forthesakeofsimpliity,ourdenitionreatesanexponentialnumberoffreshvariables.

Of ourse,an implementation shallreate afresh

V or

W

only ondemand,i.e. when it

appearsintheonstrutedboundofanexistingvariable|whihmaybeanoriginalvariable

,ormayitself bea ora.

Consideringourstronghypotheseson,itiseasytoprovethatCan()islosed. Further-

more,wemayprovethatexisting variablessee theirpolaritydereaseduring anonization.

Theseresultsmeanthat wemayapplyanonizationtransparentlyat anypointofthetype

inferene proess, whilestill performing inremental losure omputations, and relyingon

theassumptionthatnobipolarvariablesexist. Theyareprovedbelow.

Theorem13 Consider asindenition 28. Then,Can() islosed. Furthermore,

fv +

(Can())f

W g[fv

+

()

fv (Can())f

V

g[fv ()

Asaorollary, thereare nobipolar variables in Can().

Proof. Werstverifythat twoonstraintsofD involvingvariablesanneverbeombined

bytransitivity. ItsuÆes to notiethat r (V)anneverbeequalto r +

(W),beausethe

formeris ofthe form

V

or2fv (),while thelatteris of theform

W

or2fv +

().

Sine hasnobipolar variables,fv +

()\fv ()=?.

Tofully verify the requirement of denition 25, it essentially suÆes to further notie

that = GC(). This implies that for all 2 fv +

() (resp. 2 fv ()), C

"

() (resp.

C

#

())isempty; whih impliesD

"

()=f>g(resp. D

#

()=f?g). Thedesiredproperty

followseasily;thus,Can()islosed.

This resultallowsusto omputepolarities. Weverify thatf

W g[fv

+

()andf

V g[

fv ()satisfythex-pointequationsofdenition26,appliedtoCan(). Todoso,itsuÆes

to notie that a

W

never appears in negative(resp. positive) position in a non-variable

lower(resp. upper)bound|asymmetriresultholdsof

V

|andthatany2fv()appears

infewerpositionsthanin C. 2

Wearenowreadyto provetheorretnessoftheanonizationalgorithm.

Theorem14 Consider asindenition 28. Then Can().

Proof. Letususethenotationsofdenition28. WerstshowthatCan()4, i.e.

8`C 9 0

`D 0

(A))(A))