Issues in Typing Non-Functional Objects - Typing Non-Functional Constructs

Typing Non-Functional Constructs

5.2 Issues in Typing Non-Functional Objects

In this section, we will discuss the issues involved in inferring sound typings for non-functional constructs such as I-structures in Id or the ^ref construct in ML. We will informally describe some extensions to the Hindley/Milner type system that are proposed in the literature in order to take care of these constructs.

5.2.1 Store and Store Typings

The basic conceptual shift in going from a functional view of the world to an imperative one is to expand the notion of expressions manipulating values to that of expressions manipulating

is a well dened notion of an instance of an object separate from its contents or value. This corresponds to a location in the run-time store where this object may be found. This gives rise to a sharp dierence between the functional typing mechanism where we assign a type to every value (or an expression that evaluates to a value), and the imperative world where we have to assign a type to an object (or an expression that evaluates to an object) which may not yet have a value associated with it. The problem in having objects with unknown or updatable values is that we must make sure that values (or other objects) that ever get stored into that object must all be of the same type. Otherwise, we may easily generate unsound typings and run into run-time type errors. The following example from Tofte's thesis [57] chapter 3, page 31, illustrates this problem.

let r = ref (^x.x)

in (r := x.x+1; (!r) true)

refcreates an assignable storage location that is initialized to the identity function, but is later overwritten with the successor function. Therefore its later use (\^!" is the dereferencing operator) in an application to the boolean ^true is a type error. But this may go undetected if we assign the type⁸

:

^! (

ref) to ^refand allow generalization of the type (

) ref assigned to^rin the usual manner in a^let-binding.

The problem is that the location created by the use of^refis a single object and can have at most one type. Assigning types to address locations in the store is what Tofte calls a Store Typing. And, he goes on to show that generalization over type variables that are free in the current store typing gives rise to unsound type inference and should not be permitted.

5.2.2 Approximating the Store Typing

The diculty in preventing such generalizations is that the store and the store typing come into existence only at run-time. So we need a conservative approximation of the store typing for the purpose of static type analysis in order to gure out what type variables may be free in the store typing.

The type checker must also know when new type variables are introduced in the store typing (and possibly, when some type variables may be removed from it). This means that there has to be some indication that evaluating a particular expression will cause allocation of new objects, possibly with free type variables embedded in their type, and therefore may involve adding new type variables to the store typing. For example, the evaluation of ^ref itself does not

expand the store, it is the evaluation of its application to another expression that creates a new reference and has to be recorded in the store typing. Again, we will have to make a conservative approximation of the actual run-time behaviour of the expression.

Dierent authors have taken dierent positions in this matter. We will outline three schemes by means of examples and describe our position in the next section.

Approximation I

Tofte [57] makes a broad syntactic classication of expressions into two categories, applications and let-expressions termed as

expansive

, and identiers and

-abstractions termed as

non-expansive

. The idea is that only expansive expressions can ever lead to allocation of an object, which in turn implies an expansion of the store typing. Moreover, he classies the type variables into two categories as well,

imperative

(meta-variable

u

) and

applicative

(meta-variable

t

) type variables. Storage allocating functions bind and return imperative type variables, while the usual functional denitions use applicative type variables. Thus, the built-in storage allocator in ML^refhas a type⁸

u:u

^!(

u

ref). Using the above extension to the Hindley/Milner typing framework, Tofte derives typings of the usual form,

TE ^`

e

The key idea is that while typing an expression \^let

x

e

¹ ⁱⁿ

e

²" using the LET-rule, the imperative type variables in the type of

e

¹ that would otherwise be closed, are not allowed to be generalized if expression

e

¹ is expansive. This is the conservative guess Tofte makes in order to recognize a type variable entering the store typing. Thus, if expression

e

¹ is the identier^refor a

-abstraction with imperative type variables in its type signature (possibly due to occurrences of^refinside it), it can still be generalized since those expressions are taken to be non-expansive, but an application of ^refto another expression or its use in another let-expression can not be generalized since it is deemed expansive.

Coming back to the problematic example above, the type of the let-bound identier ^rwill be inferred as (

u

) ref and will not be generalized since the application ^{ref (}

^x.x)in the binding is rightly believed to be expansive. Therefore, the rst statement will unify the type of the location^rto be (int ^! int) ref and the system will detect the type error in the second expression while applying the contents of ^rto a bool.

Approximation II

Damas, in his thesis [19], takes a slightly dierent position⁷. He generates typings of the form,

TE ^`

e

:

where,

is a type-scheme and is the set of types of all objects to which new references may be created as a side eect of evaluating

e

. This is his approximation for the set of types that may occur in the store typing. We will call this set the

immediate store types

. In addition, each function type-scheme carries a list of types ⁰, that denotes an upper bound for the types of objects to which new references will be created as a side eect of applying that function. We will call these types

future store types

. At each application of such a function, fresh instances of its future store types are added to the set of immediate store types signifying the fact that new objects are allocated in the store when this function is invoked, expanding the corresponding store typing.

-abstractions simply convert the immediate store types of its body into future store types, again denoting the fact that the denition itself does not create any new objects but its application will. Finally, generalization over type variables occurring in the immediate store types is prohibited, which guarantees the soundness of the type inference system.

One immediate consequence of this simple two-level partitioning is that all embedded future store types in curried

-abstractions are pushed up to the rst

-abstraction. Thus, the very rst application will cause all of them to be added to the set of immediate store types, even if the actual allocation of the storage object does not occur until further applications. We will come back to this point later.

In Damas' system,^refis typed as,

TE ^`^ref: (⁸

:

ref ^f

^g) ^fg (5

:

2) The immediate store types are empty, denoting the fact that merely mentioning ^ref does not allocate any new references, but the future store types are non-empty signifying that each application of^refwill indeed cause expansion of the store and the store typing.

Damas' system makes ner distinctions among expressions in deciding which expressions allocate objects and when that allocation occurs, than Tofte's system which uses a simple

7Our account of Damas' solution is based on its lucid exposition by Tofte in [57], chapter 6.

syntactic classication to achieve that. So it should not surprise us if Damas' system admits programs that Tofte's system rejects. The following example shows this.

let f = let x = 1 in^{y.!(ref y)}

f 1, f true

Tofte's system does not generalize over the inferred type (

u

) for ^fbecause the inner let-expression is deemed to be expansive. While in Damas' system, the

-abstraction returned by the inner let-expression does not have any immediate store types but has a non-empty future store type list. This rightly permits multiple instantiations in the following tuple expression, creating two dierent reference locations that store¹and ^truerespectively.

Approximation III

Even though Damas' system attempts to predict and propagate information about the point of allocation of store objects that may be hidden under

-abstrations, it performs poorly with curried function denitions. It knows only about a single level of

-nesting that hides future store types from immediate store types. It does not distinguish between curried functions where the store object is created after the very rst application or in subsequent applications. The somewhat contrived Id example shown below makes this distinction clear.

def store a i v = ^f a[i] = v;

The denitions ^f1, ^f2, and ^f3, all have the same type. The dierence lies in the time of allocation of their I-structure array cells. In case of ^f1, this allocation has already occurred at the time of denition of ^f1 and the corresponding type variable ^*0 has already entered the

the allocation occurs after one and two applications respectively, and till that happens, each of them can be instantiated over dierent types without the danger of unsound type inference since each instantiation will eventually generate a new fresh array cell.

Following an idea due to David MacQueen that is currently used in Standard ML of New Jersey [2, 3], we may add this exibility to Damas' system as follows. Each type variable inside a type expression is associated with a natural number called its rank. This identies the depth of

-nesting of the given type variable after which it will enter the set of immediate store types.

This is exactly the number of

-abstractions suspending the creation of a storage object in the associated expression. For each application to the current expression, the rank of its type variables is reduced by 1 and those with rank equal to the current level of

-nesting are added to the set of immediate store types. This gives a much more ne-grain control over what type variables may enter the store typing and at what time. In this scheme, an applicative type variable will have the rank of innity.

This system eectively generalizes the concept of future store types of Damas' system to arbitrary levels of

-nestings. Unfortunately, very little documentation is available about its implementation, and to our knowledge, an in-depth theoretical analysis has not been carried out yet.

Dans le document An Incremental Type Inference System for the Programming Language Id (Page 158-163)