A Hoare Logic for Call-by-Value Functional Programs

(1)

A Hoare Logic for Call-by-Value Functional Programs

Yann R´egis-Gianas¹ and Fran¸cois Pottier²

1 INRIA Saclay - ˆIle-de-France, ProVal, Orsay, F-91893 LRI, Universit´e Paris-Sud, CNRS, Orsay, F-91405

2 INRIA Paris - Rocquencourt, Gallium - Domaine de Voluceau - F-78153

Abstract. We present a Hoare logic for a call-by-value programming language equipped with recursive, higher-order functions, algebraic data types, and a polymorphic type system in the style of Hindley and Milner.

It is the theoretical basis for a tool that extracts proof obligations out of programs annotated with logical assertions. These proof obligations, expressed in a typed, higher-order logic, are discharged using off-the- shelf automated or interactive theorem provers. Although the technical apparatus that we exploit is by now standard, its application to call- by-value functional programming languages appears to be new, and (we claim) deserves attention. As a sample application, we check the partial correctness of a balanced binary search tree implementation.

1 Introduction

Hoare logic [1–3] is a discipline for annotating programs with logical formulae, known as assertions, and for extracting logical formulae, known as proof obligations, out of such annotated programs. The validity of the proof obligations, which can be verified either manually or mechanically, entails the correctness of the annotated program. That is, it guarantees that the assertions are correct static predictions of the program’s dynamic behavior.

Hoare logic was originally designed for a “while language”, that is, a simple imperative programming language, equipped with an iteration construct and a fixed number of global, mutable variables. Recursive, higher-order procedures were the subject of much attention in the late 1970’s and early 1980’s [4–8].

More recently, heap-allocated, mutable data structures, as well as object-oriented features, have been deeply investigated. This has led to the development of practical specification languages and tools targeting, for instance, Java [9–11], C [12] and C# [13].

We would like to put forth the thesis that this traditional focus on imperative programming languages has been, to some extent, detrimental: it has consumed a great amount of energy, while comparatively little effort was being devoted to the key features that will be required in order for the methodology to scale up, such as modularity and abstraction. We would also like to raise a question:

since functional programs are significantly easier to check for correctness, why hasn’t this activity become routine in the functional programming community, forty years after Floyd and Hoare’s seminal papers?

(2)

On the cost of imperative programming There are several reasons why functional programming can be considered superior to imperative programming [14]. One of them is that functional programs are easier to reason about. In other words, there is a cost to reasoning about state.

In a typical modern imperative programming language, all heap-allocated data is mutable. As a result, instead of reasoning in terms of high-level entities such as, say, pairs, lists, trees, etc., programmers are forced to reason in terms of a view of the heap as a graph. More concretely, they must write down and prove formulae that involve mappings of memory addresses to memory blocks [15, 12].

The possibility of aliasing means that, whenever some memory block is written, the memory that is accessible through every type-compatible pointer is potentially affected. This makes it difficult to reason about the effects of a single write operation, and creates the problem of representation exposure [16, 17]. In order to address this issue, researchers have developed linear types and regions [18], ownership types [19], and separation logic [20], among other approaches.

Our research agenda We do not claim that the above issues are not worth in- vestigating: on the contrary, they are quite fascinating. However, it is a pity that we do not, today, have mature tools for checking the correctness of functional programs. This explains why, in this paper, we study a Hoare logic for (call-by-value) functional programs without state.

The programs that we are interested in checking rely heavily on (possibly higher-order) functions, algebraic data structures, and type polymorphism. We claim that it is quite easy to extract succinct and natural proof obligations out of such programs, provided, of course, that they are annotated with specifications.

There are two benefits to be reaped by not reasoning about state. As far as the user is concerned, this leads to simpler specifications and proof obligations.

As far as the implementor is concerned, this saves a large part of the “implementation budget”, which can then be spent on features such as type polymorphism, type abstraction, and modularity. The importance of these features cannot be overstated: in the end, the key to success is the ability to develop and check program components independently.

Contribution In this paper, we present the design of a typed, polymorphic, higher-order programming language, where programs can be annotated with assertions expressed in a typed, polymorphic, higher-order logic. We define a procedure for extracting proof obligations out of programs, and show that it is sound. A publicly available prototype tool [21] has been developed, which works in conjunction with the interactive theorem prover Coq [22], with the automated first-order theorem prover Alt-Ergo [23], or with both at once. This tool has been used to check the partial correctness of several non-trivial data structure implementations, including balanced binary search trees and purely functional double-ended queues [24]. We hope to publish detailed accounts of these implementations in the future.

(3)

Highlights of our approach Here are some of the key technical features of our approach.

We focus on partial correctness. We do not require programs to terminate, and do not generate proof obligations to ensure termination. It is up to the user to determine which properties of the code are of sufficient interest to deserve proof, and to insert assertions where desired. At one extreme, a program that contains no assertions leads to no proof obligations. There is no cost to be paid up front for using our methodology.

Our preconditions are prescriptive: it is impossible to call a function unless its precondition F1 holds. A descriptive interpretation of preconditions can be simulated by using the preconditiontrueand the postcondition F1⇒F2. This allows unconditional invocation, and states that the function’s result must satisfy F2if its argument satisfiesF1.

Values, programs, types, and logical formulae are distinct syntactic cate- gories. Proofs do not necessarily appear within programs: proof obligations are delegated to an external theorem prover, which may or may not require or produce explicit proof terms.

We do not embed values, programs, or formulae within types. Thus, our types are first-order terms: they include type variables, parameterized algebraic data types, and function types, just as in ML. As a result, type inference in the style of Milner [25] is possible, and implemented in our tool [21]. Type inference does not generate any proof obligations. We do not have dependent types, such as lists indexed with an integer length [26], but simulate them as follows. Instead of declaring thatxhas typelistn, we declare thatxhas typelist, and assert the logical formula length(x) =n, where the function length is inductively defined at the logical level.

Formulae can refer to values, but not to expressions. This is important, because values are pure, whereas expressions are potentially impure. Although our logic cannot explicitly reason about state, it is nevertheless soundly applicable to programs that involve non-termination, non-determinism, input/output, or mutable state. (Reading an input stream, or dereferencing a pointer to mutable storage, can be viewed as non-deterministic operations.) In that case, it allows establishing properties that do not depend on the behavior of any impure operation. This means, for instance, that we can prove the partial correctness of a functional program even if it has been instrumented with possibly impure debugging, profiling, or logging instructions.

In our programming language, functions, which are potentially impure, are values, so they can appear within formulae. But what does it mean for a formula to refer to a computational function f of type, say, τ¹ −→^. τ²? Our answer is to view f, at the logical level, as a pair of predicates, which represent f’s precondition and postcondition. In other words, when used within a formula,f has type (roughly) (τ1→prop)×(τ1→τ2→prop). The two pair projections, written pre and post, can be used to refer to the pair components. That is, pre(f) andpost(f) offer lightweight notations for referring tof’s precondition and postcondition. Whenfis a known (let-bound) function, this mechanism can

(4)

be viewed merely as offering abbreviations for known formulae. However, when f is unknown (λ- or∀-bound), it becomes key to writing natural specifications for higher-order functions (§7.5).

In summary, although the technical apparatus that we exploit is by now standard, we believe that it is worth drawing attention to the combination of power and simplicity offered by our technical choices. If extended with a suitable module system, and equipped with a compilation path down to, say, Objective Caml [27], our tool could be used to construct correct purely functional program components, possibly for use within larger, partly imperative programs.

Outline of the paper The paper is laid out as follows. First, we briefly introduce a higher-order logic, in which assertions and proof obligations are expressed (§2).

Then, we present the syntax and call-by-value semantics of a core functional programming language whose expressions carry explicit assertions (§3). We describe the type system, as well as the procedure for extracting proof obligations out of programs (§4). We present a few extensions of the language (§5) and dis- cuss how proof obligations are transformed for submission to external theorem provers (§6). Last, we present a few excerpts of our balanced binary search tree implementation (§7) and review related work (§8).

2 The underlying logic

2.1 Syntax

We rely on a mostly standard higher-order logic [28] whose types and terms appear in Figure 1. Types θ include type variables α, parameterized inductive types, function types, product types, and the typepropof logical propositions.

In the following, the syntax of terms is extended with standard syntactic sugar for falsity, disjunction, implication, equivalence, existential quantification, etc.

The typing rules appear in Figure 2. In general, we write t for terms of arbitrary type. We write F for formulae, that is, terms of type prop, and P for predicates, that is, terms of type θ → prop. The binary operator #, used in several definitions, expresses the fact that two objects have no common free names.

Our logic is not simply-typed. Because our computational language (§3) is polymorphic, and because we wish to lift every computational value up to the logical level, we need polymorphism at the logical level as well. For this reason, we have logical type schemes ς ::= ∀α. θ, where ¯¯ α is a vector of distinct type variables. Every occurrence of a variablexis explicitly applied to a type vector ¯θ, which states how the type scheme associated withxis instantiated. For this reason also, we introduce universal quantification over type variables, and usefacts of the form∀α.F¯ . Facts are not formulae: they do not appear in Figure 1. Facts appear only within computational-level type environments Γ (§3.1, Figure 3).

The extension of higher-order logic with this very simple form of explicit quantification over types is embedded within the Calculus of Inductive Constructions (§2.2).

(5)

Logical Types

θ ::=α Variable

|dθ¯ Data

|θ→θ Function

|θ×θ Product

|prop Proposition

ς ::=∀¯α. θ Scheme

Logical Type Environments

∆ ::=∅ Nil

|∆,(x:ς) Variable

|∆,α¯ Type Variables

Logical Terms

t, F, P ::=xθ¯ Variable

|Dθ¯(t, . . . , t) Data

|λ(x:θ).t Abstraction

|t(t) Application

|(t, t) Product

|π₁ Projection (also writtenpre)

|π₂ Projection (also writtenpost)

|true Truth

|t=t Equality

|t∧t Conjunction

| ¬t Negation

| ∀(x:θ).t Universal Quantification Fig. 1.The logic (syntax)

The logic offers parameterized inductive types. We assume that each inductive type constructor dcarries a fixed integer arity, and that every application dθ¯is arity-consistent. We further assume thatdcomes with a finite number of data constructorsD, each of which is assigned a type scheme of the form:

∀α. θ¯ ¹×. . .×θⁿ →dα¯

We impose a positivity condition [29], which is informally summed up as follows:

in the above type scheme, the type constructord(or any type constructor whose definition is mutually recursive with the definition ofd) must not appear under the left-hand side of an arrow withinθ1, . . . , θⁿ.

Although there is an introduction form for inductive types, namely the application of a data constructorD, no elimination form is provided here. We can get away with this omission because the process ofextracting proof obligations, which is the focus of the present paper, requires no such forms. Of course, when it comes todischargingproof obligations, that is, proving theorems, then inductive definitions and proofs become necessary.

(6)

(∆,(x:ς))(x) =ς

(∆,(x₁:ς))(x₂) =∆(x₂) ifx₁#x₂ (∆,α)(x) =¯ ∆(x) if ¯α#∆(x)

∆(x) =∀¯α. θ

∆⊢xθ¯: [ ¯α7→θ]θ¯

D:∀¯α. θ₁×. . .×θn→dα¯

∀i ∆⊢ti: [ ¯α7→θ]θ¯ i

∆⊢Dθ¯(t1, . . . , tⁿ) :dθ¯

∆,(x:θ₁)⊢t:θ₂

∆⊢λ(x:θ1).t:θ1→θ2

∆⊢t1:θ1→θ2

∆⊢t₂:θ₁

∆⊢t₁(t₂) :θ₂

∀i ∆⊢ti:θi

∆⊢(t₁, t₂) :θ₁×θ₂

∆⊢t:θ₁×θ₂

∆⊢πi(t) :θi ∆⊢true:prop

∀i ∆⊢ti:θ

∆⊢t1=t2:prop

∀i ∆⊢ti:prop

∆⊢t1∧t2:prop

∆⊢t:prop

∆⊢ ¬t:prop

∆,(x:θ)⊢t:prop

∆⊢ ∀(x:θ).t:prop

∆,α¯⊢t:prop

∆⊢ ∀¯α.t:prop

Fig. 2.The logic (type system)

2.2 Interpretation

Our higher-order logic is embedded within the Calculus of Inductive Construc- tions [29, 30], abbreviated to CiC in the sequel. Indeed, each type of our logic can be translated into a term of CiC whose type is Type0. This guarantees that the translation of polymorphic quantification only introduces type variables of type Type0 in CiC. Each construct of our logic is directly mapped to its coun- terpart in CiC. This interpretation guarantees that our logic is consistent and validates a number of laws that are used in establishing the soundness of our system (§4.8).

3 The computational language

3.1 Syntax

The syntax of our programming language appears in Figure 3. It is equipped with an ML-style type system [25], so typesτ and type schemesσare distinguished.

Types include type variables, parameterized algebraic data types, and function types. We write −→^. for the computational function type constructor, so as to distinguish it from the logical function type constructor, written→(Figure 1).

We impose a syntactic separation between values and expressions, and require both operands of the function application operator, as well ascase scrutinees, to be values. This imposes a style, reminiscent ofA-normal form [31], where the result of every intermediate computation is named via aletconstruct. Of course, such a style is quite user-unfriendly, so, in practice, we offer an unrestricted

(7)

Computational Types

τ ::=α Variable

|dτ¯ Data

|τ −→^. τ Function

σ ::=∀¯α. τ Scheme

Computational Type Environments

Γ ::=∅ Nil

|Γ,(x:σ) Variable

|Γ,α¯ Type Variables

|Γ,∀α.F¯ Assumption

Values

v ::=x¯τ Variable

|Dτ¯(v, . . . , v) Data

|funf(x:τ /F) : (x:τ /F) = e Recursive Function

Patterns

p ::=x¯τ Variable

|Dτ¯(p, . . . , p) Data

Expressions

e ::=v Value

|v(v) Function Application

|let(xα¯:τ /F) =eine Local Binding

|casevofc Pattern Matching

Cases

c ::=∅ Nil

|(p7→e)8c Cons

Fig. 3.The computation language (syntax)

surface language, and automatically translate it down to the kernel language described here.

The language supports type inference in the style of Hindley and Milner.

However, in this paper, we are not concerned with type inference, so we work with explicitly-typed programs. This is visible (i) in the syntax of values and patterns, where variables and data constructors are annotated with vectors of types that indicate how polymorphic type schemes are instantiated, (ii) at fun andletconstructs, where bound variables are annotated with types, and (iii) at letconstructs, where a vector of type variables ¯αcan be explicitly bound.

A function definition takes the general form:

funf(x1:τ1/F1) : (x2:τ2/F2) = e

(8)

The symbol / should be read “where”. Every function is recursive, so that f is bound within e. The formal parameterx1 is bound within the precondition F1, within the postconditionF2, and within e. The variable x2, which stands for the result of the function, is bound within the postconditionF2. We require every function to be annotated with an explicit precondition and postcondition (if missing,trueis assumed).

A local variable definition takes the general form:

let(xα¯:τ /F) =e1ine2

The local variablexis bound withinF and withine2. The type variables ¯αare bound withinτ,F, and e1. The propositionF serves as a postcondition fore1. If it is missing, a default postcondition is assumed, whose definition is deferred to§3.3.

A case analysis takes the general form:

casevofc

Here,cis a possibly empty sequence of cases (i.e., branches). Each branch is of the form (p7→e), where the variables that appear in the pattern pare bound withine. Patterns must be linear, that is, a pattern cannot bind a variable twice.

3.2 Lifting computational entities to the logical level

In a Hoare logic, formulae refer to values. That is, ifxis bound, at the computational level, by afun,let, orcaseconstruct, then it is possible for a formulaF, embedded in the code within the scope ofx, to refer tox. This raises two questions: first, if x has computational type τ, what is its logical type, to be used when typecheckingF? Second, if, for the purposes of evaluation,xis substituted with a computational valuev, what is the corresponding logical value, to be used when interpretingF?

The problem of lifting types and values to the logical level is trivial in a first-order language. Indeed, the type algebra only contains basic types which are translated to type constants (intis mapped toint). Besides, computational values are essentially first-order terms, interpreted as data in the logic. Yet, in an higher-order language, functions are first-class values. What should be the logical reflection of their code ?

We answer these questions by lifting both computational types and computational values up to the logical level (Figure 4). That is, to each computational typeτ, we associate a logical type ⌈τ⌉, and to each computational valuev, we associate a logical term ⌈v⌉, with the intended property that if v has computational type τ, then⌈v⌉has logical type ⌈τ⌉. Patterns are lifted too. Because patterns form a subset of values, no extra definitions are needed.

As announced (§1), computational functions are reflected, at the logical level, as pairs of a precondition and postcondition. This is made explicit in the lifting of computational function types:

⌈τ1

−→. τ2⌉= (⌈τ1⌉ →prop)×(⌈τ1⌉ → ⌈τ2⌉ →prop)

(9)

Types

⌈α⌉=α

⌈dτ¯⌉=d⌈¯τ⌉

⌈τ1

−→. τ2⌉= (⌈τ1⌉ →prop)×(⌈τ1⌉ → ⌈τ2⌉ →prop)

Type schemes

⌈∀¯α. τ⌉=∀¯α.⌈τ⌉

Type environments

⌈∅⌉=∅

⌈Γ,(x:σ)⌉=⌈Γ⌉,(x:⌈σ⌉)

⌈Γ,α⌉¯ =⌈Γ⌉,α¯

⌈Γ,∀¯α.F⌉=⌈Γ⌉

Values

⌈xτ¯⌉=x⌈¯τ⌉

⌈Dτ¯(v₁, . . . , vn)⌉=D⌈¯τ⌉(⌈v1⌉, . . . ,⌈vⁿ⌉)

⌈funf(x1:τ1/F1) : (x2:τ2/F2) = e⌉= (λ(x1:⌈τ1⌉).F1, λ(x1:⌈τ1⌉).λ(x2:⌈τ2⌉).F2) Fig. 4.Lifting computational types and values to the logical level

The first component of the pair, which represents the function’s precondition, is abstracted over the function’s argument, while the second component, which represents the postcondition, is abstracted over both argument and result.

As a result of this definition, iff is bound, at the computational level, to a function of type τ¹ −→^. τ², then a formula embedded within the code, in the scope off, viewsf as a pair of predicates, and can refer topre(f) andpost(f).

(Recall that, as per Figure 1, pre and post are sugar for the projections π¹ and π².) Note thatf does not denote a logical function. Within a formula, an applicationf(t) does not make sense: it is ill-typed.

Values of computational function type (that is, λ-abstractions) are lifted up to the logical level in a way that is consistent with this definition. A function’s precondition and postcondition alone determine how it is lifted: its code is ignored. (The conformance of a function’s body to its declared pre- and postcondition is checked, of course, via a proof obligation: see ruleFunin Figure 6.) This reflects a philosophy in which the only way of reasoning about the behavior of a function value is via its specification: code never appears within formulae.

In order to lift algebraic data types, we lift every algebraic data type definition into an isomorphic inductive type definition. So, for every computational- level algebraic data type constructor d, there must be a logical-level inductive type constructor, also writtend, of identical arity. For every computational-level data constructor

D:∀α. τ¯ 1×. . .×τⁿ→dα,¯

(10)

there must be a logical-level data constructor

D:∀α.¯ ⌈τ1⌉ ×. . .× ⌈τⁿ⌉ →dα.¯

Due to the manner in which computational function types are lifted, the positivity condition (§2) requires the type constructor dto not appear under any side of a computational arrow withinτ¹, . . . , τⁿ. This can be a limitation (§9).

3.3 Inferring strongest postconditions

In order to simplify the definition of the procedure that extracts proof obligations, we have required everyletconstruct to carry an explicit postcondition for its left-hand sub-expression (§3.1). In practice, however, annotating every let construct would be quite unpleasant, so it is desirable to construct a reasonable postcondition when the user does not provide one.

Ideally, the formula that we should construct in such a situation is the strongest postcondition of the left-hand sub-expression. Our logic is, in fact, sufficiently powerful to express strongest postconditions for every construct in our programming language. For instance, the strongest postcondition for a value v isλx.(x=⌈v⌉). The strongest postcondition for a function applicationv1(v2) ispost(⌈v1⌉)(⌈v2⌉). We could go on and explain how to construct strongest postconditions forletandcaseconstructs. However, in these two cases, they would be complex formulae, involving existential quantification and disjunction.

Eventually, the postconditions carried byletconstructs become part of proof obligations, where they appear as hypotheses. For this reason, we do not want them to be too complex: we wish to produce simple, comprehensible proof obligations.

Our answer to this issue is to construct strongest postconditions for values and function applications, as suggested above, but not for let and case constructs: instead, we rely on the user-provided postcondition, if there is one, or use the trivial postconditiontrue, otherwise.

In practice, when is it necessary for the user to provide an explicit annotation?

The left-hand side of aletconstruct can be one of four expression forms: a value, a function application, alet form, or acaseform. In the first two cases, we do use a strongest postcondition. The third case can be made to never happen, up to a conversion toA-normal form [31]. Only the last case remains. In summary, the only case where our simple-minded approach may call for an explicit, user- provided annotation is that of aletconstruct whose left-hand sub-expression is a caseconstruct.

3.4 Notions of substitution

Neither types nor formulae influence execution, but do appear in the syntax of values and expressions, in order to allow stating subject reduction and proving the soundness of our Hoare logic. So, the operational semantics reduces expressions that contain explicit types and formulae. To ensure that these annotations

(11)

remain consistent as expressions are transformed, we must define a few slightly non-standard notions of substitution.

A single type variable α can appear within logical types as well as within computational types. Similarly, a single variablexcan appear within formulae as well as within expressions. For this reason, we write [α7→τ] for the substitution that replaces every free occurrence of αat the computational level with τ and every free occurrence ofαat the logical level with⌈τ⌉. Similarly, we write [x7→v]

for the substitution that replaces every free occurrence ofxat the computational level withv and every free occurrence ofxat the logical level with⌈v⌉.

We have annotated let constructs with explicit type abstractions and occurrences of variables with explicit type applications. As a result, contracting a let-redex requires contracting type-level β-redexes as well. In order to do so, we write [x7→Λα.v] for a substitution that replaces every variable occurrence¯ of the form xτ¯ with [¯α 7→ τ]v. Again, this replacement is performed at both¯ computational and logical levels, up to a lifting operation in the latter case.

Last, the notation [x 7→ v], which denotes a substitution of a value for a variable, is extended to the notation [p7→v], which, whenpdoes not matchv, is undefined, and, whenpdoes matchv, denotes a simultaneous substitution of values for variables, as follows. The formal definition is:

[Dτ¯(p¹, . . . , pⁿ)7→D¯τ(v¹, . . . , vⁿ)]

stands for

[p¹7→v¹]∪. . .∪[pⁿ7→vⁿ]

Because patterns are linear, this is a union of substitutions whose domains are pairwise disjoint.

3.5 Operational semantics

A standard small-step, call-by-value operational semantics appears in Figure 5.

There are three kinds of redexes (β,let, andcase) and one evaluation context (the left-hand side of aletconstruct). An expression is stuck if it is irreducible and not a value. It is easy to check that an expression is stuck if and only if it contains, within an evaluation context, a sub-expression of the form v1(v2), wherev1is not a syntactic function, or of the form casevof∅.

4 The type system and proof system

We now equip the computational language with an ML-style type system and with a proof system (a Hoare logic), which can be viewed as an algorithm for extracting proof obligations out of well-typed programs. For the sake of succinct- ness, both are described using a single set of judgements, which assert at once that a program is well-typed and is annotated with consistent formulae.

In practice, our tool [21] first checks that the program is well-typed, and, at the same time, infers any omitted type annotations. Then, a set of proof

(12)

v1(v2)→[x7→v2][f7→v1]e

ifv₁ isfunf(x:τ /F) : (. . .) = e let(xα¯:τ /F) =vine→[x7→Λ¯α.v]e

casevof(p7→e)8c→[p7→v]e

if [p7→v] is defined casevof(p7→e)8c→casevofc

if [p7→v] is undefined let(xα¯:τ /F) =e1ine2→let(xα¯:τ /F) =e^′₁ine2

ife₁→e^′₁ Fig. 5.Operational semantics

obligations, expressed in our typed higher-order logic, is extracted. The fact that the program (including embedded formulae) is well-typed guarantees that the proof obligations are in turn well-typed.

4.1 Environments

The syntax of type environments Γ appears in Figure 3. As is standard, type environments bind variables and type variables. Environments also contain assumptions, that is, formulae that become hypotheses when proof obligations are emitted. An environment of the formΓ,∀¯α.F is well-formed when∀α.F¯ has type propunder⌈Γ⌉.

4.2 Proof obligations

A proof obligation is a judgement of the formΓ |=F, whereF has typeprop under ⌈Γ⌉. The semantics of the judgment is the validity of the interpretation ofF in CiC under the interpretation of the environmentΓ, which is decided via an external theorem prover.

4.3 Judgements

The proof system is defined via three judgements, which state properties about values, patterns, and expressions, respectively:

Values Γ ⊢v :τ (Figure 6) Patterns Γ ⊢p:τ (Figure 7) ExpressionsΓ ⊢e:τ{P}(Figure 8)

4.4 Values

The judgementΓ ⊢v :τ (Figure 6) states that, under the type environmentΓ, the value v has type τ. No precondition or postcondition appear in the judgement. Indeed, because values require no computation, they never have a precondition. Furthermore, because all values can be lifted up to the logical level, they

(13)

(Γ,(x:σ))(x) =σ

(Γ,(x₁:σ))(x₂) =Γ(x₂) ifx₁#x₂ (Γ,α)(x) =¯ Γ(x) if ¯α#Γ(x) (Γ,∀¯α.F)(x) =Γ(x)

Var

Γ(x) =∀¯α. τ Γ ⊢xτ¯: [ ¯α7→τ¯]τ

Data

D:∀¯α. τ₁×. . .×τn→dα¯

∀i Γ ⊢vⁱ: [ ¯α7→¯τ]τⁱ Γ ⊢D¯τ(v₁, . . . , vn) :dτ¯

Fun

f#F₁, F₂

⌈Γ,(x₁:τ₁)⌉ ⊢F₁:prop ⌈Γ,(x₁:τ₁),(x₂:τ₂)⌉ ⊢F₂:prop Γ,(f:τ₁−→^. τ₂), f=⌈funf . . .⌉,(x₁:τ₁), F₁⊢e:τ₂{λ(x2:⌈τ2⌉).F2}

Γ ⊢funf(x₁ :τ₁/F₁) : (x₂:τ₂/F₂) = e:τ₁−→^. τ₂

Fig. 6.The computation language (proof system: values)

don’t need an explicit postcondition: the strongest possible postcondition of a valuevis simply equality with⌈v⌉.

RulesVarandData are straightforward. RuleFunis more complex. Two premises require the precondition F¹ and postcondition F² to be well-formed formulae, under appropriate environments. The last premise checks that the function’s body conforms to the function’s specification. In order to do so, the type environment is extended with bindings for f and x¹. It is also extended with the hypothesis

f =⌈funf . . .⌉,

which by definition of lifting (Figure 4) is synonymous for f = (λ(x¹:⌈τ1⌉).F1, λ(x¹:⌈τ1⌉).λ(x2:⌈τ2⌉).F2).

This hypothesis gives meaning to occurrences ofpre(f) andpost(f) within the body of the function, allowing recursive calls tof to be checked. Last, the environment is also extended with the preconditionF1, which means that, within the body of the function, the precondition is assumed to hold. Under this extended environment, the body of the function is required to produce a value that meets the postconditionλ(x²:⌈τ2⌉).F2.

It is not difficult to see thatΓ ⊢v:τ implies⌈Γ⌉ ⊢ ⌈v⌉:⌈τ⌉. This property is required for the typing rules to construct only well-formed formulae.

4.5 Patterns

The judgement Γ ⊢p:τ (Figure 7) states that a value of typeτ can safely be matched against the pattern p, giving rise to (exactly) the bindings described

(14)

Pat-Var (x:τ)⊢x:τ

Pat-Data

D:∀¯α. τ₁×. . .×τn→dα¯

∀i Γⁱ⊢pⁱ: [ ¯α7→¯τ]τⁱ Γ₁, . . . , Γn⊢Dτ¯(p₁, . . . , pn) :dτ¯

Fig. 7.The computation language (proof system: patterns)

Value Γ ⊢v:τ Γ |=P(⌈v⌉) Γ ⊢v:τ{P}

App

Γ ⊢v₁:τ₁−→^. τ₂ Γ ⊢v₂:τ₁ Γ |=pre(⌈v1⌉)(⌈v2⌉) Γ |=post(⌈v1⌉)(⌈v2⌉)⇒P

Γ ⊢v₁(v₂) :τ₂{P}

Let

x#P

⌈Γ,α,¯ (x:τ1)⌉ ⊢F :prop Γ,α¯⊢e₁:τ₁{λ(x:⌈τ1⌉).F} Γ,(x:∀¯α. τ1),∀¯α.[x7→xα]F¯ ⊢e2:τ2{P}

Γ ⊢let(xα¯:τ1/F) =e1ine2:τ2{P}

Case-Nil

Γ ⊢v:τ Γ |=false Γ ⊢casevof∅:τ^′{P}

Case-Cons

Γ ⊢v:τ Γ^′⊢p:τ p#v, P Γ, Γ^′,⌈v⌉=⌈p⌉ ⊢e:τ^′{P} Γ,(∀Γ^′.⌈v⌉ 6=⌈p⌉)⊢casevofc:τ^′{P}

Γ ⊢casevof(p7→e)8c:τ^′{P}

Fig. 8.The computation language (proof system: expressions)

byΓ. As in ML, these bindings are monomorphic (seePat-Var). Because patterns are linear, the type environments Γ1, . . . , Γⁿ in Pat-Data have disjoint domains.

4.6 Expressions

The judgementΓ ⊢e:τ{P}(Figure 8) states that, under the type environment Γ, the expression e has type τ and (if it terminates) produces a value whose logical reflection satisfies the predicate P. In such a judgement, P has type

⌈τ⌉ →propunder⌈Γ⌉.

RuleValue directly reflects this intended meaning: the judgementΓ ⊢ v : τ {P} holds if and only if v has type τ under Γ and its logical reflection ⌈v⌉

provably satisfiesP under the hypotheses found inΓ. The premiseΓ |=P(⌈v⌉) is a proof obligation.

RuleApprequires the functionv1and its actual argumentv2to have matching computational types. Furthermore, it emits two proof obligations, stating

(15)

that (i) the actual argument must satisfy the function’s precondition, and (ii) the function’s postcondition must imply the desired postcondition P. In the last premise, we write P^′ ⇒ P, where P^′ and P have type ⌈τ2⌉ → prop, for

∀(x:⌈τ2⌉).(P^′(x)⇒P(x)), wherexis fresh forP^′ andP.

RuleLetchecks thate1has typeτ1and thate1complies with the postcondi- tionF. Then, the rule performs type generalization, in the style of Milner [25], so thate2is checked under the assignment (x:∀α. τ¯ 1). The hypothesisFis changed into∀α.[x¯ 7→xα]F¯ , so as to reflect the fact thatxnow has polymorphic type.

In the operational semantics, a let construct behaves just like a β-redex.

This suggests that it could perhaps be treated as syntactic sugar, obviating the need for theLetrule. However, this is not possible, for two reasons. One is that let allows type generalization, as explained above, whereas aβ-redex does not.

The other is that an appropriate postcondition for the function λx.e² cannot be determined prior to extracting proof obligations: indeed, it has to be λx.P, whereP is computed only at extraction time.

Rule Case-Nil emits the proof obligation Γ |= false, which requires the conjunction of hypotheses found withinΓ to be inconsistent. This ensures that a caseconstruct with zero branches is never executed.

RuleCase-Consrequires the valuev and the patternpto have a common type τ. The environment Γ^′ collects the variables bound by p, together with their types. Under the hypothesis that a certain instance ofpmatchesv, which is expressed by extending Γ with Γ^′ and with the hypothesis ⌈v⌉ = ⌈p⌉, the branche must have the desired typeτ^′ and meet the desired postconditionP. Furthermore, under the hypothesis that no instance of p matches v, which is written ∀Γ^′.⌈v⌉ 6=⌈p⌉, the remaining branches must have typeτ^′ and meet the postconditionP. (Our use of⌈p⌉ exploits the fact that patterns form a subset of values, a welcome but unessential property.)

When checking acase construct withn branches, the (k+ 1)-th branch is checked under the assumption that none of the patterns p¹, . . . , p^k match the valuev. In particular, for k=n, the conjunction of all hypotheses of the form (∀Γi^′.⌈v⌉ 6=⌈pi⌉) is required to be inconsistent. This ensures that control cannot fall off the end of acaseconstruct, or, in other words, that the case analyses are exhaustive. Today’s ML and Haskell compilers implement a sound approximation to this check, using a purely syntactic criterion. We also implement this syntactic criterion: when it succeeds, emitting a proof obligation is unnecessary.

4.7 Algorithmic reading

The judgement Γ ⊢e:τ {P} defines an algorithm for generating proof obligations. All four parameters of the judgement, namelyΓ,e,τ, andP, are inputs of the algorithm, which attempts to build a derivation of the judgement by starting at the root of the expressioneand working its way down into the sub-expressions ofe. As the algorithm descends, enteringfun,let, andcaseconstructs, the environment Γ grows, accumulating new bindings and assumptions. At the same time, the postcondition P is propagated down, in a very straightforward process. Atlet constructs, this propagation process relies on the (default or user-

(16)

provided, see §3.3) annotation in order to determine which postcondition must be propagated into the left-hand sub-expression. The output of the algorithm consists of the proof obligations, of the formΓ |=F, carried by the leaves of the derivation (seeValue,App, andCase-Nil).

4.8 Soundness

The soundness of our type system and proof system is established in a standard, syntactic manner. The proofs appear in the first author’s dissertation [32]. It states that the types and logical assertions carried by a program are a sound approximation of its dynamic semantics.

Lemma 1 (Environment Weakening). Γ1, F, Γ2 ⊢ e : τ {P} and Γ1 |= F implyΓ1, Γ2⊢e:τ {P}.

Lemma 2 (Postcondition Weakening). Γ ⊢ e:τ {P¹} and Γ |=P¹ ⇒ P² implyΓ ⊢e:τ{P²}.

Lemma 3 (Type Substitution). Let φ stand for [¯α7→τ¯]. Then,Γ¹,α, Γ¯ ² ⊢ e:τ{P} andα¯ #dom(Γ²)imply

Γ¹, φ(Γ²)⊢φ(e) :φ(τ²){φ(P)}

Lemma 4 (Value Substitution). Let ρ stand for[x7→Λα.v]. Then,¯ Γ1,(x:

∀α. τ¯ ¹), Γ²⊢e:τ²{P}and Γ¹,α¯⊢v:τ¹ andx6∈dom(Γ²)imply Γ1, ρ(Γ2)⊢ρ(e) :τ2{ρ(P)}

Lemma 5 (Pattern Matching). Let ∅ ⊢ v : τ and Γ^′ ⊢ p : τ and p # v.

Then, [p7→v] is defined if and only if the formula∃Γ^′.⌈v⌉=⌈p⌉is valid.

Theorem 6 (Subject Reduction). Γ ⊢e:τ{P} ande→e^′ implyΓ ⊢e^′ : τ {P}.

Theorem 7 (Progress). ∅ ⊢ e:τ {P} implies that e is either reducible or a value v such thatP(⌈v⌉)is valid.

5 A few extensions

Extra assertions The following construct allows inserting an assertion at an arbitrary point in the code:

assertF ine

This construct requires F to hold: a proof obligation is emitted. It has no computational content: dynamically, it behaves like e. It is syntactic sugar for let(x: unit/F) = ()ine, wherexis fresh. It is particularly useful when our tool is used in conjunction with an automated theorem prover: if the theorem prover fails to discharge a proof obligation, the user can use assert to cut the proof

(17)

into smaller, easier steps (if the proof obligation is in fact valid) or to find out what is wrong with the specification (if the proof obligation is in fact invalid).

The constructabsurd, which statically requiresfalseto hold, marks a piece of code as inaccessible. It is syntactic sugar for a case construct with zero branches.

Ghost variables and ghost parameters It is sometimes desirable to explicitly introduce aghost variable, that is, a name for a witness to an existentially quantified hypothesis. For this purpose, we suggest writing

let logic x:θ/F ine

This construct binds x within F and e. It requires the assertion ∃(x : θ).F to hold, and introduces F as a new hypothesis into the context. Assertions embedded withinecan refer tox, and their proofs can exploit the hypothesisF. However, occurrences of x at the computational level within e are forbidden, since “let logic” has no computational content.

Similarly, it is sometimes desirable to abstract a function with respect to a ghost parameter x, like this:

funf[x:θ](x1:τ1/F1) : (x2:τ2/F2) = e

The brackets bind a ghost parameterxwithinF¹,F², ande. (Again, occurrences of xat the computational level withine are forbidden.) Note thatθcan be an arbitrary logical type, so this extension allows explicitly abstracting a function with respect to a proposition or predicate, if desired (see§7.5). Ghost variables and ghost parameters can in principle be viewed as syntactic sugar and translated away [33]. In a realistic implementation, however, they should be primitive notions.

6 Interfacing with external theorem provers

The overall verification process, implemented in our prototype tool, is composed of three main steps. First, type inference translates an implicitly typed source code into an explicitly typed internal language, very similar to the language formalized in§3. Second, the rules of the proof system defined in§4 are applied, producing a set of proof obligations. Third, these proof obligations are turned into goals of the two external provers Coq [22] and Alt-Ergo [23]. We describe this last step in the following.

6.1 Coq

Our typed, higher-order logic is easily embedded within the Calculus of Induc- tive Constructions, which underlies Coq. As a result, exporting proof obligations to Coq is a simple matter of pretty-printing. Implicit type instantiations

(18)

are handled by Coq’s system of implicit arguments. We could have made type instantiations explicit but this would have worsened readability.

Coq is an interactive theorem prover. In order to discharge a proof obligation, the user writes a proof script. An open problem is how to maintain these scripts as the source code of the program evolves. The location in the code where a proof obligation arises might change. The statement of a proof obligation might change as well. Perhaps a solution would be to allow only explicitly-stated, explicitly- named, lemmas to be proved interactively, and to rely solely on an automated theorem prover for discharging anonymous proof obligations, possibly by appeal to an explicit lemma.

6.2 Alt-Ergo

Alt-Ergo [23] is a fully automated theorem prover for a typed, polymorphic, first-order logic. Its design is partly inspired by Simplify [34]. However, Alt- Ergo’s logic is typed and polymorphic, whereas Simplify’s is untyped. This makes Alt-Ergo superior, from our point of view, to Simplify. Indeed, provided our proof obligations lie in the first-order fragment of our logic, they can be directly exported towards Alt-Ergo. If, on the other hand, we wished to use Simplify, we would have to encode our typed, polymorphic logic into Simplify’s untyped logic. Such encodings have been studied [35], but are complex and costly. Of course, the trivial encoding that erases all types is unsound.

In addition to first-order logic, Alt-Ergo has native support for linear arith- metic and for the theory of constructors (that is, function symbolsf such that f(x) =f(y) implies x=y). The latter is useful for reasoning efficiently about algebraic data structures.

In the general case, our proof obligations are most naturally expressed in a higher-order logic, as shown in this paper. However, higher-order logic can be encoded into first-order logic. A standard encoding introduces “apply” predicates that help simulateβ-conversion [36].

Perhaps surprisingly, in our case, this encoding can be made to look fairly natural. The symbolspre andpost, which so far have stood for the pair projections, can be turned into predicates and simulate not only projection, but also application. Furthermore, we can makepre a binary predicate andposta ternary predicate, avoiding curried function applications. That is, instead of the higher-order formula:

f = (λ(x1:⌈τ1⌉).F1, λ(x1:⌈τ1⌉).λ(x2:⌈τ2⌉).F2), we can write:

∀(x1:⌈τ1⌉).(pre(f, x1)⇔F¹)

∧ ∀(x1:⌈τ1⌉).∀(x2:⌈τ2⌉).(post(f, x1, x2)⇔F2)

The pair and the threeλ-abstractions have beenη-expanded, and the projection and application symbols have been fused into applications of pre and post.

Provided F1 andF2 are first-order formulae, this is a first-order formula.

(19)

typetree = Empty: tree

Node : (int×tree×elt×tree)→tree

fixpointelements : tree→set = Empty→empty

Node (h, l, x, r)→elements (l)∪singleton (x)∪elements (r)

inductivebst : tree→prop = bst (Empty)

∀(h, l, x, r).

bst (l)andbst (r)andsup (x, elements (l))andinf (x, elements (r))

⇒bst (Node (h, l, x, r))

Fig. 9.Definitions for binary search trees

Under this encoding, the definition of the lifting operation on computational types is modified so that the computational function type constructor is no longer interpreted:

⌈τ1

−→. τ2⌉=⌈τ1⌉−→ ⌈τ^. 2⌉

That is, we make −→^. an uninterpreted binary type constructor at the logical level, so that the lifting of types becomes the identity. Thus, in the above formula, f has logical typeτ¹−→^. τ². The type schemes assigned topreandpostare as follows:

pre:∀α¹α².(α¹−→^. α²)×α¹→prop post:∀α1α2.(α1

−→. α2)×α1×α2→prop

These declarations are admissible by Alt-Ergo. We believe that it should be possible to go a long way with first-order logic alone, even when the program exploits higher-order functions. However, at present, more practical experience is needed in order to support this conjecture.

7 Application: finite sets as binary search trees

As an initial benchmark for our tool [21], we have transcribed Objective Caml’s library implementation of finite sets, represented as balanced binary search trees, into our programming language. The code is presented in the concrete syntax of our prototype implementation.

7.1 Parameters

In the following, we fix a type “elt” of elements. We assume that an algebraic data type “bool”, whose data constructors are “true” and “false”, is available.

(20)

let recmem bst (t, x)wherebst (t)

returnsbwhere((b = true)⇔(x∈elements (t)))

=matchtwith Empty→false Node (h, l, y, r)→

if(x=y)thentrue

else if (x<y)thenmem bst (l, x) elsemem bst (r, x)

end

Fig. 10.Membership in a binary search tree

We assume that an equality check over elements, written “=”, is given. It is a function of computational type elt×elt−→^. bool, whose specification could be written as follows:

post(=, x¹, x², b)⇔(b= true⇔x¹=x²)

Similarly, we assume that an ordering relation, written “<”, of logical type elt→elt→prop, is given, together with an ordering check, also written “<”, of computational type elt×elt−→^. bool, such that the latter decides the former.

We assume that a type of sets of elements, written “set”, is available at the logical level, together with the standard operations (empty set, singleton set, union, membership, etc.) and a number of axioms or theorems that describe the properties of these operations.

In a full-scale programming language, our balanced binary search tree implementation would be a functor, parameterized over the types “elt” and “set”, as well as as their operations and axioms.

7.2 Definitions

Figure 9 contains the definition of the algebraic data type “tree”, of the logical- level inductive function “elements”, and of the inductive predicate “bst”. (The concrete syntax is provisional.) A binary tree is either empty or a binary node, carrying a root element, left and right sub-trees, and a cached measure of the tree’s height. Our binary search trees are intended to implement a finite set abstraction. The logical function “elements” maps a binary tree to the finite set that it represents. It is defined by induction over the algebraic data type “tree”.

The property of being a binary search tree is defined by the inductive predicate

“bst”.

In the definition of “bst”, the types of the universally quantified variables “x”,

“l”, “r”, “h”, “y” are inferred. The types of the function “elements” and of the predicate “bst” could also be inferred, if desired. In practice, type annotations can always be omitted, except where polymorphic recursion is required.

(21)

The definition of “bst” constrains neither the shape of the tree nor the cached height information. This is done by another inductive predicate, named “avl”

(not shown). In contrast with the “dependent types” [26, 37, 38] and “general- ized algebraic data types” [39] schools, we favor a programming style in which invariants are not necessarily hardwired into data structures at definition time.

7.3 Membership in a binary search tree

Figure 10 shows a function, “member”, that checks whether an element “x” is a member of a tree “t”. The precondition “bst(t)” requires “t” to be a binary search tree, but does not require it to be balanced, since this is not necessary for correctness. If one wished to (informally) ensure a logarithmic complexity bound, one could strengthen the precondition by adding the requirement “avl(t)”. This illustrates how a single data structure can be equipped with multiple invariants, not all of which are necessarily enforced at all times. The postcondition states that the Boolean result tells whether “x” is a member of the set implemented by the tree “t”. No type annotations are needed in this definition. All types are inferred.

7.4 First-order iteration

We now define and specify first-order, persistent iterators [40] over binary search trees. Their expressive power surpasses that of “fold” (§7.5), yet their specification is simpler.

The implementation appears in Figure 11. An iterator is represented as a list of trees, which can be thought of as a stack in a depth-first traversal of some larger tree. (The definition of the type “list”, whose constructors are “Nil” and

“Cons”, is omitted.)

To an iterator “i”, there corresponds a set of elements, which we write

“remaining(i)”. Its inductive definition is simply the union of the sets of elements of the trees in the list.

An iterator is well-formed only if the trees that it contains have disjoint sets of elements. This is expressed by the inductive predicate “ok”.

The function “iterator” creates an iterator “i” out of a tree “t”, and satisfies the postcondition “ok(i)∧elements(t)≡remaining(i)”, where≡stands for extensional equality of sets (which may, or may not, coincide with definitional equality). This initial iterator is simply the singleton list [t].

The function “next”, when applied to an iterator “i”, returns either nothing or a pair of a new iterator “i’” and an element “x”. The postcondition describes how these values are related. (The definition of the type “option”, whose constructors are “None” and “Some”, is omitted.)

Figure 12 shows how iterators are used. Here, the client is a function that counts the number of elements in a tree. It does not depend on the internals of the tree data structure: it only depends on the specification of iterators, which is expressed in terms of abstract (logical-level) sets. So, this client code could be placed in another module, without access to the definition of trees.

(22)

typeiterator = list (tree)

fixpointremaining : iterator→set = Nil→empty

Cons (t, ts)→elements (t)∪remaining (ts)

inductiveok : iterator→prop = ok (Nil)

∀(t, ts).

(elements (t)∩remaining (ts))≡emptyandbst (t)andok (ts)

⇒ok (Cons (t, ts))

letiterator (t)wherebst (t)

returnsiwhere(ok (i)andremaining (i)≡elements (t)) = Cons (t, Nil)

let recnext (i)whereok (i) returnsoix

where((oix = None⇒remaining (i)≡empty) and(∀(i’, x). oix = Some ((i’, x))

⇒(remaining (i)≡(singleton (x)∪remaining (i’)) andnot (x∈remaining (i’))andok (i’))))

=matchiwith Nil→None

Cons (Empty, ts)→next (ts)

Cons (Node (h, l, x, r), ts)→Some ((Cons (l, Cons (r, ts)), x)) end

Fig. 11.Iterators over binary search trees

leteval cardinal (t)wherebst (t)

returnsnwhere(n = cardinal (elements (t))) = let reccount (i, n)

whereok (i)andn + cardinal (remaining (i)) = cardinal (elements (t)) returnsn’where(n’ = cardinal (elements (t)))

=matchnext (i)with None→n

Some ((i’, x))→count (i’, n+1) end

in

count (iterator (t), 0)

Fig. 12.A sample client of the iterator abstraction

(23)

predicatehereditary (inv, s, f) =

∀(x, s’, accu’).

((s’∪singleton (x))⊆sandnot (x∈s’)andinv (accu’, s’∪singleton (x)))⇒ (pre (f) (accu’, x)and(∀accu”. (post (f) (accu’, x) (accu”)⇒inv (accu”, s’))))

lemmahereditary subset: ∀(s, s’, inv, f).

(s’⊆sandhereditary (inv, s, f))⇒hereditary (inv, s’, f)

let recfold [s, inv] (accu, t, f)

wherebst (t)andelements (t)⊆sandinv (accu, s)andhereditary (inv, s, f) returnsaccu’whereinv (accu’, s\elements (t))

=matchtwith Empty→accu Node ( , l, x, r)→

letaccu l = fold [s, inv] (accu, l, f)in letaccu x = f (accu l, x)in

fold [s\(elements (l)∪singleton (x)), inv] (accu x, r, f) end

Fig. 13.Higher-order iteration over binary search trees

The “eval cardinal” function performs a loop, expressed as an internal recursive function, with an integer accumulator n. It corresponds directly to a foreach construct in Java or C#. The precondition of this internal function represents the loop invariant: the number of elements counted so far, plus the number of elements remaining to be seen, equals the total number of elements of the set. The postcondition is simply the precondition, specialized to the case where no elements remain.

The precondition of “count” must also state that “i” is an “ok” iterator, even though it does not have to know about the definition of “ok”. This is somewhat undesirable. In the future, we will want to allow defining a dependent sum type of the form “i : iteratorwhereok(i)”, and exporting it as an abstract type.

The definition of “eval cardinal” is syntactically somewhat heavy, as it is expressed in our core language. In a full-scale programming language, a more palatable syntax for loops could be introduced, and desugared into recursive functions and iterators. A single formula, the loop invariant, would have to be written down, instead of two formulae in this low-level version of the code.

7.5 Higher-order iteration

We now present a specification of the classic “fold” higher-order function over sets implemented as binary search trees. The specification is rather more complex than that of first-order iterators, for at least two reasons. First, the specification must mention the client’s state (the accumulator) and invariant. Second, because

(24)

letincr (x, z)returnsywhere(y = x + 1) = x+1

predicatecardinal inv (t) =

fun(accu, s)→(accu + cardinal (s) = cardinal (elements(t)))

lemmais hereditary cardinal inv:

∀t. hereditary (cardinal inv (t), elements (t), incr)

leteval cardinal (t)wherebst (t)

returnsxwhere(x = cardinal (elements (t)))

= fold [elements (t), cardinal inv (t)] (0, t, incr)

Fig. 14.A sample client of the fold operator

the code is not tail-recursive, some information is implicitly encoded within the stack, and a ghost parameter is used to make it explicit in the specification.

The function “fold” is parameterized over two ghost variables, namely the client invariant “inv” and a set “s” of remaining elements. In the case of first- order iterators, the former was unnecessary because the client retains control over the desired invariant, and the latter was unnecessary because the set of remaining elements was directly expressed as “remaining(i)”. Here, the set of remaining elements is implicit in the stack, so a ghost variable must be used in order to refer to it.

The precondition of “fold” expresses the following requirements. First, “t”

must a binary search tree. Second, the elements of “t” must form a subset of “s”.

This reflects that, in general, “t” is a sub-tree of a larger tree over which iteration is taking place. Third, the invariant must initially hold. Last, the invariant must be hereditary: that is, at any time, if an element “x” is picked among the remaining elements, the invariant guarantees that it is legal to apply “f” to the current accumulator and to “x”, and guarantees that the new accumulator thus obtained will still satisfy the invariant.

This definition is certainly somewhat overwhelming. It shows, at the same time, that it is possible to specify and exploit higher-order functions in our framework, and that there is a cost in complexity to be paid for doing so. More experience is needed before we can tell how easily higher-order functions can be defined and used in practice.

7.6 Quantitative results

The binary search tree library contains 22 functions. The development is composed of 108 lemmas, 603 lines of specification and 247 lines of code. The factor of 3 in size between specification and code does not necessarily mean that specifications must be heavy: in a realistic system, a large part of the specification would be imported from a standard library. 749 proof obligations are generated

(25)

and are proven automatically by Alt-Ergo [23]. Only one lemma, stating that the height of a tree is nonnegative, requires an induction in Coq [22]; the other lemmas are proven automatically by Alt-Ergo. About 80% of the proofs require less than 5 seconds to be proven by Alt-Ergo. Yet, about 10% of the proofs require from 10 to 30 minutes. A forthcoming extension of Alt-Ergo with support for reasoning modulo associativity and commutativity of some set operations (such as set union) would perhaps improve these results.

8 Related work

The roots of our work lie in Hoare logic [1, 2]. Extensions of Hoare logic with support for recursive, higher-order procedures were heavily studied in the late 1970’s and early 1980’s [4–8]. In particular, the issue of completeness received a lot of attention after Clarke [4] proved that there can be no sound and complete Hoare logic for a programming language equipped with recursive, higher-order procedures and global variables. Clarke’s result, however, is based upon the assumption that formulae and proof obligations are expressed in a first-order logic. Damm and Josko [6] point out that, by moving to higher-order logic, it is possible to work around Clarke’s negative result. In this paper, we follow Damm and Josko and allow specifications to be expressed in higher-order logic. The intuitive justification for this approach is that, if functions can abstract over functions, then specifications must abstract over specifications.

Our work has been strongly inspired by several existing, practical tools for checking imperative programs [41, 10, 42, 12, 11, 13]. This paper is an attempt to exploit the strengths of these works while steering away from imperative programming and placing renewed emphasis on polymorphism and modularity.

Our method for generating proof obligations is particularly straightforward:

it appears in its entirety in Figure 8. In comparison with the method used in ESC/Java [43], we avoid a translation to “passive form” because we have no assignments to begin with. We avoid the exponential explosion that could follow from the interplay between sequences and alternatives by requiring sequences (that is, letconstructs) to carry user-provided postconditions (§3.3).

Our system is not sound with respect to a call-by-name dynamic semantics.

There are at least two reasons for this fact. First, some divergent expressions admit falseas a valid postcondition. If such an expression e¹ is made the first component of a sequence, as in “letx/false = e¹ in e²”, then second compo- nente² is checked under the assumptionfalse. As a result, all of the the proof obligations found withine²are vacuously satisfied. This is sound under call-by- value evaluation, becausee2is never executed. It is unsound under call-by-name evaluation, becausee2is executed immediately (after bindingxto a suspension).

The second reason is that, in a call-by-name semantics, every type is inhabited by a bottom value, and some types are inhabited by infinite values. This is not reflected in the way we lift computational values and types up to the logical level.

(26)

Scott’s logic of computable functions [44] interpretsλ-terms in a denotational model, where equality implies, or coincides with, observational equivalence. It comes with a set of sound deduction rules, and allows explicit reasoning about divergence and equality of computations. It admits call-by-value and call-by- name variants. It was implemented as early as 1972 by Milner [45]. More recent implementations [46–48] embed Scott’s LCF within some form of higher-order logic. In a somewhat similar vein, Longley and Pollack [49] embed the functional core of Standard ML, via a fully abstract denotational semantics, into higher- order logic.

Our approach is less elaborate: by focusing on partial correctness, by adopting a call-by-value semantics, and by lifting only values, as opposed to expressions, up to the logical level, we are able to ignore non-termination issues entirely, and to work with value spaces that do not have bottom elements or definedness orderings. By contrast, tools or approaches that focus on lazy functional programs, such as Programatica [50, 51] or the Cover translator [52], require reasoning about non-termination, resulting in proof obligations that can become cluttered with definedness side conditions. The simplicity of our approach comes at a cost: our system can neither establish termination of an expression nor reason about observational equality of expressions.

Honda and Yoshida [53] define a Hoare logic for call-by-value higher-order functions, to which our system seems rather analogous. A technical difference is that Honda and Yoshida allow expressions (including, in particular, function applications) to appear within formulae, and interpret equality as observational equality; whereas we only lift values to the logical level, and interpret equality as equality of values. Honda and Yoshida’s system does not seem to have been implemented.

Smith [54,§4.4.1] defines a type theory with partial objects, where the type A¯ contains the possibly non-terminating computations that yield a result of typeA. Smith notes that the fixed point axiom, which has type (A→A)→A, is sound only at admissible types. As an example of a non-admissible type, he offers a type D whose definition can be read: “D is the type of the partial functionsgof naturals to naturals such thatgdiverges for at least one input”. It is easy to construct a function of typeD→D whose least fixed point is in fact a total function: this shows that D is not admissible. A reviewer of an earlier version of the present paper noted that “gdiverges for at least one input” seems expressible, in our system, as ∃x.∀y.¬post(g)(x)(y), and wondered if Smith’s example could be adapted to show that our system is unsound. One should note, first, that although this formula indeed represents a sufficient condition forgto diverge for at least one input, it is not anecessarycondition. Indeed, the predicatepost(g) denotes the programmer-provided postcondition ofg; it does not denote the actual semantics ofg. Second, when the programmer supplies an explicit definition of the predicate post(g) (which he must do), this definition cannot refer to g itself. As a result, there is no way that the postcondition associated withg can be the self-referent “g diverges for at least one input”.