Example: pure λ-calculus

(1)

An overview of alphaCaml

Franc¸ois Pottier

September 2005

(2)

Introduction

A specification language

Implementation techniques

Translating specifications

Conclusion

(3)

Motivation

Our programming languages do not support abstract syntax with binders in a satisfactory way.

Hand-coding the operations that deal with lexical scope

(capture-avoiding substitution, etc.) is tedious and error-prone.

How about a more declarative, robust, automated approach?

– cf. Shinwell’s Fresh O’Caml, Cheney’s FreshLib.

(4)

Three facets

Let’s distinguish three facets of the problem:

I a specification language,

I an implementation technique,

I an automated translation of the former to the latter.

In this talk, I emphasize the first aspect.

(5)

Introduction

Conclusion

(6)

Prior art

There have been a few proposals to enrich algebraic specification languages with names and abstractions.

An abstraction usually takes the form haie, or ha₁, . . . , a_nie, or, as in Fresh Objective Caml, he₁ie₂.

Abstraction is always binary: the names (or atoms) a that appear on the left-hand side are bound, and their scope is the expression e that appears on the right-hand side.

(7)

Example: pure λ-calculus

Pure λ-calculus:

M:=a|M M|λa.M is modelled in Fresh Objective Caml as follows:

bindable type var

type term =

| EVar of var

| EApp of term ∗ term

| ELam of hvariterm

(8)

A more delicate example

Let’s add simultaneous definitions:

M::=. . .|leta1=M1and . . . andan=Mn inM

The atoms a_i are bound, so they must lie within the abstraction’s left-hand side. The terms Mi are outside the abstraction’s lexical scope, so they must lie outside of the abstraction:

type term =

| ...

| ELet of term list ∗ hvar listiterm

(9)

Another delicate example

Simultaneous recursive definitions pose a similar problem:

M::=. . .|letreca1=M1and . . . andan =Mn inM

The terms M_i are now inside the abstraction’s lexical scope, so they must lie within the abstraction’s right-hand side:

type term =

| ...

| ELetRec ofhvar listi(term list ∗ term)

(10)

The problem

The root of the problem is the assumption that lexical and physical structure should coincide.

(11)

A solution

Within an abstraction, alphaCaml distinguishes three basic

components: binding occurrences of names, expressions that lie within the abstraction’s lexical scope, and expressions that lie outside the scope.

These components are assembled using sums and products, giving rise to a syntactic category of so-called patterns. Abstraction becomes unary and holds a pattern.

(12)

Back to pure λ-calculus

Pure λ-calculus is modelled in alphaCaml as follows:

sort var

type term =

| EVar of atom var

| ELam of hlampi type lamp binds var =

atom var ∗ inner term

(13)

A second look at simultaneous definitions

Simultaneous definitions are modelled without difficulty:

type term =

| ...

| ELet of hletpi

type letp binds var =

binding list ∗ inner term type binding binds var =

atom var ∗ outer term

(14)

More advanced examples

Abstract syntax for patterns in an Objective Caml-like programming language could be declared like this:

type pattern binds var =

| PWildcard

| PVar of atom var

| PRecord of pattern StringMap.t

| PInjection of [ constructor ] ∗ pattern list

| PAnd of pattern ∗ pattern

| POr of pattern ∗ pattern

(15)

Introduction

Conclusion

(16)

Three known techniques

1. de Bruijn indices. Require shifting, which is fragile. No freshening.

Generic equality and hashing functions respect α-equivalence.

2. Atoms. Require freshening upon opening abstractions. No shifting. Require custom equality and hashing functions.

3. Pollack mix: free names as atoms and bound names as indices.

Analogous to 2, except generic equality and hashing respect α-equivalence.

alphaCaml follows 2.

(17)

Some more details

Atoms are represented as pairs of an integer and a string. The latter is used only as a hint for display.

Sets of atoms and renamings are encoded as Patricia trees.

Renamings are suspended and composed at abstractions, which allows linear-time term traversals.

Even though the fresh atom generator has state, closed terms can safely be marshalled to disk.

(18)

Introduction

Conclusion

(19)

Types

The specification of pure λ-calculus is translated down to Objective Caml as follows. Atoms and abstractions are abstract.

type var = Var.Atom.t

type term =

| EVar of var

| ELam of opaque lamp and lamp =

var ∗ term

(20)

Code

Opening an abstraction automatically freshens its bound atoms.

val open lamp : opaque lamp →lamp val create lamp : lamp →opaque lamp

This enforces Barendregt’s informal convention.

More boilerplate is generated for computing sets of free or bound atoms, applying renamings, helping clients succinctly define

transformations (such as capture-avoiding substitution), etc.

(21)

Introduction

Conclusion

(22)

Status

alphaCaml is available. There are very few known users so far.

The distribution comes with two demos:

I a na¨ıve typechecker and evaluator for F≤

I a na¨ıve evaluator for a calculus of mixins (Hirschowitz et al.) These limited experiments are encouraging.

(23)

Limitations

One must go through open functions to examine abstractions. Deep pattern matching is impossible.

Clients can write meaningless code, such as a function that pretends to collect the bound atoms in an expression.

(24)

Towards alpha-(your-favorite-prover-here)?

How about translating a specification language like alphaCaml’s into theorems (recursion and induction principles) and proofs?

– cf. Pitts, Urban and Tasson, Norrish...