Evaluation is MSOL compatible

(1)

HAL Id: hal-00773126

https://hal.inria.fr/hal-00773126v2

Submitted on 14 Jan 2013

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Evaluation is MSOL compatible

Sylvain Salvati, Igor Walukiewicz

To cite this version:

(2)

Evaluation is MSOL compatible

S. Salvati and I. Walukiewicz

Université de Bordeaux, INRIA, CNRS, LaBRI UMR5800 LaBRI Bât A30, 351 crs Libération, 33405 Talence, France

Abstract

We consider simply-typed lambda calculus with fixpoint operators. Evaluation of a term gives as a result the B¨ohm tree of the term. We show that evaluation is compatible with monadic second-order logic (MSOL). This means that for a fixed finite vocabulary of terms, the MSOL properties of B¨ohm trees of terms are effectively MSOL prop-erties of terms themselves. Theorems of this kind have been known for some graph operations: unfolding, and Muchnik iteration. Similarly to those results, our main theorem has diverse applications. It can be used to show decidability results, to construct classes of graphs with decidable MSOL theory, or to obtain MSOL formulas expressing be-havioral properties of terms. Another application is decidability of a control-flow synthesis problem.

1 Introduction

Rice theorem tells us that no non-trivial property of the behaviour of a Turing machine can be decided by looking at the machine itself. In this paper we consider a much simpler abstract computing system: simply-typed lambda-calculus with fixpoint operators. We denote it λY . A behaviour of a λY -term is its Böhm tree. Since not all λY -terms have normal forms, Böhm tree is a standard choice for the result of a computation of a term. To express properties of results we use monadic second-order logic (MSOL) because it is a fundamental logic over trees. The transfer theorem we prove says that every MSOL property of Böhm trees is effectively an MSOL property of terms. In other words, we show that MSOL is compatible with evaluation.

(3)

lambda-calculus through infinite B¨ohm trees. Such a tree is just a normal form of the term, if the term has one. Otherwise it is a potentially infinite tree representing the visible part of the infinite computation of the term. In this paper we consider also infinite λY -terms. This is less standard but introduces relatively few complications while bringing real strengthening of the main theorem.

Under a different syntax λY -calculus has also been intensively stud-ied by language theoretic community. One can cite the PhD thesis of Fisher [Fis68] on macro languages, the work of Engelfriet and Schmidt on IO and OI [ES77, ES78], or the work of Damm on (safe) recursive schemes [Dam82]. More recently Knapik, Niwinski and Urzyczyn [KNU02] considered recursive schemes as generators of infinite trees, and studied the model-checking problem for such trees. After a series of intermediate re-sults [AdMO05, KNUW05, Aeh07]; Ong [Ong06] has shown that the model-checking problem of MSOL properties for such trees is decidable. It has been already clear to Engelfriet and Schmidt as well as to Damm that the grammars, or recursive schemes they study are a different representation of λY -terms (and their subclasses). Indeed, trees generated by recursive schemes are just B¨ohm trees of the corresponding terms of λY -calculus. So, for example, the theorem of Ong can be rephrased as saying that B¨ohm trees of finite λY -terms have a decidable MSOL theory. The transfer theo-rem presented in this paper implies this decidability result.

Our transfer theorem says that for a fixed finite vocabulary of terms, an MSOL formula ϕ can be effectively transformed into an MSOL formula ϕb such that for every term M over the fixed vocabulary: M satisfiesϕ iff the_b B¨ohm tree of M satisfies ϕ. The result is stronger than Ong’s theorem in at least two aspects. First, it holds also for infinite λY -terms. Second, and more importantly, the theorem gives an effective reduction of one theory to another. For example, since finiteness of a tree is definable in MSOL, we immediately obtain that if we restrict to λY -terms over a fixed finite vocabulary then the set of terms having a (finite) normal form is MSOL definable. In the last section of the paper we give several other applications of additional power provided by the transfer theorem.

(4)

show the analogous result for evaluation instead of unfolding or Muchnik iteration. The operation of evaluation cannot be directly compared to the other two since it works on different objects (trees with back edges instead of graphs). Yet in a well-known context where these operations can be com-pared, the evaluation operation is strictly stronger. Indeed, every tree in the pushdown hierarchy [Cau02] is a B¨ohm tree of some term, but not vice versa [Par12].

Related work: This work can be seen as generalization of Ong’s theo-rem, in the same way as compatibility of MSOL with unfolding is a general-ization of Rabin’s theorem. Moreover the unfolding theorem is in some sense also a special cases of our main theorem. Knapik and Courcelle [CK02a] have used the unfolding theorem to prove a special case of our theorem for infinite terms of order 1. The proof presented here is based on our proof of Ong’s theorem using Krivine machines [SW11].

MSOL properties of higher-order systems are an active area of research. After the seminal paper of Knapik, Niwinski and Urzyczn[KNU02]; Caucal has introduced the pushdown hierarchy [Cau02] that since has been an object of intensive study from logical point of view [CW03]. It has been shown that many interesting properties of higher-order programs can be analyzed with recursive schemes and automata [Kob09b, Kob09a, Kob09c, KO11, OR11]. The decidability result of Ong has been revisited in a number of ways [HMOS08, KO09, SW11, BCHS12].

(5)

Finally, we introduce the notion of canonical form of a term.

Section 3 presents the main theorem. For this it describes how terms are represented as logical structures so that we can talk about their MSOL properties. Our representation requires that we have a fixed finite set of λ-variables. At the same time we do not need to restrict the number of Y -variables. We show that the theorem is not likely to hold if the number of λ-variables in not fixed. The overview of the proof of the theorem is given in the following section.

Section 5 gives three applications of transfer theorem. We explain how to obtain formulas expressing computational properties of terms. We show de-cidability of higher-order matching for terms over a fixed vocabulary [SS03]. We present a synthesis result that allows to construct λY -programs from λY -modules.

In the conclusion section we give more relations between the transfer theorem and other results in the literature.

The extended version of the paper [SW13] presents all the proofs as well as some more comments and discussions.

2 Infinitary λY -calculus

(6)

2.1 Syntax and operational semantics

The set of types is constructed from a unique basic type 0 using a binary operation →. Thus 0 is a type and if α, β are types, so is (α → β). As usual, so as to use less parentheses, we consider that → associates to the right. For example, 0 → 0 → 0 stands for (0 → (0 → 0)). We will write 0i _{→ 0 as short notation for 0 → 0 → · · · → 0 → 0, where there are i + 1} occurrences of 0. The order of a type is defined by: order (0) = 1, and order (α → β) = max(1 + order (α), order (β)).

A signature, denoted Σ, is a set of typed constants, that is symbols with associated types. Of special interest to us will be tree signatures where all constants other than the special constant Ω have order at most 2. Observe that types of order 2 have the form 0i _{→ 0 for some i. For simplicity of} notation we will always assume that i = 2, but of course our results do not depend on this convention.

The terms will be built over two disjoint countable sets of typed variables: λ-variables and Y -variables. We shall write xα for a λ-variable of type α, and xα _{for a Y -variable of type α. In this paper, we work with potentially} infinite λY -terms. We assume that for every type α we have a constant Ωα to denote the undefined term of type α. We will also have typed application symbols @α, and typed binders Yα as well as λα→β. For all types α we define simultaneously the sets of infinite terms of type α as trees satisfying the following conditions.

• A node labelled by Ωα_{, x}α_{, x}α_{, or c}α _{is a term of type α.}

• A tree with the root labelled @β _{having as the left subtree a term of} type α → β and as a right subtree a term of type α, is a term of type β.

• A tree with the root labelled λα→β_xα _{with the unique immediate} sub-tree being a term of type β, is a term of type α → β.

• A tree with a root labelled Yα_xα _{with the unique immediate subtree} being a term of type α, is a term of type α.

Some examples of infinitary λY -terms, as well as trees that are not terms, are presented in Figure 2.1.

Notice that all variables and constructors carry type labeling that makes typing of a term unique. We shall often omit those labels when they are unnecessary for the understanding or when they can be inferred from the context. We will also use standard conventions and write (M N ) for M @ N , and N0N1. . . Np for (. . . (N0N1) . . . Np).

(7)

innocent since a term may contain infinitely many free variables. Never-theless, a careful use of de Bruijn indices is, as in the finite case, a way to represent equivalence classes of terms modulo α-conversion. However, in the sequel, we are going to work sometimes on particular representatives of α-equivalence classes assuming certain properties on the naming of variables such as the Barendregt convention on Y -variables (every Y -binders binds a distinct Y -variable), and the use of finitely many λ-variables. To make things clear, when working up to α-conversion, we will speak about terms, and when working on particular instances of α-equivalence classes, we will speak about concrete terms.

This definition of infinitary terms comes with two main differences with respect to the usual one given for the untyped infinitary calculus. First, the typing discipline rules out terms with infinite sequences of λ-abstractions (cf. Figure 2.1). The second difference is that we use the Y combinator as a binder and we distinguish between Y -variables and λ-variables. Notice that the definition of infinitary terms allows infinite sequences of Y -abstractions (cf. Figure 2.1).

The reason why we need to distinguish between λvariables and Y -variables is that the main theorem we prove is about terms which use finitely many λ-variables but possibly infinitely many Y -variables. As a small re-mark, if the main theorem did not need to make this assumption, we could simply get rid of Y -binders. Indeed the term Y x.N has the same B¨ohm tree (see section 2.2) as the term rec(λx.N [x/x]) where rec = λf.f (f (f (f . . . ))). This shows that λ-abstraction, or parameter instantiation, is more powerful than recursive definition in the context of the infinitary λ-calculus.

Terms Y x. Y x. Y x. @ @ x x Y x. λz. @ @ z x Non-term λx. λx. λx.

Figure 1: Examples of infinitary terms and non-terms

(8)

to avoid the infinite overhead of computation that each substitution may require if it were to be completely performed.

We may now define β-contraction on infinitary terms as the straightfor-ward extension of β-contraction on finite terms. Concerning δ-contraction, we need to adapt its definition to our slightly modified syntax. The rela-tion of δ-contracrela-tion →δ is the smallest relation that is compatible with the syntax of infinitary λY -calculus and so that Y x.M δ-contracts to M [x := Y x.M ]. We let βδ-contraction, →βδ to be the union of the relations →β and →δ. Of course, the subject reduction property for simple types trans-fers from finite terms to infinite ones so that the reduction preserves typing. As we are interested in infinitary terms as computational devices, we need to choose what we consider to be a value or the output of the computa-tion performed by those terms. We thus introduce the nocomputa-tion of weak head normal form and of weak head reduction.

Definition 1 An infinitary term M is in weak head normal form, if it is of the form λxα.N for some term N , or if it is of the form hN1. . . Nn, with h being either a variable or a constant different from Ω.

When M is an infinitary term of the form (λx. P )P1. . . Pn or of the form (Y x. P )P1. . . Pn, then M has a head-redex. Reducing M to P [x := P1]P2. . . Pn or to P [x := Y x.P ]P1. . . Pn, respectively, is called head-con-tracting M . We write M →h N when M head-contracts to N ; we write M →∗_h N for head-reduction in some finite number of head-contraction steps.

The reason why, we wish to use weak-head normal forms for values instead of the usual notion of head-normal form is that we are going to use the Krivine machine to compute values. Krivine machine computes weak-head normal forms of terms and not head-normal forms.

2.2 B¨ohm trees

Now that we have settled a reduction strategy together with a notion of value, we may define a notion of normal form for infinitary terms, namely B¨ohm trees. We also show that in a strong sense B¨ohm trees are actual normal forms of infinitary terms. There is no particular difficulty to define the notions of this subsection for all infinite λY -terms. Yet, since we consider infinitary λY -terms as generators of infinite trees, we are really interested only in closed terms of type 0. We choose to present the definitions only for such terms. In order to further simplify the notation we will from the start consider only terms over a tree signature, and assume that all the constants are binary.

(9)

1. if M →∗_h bN1N2 where b is a constant different from Ω, then BT (M ) is a tree with root labelled b and with BT (N1) and BT (N2) as its subtrees.

2. otherwise BT (M ) = Ωα.

Observe that in our case Böhm trees are just labelled infinite binary trees. In a sense that can be made precise, Böhm trees are normal forms of terms. As such they are terms too, but due to their special shape we do not need to use application nodes to represent them. In Figure 2 we present a Böhm tree and its representation as a term.

The reader may be surprised that we talk about B¨ohm trees while in general Krivine machines compute L´evy-Longo trees. It turns out that the two notions coincide when working with tree signatures and terms of type 0. This is why we have preferred to use the better known notion.

a b c @ @ a @ @ b @ @ c

Figure 2: B¨ohm tree and the associated term

Even though we have defined Böhm trees (or Lévy-Longo trees) using a particular reduction strategy, they really are the unique normal forms of infinitary terms modulo βδ-reductions of arbitrary ordinal length provided we add the reduction rule that allows to reduce terms without weak head normal form to Ω. We are using the end of this section to establish this fact. The results we are about to show are known in the literature on untyped infinitary λ-calculus (see Kennaway et al. [KKSdV97] and Terese [KdV03]). We here adapt those results to the simply typed λY -calculus and we present them in way that suits better our needs. Notice that, even though we have defined Böhm trees only for closed terms of atomic types built on a tree signature, all the results we mention here hold without those restrictions.

(10)

@ (λzo.azo) @

(λzo.azo) @ (λzo.azo)

requires to reduce ω redices so a to obtain the term that is the solution of the equation v = av:

@

a @

a

If we were to reduce the term (λx0.bx0)u we may first reduce u in ω steps to v and we would obtain a term (λx0.bx0)v that can be reduced to bv with one more step. With this example we have constructed a reduction of length ω + 1. It thus appears natural to define reductions of arbitrary ordinal length. The natural way of defining such sequences of reductions is to define them as continuous functions from ordinals to λ-terms. This is one of the reason why we need the constants Ωα as part of our language for defining infinitary terms. The constants Ωα stands for the undefined term of type α and allows us to define a natural partial order on infinitary terms. This is the least order v that is compatible with the syntax and so that for every term M in Λα,∞(Ω), Ωαv M . Notice that, whenever M v N , M and N need to be terms that have the same types.

Definition 3 Given a relation R on λ-terms and an ordinal γ, a γ, R-reduction sequence of type α is a function ϕ that maps ordinals δ ≤ γ to infinitary terms of Λα,∞(Ω) so that:

1. if δ < γ, we have ϕ(δ) R ϕ(δ + 1),

2. if δ ≤ γ is a limit ordinal, then for every term M v ϕ(δ) there is an ordinal θ < δ so that M v ϕ(θ).

When R =→βδ, and there is a α, →βδ-sequence ϕ such that ϕ(0) = M and ϕ(α) = N we write that M −→α ∞N , or we write M −→∞N when there is α such that M −→α ∞N . The notion of reduction sequences we use is called weakly convergent in the literature.

(11)

terms I = λx0.x0, M = λx0.I(f0→0x0) and J = λx0.f0→0x0, then the term (Y f0→0.M )x0 we can be reduced in the following ways:

(Y f0→0.M )x0 −→∞λx0.I(Y f0→0.M x0)

−→∞(λx0.I((λx0.I(Y f0→0.M x0))x0))x0 −→∞I(I(Y f0→0.M x0)

−→∞Iω

where Iω is the term satisfying the syntactic identity IIω = Iω and Iω can be depicted by:

@

I @

I

The term (Y f0→0.M )x0 can also be reduced as follows:

(Y f0→0.M )x0 −→∞(λxo.(Y f0→0.J )x0)x0−→∞(λx0.(Y J )x0)x0 −→ω ∞u where u is the infinite term verifying the syntactic identity u = (λx0.u)x0 and that is depicted by:

@ λx0.

@ λx0. x0

x0

Since, for any redex that is contracted in Iω the result of the contraction is Iω again and similarly u can only be reduced to u, it is obvious that there is no term P so that Iω −→∞ P and u −→∞P . Nevertheless we do not assume that those terms are meaningful values, we rather assume that meaningful values are terms in weak-head normal forms.

With this notion of value we may enrich the our operational semantics with a reduction to Ω for terms that do not yield a value.

Definition 4 We introduce the relation →Ω, the Ω-contraction, so that, given a term M of Λα,∞(Ω0), if there is no term N in weak-head normal from so that M −→∞ N , then M →Ω Ωα. We let →βδΩ be the union of →_βδ and →Ω.

(12)

The operational semantics −→Ω,∞ on infinitary terms has nice properties, and in particular it has the Church-Rosser property (it is a consequence of Terese [] Theorem 12.9.6 p.699).

Theorem 5 If M −→Ω,∞ N1 and M −→Ω,∞ N2, then, there is P so that N1 −→Ω,∞ P and N2 −→Ω,∞ P .

This confluence property is partially grounding the fact that Böhm trees of terms are their normal forms in proving that every term has a unique normal form. It remains to see that these unique normal forms are the Böhm trees. The main problem is that there is a gap between the relation →_Ω which may reduce a term M to Ω only when there is no infinitary reduction that turns M into a weak-head normal form while the Böhm tree of a term M is Ω when there is no finite head-reduction that turns M into a weak-head normal form. This is precisely this gap that Lemma 6 is filling. We now give its proof.

Lemma 6 Given M in Λα,∞(Ω), if there a term N in weak-head normal form so that M −→∞ N , then there is k < ω and P in weak-head normal form so that M −→k _h,∞P .

Proof

Let γ be an ordinal so that M −→γ ∞ N and let ϕ be the γ, →βδ-reduction sequence reducing M to N . In case γ < ω, it is easy to see that there is a finite term M0 v M which can be put in finitely many steps of head-contraction in weak-head normal form P0. Therefore preforming those head-contraction steps on M allows to obtain a term P so that P0 v P , and therefore P is also in weak-head normal form.

In case γ ≥ ω, there is a finite ordinal k and a limit ordinal γ0 so that γ = γ0+ k. Let P = ϕ(γ0), we have that P −→k ∞ N . As N is in weak-head normal form, there is a finite term P0 so that P0 v P so that P0 head-reduces in finitely many steps to P00 which is in weak-head normal form. By definition of γ, →βδ-reduction sequences, there is θ < γ0 so that P0v ϕ(θ). As a consequence, ϕ(θ) head-reduces in finitely many steps to a weak-head normal form. Thus, by transfinite induction, M head-reduces to

a weak-head normal form in finitely many steps.

(13)

Theorem 7 Given M in Λα,∞(Ω), there an ordinal γ ≤ ω so that M −→γ Ω,∞ BT (M ).

2.3 Canonical terms

Infinitary terms may contain infinitely many free λ-variables. But, as we already said, we are interested in closed terms and the subterms of closed terms contain only finitely many free λ-variables. Thus, from now on, we restrict our attention to terms with finitely many free λ-variables.

It will be convenient to work with terms in what we call canonical form. This form permits to separate λ-variables from Y -variables making recursion and parameter instantiation isolated in a way that will prove useful. Definition 8 An infinitary term M is in a canonical form when none of its subterms of the form Y x. N contains a free λ-variable.

There is a simple process, that transforms every term M with a finite set of free λ-variables into a canonical form. For this it suffices to abstract away free λ-variables in every subterm Y x. N of M . More precisely, if {x1, . . . , xn} is the set of free λ-variables of N (which is necessarily finite) we perform the replacement

Y x. N 7→ (Y y.λx1. . . λxn.N [x := yx1. . . xn])x1. . . xn With standard techniques based on approximations, it is rather direct to show that the new term has the same L´evy-Longo tree as M .

Proposition 1 For every infinite closed λY -term M of type 0 over tree signature the result of the above operation is a canonical term generating the same B¨ohm tree.

2.4 Krivine machine

We are now going to introduce the notion of Krivine machine that will allow us to compute the normal forms of infinite terms.

(14)

Let us fix for this section a concrete canonical term M of type 0. Working with concrete terms gives us some control over Y -variables. We will assume that in the term M every Y -binder binds a distinct Y -variable and that there is no occurrence of Ω. This assumption on the names of Y -variables allows us to stipulate the existence of a function term that maps a Y -variable xα to the subterm Y xα.N of M where it is bound. In later sections, we will represent terms as infinite graphs and the function term will be implemented directly with an edge between Y -binders and the positions they bind in the term.

The Krivine machine [Kri07], is an abstract machine that computes the weak head normal form of a term, using explicit substitutions, called envi-ronments. Environments are functions assigning closures to variables, and closures themselves are pairs consisting of a term and an environment. This mutually recursive definition is schematically represented by the grammar:

C ::= (N, ρ) ρ ::= ∅ | ρ[x 7→ C] .

As in this grammar, we will use ∅ for the empty environment. The notation ρ[x 7→ C] represents the environment which associates the same closure as ρ to variables except for the variable x that it maps to C. The terms N that can be used to form a closure must be subterms of the term M we are considering. This restriction is harmless since, when computing the normal form of M , the Krivine machine only needs closures made with subterms of M (see Theorem 11).

Since we work within a typed context, these two notions follow the typing discipline: in an environment the types of a variable and the closure it is assigned to must be the same. The type of a closure (N, ρ), is simply the type of the term N . We require that in a closure (N, ρ), the environment is defined for every free λ-variable of N , while the values of Y variables are given by term function.

Intuitively a closure C denotes a closed λ-term E(C) as follows. A closure (N, ρ) denotes a λ-term obtained by substituting: (i) for every free λ-variable x of N the term denoted by the closure ρ(x), and (ii) for every free Y -variable x the term term(x). It is important to note that the definition of an environment is inductive, so every environment has finite depth. More precisely this means that every sequence of the form:

ρ0(x0) = (N1, ρ1), ρ1(x1) = (N2, ρ2), . . . is finite and ends in the empty environment.

A stack S ≡ C1. . . Cn is a possibly empty sequence of closures. We use ⊥ to denote the empty stack.

(15)

2. ρ is an environment defined for all free variables of N ;

3. S is a stack C1. . . Ck, where k and the types of the closures are deter-mined by the type of N : the type of Ci is αi where the type of N is α1 → · · · → αk→ 0.

A configuration (N, ρ, S) represents an infinitary term: E((N, ρ, S)) = E(N, ρ)E(C1) . . . E(Cn)

where C1. . . Cn are the closures on the stack S. Notice that the third con-dition in the definition of configurations implies that E(N, ρ, S) needs to be a term of type 0. Observe also that E(N, ρ, S) may not be a subterm of M even though N as well as every term appearing in ρ, C1,. . . ,Cn is.

The transition rules of the Krivine machine are: (λx.N, ρ, (K, ρ0)S) →(N, ρ[x 7→ (K, ρ0)], S)

(Y x.N, ρ, S) →(N, ρ, S) (N K, ρ, S) →(N, ρ, (K, ρ)S)

(x, ρ, S) →(N, ρ0, S) where (N, ρ0) = ρ(x) (x, ρ, S) →(term(x), ∅, S)

It is rather straightforward to check that these computation rules transform configurations of the Krivine machine into configurations of the Krivine machine. This means that the typing properties of environments and the fact that only subterms of M are used in a closure or as the main term of configuration is preserved by those reduction rules. The first two rules do β-contraction and δ-contraction respectively. The application rule creates a closure and puts it on the stack. The λ-variable rule looks up the meaning of the variable in the environment. The Y -variable rule replaces the variable by its definition. Since we work with concrete canonical terms, there are no free λ-variables in term(x) so the environment can be discarded.

Notice that the terms reduced by the Krivine machine, that are in its environment or on its stack are in general infinite terms which are inspected with respect to their syntax. A more concrete, but equivalent way of repre-senting the same thing would be to build the Krivine machine with addresses in the infinite term and a call to an external mechanism so as to read its labels.

(16)

1. if started in (N, ρ, ⊥) the machine reaches a configuration (b, ρ, C1C2) where b is a constant and Ci = (Ni, ρi) are closures of type 0, then KT (N, ρ, ⊥) is a tree with the root labelled by b and two subtrees: KT (N1, ρ1, ⊥) and KT (N2, ρ2, ⊥).

2. in the other case KT (N, ρ, ⊥) = Ω0. We write KT (M ) for KT (M, ∅, ⊥).

The next lemma says that Krivine machine computes the weak head normal form (that is the same as head normal form in our case).

Lemma 10 Let (N, ρ, ⊥) be a configuration of the Krivine machine. Term E(N, ρ, ⊥) has a head normal form iff Krivine machine reduces (N, ρ, ⊥) to a configuration (b, ρ, S) for some constant b.

Proof

Recall that Lemma 6 guarantees that head reduction reaches a weak head normal form if a term has one. We also know that Krivine machine per-forms the head reduction. It suffices to examine in what configurations the machine gets blocked. Looking at the rules we can see that there can be only two kinds of configurations when no rule is applicable. The first is (λx.N, ρ, ⊥), but it is excluded by the third condition on configurations of Krivine machine. The second is (h, ρ, S) where h is either a constant or a variable that is not in the domain of definition of ρ. The case of a variable is also excluded by the second condition on the form of configurations of the Krivine machine. Hence the machine stops when it reaches a constant that

is the head of the normal form of E(N, ρ, ⊥).

Lemma 10 entails that the Krivine machine gives an effective way of computing the B¨ohm tree of an infinitary term. The second statement of the following theorem follows by a direct inspection.

Theorem 11 For a fixed tree signature Σ. For every concrete canonical and closed λY -term M of type 0, we have BT (M ) = KT (M ). All the terms appearing in configurations of the Krivine machine during the computation of KT (M ) are subterms of M .

3 Transfer theorem for evaluation

(17)

our representation of terms by showing two facts: (i) the set of all terms is definable in MSOL, (ii) unless the polynomial-time hierarchy collapses, the main theorem is false when we do not fix the number of bound λ-variables. Terms will be represented as labelled, potentially infinite, graphs. For us here such a graph is a structure of the form M = hV, {Ei}i=1,2, {Pi}i=1,...,ni, where V is the set of vertices, every Ei is a binary relation on vertices, and every Piis a subset of vertices. We will have two edge relations representing left and right successor. In our case predicates Pi will be a partition of V . So every vertex will have a unique label given by the predicate it belongs too.

Monadic second-order logic (MSOL) is an extension of first-order logic with quantification over sets of elements and the membership predicate x ∈ Z, where x is a variable ranging over elements, and Z is a variable ranging over sets of elements. The definition of satisfiability of a formula ϕ in a structure M, denoted M ϕ, is standard.

Let us fix a tree signature Σ with finitely many constants other than Ω. As postulated at the beginning of Section 2.1, for simplicity of notation all the constants are binary. We would like to consider terms as models of formulas of monadic second-order logic. We will work with terms over some arbitrary but finite vocabulary. We take a finite set of typed λ-variables X = {xα1

1 , . . . , x αk

k }, and a finite set of types T . We denote by Terms(Σ, T , X ) the set of infinite closed concrete terms1 M over the signature Σ such that M uses only λ-variables from X , and every subterm of M has a type in T . We also write CTerms(Σ, T , X ) for the terms M of Terms(Σ, T , X ) that are in canonical form. Observe that bound λ-variables in M should come from X . In contrast we do not put restrictions on the use Y -variables. It will be convenient to assume that every Y -variable in M is bound at most once: for every Y -variable x there is at most one occurrence of Y x in M .

A term from Terms(Σ, T , X ) can be seen as a labelled tree where the labels come from finite alphabet, but for the Y -variables and Y binders. We will now eliminate the possible source of infiniteness of labels related to Y -variables and Y binders. Take a closed term M considered as a tree. For every node of this tree labelled by a Y -variable xα we put an E1-edge from the node to the node labelled Y xα. Since M is closed, such a node exists and is an ancestor of the node labelled by x; such a node is also unique since we assume that every Y -variable is bound at most once. In the next step we introduce a new symbol α and for every node labelled with a Y -variable of type α, we change its label to α. Finally, we replace all labels of the form Y xαby just Yα. This way we have eliminated all occurrences of Y -variables from labels, but now a term is represented not as a labelled tree but as a labelled graph. Let us denote it by Graph(M ). Observe that the nodes of

1

(18)

Y x. λz. @ @ z x Y λz @ @ z

Figure 3: M and Graph(M )

this graph have labels from a finite set

Talph(Σ, T , X ) = Σ ∪ {@α, Yα, α: α ∈ T } ∪ X ∪

{λα→β_xα_{: α ∈ T ∧ α → β ∈ T ∧ x}α_{∈ X } .} There are two edge relations, E1 and E2, in Graph(M ) since nodes labelled by application symbol have both left and right successor. The nodes with other labels have no successor (nodes labeled with labels from Σ ∪ X ) or just one successor, given by E1. The example of Graph(M ) is presented in Figure 3.

Since Graph(M ) is a labelled graph over a finite alphabet it makes sense to talk about satisfiability of an MSOL formula in this graph. We will just write M ϕ instead of Graph(M ) ϕ. The first easy, but important, observation is that for fixed Σ, T , X , there is an MSOL formula determining if a graph is of the form Graph(M ) for some M ∈ Terms(Σ, T , X ). Indeed, in this case we deal with models over the signature consisting of two binary relations E1, E2 and a unary relation Pb for every b ∈ Talph(Σ, T , X ). The formula should say that Pb form the partition of the set of vertices and then express conditions from the definition of infinite λY -terms on page 5. These conditions are clearly expressible in MSOL as they talk about dependencies between labels of a node and its successors.

For a closed term M ∈ Terms(Σ, T , X ) of type 0, its B¨ohm tree is a tree with nodes labelled by symbols from Σ. Hence one can talk about satisfiability of MSOL formulas in BT (M ). The transfer theorem says that the MSOL-theory of BT (M ) is recursive in the MSOL-theory of M . Theorem 12 (Transfer theorem) Let Σ be a finite tree signature, X a finite set of typed variables, and T a finite set of types. For every MSOL formula ϕ one can effectively construct an MSOL formula ϕ such that forb every λY -term M ∈ Terms(Σ, T , X) of type 0:

BT (M ) ϕ iff M ϕ.b

(19)

In principle, it is possible to represent terms with an unbounded number of λ-variables by using the same trick for λ-binder as the one we have used for the Y -binder. However, we conjecture that the transfer theorem does not hold when we allow infinitely many λ-variables. Below we give a simple argument under the hypothesis that the polynomial hierarchy is strict.

It is customary to represent booleans with the λ-terms of type 0 → 0 → 0: true is represented by λxy.x and false by λxy.y. This permits the defini-tion of the boolean connectives as terms too: and = λb1b2xy.b1(b2xy)y, or = λb1b2xy.b1x(b2xy) and neg = λbxy.byx. One can also define propositional quantifiers All = λf.and (f true)(f false) and Ex = λf.or (f true)(f false). This allows us to represent in a direct manner every quantified boolean for-mula θ as a simply typed finite closed term Mθ such that Mθreduces to true iff θ is true. Observe that Mθ has a linear size with respect to that of θ. So if we assume that we are given two distinct terms Ntrue, Nfalse of type 0 we get MθNtrueNfalse reduces to Ntrue iff θ is true. Take an MSOL formula ϕ that is true exactly in Graph(Ntrue). The transfer theorem without any restriction on the number of λ-variables would give an MSOL formula ϕ_b such that, for every quantified boolean formula θ: θ is true iff Mθ ϕ. Theb model-checking problem for the fixed formula ϕ belongs to the polynomial_b hierarchy: the level of the hierarchy is bounded from the above by the al-ternation of quantifiers in ϕ. So the extension of the transfer theorem tob infinite number of λ-variables would imply that QBF satisfiability problem is in the polynomial hierarchy.

Using finite terms to prove that the transfer theorem does not hold in case an unbounded number of λ-variables is allowed cannot give a better result than the example we have given with QBF since the evaluation in a fixed finite model of finite terms that use only types from a finite set T can easily be proved to be in Pspace. Thus, so as to get rid of the hypothesis that the polynomial hierarchy is strict to prove that the transfer theorem does not hold when an unbounded number of λ-variables is allowed, it seems that we need to use an infinite term.

4 Proof of the transfer theorem

(20)

4.1 Parity autamata and MSOL on infinite binary trees

Recall that Σ is a fixed set of constants of type 0 → 0 → 0. For a closed term M of type 0, these constants label nodes in BT (M ). Since BT (M ) is an infinite binary tree we can use standard non-deterministic parity automata to define sets of B¨ohm trees. Such an automaton has the form

A = hQ, Σ, q0 ∈ Q, δ : Q × Σ → P(Q2), rk : Q → {1, . . . , d}i (1) where Q is a finite set of states, q0 is the initial state, δ is the transition function, and rk is a function assigning a rank (a number between 1 and d) to every state.

In general, an infinite binary tree is a function t : {0, 1}∗ → Σ. A run of A on t is a function r : {0, 1}∗ → Q such that r(ε) = q0 _{and for every} sequence w ∈ {0, 1}∗: (r(w0), r(w1)) ∈ δ(q, t(w)). The run is accepting if for every infinite path in the tree, the sequence of states assigned to this path satisfies the parity condition determined by rk ; this means that the maximal rank of a state seen infinitely often should be even.

Formally, it may be the case that BT (M ) contains also nodes labelled with rk0. We will simply assume that every tree containing rk0 is rejected by the automaton. This is frequently done in this context. Handling rk0 would not be difficult but would require to add one more case in all the constructions. The other, more difficult, solution is to convert a term to a term not generating rk0.

4.2 Structure of the proof

(21)

Figure 4: Schema of the proof

4.3 Game K(A, M )

We now give the definition of RT (A, M ), the runs of the automaton A on the graph of configurations of the Krivine Machine computing BT (M ). The actual runs of A on BT (M ) can easily be read off RT (A, M ).

Definition 13 For a given M ∈ CTerms(Σ, T , X) of type 0 , and a parity automaton A we define the tree of runs RT (A, M ) of A on the graph of configurations of the execution of the Krivine Machine on M :

1. The root of the tree is labeled with q0 : (M, ∅, ⊥)

2. A node labeled q : (a, ρ, S) has a successor (q0, q1) : (a, ρ, S) for every (q0, q1) ∈ δ(q, a).

3. A node labeled (q0, q1) : (a, ρ, (v0, N0, ρ0)(v1, N1, ρ1)) has two succes-sors q0 : (N0, ρ0, ⊥) and q1 : (N1, ρ1, ⊥).

4. A node labeled q : (λx.N, ρ, CS) has a unique successor labeled q : (N, ρ[x 7→ C], S).

5. A node q : (Y x.N, ρ, S) has a unique successor q : (N, ρ, S).

6. A node v labeled q : (x, ρ, S), for x a recursive variable, has a unique successor q : (term(x), ∅, S).

7. A node v labeled q : (N K, ρ, S) has a unique successor labeled q : (N, ρ, (v, K, ρ)S). We say that here a v-closure is created.

(22)

The definition is as expected but for the fact that in the rule for application we store the current node in the closure. When we use the closure in the variable rule or constant rule (rules 8 and 3), the stored node does not influence the result. The stored node allows us to detect what is exactly the closure that we are using. This will be important in the proof.

Notice also that the rules 2,3,4 rely on the typing properties of the Kriv-ine machKriv-ine ensured by the definition of its configurations (cf. page 13). Indeed, when the machine reaches a configuration of the form (a, ρ, S) then, since we are working with tree signature, a is of type 0 → 0 → 0. In conse-quence, the stack S consists of two closures of type 0. The environment ρ plays no role in such a configuration as a is a constant. Also from typing in-variant we get that, when the machine is in a configuration like (λx.N, ρ, S), S cannot be the empty stack.

Definition 14 We use the tree RT (A, M ) to define a game between two players: Eve chooses a successor in nodes of the form q : (a, ρ, S), and Adam in nodes (q0, q1) : (a, ρ, S). We set the parity rank of nodes labeled q : (a, ρ, S) to rk (q), and the parity ranks of all the other nodes to 1. We can use max parity condition to decide who wins an infinite play. Let us call the resulting game K(A, M ).

The following is a direct consequence of the definitions and Theorem 11.

Proposition 2 For every parity automaton A and concrete canonical term M . Eve has a strategy from the root position in K(A, M ) iff A accepts BT (M ).

The only interesting point to observe is that it is important to disallow rank 0 in the definition of parity automaton since we assign rank 1 to all “intermediate” positions. This is linked to our handling of infinite sequences of reductions of the Krivine machine without reaching a head normal form. Such a sequence results in a node labeled rk in a B¨ohm tree, hence the tree should not be accepted by the automaton. Indeed, in the game K(A, M ) this will give an infinite sequence of states of rank 1.

Hence deciding whether BT (M ) is accepted by A is reduced to deciding who has a winning strategy from the root of K(A, M ). We will introduce a “smaller” game G(A, M ), and show that the winner in the two games is the same. While G(A, M ) will still be infinite, it will be definable by MSOL formula inside the term M itself.

4.4 Game G(A, M )

(23)

We reduce this game to G(A, M ) where we remove the second source of infiniteness.

Figure 5: Game K(A, M ) on the left, and G(A, M ) on the right. The idea of the reduction is to eliminate stacks and environments using alternation. Consider situation in Figure 5. On the top left we have a position v in the game K(A, M ) where the application rule is used. This means that the new closure (K, ρ) is put on the stack of the Krivine machine (node v1). In some descendant v0 of v1 the closure may be used. This means that the machine gets to the variable x whose value is the closure in question. Let us consider the simplest case when K is of type 0. Due to the typing invariants on configurations of the Krivine machine, we know that the stack is empty in v0. So the configuration in the successor v₁0 of v0 is constructed just from the closure. This observation allows to shortcut the path from v to v0₁. This is what we do in the game G(A, M ).

The right part of Figure 5 represents the result of taking these shortcuts. In a set R, that we call residual of the closure (K, ρ), we have collected all states q0 which appear when the closure (K, ρ) is used: as in the node v0₁. For every such state we add directly a successor of v labeled with the corresponding configuration. So the edge from v to v0₁ in the right picture simulates the path from v to v₁0 in the left picture. Now the question is where we get R from. We actually just guess it and check if it is big enough. This is the task of the leftmost transition in the right picture. The gray triangle is the same as in the original game. But this time instead of a closure we have put R on the stack. When we get to v0 we just check that the state in v0 is in R. This check guarantees that we have put all uses of the closure into R.

(24)

lighter methods. In order to deal with parity conditions we need not only to remember the state in which the closure is used, but also the biggest rank on the path from creation of the closure to its use. This is symbolized by r0 in the left part of the figure. We use the same r0 as the rank of the edge in the reduced game.

The final level of complication comes from the fact that till now we have assumed that K is of type 0, but in general we need to deal with terms K of types of any order. The difference is that if K is not of type 0 then the configuration in v0 on the left will be of the form q0 : (x, ρ0, S) for some stack S whose type is determined by the type of x, that is the same as the type of K. Observe that the typing invariant of Krivine machine tells us that the orders of types of closures on the stack are always strictly smaller than that of K. So by induction on types we can assume that S is composed of residuals and not of closures. Since there are finitely many residuals, the residual for K will be now a function from sequences of residuals representing possible stacks S to a set of states with ranks as in the case when K had type 0.

After these explanations we will proceed to define residuals, the lifting operation on residuals, and finally the game G(A, M ). The lifting operation on residuals will permit us to deal with all the book-keeping required by the parity condition in an elegant way.

Definition 15 (Residuals) Recall that Q is the set of states of A and d is the maximal value of the rank function of A. Let [d] stand for the set {1, . . . , d}. For every type τ = τ₁ → · · · → τ_k→ 0, the set of residuals D_τ is the set of functions Dτ1 → · · · → Dτk → P(Q × [d]).

For example, D0 is P(Q × [d]) and D0→0 is P(Q × [d]) → P(Q × [d]). The meaning of residuals will become clearer when we will define the game.

A position of the game G(A, M ) will be of one of the forms: q : (N, ρ, S), or (q0, q1) : (N, ρ, S), or (q, R) : (N, ρ, S)

where q, q0, q1 are states of A, N is a subterm of M ; ρ is a function assigning a residual to every λ-variable that has a free occurrence in N , and S is a stack of residuals. Of course the types of residuals will agree with the types of λ-variables/arguments they are assigned to. Notice that we use the same letter ρ to denote an environment as well as an assignment of residuals. Similarly for S. It will be always clear from the context what object is denoted by these letters.

(25)

the biggest ranks guessed on the paths from the node where the closure is created to the nodes where it is used are correct. The role of the lifting operation we introduce here is to perform this verification. Of course this operation is defined for residuals of any orders.

Definition 16 A lifting of a residual R : Dτ1 → · · · → Dτk → D0 by a rank

r is a residual R r of the same type as R satisfying for every sequence of arguments S:

R r(S) = {(q1, r1) ∈ R(S) : r1 > r}∪{(q1, r2) : (q1, r1) ∈ R(S), r2≤ r1= r} Recall that D0 = P(Q×[d]) so what R rdoes is to modify the set of pairs R(S) which is the value of R on the sequence of residuals S of appropriate type. The operation leaves unchanged all pairs (q1, r1) with r1 > r. For every pair (q1, r1) with r1 = r it adds pairs (q1, r2) for all r2 ≤ r. All pairs (q1, r1) of R(S) with r1 < r do not contribute to the result.

Example Let’s take the residual R = {(q1, 1); (q2, 2); (q3, 3)} of type 0. We have that

R 1={(q1, 0); (q1, 1); (q2, 2); (q3, 3)}, R 2={(q2, 0); (q2, 1); (q2, 2); (q3, 3)}, R 3={(q3, 0); (q3, 1); (q3, 2); (q3, 3)}, and R 4=∅.

If we take a residual R of type 0 → 0 that maps {(q1, 1)} to {(q2, 2); (q3, 3)} and {(q2, 1)} to {(q1, 1); (q3, 1)}, and all other residuals to ∅ then R 2 maps {(q₁, 1)} to {(q2, 0); (q2, 1); (q2, 2); (q3, 3)} and all other residuals to ∅. Lemma 17 For every residual R and ranks r1, r2: (R r1) r2= R max(r1,r2).

If ρ is an environment then ρ r is an environment such that for every x: (ρ r)(x) = ρ(x) r.

We have all ingredients to define the transitions of the game G(A, M ). Most of the rules are just reformulation of the rules in K(A, M ):

q : (λx.N, ρ, R · S) → q : (N, ρ[x 7→ R], S)

q : (a, ρ, R0R1) → (q0, q1) : (a, ρ, R0R1) for (q0, q1) ∈ δ(q, a) q : (Y x.N, ρ, S) → q : (N, ρ, S)

q : (x, ρ, S) → q : (term(x), ρ, S) x a recursion variable We now proceed to the rule for application (cf. Figure 6). Consider q : (N K, ρ, S) with K of type τ = τ1→ · · · → τl→ 0. We have a transition

(26)

Figure 6: Dealing with application in G(A, M ).

for every residual R : Dτ1 → · · · → Dτl → D0. From this position we have

transitions

(q, R) : (N K, ρ, S) → q : (N, ρ, R rk (q)·S)

(q, R) : (N K, ρ, S) → q0 : (K, ρ r0, R₁· · · R_l) for every R₁ ∈ D_τ

1,. . . ,Rl∈ Dτl

and (q0, r0) ∈ R rk (q)(R1, . . . , Rl). In the last line R rk (q) is needed to “normalize” the residual, so that it satisfies the invariant described below.

Since we are defining a game, we need to say who makes a choice in which vertices. Eve chooses a successor from vertices of the form q : (N K, ρ, S), and q : (a, ρ, R0R1). It means that she can choose a residual, and a transition of the automaton. This leaves for Adam the choices in nodes of the form (q, R) : (N K, ρ, S). So he decides whether to accept (by choosing a transition of the first type) or to contest residuals proposed by Eve.

Observe that we do not have a rule for nodes with a term being a λ-variable. Also positions of the form (q0, q1) : (a, ρ, R0R1) are terminal. This means that we need to say who is the winner in nodes of these two forms.

For the case of a variable, Eve wins in a position

q : (x, ρ, S) with ρ(x) = Rx and S = R1· · · Rk. if (q, rk (q)) ∈ Rx(R1, . . . , Rk).

For the case of a constant, Eve wins in a position (q0, q1) : (a, ρ, R0R1)

if (q0, rk (q0)) ∈ R0rk (q0) and (q1, rk (q1)) ∈ R1rk (q1). Observe that in this

case both R0 and R1 are necessarily residuals of type 0.

Finally, we need to define ranks. It will be much simpler to define ranks on transitions instead of nodes. All the transitions will have rank 1 but for transitions of the form (q, R) : (N K, ρ, S) → q0 : (K, ρ r0, R₁· · · R_k) that

have rank r0.

(27)

4.5 Equivalence of G(A, M ) and K(A, M )

We are now going to prove the central property relating G(A, M ) and K(A, M ).

Proposition 3 For every parity automaton A and concrete canonical term M . Eve wins in K(A, M ) iff Eve wins in G(A, M ).

The proof of this lemma proceeds as follows. For the direction from left to right we take a winning strategy for Eve in K(A, M ) and define residuals for every closure with respect to this strategy. Then we show how Eve is winning in G(A, M ) using these residuals. The winning strategy in G(A, M ) will simulate the one in K(A, M ). For the other direction we will calculate residuals with respect to Adam’s winning strategy in K(A, M ) and use them to define Adam’s winning strategy in G(A, M ). As parity games are determined, we obtain Proposition 3.

4.5.1 Residuals in K(A, M )

We here introduce the key notion of the proof, the notion of residuals of nodes. Given a subtree T of K(A, M ), i.e. a tree obtained from K(A, M ) by pruning some of its subtrees, we calculate the residuals RT(v) and resT(v, v0) for some nodes and pair of nodes of T . In particular, T may be taken as being a strategy of Eve or a strategy of Adam. When T is clear from the context we will simply write R(v) and res(v, v0).

Recall that a node v in K(A, M ) is an application node when its label is of the form q : (N K, ρ, S). We will assign a residual R(v) to every application node v. Thanks to typing, this can be done by induction on the order of type. We also define a variation of this notion: a residual R(v) seen from a node v0, denoted res(v, v0). The two notions are the main technical tools used in the proof of the theorem.

Before giving a formal definition we will describe the assignment of resid-uals to nodes in concrete terms. We will need one simple abbreviation. If v is an ancestor of v0 in T then we write max(v, v0) for the maximal rank appearing on the path between v and v0, including both ends.

Consider an application node v in T . It means that v has a label of the form q : (N K, ρ, S), and its unique successor has the label q : (N, ρ, (v, K, ρ)S). That is the closure (v, K, ρ) is created in v. We will look at all the places where this closure is used and summarize the information about them in R(v). We will do this by induction on the type of K.

First, suppose that the closure, or equivalently the term K, is of type 0. The residual R(v) is a subset of Q × [d]. We have two cases

(28)

• We put (q_i, max(v, v00)) ∈ R(v) when there is v00 in T labelled qi : (K, ρ, ⊥) having a parent labelled (q0, q1) : (a, ρ0, C0C1) with Ci = (v, K, ρ); for i = 0 or i = 1.

For the induction step, suppose that K is of type τ1 → · · · → τk→ 0 and that we have already calculated residuals for all closures of types τ1, . . . , τk. Suppose that we have a closure (v, K, ρ) created at a node v. This time R(v) : Dτ1 → · · · → Dτk → P(Q × [d]). Consider a node v

0 _{using the} closure. Its label has the form q0 : (x, ρ0, S0) for some x, ρ0 and S0 such that ρ0(x) = (v, K, ρ). The stack S0 has the form (v1, N1, ρ1) . . . (vk, Nk, ρk) with Ni of type τi. We put

(q0, max(v, v0)) ∈ R(v)(R(v1) max(v1,v0), . . . , R(vk) max(vk,v0)) .

We now give a formal definition of R(v). By structural induction on types it is easy to see that such an assignment of residuals exists and is unique for T .

Definition 18 (R(v) and res(v, v1)) Given T a subtree of K(A, M ), we define a residual R(v) for every application node v of T .

For more clarity we will write res(v, v1) for R(v) max(v,v1). For a closure

(v, K, ρ) we define res((v, K, ρ), v0) = res(v, v0). We then extend this opera-tion to stacks: res(S, v0) is S where res(·, v0) is applied to every element of the stack.

Let v be a node of T labelled by q : (N K, ρ, S) with K of type τ1 → · · · → τk → 0. The residual R(v) is a function Dτ1 → · · · → Dτk → D0 such

that for every sequence of residuals ~R of appropriate types the set R(v)( ~R) contains

• (q0, max(v, v0)) for every node v0 of T with the label of the form q0 : (x, ρ0, S0) for some x, ρ0, S0 such that ρ0(x) = (v, K, ρ), and res(S0, v0) = ~R.

• (q_i, max(v, v0)), for every node v0labelled qi : (K, ρ, ⊥) having a parent labelled (q0, q1) : (a, ρ0, C0C1) with Ci = (v, K, ρ); for i = 0 or i = 1. Notice that this case applies only if K is of type 0.

4.5.2 Transferring Eve’s strategy in K(A, M ) to G(A, M )

(29)

The invariant Will use positions in the game K(A, M ) and the strategy σ as hints. The strategy in G(A, M ) will take a pair of positions (v1, v2) with v1 in G(A, M ) and a v2 in K(A, M ). It will then give a new pair of positions (v₁0, v0₂) such that v0₁ is a successor v1, and v20 is reachable from v2 using the strategy σ. Moreover, all visited pairs (v1, v2) will satisfy the following invariant:

• v1is labeled by q : (N, ρ1, S1) and v2is labeled by q : (N, ρ2, S2), where ρ1 = res(ρ2, v2) and S1= res(S2, v2),

• v₁ is labeled by (q0, q1) : (a, ρ1, S1) and v2 is labeled by (q0, q1) : (a, ρ2, S2) with ρ1 = res(ρ2, v2) and S1 = res(S2, v2).

The strategy The initial positions in both games have the same label q0 : (M, ∅, ⊥), so the invariant is satisfied. In order to define the strategy we will consider one by one the rules defining the transitions in G(A, M ).

The two cases where Eve needs to decide which successor to chose are the nodes with a constant or with an application. A node with a constant is of the form q : (a, ρ1, R0R1). Eve should then simply take the same transition of the automaton as taken from v2. So it advances to a node labelled (q0, q1) : (a, ρ1, R0R1). It is clear that the invariant is satisfied as the environment and the stack do not change. This implies moreover that no matter what Adam’s next move is, the new position also satisfies the invariant.

The strategy and its analysis in the case of application node is more complicated. Suppose that the term in the label of v1 is an application, say q : (N K, ρ1, S1). By our invariant we have a position v2 labeled by q : (N K, ρ2, S2), where ρ1 = res(ρ2, v2) and S1 = res(S2, v2). To satisfy the invariant, the strategy in G(A, M ) needs to choose R(v2), that is the residual assigned to v2. This means that from v1 the play proceeds to the node v₁0 labeled (q, R(v2)) : (N K, ρ1, S1). From this node Adam can choose either

q :(N, ρ1, (R(v2) rk (q)) · S1), or (2)

q0 :(K, ρ1r0, R₁. . . R_l) where (q0, r0) ∈ R(v₂) _{rk (q)}(R₁, . . . , R_l). (3)

Suppose Adam chooses v₁00whose label is as in (2). By definition R(v2) rk (q)= res(v2, v2). Hence the stack (R(v2) rk (q)) · S1 is just res((v2, K, ρ2)S2, v2). The unique successor v₂0 of v2 is labeled by q : (N, ρ2, (v2, K, ρ2)S2). So the pair (v00₁, v₂0) satisfies the invariant.

(30)

v0₂ is labeled by q0 : (K, ρ2, S20). We can take it as a companion for v100 since ρ1 r0= res(ρ₂, v₂) _max(v

2,v002)= res(ρ2, v

00

2) by Lemma 17. Hence the pair (v₁00, v₂00) satisfies the invariant.

v1 v0₁ v00₁ q : (N K, ρ1, S1) (q, R(v2)) : (N K, ρ1, S1) q0: (K, ρ1r0, R1. . . Rl) r0= max(v2, v02) v2 v0₂ v₂00 q : (N K, ρ2, S2) q : (N, ρ2, (v2, K, ρ2)S2) q0: (x, ρ0₂, S₂0) where ρ0₂(x) = (v2, K, ρ2) and res(S0₂, v0₂) = R1. . . Rl q0: (K, ρ2, S20)

Figure 7: Adam chooses a node of the form (3)

The second possibility for (q0, r0) ∈ R(v2) rk (q) is when K has type 0, and there is a node v0₂ labelled qi : (K, ρ, ⊥) having a parent labelled (q0, q1) : (a, ρ0, C0C1) with Ci = (v, K, ρ); for i = 0 or i = 1. By definition of R(v2) we have that r0 = max(v2, v20). We can take v02 as a companion of v00₁ since ρ1 r0= res(ρ₂, v₂) _max(v

2,v20)= res(ρ2, v

0

2) by the invariant and Lemma 17. Hence the pair (v₁00, v0₂) satisfies the invariant.

The strategy is winning We need to show that the strategy defined above is winning. Consider a sequence of nodes (v₁1, v₂1), (v2₁, v2₂), . . . consis-tent with the strategy. Suppose that this sequence is infinite. By construc-tion we have that v₂1, v₂2, . . . is a path in Kσ, hence a play winning for Eve. We have defined the strategy in such a way that a rank of a transition from vi₁ to vi+1₁ is the same as the maximal rank of a node on the path between vi₂ and v₂i+1. Hence v₁1, v₁2, . . . is winning for Eve too.

It remains to check what happens when a maximal play is finite. This means that the path ends in a pair (v1, v2) where v1 is a variable node or a constant node.

A variable node is labeled by q : (x, ρ1, S1). To show that Eve wins here we need to prove that

(q, rk (q)) ∈ Rx(S1) where Rx= ρ1(x).

By the invariant we have that the companion node v2 is labeled by q : (x, ρ2, S2) and ρ1 = res(ρ2, v2), S1 = res(S2, v2). Suppose that ρ2(x) = (v, N, ρ). We have Rx= R(v) max(v,v2), since ρ1= res(ρ2, v2). By definition

of R(v) we get (q, max(v, v2)) ∈ R(v)(res(S2, v2)). Then from the defini-tion of the max(v,v2) operation: (q, max(v, v2)) ∈ R(v)(res(S2, v2)) max(v,v2).

Which implies that (q, rk (q)) ∈ R(v)(res(S2, v2)) max(v,v2) since rk (q) ≤

(31)

A constant node is labeled by (q0, q1) : (a, ρ1, R0R1). We need to show that (qi, rk (qi)) ∈ Rirk (qi), for i = 0, 1. Let i = 0, the argument is the same

for i = 1. By the invariant we have that the companion node v2 is labeled by (q0, q1) : (a, ρ2, C0C1). Suppose C0 is (v, N, ρ). So v2 has a successor v0₂ labelled with q0 : (N, ρ, ⊥). We have that R0 = R(v) max(v,v2), since

S1 = res(S2, v2) by the invariant. By definition of R(v) we get (q0, m) ∈ R(v); where m = max(v, v₂0). From the definition m operation: (q0, m) ∈ R(v) m. Which implies (q0, rk (q0)) ∈ R(v) m as rk (q0) ≤ m. Since m = max(max(v, v2), rk (q0)) we get R(v) m= (R(v) max(v,v2)) rk (q0)= R0rk (q0),

and we are done.

4.5.3 Transferring Adam’s strategy from K(A, M ) to G(A, M ) We will show how to get a winning strategy for Adam in G(A, M ) form his winning strategy in K(A, M ). Once again we will use residuals. Let us fix a winning strategy θ of Adam in K(A, M ), and consider the tree Kθ of plays respecting this strategy. This is a subtree of K(A, M ). Consider the assignment of residuals to application nodes in Kθ as in Definition 18. We will define a strategy in G(A, M ) that will preserve the invariant described below.

The invariant In order to formulate the invariant for the strategy we introduce complementarity predicate Comp(R1, R2) between a pair of resid-uals:

• For R1, R2 ∈ D0 we put Comp(R1, R2) if R1∩ R2 = ∅.

• For R1, R2 ∈ Dτwhere τ = τ1 → · · · → τk→ 0 we put Comp(R1, R2) if for all sequences (R1,1, . . . , R1,k), (R2,1, . . . , R2,k) ∈ Dτ1×· · ·×Dτksuch

that Comp(R1,i, R2,i) for all i = 1, . . . , k we get R1(R1,1, . . . , R1,k) ∩ R2(R2,1, . . . , R2,k) = ∅.

Remark: Comp predicate is a logical relation (see [AC98]), but we have prefer to formulate the definition in a form that will be more useful for proofs.

For two closures (v, N, ρ) and (v0, N, ρ0) we will say that the predicate Comp((v, N, ρ), (v0, N, ρ0)) holds if Comp(R(v), R(v0)) is true. For two envi-ronments ρ, ρ0 we write Comp(ρ, ρ0) if the two environments have the same domain and for every x, the predicate Comp(ρ(x), ρ0(x)) holds. Finally, Comp(S, S0) holds if the two sequences are of the same length and the pred-icate holds for every coordinate.

It is important to observe that Comp behaves well with respect to r operation

(32)

Proof

Take two sequences S1 and S2 of the correct type with respect to R1 and R2 and such that Comp(S1, S2). Since Comp(R1, R2), we have R1(S1) ∩ R2(S2) = ∅. Let’s suppose that (q1, r1) is in R1r (S1), then either r1 > r and (q1, r1) is in R1(S1) so that (q1, r1) is neither in R2(S2) nor in R2r(S2); or r1≤ r and (q1, r) is in R1(S1) so that (q1, r) is not in R2(S2) and (q1, r1) is not in R2r(S2). Similarly we get that whenever (q2, r2) is in R2r (S2) it is not in R1r (S1). Therefore R1r (S1) ∩ R2r (S2) = ∅. Since S1, S2

were arbitrary, we get Comp(R1r, R2r).

As in the case for Eve, the strategy for Adam will take a pair of vertices (v1, v2) from G(A, M ) and K(A, M ), respectively. It will then consult the strategy θ for Adam in K(A, M ) and calculate a new pair (v0₁, v₂0). All the pairs will satisfy the invariant:

v1 labeled by q : (N, ρ1, S1) and v2 labeled by q : (N, ρ2, S2); where Comp(ρ1, res(ρ2, v2)) and Comp(S1, res(S2, v2));

The strategy We define the strategy by considering one by one the rules for constructing the tree K(A, M ). The only case where Adam makes a choice is the application rule.

A node labeled q : (N K, ρ1, S1) is a node of Eve and it has successors labeled (q, R) : (N K, ρ1, S1) for every residual R of appropriate type. Sup-pose Eve chooses some R and in consequence such a node v₁0. Then Adam has a choice between the children of v₁0 that have labels of the form:

q :(N, ρ1, R rk (q)·S1) (4)

q0 :(K, ρ1r0, R₁· · · R_l) for (q0, r0) ∈ R _{rk (q)}(R₁, . . . , R_l) (5)

At the same time the node v2 of K(A, M ) is an application node so it has assigned residual R(v2). We have two cases.

Suppose Comp(R rk (q), R(v2)) holds. In this case Adam chooses for v100 the node labeled q : (N, ρ1, R rk (q) ·S1). This works since the successor v02 of v2 is labeled by q : (N, ρ2, (v2, K, ρ)S2); hence the pair (v001, v20) satisfies the invariant.

The other case is when Comp(R rk (q), R(v2)) does not hold. This means that there are (R1,1, . . . , R1,l) and (R2,1, . . . , R2,l) such that Comp(R1,i, R2,i) for all i = 1, . . . , l and R rk (q) (R1,1, . . . , R1,l) ∩ R(v2)(R2,1, . . . , R2,l) 6= ∅. Let (q0, r0) be the element from the intersection. Examining the definition of R(v2), Definition 18, there are two reasons why (q0, r0) ∈ R(v2)(R2,1, . . . , R2,l).

(33)

v00₂ of v0₂ labelled by q0 : (K, ρ2, S20). So the new position becomes (v100, v002). We need to show that Comp(ρ1 r0, res(ρ₂, v00₂)) holds. For this take an

arbitrary variable y for which ρ1(y) is defined. Since Comp(ρ1, res(ρ2, v2)) we have Comp(ρ1(y), res(ρ2(y), v2)). As r0 = max(v2, v20) = max(v2, v002) we have res(ρ2(y), v002) = res(ρ2(y), v2) r0, and then by Lemma 19 we get the

required Comp(ρ1(y) r0, res(ρ₂(y), v₂) _r0). And of course, by hypothesis

we have Comp(R1,i, R2,i) for all i = 1, . . . , l so that the new configuration satisfies the invariant.

v1 v0₁ v00₁ q : (N K, ρ1, S1) (q, R rk (q)) : (N K, ρ1, S1) q0: (K, ρ1r0, ⊥) r0= max(v2, v20) v2 v0₂ v00₂ q : (N K, ρ2, S2) q : (N, ρ2, (v2, K, ρ2)S2) q0: (x, ρ0₂, S0₂) where ρ0₂(x) = (v2, K, ρ2) res(S₂0, v0₂) = R2,1. . . R2,l,

and Comp(R1,i, R2,i)

q0: (K, ρ2, S0₂)

Figure 8: First case when (q0, r0) ∈ R rk (q) (R1,1, . . . , R1,l) ∩ R(v2)(R2,1, . . . , R2,l)

The second case (see Figure 9) is when K has type 0 and there is in K_θ a node v₂0 labelled qi : (K, ρ2, ⊥), with qi = q0, having a parent labelled (q0, q1) : (a, ρ0, C0C1) with Ci = (v, K, ρ2); for i = 0 or i = 1 (Figure 9 represents the case where i = 0); and r0 = max(v2, v20). For v001 take the node labelled q0 : (K, ρ1r0, ⊥), and for its companion take the node v0

2. We need to show that Comp(ρ1 r0, res(ρ₂, v₂0)) holds. For this take arbitrary

variable y for which ρ1(y) is defined. Since Comp(ρ1, res(ρ2, v2)) we have Comp(ρ1(y), res(ρ2(y), v2)). As r0 = max(v2, v02) we have res(ρ2(y), v02) = res(ρ2(y), v2) r0, and then by Lemma 19 we get the required Comp(ρ₁(y) _r0

, res(ρ2(y), v2) r0). v1 v0₁ v00₁ q : (aK, ρ1, S1) (q, R rk (q)) : (aK, ρ1, S1) q0: (K, ρ1r0, ⊥) r0= max(v2, v02) v2 v₂0 q : (N K, ρ2, S2) q : (N, ρ2, (v2, K, ρ2)S2) (q0, q1) : (a, ρ2, (v2, K, ρ2)C1) q0: (K, ρ2, ⊥) with q0= q0

(34)

The strategy is winning As in the case of the strategy for Eve, it is easy to show that every infinite play is winning. It remains to check what happens if v1 is a variable node or a constant node.

A variable node is labeled by q : (x, ρ1, S1). To show that Adam wins here we need to prove that

(q, rk (q)) 6∈ Rx(S1) where Rx= ρ1(x).

By the invariant, the companion node v2 is labeled by q : (x, ρ2, S2) and satisfying

Comp(Rx, res(ρ2, v2)(x)), Comp(S1, res(S2, v2)) .

Suppose ρ2(x) = (v, N, ρ). Then (q, max(v, v2)) ∈ R(v)(res(S2, v2)) by the definition of R(v) (Definition 18). Hence we also have

(q, max(v, v2)) ∈ R(v)(res(S2, v2)) max(v,v2) ,

and in consequence

(q, rk (q)) ∈ R(v)(res(S2, v2)) max(v,v2) .

As R(v) max(v,v2)= res(ρ2, v2)(x) we get Comp(Rx, R(v) max(v,v2)(x)), from

the invariant. As Comp(S1, res(S2, v2)) we can obtain (q, rk (q)) 6∈ Rx(S1) by the definition of Comp.

A constant node is labelled by (q0, q1) : (a, ρ1, R0R1). To show that Adam wins here we need to prove that (qi, rk (qi)) 6∈ Ri rk (qi) for i = 0

or i = 1. By the invariant, the companion node v2 is labeled by (q0, q1) : (a, ρ2, C0C1) and satisfying Comp(Ri, res(Ci, v2)) for i = 0, 1. The node v2 must have a successor in Kθ. Let it be q0 : (N, ρ, ⊥) where C0 = (v, N, ρ). We will show that (q0, rk (q0)) 6∈ R0rk (q0). We have Comp(R0, res(C0, v2))

so also Comp(R0rk (q0), res(C0, v2) rk (q0)) by Lemma 19. But then we have

res(C0, v2) rk (q0)is (R(v) max(v,v2)) rk (q0). Let m = max((max(v, v2), rk (q0)).

By definition of R(v), Definition 18, (q0, m) ∈ R(v). Then (q0, m) ∈ R(v) max(v,v2) and (q0, m) ∈ (R(v) max(v,v2)) rk (q0) so that (q0, rk (q0)) ∈

(R(v) max(v,v2)) rk (q0). By the definition of Comp predicate (q0, rk (q0)) 6∈

R0rk (q0).

4.6 Transductions

(35)

In order to construct the formulaϕ we will examine the definition of the_b game and use MSOL transductions. Positions of G(A, M ) (cf. Section 4.4) are subterms of M together with some information of a bounded size: a state of A, an environment ρ assigning residuals to λ-variables, and a stack of residuals. We make it precise in the following lemma.

Lemma 20 There is a number, call it maxpos, depending only on Σ, T and X such that for every subterm N of a term M ∈ Terms(Σ, T , X ), the number of positions in G(A, M ) of the form q : (N, ρ, S) is bounded by maxpos.

Proof

First observe that the number of residuals of a given type is finite. A residual of type 0 is a subset of Q × [d] where Q is the set of states of A and d is the maximal value of the rank function of A. A residual of type α → β is a function from residuals of type α to residuals of type β. Recall that ρ is a function assigning residuals to free λ-variables of N . So the number of possible ρ is bounded since λ-variables come from a fixed finite set X , and the type of the residual assigned to a λ-variable is determined by the type of the λ-variable. For the stack S, we know that the type of N determines the type of S in the sense that it determines the length of the sequence S and the type of each element of the sequence. Since the type of N belongs to the finite set of types T we have that the number of possible stacks depends

on T and not on N .

We will argue that it is possible to define G(A, M ) inside Graph(M ) by means of formulas of MSOL. In other words, that for a fixed A, the mapping from M to G(A, M ) is a monadic second-order transduction. Having done this, the desired formulaϕ will come from the inverse image determined by_b this transduction of the formula defining games where Eve has a winning strategy.

First observe that the game G(A, M ) can be represented with a structure over a signature depending only on A. To represent G(A, M ), we need a predicate determining the transitions in the game, a predicate distinguishing the positions for Eve, and a predicate for every rank in order to encode the parity condition. So the signature depends only on ranks and these come from A. We will write G(A, M ) ψ to mean that ψ holds in the structure representing G(A, M ). Recall that for every fixed set of ranks, there is an MSOL formula defining the set of parity games where Eve has a winning strategy. Let γwin be such a formula for the set of ranks determined by A. We have:

G(A, M ) γwin iff Eve wins in G(A, M ) (6)

(36)

natu-ral number k and let [k] stand for {1, . . . , k}. Consider a structure M = hV, E1, E2, P1, . . . , Pli, in our case it will be a graph of a term. A k-fold dupli-cation of M is a structure M×[k] = hV ×[k], E1, E2, eq, P1, . . . , Pl, C1, . . . , Cki whose elements are pairs (v, i) ∈ V × [k]; we think of (v, i) as of v in the i-th copy. The relations E1, E2 are as in M, that is they hold between elements of the same copy: E1((v, i), (v0, j)) iff E1(v, v0) and i = j. The eq predi-cate says that two elements are copies of the same element: eq((v, i), (v0, j)) iff v = v0. Predicates P1, . . . , Pl are as in M: Pi(v, j) if Pi(v). Finally, predicate Ci holds for all elements of copy i: Ci(v, j) iff i = j.

It is well-known [Cou94] that for every k and MSOL formula ψ there is a formula ψ/[k] such that for every structure M:

M × [k] ψ iff M ψ/[k] (7)

Let M be a term and maxpos the constant from the Lemma 20. We will show how to define G(A, M ) in Graph(M ) × [maxpos]. A configuration of G(A, M ) has a form q : (N, ρ, S). We will think of it as consisting of a term N and a context q : (·, ρ, S). To every possible context q : (·, ρ, S) we can associate a number from [maxpos]. So a node (v, i) in Graph(M )×[maxpos] is a configuration q : (Nv, ρ, S) where Nv is the term rooted in v and q : (·, ρ, S) is the context number i. We write C_q:(·,ρ,S) for Ci where i is the number of the context. With this convention one can directly express the transitions of the game G(A, M ) with MSOL formulas over M × [k]. We give two examples. For all N the transitions :

q : (λx.N, ρ, R · S) → q : (N, ρ[x 7→ R], S) are defined by

nextλ(z, z0) ≡ Cq:(·,ρ,R·S)(z) ∧ Pλx(z)∧

∃z00. E1(z, z00) ∧ eq(z00, z0) ∧ Cq(·,ρ[x7→R],S)(z0) The transitions on Y -variables are defined by

next(z, z0) ≡ Cq:(·,ρ,S)(z) ∧ P(z) ∧ ∃z00.E1(z, z00) ∧ eq(z00, z0) ∧ Cq:(·,∅,S) Observe that the formulas depend on A but not on M . These formulas define G(A, M ) inside Graph(M ) × [maxpos]. We obtain that for every formula ψ there is a formula ψint such that for all M :

G(A, M ) ψ iff Graph(M ) × [k] ψint. (8)

We now have all ingredients to prove the following lemma.

Lemma 21 For a fixed A, T and X . There is a formula ϕ_bA such that for every term M ∈ CTerms(Σ, T , X )