Linear high-order deterministic tree transducers with regular look-ahead

(1)

HAL Id: hal-02902853

https://hal.archives-ouvertes.fr/hal-02902853v2

Submitted on 18 Sep 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

regular look-ahead

Paul Gallot, Aurélien Lemay, Sylvain Salvati

To cite this version:

Paul Gallot, Aurélien Lemay, Sylvain Salvati. Linear high-order deterministic tree transducers with regular look-ahead. MFCS 2020 : The 45th International Symposium on Mathematical Foundations of Computer Science, Andreas Feldmann; Michal Koucky; Anna Kotesovcova, Aug 2020, Prague, Czech Republic. �10.4230/LIPIcs.MFCS.2020.34�. �hal-02902853v2�

(2)

transducers with Regular look-ahead

2

Paul D. Gallot

3

INRIA,Université de Lille

4

[email protected]

5

Aurélien Lemay

6

Université de Lille, INRIA, CNRS

7

[email protected]

8

Sylvain Salvati

9

Université de Lille, INRIA, CNRS

10

[email protected]

11

Abstract

12

We introduce the notion of high-order deterministic top-down tree transducers (HODT) whose outputs

13

correspond to simply-typed lambda-calculus formulas. These transducers are natural generalizations

14

of known models of top-tree transducers such as: Deterministic Top-Down Tree Transducers, Macro

15

Tree Transducers, Streaming Tree Transducers. . . We focus on the linear restriction of high order

16

tree transducers with look-ahead (HODTRlin), and prove this corresponds to tree to tree functional

17

transformations defined by Monadic Second Order (MSO) logic. We give a specialized procedure for

18

the composition of those transducers that uses a flow analysis based on coherence spaces and allows

19

us to preserve the linearity of transducers. This procedure has a better complexity than classical

20

algorithms for composition of other equivalent tree transducers, but raises the order of transducers.

21

However, we also indicate that the order of a HODTRlincan always be bounded by3, and give a

22

procedure that reduces the order of a HODTRlin to3. As those resulting HODTRlin can then be

23

transformed into other equivalent models, this gives an important insight on composition algorithm

24

for other classes of transducers. Finally, we prove that those results partially translate to the case of

25

almost linear HODTR: the class corresponds to the class of tree transformations performed by MSO

26

with unfolding (not closed by composition), and provide a mechanism to reduce the order to3in

27

this case.

28

2012 ACM Subject Classification Theory of computation→Transducers; Theory of computation→

29

Lambda calculus; Theory of computation→Tree languages

30

Keywords and phrases Transducers,λ-calculus,Trees

31

Digital Object Identifier 10.4230/LIPIcs.MFCS.2020.34

32

Related Version A full version of the paper is available athttps://hal.archives-ouvertes.fr/

33

hal-02902853v1.

34

Funding Paul D. Gallot: ANR-15-CE25-0001 – Colis

35

Aurélien Lemay: ANR-15-CE25-0001 – Colis

36

Sylvain Salvati: ANR-15-CE25-0001 – Colis

37

licensed under Creative Commons License CC-BY

(3)

1 Introduction

38

Tree Transducers formalize transformations of structured data such as Abstract Syntax Trees,

39

XML, JSON, or even file systems. They are based on various mechanisms that traverse tree

40

structures while computing an output: Top-Down and Bottom-Up tree transducers [17, 4]

41

which are direct generalizations of deterministic word transducers [8, 7, 3], but also more

42

complex models such as macro tree transducers [11] (MTT) or streaming tree transducers [1]

43

(STT) to cite a few.

44

Logic offers another, more descriptive, view on tree transformations. In particular,

45

Monadic Second Order (MSO) logic defines a class of tree transformations (MSOT) [5, 6] which

46

is expressive and is closed under composition. It coincides with the class of transformations

47

definable with MTT enhanced with a regular look-ahead and restricted to finite copying

48

[9, 10], and also with the class of STT [1].

49

We argue here that simply typedλ-calculus gives a uniform generalisation of all these

50

different models. Indeed, they can all be considered as classes of programs that read input

51

tree structures, and, at each step, compose tree operations which in the end produce the

52

final output. Each of these tree operations can be represented using simply typedλ-terms.

53

In this paper, we define top-down tree transducers that follow the usual definitions of such

54

machines, except that rules can produceλ-terms of arbitrary types. We call these machines,

55

High-Order Top-down tree transducers, or High-Order Deterministic Tree Transducers

56

(HODT) in the deterministic case. This class of transducers naturally contains top-down

57

tree transducers, as they are HODT of order0(the output of rules are trees), but also MTT,

58

which are HODT of order1 (outputs are tree contexts). They also contain STT, which can

59

be translated directly into HODT of order3 with some restricted continuations. Also, STT

60

traverse their input tree represented as a string in a leftmost traversal (a stream). This

61

constraint could easily be adapted to our model but would yield technical complications that

62

are not the focus of this paper. Finally, our model generalizes High Level Tree Transducers

63

defined in [12], which also produceλ-term, but restricted to the safe λ-calculus case.

64

In this paper we focus on thelinear andalmost linear restrictions of HODT. In terms of

65

expressiveness, linear HODTR (HODTRlin) corresponds to the class of MSOT. This links

66

our formalism to other equivalent classes of transducers, such as finite-copying macro-tree

67

transducers [9, 10], with an important difference: the linearity restriction is a simple syntactic

68

restriction, whereas finite-copying or the equivalent single-use-restricted condition are both

69

global conditions that are harder to enforce. For STT, the linearity condition corresponds to

70

the copyless condition described in [1] and where the authors prove that any STT can be

71

made copyless.

72

The relationship of HODTR_lin to MSOT is made via a transformation thatreduces the

73

order of transducers. We indeed prove that for any HODTRlin, there exists an equivalent

74

HODTR_lin whose order is at most 3. This transformation allows us to prove then that

75

HODTRlin are equivalent to Attribute Tree Transducers with the single use restriction

76

(ATT_sur). In turn, this shows that HODTR_lin are equivalent to MSOT [2].

77

One of the main interests of HODTRlin is thatλ-calculus also offers a simple composition

78

algorithm. This approach gives an efficient procedure for composing two HODTR_lin. In

79

general, this procedure raises the order of the produced transducer. In comparison, com-

80

position in other equivalent classes are either complex or indirect (through MSOT). In any

81

case, our procedure has a better complexity. Indeed, it benefits from higher-order which

82

permits a larger number of implementations for a given transduction. The complexity of the

83

construction is also lowered by the use of a notion of determinism slightly more liberal than

84

(4)

usual that we callweak determinism.

85

The last two results allow us to obtain a composition algorithm for other equivalent

86

classes of tree transducer, such as MTT or STT: compile into HODTRlin, compose, reduce

87

the order, and compile back into the original model. The advantage of this approach over

88

the existing ones is that the complex composition procedure is decomposed into two simpler

89

steps (the back and forth translations between the formalisms are unsurprising technical

90

procedures). We believe in fact that existing approaches [12, 1] combine in one step the two

91

elements, which is what makes them more complex.

92

The property of order reduction also applies to a wider class of HODT,almost linear

93

HODT (HODTRal). Again here, this transformation allows us to prove that this class of

94

tree transformations is equivalent to that of Attribute Tree Transducers which is known to

95

be equivalent to MSO tree transformations with unfolding [2], i.e. MSO tree transduction

96

that produce Directed Acyclic Graphs (i.e. trees with shared sub-trees) that are unfolded to

97

produce a resulting tree. We call these transductions Monadic Second Order Transductions

98

with Sharing (MSOTS). Note however that HODTRal are not closed under composition.

99

Section 2 presents the technical definitions used throughout the paper. In particular, it

100

gives the definitions of the various notions of transducers studied in the paper and also the

101

notion of weak determinism. Section 3 studies the expressivity of linear and almost linear

102

higher-order transducer by relating them to MSOT and MSOTS. It focuses more specifically

103

on the order reduction procedure that is at the core of the technical work. Section 4 presents

104

the composition algorithm for linear higher-order transducers. This algorithm is based on

105

Girard’s coherence spaces and can be interpreted as a form of partial evaluation for linear

106

higher-order programs. Finally we conclude.

107

2 Definitions

108

This section presents the main formalisms we are going to use throughout the paper, namely

109

simply typedλ-calculus, finite state automata and high-order transducers.

110

2.1 λ-calculus

111

Fix a finite set of atomic typesA, we then define the set of types overA, types(A), as the

112

types that are either an atomic type, i.e. an element ofA, or a functional type(A→B), with

113

Aand B being intypes(A). The operator→is right-associative andA1→ · · · →An→B

114

denotes the type(A₁→(· · · →(A_n→B)· · ·)). The order of a typeAis inductively defined

115

byorder(A) = 0whenA∈ A, andorder(A→B) = max(order(A) + 1,order(B)).

116

A signatureΣis a triple(C,A, τ)withC being a finite set ofconstants, Aa finite set of

117

atomic types, andτ a mapping fromC totypes(A),the typing function.

118

We allow ourselves to writetypes(Σ)to refer to the settypes(A). The order of a signature

119

is the maximal order of a type assigned to a constant (i.e.max{order(τ(c))|c∈C}). In this

120

work, we mostly deal with tree signatures which are of order1and whose set of atomic types

121

is a singleton. In such a signature with atomic typeo, the types of constants are of the form

122

o→ · · · →o→o. We writeoⁿ →o for an order-1type which usesn+ 1occurrences ofo,

123

for example,o²→odenoteso→o→o. Whencis a constant of typeA, we may writec^A

124

to make explicit thatchas typeA. Two signaturesΣ1= (C1,A1, τ1)andΣ2= (C2,A2, τ2)

125

so that for everyc inC₁∩C₂we have τ₁(c) =τ₂(c)can be summed, and we writeΣ₁+ Σ₂

126

for the signature (C1∪C2,A1∪ A2, τ) so that ifc is in C1, τ(c) =τ1(c)and ifc is in C2,

127

τ(c) =τ2(c). The sum operation over signatures being associative and commutative, we

128

writeΣ1+· · ·+ Σn to denote the sum of several signatures.

129

(5)

We assume that for every typeA, there is an infinite countable set of variables of typeA.

130

When two types are different the set of variables of those types are of course disjoint. As

131

with constants, we may writex^A to make it clear that xis a variable of typeA.

132

When Σis a signature, we define the family of simply typedλ-terms over Σ, denoted

133

Λ(Σ) = (Λ^A(Σ))_{A∈types(Σ)}, as the smallest family indexed bytypes(Σ)so that:

134

ifcÂ is inΣ, thencÂis in ΛÂ(Σ),

135

x^Ais in Λ^A(Σ),

136

ifA=B→C andM is inΛ^C(Σ), then(λx^B.M)is inΛ^A(Σ),

137

ifM is inΛ^B→A(Σ)andN is inΛ^B(Σ), then (M N)is inΛ^A(Σ).

138

The termM is apureλ-term if it does not contain any constantc^A fromΣ. When the type

139

is irrelevant we writeM ∈Λ(Σ)instead ofM ∈Λ^A(Σ). We drop parentheses when it does

140

not bring ambiguity. In particular, we writeλx1. . . xn.M for (λx1(. . .(λxn.M). . .)), and

141

M0M1. . . Mn for((. . .(M0M1). . .)Mn).

142

The setfv(M)of free variables of a termM is inductively defined on the structure ofM:

143

fv(c) =∅,

144

fv(x) ={x},

145

fv(M N) = fv(M)∪fv(N),

146

fv(λx.M) = fv(M)− {x}.

147

Terms which have no free variables are calledclosed. We writeM[x1, . . . , xk]to emphasize that

148

fv(M)is included in{x1, . . . , x_k}. When doing so, we writeM[N₁, . . . , N_k] for the capture

149

avoiding substitution of variablesx1, . . . ,xk by the termsN1, . . . ,Nk. In other contexts,

150

we simply use the usual notation M[N₁/x₁, . . . , N_k/x_k]. Moreover given a substitutionθ,

151

we writeM.θ for the result of applying this (capture avoiding) substitution and we write

152

θ[N1/x1, . . . , Nk/xk]for the substitution that maps the variablesxi to the termsNi but is

153

otherwise equal toθ. Of course, we authorize such substitutions only when theλ-termNi

154

has the same type as the variablexi.

155

We take for granted the notions of β-contraction, noted →β, β-reduction, noted →^∗β,

156

β-conversion, noted=_β, andβ-normal form for terms.

157

Consider closed terms of typeothat are inβ-normal form and that are built on a tree

158

signature, they can only be of the forma t1. . . tn whereais a constant of typeoⁿ→o and

159

t₁, . . . , t_n are closed terms of typeo inβ-normal form. This is just another notation for

160

ranked trees. So when the typeois meant to represent trees, types of order 1which have

161

the formo→ · · · → o→o represent functions from trees to trees, or more precisely tree

162

contexts. Types of order2are types of trees parametrized by contexts. The notion of order

163

captures the complexity of the operations that terms of a certain type describe.

164

A termM is saidlinear if each variable (either bound or free) inM occurs exactly once

165

inM. A termM is saidsyntactically almost linear when each variable inM of non-atomic

166

type occurs exactly once inM. Note that, throughβ-reduction, linearity is preserved but

167

not syntactic almost linearity.

168

For example, given a tree signatureΣ₁with one atomic typeoand two constantsf of type

169

o²→oandaof typeo, the termM = (λy1y2.f y1(f a y2))a(f x a)with free variablexof type

170

ois linear because each variable (y₁,y₂andx) occurs exactly once inM. The termMcontains

171

aβ-redex so: (λy1y2.f y1(f a y2))a(f x a)→β (λy2.f a(f a y2)) (f x a)→β f a(f a(f x a)).

172

The termf a(f a(f x a))has noβ-redex so it is the β-normal form ofM.

173

Another example: the termM₂ = (λy.f y y) (x a) with free variablexof typeo→o is

174

syntactically almost linear because the variableywhich occurs twice in the term is of the

175

atomic typeo. Itβ-reduces to the termM₂⁰ =f(x a) (x a)which is not syntactically almost

176

linear, soβ-reduction does not preserve syntactical almost linearity.

177

(6)

We call a term almost linear when it is β-convertible to a syntactically almost linear

178

term. Almost linear terms are characterized also by typing properties (see [15]).

179

2.2 Tree Automata

180

We present here the classical definition of deterministic bottom-up tree automaton (BOT)

181

adapted to our formalism. A BOTAis a tuple(ΣP,Σ, R)where:

182

Σ = (C,{o}, τ)is a first-order tree signature, theinput signature,

183

Σ_P = (P,{o}, τP) is the state signature, and is such that for everyp∈ P, τ_P(p) = o.

184

Constants of P are called states,

185

R is a finite set of rules of the forma p₁. . . p_n →pwhere:

186

p,p1, . . . , pn are states of P,

187

ais a constant ofΣwith typeoⁿ→o.

188

An automaton is saiddeterministic when there is at most one rule inRfor each possible

189

left hand side. It is non-deterministic otherwise.

190

Apart from the notation, our definition differs from the classical one by the fact there are no

191

final states, and hence, the automaton does not describe a language. This is due to the fact

192

that BOT will be used here purely for look-ahead purposes.

193

2.3 High-Order Deterministic top-down tree Transducers

194

From now on we assume thatΣi is a tree signature for every numberiand that its atomic

195

type iso_i.

196

A Linear High-Order Deterministic top-down Transducer with Regular look-ahead

197

(HODTRlin)T is a tuple(ΣQ,Σ1,Σ2, q0, R,A)where:

198

Σ1= (C1,{o1}, τ1)is a first-order tree signature, theinput signature,

199

Σ2= (C2,{o2}, τ2)is a first-order tree signature, theoutput signature,

200

ΣQ = (Q,{o1, o2}, τs)is thestate signature, and is such that for everyq∈Q, τs(q)is of

201

the form o1→Aq whereAq is in types(Σ2). Constants ofQare calledstates,

202

q₀∈Qis theinitial state,

203

Ais a BOT over the tree signatureΣ1, thelook-ahead automaton, with set of statesP,

204

R is a finite set of rules of the form

205

q(a−→x)h−→pi →M(q₁x₁). . .(q_nx_n)

206

207

where:

208

q, q1, . . . , qn∈Qare states ofΣQ,

209

ais a constant ofΣ1 with typeoⁿ₁ →o1,

210

−

→x =x1, . . . , xn are variables of typeo1, they are the child trees of the root labeleda,

211

−

→p =p1, . . . , pn are in P (the set of states of the look-aheadA),

212

M is a linear term of typeAq1→ · · · →Aqn→Aq built on signatureΣ2+ ΣQ.

213

there is one rule per possible left-hand side (determinism).

214

Notice that we have given states a type of the formo1→AwhereA∈types(o2). The

215

reason why we do this is to have a uniform notation. Indeed, a stateqis meant to transform,

216

thanks to the rules in R, a tree built in Σ₁ into a λ-term built on Σ₂ with type A_q. So

217

we simply writeq M N1. . . Nn when we want to transformM with the state q and pass

218

N₁,. . . ,N_n as arguments to the result of the transformation. We writeΣ_T for the signature

219

Σ1+ Σ2+ ΣQ. Notice also that the right-hand part of a rule is a term that is built only

220

with constants of Σ2, states from ΣQ and variables of type o1. Thus, in order for this

221

term to have a type intypes(Σ2), it is necessary that the variables of typeo1 only occur as

222

(7)

the first argument of a state inΣQ. Finally, remark that we did not put any requirement

223

on the type of the initial state. So as to restrict our attention to transducers as they are

224

usually understood, it suffices to add the requirement that the initial state is of typeo₁→o₂.

225

However, we consider as well that transducers may produceprograms instead of first order

226

terms.

227

The linearity constraint on M affects both bound variables and the free variables

228

x1, . . . , xn, meaning that all of the subtrees x1, . . . , xn are used in computing the out-

229

put. That will be important for the composition of two transducers because if the first

230

transducer fails in a branch of its input tree then the second transducer, applied to that tree,

231

must fail too. This restriction forcing the use of input subtrees does not reduce the model’s

232

expressivity because we can always add a stateqwhich visits the subtree but only produces

233

the identity function on typeo₂(this state then has typeA_q =o₁→o₂→o₂).

234

Almost linear high-order deterministic top-down transducer with regular look-ahead

235

(HODTR_al) are defined similarly, with the distinction that a termM appearing as a right-

236

hand side of a rule should be almost linear.

237

As we are concerned with the size of the composition of transducers, we wish to re-

238

lax a bit the notion of HODTR_lin. Indeed, when composing HODTR_lin we may have to

239

determinize the look-ahead so as to obtain a HODTRlin, which may cause an exponen-

240

tial blow-up of the look-ahead. However if we keep the look-ahead non-deterministic, the

241

transducer stays deterministic in the weaker sense that only one rule of the transducer

242

can apply when it is actually run. For this we adopt a slightly relaxed notion of determ-

243

inistic transducer that we call high-order weakly deterministic top-down transducer with

244

regular look-ahead (HOWDTRlin). They are similar to HODTRlin but they can havenon-

245

deterministic automata as look-ahead with the proviso that whenq(a x₁. . . x_n)hp1, . . . , p_ni →

246

M[x1, . . . , xn]andq(a x1. . . xn)hp⁰₁, . . . , p⁰_ni → M⁰[x1, . . . , xn] are two distinct rules of the

247

transducer then it must be the case that for someithere is no tree that is recognized by

248

bothpi andp⁰_i. This property guarantees that when transforming a term at most one rule

249

can apply for every possible state. Notice that it suffices to determinize the look-ahead so as

250

to obtain a HODTRlin from a HOWDTRlin, and therefore the two models are equivalent.

251

Given a HODTRlin, a HODTRal or a HOWDTRlin T, we write T :: Σ1−→Σ2 to mean

252

that the input signature ofT isΣ₁ and its output signature isΣ₂.

253

A transducerT induces a notion of reduction on terms. AT-redex is a term of the form

254

q(a M₁. . . M_n) if and only if q(a x₁. . . x_n)hp₁, . . . , p_ni →M[x₁, . . . , x_n] is a rule of T and

255

(theβ-normal forms of)M1, . . . ,Mnare respectively accepted byAwith the statesp1, . . . ,pn.

256

In that case, aT-contractum ofq(a M1. . . Mn)isM[M1, . . . , Mn]. Notice thatT-contracta

257

are typed terms and that they have the same type as their correspondingT-redices. The

258

relation ofT-contraction relates a termM and a termM⁰ whenM⁰ is obtained from M

259

by replacing one of itsT-redex with a correspondingT-contractum. We writeM →T M⁰

260

whenM T-contracts toM⁰. The relation ofβ-reduction is confluent, and so is the relation

261

ofT-reduction as transducers are deterministic, moreover, the union of the two relations is

262

terminating. It is not hard to prove that it is also locally confluent and thus confluent. It

263

follows that→β,T (which is the union of→β and→T) is confluent and strongly normalizing.

264

Given a termM built on Σ_T, we write|M|T to denote its normal form modulo=_β,T.

265

Then we writerel(T)for the relation:

266

{(M,|q0M|T)|M is a closed term of typeo1and|q0M|T ∈Λ(Σ2)} .

267

Notice that when|q₀M|_T contains some states ofT, as it is usual, the pair(M,|q₀M|_T)

268

is not in the relation.

269

Given a finite set of treesL1on Σ1 andL2 included inΛ^A^q⁰, we respectively writeT(L1)

270

andT⁻¹(L2)for the image ofL1 byT and the inverse image of L2byT.

271

(8)

We give an example of a HODTRlin T that computes the result of additions of numeric

272

expressions (numbers being represented in unary notation). For this we use an input tree

273

signature with typeo₁, and constantsZô¹,Sô¹ andaddô¹^→o¹^→o¹ which respectively denote

274

zero, the successor function and addition. The output signature is similar but different to

275

avoid confusion: it uses the typeo₂ and constantsO^o²,N^o²^→o² which respectively denote

276

zero and successor.

277

We do not really need the look-ahead automaton for this computation, so we omit it for

278

this example. We could have a blank look-ahead automatonAwith one state l and rules:

279

A(Z) =l,A(S l) =l,A(addl l) =l; which would not change the result of the transducer.

280

The transducer has two states: q₀ of type o₁ → o₂ (the initial state), and q_i of type

281

o1→o2→o2. The rules of the transducer are the following:

282

q0(Z)→O,q0(S x)→N(qix O),

283

q₀(addx y)→q_ix(q_iy O),

284

qi(Z)→λx.x,

285

qi(S x)→λy.N(qix y),

286

q_i(addx y)→λz.q_ix(q_iy z),

287

As an example, we perform the transduction of the following termadd(S(S Z))(S(S(S Z))):

288

q0(add(S(S Z))(S(S(S Z)))) →T (qi(S(S Z)))(qi(S(S(S Z)))O)

→∗T (λy₁.N((λy₂.N((λx.x)y₂))y₁))((λy₃.N((λy₄.N((λy₅.N((λx.x)y₅))y₄))y₃))O)

→∗β N(N(N(N(N O))))

289

The stateqi transforms a sequence ofnsymbolsS into aλ-term of the formλx.Nⁿ(x),

290

and theaddmaps both its children into such terms and composes them. The stateq₀ simply

291

appliesO to the resulting term.

292

Note that our reduction strategy here has consisted in first computing the T-redices

293

and then reducing the β-redices. This makes the computation simpler to present. As we

294

mentioned above a head-reduction strategy would lead to the same result.

295

The order of the HODTRlin T ismax{order(Aq)|q∈Q}. Before going further, we want

296

to discuss how our framework relates to other transduction models. More specifically how

297

the notion of order of transformations generalizes the DTOP and MTT transduction models:

298

if we relax the constraint of linearity of our transducers, then DTOP and MTT can be

299

seen as non-linear transducers of order0and 1respectively. In contrast of these, we chose

300

to study the constraint of linearity instead of the constraint of order and, in this paper,

301

we will explore the benefits of this approach. Firstly we will explain why increasing the

302

order beyond order3does not increase the expressivity of neither HODTR_lin nor HODTR_al.

303

Next we will show how HODTRlin and HOWDTRlin both capture the expressivity of tree

304

transformations defined by monadic second order logic. Lastly, we will prove that, contrary

305

to MTT, the class of HODTRlintransformations is closed under composition, we will give an

306

algorithm for computing the composition of HODTR_lin and HOWDTR_lin, and explain why

307

using HOWDTRlin avoids an exponential blow-up in the size of the composition transducer.

308

3 Order reduction and expressiveness

309

In this section we outline a construction that transforms a transducer of HODTR_lin or

310

HODTRal into an equivalent linear or almost linear transducer of order ≤3. These two

311

constructions are similar and central to proving that HODTR_lin and HODTR_al are respect-

312

ively equivalent to Monadic Second Order Transductions from trees to trees (MSOT) and to

313

Monadic Second Order Transductions from trees to terms (i.e. trees with sharing) (MSOTS).

314

We will later show that there are translations between HODTRlinof order3and attribute tree

315

(9)

transducers with thesingle use restriction and between HODTRal of order3 and attribute

316

tree transducers. These two models are known to be respectively equivalent to MSOT and

317

MSOTS [2].

318

The central idea in the construction consists in decomposingλ-termsM into pairshM⁰, σi

319

whereM⁰ is a pureλ-term andσis a substitution of variables with the following properties:

320

M =_βM⁰.σ,

321

the free variables ofM⁰ have at most order 1,

322

for every variablex,σ(x)is a closedλ-term,

323

the number of free variables inM⁰ is minimal.

324

In such a decomposition, we call the termM⁰ atemplate. In caseM is of typeA, linear or

325

almost linear, it can be proven thatM⁰ can be taken from a finite set [14]. The linear case is

326

rather simple, but the almost linear case requires some precaution as one needs first to put

327

M in syntactically almost linear form and then make the decomposition. Though the almost

328

linear case is more technical the finiteness argument is the same in both cases and is based

329

on proof theoretical arguments in multiplicative linear logic which involves polarities in a

330

straightforward way.

331

The linear case conveys the intuition of decompositions in a clear manner. One takes

332

the normal form ofM and then delineates the largest contexts ofM, i.e. first order terms

333

that are made only with constants and that are as large as possible. These contexts are

334

then replaced by variables and the substitutionσ is built accordingly. The fact that the

335

contexts are chosen as large as possible makes it so that no introduced variable can have

336

as argument a term of the formx M1. . . Mn wherexis another variable introduced in the

337

process. Therefore, the new variables introduced in the process bring one negative atom

338

and several (possibly 0) positive ones and all of them need to be matched with positive and

339

negative atoms in the type ofM as, under these conditions, they cannot be matched together.

340

This explains why there are only finitely many possible templates for a fixed type.

341

ITheorem 1. For all typeAbuilt on tree signature Σ, the set of templates of closed linear

342

(or almost linear) terms of typeAis finite.

343

Moreover, the templates associated with aλ-term can be computed compositionally (i.e.

344

from the templates of its parts). As a result, templates can be computed by the look-ahead

345

of HODTRlin or of HODTRal. When reducing the order, we enrich the look-ahead with

346

template information while the substitution that is needed to reconstruct the produced term

347

is outputted by the new transducer. The substitution is then performed by the initial state

348

used at the root of the input tree which then outputs the same result as the former transducer.

349

The substitution can be seen as a tuple of order 1 terms. It is represented as a tuple using

350

Church encoding, i.e. a continuation. This makes the transducer we construct be of order 3.

351

I Theorem 2. Any HODTRlin (resp. HODTRal) has an equivalent HODTRlin (resp.

352

HODTR_al) of order 3.

353

The proof of this result shows that every HODTR_lin (or HODTR_al) can be seen as mapping

354

trees to tuples of contexts and combining these contexts in a linear (resp. almost linear)

355

way. This understanding of HODTR_lin and of HODTR_al allows us to prove that they are

356

respectively equivalent to Attribute Tree Transducers with Single Use Restriction (ATTsur);

357

and to Attribute Tree Transducers (ATT). Then, using [2], we can conclude with the following

358

expressivity result:

359

ITheorem 3. HODTRlinare equivalent to MSOT and HODTRalare equivalent to MSOTS.

360