Maximal bifix decoding

(1)

Maximal bifix decoding

1

Val´ erie Berth´ e

¹

, Clelia De Felice

²

, Francesco Dolce

³

, Julien Leroy

⁴

, Dominique Perrin

³

, Christophe Reutenauer

⁵

, Giuseppina Rindone

³

1

CNRS, Universit´ e Paris 7,

²

Universit` a degli Studi di Salerno,

3

Universit´ e Paris Est, LIGM,

⁴

Universit´ e du Luxembourg,

5

Universit´ e du Qu´ ebec ` a Montr´ eal

2

September 21, 2014 22 h 39

3

Abstract

4

We consider a class of sets of words which is a natural common gen-

5

eralization of Sturmian sets and of interval exchange sets. This class of

6

sets consists of the uniformly recurrent tree sets, where the tree sets are

7

defined by a condition on the possible extensions of bispecial factors. We

8

prove that this class is closed under maximal bifix decoding. The proof

9

uses the fact that the class is also closed under decoding with respect to

10

return words.

11

1 Introduction

36

This paper studies the properties of a common generalization of Sturmian sets

37

and regular interval exchange sets. We first give some elements on the back-

38

ground of these two families of sets.

39

Sturmian words are infinite words over a binary alphabet that have exactly

40

n+ 1 factors of length n for each n ≥ 0. Their origin can be traced back

41

to the astronomer J. Bernoulli III. Their first in-depth study is by Morse and

42

Hedlund [24]. Many combinatorial properties were described in the paper by

43

Coven and Hedlund [11].

44

We understand here by Sturmian words the generalization to arbitrary al-

45

phabets, often called strict episturmian words or Arnoux-Rauzy words (see the

46

survey [20]), of the classical Sturmian words on two letters. A Sturmian sets is

47

the set of factors of one Sturmian word. For more details, see [19, 23].

48

Sturmian words are closely related to the free group. This connection is

49

one of the main points of the series of papers [2, 4, 5] and the present one. A

50

striking feature of this connection is the fact that our results do not hold only

51

for two-letter alphabets or for two generators but for any number of letters and

52

generators.

53

Interval exchange transformations were introduced by Oseledec [25] following

54

an earlier idea of Arnold [1]. These transformations form a generalization of

55

rotations of the circle. The class of regular interval exchange transformations

56

was introduced by Keane [22] who showed that they are minimal in the sense

57

of topological dynamics. The set of factors of the natural codings of a regular

58

interval exchange transformation is called an interval exchange set.

59

Even though they have the same factor complexity (that is, the same number

60

of factors of a given length), Sturmian words and codings of interval exchange

61

transformations have a priori very distinct combinatorial behaviours, whether

62

for the type of behaviour of their special factors, or for balance properties and

63

deviations of Birkhoff sums (see [9, 27]).

64

The class of tree sets, introduced in [4] contains both the Sturmian sets

65

and the regular interval exchange sets. They are defined by a condition on the

66

possible extensions of bispecial factors.

67

(3)

In a paper with part of the present list of authors on bifix codes and Sturmian

68

words [2] we proved that Sturmian sets satisfy the finite index basis property,

69

in the sense that, given a setS of words on an alphabet A, a finite bifix code

70

is S-maximal if and only if it is the basis of a subgroup of finite index of the

71

free group onA. The main statement of [5] is that uniformly recurrent tree sets

72

satisfy the finite index basis property. This generalizes the result concerning

73

Sturmian words of [2] quoted above. As an example of a consequence of this

74

result, if S is a uniformly recurrent tree set on the alphabet A, then for any

75

n≥1, the setS∩Aⁿ is a basis of the subgroup formed by the words of length

76

multiple ofn(see Theorem theoremGroupCode

5.9).

77

Our main result here is that the class of uniformly recurrent tree sets is

78

closed under maximal bifix decoding (Theorem theoremNormal

7.1). This means that ifS is a

79

uniformly recurrent tree set and f a coding morphism for a finiteS-maximal

80

bifix code, thenf⁻¹(S) is a uniformly recurrent tree set. The family of regular

81

interval exchange sets is closed under maximal bifix decoding (see [5] Corollary

82

5.22) but the family of Sturmian sets is not (see Example exampleTribonacci2

7.2 below). Thus,

83

this result shows that the family of uniformly recurrent tree sets is the natural

84

closure of the family of Sturmian sets. The proof uses the finite index basis

85

property of uniformly recurrent tree sets.

86

The proof of Theorem theoremNormal

7.1 uses the closure of uniformly recurrent tree sets

87

under decoding with respect to return words (Theorem propositionReturns

5.12). This property,

88

which is interesting in its own, generalizes the fact that the derived word of a

89

Sturmian word is Sturmian [21].

90

The paper is organized as follows. In Section sectionPreliminaries

2, we introduce the notation

91

and recall some basic results. We define the composition of prefix codes.

92

In Section sectionIntervalExchange

3, we introduce one important subclass of tree sets, namely in-

93

terval exchange sets. We recall the definitions concerning minimal and regular

94

interval exchange transformations. We state the result of Keane expressing that

95

regular interval exchange transformations are minimal (TheoremtheoremKeane

3.4). We prove

96

in [?] that the class of regular interval exchange sets is closed under maximal

97

bifix decoding.

98

In Section sectionReturn

4, we define return words, derived words and derived sets and

99

prove some elementary properties.

100

In SectionsectionTreeNormal

5, we recall the definition of tree sets. We also recall that a regular

101

interval exchange set is a tree set (PropositionpropositionExchangeTreeCondition

5.4). We prove that the family of

102

uniformly recurrent tree sets is closed under derivation (TheorempropositionReturns

5.12). We fur-

103

ther prove that all bases of the free group included in a uniformly recurrent tree

104

set are tame, that is obtained from the alphabet by composition of elementary

105

positive automorphisms (TheoremtheoremTame

5.18).

106

In Section sectionSadic

6, we turn to the notion of H-adic representation of sets, intro-

107

duced in [17], using a terminology initiated by Vershik and coined out by B. Host

108

(it is usually calledS-adic). We deduce from the previous result that uniformly

109

recurrent tree sets have a primitiveHe-adic representation (Theorem6.5) where^{base tame}

110

Heis the finite set of positive elementary automorphisms of the free group.

111

In SectionsectionBifixDecoding

7, we state and prove our main result (Theorem theoremNormal

7.1), namely the

112

closure under maximal bifix decoding of the family of uniformly recurrent tree

113

(4)

sets.

114

Finally, in Section sectionComposition

7.3, we use TheoremtheoremNormal

7.1 to prove a result concerning the

115

composition of bifix codes (TheoremtheoremCompositionBifix

7.12) showing that the degrees of the terms

116

of a composition are multiplicative.

117

2 Preliminaries

118

sectionPreliminaries

In this section, we recall some notions and definitions concerning words, codes

119

and automata. For a more detailed presentation, see [2]. We also introduce the

120

notion of composition of codes.

121

2.1 Words

122

subsectionWords

LetAbe a finite nonempty alphabet. All words considered below, unless stated

123

explicitly, are supposed to be on the alphabetA. We let A^∗ denote the set of

124

all finite words over A and A⁺ the set of finite nonempty words over A. The

125

empty word is denoted by 1 or byε. We let|w|denote the length of a wordw.

126

For a setX of words and a wordx, we denote

127

x⁻¹X ={y∈A^∗|xy∈X}, Xx⁻¹={z∈A^∗|zx∈X}.

A word v is a factor of a word x if x = uvw. A set of words is said to be

128

factorial if it contains the factors of its elements. Let S be a set of words on

129

the alphabetA. Forw∈S, we denote

130

L(w) = {a∈A|aw∈S} R(w) = {a∈A|wa∈S}

E(w) = {(a, b)∈A×A|awb∈S} and further

131

ℓ(w) = Card(L(w)), r(w) = Card(R(w)), e(w) = Card(E(w)).

These notions depend uponS but it is assumed from the context. A word w

132

is right-extendable if r(w) >0, left-extendable ifℓ(w) > 0 andbiextendable if

133

e(w)>0. A factorial setSis calledright-extendable(resp. left-extendable, resp.

134

biextendable) if every word inSis right-extendable (resp. left-extendable, resp.

135

biextendable).

136

A wordwis calledright-special ifr(w)≥2. It is calledleft-special ifℓ(w)≥

137

2. It is calledbispecial if it is both right and left-special.

138

We let Fac(x) denote the set of factors of an infinite wordx∈A^N. The set

139

Fac(x) is factorial and right-extendable. An infinite wordx∈A^ωisrecurrent if

140

for anyu∈Fac(x) there is a wordv such thatuvu∈Fac(x).

141

A factorial set of words S 6={1} is recurrent if for everyu, w∈ S there is

142

a wordv ∈ S such that uvw∈ S. For any recurrent setS there is an infinite

143

wordxsuch that Fac(x) =S.

144

(5)

For any infinite wordx, the set Fac(x) is recurrent if and only ifxis recurrent

145

(see [2]).

146

Note that any recurrent set not reduced to the empty word is biextendable.

147

A set of words S is said to be uniformly recurrent if it is right-extendable

148

and if, for any wordu∈S, there exists an integern≥1 such that uis a factor

149

of every word ofS of lengthn. A uniformly recurrent set is recurrent.

150

A morphism f from A^∗ to B^∗ is a monoid morphism fromA^∗ into B^∗. If

151

a∈Ais such that the word f(a) begins with aand if |fⁿ(a)| tends to infinity

152

withn, there is a unique infinite word denotedf^ω(a) which has all wordsfⁿ(a)

153

as prefixes. It is called afixed point of the morphismf.

154

A morphismf :A^∗→A^∗is calledprimitive if there is an integerksuch that

155

for alla, b∈A, the letterbappears inf^k(a). Iff is a primitive morphism, the

156

set of factors of any fixed point off is uniformly recurrent (see [19, Proposition

157

1.2.3] for example).

158

An infinite word isepisturmianif the set of its factors is closed under reversal

159

and contains for eachnat most one word of lengthnwhich is right-special. It is

160

astrict episturmian word if it has exactly one right-special word of each length

161

and moreover each right-special factoruis such thatr(u) = Card(A).

162

ASturmian set is a set of words which is the set of factors of a strict epistur-

163

mian word. Any Sturmian set is uniformly recurrent (see [2, Proposition 2.3.3]

164

for example).

165

exampleFibonacci¹⁶⁶ Example 2.1 Let A = {a, b}. The Fibonacci word is the fixed point x = abaababa . . .of the morphismf :A^∗ →A^∗ defined byf(a) =ab andf(b) =a.

167

It is a Sturmian word (see [23]). The set Fac(x) of factors ofxis theFibonacci

168

set.

169

exampleTribonacci¹⁷⁰ Example 2.2 Let A ={a, b, c}. The Tribonacci word is the fixed pointx = f^ω(a) = abacaba· · · of the morphism f : A^∗ → A^∗ defined by f(a) = ab,

171

f(b) =ac,f(c) = a. It is a strict episturmian word (see [21]). The set Fac(x)

172

of factors ofxis theTribonacci set.

173

2.2 Bifix codes

174

Recall that a set X ⊂A⁺ of nonempty words over an alphabetA is a code if

175

the relation

176

x1· · ·xn=y1· · ·ym

with n, m≥ 1 andx1, . . . , xn, y1, . . . , ym ∈ X implies n =m and xi =yi for

177

i= 1, . . . , n. For the general theory of codes, see [3].

178

Aprefix code is a set of nonempty words which does not contain any proper

179

prefix of its elements. A prefix code is a code.

180

A suffix code is defined symmetrically. Abifix code is a set which is both a

181

prefix code and a suffix code.

182

A coding morphism for a codeX ⊂A⁺ is a morphismf :B^∗ →A^∗ which

183

maps bijectivelyB ontoX.

184

(6)

Let S be a set of words. A prefix code X ⊂ S is S-maximal if it is not

185

properly contained in any prefix code Y ⊂ S¹. Equivalently, a prefix code

186

X ⊂S is S-maximal if any word inS is comparable for the prefix order with

187

some word ofX.

188

A set of wordsM is called right unitary ifu, uv ∈M imply v ∈ M. The

189

submonoidM generated by a prefix code is right unitary. One can show that

190

conversely, any right unitary submonoid of A^∗ is generated by a prefix code

191

(see [3]). The symmetric notion of aleft unitary set is defined by the condition

192

v, uv∈M impliesu∈M.

193

We denote by X^∗ the submonoid generated by X. A set X ⊂ S is right

194

S-complete if every word ofS is a prefix of a word in X^∗. If S is factorial, a

195

prefix code is S-maximal if and only if it is right S-complete [2, Proposition

196

3.3.2].

197

Similarly a bifix codeX ⊂S isS-maximal if it is not properly contained in

198

a bifix codeY ⊂S. For a recurrent setS, a finite bifix code isS-maximal as a

199

bifix code if and only if it is anS-maximal prefix code [2, Theorem 4.2.2]. For

200

a uniformly recurrent setS, any finite bifix codeX ⊂S is contained in a finite

201

S-maximal bifix code [2, Theorem 4.4.3].

202

A parse of a wordw∈A^∗ with respect to a setX is a triple (v, x, u) such

203

thatw=vxuwherevhas no suffix inX,uhas no prefix inX andx∈X^∗. We

204

denote bydX(w) the number of parses of w.

205

LetX be a bifix code. The number of parses of a wordwis also equal to the

206

number of suffixes ofw which have no prefix inX and the number of prefixes

207

ofwwhich have no suffix inX [3, Proposition 6.1.6].

208

By definition, theS-degree of a bifix codeX, denoteddX(S), is the maximal

209

number of parses of all words inSwith respect toX. It can be finite or infinite.

210

The set of internal factors of a set of wordsX, denoted I(X) is the set of

211

wordswsuch that there exist nonempty wordsu, vwithuwv∈X.

212

Let S be a recurrent set and letX be a finiteS-maximal bifix code of S-

213

degreed. A word w∈S is such thatdX(w)< dif and only if it is an internal

214

factor ofX, that is

215

I(X) ={w∈S|dX(w)< d} (2.1) eqInternal (Theorem 4.2.8 in [2]). Thus any word of X of maximal length has d parses.

216

This implies that theS-degreedis finite.

217

Example 2.3 LetS be a recurrent set. For any integern≥1, the setS∩Aⁿ

218

is anS-maximal bifix code ofS-degreen.

219

Thekernel of a set of wordsX is the set of words inX which are internal

220

factors of words inX. We denote byK(X) the kernel ofX. Note thatK(X) =

221

I(X)∩X.

222

For any recurrent setS, a finiteS-maximal bifix code is determined by its

223

S-degree and its kernel (see [2, Theorem 4.3.11]).

224

1Note that in this paper we use⊂to denote the inclusion allowing equality.

(7)

exampleDegree1²²⁵ Example 2.4 Let S be a recurrent set containing the alphabet A. The only S-maximal bifix code of S-degree 1 is the alphabetA. This is clear since Ais

226

the uniqueS-maximal bifix code ofS-degree 1 with empty kernel.

227

2.3 Group codes

228

subsectionautomata

We letA= (Q, i, T) denote a deterministic automaton withQas set of states,

229

i ∈ Q as initial state and T ⊂ Q as set of terminal states. For p ∈ Q and

230

w∈A^∗, we denote p·w=q if there is a path labeledw frompto the state q

231

andp·w=∅otherwise (for a general introduction to automata theory, see [16]

232

for example).

233

The setrecognized by the automaton is the set of wordsw∈A^∗ such that

234

i·w∈T. A set of words is rational if it is recognized by a finite automaton.

235

Two automata areequivalent if they recognize the same set.

236

All automata considered in this paper are deterministic and we simply call

237

them ‘automata’ to mean ‘deterministic automata’.

238

The automatonAistrim if for anyq∈Q, there is a path fromitoqand a

239

path fromqto somet∈T.

240

An automaton is calledsimple if it is trim and if it has a unique terminal

241

state which coincides with the initial state.

242

An automatonA= (Q, i, T) iscomplete if for any statep∈Qand any letter

243

a∈A, one hasp·a6=∅.

244

For a nonempty setL⊂A^∗, we denote byA(L) theminimal automaton of

245

L. The states of A(L) are the nonempty sets u⁻¹L ={v ∈ A^∗ | uv ∈ L} for

246

u∈ A^∗ (see Section subsectionWords

2.1 for the notation u⁻¹L). For u ∈A^∗ and a∈ A, one

247

defines (u⁻¹L)·a = (ua)⁻¹L. The initial state is the set L and the terminal

248

states are the setsu⁻¹Lforu∈L.

249

LetX ⊂A^∗be a prefix code. Then there is a simple automatonA= (Q,1,1)

250

that recognizesX^∗. Moreover, the minimal automaton ofX^∗ is simple.

251

Example 2.5 The automaton A = (Q,1,1) represented in Figure figureExampleAutomaton

2.1 is the

252

minimal automaton ofX^∗ withX={aa, ab, ac, ba, ca}. We haveQ={1,2,3},

3 1 2

a a, b, c b, c

a

Figure 2.1: The minimal automaton of {aa, ab, ac, ba, ca}^∗. figureExampleAutomaton

253

i= 1 andT = 1. The initial state is indicated by an incoming arrow and the

254

terminal one by an outgoing arrow.

255

Let X be a prefix code and let P be the set of proper prefixes of X. The

256

literal automaton ofX^∗ is the simple automatonA= (P,1,1) with transitions

257

(8)

defined forp∈P anda∈Aby

258

p·a=







pa ifpa∈P, 1 ifpa∈X,

∅ otherwise.

One verifies that this automaton recognizesX^∗.

259

An automatonA= (Q,1,1) is agroup automaton if for anya∈Athe map

260

ϕA(a) :p7→p·ais a permutation ofQ.

261

The following result is proved in [2, Proposition 6.1.5].

262

propositionGroupAutomaton²⁶³ Proposition 2.6 The following conditions are equivalent for a submonoid M ofA^∗.

264

(i) M is recognized by a group automaton withdstates.

265

(ii) M =ϕ⁻¹(K), whereK is a subgroup of index dof a group Gandϕis a

266

surjective morphism from A^∗ ontoG.

267

(iii) M =H∩A^∗, where H is a subgroup of index dof the free group on A.

268

If one of these conditions holds, the minimal generating set ofM is a maximal

269

bifix code of degreed.

270

A bifix code Z such that Z^∗ satisfies one of the equivalent conditions of

271

PropositionpropositionGroupAutomaton

2.6 is called agroup code of degreed.

272

2.4 Composition of codes

273

We introduce the notion of composition of codes (see [3] for a more detailed

274

presentation).

275

For a set X ⊂ A^∗, we denote by alph(X) the set of letters a ∈ A which

276

appear in the words ofX.

277

LetZ ⊂A^∗ andY ⊂B^∗ be two finite codes with B = alph(Y). Then the

278

codesY andZ arecomposable if there is a bijection fromB ontoZ. SinceZ is

279

a code, this bijection defines an injective morphismf from B^∗ intoA^∗. If f is

280

such a morphism, thenY andZ are called composablethroughf. The set

281

X =f(Y)⊂Z^∗⊂A^∗ (2.2) eq1.6.1

is obtained bycomposition ofY andZ (by means off). We denote it by

282

X=Y ◦^fZ ,

or byX =Y ◦Z when the context permits it. Sincef is injective, X and Y

283

are related by bijection, and in particular Card(X) = Card(Y). The words in

284

X are obtained just by replacing, in the words of Y, each letter bby the word

285

f(b)∈Z.

286

Example 2.7 LetA={a, b}andB ={u, v, w}. Letf :B^∗→A^∗ be the mor-

287

phism defined byf(u) =aa,f(v) =ab andf(w) =ba. LetY ={u, vu, vv, w}

288

and Z = {aa, ab, ba}. Then Y, Z are composable through f and Y ◦^f Z =

289

{aa, abaa, abab, ba}.

290

(9)

IfY andZ are two composable codes, thenX =Y ◦Z is a code [3, Proposition

291

2.6.1] and ifY andZ are prefix (suffix) codes, thenX is a prefix (suffix) code.

292

Conversely, ifX is a prefix (suffix) code, then Y is a prefix (suffix) code.

293

We extend the notation alph as follows. For two codesX, Z⊂A^∗ we denote

294

alph_Z(X) ={z∈Z| ∃u, v∈Z^∗, uzv∈X}. The following is Proposition 2.6.6 in [3].

295

prop266²⁹⁶ Proposition 2.8 Let X, Z ⊂ A^∗ be codes. There exists a code Y such that X=Y ◦Z if and only ifX ⊂Z^∗ andalph_Z(X) =Z.

297

The following statement generalizes Propositions 2.6.4 and 2.6.12 of [3] for

298

prefix codes.

299

propositionMaxPref³⁰⁰ Proposition 2.9 Let Y, Z be finite prefix codes composable through f and let X=Y ◦^fZ.

301

(i) For any set Gsuch that Y ⊂G andY is aG-maximal prefix code, X is

302

anf(G)-maximal prefix code.

303

(ii) For any setS such thatX, Z⊂S, ifX is anS-maximal prefix code,Y is

304

an f⁻¹(S)-maximal prefix code andZ is an S-maximal prefix code. The

305

converse is true if S is recurrent.

306

Proof. (i) Letw∈f(G) and setw=f(v) with v∈G. Since Y isG-maximal,

307

there is a wordy ∈Y which is prefix-comparable with v. Then f(y) is prefix-

308

comparable withw. ThusX isf(G)-maximal.

309

(ii) SinceX is an S-maximal prefix code, any word in S is prefix comparable

310

with some element of X and thus with some element of Z. Therefore, Z is

311

S-maximal. Next ifu∈f⁻¹(S),v=f(u) is in Sand is prefix-comparable with

312

a word x in X. Assume that v = xt. Then t is in Z^∗ sincev, x ∈ Z^∗. Set

313

w=f⁻¹(t) andy=f⁻¹(x). Sinceu=yw,uis prefix-comparable withywhich

314

is inY. The other case is similar.

315

Conversely, assume that S is recurrent. Let w be a word in S of length

316

strictly larger than the sum of the maximal length of the words ofX and Z.

317

SinceS is recurrent, the setZ is rightS-complete, and consequently the word

318

wis a prefix of a word in Z^∗. Thusw=upwithu∈Z^∗ andpa proper prefix

319

of a word inZ. The hypothesis onwimplies thatuis longer than any word of

320

X. Let v =f⁻¹(u). Sinceu∈S, we havev ∈f⁻¹(S). It is not possible that

321

v is a proper prefix of a word ofY since otherwiseu would be shorter than a

322

word ofX. Thusv has a prefix inY. Consequentlyu, and thusw, has a prefix

323

inX. ThusX isS-maximal.

324

Note that the converse of (ii) is not true if the hypothesis thatSis recurrent is

325

replaced by factorial. Indeed, forS ={1, a, b, aa, ab, ba},Z ={a, ba},f⁻¹(S) =

326

{1, u, uu, v},Y ={uu, v},f(u) =aandf(v) =ba, one hasX ={aa, ba}which

327

is not anS-maximal prefix code.

328

(10)

Note also that whenSis recurrent (or even uniformly recurrent),G=f⁻¹(S)

329

need not be recurrent. Indeed, letSbe the set of factors of (ab)^∗, letB ={u, v}

330

and let f : B^∗ →A^∗ be defined by f(u) =ab, f(v) =ba. Then G= u^∗∪v^∗

331

which is not recurrent.

332

3 Interval exchange sets

333

sectionIntervalExchange

In this section, we recall the definition and the basic properties of interval ex-

334

change transformations.

335

3.1 Interval exchange transformations

336

Let us recall the definition of an interval exchange transformation (see [10]

337

or [7]).

338

A semi-interval is a nonempty subset of the real line of the form [α, β) =

339

{z ∈R|α≤z < β}. Thus it is a left-closed and right-open interval. For two

340

semi-intervals ∆,Γ, we denote ∆<Γ ifx < yfor anyx∈∆ andy∈Γ.

341

Let (A, <) be an ordered set. A partition (Ia)a∈A of [0,1) in semi-intervals

342

isordered ifa < bimpliesIa < Ib.

343

LetAbe a finite set ordered by two total orders<1and<2. Let (Ia)a∈Abe

344

a partition of [0,1) in semi-intervals ordered for<1. Letλa be the length ofIa.

345

Letµa=P

b≤1aλbandνa=P

b≤2aλb. Setαa=νa−µa. Theinterval exchange

346

transformation relative to (Ia)a∈Ais the mapT : [0,1)→[0,1) defined by

347

T(z) =z+αa ifz∈Ia.

Observe that the restriction of T to Ia is a translation onto Ja =T(Ia), that

348

µa is the right boundary of Ia and that νa is the right boundary of Ja. We

349

additionally denote by γa the left boundary ofIa and by δa the left boundary

350

ofJa. Thus

351

Ia= [γa, µa), Ja= [δa, νa).

Since a <2 b implies Ja <2 Jb, the family (Ja)a∈A is a partition of [0,1)

352

ordered for <2. In particular, the transformation T defines a bijection from

353

[0,1) onto itself.

354

An interval exchange transformation relative to (Ia)a∈A is also said to be

355

on the alphabetA. The values (αa)a∈Aare called the translation values of the

356

transformationT.

357

exampleRotation³⁵⁸ Example 3.1 LetRbe the interval exchange transformation corresponding to A={a, b},a <1b,b <2a,Ia = [0,1−α),Ib = [1−α,1). The transformation

359

R is the rotation of angle αon the semi-interval [0,1) defined by R(z) =z+

360

αmod 1.

361

Since<1and<2are total orders, there exists a unique permutationπofAsuch

362

that a <1 b if and only if π(a) <2 π(b). Conversely, <2 is determined by <1

363

(11)

and π, and <1 is determined by <2 and π. The permutationπ is said to be

364

associated withT.

365

Let s ≥2 be an integer. If we set A = {a1, a2, . . . , as} with a1 <1 a2 <1

366

· · ·<1as, the pair (λ, π) formed by the familyλ= (λa)a∈Aand the permutation

367

πdetermines the mapT. We will also denoteT asTλ,π. The transformationT

368

is also said to be ans-interval exchange transformation.

369

It is easy to verify that the family ofs-interval exchange transformations is

370

closed by composition and by taking inverses.

371

Example 3.2 A 3-interval exchange transformation is represented in Figurefigure3interval

3.1.

372

One hasA={a, b, c}witha <1b <1candb <2c <2a. The associated permu-

373

tation is the cycleπ= (abc). µa µb µc

νb νc νa

Figure 3.1: A 3-interval exchange transformation figure3interval

374

3.2 Regular interval exchange transformations

375

Theorbit of a pointz∈[0,1) is the set{Tⁿ(z)|n∈Z}. The transformationT

376

is said to beminimal if for anyz∈[0,1), the orbit ofzis dense in [0,1).

377

Set A = {a1, a2, . . . , as} with a1 <1 a2 <1 . . . <1 as, µi = µai and δi =

378

δai. The points 0, µ1, . . . , µs−1 form the set ofseparation points ofT, denoted

379

Sep(T).

380

An interval exchange transformation Tλ,π is called regular if the orbits of

381

the nonzero separation pointsµ1, . . . , µs−1 are infinite and disjoint. Note that

382

the orbit of 0 cannot be disjoint of the others since one hasT(µi) = 0 for some

383

iwith 1≤i≤s.

384

There are several equivalent terms used instead of regular. A regular interval

385

exchange transformation is also said to satisfy theidoc condition (where idoc

386

stands for “infinite disjoint orbit condition”). It is also said to have the Keane

387

property or to be withoutconnection (see [8]).

388

Example 3.3 The 2-interval exchange transformationRof ExampleexampleRotation

3.1 which

389

is the rotation of angleαis regular if and only ifαis irrational.

390

The following result is due to Keane [22].

391

theoremKeane³⁹² Theorem 3.4 A regular interval exchange transformation is minimal.

The converse is not true. Indeed, consider the rotation of angle α with α

393

irrational, as a 3-interval exchange transformation with λ= (1−2α, α, α)and

394

(12)

π= (132). The transformation is minimal as any rotation of irrational angle

395

but it is not regular sinceµ1= 1−2α, µ2= 1−αand thusµ2=T(µ1).

396

3.3 Natural coding

397

LetT be an interval exchange transformation relative to (Ia)a∈A. For a given

398

real numberz∈[0,1), thenatural coding ofT relative toz is the infinite word

399

ΣT(z) =a0a1· · · on the alphabetAdefined by

400

an=a if Tⁿ(z)∈Ia. exampleFiboNatCoding Example 3.5 Letα= (3−√

5)/2 and letRbe the rotation of angleαon [0,1)

401

as in Example exampleRotation

3.1. The natural coding ofR with respect toαis the Fibonacci

402

word (see [23, Chapter 2] for example).

403

For a wordw=b0b1· · ·bm−1, letIw be the set

404

Iw=Ib⁰∩T⁻¹(Ib¹)∩ · · · ∩T^−m+1(Ibm−1). (3.1) ^eqIu Note that eachIw is a semi-interval. Indeed, this is true ifwis a letter. Next,

405

assume thatIwis a semi-interval. Then for anya∈A,T(Iaw) =T(Ia)∩Iwis a

406

semi-interval sinceT(Ia) is a semi-interval by definition of an interval exchange

407

transformation. SinceIaw⊂Ia,T(Iaw) is a translate ofIaw, which is therefore

408

also a semi-interval. This proves the property by induction on the length.

409

Then one has for anyn≥0

410

anan+1· · ·an+m−1=w⇐⇒Tⁿ(z)∈Iw (3.2) eqIw If T is minimal, one has w ∈ Fac(ΣT(z)) if and only ifIw 6= ∅. Thus the

411

set Fac(ΣT(z)) does not depend onz (as for Sturmian words, see [23]). Since it

412

depends only onT, we denote it by Fac(T). WhenT is regular (resp. minimal),

413

such a set is called a regular interval exchange set (resp. a minimal interval

414

exchange set).

415

Let T be an interval exchange transformation. The natural codings ΣT(z)

416

of T with z ∈ [0,1) are infinite words on A. The set A^ω of infinite words on

417

Ais a topological space for the topology induced by the metric defined by the

418

following distance. For x= a0a1· · ·, y =b0b1· · · ∈ A^ω with x 6=y, one sets

419

d(x, y) = 2^−n(x,y)ifn(x, y) is the leastnsuch thatan 6=bn. LetXbe the closure

420

in the spaceA^ωof the set of all ΣT(z) forz∈[0,1) and letσbe the shift onX.

421

The pair (X, σ) is asymbolic dynamical system, formed of a topological space

422

X and a continuous transformationσ. Such a system is said to be minimal if

423

the only closed subsets invariant byσare ∅or X. It is well-known that (X, σ)

424

is minimal if and only if the set Fac(X) of factors of the x ∈X is uniformly

425

recurrent (see for example [23] Theorem 1.5.9).

426

We have the commutative diagram of FigurecommutativeDiagram

3.2.

427

The map ΣT is neither continuous nor surjective. This can be corrected by

428

embedding the interval [0,1) into a larger space on whichT is a homeomophism

429

(13)

[0,1) [0,1)

X X

T

ΣT

σ

ΣT

Figure 3.2: A commutative diagram. commutativeDiagram

(see [22] or [7] page 349). However, if the transformation T is minimal, the

430

symbolic dynamical system (X, σ) is minimal (see [7] page 392). Thus, we

431

obtain the following statement.

432

propositionRegularUR⁴³³ Proposition 3.6 For any minimal interval exchange transformationT, the set Fac(T)is uniformly recurrent.

434

exampleDivision Example 3.7 Set α = (3−√

5)/2 and A = {a, b, c}. Let T be the interval

435

exchange transformation on [0,1) which is the rotation of angle 2αmod 1 on

436

the three intervals Ia = [0,1−2α), Ib = [1−2α,1−α), Ic = [1−α,1) (see

437

Figure figure3interval2

3.3). The transformationT is regular since αis irrational. The words

0 1−2α 1−α 1

a b c

0 α 2α 1

b c a

Figure 3.3: A regular 3-interval exchange transformation. figure3interval2

438

of length at most 5 of the setS = Fac(T) are represented in Figure3.4. Since^figureSetF

a b

c c a b b

c bc

c a ab b

b b bc

c c

a c

a ab b b bc c c ab c a

Figure 3.4: The words of length≤5 of the setS. ^figureSetF

439

T =R², whereRis the transformation of ExampleexampleFiboNatCoding

3.5, the natural coding ofT

440

(14)

relative toαis the infinite wordy =γ⁻¹(x) wherexis the Fibonacci word and

441

γis the morphism defined byγ(a) =aa,γ(b) =ab,γ(c) =ba. One has

442

y=baccbaccbbacbbacbbacc· · · (3.3) Eqy Actually, the wordyis the fixed-pointg^ω(b) of the morphismg:a7→baccb, b7→

443

bacc, c 7→ bacb. This follows from the fact that the cube of the Fibonacci

444

morphism f : a 7→ ab, b 7→ a sends each letter on a word of odd length and

445

thus preserves the set of words of even length.

446

4 Return words

447

sectionReturn

In this section, we introduce the notion of return and first return words. We

448

prove elementary results about return words which extendablely already appear

449

in [12].

450

LetS be a set of words. Forw∈S, let ΓS(w) ={x∈S |wx∈S∩A⁺w}be

451

the set ofright return words towand letR^S(w) = ΓS(w)\ΓS(w)A⁺ be the set

452

offirst right return words tow. By definition, the setR^S(w) is, for anyw∈S,

453

a prefix code. IfS is recurrent, it is aw⁻¹S-maximal prefix code.

454

Similarly, forw∈S, we denote Γ^′_S(w) ={x∈S|xw∈S∩wA⁺} the set of

455

left return wordstowandR^′S(w) = Γ_S^′(w)\A⁺Γ^′_S(w) the set offirst left return

456

words tow. By definition, the set R^′S(w) is, for anyw∈S, a suffix code. IfS

457

is recurrent, it is anSw⁻¹-maximal suffix code. The relation betweenR^S(w)

458

andR^′S(w) is simply

459

wRS(w) =R^′S(w)w . (4.1) ^eqAutomo

Letf :B^∗→A^∗is a coding morphism forR^S(w). The morphismf^′ :B^∗→A^∗

460

defined forb∈Bbyf^′(b)w=wf(b) is a coding morphism forR^′S(w) called the

461

coding morphismassociated withf.

462

Example 4.1 LetS be the uniformly recurrent set of Example exampleDivision

3.7. We have

463

R^S(a) = {cbba, ccba, ccbba}, R^S(b) = {acb, accb, b}, RS(c) = {bac, bbac, c}.

These sets can be read from the word y given in Equation (3.3). A coding^Eqy

464

morphismf :B^∗ →A^∗ withB =A for the setR^S(c) is given by f(a) =bac,

465

f(b) =bbac,f(c) =c.

466

Note that ΓS(w)∪ {1}is right unitary and that

467

ΓS(w)∪ {1}=R^S(w)^∗∩w⁻¹S. (4.2) eqGamma1 Indeed, if x ∈ ΓS(w) is not in R^S(w), we have x= zu with z ∈ ΓS(w) and

468

u nonempty. Since ΓS(w) is right unitary, we have u ∈ ΓS(w), whence the

469

conclusion by induction on the length ofx. The converse inclusion is obvious.

470

(15)

propReturnsFinite⁴⁷¹ Proposition 4.2 A recurrent setS is uniformly recurrent if and only if the set R^S(w)is finite for all w∈S.

472

Proof. Assume that all sets R^S(w) forw∈S are finite. Letn≥1. Let N be

473

the maximal length of the words inRS(w) for a word wof lengthn, then any

474

word of lengthN+ 2n−1 contains an occurrence ofw. Conversely, for w∈S,

475

letN be such thatwis a factor of any word inS of lengthN. Then the words

476

ofRS(w) have length at most|w|+N−1.

477

LetSbe a recurrent set and letw∈S. Letf be a coding morphism forR^S(w).

478

The setf⁻¹(w⁻¹S), denotedDf(S), is called thederived set ofS with respect

479

tof. Note that iff^′ is the coding morphism forR^′S(w) associated withf, then

480

Df(S) =f^′−1(Sw⁻¹).

481

The following result gives an equivalent definition of the derived set.

482

propositionRecurrent⁴⁸³ Proposition 4.3 Let S be a recurrent set. For w∈S, let f be a coding morphism for the setRS(w). Then

484

Df(S) =f⁻¹(ΓS(w))∪ {1}. (4.3) eqMagique Proof. Letz∈Df(S). Thenf(z)∈w⁻¹S∩RS(w)^∗and thusf(z)∈ΓS(w)∪{1}.

485

Conversely, ifx∈ΓS(w), thenx∈ R^S(w)^∗by Equation (4.2) and thus^eqGamma1 x=f(z)

486

for somez∈Df(S).

487

Let S be a recurrent set andx be an infinite word such thatS = Fac(x).

488

Letw∈Sand let f be a coding morphism for the setR^S(w). Sincewappears

489

infinitely often inx, there is a unique factorization x=vwz withz ∈ R^S(w)^ω

490

andvsuch thatvwhas no proper prefix ending withw. The infinite wordf⁻¹(z)

491

is called the derived word of xrelative to f. Iff^′ is the coding morphism for

492

R^′S(w) associated withf, we havef⁻¹(z) =f^′−1(wz) and thusf, f^′ define the

493

same derived word.

494

The following well-known result (for a proof, see [6] for example), shows in

495

particular that the derived set of a recurrent set is recurrent.

496

propositionDerived⁴⁹⁷ Proposition 4.4 LetS be a recurrent set and letxbe a recurrent infinite word such that S = Fac(x). Let w ∈S and let f be a coding morphism for the set

498

RS(w). The derived set ofS with respect tof is the set of factors of the derived

499

word ofxwith respect tof, that is Df(S) = Fac(Df(x)).

500

Example 4.5 LetSbe the uniformly recurrent set of ExampleexampleDivision

3.7. Letf be the

501

coding morphism for the setR^S(c) given by f(a) =bac,f(b) =bbac,f(c) =c.

502

Then the derived set ofS with respect to f is represented in FigurefigureDerived

4.1.

503

(16)

a b

c c bc

ab ab bc

c c b

Figure 4.1: The words of length≤3 of the derived set ofS. figureDerived

5 Uniformly recurrent tree sets

504

sectionTreeNormal

In this section, we recall the notion of tree set introduced in [?]. We recall that

505

the factor complexity of a tree set onk+ 1 letters ispn=kn+ 1.

506

We recall a result concerning the decoding of tree sets (Theorem InverseImageTree

5.7). We

507

also recall the finite index basis property of uniformly recurrent tree sets (The-

508

orems ??^theoremSIBand theoremGroupCode

5.9) that we will use in Section sectionBifixDecoding

7. We prove that the family of

509

uniformly recurrent tree sets is invariant under derivation (TheorempropositionReturns

5.12). We

510

further prove that all bases of the free group included in a uniformly recurrent

511

tree set are tame (TheoremtheoremTame

5.18).

512

5.1 Tree sets

513

Let S be a fixed factorial set. For a biextendable word w, we consider the

514

undirected graphG(w) on the set of vertices which is the disjoint union ofL(w)

515

and R(w) with edges the pairs (a, b) ∈ E(w). The graph G(w) is called the

516

extension graph ofwin S.

517

Example 5.1 Let S be the Fibonacci set. The extension graphs of ε, a, b, ab

518

respectively are shown in Figure FigureExtensionGraph

5.1.

b

a b

a

a a

a b

a

Figure 5.1: The extension graphs ofε, a, b, abin the Fibonacci set. FigureExtensionGraph

519

Recall that an undirected graph is a tree if it is connected and acyclic.

520

We say thatS is atree set (resp. an acyclic set) if it is biextendable and if

521

for every wordw∈S, the graph G(w) is a tree (resp. is acyclic).

522

It is not difficult to verify the following statement (see [4], Proposition 4.3)

523

which shows that the factor complexity of a tree set is linear.

524

(17)

propositionComplexity⁵²⁵ Proposition 5.2 Let S be a tree set on the alphabet A and let k = Card(A∩ S)−1. Then Card(S∩Aⁿ) =kn+ 1 for alln≥0

526

The following result is also easy to prove.

527

propositionSturmianisNormal⁵²⁸ Proposition 5.3 A Sturmian set S is a uniformly recurrent tree set.

Proof. We have already seen that a Sturmian set is uniformly recurrent. Let

529

us show that it is a tree set. Considerw∈S. If wis not left-special there is a

530

uniquea∈Asuch thataw∈S. ThenE(w)⊂ {a} ×Aand thusG(w) is a tree.

531

The case wherewis not right-special is symmetrical. Finally, assume thatwis

532

bispecial. Let a, b ∈A be such that aw is right-special and wb is left-special.

533

ThenE(w) = ({a} ×A)∪(A× {b}) and thusG(w) is a tree.

534

Putting together PropositionspropositionRegularUR

3.6 and Proposition 5.8 in [5], we have the similar

535

statement.

536

propositionExchangeTreeCondition⁵³⁷ Proposition 5.4 A regular interval exchange set is a uniformly recurrent tree set.

538

PropositionpropositionExchangeTreeCondition

5.4 is actually a particular case of a result of [18] which charac-

539

terizes the regular interval exchange sets.

540

We give an example of a uniformly recurrent tree set which is neither a

541

Sturmian set nor an interval exchange set.

542

exampleTribonacci21⁵⁴³ Example 5.5 Let S be the Tribonacci set on the alphabetA ={a, b, c} (see Example exampleTribonacci

2.2). LetX =A²∩S. ThenX ={aa, ab, ac, ba, ca}is anS-maximal

544

bifix code of S-degree 2. LetB = {x, y, z, t, u} and let f : B^∗ → A^∗ be the

545

morphism defined byf(x) = aa, f(y) = ab, f(z) =ac, f(t) = ba, f(u) = ca.

546

Thenf is a coding morphism for X. We will see that the set G= f⁻¹(S) is

547

a uniformly recurrent tree set (this follows from TheoremtheoremNormal

7.1 below). It is not

548

Sturmian sincey and tare two right-special words of length 1. It is not either

549

an interval exchange set. Indeed, for any right-special word w of G, one has

550

r(w) = 3. This is not possible in a regular interval exchange set T since, ΣT

551

being injective, the length of the intervalJwtends to 0 as|w| tends to infinity.

552

LetS be a set of words. Forw∈S, andU, V ⊂S, letU(w) ={ℓ∈U |ℓw∈

553

S} and let V(w) = {r ∈ V | wr ∈ S}. The generalized extension graph of w

554

relative toU, V is the following undirected graphGU,V(w). The set of vertices

555

is made of two disjoint copies ofU(w) andV(w). The edges are the pairs (ℓ, r)

556

for ℓ ∈ U(w) and r ∈ V(w) such that ℓwr ∈ S. The extension graph G(w)

557

defined previously corresponds to the case whereU, V =A.

558

The following result is proved in [4] (Proposition 4.9).

559

PropStrongTreeCondition⁵⁶⁰ Proposition 5.6 Let S be a tree set. For any w ∈ S, any finite S-maximal suffix codeU ⊂S and any finite S-maximal prefix code V ⊂S, the generalized

561

extension graphGU,V(w)is a tree.

562

(18)

Let S be a recurrent set and let f be a coding morphism for a finite S-

563

maximal bifix code. The setf⁻¹(S) is called amaximal bifix decoding ofS.

564

The following result is Theorem 4.13 in [4].

565

InverseImageTree⁵⁶⁶ Theorem 5.7 Any maximal bifix decoding of a recurrent tree set is a tree set.

We have no example of a bifix decoding of a recurrent tree set which is not

567

recurrent (in view of Theorem theoremNormal

7.1 to be proved hereafter, such a set would be

568

the decoding of a recurrent tree set which is not uniformly recurrent).

569

5.2 The finite index basis property

570

sectionNormal

Let S be a recurrent set containing the alphabet A. We say that S has the

571

finite index basis property if the following holds. A finite bifix codeX ⊂S is

572

anS-maximal bifix code ofS-degreedif and only if it is a basis of a subgroup

573

of indexdof the free group onA.

574

We recall the main result of [5] (Theorem 6.1).

575

theoremFIB⁵⁷⁶ Theorem 5.8 A uniformly recurrent tree set containing the alphabet Ahas the finite index basis property.

577

Recall from Section subsectionautomata

2.3 that a group code of degree d is a bifix code X such

578

thatX^∗=ϕ⁻¹(H) for a surjective morphismϕ:A^∗→GfromA^∗ onto a finite

579

groupGand a subgroupH of indexdofG.

580

We will use the following result. It is stated for a Sturmian set S in [2]

581

(Theorem 7.2.5) but the proof only uses the fact thatS is uniformly recurrent

582

and satisfies the finite index basis property. We reproduce the proof for the sake

583

of clarity.

584

For a set of words X, we denote by hXi the subgroup of the free group on

585

Agenerated byX. The free group on Aitself is consistently denotedhAi.

586

theoremGroupCode⁵⁸⁷ Theorem 5.9 Let Z ⊂A⁺ be a group code of degree d. For every uniformly recurrent tree set S containing the alphabet A, the set X =Z∩S is a basis of

588

a subgroup of indexdof hAi.

589

Proof. By Theorem 4.2.11 in [2], the code X is an S-maximal bifix code of

590

S-degreee≤d. SinceS is a uniformly recurrent, by Theorem 4.4.3 of [2],X is

591

finite. By Theorem5.8,^theoremFIBX is a basis of a subgroup of indexe. SincehXi ⊂ hZi,

592

the index e of the subgrouphXi is a multiple of the index d of the subgroup

593

hZi. Sincee≤d, this implies thate=d.

594

As an example of this result, if S is a uniformly recurrent tree set, then

595

S∩Aⁿ is a basis of the subgroup formed by the words of length multiple ofn

596

(where the length is not the length of the reduced word but the sum of values

597

1 for the letters inAand−1 for the letters inA⁻¹).

598

We will use the following results from [4]. The first one is Corollary 5.8 in [4].

599

(19)

theoremJulien⁶⁰⁰ Theorem 5.10 LetS be a uniformly recurrent tree set containing the alphabet A. For any wordw∈S, the setR^S(w) is a basis of the free group onA.

601

The next result is Theorem 6.2 in [4]. A submonoidM ofA^∗issaturated in

602

a setSifM ∩S=hMi ∩S.

603

propositionHcapF⁶⁰⁴ Theorem 5.11 LetS be an acyclic set. The submonoid generated by any bifix codeX ⊂S is saturated in S.

605

5.3 Derived sets of tree sets

606

We will use the following closure property of the family of uniformly recurrent

607

tree sets. It generalizes the fact that the derived word of a Sturmian word is

608

Sturmian (see [21]).

609

propositionReturns⁶¹⁰ Theorem 5.12 Any derived set of a uniformly recurrent tree set is a uniformly recurrent tree set.

611

Proof. LetS be a uniformly recurrent tree set containingA, letv ∈S and let

612

f be a coding morphism forX =R^S(v). By Theorem theoremJulien

5.10,X is a basis of the

613

free group onA. Thus f :B^∗→A^∗ extends to an isomorphism fromhBi onto

614

hAi.

615

Set H =f⁻¹(v⁻¹S). By Proposition propositionRecurrent

4.3, the set H is recurrent and H =

616

f⁻¹(ΓS(v))∪ {1}.

617

Consider x ∈ H and set y = f(x). Let f^′ be the coding morphism for

618

X^′=R^′S(v) associated withf. Fora, b∈B, we have

619

(a, b)∈G(x)⇔(f^′(a), f(b))∈GX^′,X(vy),

whereGX^′,X(vy) denotes the generalized extension graph ofvyrelative toX^′, X.

620

Indeed,

621

axb∈H⇔f(a)yf(b)∈ΓS(v)⇔vf(a)yf(b)∈S⇔f^′(a)vyf(b)∈S.

The setX^′ is anSv⁻¹-maximal suffix code and the setX is a v⁻¹S-maximal

622

prefix code. By PropositionPropStrongTreeCondition

5.6 the generalized extension graphGX^′,X(vy) is a

623

tree. Thus the graphG(x) is a tree. This shows thatH is a tree set.

624

Consider now x ∈ H \1. Set y = f(x). Let us show that ΓH(x) =

625

f⁻¹(ΓS(vy)) or equivalently f(ΓH(x)) = ΓS(vy). Consider first r ∈ ΓH(x).

626

Sets=f(r). Thenxr=ux withu, ux∈H. Thusys=wy withw=f(u).

627

Since u ∈ H \ {1}, w = f(u) is in ΓS(v), we have vw ∈ A⁺v∩S. This

628

implies that vys = vwy ∈ A⁺vy∩S and thus that s ∈ ΓS(vy). Conversely,

629

considers∈ ΓS(vy). Since y =f(x), we haves∈ΓS(v). Sets=f(r). Since

630

vys∈A⁺vy∩S, we haveys∈A⁺y∩S. Setys=wy. Thenvwy∈A⁺vyimplies

631

vw∈A⁺v and thereforew∈ΓS(v). Settingw=f(u), we obtainf(xr) =ys=

632

wy∈X⁺y∩ΓS(v). Thusr∈ΓH(x). This shows thatf(ΓH(x)) = ΓS(vy) and

633

thus thatR^H(x) =f⁻¹(R^S(vy)).

634

SinceSis uniformly recurrent, the setR^S(vy) is finite. Sincef is an isomor-

635

phism,R^H(x) is also finite, which shows thatH is uniformly recurrent.

636

Maximal bifix decoding