On some problems related to palindrome closure

(1)

DOI:10.1051/ita:2007064 www.rairo-ita.org

ON SOME PROBLEMS RELATED TO PALINDROME CLOSURE

^∗

Michelangelo Bucci

¹

, Aldo de Luca

¹

, Alessandro De Luca

¹

and Luca Q. Zamboni

²

Abstract. In this paper, we solve some open problems related to (pseudo)palindrome closure operators and to the infinite words generated by their iteration, that is, standard episturmian and pseudostandard words. We show that ifϑis an involutory antimorphism of A^∗, then the right and left ϑ-palindromic closures of any factor of a ϑ-standard word are also factors of some ϑ-standard word. We also introduce the class of pseudostandard words with “seed”, obtained by iterated pseudopalindrome closure starting from a nonempty word. We show that pseudostandard words with seed are morphic images of standard episturmian words. Moreover, we prove that for any given pseudostandard word s with seed, all sufficiently long left special factors ofsare prefixes of it.

Mathematics Subject Classification.68R15.

Introduction

Sturmian words are a classical subject of combinatorics on words (see for instance [3]); by deﬁnition, they are inﬁnite words havingn+ 1 factors (i.e., blocks of consecutive symbols) of each lengthn. Sturmian words enjoy many interesting characterizations and have a wide range of applications, from discrete geometry to crystallography.

Palindrome closure operators, introduced in [5], have had an important role in the study of Sturmian words. Ifwis a word, itsright (resp.left)palindrome closure

Keywords and phrases.Palindromes, palindrome closures, Sturmian and episturmian words, involutory antimorphisms, pseudopalindromes, pseudostandard words.

∗ The work for this paper has been supported by the Italian Ministry of Education under Project COFIN 2005 – Automi e Linguaggi Formali: aspetti matematici e applicativi.

1 Dipartimento di Matematica e Applicazioni “R. Caccioppoli”, Universit`a degli Studi di Napoli Federico II. Via Cintia, Monte S. Angelo, I-80126 Napoli, Italy;[email protected];

[email protected]; [email protected]

2Department of Mathematics, PO Box 311430, University of North Texas. Denton TX, USA;

[email protected]

Article published by EDP Sciences c EDP Sciences 2008

(2)

M. BUCCIET AL.

w⁽⁺⁾ (resp. w⁽⁻⁾) is the shortest palindrome having w as a preﬁx (resp. suﬃx).

Standard Sturmian words can be constructed byiterated palindrome closure, that is, by the following procedure. Start from the empty word, and successively add a letter from{a, b}and apply the right palindrome closure operator. In this way one generates a sequence of palindromes, each one being a prefix of the next one, so that a limit is naturally defined. If both a and b are used infinitely many times during such process, the infinite word obtained as a limit is aperiodic, and is exactly a standard Sturmian word. For any Sturmian word, there exists a (unique) standard Sturmian word having the same factors.

In recent years, many extensions of Sturmian words have appeared. In particular, in [7]episturmian words were introduced (see also [10]). They can be deﬁned as words having the same set of factors of astandard episturmian word, which is just a word obtained by iterated palindrome closure over an arbitrary alphabet (and without the aperiodicity condition).

A further generalization was introduced in [6], by substituting palindrome closure with pseudopalindrome closure. Apseudopalindrome is a ﬁxed point of some involutory antimorphismϑof a free monoidA^∗. Thus ordinary palindromes are a special case of pseudopalindromes where the antimorphism is simply the reversal operator. We speak of ϑ-palindromes when a particular antimorphism ϑis cho- sen. It is then natural to considerϑ-palindrome closure operators, and to look at words obtained by iteratedϑ-palindrome closure, called ϑ-standard (or generally pseudostandard) words.

In this paper, we discuss some properties related to (pseudo)palindrome closure, episturmian and ϑ-standard words. In [6] it was proven that both palindromic closuresw⁽⁺⁾andw⁽⁻⁾of a factorwof a Sturmian word are themselves factors of Sturmian words. In Section2, this property is proved for episturmian words.

In Section 3, the closure property is extended to factors of ϑ-standard words too. We also show that aϑ-standard word having both closures as factors always exists. Moreover, we prove that every left special factor of aϑ-standard word t, whose length is at least 3, is a prefix of t. Recall that a factor uof a (finite or infinite) wordw over an alphabetA is left (resp. right) special if there exist at least two distinct lettersa, b∈A such that bothauandbu (resp.uaandub) are factors ofw.

In the last section we introduce the class ofϑ-standard words with seed. They are inﬁnite words obtained by iterated ϑ-palindrome closure, starting from an arbitrary wordu₀ (called seed) instead of the empty word. We show that every ϑ-standard word with seed is a morphic image of a standard episturmian word.

More precisely, if Δ = xx₁x₂. . . x_n. . . is the inﬁnite sequence of letters which directs the construction of aϑ-standard wordtwith a seed, thent=φ_x(s), where φ_xis a morphism depending onϑandu₀, andsis the standard episturmian word directed by Δ =x₁x₂. . . x_n. . .

Finally, we show that every suﬃciently long left special factor of aϑ-standard word with seed is a preﬁx of it, and give an upper bound for the minimal length from which this occurs, in terms of the length of the rightϑ-palindrome closure of u₀x. This proves a conjecture posed in [6].

(3)

1. Preliminaries

LetAbe a ﬁnite alphabet andA^∗thefree monoidgenerated byA. The elements ofAare usually calledletters and those ofA^∗ words. The identity element ofA^∗ is calledempty word and denoted by ε. We set A⁺=A^∗\ {ε}. A wordw∈A⁺ can be written uniquely as a sequence of letters w = a₁a₂. . . a_n, with a_i ∈ A, i= 1, . . . , n. The integer nis called the length of w and is denoted by|w|. The length ofεis conventionally 0.

Letw∈A^∗. A wordv is a factor of wif there exist words rand ssuch that w=rvs. Ifw=vsfor somes (resp.w=rv for some wordr), then v is called a prefix (resp. asuffix) ofw. A word which is both a prefix and a suffix ofwis called aborder ofw. We shall denote respectively by Fact(w), Pref(w), and Suff(w) the sets of all factors, prefixes, and suffixes of the wordw.

ForX⊆A^∗ andu∈A^∗,u⁻¹X andXu⁻¹denote respectively the sets {w∈A^∗|uw∩X =∅} and{w∈A^∗|wu∩X =∅}.

When X is a singleton {x} and u⁻¹X = ∅ (resp. Xu⁻¹ = ∅), the unique word w∈u⁻¹{x}(resp.w∈ {x}u⁻¹) is denoted byu⁻¹x(resp.xu⁻¹).

Ifw=a₁. . . a_n ∈A^∗, a_i∈A, i= 1, . . . , n, the reversal, ormirror image, ofw is the word

˜

w=a_n. . . a₁.

One sets ˜ε = ε. A word is called palindrome if it is equal to its reversal. Any border of a palindrome is trivially a palindrome. We shall denote byPAL(A), or simplyPAL, the set of all palindromes on the alphabetA.

An inﬁnite word (from left-to-right) x over the alphabet A is any map x : N+−→AwhereN+ is the set of positive integers. We can representxas

x=x₁x₂. . . x_n. . . ,

where for anyi >0,x_i=x(i)∈A. A (ﬁnite)factor ofxis either the empty word or any sequenceu=x_i. . . x_j withi≤j,i.e., any block of consecutive letters ofx.

Ifi= 1, thenuis aprefix ofx. We shall denote by Fact(x) and Pref(x) the sets of finite factors and prefixes ofxrespectively. The set of all infinite words overA is denoted byA^ω. Moreover, we setA^∞=A^∗∪A^ω.

The product between a finite wordwand an infinite onexis naturally defined as the infinite wordwx. Anoccurrenceof the wordvinw∈A^∞is any pair (r, s), withr∈A^∗ ands∈A^∞, such thatw=rvs. Ifw∈A^∗ and a∈A,|w|a denotes the number of distinct occurrences ofain w.

Ifx∈Aandvx (resp. xv) is a factor of w∈A^∞, thenvx(resp. xv) is called a right (resp. left)extension of v in w. We recall that a factorv of a (finite or infinite) wordwis calledright special if it has at least two distinct right extensions in w,i.e., there exist at least two distinct lettersa, b∈Asuch that both vaand vbare factors ofw. Left special factors are defined analogously. A factor of wis calledbispecial if it is both right and left special.

(4)

M. BUCCIET AL.

We call a factorw of a words∈ A^∞ a first return tou ifw contains exactly two distinct occurrences ofu, one as a prefix and the other as a suffix,i.e.,

w=uλ=μu withλ, μ∈A⁺ andw /∈A⁺uA⁺.

We observe that in such a case,wu⁻¹=μ is usually called areturn word over u ins(see [8]).

A words ∈ A^ω is said to be closed under reversal if for anyu∈ Fact(s) one has ˜u∈Fact(s). In this case, a factoruof sis right special if and only if ˜uis a left special factor ofs.

A word w∈ A^ω is called episturmian if it is closed under reversal and it has at most one right (or equivalently, left) special factor of each length. We recall (see [7]) that every episturmian word isuniformly recurrent, i.e., every factor of an episturmian word occurs inﬁnitely often, with bounded gaps.

An episturmian wordw is calledstandard if every left special factor ofwis a preﬁx of it. We denote byEp(A), or simplyEp, the set of all episturmian words overA, and bySEpthe set of standard ones.

Proposition 1.1(cf.[7]). For every episturmian wordw, there exists a standard episturmian words such thatFact(s) = Fact(w).

Thus Fact(Ep) = Fact(SEp). The elements of Fact(Ep) are called ﬁnite epis- turmian words.

Given a wordw∈A^∗, we denote byw⁽⁺⁾itsright palindrome closure, i.e., the shortest palindrome havingw as a preﬁx. Similarly, w⁽⁻⁾ is the left palindrome closure of w. For instance, if w = abacbca, then w⁽⁺⁾ =abacbcaba and w⁽⁻⁾ = acbcabacbca.

For anyw∈A^∗, one hasw⁽⁻⁾= ˜w⁽⁺⁾. Moreover, ifQis the longest palindromic suﬃx ofwandw=sQ, thenw⁽⁺⁾=sQ˜s.

Letψ:A^∗→A^∗ be deﬁned byψ(ε) =εandψ(va) = (ψ(v)a)⁽⁺⁾for anya∈A andv ∈A^∗. For any u, v∈A^∗, one hasψ(uv)∈ψ(u)A^∗∩A^∗ψ(u). The map ψ can then be naturally extended toA^ωby setting, for any inﬁnite wordx,

ψ(x) = lim

n→∞ψ(w_n), where{w_n}= Pref(x)∩Aⁿ for alln≥0.

Proposition 1.2(cf.[7]). Let s∈A^ω. The following conditions are equivalent:

(1) sis a standard episturmian word,

(2) for any preﬁxuofs,u⁽⁺⁾ is also a preﬁx ofs, (3) there existsx∈A^ω such thats=ψ(x).

Given a standard episturmian word s, the (unique) inﬁnite word x such that s =ψ(x) is called directive word of s and is denoted by Δ(s), or simply by Δ.

From the preceding proposition, one can easily derive (cf. [7]) that the set of palindromic preﬁxes of a standard episturmian wordscoincides with

{ψ(u)|u∈Pref(Δ(s))}.

(5)

A standard episturmian wordsover the alphabetAis called a (standard)Arnoux- Rauzy word if every symbol ofAoccurs inﬁnitely often in the associated directive word Δ(s). We will denote by AR(A), or simply AR, the set of Arnoux-Rauzy words over A. In the case of a binary alphabet, an AR-word is usually called standard Sturmian word.

Example 1.3. LetA={a, b} andx= (ab)^ω. One has that f =ψ(x) =abaababaabaababa . . .

is the famousFibonacci word, a standard Sturmian word. On an alphabet with three lettersA={a, b, c}, if we takex= (abc)^ω as a directive word, then

τ =ψ(x) =abacabaabacababacabaabac . . .

is a standard Arnoux-Rauzy word, often calledTribonacci word. The word s = cabaabacababacabaab . . . such thatabas=τis an example of an episturmian word which is not standard, asais a left special factor ofsbut not a preﬁx of it.

The periodic words= (abac)^ωis standard episturmian, but not Arnoux-Rauzy.

Its directive word is Δ(s) =abc^ω.

The following proposition can be easily proved using well-known results on episturmian words (see [7]).

Proposition 1.4. Let s be a standard episturmian word. Any bispecial factor of sis a palindromic preﬁx of s. Ifsis not periodic, the converse holds too.

Proposition 1.5. Fact(Ep) = Fact(AR).

Proof. Let u ∈ Fact(Ep) = Fact(SEp). Hence there exists s ∈ SEp such that u∈Fact(s). Now let bes=ψ(Δ) where Δ =t₁t₂. . . t_n. . ., witht_i∈Afori≥1.

Therefore there exists a palindromic preﬁx p of s such that u ∈ Fact(p). Now p=ψ(t₁. . . t_i) for somei. We can consider Δ =t₁. . . t_it witht ∈A^ω such that any letter ofA occurs inﬁnitely many times in t. Hence s = ψ(Δ)∈ AR and contains p as a factor, so that u ∈ Fact(s). Therefore, Fact(Ep) ⊆ Fact(AR).

Since the inverse inclusion is trivial, the result follows.

The following proposition collects two properties of standard episturmian words (cf. Lems. 1 and 4 in [7]) which will be useful in the sequel.

Proposition 1.6(cf. [7]). Let s be a standard episturmian word. The following hold:

(1) Any preﬁxpof s has a palindromic suﬃx which has a unique occurrence inp.

(2) The ﬁrst letter ofsoccurs in every factor of shaving length 2.

Clearly, ifpis a prefix of a standard episturmian word, then the palindromic suffix ofpwhich has a unique occurrence in pis the longest palindromic suffix ofp.

(6)

M. BUCCIET AL.

2. A closure property

We want to show that ifw∈Fact(Ep), then also its right and left palindrome closures belong to Fact(Ep); since episturmian words are closed under reversal, andw⁽⁻⁾ = ˜w⁽⁺⁾, it suﬃces to prove only the right palindrome closure case. We have the following:

Proposition 2.1. Let ube a non-palindromic finite episturmian word; let Q be the longest palindromic suffix ofuand writeu=saQwherea∈Aands∈A^∗ (s possibly empty). Thenua=saQais a finite episturmian word.

Before proving the proposition we need some lemmas. The ﬁrst lemma was proved in [1], Theorem 1.1. We report here a diﬀerent and simpler proof.

Lemma 2.2. Let wbe an episturmian word andP ∈PAL∩Fact(w). Then every ﬁrst return toP in wis a palindrome.

Proof. By Proposition 1.1, we may always suppose that w is a standard episturmian word. Let u ∈ Fact(w) be a first return to the palindrome P, i.e., u = P λ = ρP, λ, ρ ∈ A^∗, and the only two occurrences of P in u are as a prefix and as a suffix ofu. If |P|>|ρ|, then the prefix P ofuoverlaps with the suffixP in uand this implies, as is easily to verify, thatuis a palindrome. Then let us suppose thatu=P vP withv∈A^∗.

Now we consider the first occurrence ofuor of ˜uinw. Without loss of generality, we may suppose that w = αuw and that ˜u does not occur in the prefix of w having length|αu| −1. LetQbe the palindromic suffix ofαuof maximal length.

If |Q|>|u|, then we have that ˜uoccurs in αu beforeu, which is absurd. Then suppose|Q| ≤ |u|. If|u|>|Q|>|P|, then one contradicts the hypothesis thatu is a ﬁrst return toP. If|Q|=|P|, thenQ=P has more than one occurrence in αu, which is absurd in view of Proposition1.6. The only remaining possibility is

Q=u,i.e.,uis a palindrome.

The following lemma is well-known. We report here a proof for the sake of completeness.

Lemma 2.3. Let w∈ARandsbe the unique right special factor of lengthn. If B₁, . . . , B_m, . . .are the bispecial factors of w ordered by increasing length, then s is a suﬃx of anyB_m such that|s| ≤ |B_m| and, for anyx∈A,sx∈Fact(w).

Proof. Sincewis not periodic, by Proposition1.4 the bispecial factorsB_i, i >0, are its palindromic prefixes. Moreover, ift =t₁t₂. . . t_n. . . ∈A^ω is the directive word ofw, thenB_i+1= (B_it_i)⁽⁺⁾for anyi >0. Sincesis a right special factor of w, ˜sis left special and thus a prefix ofw. Therefore,sis a suffix of any palindromic prefixB_mofwsuch that|s| ≤ |Bm|. Asw∈AR, any letterx∈Aoccurs infinitely often int; hence there exists k≥m such thatx=t_k, so thatB_kxis a factor of w. SinceB_mis a suffix ofB_k, it followssx∈Fact(w).

Lemma 2.4. Let w andw be Arnoux-Rauzy words on the alphabet A. If wand w have the same right special factor of lengthn, then they share the same factors up to lengthn+ 1.

(7)

Proof. Trivial ifn = 0. By induction, suppose we have proved the assertion for the integern−1 ≥0. LetQ be the common right special factor ofw andw of lengthn. If we writeQ=aQ, witha∈A, thenQ is the only right special factor of lengthn−1 of both w andw. Hence w and w have the same factors up to lengthn.

By symmetry, it suﬃces to prove that any factorvofw, of length|v|=n+ 1, is also a factor ofw. Letv=vb,b∈A. Suppose ﬁrst thatv=Q. By Lemma2.3, each right extensionQx, withx∈A, is a factor of bothw andw; in particular, vis a factor of w.

Now assume thatv=Q. Letv =cvwith c∈A, and suppose thatv=Q. One has then c = a. In this case, since v = cvb and Qb = avb are diﬀerent factors ofw, one has thatvbis left special inw. Since|vb|=n, one derives that vb = ˜Q is a left special factor of w too, so that v=cvb is a factor of w as a consequence of Lemma2.3.

Ifv=Q, thenvb is the unique right extension ofv in w. As |vb|=n, it is also a factor of w, and no other letterx is such that vx∈ Fact(w). Hence v=cvb is the only right extension inw of the factorcv=Q.

We can now proceed to prove Proposition2.1.

Proof of Proposition2.1. We ﬁrst observe thatucontains a single occurrence ofQ.

Indeed, ifucontained other occurrences ofQ, by Lemma2.2the suﬃx ofubegin- ning with the penultimate occurrence would be a palindromic suﬃx ofustrictly longer thanQ, contradicting the hypothesis of maximality of the length ofQ.

By Proposition1.5there exists an Arnoux-Rauzy wordwsuch thatu∈Fact(w).

We can assume thatua /∈Fact(w) (otherwise uais in Fact(AR) as required); so there existb∈Asuch thatb=aandub∈Fact(w). ThusaQb∈Fact(w); sinceQ is a palindrome andw∈AR, alsobQa∈Fact(w) andQis a bispecial factor ofw.

Then it follows that every left special factor ofwlonger thatQmust containQas a preﬁx, and since there is only a single occurrence ofQinu,Qitself is the longest suﬃx ofuwhich is left special in w. Thus every occurrence of aQin wmust be

“preceded” bys,i.e., ifw=λaQμ, thenw=λsaQμ, withλ=λs. In particular aQais not a factor ofw, for otherwiseuawould be in Fact(w), contradicting our assumption.

Set Δ(w) =t₁t₂. . . LetB₁=ε, B₂, . . . be the sequence of all bispecial factors of w, ordered by increasing length,i.e.,|B_i|<|B_i+1|for alli >0. By Proposition1.4, they are the palindromic preﬁxes of w as w is not periodic. Moreover, for each i >0 we haveB_i+1 = (B_it_i)⁽⁺⁾, so thatB_it_iis left special andt_iB_iis right special.

SinceQis a bispecial factor ofw, one hasQ=B_mfor somem >1. Let|Q|= n−1 forn≥2. We then have thatt_mQis right special inwand, from Lemma2.3, t_mQx ∈Fact(w) for all x∈A. It is clear thatt_m =asince aQa /∈Fact(w) and t_mQa ∈ Fact(w), then we have that aQb and t_mQb are distinct factors of w, thusQb is left special andbQ is the unique right special factor ofw of lengthn.

Sot_m=b.

Letw be any Arnoux-Rauzy sequence overA, whose directive word Δ(w) = t₁t₂. . . satisﬁest_i=t_i for 0< i≤m−1 and t_m=a. SinceQis the unique right

(8)

M. BUCCIET AL.

special factor ofwandw of lengthn−1, from Lemma2.4, we obtain thatwand w have the same factors of lengthkfor eachk≤n. However, they diﬀer on some factors of lengthn+ 1. Indeed, from the deﬁnition of w, we have thataQis its unique right special factor of lengthn, so that by Lemma 2.3, for all x∈ A we have thataQx∈Fact(w). ThereforeaQa∈Fact(w)\Fact(w).

Now let us prove that, as inw, each occurrence ofaQ in w is preceded by s.

Letp∈A^∗ be such that|p|=|s| andpaQ∈Fact(w). Let thenS be the largest common suffix ofpaQ andsaQand Q its prefix of lengthn−1. ClearlyQ=Q since there is only one occurrence ofQinsaQ. If we assume thatS =paQ, then there existx, y ∈A such thatx=y, xS ∈Suff (saQ) and yS ∈Suff(paQ); then xQ and yQ are both factors of wandw since these latter words have the same factors of length n. Thus Q is a left special factor of w and w, and that is a contradiction, since the only left special factor of lengthn−1 inwand inw isQ.

Thusp=sand so every occurrence ofaQinw is preceded bys.

SinceaQais a factor ofw, it follows that saQa=uais a factor ofw. Hence

uais in Fact(AR) as required.

From the preceding proposition one derives the following theorem, announced without proof in [6].

Theorem 2.5. If w is a ﬁnite episturmian word, then so is each of w⁽⁺⁾ and w⁽⁻⁾.

Proof. Trivial if w ∈ PAL. Let then w = a₁. . . a_nQ, where a_i ∈ A for i = 1, . . . , n andQis the longest palindromic suffix ofw. By Proposition2.1,wa_n = a₁. . . a_nQa_n is a finite episturmian word; since its longest palindromic suffix is a_nQa_n, also wa_na_n−1 is episturmian. In this way, by applying Proposition 2.1 exactlyntimes, one eventually obtains that

a₁a₂. . . a_nQa_n. . . a₂a₁=w⁽⁺⁾

is episturmian. Sincew⁽⁻⁾= ˜w⁽⁺⁾, the assertion follows.

Corollary 2.6. Let a∈A andu∈A^∗. If au is a ﬁnite episturmian word, then so isau⁽⁺⁾.

Proof. If au is not a palindrome, then by Theorem 2.5, (au)⁽⁺⁾ =au⁽⁺⁾a is an episturmian word and therefore so is au⁽⁺⁾. Let us then suppose that au is a palindrome.

By Theorem 2.5 one has u⁽⁺⁾ ∈ Fact(s) for a suitable s ∈ AR. Since s is recurrent there exist lettersx, y∈Asuch that

xu⁽⁺⁾y∈Fact(s).

If x = y, then, since s is closed under reversal, one has also yu⁽⁺⁾x ∈ Fact(s).

Hence u⁽⁺⁾ is bispecial, so that it follows au⁽⁺⁾ ∈Fact(s). Let us now consider the casex=y. Ifx=a, then the assertion is trivially veriﬁed.

(9)

Suppose then x = a. As au is a palindrome, we can write u = ua with u ∈PAL. Hence,

x(ua)⁽⁺⁾x∈Fact(s).

Since (ua)⁽⁺⁾ begins with ua and ends with au, one has that xua and aux are factors of s, so that u is bispecial and then a palindromic preﬁx of s by Proposition1.4.

Let Δ(s) =t₁t₂. . . t_n. . . be the directive word of s. There exists an integerk such thatu =ψ(t₁t₂. . . t_k). We consider anyAR words whose directive word Δ(s) has the prefix t₁t₂. . . t_ka. Thusua=uis a prefix of s. This implies, by Propositions1.2and1.4, thatu⁽⁺⁾is a bispecial prefix ofs. From this one derives

au⁽⁺⁾∈Fact(s).

3. Pseudostandard words

Aninvolutory antimorphism of the free monoidA^∗ is a mapϑ:A^∗→A^∗such thatϑ(uv) =ϑ(v)ϑ(u) for anyu, v∈A^∗, and ϑ◦ϑ= id. The reversal operator

R:w∈A^∗→w˜∈A^∗

is the basic example of involutory antimorphism ofA^∗. Any involutory antimorphism is the compositionϑ=τ◦R=R◦τ whereτ is an involutory permutation of the alphabetA. Thus it makes sense to callϑ-palindromes the fixed points of an involutory antimorphismϑ. We shall denote byPAL_ϑthe set ofϑ-palindromes overA. One can then define theϑ-palindrome closure operators: w^⊕^ϑ (resp.w^ϑ) denotes the shortestϑ-palindrome havingwas a prefix (resp. suffix).

Some properties and results relatingϑ-palindrome closure operators with peri- odicity and conjugacy are in [6]. Further interesting combinatorial properties of ϑ-palindromes, motivated by problems of molecular biology, have been recently studied in [11].

In the following, we shall fix an involutory antimorphismϑofA^∗, and use the notation ¯w for ϑ(w). We shall also drop the subscript ϑfrom the ϑ-palindrome closure operator^⊕^ϑ when no confusion arises. Ifw=sQ=P t, whereQ(resp.P) is the longestϑ-palindromic suffix (resp. prefix) ofw, then

w^⊕ =sQ¯s and w= ¯tP t (see [6]). Moreover, from the deﬁnition it follows

w = ¯w^⊕ for anyw∈A^∗.

For example, whenA={a, b},ϑ=E◦RwhereEis the interchange morphism deﬁned by E(a) = b and E(b) = a, one has (aabab)^⊕ =aababb and (aabab) = ababbaabab.

The following lemma will be useful in the sequel.

(10)

M. BUCCIET AL.

Lemma 3.1. For any u∈PAL_ϑ\ {ε} and a∈ A, (ua)^⊕ is a ﬁrst return to u, i.e., if(ua)^⊕=λuρwith λ, ρ∈A^∗, then either λ=εorρ=ε.

Proof. By contradiction, letλ, ρ∈A⁺ be such that

(ua)^⊕=λuρ. (1)

Clearly|λ|+|u|+|ρ|=|(ua)^⊕| ≤2|u|+ 2, which implies|λ| ≤ |u|+ 2−|ρ| ≤ |u|+ 1.

Let us show that actually one has|λ| ≤ |u|. Indeed, ifλ=uathen from (1) one derives|(ua)^⊕|= 2|u|+ 2; this implies thata /∈PAL_ϑand (ua)^⊕=ua¯au=uauρ, so thatuρ= ¯au. It follows that for somek >0,u= ¯a^k ∈/ PAL_ϑ, a contradiction.

Let thenv, w∈A^∗ be such thatu=λvand (ua)^⊕ =uw= ¯wu, whenceλuρ= uw=λvw. Thus uρ=vw, so that v is also a preﬁx of uand therefore a border of u. Since u is a ϑ-palindrome, v is a ϑ-palindrome too, so that u=λv = vλ.¯ Therefore

(ua)^⊕ =λuρ=λv¯λρ.

Thus λvλ¯ is a ϑ-palindrome beginning with uaand strictly shorter than (ua)^⊕,

which is a contradiction.

We can naturally deﬁne a mapψ_ϑ:A^∗→A^∗byψ_ϑ(ε) =εand ψ_ϑ(ua) = (ψ_ϑ(u)a)^⊕

for u ∈ A^∗, a ∈ A. For any u, v ∈ A^∗ one has ψ_ϑ(uv) ∈ ψ_ϑ(u)A^∗∩A^∗ψ_ϑ(u), so that, as done for the iterated palindrome closure, the domain of ψ_ϑ can be extended to inﬁnite words too. More precisely, ifx∈A^ω, then

ψ_ϑ(x) = lim

n→∞ψ_ϑ(w_n),

where {w_n} = Pref(x)∩Aⁿ for all n ≥ 0. The word x is called the directive word ofψ_ϑ(x) and is denoted by Δ(ψ_ϑ(x)). The images of inﬁnite words over A byψ_ϑ have been calledϑ-standard words in [6]. If ϑ =R, then ψ_R = ψ, where ψis the iterated palindrome closure operator introduced in Section1, so that an R-standard word is a standard episturmian word. A ϑ-standard word, without specifying the antimorphismϑ, has been calledpseudostandard word.

Example 3.2. LetA={a, b} andϑ=E◦R, so that ¯a=b. Forx= (ab)^ω, we haveψ_ϑ(a) =ab,ψ_ϑ(ab) =abbaab, and

s=ψ_ϑ(x) =abbaababbaabbaab . . .

The wordsis theϑ-standard word havingxas its directive word.

The following theorem, proven in [6], shows that any ϑ-standard word is a morphic image of the standard episturmian word having the same directive word.

Theorem 3.3. For any w ∈ A^∞, one has ψ_ϑ(w) = μ_ϑ(ψ(w)), where μ_ϑ is the injective morphism deﬁned asμ_ϑ(a) =a^⊕ for any letter a∈A.

(11)

For instance, one easily veriﬁes that the wordsof Example3.2is equal toμ(f), wheref is the Fibonacci word and μ=μ_ϑ is the Thue-Morse morphism deﬁned asμ(a) =ab,μ(b) =ba.

A new proof of Theorem 3.3 will be given in Section 4, as a consequence of a more general result. Some general properties of ϑ-standard words have been considered in [6]. In particular, we recall that

Proposition 3.4. Let s=ψ_ϑ(x)be aϑ-standard word. The following hold:

(1) wis a preﬁx ofs if and only ifw^⊕ is a preﬁx ofs,

(2) the set of allϑ-palindromic preﬁxes ofs is given byψ_ϑ(Pref(x)), (3) sis closed underϑ, i.e., ifw∈Fact(s), then w¯∈Fact(s).

Moreover, the following holds:

Proposition 3.5. If s is a ϑ-standard word over A and two letters of A occur inﬁnitely often in Δ(s), then any preﬁx of sis a left special factor of s.

Proof. A prefixpofsis also a prefix of anyϑ-palindromic prefixB ofssuch that

|p| ≤ |B|. SinceBis a suﬃx of anyϑ-palindromic preﬁx ofswhose length is at least

|B|, and there exist two distinct letters (sayaandb) which occur inﬁnitely often in Δ(s), by Proposition3.4one derivesBa, Bb∈Fact(s). Therefore, as ¯p∈Suﬀ(B), we have ¯pa,pb¯ ∈Fact(s),i.e., ¯pis right special. Since by Proposition3.4sis closed underϑ, one has ¯ap,¯bp∈Fact(s); as ¯a= ¯b,pis left special.

For the converse of the previous proposition, we observe that aϑ-standard word scan have left special factors which are not prefixes ofs. For instance, consider theϑ-standard wordsin Example3.2. As one easily verifies,bandbaare two left special factors ofs, which are not prefixes.

However, we will show that if a left special factorwof aϑ-standard wordsis not a preﬁx of s, then|w| ≤ 2. For a proof of this we need a couple of lemmas.

We denote byA=A\PAL_ϑthe set of letters ofA that are notϑ-palindromic.

Lemma 3.6. The following holds:

Aμ_ϑ(A^∗)∩μ_ϑ(A^∗) =μ_ϑ(A^∗)A∩μ_ϑ(A^∗) =∅.

Proof. It is suﬃcient to observe that any word inμ_ϑ(A^∗) has an even number of

occurrences of letters inA.

Lemma 3.7. Let b, c ∈ A, and let f = ¯bμ_ϑ(u) and g =μ_ϑ(v)c be factors of a ϑ-standard wordt=μ_ϑ(s), withs∈SEp. Then:

(1) Ifbu, vc∈Fact(s)and|f|>1, thenf =g.

(2) Ifu∈Fact(s) and|f|>3, thenbu∈Fact(s).

Proof. (1) Since |f|>1, one hasu=ε. By contradiction, if f =g, one has also v=ε, so that, from the definition ofμ_ϑ, ¯bbis a prefix ofμ_ϑ(v). Thenb¯bis a prefix ofμ_ϑ(u), and so on; therefore, f = ¯b(b¯b)^k = (¯bb)^k¯b for k=|u|=|v| ≥1. Hence c= ¯b, u=b^k, andv= ¯b^k. Ask≥1, by Proposition1.6,bu=b^k+1 andvc= ¯b^k+1 cannot be both factors of the episturmian words, a contradiction.

(12)

M. BUCCIET AL.

(2) Since|f|>3, one derives|u|>1. By contradiction, supposebu /∈Fact(s).

By the preceding lemma and by Theorem3.3, one derivesf =μ_ϑ(v)c for some suitablev ∈ A^∗ and c ∈A such that vc ∈Fact(s). As done before, one then obtainsf = (¯bb)^k¯bso thatb^k,¯b^k∈Fact(s), which is absurd by Proposition1.6, as

k≥2.

Theorem 3.8. Letwbe a left special factor of aϑ-standard wordt=μ_ϑ(s), with s∈SEp. If |w| ≥3, then wis a preﬁx oft.

Proof. By Theorem3.3,wcan be written in one of the following ways:

(1) w=μ_ϑ(u), withu∈Fact(s),

(2) w= ¯bμ_ϑ(u), withbu∈Fact(s) andb∈A, (3) w=μ_ϑ(u)c, withuc∈Fact(s) andc∈A, (4) w= ¯bμ_ϑ(u)c, withbuc∈Fact(s) andb, c∈A.

In case 1, letxw, yw∈Fact(t) withx=yletters ofA. Ifxisϑ-palindromic, then clearly one must havexu∈Fact(s). Ifx∈A, then by the preceding lemma one has ¯xu∈Fact(s), as|xw|>3. Since the same holds fory,uis a left special factor of the episturmian words, and therefore a preﬁx of it. Thusw=μ_ϑ(u) is a preﬁx oft.

Cases 2 and 4 are absurd; indeed, by the preceding lemma one derives that every occurrence ofwis preceded byb.

Finally, in case 3, by the preceding lemma one derives that every occurrence of wis followed by ¯c. Henceμ_ϑ(uc) is a left special factor oftand one can apply the same argument as in case 1 to show that it is a preﬁx oft.

An inﬁnite word t is a ϑ-word if there exists a ϑ-standard word s such that Fact(t) = Fact(s). AnR-word is an episturmian word.

Proposition 2.1 and Theorem 2.5 can be extended to the class of ϑ-words, showing that if w is a factor of a ϑ-word, then w^⊕ and w are also factors of ϑ-words. A proof can be obtained as a consequence of Theorems2.5and3.3and of Corollary2.6. However, we need the following lemma (cf.[6]):

Lemma 3.9. Let u∈A^∗ andx∈A∪ {ε}. Then (μ_ϑ(u)x)^⊕=μ_ϑ

(ux)⁽⁺⁾

.

Theorem 3.10. Let w be a factor of a ϑ-standard word. Then each of w^⊕ and w is a factor of a ϑ-standard word.

Proof. We shall supposew /∈PAL_ϑ, otherwise the result is trivial. Sincew= ¯w^⊕, it suﬃces to prove the result forw^⊕. Let A =A\PAL_ϑ as above. From Theo- rem3.3, one derives thatwcan be written in one of the following ways:

(1) w=μ_ϑ(u)x, with x∈A∪ {ε}andux∈Fact(Ep), (2) w= ¯aμ_ϑ(u)b, witha, b∈A andaub∈Fact(Ep), (3) w= ¯aμ_ϑ(u), witha∈A andau∈Fact(Ep).

(13)

In the ﬁrst case, by Theorem 2.5 there exists a standard episturmian word s =ψ(Δ) such that (ux)⁽⁺⁾ ∈ Fact(s). Thus, by Lemma 3.9 and Theorem 3.3, w^⊕=μ_ϑ

(ux)⁽⁺⁾

is a factor of theϑ-standard wordψ_ϑ(Δ) =μ_ϑ(s).

In the second case, by using Lemma3.9, one has:

w^⊕= ¯a(μ_ϑ(u)b)^⊕a= ¯aμ_ϑ

(ub)⁽⁺⁾

a∈Fact

μ_ϑ

a(ub)⁽⁺⁾a

.

Moreover,aubis not a palindrome, since otherwise one would derive, for instance using Lemma 3.9, that w = ¯aμ_ϑ(u)b is a ϑ-palindrome, which contradicts our assumption. Thus (aub)⁽⁺⁾=a(ub)⁽⁺⁾aand the result is a consequence of Theo- rem3.3.

In the third case, sincewis not aϑ-palindrome, by Lemma3.9one obtains w^⊕ = ¯aμ_ϑ(u)^⊕a∈Fact

μ_ϑ(au⁽⁺⁾a)

.

Ifu=a^k for somek≥0, thenau⁽⁺⁾a=a^k+2 ∈Fact(Ep); otherwiseau⁽⁺⁾ is not a palindrome andau⁽⁺⁾a = (au⁽⁺⁾)⁽⁺⁾, so that au⁽⁺⁾a is episturmian by Corol- lary2.6and Theorem2.5. Once again, the assertion follows from Theorem3.3.

Corollary 3.11. Let w be a factor of a ϑ-standard word. Then there exists a ϑ-standard word having bothw^⊕ and w as factors.

Proof. Trivial if w ∈ PAL_ϑ. Let then w = P bt = saQ, where P (resp. Q) is the longestϑ-palindromic preﬁx (resp. suﬃx) of w, anda, b ∈ A. Thusw¯a and

¯bw, being respectively factors of w^⊕ = saQ¯a¯s and w = ¯t¯bP bt, are factors of ϑ-standard words by Theorem3.10.

Supposew¯a /∈ PAL_ϑ. Then (w¯a) = aw¯a, so that w¯a is a factor of some ϑ-standard word, by Theorem3.10. Consider the word

(w¯a)^⊕ = (¯t¯bP bt¯a)^⊕= (¯t¯bsaQ¯a)^⊕,

and callQthe longestϑ-palindromic suffix ofw¯a; thenQ=aQ¯a. Indeed, since aQ¯ais aϑ-palindrome, one has|Q| ≥ |aQ¯a|; but|aQ¯a|<|Q| ≤ |saQ¯a|is absurd, forQwould not be the longestϑ-palindromic suffix ofw, and|Q|>|saQ¯a|cannot happen, for otherwise there would exist aϑ-palindromic proper suffix ofwhaving was a suffix, contradicting the definition ofw. Thus

(wa)¯ ^⊕= ¯t¯bsaQ¯a¯sbt= ¯t¯bP bt¯a¯sbt

is a factor of someϑ-standard word, again by Theorem3.10, and it contains both w^⊕ andw as factors.

Ifw¯a ∈PAL_ϑ but ¯bw /∈ PAL_ϑ, one can prove by a symmetric argument that (¯bw^⊕) is a factor of someϑ-standard word having bothw^⊕ and w as factors.

Let thenw¯a,¯bw∈PAL_ϑ, so that

w^⊕=w¯a=aw¯ andw= ¯bw= ¯wb. (2)

(14)

M. BUCCIET AL.

If w is a single letter, one derives w =a =b, so that w^⊕ = a¯a and w = ¯aa.

Therefore w^⊕ and w are factors of any ϑ-standard word whose directive word begins with a². Let us then suppose |w| > 1. From (2) it follows w = aRbfor someR∈A^∗ such thataR= ¯R¯a=P andRb= ¯bR¯=Q. Moreover,

w=aRb=a¯bR¯= ¯R¯ab, (3) showing that ¯Ris a border ofw. Therefore one has eitherw= (a¯b)^korw= (a¯b)^ka, for some k > 0. In the ﬁrst case, from (3) one derives a = ¯a and b = ¯b, so that anyϑ-standard word whose directive word begins with ab^k+1 contains both w^⊕ = (ab)^kaand w =b(ab)^k as factors. In the latter case, by (3) one obtains a=b, so that anyϑ-standard word whose directive word begins witha^k+1contains

bothw^⊕= (a¯a)^k andw= (¯aa)^k as factors.

Remark 3.12. For a finite episturmian wordw, the proof of the preceding result can be simplified by using Theorem2.5 and Corollary 2.6. Indeed, if w is not a palindrome, we can write w =P bt =saQ, whereP and Q are respectively the longest palindromic prefix and suffix ofw, and a, b∈ A. By Theorem2.5, w⁽⁺⁾ and w⁽⁻⁾ are finite episturmian words; moreover bw is a factor of w⁽⁻⁾, so that by Corollary2.6,bw⁽⁺⁾is a finite episturmian word. By Theorem2.5,

bw⁽⁺⁾₍₋₎ is a ﬁnite episturmian word, which has also w⁽⁻⁾ as a factor, as one can prove similarly as in the proof of Corollary3.11.

In the case of Sturmian words, results analogous to Theorem3.10 and Corol- lary3.11were proven in [6] with a diﬀerent and simpler technique based on the structure of ﬁnite Sturmian words.

Example 3.13. Letτ be the Tribonacci word

τ=ψ((abc)^ω) =abacabaabacababacabaabacabac . . .

Ifw=bac∈Fact(τ), one has thatw⁽⁺⁾=bacabandw⁽⁻⁾=cabacare factors ofτ.

However, in the case of the factorv=abacabab, one hasv⁽⁺⁾=abacababacaba∈ Fact(τ), whereasv⁽⁻⁾=babacababis not a factor ofτ, since otherwisevwould be a left special factor ofτ, which is a contradiction asv /∈Pref(τ). Nevertheless, both v⁽⁺⁾ and v⁽⁻⁾ are factors of any episturmian word whose directive word begins withabcbb. Indeed, v =P b whereP =abacabais the longest palindromic preﬁx ofv, and

bv⁽⁺⁾ ₍₋₎

=abacababacababacaba=ψ(abcbb).

4. Words generated by nonempty seeds

We now consider a generalization of the construction ofϑ-standard words. De- ﬁne the map ˆψ_ϑ : A^∗ → A^∗ by setting ˆψ_ϑ(ε) = u₀ with u₀ a ﬁxed word of A^∗ calledseed, and

ψˆ_ϑ(ua) =

ψˆ_ϑ(u)a _⊕ foru∈A^∗and a∈A.

(15)

As usual, we can extend this deﬁnition to inﬁnite wordst∈A^ω by:

ψˆ_ϑ(t) = lim

n→∞ψˆ_ϑ(w_n),

where{w_n}= Pref(t)∩Aⁿ for alln≥0. The wordt is called thedirective word of ˆψ_ϑ(t), and denoted by Δ( ˆψ_ϑ(t)). When the seedu₀is empty, one has ˆψ_ϑ=ψ_ϑ so that one obtainsϑ-standard words. If u₀ = ε, then any word ˆψ_ϑ(t) is called ϑ-standard with seed.

Example 4.1. LetA={a, b, c},ϑbe the involutory antimorphism exchanginga andband ﬁxingc,u₀=acbbc, andw=abc. Then

ψˆ_ϑ(w) =

ψˆ_ϑ(ab)c _⊕

=

ψˆ_ϑ(a)b _⊕

c

_⊕

=

(acbbca)^⊕b_⊕ c

_⊕

=

(acbbcaacbb)^⊕c_⊕

=acbbcaacbbcaacbcacbbcaacbbcaacb.

Let t =xt₁t₂. . ., withx ∈ A and t_i ∈ A for i ≥1. We remark that the set of ϑ-palindromic preﬁxes of the wordw= ˆψ_ϑ(t) is

(PAL_ϑ∩Pref(u₀))∪ {un|n≥1}, whereu₁= (u₀x)^⊕ andu_i+1= (u_it_i)^⊕ fori≥1.

Deﬁne the endomorphismφ_x ofA^∗ by setting φ_x(a) = ˆψ_ϑ(xa) ˆψ_ϑ(x)⁻¹

for any lettera∈A. From the deﬁnition, one has thatφ_x depends onϑand u₀; moreover,φ_x(a) ends with ¯afor alla∈A, so that any word of the setX=φ_x(A) is uniquely determined by its last letter. ThusX is a suﬃx code, and φ_x is an injective morphism.

Example 4.2. Let A, ϑ, and u₀ be deﬁned as in Example 4.1, and let x= a.

Then

φ_a(a) = ψˆ_ϑ(aa) ˆψ_ϑ(a)⁻¹=acbbcaacb,

φ_a(b) = ψˆ_ϑ(ab) ˆψ_ϑ(a)⁻¹=acbbca, (4) φ_a(c) = ψˆ_ϑ(ac) ˆψ_ϑ(a)⁻¹=acbbcaacbc.

To simplify the notation, in the following we shall often omit in the proofs the subscriptxfrom φ_x, when no confusion arises.

Theorem 4.3. Fixx∈Aandu₀∈A^∗. Letψˆ_ϑ andφ_x be deﬁned as above. Then for anyw∈A^∗, the following holds:

ψˆ_ϑ(xw) =φ_x(ψ(w)) ˆψ_ϑ(x).

(16)

M. BUCCIET AL.

Proof. In the following we shall often use the property that ifγis an endomorphism ofA^∗ andv is a suﬃx ofu∈A^∗, then γ(uv⁻¹) =γ(u)γ(v)⁻¹.

We will prove the theorem by induction on|w|. It is trivial that forw=ε the claim is true sinceψ(ε) =ε=φ(ε). Suppose that for all the words shorter than w, the statement holds. For|w|>0, we setw=vy withy∈A.

First we consider the case|v|_y= 0. We can then writev=v₁yv₂with|v₂|_y= 0, so that

ψˆ_ϑ(xv) = ˆψ_ϑ(xv₁yv₂) = ˆψ_ϑ(xv₁)yλ= ¯λ¯yψˆ_ϑ(xv₁),

for a suitable λ ∈ A^∗. Note that ˆψ_ϑ(xv₁) is the largest ϑ-palindromic preﬁx (resp. suﬃx) followed (resp. preceded) byy (resp. ¯y) in ˆψ_ϑ(xv). Therefore,

ψˆ_ϑ(xvy) = ¯λ¯yψˆ_ϑ(xv₁)yλ= ˆψ_ϑ(xv) ˆψ_ϑ(xv₁)⁻¹ψˆ_ϑ(xv). (5) By a similar argument one has:

ψ(vy) =ψ(v)ψ(v₁)⁻¹ψ(v). (6) By induction we have:

ψˆ_ϑ(xv) =φ(ψ(v)) ˆψ_ϑ(x), ψˆ_ϑ(xv₁) =φ(ψ(v₁)) ˆψ_ϑ(x).

Replacing in (5), and by (6), we obtain

ψˆ_ϑ(xvy) = φ(ψ(v))φ(ψ(v₁))⁻¹φ(ψ(v)) ˆψ_ϑ(x)

= φ(ψ(v)ψ(v₁)⁻¹ψ(v)) ˆψ_ϑ(x)

= φ(ψ(vy)) ˆψ_ϑ(x), which was our aim.

Now suppose that|v|y = 0 andPAL_ϑ∩Pref(u₀x)y⁻¹=∅. Letα_ybe the longest word inPAL_ϑ∩Pref(u₀x)y⁻¹, that is the longestϑ-palindromic preﬁx ofu₀xwhich is followed byy. Since|v|y = 0, one derives that the longestϑ-palindromic suﬃx of ˆψ_ϑ(xv)y is ¯yα_yy, whence

ψˆ_ϑ(xvy) =

ψˆ_ϑ(xv)y _⊕

= ˆψ_ϑ(xv)α⁻¹_y ψˆ_ϑ(xv). (7) By induction, this implies

ψˆ_ϑ(xvy) =φ(ψ(v)) ˆψ_ϑ(x)α⁻¹_y φ(ψ(v)) ˆψ_ϑ(x). (8) By using (7) forv=ε, one has ˆψ_ϑ(xy) = ˆψ_ϑ(x)α⁻¹_y ψˆ_ϑ(x), and

φ(y) = ˆψ_ϑ(xy) ψˆ_ϑ(x)

₋₁

= ˆψ_ϑ(x)α⁻¹_y .