Hierarchies and reducibilities on regular languages related to modulo counting

(1)

DOI:10.1051/ita:2007063 www.rairo-ita.org

HIERARCHIES AND REDUCIBILITIES ON REGULAR LANGUAGES RELATED TO MODULO COUNTING

^∗

Victor L. Selivanov

¹

Abstract. We discuss some known and introduce some new hierarchies and reducibilities on regular languages, with the emphasis on the quantifier-alternation and difference hierarchies of the quasi-aperiodic languages. The non-collapse of these hierarchies and decidability of some levels are established. Complete sets in the levels of the hierarchies under the polylogtime and some quantifier-free reducibilities are found. Some facts about the corresponding degree structures are established. As an application, we characterize the regular languages whose balanced leaf-language classes are contained in the polynomial hierarchy. For any discussed reducibility we try to give motivations and open questions, in a hope to convince the reader that the study of these reducibilities is interesting for automata theory and computa- tional complexity.

Mathematics Subject Classification.03D05, 03C13, 68Q15.

1. Introduction

The notion of hierarchy appeared in descriptive set theory as a classiﬁcation tool for characterizing complexity of sets studied in analysis. The notion of reducibility appeared in computability theory and plays a central role in the classiﬁcation of undecidable problems.

Keywords and phrases.Regular language, quasi-aperiodic regular language, quantifier-alternation hierarchy, difference hierarchy, polylogtime reducibility, quantifier-free reducibility, for- bidden pattern.

∗Supported by the Alexander von Humboldt Foundation, by DFG Mercator program and by RFBR grant 07-01-00543a.

1A.P. Ershov Institute of Informatics Systems, Siberian Division of the Russian Academy of Sciences;vseliv@nspu.ru

Article published by EDP Sciences c EDP Sciences 2008

(2)

V.L. SELIVANOV

Later, different notions of hierarchies and reducibilities were employed in different branches of computation theory and of definability theory (examples in descriptive set theory are the Borel hierarchy and the Wadge reducibility, in complexity theory – the polynomial-time hierarchy and the polynomial-timem-reducibility, in finite model theory – the logical hierarchies and reducibilities and so on). Some of these hierarchies and reducibilities turned out to be also quite important for the corresponding fields.

Hierarchies in automata theory (e.g. the dot-depth hierarchy) were introduced long ago [7]. More recently, people began to consider reducibilities inducing non- trivial degree structures on the regular sets (i.e., on the languages recognized by finite automata) [3,12,13,38,39,43]. In particular, there exists a natural quantifier- free reducibility that fits the dot-depth hierarchy in the sense that every level is downward closed and has a complete set under this reducibility.

In this paper, we continue to discuss some known and introduce some new hierarchies and reducibilities on the regular sets, with the emphasis on the quantifier- alternation and difference hierarchies of the quasi-aperiodic languages (axiomatized by sentences of signatures containing predicates that count positions modulo a given number). The non-collapse of these hierarchies and decidability of some levels are established. Complete sets in the levels of the hierarchies under the polylogtime and quantifier-free reducibilities are found. Several facts on the corresponding degree structures are established. As an application, we characterize the regular languages whose leaf-language classes (in the balanced model of leaf language definability) are contained (uniformly on oracles) in the polynomial hierarchy.

This paper is closely related to [12–14,26,27,35,38,39], and we often refer to the results and proofs there. Reading of our paper would become much easier with these sources at hand.

In Section2 we recall some notions and known facts and state some new facts related to the logical approach to automata theory. In Section3 we consider hierarchies of regular languages induced by the quantifier-alternation hierarchies of first-order formulas. Sections 4–7 are devoted to the difference hierarchies over levels of the quantifier-alternation hierarchies. In Section8 we discuss the important polylogtime reducibility closely related to the so called leaf language approach to complexity classes which is described rather comprehensively in [45]. In Sec- tions9 and10we consider some versions of the quantifier-free reducibility [38,39]

which ﬁt the introduced hierarchies. In Section11we present some results on the ﬁrst-order reducibilities. We conclude in Section12 with mentioning some other reducibilities and open questions.

2. Regular languages and logic

We use (mostly without deﬁnitions here) some standard terminology and notation from computability theory, automata theory and complexity theory, say the terminology on reducibilities and degrees, the notation of languages by regular

(3)

expressions or the concept of polynomial-time non-deterministic Turing machine.

LettersA, B will denote alphabets which are always assumed to contain at least two symbols. By A⁺ we denote the set of all non-empty words over A, and by A^∗ the set of all words (including the empty wordε). For any k, let A^k denote the set of words over A of length k; notations A^≤k and A^>k are deﬁned in the same manner. Since usually we work with a ﬁxed alphabetA, we normally do not mention the alphabet explicitly. The length of a wordwis denoted|w|. For every i <|w|, w(i) denotes theith letter of w(thus, we start the numbering of letters in w with 0). By #_a(w) we denote the number of entries of the letter a in the wordw.

Since we use the logical approach to regular languages [4,20,40,42] and the empty structures are not usual in logic, we work mostly with the languages of nonempty wordsL⊆A⁺. Correspondingly, the complementLof such a languageLis deﬁned byL=A⁺\L. As usual,P(A⁺) denotes the power set ofA⁺. ByP(A⁺) we denote the class of all non-trivial (i.e. distinct from∅ andA⁺) languages over A. ByR(R) we denote the class of all regular (respectively, regular non-trivial) languages overA. For a classCof languages, let BC(C) be the Boolean closure of C,i.e., the closure ofCunder union and complement. By co-Cwe denote the class of complements of languages inC.

Relate to any alphabet A = {a, . . .} the signature σ = σ_A = {≤, Qa, . . . ,⊥, , p, s}where≤is a binary relation symbol,Q_a(for eacha∈A) is a unary relation symbol, ⊥ and are constant symbols, and p, s are unary function symbols.

A wordu=u₀. . . u_n ∈ A⁺ may be considered as a structure u= ({0, . . . , n};≤ , Q_a, . . .) of signature σ, where ≤ has its usual meaning, Q_a(a ∈ A) are unary predicates on{0, . . . , n} deﬁned byQ_a(i)↔u_i=a, the symbols⊥anddenote respectively the least and the greatest elements, whilepandsare respectively the predecessor and successor functions on{0, . . . , n}satisfyingp(0) = 0 ands(n) =n.

For a sentenceφ ofσ, let L_φ={u∈A⁺ |u|=φ}. Sentencesφ, ψ are treated as equivalent whenL_φ=L_ψ. In [20] it was shown that the class of FO_σ-axiomatizable languages (i.e., languages of the formL_φ, whereφ ranges through the ﬁrst-order sentences ofσ), coincides with the class of regular aperiodic languages (known also as star-free languages).

Remark. We use in this paper the term “axiomatizable” to denote the languages of the formL_φ instead of the more popular in the literature on automata theory term “deﬁnable” because our term corresponds better to the old tradition in logic.

Note that the term “ﬁnitely axiomatizable” would be even more appropriate but we can use the abbreviated form safely because consider only ﬁnitely axiomatizable languages.

We will consider also some subsignatures and enrichments of the signatureσ.

In particular, let ρ = {<, Qa, . . .}, and for any positive integer d let τ_d be the signatureσ∪{P_d⁰, . . . , P_d^d−1}, whereP_d^ris the unary predicate true on the positions of a word which are equivalent tormodulod. By FO_τ_d-axiomatizable languagewe mean any language of the formL_φ, whereφis a ﬁrst-order sentence of signatureτ_d.

(4)

V.L. SELIVANOV

Note that signatureτ₁ is essentially the same as σbecause P₁⁰ is the valid predicate. In contrast, ford >1 FO_τ_d-axiomatizable languages need not be aperiodic.

E.g., the sentence P₂¹() deﬁnes the language L consisting of all words of even length which is known to be non-aperiodic. It is easy to see that the predicates P_d¹, . . . , P_d^d−1 may be eliminated from the formulas of signatureτ_d (e.g. P_d¹(x) is essentially equivalent toP_d⁰(s(x))). Nevertheless, for technical reasons we will not remove them fromτ_d. We are also interested in the inﬁnite signatureτ =

dτ_d. Note that the signaturesρ_d =ρ∪ {P_d⁰, . . . , P_d^d−1} and ρ_P =

{ρd | d∈ P}, for each setP of positive integers, were discussed in [5,10,35].

Let us formulate a precise characterization of the introduced classes of languages in terms of syntactic monoids and homomorphisms (for details see e.g. Chap. 5 of [35]). For a languageL, letM(L) be its syntactic monoid andη_L:A^∗→M(L) the canonical syntactic homomorphism. We denote the semigroup operation on M(L) by·. As is well-known,Lis regular iffM(L) is finite. A languageLis called aperiodicif there is no non-trivial group (G;·)⊆(M(L);·) (as usual,⊆here means the substructure relation). A languageL is calledquasi-aperiodic [35] if there is no non-trivial group (G;·)⊆(η_L(A^d);·) for eachd >0 whereη_L(A^d) is the image ofA^d under η_L. We call a languageL d-quasi-aperiodic (for any fixed d >0), if there is no non-trivial group (G;·)⊆(η_L((A^d)⁺);·). Note thatLis aperiodic iffL is 1-quasi-aperiodic.

Theorem 2.1.

(1) A regular languageL is quasi-aperiodic iﬀ it isFO_τ-axiomatizable.

(2) A regular languageL isd-quasi-aperiodic iﬀ it isFO_τ_d-axiomatizable.

(3) A regular languageL is aperiodic iﬀ it isFO_σ-axiomatizable.

For the proof of (1) see Theorem VI.4.1 in [35]. The proof of (2) is implicitly contained in the proof of that Theorem VI.4.1. For a detailed proof of (2) (even for an arbitrary setP of moduli) see [10]. The assertion (3) is a particular case of (2) and is a classical result of Sch¨utzenberger, McNaughton and Papert [20,25].

The last theorem implies the following important decidability result (for details seee.g.[10,35]).

Corollary 2.2. The classes of languages from the last theorem are decidable.

Remark. In [10] several additional interesting facts about the ﬁrst-order axiomatizable languages were established, in particular FO_τ_a_∪τ_b = FO_τ_c wherec is the least common multiple ofaandb, and FO_τ_a∩FOτb= FO_τ_d wheredis the greatest common divisor ofaandb.

From the interpretation of signature symbols in the word structuresuit follows that the nonempty words correspond bijectively to (the isomorphism types of) the ﬁnite models of the theory CLO^τ_A^d of signatureτ_d (CLO stand for “colored linear orders”) with the following axioms:

– ≤is a linear order,

– any element satisﬁes exactly one of the predicatesQ_a(a∈A), – ∀x(⊥ ≤x≤ ),

– ∀x(p(x)≤x∧ ¬∃y(p(x)< y < x)),

(5)

– ∀x(x≤s(x)∧ ¬∃y(x < y < s(x))), – ∀x(x >⊥ →p(x)< x),

– ∀x(x < →x < s(x)), – P_d⁰(⊥),

– any element satisﬁes exactly one of the predicatesP_d⁰, . . . , P_d^d−1, – ∀x <(P_d^r(x)→P_d^r+1(s(x))) for 0≤r < d−1,

– ∀x <(P_d^r−1(x)→P_d⁰(s(x))).

Sometimes it is technically more convenient to consider “relational” versions of the signatureτ_d (and of the theory CLO^τ^d). E.g., one could take the signatureτ_d = {≤, Qa, . . . , P_d^r,⊥,, S}, whereS(x, y) is a binary relation symbol interpreted as

“x is an immediate predecessor of y”. The signatures τ_d and τ_d are equivalent for most of our purposes, in particular the quantiﬁer alternation hierarchies over them (discussed in the next section) coincide. So we may use any of the signatures when appropriate.

Remark. Analogs of many results of this paper hold (with similar proofs) also for some signatures not discussed explicitly in what follows, in particular for the signaturesρ_d mentioned above. We present all details for the signaturesτ_d andτ because they are better related to the leaf language deﬁnability.

For any setP of positive integers, let FO + MOD(P) denote the class of languages axiomatized byσ-sentences using modulo counting quantifiers∃^(q,r) with moduli q in P, along with the usual first-order quantifiers. It is known (see [37]

or Chap. 7 of [35]) that the class FO + MOD = FO + MOD({1,2, . . .}) consists exactly of languages with solvable syntactic monoid. Any FO_τ_d-axiomatizable language is FO + MOD({d})-axiomatizable. The languageL⊆ {a, b}⁺consisting of the words with even number of entries ofais FO + MOD({2})-axiomatizable but not quasi-aperiodic. More information on the logical approach may be found in [11,23,26,35,40,41].

3. Quantifier-alternation hierarchies

In this section we consider hierarchies of regular languages induced by the quantiﬁer-alternation hierarchy of ﬁrst-order formulas in prenex normal form. For any signatureθwe obtain the corresponding hierarchy calledθ-hierarchy.

Forn >0, let Σ^σ_n be the class of all languagesL_φ, whereφranges through the Σ_n-sentences of σ. Let Π^σ_n = co-Σ^σ_n and Δ^σ_n = Σ^σ_n∩Π^σ_n. In [40] it was shown that the σ-hierarchy (more exactly, the sequence {BC(Σ^σ_n)}n>0) coincides with the dot-depth hierarchy which is a popular object of automata theory. If we take the smaller signatureρ={<, Qa, . . .}, we obtain the ρ-hierarchy{Σ^ρ_n} known as the Straubing-Th´erien hierarchy. The mentioned hierarchies of regular languages have natural characterizations in terms of regular expressions [23,40]; we do not recall these characterizations because we work here only with the logical deﬁnition.

Levels of theτ_d-hierarchy are denoted Σ^τ_n^d,Π^τ_n^d,Δ^τ_n^d. Of course, theτ_d-hierarchy exhausts the class of τ_d-axiomatizable languages, and the τ₁-hierarchy coincides

(6)

V.L. SELIVANOV

with theσ-hierarchy. To our knowledge, theτ_d-hierarchy ford >1 (as well as the τ-hierarchy discussed below) was not considered in the literature so far.

One could also consider similar hierarchies of languages for signatures like τ_d₁∪ · · · ∪τ_d_k, but in fact we do not obtain new hierarchies in this way. The next result reﬁnes a related fact in [10] mentioned in the previous section.

Proposition 3.1. For alln, c, e >0there holdsΣ^τ_n^c^∪τ^e = Σ^τ_n^d, wheredis the least common multiple ofc ande.

Proof. For the inclusion from left to right, we have to show that the predicates P_c⁰ andP_e⁰are easily (say, by the quantiﬁer-free formulas) expressible through the predicates P_d^r. Indeed, let d = ck. Then P_c⁰(x) is equivalent to P_d⁰(x)∨ · · · ∨ P_d^k−1(x), and similarly fore.

For the converse inclusion, leta be the greatest common divisor ofc, e. Then d=ce, wherec=c/a. Since, by the previous paragraph,P_c⁰ is easily expressible throughP_c⁰, it suﬃces to expressP_d⁰ throughP_c⁰ andP_e⁰. Since c andehave no common divisors, it holdsP_d⁰(x)↔P_c⁰(x)∧P_e⁰(x).

By Σ^τ_n,Π^τ_n,Δ^τ_nwe denote the levels of theτ-hierarchy. From the last proposition we immediately obtain the following relation between the introduced hierarchies.

Corollary 3.2. For any n > 0 there hold Σ^τ_n =

dΣ^τ_n^d, Π^τ_n =

dΠ^τ_n^d and Δ^τ_n =

dΔ^τ_n^d.

Obviously, any Σ-level of the quantiﬁer-alternation hierarchies is closed under union and intersection and contains∅ and L⁺ as elements, while any Δ-level is closed under union, intersection and complement.

The main result about the quantifier-alternation hierarchies is that they do not collapse. This can be checked by the standard method of Ehrenfeucht-Fra¨ıssé games developed in [32,33,40,41] for theρ- andσ-hierarchies and the corresponding difference hierarchies. Those proofs are generalized to the hierarchies of this section in a straightforward way. Following the referees request, we provide some details of the proof.

Theorem 3.3. For any n >0 there holdsΣ^ρ_n⊆Π^τ_n. In particular,Σ^τ_n⊆Π^τ_n and Σ^τ_n^d⊆Π^τ_n^d for alln, d >0.

We start with introducing some notions and establishing some lemmas. Let ν be a ﬁnite set of variables. Recall that dealing with the Ehrenfeucht-Fra¨ıss´e games is more comfortable for the signatures without function symbols, like the signaturesρandτ_d from the previous section.

As we already noted, Σ^τ_k^d = Σ^τ_k^d for alld, k ≥1. In this section we work with formulas and structures of the signatureτ_d.

Byν-structure we mean a word fromA⁺ (interpreted as a structure of signa- tureτ_d) together with an assignment of values to the variables fromν. Note [23,35]

that the ν-structures may be considered as nonempty words (a₀, U₀)· · ·(a_n, U_n) over the bigger alphabetA×P(ν) whereU_iis the set of variables assigned to the positioni of the “usual” word a₀· · ·a_n; the sets U₀, . . . , U_r are pairwise disjoint

(7)

and exhaust P(ν). Formulas of signature τ_d with free variables in ν are interpreted in theν-structures in the usual way. The ∅-structures are identiﬁed with the nonempty words overA. Since the modulodis ﬁxed till the proof of Theorem 3.3, we do not mention it explicitly in the technical notions likeS_k,r or k,r we are going to introduce (the exact notation would beS_k,r^d or^d_k,r, respectively).

Define for anyk≥0 andr≥1 the setS_k,rof formulas of signatureτ_d by induction onk as follows. Let S_0,r be the set of quantifier-free formulas in disjunctive normal form without repetitions of the elementary conjunctions and without repetitions of members of the elementary conjunctions. Fork >0, letS_k,rbe the set of finite disjunctions of pairwise distinct formulas ∃z₁^k∃z₂^k. . .∃z_p^k¬φwherep≤r andφ ∈S_k−1,r. Thus, all bounded variables of the formulas in S_k,r are among the variablesz₁ⁱ, zⁱ₂, . . . z_rⁱ, i ∈ {1, . . . , k} fixed in advance. The next assertion is obvious.

Lemma 3.4.

(1) Letνbe a ﬁnite set of variables,k≥0andr≥1. Then the set of sentences inS_k,r with free variables amongν is ﬁnite.

(2) For any k≥1, the class of languages axiomatized by the sentences from

rS_k,rcoincides with Σ^τ_k^d.

For allν, k≥0 and r≥1, deﬁne a preorderk,r on the ν-structures as follows:

uk,r v iﬀ u |= φ implies v |= φ, for all formulas φ ∈ S_k,r with free variables among ν. As usual, let ≡k,r denote the associated equivalence relation. By the previous lemma, the relation≡k,ris of ﬁnite index. Byupper setof a preorder we mean a set of elements closed upwards under the preorder relation.

Lemma 3.5. For all L ⊆ A⁺ and k ≥ 1, L ∈ Σ^τ_k^d iﬀ L is an upper set in (A⁺;k,r)for somer≥1.

Proof. Let L ∈ Σ^τ_k^d, then L = L_φ for some Σ_k-sentence φ of signature τ_d. By Lemma 3.4, φ is equivalent to a sentence from S_k,r for some r. Then L is an upper set in (A⁺;k,r). Conversely, let L be an upper set in (A⁺;k,r). Since L is a ﬁnite union of upper “cones” ˇu={v | uk,r v} in (A⁺;k,r), it suﬃces to show that any such a cone ˇu is in Σ^τ_k^d. Clearly, ˇu = L_φ ∈ Σ^τ_k^d where φ is

{ψ∈S_k,r|u|=ψ}.

Next we characterize the introduced preorder in terms of ak-round Ehrenfeucht- Fra¨ıss´e game G_k,r(u, v) deﬁned for any given ν-structures u and v as follows.

There are two players denoted 1 and 2. The player 1 wants to show that the structures are distinct while the player 2 wants to show they are similar. Each player has his/her copies of thekr variablesz₁^j, . . . , z^j_r, j = 1, . . . , k(realizede.g.

as kr pebbles labeled by the variables). Fork = 0, the players make no moves at all and simply determine the winner by the following rule: the player 2 wins iﬀ u≡0,rv. Now let k≥1. At the ﬁrst round, player 1 chooses a numberp≤r and puts his/her pebblesz₁^k, . . . , z^k_pat some positions of theν-structure u, forming thus a{y₁, . . . , y_m, z₁^k, . . . , z_p^k}-structureu, whereν ={y₁, . . . , y_m}. The player 2 answers by putting his/her variablesz^k₁, . . . , z^k_p at some positions of v forming a

(8)

V.L. SELIVANOV

{y₁, . . . , y_m, z₁^k, . . . , z_p^k}-structure v. If k = 1, the game is over and the winner is determined by the rule above foru andv. Otherwise, the players continue as in the game G_k−1,r(v, u). The notion of a winning strategy for each player is deﬁned in the obvious way. Since the gameG_k,r(u, v) is ﬁnite, one of the players has a winning strategy.

Lemma 3.6. Let uandv beν-structures and letk0,r1. Then uk,rviﬀ player 2 has a winning strategy inG_k,r(u, v).

Proof. Let u≤k,r v. We check by induction on k that player 2 has a winning strategy in G_k,r(u, v). For k = 0 this is clear because u and v satisfy the same quantiﬁer-free formulas. Letk >0. Towards a contradiction, suppose that player 2 does not have a winning strategy, hence player 1 has a winning strategy. Assign the values toz₁^k, . . . , z_p^k (p≤r) inuaccording to the ﬁrst move of player 1 by this strategy. Then we obtain a structureusuch that for all values ofz^k₁, . . . , z^k_p(p≤r) inv (resulting in a structurev) player 1 has a winning strategy inG_k−1,r(v, u).

By induction hypothesis, v ≤_k−1,r u. Let φ be the disjunction of all formulas fromS_k−1,r that are false inu. Thenv|=¬φ. Since the values of z^k₁, . . . , z^k_p inv were arbitrary,

v|=∃z₁^k, . . . , z_p^k¬φandu|=∃z^k₁, . . . , z_p^k¬φ,

which is a contradiction because the formula ∃z₁^k, . . . , z^k_p¬φ is equivalent to a formula inS_k,r.

The opposite implication is also proved by induction onk. Ifk= 0 and player 2 has a winning strategy thenu≡_0,r v and we are done. Now let k >0 and ﬁx a winning strategy for player 2 inG_k,r(u, v). Towards a contradiction, suppose that u≤k,r v, then there is a ψ ∈ S_k,r such that u|=ψ and ¨ v |=ψ. W.l.o.g. we can assume thatψis of the form ∃z₁^k, . . . , z_r^k¬φwhere φ∈S_k−1,r. Since u|=ψ, player 1 can put the variables z₁^k, . . . , z_r^k in u in such a way that the resulting structure u satisﬁes ¬φ. Taking the values of z₁^k, . . . , z_r^k in v according to the winning strategy for player 2 we obtain a structure v that does not satisfy¬φ.

But player 2 has a winning strategy inG_k−1,r(v, u). By induction,v ≤k−1,ru

and thereforev |=φ. A contradiction.

The next corollary of the previous lemma states that the concatenation ofν- structures respects (under a reasonable assumption) the introduced preorder. We omit the obvious proof using the game characterization ofk,r.

Lemma 3.7. Let ν be a ﬁnite set of variables, k≥0,r≥1, andu₁,u₂,v₁, v₂, u₁u₂,v₁v₂ be ν-structures. Thenu₁k,rv₁ andu₂k,rv₂ imply u₁u₂k,rv₁v₂. For allk≥0 andr≥1, deﬁnec(k, r) by induction onkas follows: c(0, r) = 1 andc(k, r) =r+ (r+ 1)c(k−1, r) + 1. Until the end of this section, an upper index on a word means the corresponding power of this word under concatenation.

Lemma 3.8. Let w ∈ A⁺, |w| ≡ 0 (mod d), and N₁, N₂ ≥ c(k, r). Then w^N¹ ≤k,rw^N².

(9)

Proof. We can assume that N₁ = N₂. It suﬃces to prove the assertion for

|N₁−N₂| = 1 because for |N₁−N₂| > 1 it will then follow by induction on

|N₁−N₂|. We have to show thatw^N¹ ≤k,rw^N¹^+z wherez=N₂−N₁∈ {−1,1}.

The proof is by induction on k. For k = 0 the inequality w^N¹ ≤k,r w^N¹^+z is true because w^N¹ and w^N¹^+z satisfy the same atomic sentences of signature τ_d (the sentences are of the form i=j, i < j, S(i, j), Q_a(i), P_d⁰(i), . . . , P_d^d−1(i) for i, j∈ {⊥,}).

We assume that the inequality holds fork−1≥1 and prove it forkby describing a winning strategy for player 2 inG_k,r(w^N¹, w^N²). Let in the ﬁrst round player 1 put phis/her pebbles (for somep≤r) in w^N¹. Since N₁≥c(k, r) the resulting structure will have a factorw^c(k−1,r)+1without pebbles,i.e. the wordw^N¹may be factorized asw₁w^c(k−1,r)+1w₂ wherew₁ andw₂ are some powers ofw containing all theppebblesz₁^k, . . . , z^k_p. Hence, after the ﬁrst round the left structure looks as w₁w^c(k−1,r)+1w₂. Note that, before the answer of player 2, the right word looks as w^N¹^+z =w₁wc(k−1,r)+1+zw₂. Player 2 puts his/her pebblesz₁^k, . . . , z_p^k in this word in a way to obtainw₁wc(k−1,r)+1+zw₂. By induction hypothesis,

wc(k−1,r)+1+z≤_k−1,rw^c(k−1,r)+1,

hence, by Lemma3.7,

w₁wc(k−1,r)+1+zw₂ ≤k−1,rw₁w^c(k−1,r)+1w₂.

Player 2 continues by the winning strategy in G_k−1,r(w₁wc(k−1,r)+1+zw₂,

w₁w^c(k−1,r)+1w₂).

For any ﬁxedk and r, deﬁne operations F andG on A⁺ byF(u) =u^N and G(u, v) =v^Muv^M whereN =r+ (r+ 1)(2c(k, r) + 1) andM =N+c(k, r).

Lemma 3.9. Let u, v∈A⁺,|u| ≡ |v| ≡0 (mod d),k≥0,r≥1, and uk,r v.

Then F(v)k+1,r G(u, v).

Proof. We describe a winning strategy for player 2 inG_k+1,r(F(v), G(u, v)). Let in the ﬁrst round player 1 put his/herppebbles (p≤r) in the wordw₁=F(v) =v^N. By deﬁnition of N, the resulting structure has a factorv^2c(k,r)+1without pebbles, i.e. the wordw₁may be factorized asu₁v^2c(k,r)+1u₂=u₁v^c(k,r)vv^c(k,r)u₂whereu₁ andu₂ are some powers ofv containing all theppebbles. The resulting structure looks then asw₁ = u₁v^c(n,r)vv^c(n,r)u₂. Note that the right word w₂ = G(u, v) may be factorized asw₁v^M−Nuv^M^−Nw₁=u₁v^m¹uv^m²u₂ wherem₁≥c(k, r) and m₂ ≥ c(k, r). Player 2 answers by putting his/her p pebbles in w₂ in a way to obtain w₂ = u₁v^m¹uv^m²u₂. By Lemma 3.8, v^m¹ k,r v^c(n,r). By Lemma 3.7, w₂ k,r w₁ hence player 2 has a winning strategy inG_k,r(w₂, w₁). Player 2

proceeds by following this winning strategy.

Proof of Theorem3.3. To simplify notation, we prove here the following slightly weaker alphabet-dependent version: for anyk≥1 there is a languageH_k over the alphabet A_k = {0,1, . . . , k} with H_k ∈ Σ^ρ_k \Π^τ_k. For the alphabet-independent

(10)

V.L. SELIVANOV

version, see remarks in the corresponding proof in Section5 where we establish a strengthening of Theorem3.3.

We deﬁne the setH_k by induction onk as follows:

H₁=A^∗₁·1·A^∗₁ andH_k+1=A^∗_k+1·(k+ 1)·(A^∗_k\H_k)·(k+ 1)·A^∗_k+1 fork≥1 (cf. Lem. 3.1 in [38,39]). Obviously, H_k ∈ Σ^ρ_k for eachk ≥ 1, so it remains to show thatH_k ∈Π^τ_k. By Corollary3.2, it suffices to show thatH_k ∈Π^τ_k^d for each d≥1. By Lemma3.5, it suffices to find for any k, r≥1 wordsu_k, v_k ∈A⁺_k such thatu_k k,rv_k,u_k ∈H_k and v_k ∈H_k. It is easy to see (by looking at the game G_1,r(u₁, v₁)) that the words u₁ = 0^d and v₁ = 0^d10^d−10^d = 0^d10^2d−1 have the desired properties fork= 1 and for allr≥1.

Fix r ≥ 1 and suppose by induction on k that we have u_k and v_k with the desired properties. By Lemma3.7, x≤k,r y where x= (k+ 1)^du_k(k+ 1)^d and y = (k+ 1)^dv_k(k+ 1)^d. Set u_k+1 =F(y) and v_k+1 = G(x, y). By Lemma 3.9, u_k+1k+1,r v_k+1. By deﬁnition ofH_k+1,u_k+1∈H_k+1 andv_k+1∈H_k+1.

4. Abstract difference hierarchies

In this section we make a couple of general remarks about the diﬀerence hierarchy (DH) known also as the Boolean hierarchy.

LetS be any set. By a base in S we mean any classC of subsets of S which is closed under union and intersection and contains ∅ and S as elements. For any k < ω, let C(k) be the class of sets

i(L_2i\L_2i+1), where L₀ ⊇ L₁ ⊇ · · · is a descending sequence of sets from C and L_i = ∅ for i ≥ k. The sequence {C(k)}k<ω is known as thediﬀerence hierarchy overC. As is well-known,C(k)∪co- C(k)⊆ C(k+ 1) for everyk, and

kC(k) = BC(C).

We will need the following relation between the DH’s over diﬀerent bases.

Proposition 4.1. Let f :T →S be a surjection,C a base inS,Ma base in T, and f(M)∈ C for all M ∈ M. Then f⁻¹(L) ∈ M(k) implies L ∈ C(k) for all L⊆S andk < ω.

Proof. Letf⁻¹(L)∈ M(k), thenf⁻¹(L) =

i(M_2i\M_2i+1) for someM₀⊇M₁⊇

· · ·,M_k =∅,M_i ∈ M. Then f(M₀)⊇f(M₁)⊇ · · ·, f(M_k) =∅, andf(M_i)∈ C for alli. It suﬃces to check that L=

i(f(M_2i)\f(M_2i+1)) (then L∈ C(k), as desired).

First we check the inclusionL⊆

i(f(M_2i)\f(M_2i+1)). Lety∈L. Sincef is a surjection,y=f(x) for somex∈T. Sincex∈f⁻¹(L),x∈M₀, hencey∈f(M₀).

It remains to show thaty ∈f(M_2i+1) implies y ∈f(M_2i+2). Lety =f(x₁) for somex₁∈M_2i+1. Sincex₁∈f⁻¹(L),x₁∈M_2i+2and thereforey∈M_2i+2.

The inclusion

L⊆

i

(f(M_2i)\f(M_2i+1))

is checked in the same manner.

(11)

The next notion is well-known from descriptive set theory [18].

Definition 4.2. A baseCis said to have theseparation property, if for every two disjoint sets L, K ∈ C there is a set M ∈ C ∩co-C with L ⊆ M ⊆ K. We say thatM separatesLfromK (note that it is equivalent to say thatM separatesK fromL).

The following easy fact is often of use.

Proposition 4.3. Let {L_i}_i∈I and {K_j}_j∈J be ﬁnite families of subsets of S and let M_i,j separates L_i from K_j for all i ∈I, j ∈ J. Then the set

i

jM_i,j separates L=

iL_i fromK=

jK_j.

Proof. We haveL_i⊆M_i,j⊆K_j for alli∈I, j∈J. ThenL_i⊆

jM_i,j⊆

jK_j for alli∈ITherefore,L=

iL_i⊆

i

jM_i,j⊆

jK_j =K.

We will need the following fact about the DH’s over bases with the separation property.

Proposition 4.4. Let C be a base with the separation property. Then for any k < ωthe class C(k+ 1)∩co-C(k+ 1) coincides with the class of sets of the form (U ∩L)∪(U ∩K), whereU ∈ C ∩co-C,L∈ C(k) andK∈co-C(k).

Proof. Let M = (U ∩L)∪(U ∩K) where U ∈ C ∩co-C, L∈ C(k) and K ∈co- C(k). One easily checks (even without using the separation property) thatM ∈ C(k+ 1)∩co-C(k+ 1).

Simplifying notation, we prove the opposite inclusion only for the typical particular casek= 2. LetM ∈ C(k+ 1)∩co-C(k+ 1). Then

M =L₀∪(L₁\L₂) andM =K₀∪(K₁\K₂), (1) for some setsL₀ ⊇ L₁ ⊇ L₂ and K₀ ⊇ K₁ ⊇ K₂ from C. Taking complements from (1), we obtain

M = (L₀\L₁)∪L₂ andM = (K₀\K₁)∪K₂, (2) henceL₂andK₂are disjoint. By the separation property,L₂⊆U ⊆K₂for some U ∈ C ∩co-C. Intersecting the ﬁrst equality in (1) withU we obtain

M∩U = (L₀∩U)∪(L₁∩U) = (L₀∪L₁)∩U .

Intersecting the second equality in (2) withU we obtainM∩U= (K₀\K₁)∩U. Therefore,M = (U∩L)∪(U∩K) whereL= (K₀\K1)∈ C(2) andK= (L₀∪L1)∈

co-C(2).

Next we formulate an obvious fact showing that the separation property survives under some operation on bases.

Proposition 4.5. Let{Ci}i∈I be a family of bases inSsuch that everyCi has the separation property and for alli, j∈Ithere exists ak∈I withCi∪Cj⊆ Ck. Then C=

iCi is a base in S with the separation property.

(12)

V.L. SELIVANOV

Relate to any partial order (in fact, all the following notions and results in this section apply also to preorders) P = (P;≤) the base C in P consisting of all upper sets of P (i.e., the sets L ⊆ P such that x ∈ L and x ≤ y imply y∈L), including the empty set. Byalternating chainof lengthkfor a setK⊆P we mean a sequence (x₀, . . . , x_k) of elements of P such that x₀ ≤ · · · ≤ x_k and x_i ∈K↔x_i+1 ∈K for everyi < k. Such a chain is called an 1-alternating chain ifx₀ ∈ K, otherwise it is called a 0-alternating chain. Variants of the following fact frequently appear when dealing with the DH’s.

Proposition 4.6. Let P= (P;≤)be a partial order and C the base of upper sets in P. For all K ⊆P andk < ω, K ∈ C(k) iﬀ K has no 1-alternating chain of lengthk.

Proof. Let ﬁrst K ∈ C(k), thenK =

i(L_2i\L_2i+1), whereL₀⊇L₁ ⊇ · · · is a descending sequence of upper sets andL_k =∅. Towards a contradiction, suppose that (x₀, . . . , x_k) is an 1-alternating chain of length k for K. Sincex₀ ∈ K, we havex₀ ∈L₀. Since L₀ is an upper set andx₀ ≤ x₁, x₁ ∈ L₀. Since x₁ ∈ K, x₁∈L₁. Continuing in this manner, we obtainx_k∈L_k=∅, a contradiction.

In the opposite direction, let K have no 1-alternating chain of lengthk. For anyi < ω, letL_i be the set of allx∈P such that there is an 1-alternating chain (x₀, . . . , x_i) for K with x_i ≤x. Then L₀ ⊇L₁ ⊇ · · · is a descending sequence of upper sets andL_k =∅. It remains to check that K =

i(L_2i\L_2i+1). First we check the inclusion from left to right. Let x∈ K. Thenx ∈L₀. It suﬃces to show that x ∈ L_2i+1 implies x ∈ L_2i+2. Let x ∈ L_2i+1, then there is an 1- alternating chain (x₀, . . . , x_2i+1) forK withx_2i+1≤x. Then (x₀, . . . , x_2i+1, x) is an 1-alternating chain forK, hencex∈L_2i+2.

It remains to show that ifx∈P\K thenx∈

i(L_2i\L_2i+1). Ifx∈L₀, we are done. Now letx∈L₀. It suﬃces to show thatx∈L_2i impliesx∈L_2i+1, and this is done exactly as in the preceding paragraph.

We conclude this section by a result about the DH’s in well partial orders (wpo) that is closely related to some facts in automata theory, e.g. to results like Theorem 3.3 in [34]. Recall that a partial order (P;≤) is a wpo if it has neither inﬁnite descending chains nor inﬁnite antichains. By an ω-alternating chain for a setK ⊆P we mean anω-sequence x₀, x₁, . . . of elements of P such thatx₀≤x₁≤ · · · andx_i∈K↔x_i+1∈K for everyi < ω.

Proposition 4.7. Let P = (P;≤) be a wpo and C the base of upper sets in P. For allK⊆P,K∈BC(C)iﬀ K does not haveω-alternating chains.

Proof. From left to right, the assertion follows from the last proposition and the equality BC(C) =

kC(k). It remains to show that for any K∈P\BC(C) there is anω-alternating chain. By the last proposition, there are alternating chains for Kof arbitrary ﬁnite length.

Letω^∗be the set of all ﬁnite sequences of natural numbers, including the empty sequenceε. LetP∪{⊥}be the partial order obtained by adjoining a new smallest element⊥toP. We construct a partial functionu:ω^∗→P∪ {⊥}as follows. Set u(ε) =⊥and suppose, by induction on |σ|, σ ∈ω^∗, thatu(σ) is already deﬁned.

(13)

If|σ| is even then ﬁnd m∈ω and the elements v₀, . . . , v_m−1 ∈P (if any) which enumerate without repetitions the≤-minimal elements inX={v∈K|u(σ)≤v}

(sinceP is a wpo, the setX is finite). Then we setu(σi) =v_ifori < mandu(σi) is undefined fori≥m. For|σ|odd, the definition is similar but this time we use the setX={v∈P\K|u(σ)≤v}.

From the construction it follows that{σ∈ω^∗ | u(σ) is defined} is an infinite finitely branching tree (under the relation of being an initial segment). By König lemma, there is an infinite path through this tree. The image of this path under uprovides a desired infinite alternating chain forK.

5. Difference hierarchies over Σ

^τ_n^d

and Σ

^τ_n

Here we establish the non-collapse property of the DH’s over the bases Σ^τ_k^dand Σ^τ_k, for eachk≥1. In [32,33] it was shown that the DH’s over the bases Σ^ρ_k and Σ^σ_k,k≥1, do not collapse. The methods of those papers (using the Ehrenfeucht- Fra¨ıss´e games) apply in a straightforward way to obtain the following.

Theorem 5.1. For all n, k > 0 it holds Σ^ρ_k(n) ⊆ co-Σ^τ_k(n). In particular, Σ^τ_k^d(n)⊆co-Σ^τ_k^d(n)and Σ^τ_k(n)⊆co-Σ^τ_k(n)for allk, n, d >0.

Before proving the theorem, ﬁrst we note that from Corollary 3.2 it follows that Σ^τ_k(n) =

dΣ^τ_k^d(n), for allk, n≥1. Next we establish some additional facts on the preordersk,r from Section 3. Recall that those preorders actually also depend on a modulodﬁxed in advance.

Lemma 5.2. For allk, n≥1 andL⊆A⁺,L∈Σ^τ_k^d(n)iﬀ there is an r≥1 such thatL does not have 1-alternating chains of lengthn in(A⁺;k,r).

Proof. LetL∈Σ^τ_k^d(n), soL=

i(L_2i\L_2i+1) for some setsL₀⊇L₁⊇ · · ·,L_n=∅ from Σ^τ_k^d. By Lemma3.5, there is anr≥1 such that allL₀.L₁, . . .are upper sets in (A⁺;k,r). By Proposition4.6, Ldoes not have 1-alternating chains of length nin (A⁺;k,r). The opposite implication is checked in a similar fashion.

Using the operationsF, G from Section3, let us deﬁne an operation (u₀, . . . , u_n)→(˜u₀, . . . ,u˜_n) on tuples of words inA⁺as follows:

˜

u₀=Fⁿ(u⁰_n), u˜₁=Fⁿ⁻¹(u¹_n−1), . . . , u˜_n=uⁿ₀

where u⁰_i =u_i for all i≤n, u¹_i =G(u⁰_i, u⁰_n) for all i≤ n−1,u²_i =G(u¹_i, u¹_n−1) for all i ≤ n−2 and so on. Note that in this deﬁnition the upper indices on words are just indices, not the powers of words, while the upper indices on F mean composition.

Lemma 5.3. Let u₀ k,r · · ·k,r u_n and|ui| ≡0 (mod d) for all i≤n. Then

˜

u₀k+1,r · · ·k+1,ru˜_n.