CD-systems of stateless deterministic R(1)-automata governed by an external pushdown store

(1)

CD-SYSTEMS OF STATELESS DETERMINISTIC R(1)-AUTOMATA GOVERNED BY AN EXTERNAL

PUSHDOWN STORE^∗,^∗∗

Benedek Nagy

¹

and Friedrich Otto

²

Abstract.We study cooperating distributed systems (CD-systems) of stateless deterministic restarting automata with window size 1 that are equipped with an external pushdown store. In this way we obtain an automata-theoretical characterization for the class of word languages that are linearizations of context-free trace languages.

Mathematics Subject Classification. ^68Q45.

1. Introduction

Starting with Mazurkiewicz’s seminal paper [18], the theory of traces has be- come an important part of the theory of concurrent systems. Atraceis an equivalence class of words over a given alphabet with respect to a partial commutativity relation [6]. Informally speaking, if the letters of an alphabet Σ are interpreted

Keywords and phrases. Restarting automaton, cooperating distributed system, external pushdown, context-free trace language.

∗ The results of this paper have been announced at SOFSEM 2011 in Nov´y Smokovec, Slovakia, January 2011. An extended abstract appeared in the proceedings of that confer- ence [23].

∗∗ This work was supported by grants from the Balassi Intézet Magyar Ösztönd´ıj Bizottsága (M ÖB) and the Deutsche Akademischer Austauschdienst (DAAD). The first author was also supported by the T ÁMOP 4.2.1/B-09/1/KONV-2010-0007 project, which is implemented through the New Hungary Development Plan, co-financed by the European Social Fund and the European Regional Development Fund.

1 Department of Computer Science, Faculty of Informatics, University of Debrecen, 4032 Debrecen, Egyetem t´er 1., Hungary.nbenedek@inf.unideb.hu

2 Fachbereich Elektrotechnik/Informatik, Universit¨at Kassel, 34109 Kassel, Germany.

otto@theory.informatik.uni-kassel.de

Article published by EDP Sciences c EDP Sciences 2011

(2)

as atomic actions, then a word w over Σ stands for a finite sequence of such actions. If some of these atomic actions, say a and b, are independent of each other, then it does not matter in which order they are executed, that is, the sequences of actionsab andba yield the same result. IfD is a reflexive and sym- metric dependency relation on Σ, and I = (Σ ×Σ)D is the corresponding independence relation, then the equivalence relation ≡_D is induced by the pairs {(xaby, xbay)|(a, b)∈I, x, y∈Σ^∗}. By collecting all words (sequences) that are equivalent to a given wordwinto a class [w]D={z∈Σ^∗|z≡Dw}, one abstracts from the order between independent actions. These equivalence classes are called traces, and the set {[w]_D |w ∈ Σ^∗} of all traces is the trace monoid M(D). A set of tracesS⊆M(D) is called atrace language, and the set of all wordswsuch that [w]D belongs to S is called the linearization of this trace language. A trace languageS ⊆ M(D) is called recognizable (context-free) if there exists a regular (context-free) languageR ⊆Σ^∗ such thatS ={[w]_D |w∈ R}, and it is called rational if it is empty or if it can be obtained from singleton sets by a finite number of unions, products, and star operations. However, in contrast to the situation for words (that is, free monoids), the recognizable trace languages are a proper subclass of the rational trace languages (unlessI=∅). For a detailed presentation of trace theory and a long list of references, see the monograph by Diekert and Rozenberg [9], which serves as our main reference on this topic.

In [26] Zielonka introduced asynchronous automata for accepting recognizable trace languages. Actually he presented a construction that, starting from a regu- larI-closed word language that is either given through a homomorphism into a finite monoid, or by anI-diamond automaton, yields a deterministic asynchronous automaton for the corresponding trace language. Interestingly, these automata process traces, not linearizations thereof, in this way reflecting the concurrent aspect of traces. In [8] finite automata are studied that work on multisets, that is, they accept recognizable sets of traces over a commutative alphabet (that is, I= (Σ×Σ){(a, a)|a∈Σ}), and this approach has been extended tomultiset pushdown automatain [15]. However, so far no automata-theoretical characterization has been obtained for rational or context-free trace languages.

Here we make a step into this direction by studying systems of automata that acceptlinearizations of context-free trace languages. In [21] it is shown that linearizations of rational trace languages are accepted bycooperating distributed sys- tems (CD-systems) of a particular type of stateless deterministic restarting automata. Restarting automata were invented by Janˇcaret al.to model the linguistic technique of analysis by reduction [13] (see also [25]). These automata can be interpreted as generalizations of finite-state acceptors that work in cycles. They have a finite-state control and a read/write window of a fixed size k ≥ 1 that works on aflexibletape. In each cycle they execute a single rewrite operation that strictly shortens the actual tape. CD-systems of restarting automata have been defined in [19], and in [20] various types of deterministic CD-systems of restarting automata have been studied. In such a system the component automata coop- erate in processing a given input word: at each moment exactly one component

(3)

automaton is active. It executes one or more cycles, depending on the chosen mode of operation, and then another component automaton becomes active. As expected, CD-systems are much more expressive than their component automata themselves. On the other hand, stateless restarting automata, that is, restarting automata with only a single state, have been introduced and studied in [16,17]. In the monotone case and in the deterministic case, they are just as expressive as the corresponding restarting automata with states, provided that auxiliary symbols are available. Without the latter, however, stateless restarting automata are in general much less expressive than their corresponding counterparts with states.

In [21] CD-systems of stateless deterministic restarting automata that have a read/write window of size 1 only are considered. Working in mode = 1, these systems accept a class of semi-linear languages that properly contains the linearizations of all rational trace languages. In fact, even a characterization of the linearizations of rational trace languages in terms of a particular type of these CD-systems was obtained.

Here we extend these CD-systems by an external pushdown store that is used to determine the successor of the current automaton, in this way obtaining the so- calledpushdown CD-systemsof stateless deterministicR(1)-automata, abbreviated asPD-CD-R(1)-systems. When the active automaton of such a system performs a cycle, then its successor automaton is chosen based on both, the symbol deleted in this cycle and the topmost symbol on the pushdown store. In this process also the pushdown content is modified by either erasing the topmost symbol, or by replacing it by a word of length at most 2. Essentially such a system can be interpreted as a traditional pushdown automaton, in which the operation of reading an input symbol has been replaced by a stateless deterministicR(1)-automaton. Hence, not the first symbol is necessarily read, but some symbol that can be reached by this automaton by moving across a prefix of the current input word. In this way our CD-systems can be interpreted as pushdown automata with translucent letters.

Analogously, the CD-systems of stateless deterministic restarting automata with window size 1 studied in [21] can be interpreted as ﬁnite-state acceptors with translucent letters (see [24]). Also other variants of pushdown automata that do not simply read their input sequentially from left to right have been studied before.

For example, in [5] pushdown automata are considered that can reverse their input.

We show that the classL(PD-CD-R(1)) of languages that are accepted byPD- CD-R(1)-systems is a proper subclass of the class of languages with semi-linear Parikh image, and that it includes the linearizations of all context-free trace languages. Actually, from a context-free word language R ⊆ Σ^∗ that is given through a context-free grammar and a dependency relation D on Σ, we con- struct aPD-CD-R(1)-system for the linearization of the context-free trace language S ={[w]_D | w∈ R}. On the other hand, from a givenPD-CD-R(1)-systemM, we can extract a pushdown automatonAsuch that the languageL(A) is a sublanguage ofL(M) that is letter-equivalent toL(M). Finally, we present a characterization of the class of linearizations of context-free trace languages in terms of a particular type ofPD-CD-R(1)-systems. In fact, from aPD-CD-R(1)-systemMof

(4)

this type, one can construct a pushdown automaton B such that the language L(M) accepted by M is the linearization of the context-free trace language S={[w]D|w∈L(B)}.

This paper is structured as follows. In Section 2 we restate in short the definition of the CD-systems of stateless deterministicR(1)-automata and their main properties from [21], and in Section3, we define thePD-CD-R(1)-systems. We also consider the special case of these CD-systems where the pushdown is a counter (the so-calledOC-CD-R(1)-systems), that is, there is only a single pushdown symbol in addition to the bottom marker. We illustrate these definitions by some examples and compare the resulting language classes to each other and to the classCFL of context-free languages, the classOCL of one-counter languages, and the classL₌₁(stl-det-local-CD-R(1)) of languages that are accepted by CD-systems of stateless deterministicR(1)-automata. Then in Section4, we study the classes of linearizations of one-counter and context-free trace languages. We will see that our PD-CD-R(1)-systems accept a proper superclass of the linearizations of context- free trace languages, and the OC-CD-R(1)-systems accept a proper superclass of the linearizations of one-counter trace languages, and we present characterizations of these classes of linearizations of trace languages in terms of our CD-systems.

Finally, in Section 5 we state some preliminary closure and non-closure results and several open problems. The paper closes with some concluding remarks in Section6.

Notation.For a ﬁnite alphabet Σ, we useΣ⁺ to denote the set of all non-empty words over Σ, while Σ^∗ denotes the set of all words over Σ including the empty wordε. For a word w∈Σ^∗ and a lettera∈Σ, |w| denotes the length ofw, and

|w|_a denotes the a-length of w, that is, the number of occurrences of the letter a inw. Further, w^R denotes the reversal(or mirror image) of w.

If Σ ={a1, . . . , an}, then the corresponding Parikh mapping is the morphism ψ : Σ^∗ → Nⁿ from the set of words over Σ into the set of vectors of dimension n over N that is deﬁned by mapping a_i to the vector (0, . . . , 0

i−1

,1,0, . . . , 0

n−i

) for all 1 ≤ i ≤ n. Two languages L1, L2 ⊆ Σ^∗ are called letter-equivalent if ψ(L1) = ψ(L2) holds. A language L ⊆ Σ^∗ is called semi-linear if its Parikh image ψ(L) is a semi-linear subset ofNⁿ, that is, ifψ(L)is the union of ﬁnitely many linear subsets ofNⁿ (see e.g., [11]).

We useREG,LIN,DCFL, andCFLto denote the classes of regular, linear, deterministic context-free and context-free languages. The monographs [11,12] are our main references on formal language and automata theory.

2. Stateless deterministic R(1)-automata

Stateless types of restarting automata were introduced in [16]. Here we are only interested in the most restricted form of them, the stateless deterministic R-automaton of window size 1. Astateless deterministic R(1)-automaton is a one- tape machine that is described by a 5-tupleM = (Σ,c,$,1, δ), whereΣis a ﬁnite

(5)

alphabet, the symbols c,$ ∈ Σ serve as markers for the left and right border of the work space, respectively, the size of the read/write window is 1, and δ : Σ∪ {c,$} → {MVR,Accept, ε}is the (partial)transition function. There are three types of transition steps: move-right steps (MVR), which shift the window one step to the right, combinedrewrite/restart steps (denoted byε), which delete the contentaof the window, thereby shortening the tape, and place the window over the left end of the tape, andaccept steps (Accept), which cause the automaton to halt and accept. In addition, we use the notationδ(a) =∅to express the fact that the functionδ is undeﬁned for the symbola. Some restrictions apply in that the sentinels c and $ must not be deleted, and that the window must not move right on seeing the $-symbol.

A configuration of M is described by a pair (α, β), where either α = ε (the empty word) andβ∈ {c} ·Σ^∗· {$}orα∈ {c} ·Σ^∗andβ ∈Σ^∗· {$}; hereαβis the current content of the tape, and it is understood that the window contains the first symbol ofβ. Arestarting configuration is of the form (ε,cw$), wherew∈Σ^∗; to simplify the notation a restarting configuration (ε,cw$) is usually simply written as cw$. By M we denote the single-step computation relation of M, and ^∗_M denotes the reflexive transitive closure of_M.

The automaton M proceeds as follows. Starting from an initial configuration cw$, the window moves right until a configuration of the form (cx, ay$) is reached such thatδ(a) =ε. Now the latter configuration is transformed into the restarting configuration cxy$. This computation, which is called acycle, is expressed as w ^c_M xy. A computation of M now consists of a finite sequence of cycles that is followed by a tail computation, which consists of a sequence of move-right op- erations possibly followed by an accept step. An input wordw ∈Σ^∗ isaccepted by M, if the computation of M which starts with the initial configuration cw$

ﬁnishes by executing an accept step. ByL(M) we denote the language consisting of all words accepted byM.

If M = (Σ,c,$,1, δ) is a stateless deterministic R(1)-automaton, then we can partition its alphabetΣ into four disjoint subalphabets:

ΣM={a∈Σ|δ(a) =MVR}, ΣA={a∈Σ|δ(a) =Accept}, Σε ={a∈Σ|δ(a) =ε}, Σ_∅ ={a∈Σ|δ(a) =∅ }.

Thus, Σ_M is the set of letters that M just moves across, Σ_ε is the set of letters thatM deletes,ΣAis the set of letters which causeM to accept, andΣ∅is the set of letters on whichM will get stuck. It has been shown in [21] that the language L(M) can be characterized as

L(M) =

⎧⎪

⎪⎪

⎨

⎪⎪

⎪⎩

∅, ifδ(c) =∅,

Σ^∗, ifδ(c) =Accept,

(Σ_M∪Σ_ε)^∗·Σ_A·Σ^∗, ifδ(c) =MVRandδ($) =Accept, (Σ_M∪Σ_ε)^∗·((Σ_A·Σ^∗)∪ {ε}),ifδ(c) =MVRandδ($) =Accept.

Let M = (Σ,c,$,1, δ) be a stateless deterministic R(1)-automaton. If δ(c) is undeﬁned, then L(M) = ∅. Deﬁne a stateless deterministic R(1)-automaton

(6)

M₋ = (Σ,c,$,1, δ₋) by takingδ₋(c) =δ₋(a) =MVRfor alla∈Σandδ₋($) =∅.

ThenM− scans its tape contents completely and halts (and rejects) on the right delimiter $, that is, L(M−) = ∅. Similarly, if δ(c) = Accept, then L(M) = Σ^∗. Deﬁne a stateless deterministic R(1)-automaton M₊ = (Σ,c,$,1, δ₊) by taking δ₊(c) = δ₊(a) = MVR for all a ∈ Σ and δ₊($) = Accept. Then M₊ scans its tape contents completely and halts (and accepts) on the right delimiter $, that is, L(M+) = Σ^∗. Thus, the automatonM is equivalent to M− (in the ﬁrst case) or to M+ (in the second case). Accordingly, we assume in the following that for all stateless deterministic R(1)-automataM = (Σ,c,$,1, δ) considered,δ(c) =MVR holds.

Cooperating distributed systems of restarting automata were introduced and studied in [19]. Here we only considercooperating distributed systems of stateless deterministic R(1)-automata (or stl-det-local-CD-R(1)-systems for short). Such a system consists of a ﬁnite collectionM= ((M_i, σ_i)_i∈I, I₀) of stateless deterministic R(1)-automataMi= (Σ,c,$,1, δi) (i∈I),successor relations σi⊆I (i∈I), and a subsetI0⊆I ofinitial indices.Here it is required that I0 =∅, and thatσi =∅ for alli∈I. These systems are calledlocally deterministic in accordance with the notation coined in [20], since their computations are not deterministic as we will see below, although all their component automataM_i (i∈ I) are deterministic.

Actually, in [21] it is required additionally thati ∈σi for all i∈I, but as we are only interested in mode = 1 computations (see below), this requirement is actually irrelevant. In fact, by simply adding a copy for each component automaton Mi

(i∈I), we could easily enforce it, but this would just make the systems larger and harder to describe.

The cooperating distributed systems of restarting automata can be seen as an adaptation of the notion of a CD-grammar system with external control (see e.g.[7]) to restarting automata. Accordingly various modes of operation have been deﬁned and studied for them [19], but here we concentrate on mode = 1 computations only.

A computation ofMin mode = 1 on an input wordwproceeds as follows. First an indexi₀∈I₀is chosen nondeterministically. Then theR-automatonM_i₀ starts the computation with the initial conﬁguration cw$, and executes a single cycle.

Thereafter an index i1 ∈ σi0 is chosen nondeterministically, and Mi1 continues the computation by executing a single cycle. This continues until, for somel≥0, the automatonMil accepts. Should at some stage the chosen automaton Mil be unable to execute a cycle or to accept, then the computation fails. By L₌₁(M) we denote the language that the systemM accepts in mode = 1. It consists of all wordsw ∈Σ^∗ that are accepted by Min mode = 1 as described above. By L=1(stl-det-local-CD-R(1)) we denote the class of languages that are accepted by mode = 1 computations ofstl-det-local-CD-R(1)-systems.

Example 2.1. LetM= ((M_i, σ_i)_i∈I, I₀), where I={a, b, c,+},I₀={a},σ_a = {b},σ_b={c},σ_c ={a,+},σ₊={a}, andM_a,M_b,M_c, andM₊ are the stateless

(7)

deterministicR(1)-automata that are given by the following transition functions:

M_a : δ_a(c) =MVR, δ_a(b) =MVR, δ_a(c) =MVR, δ_a(a) =ε, M_b: δ_b(c) =MVR, δ_b(a) =MVR, δ_b(c) =MVR, δ_b(b) =ε, M_c: δ_c(c) =MVR, δ_c(a) =MVR, δ_c(b) =MVR, δ_c(c) =ε, M₊:δ₊(c) =MVR, δ₊($) =Accept,

andδa($), δb($), δc($) andδ+(a), δ+(b), δ+(c) are undeﬁned.

The automatonM₊ accepts the empty word and rejects (that is, it gets stuck on) all other inputs. The automatonM_a simply deletes the first occurrence of the letter afrom its tape, Mb simply deletes the first occurrence of the letterb, and Mc simply deletes the first occurrence of the letterc. AccordinglyL=1(M) is the non-context-free languageLabc={w∈ {a, b, c}^∗| |w|a =|w|b=|w|c≥1}.

In [21] the following result was established.

Proposition 2.2. Each languageL∈ L₌₁(stl-det-local-CD-R(1))contains a regu- lar sublanguageE that is letter-equivalent to L. In fact, a ﬁnite-state acceptor for E can be constructed eﬀectively from a stl-det-local-CD-R(1)-system forL.

In particular, it follows thatL₌₁(stl-det-local-CD-R(1)) only contains languages that are semi-linear. Let DLINdenote the class ofdeterministic linear languages, which is the class of languages that are accepted by deterministic one-turn pushdown automata. Further, let DOCL and OCLdenote the classes of deterministic one-counter languages and one-counter languages, which are the classes of lan- guages that are accepted by (deterministic) one-counter automata (see below).

The languageL={aⁿbⁿ |n≥0} is accepted by a deterministic one-turn pushdown automaton as well as by a deterministic one-counter automaton, that is, it belongs to the intersection DLIN∩DOCL. However, it does not contain a regular sublanguage that is letter-equivalent to the language itself. Thus, we see from Proposition 2.2 that this language is not accepted by any stl-det-local-CD-R(1)- system working in mode = 1. Together with Example 2.1 this implies that the language classL=1(stl-det-local-CD-R(1)) is incomparable to the classesDLIN,LIN, DOCL, OCL,DCFL, and CFLwith respect to inclusion.

For technical reasons the following normal form was introduced in [21] for stl- det-local-CD-R(1)-systems.

Definition 2.3. A stl-det-local-CD-R(1)-systemM = ((M_i, σ_i)_i∈I, I₀) is in nor- mal form, if it satisﬁes the following three conditions for all i ∈ I, where Σ_M⁽ⁱ⁾, Σε⁽ⁱ⁾, Σ_A⁽ⁱ⁾, Σ_∅⁽ⁱ⁾ is the partitioning of alphabet Σ for the automaton Mi as described above:

(1)|Σ_ε⁽ⁱ⁾| ≤1, (2)δi(c) =MVRandΣ_A⁽ⁱ⁾=∅, (3)Σ_ε⁽ⁱ⁾=∅iﬀδi($) =Accept.

(8)

It is shown in [21] that a given stl-det-local-CD-R(1)-system M can be con- verted eﬀectively into a stl-det-local-CD-R(1)-system M in normal form such that L=1(M) = L=1(M). However, the system M can have about |Σ|+ 1 times as many component automata as the given system M. In [22] clo- sure properties and algorithmic properties are presented for the language class L₌₁(stl-det-local-CD-R(1)).

3. CD-systems with an external pushdown store

A pushdown CD-system of stateless deterministic R(1)-automata, a PD-CD- R(1)-system for short, consists of a CD-system of stateless deterministic R(1)- automata and an external pushdown store. Essentially such a system can be interpreted as a traditional pushdown automaton, in which the operation of reading an input symbol is replaced by a stateless deterministic R(1)-automaton. Hence, not the first symbol is necessarily read, but some symbol that can be reached by this automaton by moving across a prefix of the current input word. Formally, a PD-CD-R(1)-system is defined as a tupleM= (I, Σ,(M_i, σ_i)_i∈I, Γ,⊥, I₀, δ), where

• I is a ﬁnite set of indices,

• Σ is a ﬁnite input alphabet,

• for alli∈I,Mi is a stateless deterministicR(1)-automaton onΣ, and σi ⊆I is a non-empty set of possible successors forMi,

• Γ is a ﬁnite pushdown alphabet,

• ⊥ ∈Γ is the bottom marker of the pushdown store,

• I0⊆I is the set of initial indices, and

• δ: (I×Σ×(Γ∪ {⊥}))→2^{I×(Γ∪{⊥})}^∗ is the successor relation. For eachi∈I, a∈Σ, andA∈Γ,δ(i, a, A) is a subset ofσi×Γ^≤2, and δ(i, a,⊥) is a subset of σ_i×(⊥ ·Γ^≤2). Here Γ^≤2 denotes the set of all words over Γ of length at most 2.

Aconﬁguration of Mis a triple of the form (i, ω,⊥α), wherei∈I,ω ∈(c·Σ^∗·

$)∪ {Accept}, andα∈Γ^∗. A configuration of the form (i,cw$,⊥α) describes the situation that the component automaton Mi has just been activated, the word cw$ is the corresponding restarting configuration of M_i, and the word⊥αis the current content of the pushdown store with the last symbol ofαat the top. For w∈Σ^∗, aninitial configuration ofMon inputwhas the form (i₀,cw$,⊥) for any i0∈I0, and anaccepting configuration has the form (i,Accept,⊥) for anyi∈I.

Recall from the discussion in Section 2 that we assume that each component automatonMi (i∈I) performs a move-right operation on the c-symbol. Further, for eachi ∈ I, letΣ_M⁽ⁱ⁾, Σε⁽ⁱ⁾, and Σ_A⁽ⁱ⁾ denote the subsets ofΣ that correspond to the automaton M_i. Then the single-step computation relation ⇒_M that M induces on the set of conﬁgurations is deﬁned by the following three rules, where

(9)

i∈I,w∈Σ^∗,α∈Γ^∗, and A∈Γ:

(1) (i,cw$,⊥αA)⇒_M(j,cw$,⊥αη) if∃u∈Σ_M⁽ⁱ⁾^∗, a∈Σ⁽ⁱ⁾ε , v∈Σ^∗ such that w=uav, w=uv, and (j, η)∈δ(i, a, A);

(2) (i,cw$,⊥)⇒M(j,cw$,⊥η) if∃u∈Σ_M⁽ⁱ⁾^∗, a∈Σ⁽ⁱ⁾ε , v∈Σ^∗ such that w=uav, w=uv, and (j,⊥η)∈δ(i, a,⊥);

(3) (i,cw$,⊥)⇒M(i,Accept,⊥) if∃u∈Σ_M⁽ⁱ⁾^∗, a∈Σ⁽ⁱ⁾_A , v∈Σ^∗ such that w=uav, or w∈Σ_M⁽ⁱ⁾^∗ andδ_i($) =Accept.

Notice that the contents of the pushdown store is always a word of the form⊥αfor someα∈Γ^∗, that is, the bottom marker⊥cannot be removed from the pushdown store. By ⇒^∗_M we denote the computation relation of M, which is the reﬂexive and transitive closure of the relation ⇒M. The language L(M) accepted by M consists of all words for whichMhas an accepting computation, that is,

L(M) ={w∈Σ^∗| ∃i0∈I0∃i∈I: (i0,cw$,⊥)⇒^∗_M(i,Accept,⊥)}.

Remark 3.1. The system M accepts if and when both of the following conditions are satisfied: the currently active component automaton M_i executes an accepting tail computation starting from the current restarting configuration cw$, and the pushdown store just contains the bottom marker⊥. One could relax this acceptance condition by just requiring that the currently active component automaton Mi accepts starting from the current restarting configuration. It is not clear whether that would change the class of languages accepted. However, the requirement that the pushdown store must just contain the bottom marker at the end of an accepting computation can be seen as a kind ofnormalization. Observe that the contents of the pushdown store ofMis manipulated only in steps of the form (1) and (2), and that during each step of either of these forms a component automaton of Mexecutes a cycle, that is, an input letter is being erased. Thus, there is no way thatMcan manipulate its pushdown store without reading (that is, deleting) input symbols, that is, if a configuration of the form (i,Accept,⊥α) were reached for some α = ε, then α could not be popped from the pushdown store.

A PD-CD-R(1)-systemM= (I, Σ,(Mi, σi)_i∈I, Γ,⊥, I0, δ) is called a one-coun- ter CD-system of stateless deterministic R(1)-automata, OC-CD-R(1)-system for short, if |Γ| = 1, that is, if there is only a single pushdown symbol in addition to the bottom marker ⊥. By L(PD-CD-R(1)) we denote the class of languages that are accepted byPD-CD-R(1)-systems, andL(OC-CD-R(1)) denotes the class of languages that are accepted byOC-CD-R(1)-systems.

Example 3.2. We consider the language

L={aⁿv|v∈ {b, c}^∗,|v|b=|v|c=n, n≥0}.

AsL∩(a^∗·b^∗·c^∗) ={aⁿbⁿcⁿ|n≥0}is not context-free, we see thatLitself is not context-free. Further, there is no regular sublanguage ofLthat is letter-equivalent

(10)

to L. Hence, by Proposition 2.2, L is not accepted by any stl-det-local-CD-R(1)- system, either. However, we claim that Lis accepted by the OC-CD-R(1)-system M= (I, Σ,(Mi, σi)_i∈I, Γ,⊥, I0, δ) that is deﬁned as follows:

• I={a, b, c,+},I₀={a,+},Σ={a, b, c}, andΓ ={C},

• Ma,Mb,Mc, andM+ are deﬁned by the following transition functions:

(1)δa(c) =MVR, (5)δb(c) =MVR, (8)δc(c) =MVR, (2)δa(a) =ε, (6)δb(b) =ε, (9)δc(c) =ε, (3)δ₊(c) =MVR, (7)δ_b(c) =MVR, (10)δ_c(b) =MVR, (4)δ₊($) =Accept,

where δ_x(y) (x∈I, y∈Σ∪ {$}) is undeﬁned for all other cases,

• σ_a={a, b},σ_b ={c},σ_c={b,+}, andσ₊={+}, and

• δ is deﬁned as follows:

(1)δ(a, a,⊥) ={(a,⊥C),(b,⊥C)},(3)δ(b, b, C) ={(c, C)}, (2)δ(a, a, C) ={(a, CC),(b, CC)},(4)δ(c, c, C) ={(b, ε),(+, ε)}, and for all other tripels,δyields the empty set.

The component automatonM+ just accepts the empty word, and it gets stuck on all other words. The component Ma just deletes the first letter, if it is ana, otherwise, it gets stuck. The componentMb reads acrossc’s and deletes the firstb it encounters, and analogously, the componentM_creads acrossb’s and deletes the firstcit encounters. Thus, we see from the form of the successor sets thatMcan only accept certain words of the forma^mv such that v∈ {b, c}^∗. However, when Ma deletes ana, then a symbolC is pushed onto the pushdown store, and when Mc deletes ac, then a symbolC is popped from the pushdown store. As Mb and M_c work alternatingly, this means that the same number ofb’s andc’s are deleted.

Thus, ifM is to accept, then|v|b=|v|c=nholds for somen≥0.

Ifm < n, then after deleting the ﬁrstm occurrences ofbandc, the pushdown store only contains the bottom marker ⊥, and then M gets stuck as seen from the deﬁnition ofδ. On the other hand, ifm > n, then the pushdown still contains some occurrences of the symbolCwhen the worda^mvhas been erased completely.

Hence, in this situationM does not accept, either. Finally, ifm =n, then after erasing the last occurrence ofc, also the last occurrence of the symbolCis popped from the pushdown store, and thenM+can accept starting from the conﬁguration (+,c$,⊥). Hence, we see thatL(M) =Lholds.

Thus, already the language class L(OC-CD-R(1)) contains a language that is neither context-free nor accepted by anystl-det-local-CD-R(1)-system.

Remark 3.1 (cont.). In the above example we exploit the fact thatM accepts only when its pushdown store contains nothing but the bottom marker. However, for the languageLabove anOC-CD-R(1)-system can be designed that accepts with an arbitrary contents in its pushdown store.

Next we show thatOC-CD-R(1)-systems can simulate allstl-det-local-CD-R(1)- systems.

(11)

Proposition 3.3. L₌₁(stl-det-local-CD-R(1))L(OC-CD-R(1)).

Proof. Let M = ((Mi, σi)_i∈I, I0) be a stl-det-local-CD-R(1)-system, and let L = L=1(M). We obtain a OC-CD-R(1)-system M = (I, Σ,(Mi, σi)_i∈I,∅,⊥, I0, δ), whereΣis the tape alphabet ofM, by deﬁning the transition functionδas follows for alli∈I:

δ(i, a,⊥) ={(j,⊥)|j ∈σi}for alla∈Σε⁽ⁱ⁾, δ(i, a,⊥) =∅ for alla∈ΣΣε⁽ⁱ⁾.

Then there is a one-to-one correspondence between the accepting computations of Mand the accepting computations of M. Thus,L(M) =L. This yields the announced inclusion. Its properness follows from the previous example.

Further,PD-CD-R(1)-systems accept all context-free languages.

Proposition 3.4. CFLL(PD-CD-R(1)).

Proof. Let L⊆ Σ⁺ be a context-free language. Then there exists a context-free grammar G= (V, Σ, S, P) in quadratic Greibach normal form for L, that is, for each production (A → r) ∈ P, the right-hand side r is of the form r = aα, where a∈Σ and α∈V^≤2. In addition, we can assume that the start symbol S does not occur on the right-hand side of any production. Applied toG, the stan- dard construction of a pushdown automaton from a context-free grammar yields a pushdown automatonA withoutε-moves that, given a wordw∈Σ⁺ as input, simulates a left-mostG-derivation of wfrom S (seee.g. [12]). In analogy to this construction we build aPD-CD-R(1)-systemM= (I, Σ,(Mi, σi)_i∈I, V,⊥,{S}, δ), where I =V ∪ {+}, the stateless deterministic R(1)-automata M_A (A∈V) and M₊ are deﬁned as follows, wherea∈Σ:

(1)δA(c) =MVR,

(2)δA(a) =ε, if there existsγ∈V^≤2: (A→aγ)∈P, (3)δ_A(a) =∅, otherwise,

(4)δ_A($) =∅, (5)δ₊(c) =MVR, (6)δ+(a) =∅, (7)δ+($) =Accept,

the sets of successors are deﬁned byσ_A=σ₊=Ifor allA∈V, and the successor relationδis deﬁned as follows, whereA∈V anda∈Σ:

(1)δ(S, a,⊥) = {(+,⊥)|(S →a)∈P}

∪ {(B,⊥B)|(S →aB)∈P}

∪ {(B,⊥CB)|(S→aBC)∈P}, (2)δ(A, a, A) = {(B, ε)|B∈V {S}and (A→a)∈P}

∪ {(+, ε)|(A→a)∈P}

∪ {(B, B)|(A→aB)∈P}

∪ {(B, CB)|(A→aBC)∈P}, andδyields the empty set for all other values.

(12)

We claim that L(M) = L holds. For establishing this equality, we prove the following technical result, whereU denotes the setU=V {S}, and⇒^∗_Gdenotes the left-most derivation relation ofG.

Claim.For allw∈Σ⁺, allA∈U, and allα∈U^∗,

(A,cw$,⊥αA)⇒^∗_M(+,Accept,⊥) iﬀ Aα^R⇒^∗_Gw.

Proof. “⇒”: Assume that (A,cw$,⊥αA) ⇒^∗_M (+,Accept,⊥). We proceed by induction on |w|. If |w| = 1, then w = a ∈ Σ. We see from the deﬁni- tion of M that α = ε, and that the above computation of M has the form (A,cw$,⊥αA) = (A,ca$,⊥A) ⇒M (+,c$,⊥). This implies that (A → a) ∈ P, that is,A=Aα^R⇒_G a=wholds.

If|w|=n+ 1 for somen≥1, thenw=aw for somea∈Σ and some wordw of lengthn. It follows that the above computation ofMhas the form

(A,cw$,⊥αA) = (A,caw$,⊥αA)⇒_M(B,cw$,⊥γB)⇒^∗_M(+,Accept,⊥), where (A → a) ∈ P, and then γB = α, or (A → aB) ∈ P, and then γ = α, or (A → aBC) ∈ P, and then γ = αC. In each case we obtain the derivation Aα^R⇒_GaBγ^R⇒^∗_G aw =wfrom the induction hypothesis forw.

“⇐”: Assume that Aα^R ⇒^∗_G wholds. Again, we proceed by induction on |w|. If

|w|= 1, thenw=a∈Σ, and it follows from the form of the rules ofGthatα=ε.

Thus, (A→a)∈P, and hence,

(A,cw$,⊥αA) = (A,ca$,⊥A)⇒_M(+,c$,⊥)⇒_M(+,Accept,⊥)

follows. If|w|=n+ 1 for some n≥1, then w= aw for some a ∈Σ and some wordw of length n. It follows that the above derivation has the form Aα^R ⇒G

aBγ^R ⇒^∗_G aw = w, where in the ﬁrst step, (A → a) ∈ P is used, and then α=γB, or (A→aB)∈P is used, and thenα=γ, or (A→aBC)∈P is used, and then αC = γ. In each case we obtain the following computation from the induction hypothesis forw:

(A,cw$,⊥αA) = (A,caw$,⊥αA)⇒M(B,cw$,⊥γB)⇒^∗_M(+,Accept,⊥). Let a ∈ Σ. Then a ∈ L iff (S → a) ∈ P iff (S,ca$,⊥) ⇒_M (+,c$,⊥) ⇒_M (+,Accept,⊥) iff a∈L(M). Finally, for allw∈Σ⁺ and alla∈Σ,

aw∈LiﬀS⇒GaAα^R⇒^∗_Gaw

iﬀ (S →aAα^R)∈P andAα^R⇒^∗_Gw

iﬀ (S,caw$,⊥)⇒M(A,cw$,⊥αA)⇒^∗_M (+,Accept,⊥).

Hence, it follows thatL(M) =L.

If the given context-free language includes the empty word, we can apply the above construction to the languageL{ε}. Then the resultingPD-CD-R(1)-system will accept this language. By adding the component + to the set of initial components, we obtain aPD-CD-R(1)-system for the languageL. This yields the intended

inclusion, which is proper by Example3.2.

(13)

Next we consider the so-calledone-counter automataand the class of languages accepted by them. One finds several different non-equivalent definitions for one- counter automata in the literature. Here we take a definition that is equivalent to the one used by Janˇcaret al. in [14] (see also [3]).

A pushdown automaton A = (Q, Σ, Γ, q₀,⊥, δ, F) is called a one-counter au- tomaton if |Γ| = 1, and if the bottom marker ⊥ cannot be removed from the pushdown store. Thus, ifC is the only symbol inΓ, then the pushdown contents

⊥C^m can be interpreted as the integer m for all m ≥ 0. Accordingly, the pop operation can be interpreted as the decrement−1. It can be assumed in addition that the only other pushdown operations leave the valuemunchanged or increase it by 1, that is, the pushdown is not changed or exactly one additionalCis pushed onto it. Finally, A has to read an input symbol in each step, that is, it cannot make anyε-steps.

A word w ∈ Σ^∗ is accepted by A, if (q0, w,⊥) ^∗_A (q, ε,⊥) holds for some ﬁnal stateq ∈F. Observe that A can only distinguish between two states of its pushdown store: either the topmost symbol is C, which is interpreted by saying thatthe counter is positive, or it is the bottom marker⊥, which is interpreted asthe counter is zero. ByOCLwe denote the class of languages accepted by one-counter automata. It is well-known thatREGOCLCFLholds (seee.g.[3]).

Proposition 3.5. OCLL(OC-CD-R(1)).

Proof. LetA= (Q, Σ,{C}, q₀,⊥, δ_A, F) be aone-counter automaton, and letL= L(A) ⊆ Σ^∗ be the language it accepts. We simulateA through a OC-CD-R(1)- systemM= (I, Σ,(Mi, σi)_i∈I,{C},⊥, I0, δ), whereI= (Q× {=, >})∪ {+},I0= {(q0,=),+}, σ_(q,>) =σ_(q,=) = σ+ = I for all q ∈ Q, the stateless deterministic R(1)-automataM_(q,>), M_(q,=) (q∈Q), andM₊ are deﬁned as follows:

(1)δ_(q,=)(c) =MVR,

(2)δ_(q,=)(a) =ε, ifδ_A(q, a,⊥) is deﬁned, (3)δ_(q,=)(a) =∅, otherwise,

(4)δ_(q,=)($) =∅, (5)δ_(q,>)(c) =MVR,

(6)δ_(q,>)(a) =ε, ifδA(q, a, C) is deﬁned, (7)δ_(q,>)(a) =∅, otherwise,

(8)δ_(q,>)($) =∅, (9)δ₊(c) =MVR, (10)δ+(a) =∅, (11)δ+($) =Accept,

(14)

and the successor relation δ is deﬁned as follows, where q ∈ Q, a ∈ Σ, and i∈ {1,2}:

(1)δ((q,=), a,⊥) ={((q,=),⊥)|(q,⊥)∈δA(q, a,⊥)}

∪ {(+,⊥)| ∃q∈F : (q,⊥)∈δA(q, a,⊥)}

∪ {((q, >),⊥C)|(q,⊥C)∈δ_A(q, a,⊥)}, (2)δ((q, >), a, C) ={((q, >), Cⁱ)|(q, Cⁱ)∈δA(q, a, C)}

∪ {((q, >), ε),((q,=), ε)|(q, ε)∈δ_A(q, a, C)}

∪ {(+, ε)| ∃q ∈F : (q, ε)∈δA(q, a, C)}, whileδyields the empty set for all other values.

Observe that each timeAdecreases its counter,M also decreases its counter, and in addition it has the option of activating the ﬁnal component M₊, if the state entered is ﬁnal. However,M+ can only accept, if at that moment the input has been processed completely, andMonly accepts if, in addition, the counter is zero. It follows that there is a one-to-one correspondence between the accepting computations of the one-counter automaton A and the system M. Hence, we haveL(M) = L(A) = L. This yields the intended inclusion, which is proper by

Example3.2.

Definition 3.6. A PD-CD-R(1)-system M = (I, Σ,(M_i, σ_i)_i∈I, Γ,⊥, I₀, δ) is in strong normal form if it satisﬁes the following conditions, where, for all i ∈ I, Σ⁽ⁱ⁾_M, Σε⁽ⁱ⁾, Σ_A⁽ⁱ⁾, Σ_∅⁽ⁱ⁾ is the partitioning of alphabet Σ for the automaton M_i as described in Section2:

(1)∃i+∈I:δi+(c) =MVR, δi+($) =Accept, andΣ_∅⁽ⁱ⁺⁾=Σ;

(2)∀i∈I{i+}:δi(c) =MVR,|Σε⁽ⁱ⁾|= 1, Σ_A⁽ⁱ⁾=∅, andδi($) =∅.

Thus, ifMis in strong normal form, then it has a unique componentMi+ that can execute accept instructions, but it only accepts the empty word, while all other components each delete a single kind of letter. In particular, a wordw ∈ L(M) is ﬁrst erased completely by executing|w|many cycles, and then the empty word is accepted by activating component Mi+. As OC-CD-R(1)-systems are a special type of PD-CD-R(1)-systems, this deﬁnition also applies to them. The following technical result shows that we can restrict our attention to PD-CD-R(1)-systems in strong normal form.

Lemma 3.7. From a PD-CD-R(1)-system M one can construct a PD-CD-R(1)- systemM in strong normal form such thatL(M) =L(M). In addition, ifMis anOC-CD-R(1)-system, thenM can be constructed to be anOC-CD-R(1)-system, too.

Proof.LetM= (I, Σ,(M_i, σ_i)_i∈I, Γ,⊥, I₀, δ) be aPD-CD-R(1)-system. As pointed out in Section2, we can assume that each of the component automataM_iexecutes

(15)

a move-right step on seeing the c-symbol, that is,δ_i(c) =MVRfor alli∈I, where δi denotes the transition function ofMi.

First we split every component automatonMiinto|Σε⁽ⁱ⁾|+ 1 many parts,M_i^(a) for a∈Σε⁽ⁱ⁾, and M_i⁽⁺⁾, where the former is responsible for executing the cycles ofM_iin which an occurrence of the letter ais deleted, while the latter takes care of the accepting tail computations ofM_i. In detail, for eacha∈Σε⁽ⁱ⁾,

δ_i^(a)(c) =MVRand δ_i⁽⁺⁾(c) =MVR, δ_i^(a)(a) =ε and δ_i⁽⁺⁾(a) =∅,

δ_i^(a)(b) =MVRand δ_i⁽⁺⁾(b) =MVR for allb∈Σ_M⁽ⁱ⁾, δ_i^(a)(b) =∅ and δ_i⁽⁺⁾(b) =Acceptfor allb∈Σ_A⁽ⁱ⁾,

δ_i^(a)(b) =∅ and δ_i⁽⁺⁾(b) =∅ for allb∈(Σε⁽ⁱ⁾{a})∪Σ_∅⁽ⁱ⁾, δ_i^(a)($) =∅, and δ_i⁽⁺⁾($) =δ_i($).

We adjust the successor relationsσ_i (i∈I) as

σ_i^(a)=σ⁽⁺⁾_i ={j^(b), j⁽⁺⁾|j∈σi, b∈Σ_ε^(j)}, and we take

Mˆ = ( ˆI, Σ,(M_i^(a), σ^(a)_i )_i∈I,a∈Σ(i)

ε ∪(M_i⁽⁺⁾, σ⁽⁺⁾_i )_i∈I, Γ,⊥,Iˆ0,ˆδ),

where Î ={i^(a), i⁽⁺⁾ | i ∈ I, a∈ Σε⁽ⁱ⁾} and Î0 ={i^(a), i⁽⁺⁾ | i∈ I0, a ∈ Σε⁽ⁱ⁾}. Finally, the successor relation ˆδ: ( ˆI×Σ×(Γ∪ {⊥})→2Î×(Γ^ˆ ^∪{⊥})^∗ is defined as follows, wherei∈I,a, b∈Σ, andA∈Γ:

(1) ˆδ(i^(a), a, A) ={(j^(c), α)|(j, α)∈δ(i, a, A), c∈Σε^(j)}

∪ {(j⁽⁺⁾, ε)|(j, ε)∈δ(i, a, A)},

(2) ˆδ(i^(a), a,⊥) ={(j^(c),⊥α)|(j,⊥α)∈δ(i, a,⊥), c∈Σε^(j)}

∪ {(j⁽⁺⁾,⊥)|(j,⊥)∈δ(j, a,⊥)}, and for all other tripels, ˆδyields the empty set.

Then ˆMsimply simulates the computations ofM. Each time a successor automaton M_j is chosen in a computation of M, one has to guess whether another cycle will be executed, and if so, which rewrite instruction will be applied, or whether the next component automaton will accept in a tail computation. Then in the simulating computation of ˆM, one must simply choose the corresponding component M_j^(a) orM_j⁽⁺⁾. Observe that a computation ofMcan succeed only if the pushdown contents just consists of the bottom marker⊥at the moment when the active componentM_j executes an accepting tail computation. Accordingly ˆM only needs to be able to choose an accepting componentM_j⁽⁺⁾when the pushdown content is just⊥(see (2)) or when it could just now have been reduced to⊥(see (1)). It follows easily thatL( ˆM) =L(M).

We now construct the intended system in strong normal form by modifying the system ˆM. Observe that all component automataM_i^(a) of ˆMalready satisfy

(16)

the conditions stated in part (2) of Deﬁnition 3.6. Hence, it remains to modify the accepting component automata M_i⁽⁺⁾ (i ∈ I). First we introduce a special componentM₊ that just accepts the empty word, that is,δ₊ (c) =MVR,δ₊($) = Accept, andδ₊(a) =∅for all lettersa∈Σ. Now we need to distinguish two cases.

Case 1:δ⁽⁺⁾_i ($) =∅. In this situationM_i⁽⁺⁾accepts all words from the setΣ_M⁽ⁱ⁾^∗· Σ⁽ⁱ⁾_A ·Σ^∗. We now replace M_i⁽⁺⁾ by the two components M_i⁽⁺⁾ and M_i⁽⁺⁾ that are deﬁned by the following transition functions:

δ_i⁽⁺⁾(c) =MVR, δ_i⁽⁺⁾(c) =MVR,

δ_i⁽⁺⁾(a) =MVR for alla∈Σ_M⁽ⁱ⁾, δ_i⁽⁺⁾(a) =ε for alla∈Σ, δ_i⁽⁺⁾(a) =∅ for alla∈Σ_ε⁽ⁱ⁾∪Σ_∅⁽ⁱ⁾, δ_i⁽⁺⁾($) =∅,

δ_i⁽⁺⁾(a) =ε for alla∈Σ_A⁽ⁱ⁾, δ_i⁽⁺⁾($) =∅.

Further, the successor relation ˆδis modiﬁed as follows:

(1) On the right-hand side of ˆδ, each occurrence of the componentM_i⁽⁺⁾is replaced by the componentM_i⁽⁺⁾.

(2) The following transitions are added:

δ(iˆ ⁽⁺⁾, a,⊥) ={(i⁽⁺⁾,⊥),(+,⊥)} for alla∈Σ_A⁽ⁱ⁾, δ(iˆ ⁽⁺⁾, a,⊥) ={(i⁽⁺⁾,⊥),(+,⊥)} for alla∈Σ, where + refers to the component automatonM₊ introduced above.

This modiﬁcation of ˆδ ensures that in combination withM₊, the component au- tomataM_i⁽⁺⁾andM_i⁽⁺⁾accept the words from the setΣ_M⁽ⁱ⁾^∗·Σ_A⁽ⁱ⁾·Σ^∗.

Case 2:δ_i⁽⁺⁾($) =Accept. In this situationM_i⁽⁺⁾accepts all words from the set Σ⁽ⁱ⁾_M^∗·Σ_A⁽ⁱ⁾·Σ^∗∪Σ_M⁽ⁱ⁾^∗. We now replaceM_i⁽⁺⁾by the two componentsM_i⁽⁺⁾and M_i⁽⁺⁾deﬁned above and an additional component ˆM_i⁽⁺⁾that is deﬁned as follows:

ˆδ⁽⁺⁾_i (c) =MVR,δˆ_i⁽⁺⁾(a) =εfor alla∈Σ_M⁽ⁱ⁾, ˆδ⁽⁺⁾_i ($) =∅, δˆ_i⁽⁺⁾(a) =∅for alla∈ΣΣ_M⁽ⁱ⁾. In this case the successor relation ˆδis modiﬁed as follows:

(1) On the right-hand side of ˆδ, each occurrence of the componentM_i⁽⁺⁾is replaced by the componentsM_i⁽⁺⁾ and ˆM_i⁽⁺⁾.

(2) The following transitions are added:

δ(iˆ ⁽⁺⁾, a,⊥) ={(i⁽⁺⁾,⊥),(+,⊥)} for alla∈Σ_A⁽ⁱ⁾, δˆ(i⁽⁺⁾, a,⊥) ={(i⁽⁺⁾,⊥),(+,⊥)} for alla∈Σ, δ(ˆi⁽⁺⁾, a,⊥) ={(ˆi⁽⁺⁾,⊥),(+,⊥)} for alla∈Σ_M⁽ⁱ⁾.