• Aucun résultat trouvé

Duality and equational theory of regular languages

N/A
N/A
Protected

Academic year: 2022

Partager "Duality and equational theory of regular languages"

Copied!
39
0
0

Texte intégral

(1)

Duality and equational theory of regular languages

Mai Gehrke1 Serge Grigorieff2 Jean-´Eric Pin2

1Radboud Universiteit

2LIAFA, CNRS and University Paris 7

October 2008, Valencia

(2)

Summary

(1) Related results

(2) Equational theory of regular languages (3) The profinite world

(4) Examples (5) Duality

(3)

Part I

Equational theory of regular languages

(4)

A virtuous circle

First order logic Star-free

languages

xω = xω+1

Aperiodic monoids

Decidability Sch¨utzenberger

Mc Naughton

(5)

Another virtuous circle

1

formulas Piecewise testable

languages

xω =xω+1 (xy)ω = (yx)ω J-trivial

monoids

Decidability I. Simon

Thomas

(6)

A brief historical perspective

There have been several attempts to find an appropriate framework to state these results

• Eilenberg (1976) + Reiterman (1982)

• Pin (1995) + Pin-Weil (1996)

• Pippenger (1997)

• Pol´ak (2001)

• Esik (2002), Straubing (2002) + Kunc (2003)

(7)

Eilenberg-Reiterman theorem

A variety of languages is a class of regular languages closed under Boolean operations, quotients and inverse of morphisms between free monoids.

Varieties of languages

Profinite identities Varieties of

finite monoids

Decidability

Eilenberg Reiterman

In good cases

(8)

Closure properties of classes of regular languages

• Eilenberg’s varieties: Boolean operations, quotients, inverse of morphisms

• Pin: intersection, union, quotients, inverse of morphisms

• Pippenger: Boolean operations, inverse of morphisms

• Pol´ak: intersection, quotients, inverse of morphisms

• Esik, Straubing: Boolean operations, quotients, inverse of length preserving morphisms

(9)

Part II

Lattices of languages

(10)

Lattices of languages

Let A be a finite alphabet. A lattice of languages is a set of regular languages of A containing ∅ and A and closed under finite intersection and finite union.

Let u and v be words of A. A language L of A satisfies the equation u →v if

u ∈ L ⇒v ∈ L

Let E be a set of equations of the form u →v.

Then the languages of A satisfying the equations of E form a lattice of languages.

(11)

Equational description of finite lattices

Proposition

A finite set of languages of A is a lattice of

languages iff it can be defined by a set of equations of the form u →v with u, v ∈ A.

Therefore, there is an equational theory for finite lattices of languages. What about infinite lattices?

One needs the profinite world...

(12)

Part III

The profinite world

Citation (M. Stone)

A cardinal principle of modern mathematical research may be stated as a maxim: One must always topologize.

(13)

Separating words

A monoid M separates two words u and v of A if there exists a monoid morphism ϕ : A →M such that ϕ(u) 6= ϕ(v).

For instance, the morphism which maps each word onto its length modulo 2 is a morphism from {a, b} onto Z/2Z which separates abaaba and abaabab.

Let M = {(1 00 1),(1 01 0),(0 10 1)} and let

ϕ : {a, b} → M defined by ϕ(a) = (1 01 0) and ϕ(b) = (0 10 1). Then, for all u, ϕ separates ua and ub since ϕ(ua) = ϕ(a) and ϕ(ub) = ϕ(b).

(14)

The profinite metric

Let u and v be two words. Put r(u, v) = min

|M| M is a finite monoid

that separates u and v}

d(u, v) = 2−r(u,v)

Intuitively, two words are close for d if one needs a large monoid to separate them.

Then d is an ultrametric, for which the product of words is uniformly continuous.

(∗) d(x, z) 6 max {d(x, y), d(y, z)}

(15)

The free profinite monoid

The completion of the metric space (A, d) is the free profinite monoid on A and is denoted by Ac. Its elements are called profinite words. Further, if A is finite, Ac is compact.

Any morphism ϕ : A →M, where M is a

(discrete) finite monoid extends in a unique way to a uniformly continuous morphism ϕˆ: Ac → M. Further, a profinite word u is completely determined by its images ϕ(u), whereˆ ϕ runs over the class of morphisms from A onto a finite monoid.

(16)

A nonfinite profinite word

For each u ∈ A, the sequence un! is a Cauchy sequence and hence converges in Ac to a limit, denoted by uω. If ϕ is a morphism from A onto a finite monoid, ϕ(uω) is the unique idempotent xω of the semigroup generated by x = ϕ(u).

1 x x2 x3

. . .

xi+p = xi

xi+1 xi+2

xi+p−1 xω

(17)

Part IV

Equational theory

(18)

Profinite equations

Let (u, v) be a pair of profinite words of Ac. We say that a regular language L of A satisfies the

profinite equation u → v if

u ∈ L ⇒v ∈ L

Let η : A →M be the syntactic morphism of L.

Then L satisfies the profinite equation u → v iff ˆ

η(u) ∈ η(L) ⇒ηˆ(v) ∈ η(L)

(19)

Main result

Given a set E of equations of the form u →v (where u and v are profinite words), the set of all regular languages of A satisfying all the equations of E is called the set of languages defined by E.

Theorem

A set of regular languages of A is a lattice of languages iff it can be defined by a set of equations of the form u →v, where u, v ∈ Ac.

(20)

Equations of the form u 6 v

Let us say that a regular language satisfies the equation u 6v if, for all x, y ∈ Ac, it satisfies the equation xvy →xuy.

Proposition

Let L be a regular language of A, let (M,6L) be its syntactic ordered monoid and let η : A →M be its syntactic morphism. Then L satisfies the

equation u 6v iff η(u)ˆ 6L η(v)ˆ .

(21)

Quotienting algebras of languages

A lattice of languages is a quotienting algebra of languages if it is closed under the quotienting operations L → u−1L and L → Lu−1, for each word u ∈ A.

Theorem

A set of regular languages of A is a quotienting algebra of languages iff it can be defined by a set of equations of the form u 6v, where u, v ∈ Ac.

(22)

Boolean algebras

Let us write

u ↔v for u → v and v → u, u = v for u 6 v and v 6 u.

Theorem

(1) A set of regular languages of A is a Boolean algebra iff it can be defined by a set of

equations of the form u ↔ v.

(2) A set of regular languages of A is a Boolean quotienting algebra iff it can be defined by a set of equations of the form u = v.

(23)

Interpreting equations (Notation: P = η (L))

Closed under Interpretation

∪,∩ u →v η(u)ˆ ∈ P ⇒η(v) ∈ P

quotient u 6 v xvy →xuy

complement (Lc) u ↔v u →v and v → u quotient and Lc u = v xuy ↔xvy

ϕ−1 Variables = words

ϕ−1 length pres. Variables = letters

ϕ−1 length Variables =

multiplying words of the same length

(24)

Languages with zero

A language with zero is a language whose syntactic monoid has a zero. The class of regular languages with zero is closed under Boolean operations and quotients, but not under inverse of morphisms.

Therefore, this class is not a variety, but it can be defined by a set of equations of the form u = v.

Proposition

A regular language has a zero iff it satisfies the equation xρA = ρA = ρAx for all x ∈ A.

(25)

The profinite word ρ

A

Let us fix a total order on the alphabet A. Let u0, u1, . . . be the ordered sequence of all words of A in the induced shortlex order.

1, a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, . . . Reilly and Zhang (see also Almeida-Volkov) proved that the sequence (vn)n>0 defined by

v0 = u0, vn+1 = (vnun+1vn)(n+1)!

converges to a profinite word ρA, which is

idempotent and belongs the minimal ideal of Ac.

(26)

Nondense languages

A language L of A is dense if, for each word u ∈ A, L∩AuA 6= ∅.

The languages non-dense or full are closed under finite union, finite intersection and quotients. They are not closed under inverse of morphisms, even for length preserving morphisms.

Theorem

A language of A is non-dense or full iff it satisfies the equations x 6 0 for all x ∈ A.

(27)

Slender or full languages

A regular language is slender iff it is a finite union of languages of the form xuy, where x, u, y ∈ A. Regular slender or full languages form a quotienting lattice of languages, but are not closed under

inverse of morphisms.

Theorem

Suppose that |A| > 2. A regular language of A is slender or full iff it satisfies the equations x 60 for all x ∈ A and the equation xωuyω = 0 for each x, y ∈ A+, u ∈ A such that i(uy) 6= i(x).

(28)

Automata recognizing slender languages

Theorem

A regular language is slender iff its minimal

deterministic automaton does not contai any pair of connected cycles.

u

x y

Two connected cycles, where x, y ∈ A+ and u ∈ A.

(29)

Sparse languages

A regular language is sparse iff it is a finite union of languages of the form u0v1u1· · ·vnun, where u0, v1, . . . , vn, un are words. Regular sparse or full

languages form a quotienting lattice of languages, but are not closed under inverse of morphisms.

Theorem

Suppose that |A| > 2. A regular language of A is sparse or full iff it satisfies the equations x 6 0 for all x ∈ A and the equations (xωyω)ω = 0 for each x, y ∈ A+ such that i(x) 6= i(y).

(30)

Boolean closures

Theorem

Suppose that |A| > 2. A regular language of A is slender or coslender iff its syntactic monoid has a zero and satisfies the equations xωuyω = 0 for each x, y ∈ A+, u ∈ A such that i(uy) 6= i(x).

Theorem

Suppose that |A| > 2. A regular language of A is sparse or cosparse iff its syntactic monoid has a zero and satisfies the equations (xωyω)ω = 0 for each x, y ∈ A+ such that i(x) 6= i(y).

(31)

Duality in a nutshell

The dual space of a distributive lattice is the set of its prime filters.

Elements ←→ Prime filters

Boolean algebras ←→ Topological spaces

Distributive lattices ←→ Ordered topologial spaces Sublattices ←→ Quotient spaces

n-ary operations ←→ (n+ 1)-ary relations

(32)

The dual space of Reg(A

)

Almeida (1989, implicitely) and Pippenger (1997, explicitely) proved:

Theorem

The dual space of the lattice of regular languages is the space of profinite words. Furthermore, the canonical embedding is given by the topologial closure: e(L) = L.

(33)

Some formulas

The maps L 7→ L and K 7→ K ∩A are inverse isomorphisms between the Boolean algebras Reg(A) and Clopen(Ac). In particular, for all L, L1, L2 ∈ Reg(A):

(1) Lc = (L)c,

(2) L1 ∪L2 = L1 ∪L2, (3) L1 ∩L2 = L1 ∩L2,

(4) for all x, y ∈ A, then x−1Ly−1 = x−1Ly−1. (5) If ϕ : A →B is a morphism and

L ∈ Reg(B), then ϕˆ−1(L) = ϕ−1(L).

(34)

Prime filters of Reg(A

)

• Given a prime filter p of Reg(A), there is a unique profinite word u such that, for every

morphism from A onto a finite monoid M, ϕ(u)ˆ is the unique element m ∈ M such that ϕ−1(m) ∈ p.

• If u is a profinite word, the set

pu = { L ∈ Reg(A) | ϕ−1( ˆϕ(u)) ⊆L for some morphism ϕ from A onto a finite monoid } is a prime filter of Reg(A).

(35)

A duality result

The right and left residuals of L by K are:

K\L = {u ∈ A |Ku ⊆ L}

L/K = {u ∈ A |uK ⊆ L}

Theorem

The product on profinite words is the dual of the residuation operations on regular languages.

The identity (H\L)/K = H\(L/K) in Reg(A) is equivalent to stating that the product is associative.

(36)

Back to the proof of the main result

Theorem

A set of regular languages of A is a lattice of languages iff it can be defined by a set of equations of the form u →v, where u, v ∈ Ac.

The proof is an instantiation of the duality between sublattices of Reg(A) and preorders on its dual space Ac.

Let L be a lattice of languages. The preorder

determining the quotient in the dual space is exactly the equational theory of L.

(37)

Summary

• We proved that every lattice of regular

languages is given by an equational theory, a result that subsumes Eilenberg’s variety theorem and its extensions.

• In particular, any class of regular languages defined by a fragment of logic closed under conjunctions and disjunctions (first order, monadic second order, temporal, etc.) admits an equational description.

(38)

The golden circle

Logical fragments

Lattices of languages

Profinite identities

Decidability

Stone-Priestley Duality

Hopefully

(39)

Potential extensions

• One could further extend this result to classes of regular languages only closed under finite intersection by using the syntactic semiring introduced by Pol´ak.

• Our result could also be adapted to languages of infinite words, words over ordinals or linear orders, and even perhaps to tree languages.

Our second main result reveals a strong link between relational semantics for modal logic and the algebraic theory of automata. Further duality results are to be expected. . .

Références

Documents relatifs

To emphasize that the above undecidability result is not due to the fact that our family of languages is too complicated but rather because of the properties of morphisms, we also

It is well-known that a dfa M = (Q, Σ, δ, s, F ) is minimal if all its states are reachable from the starting state and no two its different states are equivalent (states p and q

If n is the number of states of the min- imal deterministic automaton accepting the language, an O(n 3 )-time algorithm is obtained for extensible binary languages in [4], while an

Indeed, a variety of languages is, by Eilenberg theorem, recognised by a variety of finite monoids, which in turn is defined by profinite equations by Reiterman theorem.. Then

Give a context-free grammar generating all well-formed boolean expressions that evaluate to

(Alternatively: there are countably many context-free languages whereas an infinite set has an uncount- able number of subsets! Thanks to Florent Noisette for this hack!).. 7

This means that it is not possible to obtain a recursive bound relating the size of context-free grammars generating regular languages with the number of states of

Since the hypotheses depend only on the abstract representing automaton and not on a particular embedding, we deduce an estimation of the genus of regular languages (Theorem