Duality and equational theory of regular languages

(1)

LIAFA, CNRS and University Paris Diderot

Duality and equational theory of regular languages

Mai Gehrke¹ Serge Grigorieff² Jean-´Eric Pin²

1Radboud Universiteit

2LIAFA, CNRS and Universit´e Paris Diderot

ICALP 2008, Reykjavik

Summary

(1) Related results

(2) Equational theory of regular languages (3) The profinite world

(4) Examples (5) Duality

A virtuous circle

First order logic Star-free

languages

x^ω=x^ω+1

Aperiodic monoids

Decidability Sch¨utzenberger

Mc Naughton

Another virtuous circle

BΣ1

formulas Piecewise testable

languages

x^ω=x^ω+1 (xy)^ω= (yx)^ω J-trivial

monoids

Decidability I. Simon

Thomas

A brief historical perspective

There have been several attempts to find an appropriate frameworkto state these results

• Eilenberg(1976) +Reiterman(1982)

• Pin(1995) +Pin-Weil(1996)

• Pippenger(1997)

• Pol´ak(2001)

• Esik(2002),Straubing(2002) +Kunc(2003)

Eilenberg-Reiterman theorem

Avariety of languagesis a class ofregularlanguages closed underBoolean operations,quotientsand inverse of morphismsbetween free monoids.

Varieties of languages

Profinite identities Varieties of finite monoids

Decidability

Eilenberg Reiterman

In good cases

(2)

Closure properties of classes of regular languages

• Eilenberg’s varieties:Boolean operations, quotients, inverse of morphisms

• Pin:intersection, union, quotients, inverse of morphisms

• Pippenger:Boolean operations, inverse of morphisms

• Pol´ak:intersection, quotients, inverse of morphisms

• Esik, Straubing:Boolean operations, quotients, inverse oflength preservingmorphisms

Lattices of languages

LetAbe a finite alphabet. Alattice of languagesis a set ofregularlanguages ofA^∗containing∅andA^∗ and closed underfinite intersectionandfinite union.

Letuandvbe words ofA^∗. A languageLofA^∗ satisfies the equationu→vif

u∈L⇒v∈L

LetEbe a set of equations of the formu→v.

Then the languages ofA^∗satisfying the equations ofEform alattice of languages.

Equational description of finite lattices

Proposition

A finite set of languages ofA^∗is alattice of languagesiff it can be defined by a set of equations of the formu→vwithu, v∈A^∗.

Therefore, there is an equational theory forfinite latticesof languages. What about infinite lattices?

One needs theprofinite world...

Separating words

A monoidM separatestwo wordsuandvofA^∗if there exists a monoid morphismϕ:A^∗→M such thatϕ(u)6=ϕ(v).

For instance, the morphism which maps each word onto itslength modulo 2is a morphism from{a, b}^∗ ontoZ/2Zwhich separatesabaabaandabaabab.

LetM ={(^{1 0}0 1),(^{1 0}_{1 0}),(^{0 1}_{0 1})}and let ϕ:{a, b}^∗→M defined byϕ(a) = (^{1 0}_{1 0})and ϕ(b) = (^{0 1}_{0 1}). Then, for allu,ϕseparatesuaand ubsinceϕ(ua) =ϕ(a)andϕ(ub) =ϕ(b).

The profinite metric

Letuandvbe two words. Put r(u, v)= min

|M| M is a finite monoid that separatesuandv} d(u, v)=2⁻^r(u,v)

Intuitively, two words are close fordif one needs a largemonoid to separate them.

Thendis anultrametric^∗, for which theproductof words isuniformly continuous.

(∗) d(x, z)6max{d(x, y), d(y, z)}

The free profinite monoid

The completion of the metric space(A^∗, d)is the freeprofinite monoidonAand is denoted byAc^∗. Its elements are calledprofinite words. Further, ifA isfinite,cA^∗iscompact.

Any morphismϕ:A^∗→M, whereM is a (discrete) finite monoid extends in a unique way to auniformly continuousmorphismϕˆ:Ac^∗→M. Further, aprofinite worduiscompletely determined by its imagesϕ(u), whereˆ ϕruns over the class of morphisms fromA^∗onto afinitemonoid.

(3)

A nonfinite profinite word

For eachu∈A^∗, the sequenceu^n!is aCauchy sequenceand hence converges inAc^∗to a limit, denoted byu^ω. Ifϕis a morphism fromA^∗onto a finite monoid,ϕ(u^ω)is theunique idempotentx^ωof the semigroup generated byx=ϕ(u).

1 x x² x³ . . .x^i+p=xⁱ

xⁱ⁺¹ xⁱ⁺²

x^i+p⁻¹ x^ω

Profinite equations

Let(u, v)be a pair of profinite words ofcA^∗. We say that a regular languageLofA^∗satisfies the profinite equationu→vif

u∈L⇒v∈L

Letη:A^∗→M be thesyntactic morphismofL.

ThenLsatisfies the profinite equationu→viff ˆ

η(u)∈η(L)⇒η(v)ˆ ∈η(L)

Main result

Given a setEof equations of the formu→v (whereuandvare profinite words), the set of all regular languages ofA^∗satisfying all the equations ofEis called the set of languagesdefined byE.

Theorem

A set of regular languages ofA^∗is alattice of languagesiff it can be defined by aset of equations of the formu→v, whereu, v∈Ac^∗.

Equations of the form

u6v

Let us say that a regular languagesatisfies the equationu6vif, for allx, y∈Ac^∗, it satisfies the equationxvy→xuy.

Proposition

LetLbe a regular language ofA^∗, let(M,6L)be itssyntactic ordered monoidand letη:A^∗→M be itssyntactic morphism. ThenLsatisfies the equationu6viffη(u)ˆ 6Lη(v).ˆ

Quotienting algebras of languages

A lattice of languages is aquotienting algebra of languagesif it is closed under the quotienting operationsL→u⁻¹LandL→Lu⁻¹, for each wordu∈A^∗.

Theorem

A set of regular languages ofA^∗is aquotienting algebraof languages iff it can be defined by aset of equationsof the formu6v, whereu, v∈Ac^∗.

Boolean algebras

Let us write

u↔v foru→vandv→u, u=v foru6vandv6u.

Theorem

(1) A set of regular languages ofA^∗is aBoolean algebraiff it can be defined by aset of equationsof the formu↔v.

(2) A set of regular languages ofA^∗is aBoolean quotienting algebraiff it can be defined by a set of equationsof the formu=v.

(4)

Interpreting equations (Notation:

P =η(L))

Closed under Interpretation

∪,∩ u→v η(u)ˆ ∈P⇒η(v)∈P

quotient u6v xvy→xuy

complement (L^c) u↔v u→vandv→u quotient andL^c u=v xuy↔xvy

ϕ⁻¹ Variables =words

ϕ⁻¹length pres. Variables =letters

ϕ⁻¹length Variables =

multiplying words of the same length

Languages with zero

Alanguage with zerois a language whosesyntactic monoidhas a zero. The class of regularlanguages with zerois closed underBoolean operationsand quotients, butnotunderinverse of morphisms.

Therefore, this class isnotavariety, but it can be defined by a set of equations of the formu=v.

Proposition

A regular languagehas a zeroiff it satisfies the equationxρA=ρA=ρAxfor allx∈A^∗.

The profinite word

ρA

Let us fix a total order on the alphabetA. Let u0, u1, . . .be the ordered sequence of all words of A^∗in the induced shortlex order.

1, a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, . . . Reilly and Zhang (see also Almeida-Volkov) proved that the sequence(vn)n>0defined by

v0=u0, vn+1= (vnun+1vn)^(n+1)!

converges to a profinite wordρA, which is idempotent and belongs theminimal idealofAc^∗.

Nondense languages

A languageLofA^∗isdenseif, for each word u∈A^∗,L∩A^∗uA^∗6=∅.

The languagesnon-dense or fullare closed under finite union,finite intersectionandquotients. They are not closedunderinverse of morphisms, even for length preserving morphisms.

Theorem

A language ofA^∗isnon-dense or fulliff it satisfies the equationsx60for allx∈A^∗.

Slender or full languages

A regular language isslenderiff it is a finite union of languages of the formxu^∗y, wherex, u, y∈A^∗. Regularslender or fulllanguages form a quotienting lattice of languages, but arenot closedunder inverse of morphisms.

Theorem

Suppose that|A|>2. A regular language ofA^∗is slender or fulliff it satisfies the equationsx60for allx∈A^∗and the equationx^ωuy^ω= 0for each x, y∈A⁺,u∈A^∗such thati(uy)6=i(x).

Sparse languages

A regular language issparseiff it is a finite union of languages of the formu0v₁^∗u1· · ·v_n^∗un, whereu0,v1, . . . ,vn,unare words. Regular sparse or full languages form aquotienting latticeof languages, but arenot closedunderinverse of morphisms.

Theorem

Suppose that|A|>2. A regular language ofA^∗is sparse or fulliff it satisfies the equationsx60for allx∈A^∗and the equations(x^ωy^ω)^ω= 0for each x, y∈A⁺such thati(x)6=i(y).

(5)

Boolean closures

Theorem

Suppose that|A|>2. A regular language ofA^∗is slender or coslenderiff its syntactic monoid has a zero and satisfies the equationsx^ωuy^ω= 0for each x, y∈A⁺,u∈A^∗such thati(uy)6=i(x).

Theorem

Suppose that|A|>2. A regular language ofA^∗is sparse or cosparseiff its syntactic monoid has a zero and satisfies the equations(x^ωy^ω)^ω= 0for each

x, y∈A⁺such thati(x)6=i(y). LIAFA, CNRS and University Paris Diderot

Duality in a nutshell

Thedual spaceof a distributive lattice is the set of itsprime filters.

Elements ←→ Prime filters Boolean algebras ←→ Topological spaces Distributive lattices ←→ Ordered topologial spaces

Sublattices ←→ Quotient spaces n-ary operations ←→ (n+ 1)-ary relations

The dual space of

Reg(A^∗)

Almeida(1989, implicitly) andPippenger(1997, explicitly) proved:

Theorem

Thedual spaceof thelattice of regular languagesis the space ofprofinite words. Furthermore, the canonical embedding is given by the topologial closure:e(L) =L.

Prime filters of

Reg(A^∗)

•Given aprime filterpofReg(A^∗), there is a uniqueprofinite wordusuch that, for every morphism fromA^∗onto a finite monoidM,ϕ(u)ˆ is the unique elementm∈M such thatϕ⁻¹(m)∈p.

•Ifuis aprofinite word, the set

pu={L∈Reg(A^∗)|ϕ⁻¹( ˆϕ(u))⊆Lfor some morphismϕfromA^∗onto afinitemonoid} is aprime filterofReg(A^∗).

A duality result

Therightandleft residualsofLbyKare:

K\L={u∈A^∗|Ku⊆L} L/K={u∈A^∗|uK⊆L}

Theorem

Theproducton profinite words is the dual of the residuation operationson regular languages.

The identity(H\L)/K=H\(L/K)inReg(A^∗)is equivalent to stating that the product is associative.

Back to the proof of the main result

Theorem

A set of regular languages ofA^∗is alattice of languagesiff it can be defined by aset of equations of the formu→v, whereu, v∈Ac^∗.

The proof is an instantiation of thedualitybetween sublattices ofReg(A^∗)and preorders on its dual spaceAc^∗.

LetLbe a lattice of languages. The preorder determining thequotientin the dual space is exactly theequational theoryofL.

(6)

Summary

• We proved that everylattice of regular languagesis given by anequational theory, a result that subsumesEilenberg’s variety theoremand itsextensions.

• In particular, any class of regular languages defined by afragment of logicclosed under conjunctionsanddisjunctions(first order, monadic second order, temporal, etc.) admits anequational description.

The golden circle

Logical fragments

Lattices of languages

Profinite identities

Decidability

Stone-Priestley Duality

Hopefully

The golden circle

Logical fragments

Lattices of languages

Profinite identities

Decidability

Stone-Priestley Duality

Hopefully

Potential extensions

• One could further extend this result to classes of regular languages only closed underfinite intersectionby using thesyntactic semiring introduced by Pol´ak.

• Our result could also be adapted to languages ofinfinite words, words overordinalsorlinear orders, and even perhaps totreelanguages.

Our second main result reveals a strong link between relational semantics for modal logic and the algebraic theory of automata. Furtherduality resultsare to be expected. . .