• Aucun résultat trouvé

Enumerative combinatorics on words

N/A
N/A
Protected

Academic year: 2022

Partager "Enumerative combinatorics on words"

Copied!
45
0
0

Texte intégral

(1)

HAL Id: hal-00620805

https://hal-upec-upem.archives-ouvertes.fr/hal-00620805

Submitted on 24 Feb 2013

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Enumerative combinatorics on words

Dominique Perrin

To cite this version:

Dominique Perrin. Enumerative combinatorics on words. Crapo Henri, Rota Gian-Carlo. Algebraic

Combinatorics and Computer Science, Springer-Verlag, pp.391-430, 2001. �hal-00620805�

(2)

DominiquePerrin

Institut Gaspard Monge, Universitede Marne-la-Vallee,

77454 Marne-la-Vallee Cedex 2 Frane.

perrinuniv-mlv.fr.

Abstrat

Wepresentthestateoftheart intheeld ofgenerating seriesfor

formallanguages. Theemphasis isonregular languagesandrational

series. ThepaperoversaspetsinludingregulartreesandtheKraft-

MMillaninequalityaswellasneklaesandzetafuntions.

Contents

1 Introdution 2

2 Regular sequenes and automata 3

2.1 Regularsequenes . . . 5

2.2 Finiteautomata . . . 7

2.3 Beyond regularsequenes . . . 8

3 Enumeration on regular trees 9 3.1 Graphsand trees . . . 10

3.2 Regularsequenesand trees . . . 11

3.3 Approximateeigenvetor . . . 13

3.4 The multisetonstrution . . . 16

3.5 Generating sequeneof leaves . . . 20

3.6 Generating sequeneof nodes . . . 24

4 Generating sequenes of prex odes 28 4.1 Trees and prexodes . . . 28

4.2 Bixodes . . . 30

(3)

5.1 Subshiftsof nitetype . . . 33

5.2 Cirularodes. . . 36

5.3 Zetafuntions. . . 40

1 Introdution

Generatingseries,alsoalledgeneratingfuntionsplayanimportantrolein

ombinatorial mathematis. Manyenumeration problemsan be solved by

transferringthebasioperationson setsinto algebraioperationson formal

seriesleading to a solutionof an enumeration problem. The famouspaper

byDoubilet,RotaandStanley'Theideaofgeneratingfuntion'[41℄,plaes

thesubjetinageneralmathematialframeallowingto presentinaunied

waythediverse sorts ofgeneratingfuntions from theordinaryones to the

exponentialoreven Dirihletones.

Their plae withinthe eld of ombinatoris on words is partiular. It

was indeedM. P. Shutzenberger's point of view that sets of words an be

onsidered as series in several non-ommutative variables. The generating

seriesoftheset appears thenasa theimageof thenon-ommutative series

through an homomorphism. This gives rise to a rih domain in whih an

interplaybetweenlassialommutativealgebraandombinatorisonwords

ispresent.

In these letures, I will survey on several aspets of these generating

funtions on words. The emphasis is on the most elementary ase orre-

sponding to sets of words whih an be dened using a nite automaton,

usuallyalledregular. The orrespondingseriesare atuallyrational. Two

speialases willbe onsidered inturn. The rst one is the ase of sets of

wodrs orresponding to leaves in a tree and usually alled prex odes. A

reent resultduetoFrederiqueBassino,Marie-PierreBealandmyself[10 ℄is

presented. Itompletelyharaterizesthegeneratingseriesofregularprex

odes. The seondone is theaseof sets ofwordsonsideredup to ayli

permutation, often alled neklaes. The orresponding generating series

arethezeta funtionsof symbolidynamis.

Awordontheterminologyusedhere. Weonstantlyusethetermregular

wherea riher terminologyisoften used. Inpartiular,what we allhere a

regularsequene is,inEilenberg'sterminology,an N-rational sequene (see

[22 ℄,[42 ℄ or[18℄).

(4)

We onsidertheset A

of allwords ona given alphabetA. A subset ofA

isoften alleda formal language. Forsets X;Y A

,we denote

X+Y =X[Y;

XY =fxyjx2X;y 2Yg;

X

=fx

1 x

2 x

n jx

i

2X;n0g

We say thatthe pair (X;Y) is unambiguousiffor eah z2XY there isat

mostone pair(x;y)2XY suh thatz=xy.

We say that a set of nonempty words X is a ode if for eah x 2 X

there is at most one sequene (x

1

;x

2

;:::;x

n

) with x

i

2 X suh that x =

x

1 x

2 x

n

(one also says that X is uniquely deipherable). A partiular

ase of a ode is a prex ode. It is a set ofwords X suh that no element

of X is a prex of another one. It is easy to see that suh a set is either

reduedto theemptywordordoesnotontaintheemptywordand isthen

aode.

ThelengthdistributionofasetofwordsXisthesequeneu

X

=(u

n )

n0

with

u

n

=Card(X\A n

):

We denote byu

X

theformal series

u

X (z)=

X

n0 u

n z

n

:

whihis theordinarygeneratingseriesof thesequene u

X .

For example, the length distribution of X = A

is u(z) = 1

1 kz

where

k=Card(A).

The entropy ofa formallanguageX is

h(X)=log(1=);

where is the radius of onvergene of the seriesu

X

(z). It is well dened

providedX isinniteandthusisnite. IfthealphabetAhask elements,

we have h(X)logk.

Thefollowingresult relatesthebasioperationson setswithoperations

onseries.

Proposition 1 The following properties hold for any subsets X;Y of A

.

(i) If X\Y =;, then u

X+Y

=u

X +u

Y .

(5)

XY X Y

(iii) If X is a ode, then u

X

=1=(1 u

X ).

Proof. The rst two formulae are lear. If X is a ode, every word in X

hasaunique deompositionasaprodutofwordsinX. Thisimpliesthat

u

X n

=(u

X )

n

andthus,

u

X

=1+u

X

++u

X n

+=1=(1 u

X ):

Example 1 The setX =fb;abg isa prexode. The seriesu

X is

u

X (z)=

1

1 z z

2 :

Let(F

n )

n0

bethesequene ofFibonai numbersdenedbyF

0

=0,F

1

=

1,and F

n+2

=F

n+1 +F

n

. Itfollows from thereurrenerelationthat

z

1 z z

2

= X

n0 F

n z

n

:

Consequently,u

X (z)

= P

n0 F

n+1 z

n

. Itan also beprovedbyaombina-

torialargument thatthenumberofwordsof lengthninX

isF

n+1 .

Thereareseveralvariantsofthegeneratingseriesonsideredabove. One

mayrst dene

p

X (z)=

X

n0 u

n

k n

z n

;

where k = Card(A). The oeÆients of z n

in p

X

(z) is the probability for

a wordof length nto be in theset X. Therelation betweenu

X and p

X is

simplesine p

X

(z) = u

X

(z=k). Another variant of the generating series is

theexponential generating series ofthesequene (u

n )

n0

denedas

e(z)= X

n0 u

n

n!

z n

:

We willalso usethezeta funtionofa sequene (u

n )

n1

denedas

(z)=exp X

n1 u

n

n z

n

:

(6)

We onsidersequenes ofnatural integerss=(s

n )

n0

. We shallnotdistin-

guishbetween suh asequene and theformal seriess(z)= P

n0 s

n z

n

:

We usuallydenoteavetor indexedbyelementsofasetQ, alsoalled a

Q-vetor, withboldfaesymbols. Forv=(v

q )

q2Q

we saythat v isnonneg-

ative,denotedv0,(resp.positive,denotedv>0)ifv

q

0 (resp.v

q

>0)

forall q 2Q. The same onventions are used formatries. A nonnegative

QQ-matrix M is said to be irreduible if, for all indiesp;q,there is an

integer m suh that (M m

)

p;q

> 0. The matrix is primitive if there is an

integer m suh thatM m

>0.

The adjaeny matrix of a graph G = (Q;E) is the QQ-matrix M

suh that for eah p;q 2 Q, the integer M

p;q

is the number of edges from

p to q. The adjaeny matrix of a graph G is irreduible i the graph is

stronglyonneted. Itisprimitiveif,moreover, theg..doflengthsofyles

inGis 1.

Let G be a nite graph and let I, T be two sets of verties. For eah

n0,let s

n

bethenumberofdistintpaths oflengthnfrom avertex ofI

toavertexofT. Thesequenes=(s

n )

n0

isalledthesequenereognized

by (G;I;T) or also by G if I and T are already speied. When I = fig

and T =ftg,we simplydenote (G;i;t) insteadof (G;fig;ftg).

Asequene s=(s

n )

n0

of nonnegativeintegersissaid to beregularifit

is reognizedby suh a triple (G;I;T), where Gis nite. We say that the

triple(G;I;T) is a representation of thesequene s. The verties of I are

alled initial and those of T terminal. Two representations are said to be

equivalent ifthey reognize thesame sequene.

A representation (G;I;T) is said to be trim if every vertex of G is on

some path from I to T. It is learthat any representation is equivalent to

atrim one.

A well known result in theory of nite automata allows one to use a

partiular representation of any regular sequene s suh that s

0

=0. One

an always hoose in this ase a representation (G;i;t) of s with a unique

initial vertex i, a unique nal vertex t 6= i suh that no edge is entering

vertexiand noedgeisgoingoutofvertext. Suharepresentationisalled

anormalized representation (seeforexample [37 ℄ page14).

Let(G;i;t) bea trimnormalizedrepresentation. Ifwe mergetheinitial

vertexiandthenalvertextinasinglevertexstilldenotedbyi,weobtain

anew graphdenoted byG, whihisstrongly onneted. The triple(G ;i;i)

isalledthe losure of(G;i;t).

Letsbearegularsequenesuhthats

0

=0. Thestar s

ofthesequene

(7)

s

(z)= 1

1 s(z) :

Proposition 2 If (G;i;t) is a normalized representation of s, its losure

(G ;i;i) reognizesthe sequenes

.

Proof. Thesequenesisthelengthdistributionofthepathsofrst returns

to vertex i in G , that is of nite paths going from i to i without going

throughvertexi. Thelengthdistributionofthesetofallreturnstoiisthus

1+s(z)+s 2

(z)+:::=1=(1 s(z)).

Anequivalentdenitionofregularsequenesusesvetorsinsteadofsets

I;F. Let i be a Q-row vetor of nonnegative integers and let t be a Q-

olumnvetor of nonnegative integers. We say that (G;i;t) reognizes the

sequene s=(s

n )

n0

ifforeah integer n0

s

n

=iM n

t;

whereM istheadjaeny matrixof G. The proof thatbothdenitionsare

equivalentfollowsfromthefatthatthefamilyofregularsequenesislosed

under addition (see [22 ℄). A triple (G;i;t) reognizing a sequene sis also

alleda representationof sand tworepresentations are alledequivalent if

they reognizethe same sequene.

A sequene s=(s

n )

n0

of nonnegative integers is rational ifit satises

a reurrene relation with integral oeÆients. Equivalently, s is rational

ifthere exist two polynomials p(z);q(z) with integral oeÆientsand with

q(0)=1 suh that

s(z)= p(z)

q(z) :

1 2

Figure 1: TheFibonai graph.

Forexample,thesequene sdenedbys(z)= z

1 z z 2

isthesequene of

Fibonai numbers also dened bys

0

=0;s

1

=1 and s

n+1

=s

n +s

n 1 . It

isreognizedbythegraph ofFigure 1with I =f1g and T =f2g.

(8)

Setion3.6).

AtheoremofSoittola[42 ℄,alsofoundindependentlyin[27℄haraterizes

thoserationalsequeneswhihareregular. Wesaythatarationalsequene

hasadominating root, eitherifit isa polynomialorifithasa real positive

polewhihisstritlysmallerthanthemodulusofanyotherone. Asequene

r is amerge ofthe sequenesr

i

ifthere isan integer p suh that

r(z)= p 1

X

i=0 z

i

r

i (z

p

):

Theorem 1 (Soittola) Asequeneofnonnegativeintegersr =(r

n )

n0 is

regularifandonlyifit isamergeof rationalsequeneshavinga dominating

root.

This result shows that it is deidableif a rational series is regular (see

[42 ℄). Inthepositivease,there isanalgorithmomputingarepresentation

ofthesequene.

2.2 Finite automata

Wepresenthereabriefintrodutiontotheoneptsusedinautomatatheory.

Fora generalreferene,see [38 ℄ or[22 ℄.

An automaton over the alphabet A is omposed of a set Q of states, a

set E QAQ of edges ortransitions and two sets I;T Qof initial

and terminalstates.

A path intheautomaton Ais a sequene

(p

1

;a

1

;p

2 );(p

2

;a

2

;p

3

);:::;(p

n

;a

n

;p

n+1 )

of onseutive edges. Its label is the word x =a

1 a

2 a

n

. A path is su-

essful if it starts in an initial state and ends in a terminal state. The set

reognized bythe automatonis thesetof labelsof its suessfulpaths.

Anautomatonisdeterministiif,foreahstatepandeahlettera,there

is at most one edge whih starts at p and is labeled by a. The term right

resolving isalso used.

Example 2 Let A be the automaton given in Figure 2 with 1 as unique

initial and terminal state. It reognizes the set X

where X is the prex

ode X =fb;abg:

(9)

1 2

b

b

Figure2: Golden mean automaton.

A set of words X over A is regular if it an be reognized by a nite

automaton.

Itisa lassialresult thatasetofwordsisregulariitan beobtained

bya nitenumberof operationsunion,produtand star,startingform the

nitesets.

The followingresultis also lassial(see[22 ℄ forexample).

Proposition 3 Everyregularsetanbereognized byanite deterministi

automaton having a unique initial state.

The following theorem is of fundamental importane. It belongsto the

earlyfolkloreof automata theory.

Theorem 2 Thelengthdistributionsofregularsetsaretheregularsequenes.

Proof. LetX be a regularset. By Proposition3,it an bereognized by a

deterministi automaton A. Sine A is deterministi,there is at most one

pathwithgivenlabel,originandend. Thusthenumberofpathsoflengthn

fromtheinitialstate to aterminalstate isequaltothenumberu

n

ofwords

ofX oflengthn.

Conversely,letubearegularsequeneenumeratingthepathsinagraph

G from I to T. We onsider the graph G asan automaton with all edges

withdistint labels. Let X be the set of labels of paths from I to T. The

sequeneu is thelengthdistributionof theset X.

Example 3 If X=a

b,then

u

X (z)=

z

1 z :

2.3 Beyond regular sequenes

There are several natural lasses of series beyond the rational ones. The

algebraiseries are those satisfying an algebrai equation. More generally,

(10)

termsis givenbya rationalfration(see [26℄).

The lass of algebrai seriesis linked withthe lass of ontext-free sets

(see [23 ℄). A typial example of a ontext-free set is the set of words on

thebinaryalphabetfa;bghavingasmanya'sasb's. Weomputebelowits

lengthdistributionwhih isan algebraiseries.

Example 4 The set of words on A = fa;bg having an equal number of

ourrenes of aand b is asubmonoidof A

generated bya prexode D.

SineanywordofD

oflength2nisobtainedbyhoosingnpositionsamong

2n, we have

u

D (z)=

X

n0

2n

n

z 2n

:

Bya simpleappliation ofthebinomial formula, we obtain

u

D

(z)=(1 4z 2

) 1

2

:

Thisfollows indeed,usingthesimpleidentity

1

2

n

= 1

( 4) n

2n

n

:

We have u

D

(z)=1 1=u

D (z)

and thus

u

D

(z)=1 p

1 4z 2

:

Thusu

D

(z) isan algebraiseries,solutionof theequation

f 2

2f +4z 2

=0:

3 Enumeration on regular trees

We nowturn to thestudyof generatingsequenes linked withtrees. Atu-

ally, we do notenumerate trees butobjets withina tree like the nodes or

the leaves at eah level. This is atually equivalent to the enumeration of

partiular sets of words, namely prex-losed sets and prex odes, as we

shallseebelow(Setion4).

(11)

Inthispaper, we use direted multigraphsi.e. graphswithpossiblyseveral

edges withthe same originand thesame end. We simply allthem graphs

inall whatfollows. We denoteG=(Q;E) agraphwithQassetofverties

andE asset ofedges. We alsosay thatGis agraph on thesetQ.

A tree T on a set of nodes N with a root r 2 N is a funtion T :

N frg ! N whih assoiates to eah node distint from the root its

father T(n), in suh a way that, for eah node n, there is a nonnegative

integer h suh thatT h

(n)=r. The integer h istheheight ofthenode n.

A tree is k-ary if eah node has at most k hildren. A node without

hildrenisalledaleaf. Anodewhihisnotaleafisalledinternal. Anode

nisadesendant of anodem ifm=T h

(n)forsome h0. Ak-arytree is

omplete ifall internalnodeshave exatlyk hildrenand have at leastone

desendant whih isa leaf.

For eah node nof a tree T, thesubtree rooted at n, denoted T

n is the

treeobtained byrestriting theset ofnodesto thedesendantsof n.

Two trees S;T are isomorphi, denoted S T,ifthere isa mapwhih

transforms S into T bypermuting thehildrenofeah node. Equivalently,

S T if there is a bijetive map f : N ! M from the set of nodes of S

onto theset of nodesof T suh thatf ÆS =T Æf. Suh a map f is alled

anisomorphism.

IfT isatree withN assetofnodes,thequotient graph ofT isthegraph

G=(Q;E)whereQand Earedenedasfollows. ThesetQisthequotient

of N by the equivalene n m if T

n T

m

. Let m denote the lass of a

node m. The numberof edges from m to n is thenumberof hildrenof m

equivalentto n.

Conversely,thesetofpathsinagraphwithgivenoriginisatree. Indeed,

let G=(Q;E) be a graph. Let r 2Q bea partiular vertex and let N be

thesetofpathsinGstartingatr. ThetreeT havingN assetofnodesand

suhthatT(p

0

;p

1

;:::;p

n )=(p

0

;p

1

;:::;p

n 1

) isalledtheovering tree of

Gstartingat r.

Both onstrutions aremutuallyinverse in thesense that any tree T is

isomorphito theovering tree of its quotient graph starting at the image

oftheroot.

Proposition 4 LetT bea treewithrootr. Let Gbeitsquotient graph and

let i bethe vertex of G whih isthe lass of the root of T. For eah vertex

q of G and for eah n 0, the number of paths of length n from i to q is

equal to the number of nodes of T atheight n in the lass of q.

(12)

isomorphisubtrees,i.e.ifits quotient graphis nite.

Figure 3: A regulartree.

1 3

4 2

Figure4: Andits quotientgraph.

For example, the innite tree representedon Figure 3 is a regular tree.

Itsquotientgraph isrepresentedon Figure 4.

3.2 Regular sequenes and trees

If T is a tree, its generating sequene of leaves is the sequene of numbers

s=(s

n )

n0

,where s

n

is the number of leaves at height n. We also simply

saythatsis thegenerating sequene ofT.

The followingresultis a diretonsequeneof thedenitions.

Theorem 3 Thegenerating sequeneofa regulartreeisa regularsequene.

Proof. Let T be a regular tree and let G be its quotient graph. Sine T

is regular, G is nite. The leaves of T form an equivalene lass t. By

(13)

iisthelass of theroot ofT.

We saythat asequenes=(s

n )

n1

satisestheKraftinequalityforthe

integer k if

X

n0 s

n k

n

1;

i.e.usingtheformalseriess(z)= P

n0 s

n z

n

,if

s(1=k)1:

We say that s satises the strit Kraft inequality for k if s(1=k) < 1.

Thefollowingresult is well-known(see [4 ℄ page35 forexample).

Theorem 4 A sequene s is the generating sequene of a k-ary tree i it

satises the Kraft inequality for the integer k.

LetusonsidertheKraft'sequalityase. Ifs(1=k)=1,thenanytree T

having s asgenerating sequene is omplete. The onverse propertyis not

truein general(see [22 ℄ p. 231). However, it isa lassialresult thatwhen

T isaompleteregulartree,itsgeneratingsequenesatisess(1=k)=1(see

Proposition8).

For the sake of a omplete desription of the onstrution desribed

aboveintheproofofTheorem4,wehavetospeifythehoiemadeateah

stepamong theleavesat height n. A possiblepoliyisto hooseto give as

manyhildrenaspossibleto thenodeswhiharenotleavesandofmaximal

height.

IfwestartwithanitesequenessatisfyingKraft'sinequality,theabove

method builds a nite tree with generating sequene equal to s. It is not

truethatthisinrementalmethodgivesa regulartreewhenwestartwitha

regularsequene, asshowninthe followingexample.

Let s(z) = z 2

=(1 2z 2

). Sine s(1=2) = 1=2, we may apply the Kraft

onstrutionto buildabinary treewithlengthdistributions. Theresult is

thetree T(X)where X is thesetof prexes oftheset

Y = [

n0 01

n

0f0;1g n

:

whihis notregular.

If sisa regular sequene suh thats

0

=0,there exists aregular tree T

havingsasgeneratingsequene. Indeed,let (G;i;t) be anormalized repre-

sentation of s. The generating sequene of the overing tree of G starting

(14)

however nottruethat theregular overingtree obtainedis k-ary,as shown

inthefollowingexample.

Let s be the regular sequene reognized by the graph of Figure 5 on

the left with i= 1 and t= 4. We have s(z) =3z 2

=(1 z 2

). Furthermore

s(1=2)=1and thusssatisesKraft'sequalityfork=2. Howeverthereare

fouredges goingoutofvertex2anditsregularoveringtreestartingat 1is

4-ary. A solutionforthisexample isgiven bythe graphof Figure 5 on the

right. Itreognizessanditsoveringtreestartingat 1istheregularbinary

treeof Figure 3.

1 2 3

4

1 3

4 2

Figure5: Graphs reognizings(z)=3z 2

=(1 z 2

).

TheaimofSetion3.5istobuildfromaregularsequenesthatsatises

theKraftinequalityforanintegerk atreewithgeneratingsequeneswhih

isbothregularand k-ary.

3.3 Approximate eigenvetor

Let M be the adjaeny matrix of a graph G. By the Perron-Frobenius

theorem (see [25℄, for a general presentation and [30 ℄, [28 ℄ or [11 ℄ for the

link with graphs and regular sequenes), the nonnegative matrix M has a

nonnegative real eigenvalue of maximal modulus denoted by , also alled

thespetralradius ofthematrix.

WhenGisstronglyonneted, thematrixisirreduibleand thePerron-

Frobeniustheoremassertsthatthedimensionoftheeigenspaeofthematrix

Morrespondingtoisequaltoone,andthatthereisapositiveeigenvetor

assoiated to .

Letkbeaninteger. Ak-approximateeigenvetorofanonnegativematrix

M is,by denition,an integralvetor v0 suh that

Mvkv :

Onehasthe followingresult(see [30℄p.152).

(15)

admits a positivek-approximate eigenvetor i k .

Foraproof,see[30 ℄p.152. WhenM istheadjaenymatrixofagraph

G,wealsosaythatvisak-approximateeigenvetorofG. Theomputation

of an approximate eigenvetor an be obtained by the use of Franaszek's

algorithm (see for example [30 ℄). It an be shown that there exists a k-

approximate eigenvetor with elements bounded above by k 2n

where n is

thedimensionofM [5 ℄. ThusthesizeoftheoeÆientsof ak-approximate

eigenvetorisboundedabovebyanexponentialinnandanbeintheworst

aseof thisorder ofmagnitude.

The followingresult iswell-known. It linkstheradiusof onvergene of

asequene withthe spetralradiusof theassoiatedmatrix.

Proposition 6 Let s be a regular sequene reognized by a trim represen-

tation (G;I;T). Let M be the adjaeny matrix of G. The radius of on-

vergeneof s isthe inverse of the maximal eigenvalue of M.

Proof. ThemaximaleigenvalueofM is=limsup

n0 n p

kM n

k,wherekk

isanyoftheequivalentmatrixnorms. Let betheradiusofonvergeneof

sand,foreahp;q 2Q,let

pq

betheradiusofonvergeneofthesequene

u

pq

= (M n

pq )

n0

. Then 1= = min

pq

. Sine (G;I;T) is trim, we have

pq

forallp;q 2Q. On theother hand,min

pq

sines isa sum of

someofthesequenesu

pq

. Thus

s

=min

pq

whihonludestheproof.

Asa onsequeneof thisresult,the radiusofonvergene of a regular

sequenesisapole. Indeed, withtheabovenotation, s(z)=i(1 Mz) 1

t.

Thendet(I Mz)isadenominatorof therationalfrations,thepolesofs

areamong theinversesoftheeigenvaluesofM. Andsine1=istheradius

of onvergene of s, it has to be a pole of s. In partiular, s diverges for

z=.

The followingresult,dueto Berstel,is alsowell-known. Itallowsone to

omputetheradiusof onvergene ofthestar of a sequene.

Proposition 7 Let s be a regular sequene. The radius of onvergene of

theseriess

(z)=1=(1 s(z))istheuniquerealnumberr suhthats(r)=1:

Fora proof,see[22 ℄ pp 211-214, [18℄p. 82 or[11 ℄p. 84. Asaonsequene,

we obtainthefollowingresult.

(16)

radius of onvergene of s

. The sequene s satises the Kraft strit in-

equality s(1=k) < 1 (resp. equality s(1=k) = 1) if and only if < k (resp.

=k).

We have thusproved thefollowingresult, whihis thebasis ofthe on-

strutionsof thenext setions.

Proposition 9 Let s be a regular sequene satisfying Kraft's inequality

s(1=k)1. Let (G;i;t) bea normalized representation of sand let (G ;i;i)

bethelosureof (G;i;t). Theadjaeny matrixM ofGadmitsa k-approxi-

mateeigenvetor.

Atually,under thehypothesis of Proposition9, thegraph Gitself also

admits a k-approximate eigenvetor. Indeed, let w = (w

q )

q2Q t

be a k-

approximate eigenvetor of G . Then the vetor w = (w

q )

q2Q

dened by

w

q

=w

q

forq 6=tand w

t

=w

i

isa k-approximateeigenvetorof G. Thisis

illustratedinthefollowingexample.

1 2 3

4

1 2 3

Figure 6: ThegraphsGand G.

Letusforexampleonsideragains(z)=3z 2

=(1 z 2

)(seeFigure5). The

sequenesisreognizedbythenormalizedrepresentation(G;1;4) whereG

isthegraphrepresentedontheleftofFigure6. ThegraphGisrepresented

on theright. The vetors

w= 2

6

6

4 3

2

1

3 3

7

7

5

;w= 2

4 3

2

1 3

5

are2-approximateeigenvetors ofG and Grespetively.

(17)

In this setion, we present the main onstrution used in this paper. It

anbeonsideredasaversionwithmultipliitiesofthesubsetonstrution

used in automata theory to replae a nite automaton by an equivalent

deterministione. Weuseonlyunlabeledgraphsbuttheonstrutionanbe

easilygeneralizedtographswithedgeslabeledbysymbolsfromanalphabet.

Ouronstrution isalsolinkedwithoneusedbyD.Lindtobuildapos-

itive matrixwith given spetral radius(see [30 ℄,espeiallyLemma11.1.9).

We use for onveniene the term multiset of elements of a set Q as a

synonymofQ-vetor. Ifu=(u

q )

q2Q

issuh amultiset,theoeÆientu

q is

also alled themultipliity of q. The degree of u is thesum P

q2Q u

q of all

multipliities.

We start witha triple(G;i;t) whereG=(Q;E) is a nitegraphand i

(resp. t)is a row(resp. olumn) Q-vetor. We denote byM the adjaeny

matrixof G.

Let m be a positive integer. We dene another triple(H;J;X) whih

is said to be obtained bythe multiset onstrution. The graph H is alled

an extension of the graph G. The extension is not unique and depends

as we shall see on some arbitrary hoies. The set S of verties of H is

formed of multisets of elements of Q of total degree at most m. Thus, an

element of S is a nonnegative vetor u = (u

q )

q2Q

with indies in Q suh

that P

q2Q u

q

m. Thisonditionensuresthat H is anitegraph.

WenowdesribethesetofedgesofthegraphHbydeningitsadjaeny

matrixN. LetU betheSQ-matrixdenedbyU

u;q

=u

q

. ThenN isany

nonnegative SS-matrix whihsatises

NU =UM:

Equivalently,forallu2S,

X

v 2S N

u;v

v=uM:

Let us omment informally the above formula. We an desribe the on-

strution of the graph H as a sequene of hoies. If we reah a vertex u

of H,we partition themultisetuM of vertiesreahable from the verties

omposinguintomultisetsofdegree atmostm todenethevertiesreah-

ablefromu inH. TheintegerN

u;v

isthemultipliityofv inthepartition.

Theformula simplyexpresses the fatthat theresult isindeeda partition.

Ingeneral, there areseveral possiblepartitions. The matrixU is alledthe

transfer matrix of theextension.

(18)

JbetheS-rowvetor suhthat J

i

=1 and J

u

=0foru6=i. LetX be the

S-olumnvetor suhthat X

u

=ut.

Thus

JU =i; X=Ut:

To avoid unneessary omplexity,we only keep inS the verties reahable

from i. Thus, we replaethe set S by theset of elements u of S suh that

there isa pathfrom ito u.

ThenumberofmultisetsofdegreeatmostmonasetQwithnelements

is n

m+1

1

n 1

. Thus the number of verties of a multiset extension is of order

n m

. It ispolynomialinnifm is taken asa onstant.

1 2 1 12

Figure7: The graphsGand H.

LetforexampleGbethegraphrepresentedonFigure7ontheleft. The

graphH represented ontheright isa multisetextension ofG with

i=

1 0

; j=

0

1

:

Thematries M;N and U are

M =

2 1

0 1

;N =

1 1

0 2

;U =

1 0

1 1

;J=

1 0

;X=

0

1

:

Inthisase, thematrixU isinvertibleandthematriesM;N areonjugate.

The basipropertyofan extensionis thefollowingone.

Proposition 10 LetH bean extensionof G. Thetriple(H;J;X)isequiv-

alentto (G;i;t).

Proof. Foreah n0,we have

UM n

=N n

U:

(19)

JN n

X = JN

n

Ut

= JUM

n

t

= iM n

t:

Thisshows that (H;J;X)reognizes s.

We willalsomakeuseofthefollowingadditionalpropertyofextensions.

Proposition 11 Let H be an extension of G. Let M (resp. N) be the

adjaeny matrix of G (resp. H) and let U be the transfer matrix. If w is

a k-approximate eigenvetor of M,the vetor W=Uw is a k-approximate

eigenvetor of N. If wis positive, then W isalso positive.

Proof. We have

NW=NUw=UMwkUw=kW :

Sine all rows of U aredistint from 0, the vetor W is positive whenever

wispositive.

In the next setion, we will hoose a partiular extension of the graph

G alled admissible and whih is dened as follows. Let w be a positive

Q-vetor andletm be apositiveinteger. LetH beanextensionofG,let U

bethetransfermatrix, andletW=Uw . We saythatH isadmissible with

respetto wand m ifforeah u2S,allbutpossiblyone of thevertiesv

suh that(u;v ) is anedge ofH satisfyW

v

0modm.

Theorem 5 For any graph G on Q, any positive Q-vetor w and any in-

teger m>0, the graph G admits an admissibleextension with respet to w

and m.

The proof relies on the following ombinatorial lemma. This lemma is

also used in a similar ontext by Adler et al. and Marus [34 ℄,[1℄. It is

atuallypresentedin[3℄ asa nievariant of thepigeon-holepriniple.

Lemma 1 Let w

1

;w

2

;:::;w

m

be positive integers. Then there is a non-

empty subset Sf1;2;:::;mg suh that P

q2S w

q

isdivisible by m.

Proof. Thepartialsumsw

1

;w

1 +w

2

;w

1 +w

2 +w

3

;:::;w

1 +w

2

++w

m

eitherarealldistint(modm),ortwoareongruent(modm). Intheformer

(20)

there are1p<rmsuhthat

w

1 +w

2

++w

p w

1 +w

2

++w

r

( modm)

Henew

p+1 +w

p+2

++w

r

0 (mod m).

Proof. ofTheorem5. WebuildprogressivelythesetofedgesofH. Letube

anelementofS. Weprovebyindutiononthedegreed(uM)= P

q2Q (uM)

q

of uM that there exists v

1

;::: ;v

n

2 S suh that uM = P

n

i=1 v

i and

W

v

i

0modm for 1 i n 1. If uM 2 S, i.e. if d(uM) m,

we hoose n = 1 and v

1

= uM. Otherwise, there exists a deomposition

uM = v+u 0

suh that d(v ) = m. Let w

1

;w

2

;:::;w

m

be the sequene of

integers formed by the w

q

repeated v

q

times. By Lemma 1 applied to the

sequene of integers w

i

, there is a deomposition v = v 0

+r with v 0

6= 0

suh thatW

v

0 0modm. We have uM =v 0

+w 0

withw 0

=r+u 0

. Sine

d(w 0

) < d(uM), we an apply the indution hypothesis to w 0

, giving the

desiredresult.

ForanS-vetorW ,wedenotebyd W

m

etheS-vetor Zsuhthatforeah

u inS,

Z

u

=d W

u

m e:

Summingup thepreviousresults,weobtainthe followingstatement.

Proposition 12 Let H be an admissibleextension of G withrespet to w

and m. Let M (resp. N) be the adjaeny matrix of G (resp. H), let U

be the transfer matrix and let W = Uw . If w is a positive k-approximate

eigenvetor of M,then d W

m

e is a positive k-approximate eigenvetor of N.

Proof. By Proposition3.4,thevetorW isa positivek-approximateeigen-

vetor of N. Thus

NWkW :

Let u be an element of S. We have W

v

0modm forall indies v suh

that N

u;v

> 0 exept possibly for an index v

0

. The previous inequality

impliesthat

X

v 2S fv

0 g

N

u;v W

v

m +N

u;v

0 W

v

0

m k

W

u

m :

(21)

Sine v

m

isa nonnegative integer forv2Q fv

0

g, we get

X

v 2S fv0g N

u;v W

v

m +N

u;v

0 d

W

v

0

m

ek d W

u

m e:

Thisproves that

Nd W

m ekd

W

m e:

3.5 Generating sequene of leaves

Inwhatfollows,weshowhow themultisetonstrution allowsone to prove

themainresult of[10 ℄ onerningthegeneratingsequenesofregulartrees.

We beginwith thefollowing lemma,whih is also usedin thenext setion.

We usethetermleaf fora vertexof agraph withoutoutgoingedges.

Lemma 2 Let G be a graph on a set Q of verties. Let i2Q and T Q.

IfG admitsa k-approximate eigenvetor w ,thereisa graph G 0

and asetof

verties I 0

of G 0

suh that

1. G 0

admitsthe k-approximate eigenvetor w 0

withall omponents equal

to1.

2. the triple (G;i;w ) isequivalentto the triple (G 0

;I 0

;w 0

);

3. Ifw

p

=1for all p2T, thereisa setofvertiesT 0

ofG 0

suhthat the

triple (G;i;T) is equivalentto the triple (G 0

;I 0

;T 0

). Moreover, if T is

the setof leaves of G,we an hoose for T 0

the setof leaves of G 0

.

We nowstate the mainresultof [10 ℄.

Theorem 6 Let s=(s

n )

n0

be a regular sequene of nonnegative integers

and let k be a positive integer suh that P

n0 s

n k

n

1. Then there is a

k-ary rational treehaving sas its generating sequene.

Proof. Let us onsider a regular sequene s and an integer k suh that

P

n0 s

n k

n

1. Sine the result holds trivially for s(z) = 1, we may

suppose that s

0

= 0. Let (G;i;t) be a normalized representation of s and

let G be the losure of G as dened at the beginning of Setion 2.1. We

denotebyM (resp.M)theadjaenymatrixofG(resp.G). LetQ=Q ftg

(22)

thematrixMadmitsapositivek-approximateeigenvetorw . Bydenition,

we have Mwkw .

Let w be the Q-vetor dened by w

q

=w

q

forall q 2 Q and w

t

=w

i .

Then,sinethereisnoedgegoingoutoftinG,wisapositivek-approximate

eigenvetor of M. Let t be the Q-vetor whih is the harateristi vetor

ofthevertext. Letm=w

i .

By Theorem5there existsanadmissibleextensionH of Gwithrespet

to w and m. Let U be the transfer matrixand let W =Uw . Sine w

t

0modm, we may hoose H withthefollowing additionalproperty. Forall

u2S eitheru

t

=0oru=t.

Aording to Proposition10, the sequene s is reognized by(H;J;X)

where J is the harateristi row vetor of i and X is the harateristi

olumn vetor of t. This means that s is reognized by the normalized

representationonsistinginthegraphH,theinitialvertexi,thatweidentify

to i,and theterminal vertext, thatwe identifyto t.

LetN betheadjaenymatrixofH. ByProposition12,thevetord W

m e

isapositivek-approximateeigenvetorofN. Remarkthatd W

m e

i

=d W

m e

t

=

1.

We maynowapplyLemma2 toonstrut atriple(H 0

;I 0

;T 0

)equivalent

to (H;i;t). The set T 0

is the set of leaves of H 0

. Sine d W

m e

i

= 1, I 0

is

reduedto onevertexi 0

. SineH 0

admits ak-approximateeigenvetor with

allomponentsequaltoone,thegraphH 0

isofoutdegreeatmostk. Finally

sis thegenerating sequene of the overing tree of H 0

startingat i 0

. This

treeis k-aryand regular.

Let us onsider the above onstrutions in the partiular ase of the

equality in Kraft's inequality. In this ase, the result is a omplete k-ary

tree. Indeed, by Proposition 8, the matrix M admits a positive integral

eigenvetor wfortheeigenvaluek. We have forall p2Q,

X

q2Q M

p;q w

q

=kw

p :

Asaonsequene, foranyu6=t,we have

X

v 2S N

u;v W

v

=kW

u :

Then the graph onstruted in Lemma 2 is of onstant outdegree k. Thus

thek-arytree obtainedis omplete.

(23)

of Theorem 6. Let n be the number of verties of the graph G giving a

normalizedrepresentationofs. Thesizeoftheintegerm=w

i

isexponential

inn(seeSetion3.3). ThusthenumberofvertiesofthegraphHisbounded

by a doubleexponentialin n. The nal regular tree is the overing tree of

agraph whoseset ofverties hasthesame sizeinorder ofmagnitude.

Letforexample sbe thesequene denedby

s(z)= z

2

(1 z 2

) +

z 2

(1 5z 3

) :

Sine s(1=2) = 1, it satises the Kraft equality for k = 2. The sequene

s is reognized by (G;i;t) where G = (Q;E) is the graph given in Figure

3.5 with Q = f1;2;3;4;5;6;7g, i = 1, t = 4. The adjaeny matrix of G

admitsthe 2-approximateeigenvetor representedon Figure 3.5, wherethe

oeÆientsofwarerepresentedinsquaresbesidetheverties. Thusm=3.

3

3 1

5 4

7 6

2 1

2

3 1

2

4

Figure8: A normalized representation ofs

An admissible extension H of G with respet to w and m is given in

Figure 9. In this gure,eah multisetof S is represented bya sequene of

vertieswithrepetitionsorrespondingtothemultipliity. Forexample,the

multiset u= (0;0;1;0;0;2;0) is represented by (3;6;6). The sequene s is

reognizedbythe normalizedrepresentation(H;1;4), wheretheinitialand

nalverties arenamed asthey appearon Figure 9. TheoeÆientsof W

arerepresentedinsquaresbesidetheverties.

A regular binary tree T having s as generating sequene of leaves, is

given in Figure 10. In this gure, the nodes have been renumbered, with

thehildrenof a nodewith agiven labelrepresentedonlyone. Theleaves

(24)

1

3

1 2

2

4

1 1

1

2 1

6 6 6 2 5

7 7 7

5 5 5 3 6 6 3 7 2 7 7 3 5 5

2 6

1

4

Figure 9: Anadmissible extensionH.

of the tree are indiated by blak boxes. The tree itself is obtained from

the graph of Figure 9 by appliation of the onstrution of Lemma 2. For

example, the vertex (2;5), whih has oeÆient 6 in W , is split into two

vertiesnamed 2and 3 inthetree.

This example was suggested to us by Christophe Reutenauer [39 ℄. To

hekdiretlythatthelengthdistributionisequaltos(z),onemayompute

from the graph the following regular expression of s(z) and hek by an

elementaryomputation (possiblywiththehelpofa symboliomputation

system)that itis equalto s(z).

s(z)=(z 6

)

(2z 2

+z 4

+2z 5

+z 6

+(z 2

+3z 5

)(5z 3

)

3z 3

): (1)

(notefora readerunfamiliarwithregularexpressions: therst fator(z 6

)

orresponds to the vertex labeled 1 at level 6 of the tree. The term 2z 2

+

z 4

+2z 5

+z 6

orresponds to the leaves reahed by a path whih does not

useavertexlabeled5. Thefator(z 2

+3z 5

)(5z 3

)

orrespondsto thepaths

from the root to a vertex labeled 5. Finally, thefator 3z 3

orresponds to

thediretpaths from 5to a leaf.)

Thisexampleshowsaninterestingfeatureofthisproblem. Infat, from

thepointofview ofregularexpressions,thediÆultoperationinthisprob-

lemisthesum. Itwouldbeasimplemattertobuildarationaltreeforeah

term of the sum in the expression (1) (see the example of Figure 5). The

diÆultywouldthenbetomergethesetreestoobtainone orrespondingto

thesum.

(25)

1 2

3

4

5

6

7

8

9 10

11

12

10

13

12

12

14

5

5

5

5

1

Figure10: Aregular binarytree withlengthdistributions.

Auriousonsequeneof Theorem6 isthefollowingpropertyof regular

sequenes.

Corollary 1 Let k 2 be an integer and let u be a regular sequene

suh that u(1=k) 1 and u(0) = 0. Then there exist k regular sequenes

u

1

;:::;u

k

suh that u

i

(1=k)1 and

u(z)= k

X

i=1 zu

i (z):

Proof. It is a simple onsequene of Theorem 6. Indeed, if X is a regular

prexode on the k element alphabet A,then X = P

a2A aX

a

whereeah

X

a

is aregular prexodeon thealphabetA.

We don't know ofa diretproofof thisresult.

3.6 Generating sequene of nodes

In thissetion, we onsiderthe generating sequene of theset of all nodes

inatreeinsteadof justthesetofleaves. Thisismotivatedbythefatthat

in searh trees, the information an either be arried by the leaves or by

all the nodes of the tree. We will see that the omplete haraterization

(26)

ompliatedthantheone forleaves.

Soittola (see [42 ℄ p.104)hasharaterized theserieswhihare thegen-

eratingsequenes of nodesina regular tree. We haraterize theones that

orrespondtok-arytrees(Theorem7). Wealsogiveamorediretonstru-

tionina partiular ase(Theorem 8).

Let T be a tree. The generating sequene of nodes of the tree T is the

sequenet=(t

n )

n0

,wheret

n

isthenumberofnodesofT atheightn. The

sequenet satisest

0

1 and,moreover, ifT is ak-ary tree,theondition

t

n kt

n 1

for all n 1. If T is a regular tree, then t is a regular sequene. We

nowompletelyharaterizetheregularsequenestthatarethegenerating

sequenesof nodes ofa k-aryregular tree.

Theorem 7 Let t = (t

n )

n0

be a regular sequene and let k be a positive

integer. Thesequene(t

n )

n0

isthe generating sequeneof nodesof ak-ary

regular tree i it satises the following onditions.

(i) the onvergeneradius of t isstritly greater than 1=k,

(ii) the sequenes(z)=t(z)(kz 1)+1 is regular.

Proof. Let us rst show that the onditions are neessary. Let T be the

ompletek-arytree obtained by addingi newleavesto eah node that has

k ihildren. SineT is a regulartree,T isalso regular.

Let s be the generating sequene of leaves of T. Sine T is omplete,

s(1=k)=1. Sine kt

n

=s

n+1 +t

n+1

foralln0,wehave

1 s(z)=t(z)(1 kz):

Sinesisaregularsequene,itsradiusofonvergeneisstritlylargerthan

1=k (see Setion 3.3). Sine the value of the derivative of s at z = 1=k is

kt(1=k), thesame holdsfort. Thisprovesthe neessityof theonditions.

Conversely,iftsatisestheonditionsofthetheorem,theregularseries

s(z) =t(z)(kz 1)+1 satises s(1=k) = 1. Thus, byTheorem 6, s is the

generatingsequene ofleavesofaompletek-aryregulartree. The internal

nodes of this tree form a k-ary regular tree whose generating sequene of

nodes ist.

Thesequenesdenedbyondition(ii)isrationalassoonastisregular

andthereforerational. Givenaregularsequenet,ondition(ii)isdeidable

inview ofthetheorem of Soittola (Theorem1).

(27)

negativity of the oeÆients of the series s and thus the inequality 8n

1;t

n kt

n 1

. Italso impliesthatt

0 1.

We now show that there are regular sequenes t satisfying t

n kt

n 1

forall n1, and ondition (i) of the theorem and suh that thesequene

s(z)=t(z)(kz 1)+1 isnotregular. Theexampleisbased onan example

ofarationalsequenewithnonnegativeoeÆientsandwhihisnotregular

(see [18 ℄page 95). Let

r

n

=b 2n

os 2

(n)

withos()= a

b

where theintegersa;b aresuh thatb6=2a and0<a<b.

The sequene r is rational, has nonnegative integer oeÆients and is not

regular. Itspoles are 1

b 2

, 1

b 2

e 2i

and 1

b 2

e 2i

. We nowdene the sequene t

asfollows:

t

2h

= k h

;

t

2h+1

= k h

+r

h :

We also assume that b 2

< k. By Soittola's theorem, the sequene t is

regular sineit is a mergeof rationalsequenes havinga dominatingroot.

The onvergene radius of t is 1

p

k

>

1

k

. Therefore the sequene t satises

therst onditionof Theorem7. Let s be the sequene dened by s(z) =

t(z)(kz 1)+1. Ifh=2pis even,

s

h

= kt

h 1 t

h

= kk p 1

+kr

p 1 k

p

+1=kr

p 1 +1:

Thusthesequene sis notregular.

The above example does not work for the small values of k (the least

value isk =10). We do notknow ofsimilarexamples for2k9.

We nally desribe a partiular ase of Theorem 7 in whih one has a

relatively simple method, based on the multiset onstrution, to build the

regulartree witha given generatingsequene of nodes. Thisavoids theuse

ofSoittola's haraterization whih leadsto amethod ofhigheromplexity.

A primitive representation of a regular sequene s is a representation

(G;i;t) suhthat theadjaenymatrix ofGis primitive. Thefollowingre-

sultisprovedin[8 ℄withadierentproofusingthestate-splittingmethodof

symbolidynamis. Theproofgivenin[10℄reliesonasimpleronstrution.

Theorem 8 Let t = (t

n )

n0

be a regular sequene and let k be a positive

integer suh that t

0

=1, t

n kt

n 1

for all n1 and suh that

(28)

(ii) thas a primitive representation.

Then(t

n )

n0

isthegeneratingsequeneofnodes by heightofak-aryregular

tree.

The proof of this theorem given in [10 ℄ uses the multiset onstrution.

Itrelieson thefollowinglemma.

Lemma 3 Let M be a primitive matrix with spetral radius . Let v be a

non-null and nonnegative integral vetor and let k be an integer suh that

< k. Then there is a positive integer n suh that M n

v is a positive k-

approximate eigenvetor of M.

Proof. ForaprimitivematrixM withspetralradius,itisknownthatthe

sequene ((

M

)

n

)

n0

onverges to r:l where ris apositiveright eigenvetor

andl apositivelefteigenvetor of M fortheeigenvaluewithlr=1(see

forexample[30 ℄ p. 130). Thus( M

n

n

v )

n0

onverges to r:l:v whihis equal

to r where is a nonnegative real number. Sine Mr=r, we get, for a

largeenoughinteger n,

M M

n

n

vk M

n

n

v

or equivalently MM n

v kM n

v . If n is large enough, we moreover have

M n

v>0sine M isprimitive.

Theproofof Theorem8 usesa shiftofindiesof thesequeneto obtain

a new sequene to whih a simple appliation of the multisetonstrution

an be applied. We illustrateiton an example.

1 2 3

Figure11: AprimitiverepresentationG oft.

(29)

i=

1 0 0

and t= 2

4 1

1

0 3

5

:

Theadjaeny matrixM of Gisthe primitivematrix

M = 2

4

1 1 0

0 0 1

1 0 0 3

5

:

Its spetral radius is less than 2. The hypothesis of Theorem 8 are thus

satised. We have

M 2

t= 2

4 2

1

2 3

5

and M 3

t= 2

4 3

2

2 3

5

:

Sine M 3

t 2M 2

t, thevetor W =M 2

t is an approximateeigenvetor of

M (theexistene of suh avetor is asserted byLemma 3). Let w=M 2

t.

ApplyingLemma 2, we obtainfrom G thegraph G 0

represented on the

leftsideofFigure 12. Moreover, (G;i;w ) isequivalentto (G 0

;I 0

;w 0

) where

I 0

isthesetofinitialvertiesindiatedonFigure12andwisthevetorwith

all omponents equal to 1. The overing trees T

1;1 and T

1;2 of G

0

starting

at the verties of I 0

give, with the appropriate shift of indies, the binary

regular tree T represented on the right sideof Figure 12 (the nodes of the

treehave beenrenumbered).

4 Generating sequenes of prex odes

There is a lose onnexion between trees and prex odes or prex-losed

sets of words. We present belowthetranslationof some of thenotionsand

resultsseenbeforein termsof prexodes.

4.1 Trees and prex odes

Let R be a set of words on the alphabetA =f0;1;::: ;k 1g. The set R

issaid to beprex-losed ifanyprexof anelement of R isalso inR . The

setXofwordswhiharenotaproperprexofawordinR isaprexode,

alledtheprexode assoiatedto R .

(30)

7 3

4 5

2,1 1,1

3,1 3,2

1,2

T 1,1 T 1,2

1

2 3

4 5

4 5

6

Figure 12: Thegraph G 0

and thetree T.

WhenR isprexlosed,we anbuildatreeT(R )asfollows. Theset of

nodesis R ,theroot istheemptywordandT(a

1 a

2 a

n )=a

1 a

2 a

n 1 .

Theleaves ofT froma prexode whihistheprexodeassoiatedto R .

Thegeneratingsequene of T is thegeratingsequene of X.

Let for example R = f;0;1;10;11g. The tree T(R ) is represented on

Figure13. Theassoiatedprexode isX=f0;10;11g.

Figure13: The tree T(X).

Let X be a prexode on an alphabetwithk symbols. It is lear that

(31)

n n1

X

n1 u

n k

n

1;

orequivalentlyu(1=k)1. The numberu(1=k) an atuallybeinterpreted

astheprobabilitythata longenough wordhasa prexinX.

There is also a onnexion with the notion of entropy. Atually, if X is

aprexode,theentropyofX

isequaltolog (1=) where isthesolution

of theequation u

X

()=1. ThusKraft's inequalityexpresses the fat that

h(X

)logk.

Conversely,Kraft-MMillan'stheoremstates thatforanysuhsequene

u=(u

n )

n1

,thereexistsaprexodeX ona k-symbolalphabetsuhthat

u=u

X .

The equalityase inKraft'sinequalityorresponds to apartiular lass

ofprexodesoften alledomplete. A prexode X onthealphabetA is

ompleteifanywordon A haseither a prexinX oris a prexof a word

ofX.

Theorem 6 shows that the generatingsequenes of regular prex odes

areexatlythe regularsequenessatisfyingKraft'sinequality.

4.2 Bix odes

We investigate here the length distributions of a partiular lass of prex

odes, alledbix. Several other lassesof prex odes ould give riseto a

similarstudy (fora desriptionto these lasses, see[21 ℄).

The denitionof a suÆxode is symmetri to the denitionof a prex

ode. Itisaset ofwordsX suhthatnoelementof X isasuÆxofanother

one. The notionofa ompletesuÆxode isalso symmetri. A bix ode is

aset X of words whih isbothaprexand a suÆxode.

Any set of words of xed lengthis obviously a bix ode butthere are

more ompliatedexamples.

Example 5 The set

X=faaa;aaba;aabb;ab;baa;baba;babb;bba;bbbg

isa omplete prexode pitured inFigure 14. It is also a ompletesuÆx

ode as onemayhekbyreadingits words bakwards.

Surprisingly,it is an open problemto haraterize the length distribu-

tionsofbixodes. Thefollowingsimpleexampleshowsthatthey aremore

onstrainedthanthose of prexodes.

(32)

a

b

a

b

a

b

b

a

b

a

b

a

b

a

b

Figure 14: Thebixode X.

Example 6 Thesequeneu(z)=z+2z 2

isnotrealizableasthelengthdis-

tributionofabixodeon abinary alphabetalthoughu(1=2)=1. Indeed,

oneofthesymbolshastobeinX,saya. Thenbbistheonlywordoflength

2that an beadded.

The following nie partial result is due to Ahlswede, Balkenhol and

Khahatrian[2 ℄. Westatetheresultforabinaryalphabet. Itan bereadily

generalizedto k symbolsbutit presentslessinterest.

Theorem 9 For any integer sequeneu suh that

u(1=2)1=2;

there isa bix ode X suh that u=u

X .

Proof. The proof isbyindution. Wesupposethat we have alreadybuilt a

bixodeXformedofwordsoflengthatmostn 1withlengthdistribution

(u

1

;u

2

;:::;u

n 1

). We have

n

X

i=1 u

i 2

i

1=2;

(33)

2 n

X

i=1 u

i 2

n i

2 n

:

Finally,we obtain

u

n 2

n

2 n 1

X

i=1 u

i 2

n i

:

The expression of the right handside is at most equal to the number of

elements of the set A n

XA

A

X. Thus, we an hoose u

n

words of

lengthn whihdo nothave aprexorasuÆxinX. Thisprovesthe result

byindution.

The authors of [2 ℄ formulate the interesting onjeture that Theorem 9

isstilltrue ifthehypothesisu(1=2) 1=2 isreplaed byu(1=2)3=4.

There are known additional onditionsimposed on lengthdistributions

of bix odes. For example, one has the following result,originally dueto

Shutzenberger (see [16 ℄).

Theorem 10 IfXisaniteompletebixodeonk symbols,thenu

X

(1=k)=

1 and 1

k u

0

X

(1=k) is an integer.

The number 1

k u

0

X

(1=k) an be interpretedas the average length of the

words ofX. Indeed

zu 0

X (z)=

X

x2X jxjz

jxj

:

Example 7 Forthe bixode of Example5,we have

u

X

(z)=z 2

+4z 3

+4z 4

andthus

u 0

X

(z)=2z+12z 2

+16z 3

:

Hene 1

2 u

0

X

(1=2)=3:

The onditionsof Theorem 10 show diretly thatthe sequene of Example

6isnotrealizable. Indeed, itsatisestherstonditionbutnottheseond

one. The onditions of Theorem 10 are not suÆient. Indeed, if u(z) =

z+4z 3

wehaveu(1=2) =1andu 0

(1=2)=4althoughitislearlyimpossible

thatu=u

X

fora bixodeX.

Reently, Ye and Yeung [45℄ have made some progress on this prob-

lem. Theyare in partiular able to prove that Theorem 9 stillholdswhen

u(1=2)5=8.

(34)

ular odes

Inthissetion,wepresent anumberofresultson interrelatedobjetswhih

are onneted with yli permutation of words. The link with enumera-

tiveombinatoriswasdeveloppedinLothaire'svolume[31 ℄ andlaterinR.

Stanley's book [44℄. We begin with notions lassial insymboli dynamis

(see[30 ℄or[28℄forageneralreferene;see[15 ℄or[24℄forthelinkwithnite

automata).

5.1 Subshifts of nite type

Asubshift is aset of biinnitewords on anitealphabet A whih avoids a

given set F of forbidden words. It is a topologialspae asa losed subset

of the spae A Z

of funtions from Z into the set A. The full shift on A is

thesetof all biinnitewords on A. Itorrespondsto thease F =;.

Aso subshiftisthesetof biinnitelabelsofpathsinaniteautoma-

ton. A so subshift is alled irreduible if the automaton an be hosen

strongly onneted. A subshift of nite type is the set of biinnite words

avoiding a nite setof nitewords. Any subshiftof nitetype is so but

the onverse is not true. The edge shift of a nite graph G is the set S

G

of biinnite paths in G (viewed as biinnite sequenes of edges). It is a

subshiftof nitetype.

The shift isthefuntionon a subshiftS whihmaps a pointx to the

pointy =(x)whose ithoordinate isy

i

=x

i+1 .

Amorphism from asubshiftS intoa subshiftT isafuntionf :S !T

whih is ontinuous and invariant under theshift. A bijetive morphism is

alled a onjugay. Any subshift of nite type is onjugate to some edge

shift.

The entropy h(S) of a subshiftS is the entropyof theformal language

formedbythenitebloksourringinwordsofS. Itanbeshownthatthe

entropyis atopologialinvariant,inthesensethattwo onjugate subshifts

have the same entropy.

Whiletheentropyisameasureofnumberofforbiddenwords,itispossi-

bletostudythenumberofminimalforbiddenwords. Itgivesrisetoanother

invariantof subshifts[13 ℄,[14 ℄.

Anintegerpisaperiod ofapointx=(a

n )

n2Z ifa

n+p

=a

n

foralln2Z.

Equivalently,p isaperiodof xif p

(x)=x. Thezetafuntionofa subshift

(35)

(S)=exp X

n1 p

n

n z

n

wherep

n

isthenumberofwords withperiodninS. Itisalso atopologial

invariant,sinea pointof periodnis mapped by aonjugay ona point of

thesame period.

ThefollowingresultduetoBowenandLanford[19℄islassial(see[30 ℄).

Proposition 13 LetGbea nitegraphandletM bethe adjaeny matrix

of G. Then

(S

G

)=det(I Mz) 1

:

Proof. We rst have foreah n1

Tr(M n

)=p

n

sinetheoeÆient (i;j) of M n

is thenumberof paths from ito j. Thus

(S

G

) = exp X

n1 p

n

n z

n

= exp X

n1 Tr(M

n

)

n z

n

= expTr(log(I Mz) 1

)

= det (I Mz) 1

sine,bytheformulaof Jaobi, expTr=detexp.

Example 8 LetS betheedge shiftofthegraph Gof Figure15. We have

M = 2

4

1 1 0

0 0 1

1 0 0 3

5

:

Consequently

(S)= 1

1 z z

3 :

(36)

1

3

Figure 15: A subshiftof nitetype

Let S be a subshift of nite type and let p

n

be the number of points

with periodn. Let q

n

be the number of points with least period n. Sine

q

n

is a multipleof n, we also denote q

n

= nl

n

. We have then the formula

expressingthe zeta funtion as an innite produt using theintegers l

n as

exponents.

(S)=

n1 (1 z

n

) ln

;

asone mayverify usingp

n

= P

djn dl

d

and thedenitionof (S).

A lassialresult,related with what follows,is thefollowing statement,

knownas Krieger'sembeddingtheorem.

Theorem 11 Let S;T betwo subshiftsof nite type. Thereexistsan inje-

tivemorphism f :S!T withf(S)6=T i

1. h(S)<h(T)

2. foreahn1,q

n

(S)q

n

(T)whereq

n

(S)(resp. q

n

(T))isthenumber

of points of S (resp. T)of least period n.

The following result is the basis of many appliations of symboli dy-

namisto oding. It isdueto Adler, CoppersmithandHassner [1℄.

Theorem 12 If S is an irreduible subshift of nite type suh that h(S)

logk, it is onjugate to a subshift of nite type S

G

where the graph G has

outdegree at least k.

The proof is based on a state-splitting algorithm using approximate

eigenvetors and Lemma 1. This result is part of a number of onstru-

tionsleadingto slidingblokodesusedinmagnetireording(see[35℄,[11 ℄

or[30 ℄). It givesat thesame time thefollowingresult.

Theorem 13 It S is a subshift of nite type suh that h(S) logk, then

thereis a graph G of outdegree at mostk suh that S isonjugate to S

G .

(37)

u be a regular sequene of integers suh that u(1=k) 1. Let G be a

normalized graphreognizing u (in thesense of Setion2.1). Let

Gbe the

graph obtainedby merging the initial and terminal vertex. Then h(S

G )

logk. WeanapplyTheorem13toobtainagraphHwithoutdegreeatmost

ksuhthatS

G andS

H

areonjugate. ThisgivestheonlusionofTheorem

6providedtheinitial-terminalvertexdidnotsplitintheonstrution. The

followingexamplesshowbothases (fordetails,see [7 ℄ and[8 ℄).

Example 9 LetGbe thegraphofFigure5. The splittingofvertex2gives

agraph ofoutdegree 2. A normalizationgivesthe automatonon theright.

Example 10 The sequene ofthe example given inFigure 6 is reognized

by a graph G suh that

G has three yles of length 2. The solutionas a

binarytree has onlytwo yles of length 2 and thusould notbe obtained

bystate-splitting.

5.2 Cirular odes

Airular word,orneklae, isthe equivalenelassof awordunderyli

permutation. For a word w,we denote by w the irularwordrepresented

byw.

Let X be a set of words and w =x

1 x

2 x

n

with x

i

2 X. The set of

ylipermutationsofthesequene (x

1

;x

2

;:::;x

n

) isalledafatorization

oftheirular wordw.

Airularode isasetXofwordssuhthatthefatorizationofirular

words isunique.

Example 11 The set X=fa;abag is airular ode. Indeed, theposition

ofthesymbols bdeterminesuniquely theourrenes ofaba.

Example 12 The set X = fab;bag is not a irular ode. Indeed, the

irular word w for w = abab has two fatorizations namely (ab;ab) and

(ba;ba).

The followingharaterization is useful(see [16 ℄).

Proposition 14 A setX is a irular ode if and only if it is a ode and

for all u;v2A

,

uv;vu 2X

)u;v2X

(38)

is nota irular ode. Indeed, otherwise we would have a;b2 X

whih is

ontraditory.

LetX beaniteode. Theower automaton of X,denoted A

X

,isthe

followingautomaton. The setof itsstates is

Q=f(u;v)2A +

A +

juv2Xg[(1;1)

The transitions are of the form (u;av) a

! (ua;v) or (1;1) a

! (a;v) or

(u;a) a

!(1;1). The uniqueinitialand nal state is(1;1).

Example 14 Theowerautomatonoftheirularodefa;abagispitured

inFigure16.

1

2

3 a

a

b

a

Figure 16: Theowerautomatonof fa;abag.

The followingresultis easy to prove.

Proposition 15 The ower automaton A

X

reognizes X

. Theode X is

irular i for eah word w, there is atmost one ylewith label w.

We now study the length distributions of irular odes. Let X be a

irular ode and let u

(

z) = (u

n )

n1

be its length distribution. For eah

n 1, let p

n

be the number of words w of length n suh that w has a

fatorization inwords ofX.

Proposition 16 The sequenes (p

n )

n1

and (u

n )

n1

are related by

exp X

n1 p

n

n z

n

= 1

1 u(z)

: (2)

(39)

n n

It istherefore possibleto supposethat thesequene (u

n

) is nite, i.e. that

the ode X is nite. Let A be the ower automaton of X. Let S be

the subshiftof nitetype assoiated with the graph of A. Then p

n is the

numberof elementsofperiodn inS. Indeed, eah wordw suh thatw has

afatorizationisountedexatlyoneasthelabelofayleinA. Wehave

also

det(I Mz)=1 u(z):

Thus, theresult followsfrom Proposition13.

Theexpliitrelationbetweenthenumbersu

n andp

n

isthefollowing. For

eahi1,letu (i)

=(u (i)

n )

n1

bethelengthdistributionofX i

. Equivalently,

u (i)

n

istheoeÆientof degree nof u(z) i

. Then foreah n1

p

n

= n

X

i=1 n

i u

(i)

n :

We also have foreah n1

p

n

=nu

n +

n 1

X

i=1 p

i u

n i

: (3)

This formula an be easily dedued from Formula (2) by taking the loga-

rithmi derivative of eah side of the formula. It shows diretly that for

any sequene (u

n )

n1

of nonnegative integers, the sequene p

n

dened by

Formula(2)is formedof nonnegativeintegers.

Formula (3) is known as Newton's formula in the eld of symmetri

funtions. Atually, the numbers u

n

an be onsidered, up to the sign, as

elementarysymmetrifuntionsandthep

n

asthesumsofpowers(see[32 ℄).

The linkbetween Wittvetors and symmetri funtions wasestablishedin

[43 ℄.

Letp

n

= P

djn dl

d

. Thenl

n

isthenumberofnon-periodiirularwords

oflengthn withafatorization. Interms of generatingseries,we have

exp X

n1 p

n

n z

n

= Y

n1 (1 z

n

) l

n

: (4)

Puttingtogether Formulae (2)and (4), weobtain

1

1 u(z)

= Y

n1 (1 z

n

) ln

: (5)

(40)

n n1 n n1

thusdenedisformedofnonnegativeintegers. Thisanbeprovedeitherby

adiret omputation orby aombinatorial argument sineanysequene u

ofnonnegativeintegersisthelengthdistributionofairularodeonalarge

enoughalphabet. We denote l=(u) and we saythat l isthe -transform

ofthesequene u.

We denoteby'

n

(k)thenumberofnon-periodiirularwordsoflength

nonk symbols. Thenumbers'

n

(k)arealledtheWittnumbers. Itislear

thatthesequene ('

n (k))

n1

isthe -transformof thesequene (k n

)

n1 .

The orresponding partiularase of Identity (5)

1 kz= Y

n1 (1 z

n

) 'n(k)

isknownastheylotomi identity.

ThefollowingarraysdisplayatabulationoftheWitt numbers forsmall

valuesof nand k.

n '

n (2) '

n

(3) '

n (4)

1 2 3 4

2 1 3 6

3 2 8 20

4 3 18 60

5 6 48 204

6 9 116 670

7 18 312 2340

8 30 810 8160

9 56 2184 29120

10 99 5880 104754

The value '

3

(4) =20 is famous beause of the geneti ode: there are

preisely20amino-aidsodedbywordsoflength3overa4-symbolalphabet

A,C,G,U.

For anysequene a=(a

n )

n1 ,let

p

n

= X

djn da

n=d

d :

Thepair(a;p)isalleda Wittvetor(see[29 ℄ or[36 ℄). Thenumbersp

n are

theghost omponents. In termsof generatingseries,one has

exp X

n1 p

n

n z

n

= Y

n1 (1 a

n z

n

) 1

:

Références

Documents relatifs

In particular, in Lemma 4.1 we prove the limit (1.1) when A is the number of marked nodes, and we deduce the convergence of a critical GW tree conditioned on the number of marked

The winding number of a closed curve around a given point is an integer representing the total number of times that curve travels anti-clockwise around the point.. The sign of

- Pierre Arnoux, Université de la Méditerranée, - Pierre Cartier, École Normale Supérieure de Paris, - Julien Cassaigne, Institut de Mathématiques de Luminy, - Sébastien

Dad hasn’t moved the lawn (tondre la pelouse ) Andrew has taken the dog out for a walk?. Maria has finished cleaning

1496 Canadian Family Physician • Le Médecin de famille canadien d VOL 50: NOVEMBER • NOVEMBRE 2004.. Letters

By successive state splittings of all states in P having more than one outgoing edges, we shall get, in a nite number of steps, a representation such that all states with one

In this article we show that we can combine the two types of constraints for the binary alphabet: producing an infinite word whose maximal exponent of its factor is the

In this article we show that we can combine the two types of constraints for the binary alphabet: producing an infinite word whose maximal exponent of its factor is the smallest