• Aucun résultat trouvé

The average lengths of the factors of the standard factorization of Lyndon words

N/A
N/A
Protected

Academic year: 2022

Partager "The average lengths of the factors of the standard factorization of Lyndon words"

Copied!
13
0
0

Texte intégral

(1)

HAL Id: hal-00619865

https://hal-upec-upem.archives-ouvertes.fr/hal-00619865

Submitted on 6 Oct 2011

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

The average lengths of the factors of the standard factorization of Lyndon words

Frédérique Bassino, Julien Clément, Cyril Nicaud

To cite this version:

Frédérique Bassino, Julien Clément, Cyril Nicaud. The average lengths of the factors of the standard

factorization of Lyndon words. 6th International Conference on Developments in Language Theory

(DLT 2002), Sep 2003, Kyoto, Japan. pp.307-318. �hal-00619865�

(2)

standard fatorization of Lyndon words

FrederiqueBassino,JulienClementandCyrilNiaud,

InstitutGaspardMonge

UnversitedeMarne-la-Vallee

77454Marne-la-ValleeCedex2-Frane

email:fbassino,lementj,niaudguniv-mlv.fr

NoInstituteGiven

Abstrat. A non-empty word w of fa;bg

is a Lyndon word if and

onlyifitis stritlysmallerfor thelexiographialorderthanany ofits

propersuÆxes. Suhaword w is eithera letter or admits astandard

fatorization uv where v is itssmallest propersuÆx. For any Lyndon

word v, we show that the set of Lyndon words having v as right fa-

torofthe standardfatorization isrationaland omputeexpliitlythe

assoiated generatingfuntion. Nextweestablish that,for theuniform

distributionoverthe Lyndon words of lengthn, the average lengthof

therightfator vof thestandard fatorization is asymptotially3n=4.

Finallywepresent algorithmsonLyndonwords derivedfromourwork

togetherwithexperimentalresults.

1 Introdution

Given a totally ordered alphabet A, a Lyndon word is a word that is stritly

smaller,forthelexiographialorder,thananyofitsonjugates(i.e.,allwords

obtained by a irular permutation on the letters). Lyndon words were intro-

duedbyLyndon[Lyn54℄underthenameof\standardlexiographisequenes"

inordertogiveabaseforthefreeLiealgebraoverA;thestandardfatorization

playsaentralrolein thisframework(see[Lot83℄, [Reu93℄,[RSar℄).

Oneof the basi properties of the set of Lyndon words is that everyword

isuniquelyfatorizable asanoninreasing produtofLyndonwords.As there

exists abijetionbetweenLyndonwordsoveranalphabet ofardinalityk and

irreduible polynomials over F

k

[Gol69℄, lot of results are known about this

fatorization: the averagenumber of fators, theaverage lengthof the longest

fator[FGP01℄and oftheshortest[PR01℄.

Several algorithms deal with Lyndon words.Duval gives in [Duv83℄ an al-

gorithmthat omputes,inlineartime,thefatorizationof awordintoLyndon

words;healsopresentsin [Duv88℄analgorithmforgeneratingallLyndonword

uptoagivenlengthinlexiographialorder.Thisalgorithmrunsinaonstant

(3)

ativeproperties ofthese sets ofwords.Then weintroduethestandardfator-

ization of a Lyndonword w whih is theunique oupleof Lyndon wordsu, v

suhthatw=uv andvisofmaximallength.

InSetion3,westudythesetofLyndonwordsoffa;bg

havingagivenright

fator in their standard fatorization and provethat it is a rational language.

We also omputeits assoiated generating funtion. Butasthe set of Lyndon

wordsisnotontext-free[BB97℄, wearenotableto diretlyderiveasymptoti

properties from these generating funtions. Consequently in Setion 4we use

probabilistitehniquesandresultsfrom analytiombinatoris(see[FS02℄)in

ordertoomputetheaveragelengthofthefatorsofthestandardfatorization

ofLyndonwords.

Setion 5 is devoted to algorithms and experimental results. We give an

algorithm to generate randomly for uniform distribution a Lyndon word of a

givenlengthandanotheronerelatedto thestandardfatorizationofaLyndon

wordwhihisbasedontheproofofTheorem2ofSetion3.Tothebestofour

knowledgethese algorithms, although simple and notneessarily new,are not

found elsewhere. Finally experiments are given whih onrm our results and

givehintsoffurtherstudies.

Theresultsontainedinthispaperonstitutearststepinthestudyofthe

averagebehaviorofthebinaryLyndontreesobtainedfrom Lyndonwordsbya

reursiveappliationofthestandardfatorization.

2 Preliminary

We denote A

the free monoid over the alphabet A = fa;bg obtained by all

niteonatenationsofelementsofA.Thelengthjwjofawordwisthenumber

of thelettersw isprodut of,jwj

a

is thenumberofourrenesofthe lettera

in w. Weonsider the lexiographialorder <overallnon-empty wordsof A

dened bytheextensionoftheordera<boverA.

Wereordtwopropertiesofthisorder

(i) ForanywordwofA

,u<v ifandonlyifwu<wv.

(ii) Letx;y2A

betwowordssuhthatx<y.Ifxisnotaprexofythenfor

everyx 0

;y 0

2A

wehavexx 0

<yy 0

.

Bydenition, aLyndon word is a primitive word (i.e, it is not apower of

another word) that is minimal, for the lexiographial order, in its onjugate

lass (i.e, the set of all wordsobtained by a irular permutation). The set of

Lyndonwordsoflengthnisdenoted byL

n

andL=[

n L

n .

L=fa;b;ab;aab;abb;aaab;aabb;abbb;

aaaab;aaabb;aabab;aabbb;ababb;abbbb;:::g

Equivalently,w2Lifandonlyif

+

(4)

ofitspropersuÆxes.

Proposition1 A word w2A +

is aLyndon wordif and only if either w2A

orw=uvwith u;v2L,u<v.

Theorem1(Lyndon) Any word w 2A +

an be written uniquely as a non-

inreasing produtof Lyndonwords:

w=l

1 l

2 :::l

n

; l

i

2L; l

1 l

2

l

n :

Moreover, l

n

isthe smallestsuÆx ofw.

ThenumberCard(L

n

)ofLyndonwordsoflengthnoverA(see[Lot83℄)is

Card(L

n )=

1

n X

djn

(d)Card(A) n=d

;

where is theMoebius funtion dened onNnf0gby(1)=1,(n)=( 1) i

ifnistheprodutofidistintprimesand(n)=0otherwise.

WhenCard(A)=2,weobtainthefollowingestimate

Card(L

n )=

2 n

n

1+O

2 n=2

:

Denition1 (Standard fatorization). For w2LnA a Lyndonword not

reduedtoaletter, the pair (u;v), u;v2L suhthatw=uv andv of maximal

lengthisalledthe standardfatorization.The wordsuandv arealledthe left

fatorandrightfator ofthe standardfatorization.

Equivalently, the right fator v of the standard fatorization of a Lyndon

wordw whih is notredued to aletter anbedened asthe smallestproper

suÆxofw.

Examples.

aaabaab=aaab aab; aaababb=a aababb; aabaabb=aab aabb:

3 Counting Lyndon words with a given right fator

Inthissetion,weprovethatthesetofLyndonwordswithagivenrightfator

intheirstandardfatorizationisarationallanguageandomputeitsgenerating

funtion.Thetehniquesusedinthefollowingbasiallyomefromombinatoris

onwords.

Letw=vab i

be awordontaining onea and endingwith asequene ofb.

ThewordR (w)=vbisthereduedwordofw.

ForanyLyndonwordv,wedenetheset

X

v

=fv

0

=v;v

1

=R (v);v

2

=R 2

(v);:::;v

k

=R k

(v)g:

where k =jvj

a

is thenumberof ourrenesof a in v. Notethat Card(X

v )=

jvj

a

+1andv

k

=b.

(5)

aabab

2. v=a:X

a

=fa;bg.

3. v=b:X

b

=fbg.

Byonstrution,visthesmallestelementofX +

v

forthelexiographialorder.

Lemma1 Every wordx2X

v

isaLyndonword.

Proof. If v =a, then X

v

=fa;bg, else anyelementof X

v

ends bya b.In this

ase, if x 2= L, there exists a deomposition x = x

1 x

2

b suh that x

2 bx

1

x

1 x

2 b andx

1

6=". Thus x

2

a isnotaleft fatorof x

1 x

2

b and x

2 a<x

1 x

2 a.By

onstrutionofX

v

, asx6=v, thereexists awordwsuhthat v =x

1 x

2 aw. We

getthatx

2 awx

1

<x

1 x

2

aw. Thisisimpossiblesinev2L.

A ode C overA

is aset of non-emptywordssuh anywordw of A

an

bewrittenin atmostonewayasaprodutofelementsofC. Aset ofwordsis

prex if noneofits elements isthe prexof anotherone. Suh aset is aode,

alled a prex ode.A ode C is said to beirularif any word of A

written

alongairleadmitsatmostonedeompositionasprodutofwordsofC.These

odesanbeharaterized asthe bases ofverypuremonoids, i.e., ifw n

2C

thenw2C

.Forageneralrefereneaboutodes,see[BP85℄.

Proposition2 The setX

v

isaprex irularode.

Proof. If x;y 2 X

v

with jxj <jyj, then, by onstrutionof X

v

, x >y. Sox is

notaleftfatorofy andX

v

isaprexode.

Moreover,foreveryn1,ifw isawordsuhthat w n

2X

v

thenw2X

v .

Indeedif w2=X

v

, theneither w is aproperprex of awordof X

v

orw hasa

prex in X

v

. If w is aproperprex of aword of X

v

,it is a prexof v and it

isstritly smallerthananywordof X

v . Asw

n

2X

v

,woroneofitsprexisa

suÆxofawordofX

v

.ButallelementsofX

v

areLyndonwordsgreaterthanv,

sotheirsuÆxesarestritlygreaterthanv andw annotbeaprexofaword

ofX

v .

Now ifw =w

1 w

2

where w

1

is the longestprex of w in X +

v

, then w

2 isa

non-empty prexof awordX

v , sow

2

is stritly smallerthan any wordof X

v .

Asw n

2X

v ,w

2

oroneofitsprexisasuÆxofawordofX

v

, butallelements

of X

v

are Lyndon words greater than v, so their suÆxes are stritly greater

thanv andwannothaveaprexinX +

v .

Asaonlusion,sineX

v

isaodeandforeveryn1,ifw n

2X

v

thenw n

2

X

v ,X

v

isirularode.

Proposition3 Letl2L beaLyndonword, lv if andonlyif l2X +

v .

Proof. Iflv,letl

1

bethelongestprexoflwhihbelongstoX

v ,andl

2 suh

that l=l

1 l

2 . Ifl

2

6=", wehavetheinequalityl

2 l

1

>lv,thus l

2 l

1

>v. The

wordvisnotaprexofl

2 sinel

2

hasnoprexinX

v

,henewehavel

2

=l 0

2 bl

00

2

and v = l 0

2 av

00

. Then, by onstrution of X

v , l

0

2 b 2 X

v

whih is impossible.

Thusl

2

="andl2X +

v .

Conversely,ifl2X +

,asaprodutofwordsgreaterthanv,lv.

(6)

Theorem2 Letv 2L andw2A . Then awv isa Lyndonwordwith aw v

as standard fatorizationif andonly if w2X

v n(a

1

X

v )X

v

.Henethe set F

v

of Lyndonwordshavingv asright standardfatoris arational language.

Proof. AssumethatawvisaLyndonwordanditsstandardfatorizationisaw

v.ByTheorem 1,wv anbewrittenuniquelyas

wv=l

1 l

2 :::l

n

; l

i

2L; l

1 l

2

l

n :

Asvisthesmallest(forthelexiographialorder)suÆxofawv,andonsequently

ofwv,wegetl

n

=v;ifw=",thenn=1,elsen2andfor1in 1,l

i v.

Thus,w2X

v .

Moreover if w 2 (a 1

X

v )X

v

, then aw 2 X +

v

\L. Hene aw v whih is

ontraditory with the denition of the standard fatorization. So w 2 X

v n

(a 1

X

v )X

v .

Conversely,ifw2X

v n(a

1

X

v )X

v ,then

w=x

1 x

2 :::x

n

; x

i 2X

v

and aw2=X +

v :

>FromProposition 1,theprodutll 0

oftwoLyndonwordssuh thatl<l 0

isa

Lyndonword.Replaingasmuhaspossiblex

i x

i+1

bytheirprodutwhenx

i

<

x

i+1

,wanberewrittenas

w=y

1 y

2 :::y

m

; y

i 2X

+

v

\L; y

1 y

2

y

m :

Asaw2=X +

v

,foranyinteger1im, ay

1 :::y

i

= 2X

+

v .

Nowweproveby indution that aw 2L. As y

1

2L and a< y

1 , ay

1 2 L.

Supposethatay

1 :::y

i

2L.Then,asy

i+1

2L\X +

v

,anday

1 :::y

i 2LnX

+

v ,from

Proposition3,wegetay

1 :::y

i

<vy

i+1

.Heneay

1 :::y

i+1

2L. So,aw2L.

Asaw2LnX +

v

,aw<v andawv2L.Setting v=y

m+1

,wehave

wv=y

1 y

2 :::y

m y

m+1

; y

i 2X

v

\L; y

1 y

2

:::y

m+1 :

Moreover any proper suÆx s of awv is asuÆx of wv and an be written

as s = y 0

i y

i+1 :::y

m+1

where y 0

i

is a suÆxof y

i . As y

i

2 L, y 0

i y

i . As y

i 2

X +

v ,y

i

v andthussv.Thus,v isthesmallestsuÆxofawv andaw v is

thestandardfatorizationoftheLyndonwordawv.

Finallyas the set of rationallanguagesis losed by omplementation, on-

atenation,Kleenestaroperationandleftquotient,foranyLyndonwordv,the

set F

v

ofLyndonwordshavingv rightstandardfatorisarationallanguage.

Remark. TheproofofTheorem2leadstoalinearalgorithmthat omputesthe

rightfatorofaLyndonwordusingthefatthatthefatorizationofTheorem1

anbeahievedinlineartimeandspae(byanalgorithmofDuval[Duv83℄,see

Setion 5).

Wedenethegeneratingfuntions X

v

(z)of X

v andX

v

(z)ofX

v :

X

v (z)=

X

w2X

v z

jwj

and X

v (z)=

X

w2X

z jwj

:

(7)

AsthesetX

v

isaode,theelementsofX

v

aresequenesofelementsofX

v (see

[FS02℄):

X

v (z)=

1

1 X

v (z)

:

DenotebyF

v (z)=

P

x2Fv z

jxj

thegeneratingfuntion oftheset

F

v

=fawv2Ljaw visthestandardfatorizationg:

Theorem 3 Letv beaLyndonword. The generating funtion of the setF

v of

Lyndonwordshaving arightstandardfator v anbe written

F

v (z)=z

jvj

1+

2z 1

1 X

v (z)

:

Proof. First of all, note that anyLyndonwordof fa;bg

whih is not aletter

ends with the letter b, so F

a

(z) = 0. And as X

a

= fa;bg, the formula given

forF

v

(z)holdsforv=a.

Assumethatv6=a.FromTheorem 2,F

v

(z)anbewrittenas

F

v (z)=z

javj

X

w2X

v na

1

X +

v z

jwj

:

Inorder to transformthis ombinatorialdesription involvingX

v na

1

X +

v

into an enumerative formula of the generating funtion F

v

(z), we prove rst

that a 1

X +

v X

v

and,nextthattheset a 1

X +

v

anbedesribedasadisjoint

unionofrationalsets.

If x 2 X

v

nfbg, then x is greater than v and as x is a Lyndon word, its

propersuÆxesarestritlygreaterthanv; onsequently,writinga 1

x asanon-

inreasing sequene of Lyndon word l

1

;:::;l

m

, we get, sine l

m

v, that for

alli,l

i

isgreaterthanv.ConsequentlyfromProposition3,foralli,l

i 2X

v and

asaprodutofelementsofX +

v ,a

1

x2X +

v

.Thereforea 1

(X

v

nfbg)X

v X

v .

Moreoverifx

1

;x

2 2X

v andx

1 6=x

2 ,asX

v

isaprexode,

a 1

x

1 X

v

\a 1

x

2 X

v

=;:

Thus a 1

( X

v

nfbg)X

v

is the disjoint union of the sets a 1

x

i

X

v

when x

i

ranges over X

v

nfbg. Consequently the generating funtion of the set F

v of

Lyndonwordshavingv asrightfatorsatises

F

v (z)=z

jvj+1 1

Xv(z) z

z

1 X

v (z)

andnallytheannounedequality.

Note that thefuntion F

v

(z)isrational foranyLyndonword v. Butthe right

standard fator runs over the set of Lyndon words whih is not ontext-free

[BB97℄. Therefore in order to study the average length of the fators in the

(8)

Makinguseofprobabilistitehniquesandofresultsfromanalytiombinatoris

(see[FS02℄),weestablishthefollowingresult.

Theorem 4 The average length for the uniform distribution over the Lyndon

wordsoflengthnoftherightfatorofthestandardfatorizationisasymptotially

3n

4

1+O

log 3

n

n

:

Remark: Theerrortermomesfromsuessiveapproximationsatdierentsteps

of theproofand, forthis reason,itisprobablyoverestimated(seeexperiments

in Setion5).

FirstwepartitionthesetL

n

ofLyndonwordsoflengthninthetwofollowing

subsets:aL

n 1 andL

0

n

=L

n naL

n 1 .

Note that aL

n 1 L

n

(that is, if w is a Lyndon word then aw is also a

Lyndonword).Moreoverifw2aL

n 1

,thestandardfatorizationisw=a v

withv2L

n 1 .As

Card(L

n 1 )=

2 n 1

n 1

1+O

2 n=2

;

the ontributionof the set aL

n 1

to themean valueof thelength ofthe right

fatoris

(n 1)

Card(aL

n 1 )

Card(L

n )

= n

2

1+O

1

n

:

Theremainingpartofthispaperisdevotedtothestandardfatorizationof

thewordsofL 0

n

whihrequiresaarefulanalysis.

Proposition4 The ontribution of the set L 0

n

tothe meanvalue of the length

of rightfator is

n

4

1+O

log 3

n

n

:

This proposition basially asserts that in average for the uniform distribution

overL 0

n

,thelengthoftherightfatorisasymptotiallyn=2.

Theideaistobuildatransformation',whihisabijetiononasetD

n L

0

n ,

suh that the sum of the lengths of standard right fators of w and '(w) is

aboutjwjthelengthofw.Indeedwithsuharelationweanomputetheon-

tributionofD

n

totheexpetationofparameterright.Theniftheontribution

ofL 0

n nD

n

totheparameterrightisnegligibleweareableto onludeforthe

expetation ofparameterright.

It remainsto exhibit/onstrutsuh abijetion' and determinea\good"

setD

n

.Thisisdoneinthefollowingway:assumethatwisaLyndonwordinL

n n

aL .Letus denotebyk thelengthofthe rstrunofa'softhe standardright

(9)

n n 1

Indeedthestandardfatorizationofwanonlybeone ofthefollowing

w=a k +1

bu a k

bv(rstkind)

w =a k

bu a k

bv (seondkind):

This means that the left fator of aLyndon word w anonly begin by a k +1

b

ora k

b when we know that theright fatorbeginby a k

b (otherwise w annot

bein L

n naL

n 1

).Letusxaintegerparameter2Z +

. Thenthewordsu;v

of X

k

anbeuniquely written asu =u 0

u 00

and v = v 0

v 00

where u 0

and v 0

are

the smallest prexes of u and v of length greater than and ending by a b

(thereisalwayssuhasymbolbifthesewordsarenotemptysinethenuandv

end with ab). Whenjuj;jvjwedene '(w) fora wordw =a k

bu a k

bv

(resp.w=a k +1

bu a k

bv)by

'(w)=a k

bu 0

v 00

a k

bv 0

u 00

(resp.a k +1

bu 0

v 00

a k

bv 0

u 00

):

Forexample,onsidering w=a k

babb a k

bbaaba k

bbbb,ifwehoose=2

andsou 0

=ab;u 00

=b;v 0

=baab;v 00

=a k

bbbbthenweget'(w)=a k

baaba k

bbbb

a k

bab 2 L. Here jwj =j'(w)j=3k+13 and the lengthof the rightstandard

fatorare2k+9andk+3respetively.

Somewordsgivehintsofwhat wemustbearefulaboutifwewant'(w)to

beaLyndonword.

{ If we want the appliation ' to bewell dened,the parameter must be

greater or equal to 1. So the longest runs of a's have to be separated by

non-empty words. Ifw=a k

b a k

bb,then u=" is theemptyword.The

appliationexhanginguandv givesawordswhih is nolongeraLyndon

word.

{ Ifw=a k

bab a k

babb, thenu=ab andit isaprex ofv.Foranyhoie

of , '(w) is not a Lyndon word. So the longest runs of a's have to be

separatedbywordshavingdistintprexestoensurethat'(w)isaLyndon

word.

{ If w = a k

bbab a k

bbbab, then if we hoose = 1, we get '(w) =

a k

bbbaba k

bbab 2= L (sine u 0

= v 0

and u 0

v 00

= bbab > v 0

u 00

= bab). Thus

wehave to takeare, when we apply thetransformation that '(w) is still

smallerthanitspropersuÆxes(thisisensuredifu 0

6=v 0

).

Theappliation 'andsetD

n

aredependentandtosuit ourneedstheyare

impliitlydeterminedbythefollowingonstraints

1. Thefuntion 'isaninvolutiononD

n

:'('(w))=w.

2. Thestandardfatorizationof'(w)forw2D

n is

'(w)=a k +1

bu 0

v 00

a k

bv 0

u 00

(rstkind)

k 0 00 k 0 00

(10)

jright(w)j+jright('(w))j=jwj ( 1+o(jwj)):

4. The set D

n

\aptures" most of the set L 0

n

in an asymptoti way when n

growsto1,thatis

Card(D

n )

Card(L 0

n )

=1 o(1):

Mostoftheseonditionsarerelatedtothepropertiesofthelongestrunsofa's.

Hene, in the following parts, we study some ombinatorial properties of the

longestrunsofa'sin Lyndonwordstoharaterize'andD

n

preisely.

5 Algorithms and experimental results

InthissetionwegiveanalgorithmtogeneraterandomLyndonwordsofagiven

length nand useit to establishsomeexperimental results aboutthe lengthof

therightfatorinthestandardfatorization.

0 2000 4000 6000 8000 10000 12000

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

length of the right factor

length of the Lyndon word

3*x/4

Fig.1. Average length of the right fator of random Lyndon words with lengths

from1;000to10;000.Eahplotisomputedwith1;000words.Theerrorbarsrepre-

sentsthestandarddeviation.

OuralgorithmsuseDuval'salgorithm[Duv83℄,whihomputesinlineartime

(11)

wordsl

1 l

2

l

k

suhthat

u=l

1 l

2 :::l

k :

LetthefuntionDuval(string u, int k, array pos)bethefuntionwhih

omputes theLyndon deomposition of u by storing in an arraypos of size k

thepositionsofthefators.

ThereexistsanalgorithmSmallestConjugate(u),proposedbyBooth[Lot03,?℄,

that omputesthesmallestonjugatearandomlyndonwordoflengthninlin-

ear time. We use it to make a rejet algorithm whih is eÆient to generate

randomlyaLyndonwordoflengthn:

RandomLyndonWord(n) //returnarandomLyndonword

string u, v;

do

u = RandomWord(n); //uisarandomwordofA n

v = SmallestConjugate(u); //visthesmallest onjugateofu

until (length(v) == n); //visprimitive

return v;

ThealgorithmRandomLyndonWordomputesuniformly aLyndonword.

Lemma2 The averageomplexityofRandomLyndonWord(n)islinear.

Proof. Eah exeution of the do ... until loop is done in linear time. The

onditionisnotsatisedwhen uis aonjugateofapowerv p

withp>1.This

happenswithprobabilityO(

n

2 n=2

).Thustheloopisexeutedaboundednumber

oftimesintheaverage.

Lemma3 Letl=aubeaLyndonwordoflengthgreaterorequalsto2starting

with a letter a. Let l

1 :::l

k

be the Lyndon fatorization of u. The right fator

of linitsstandardfatorizationisl

k .

Proof. ByTheorem1,l

k

isthesmallestsuÆxofu,thusitisthesmallestproper

suÆxofl.

ThealgorithmtoomputetherightfatorofaLyndonwordlsuhthatjlj

2isthefollowing:

RightFator(string l[1..n℄)

array pos;

int k;

pos = Duval(l[2..n℄, k, pos); // omit the first letter a and apply Duval's algorithm.

return l[pos[k℄..n℄; // return the last fator

Thisalgorithmislinearin timesineDuval'salgorithmislinear.

Figures1 and 2present someexperimental results obtainedwith ouralgo-

rithms.

OpenproblemTheresultsobtainedinthispaperareonlytherststeptoward

theaveragease-analysisoftheLyndontree.TheLyndontreeT(w)ofaLyndon

(12)

0 500 1000 1500 2000

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

number of hits

size of the right factor

Fig.2.Distribution ofthe lengthof theright fator.We generated 100;000random

Lyndonwordsoflength5;000.

{ ifwisaletter,thenT(w)is aexternalnodelabeledbytheletter.

{ otherwise,T(w)isaninternalnodehavingT(u)andT(v)ashildrenwhere

thestandardfatorizationof wisu v.

This struture enodesanonassoiativeoperation,either aommutatorin the

freegroup[CFL58℄,oraLiebraketing[Lot83℄;bothonstrutionsleadstobases

ofthefreeLiealgebra.

InordertostudytheheightofthetreeobtainedfromaLyndonwordbysu-

essivestandardfatorizations,itwouldbeveryinterestingtogetmorepreise

informations about thedistribution of therightfators of words of L 0

n . Fig.2

hints a very strong equi-repartition property of the length of the right fator

overthisset.Thissuggestsaverypartiularsubdivisionproessateahnodeof

thefatorizationtreewhihneedsfurtherinvestigations.

Referenes

[BB97℄ J.BerstelandL.Boasson. Thesetoflyndonwordsisnotontext-free. Bull.

Eur.Asso.Theor.Comput. Si.EATCS,63:139{140, 1997.

[Boo80℄ K. S. Booth. Lexiographially least irular substrings. Inform. Proess.

Lett.,10(4-5):240{242, 1980.

[BP85℄ J.BerstelandD.Perrin. Theory ofodes. AademiPress,1985.

[BP94℄ J.BerstelandM.Pohiola.AverageostofDuval'salgorithmforgenerating

Lyndonwords. Theoret. Comput.Si.,132(1-2):415{425, 1994.

[Car85℄ H.Cartan. Theorieelementairedesfontionsanalytiques d'uneou plusieurs

(13)

quotientgroupsofthelowerentralseries. Ann.Math.,58:81{95,1958.

[Duv83℄ J.-P. Duval. Fatorizingwords overanordered alphabet. Journal of Algo-

rithms,4:363{381, 1983.

[Duv88℄ J.-P.Duval. Generationd'unesetiondeslassesdeonjugaisonetarbredes

motsdeLyndondelongueurbornee.Theoret.Comput.Si.,4:363{381,1988.

[FGP01℄ P.Flajolet,X.Gourdon,andD.Panario. Theompleteanalysisofapolyno-

mialfatorizationalgorithmoverniteelds.JournalofAlgorithms,40:37{81,

2001.

[FS91℄ P. Flajolet and M. Soria. The yle onstrution. SIAM J. Dis. Math.,

4:58{60,1991.

[FS02℄ P.FlajoletandR.Sedgewik.Analytiombinatoris{symboliombinatoris.

Book inpreparation,2002. (Individualhaptersare availableas INRIARe-

searhreportsathttp://www.algo.inria.fr/ajolet/publist.html).

[Gol69℄ S.Golomb.Irreduiblepolynomials,synhronizingodes,primitiveneklaes

and ylotomi algebra. In Pro.Conf Combinatorial Math. and Its Appl.,

pages358{370,ChapelHill,1969.Univ.ofNorthCarolinaPress.

[HW38℄ G. H. Hardy and E. M. Wright. An Introdution to the Number Theory.

OxfordUniversityPress,1938.

[Knu78℄ D.E. Knuth. Theaverage timeforarrypropagation. Indagationes Mathe-

matiae,40:238{242,1978.

[Lot83℄ M.Lothaire. CombinatorisonWords,volume17ofEnylopediaof mathe-

matisandits appliations. Addison-Wesley,1983.

[Lot03℄ M.Lothaire.AppliedCombinatorisonWords.2003.inpreparation,hapters

availableathttp://www-igm.univ-mlv.fr/~berstel/Lothaire.

[Lyn54℄ R.C.Lyndon.OnBurnsideproblemI. Trans.AmerianMath.So.,77:202{

215,1954.

[PR01℄ D. Panario and B.Rihmond. Smallestomponentsindeomposablestru-

tures:exp-loglass. Algorithmia,29:205{226,2001.

[Reu93℄ C.Reutenauer. Free Liealgebras. OxfordUniversityPress,1993.

[RSar℄ F.RuskeyandJ.Sawada. GeneratingLyndonbrakets:abasisfor then-th

homogeneousomponent ofthe freeLiealgebra. Journal of Algorithms, (to

appear). Availableathttp://www.s.uvi.a/ fruskey/Publiations/.

Références

Documents relatifs

In this paper, we give the answer, proving that the only infinite smooth Lyndon words are m {a&lt;b} , with a, b even, m {1&lt;b} and ∆ −1 1 (m {1&lt;b} ), with b odd, where m A is

For an alphabet of size at least 3, the average time complexity, for the uniform distribution over the sets X of Set n,m , of the construction of the accessible and

As there exists a bijection between Lyndon words over an alphabet of cardinality k and irreducible polynomials over F k [15], lots of results are known about this factorization:

For example, the reflected Gray code order induces a 1-Gray code on {0, 1} n ; and its restriction to the strings with fixed density (i.e., strings with a constant number of

In the workshop of 2007, several contributions to the topic are also presented: Nicolas B´ edaride, Eric Domenjoud, Damien Jamet and Jean-Luc Remy study the number of balanced words

Broadhurst considered the question of finding a transcendence basis for the algebra of MZV, he suggested that one should consider Lyndon words. Denote by L the set of

An infinite word is an infinite Lyndon word if it is smaller, with respect to the lexicographic order, than all its proper suffixes, or equivalently if it has infinitely many

We prove that if a Sturmian word is the image by a morphism of a word which is a fixed point of another morphism, then this latter word is mostly a Sturmian word, and the