HAL Id: hal-00619865
https://hal-upec-upem.archives-ouvertes.fr/hal-00619865
Submitted on 6 Oct 2011
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
The average lengths of the factors of the standard factorization of Lyndon words
Frédérique Bassino, Julien Clément, Cyril Nicaud
To cite this version:
Frédérique Bassino, Julien Clément, Cyril Nicaud. The average lengths of the factors of the standard
factorization of Lyndon words. 6th International Conference on Developments in Language Theory
(DLT 2002), Sep 2003, Kyoto, Japan. pp.307-318. �hal-00619865�
standard fatorization of Lyndon words
FrederiqueBassino,JulienClementandCyrilNiaud,
InstitutGaspardMonge
UnversitedeMarne-la-Vallee
77454Marne-la-ValleeCedex2-Frane
email:fbassino,lementj,niaudguniv-mlv.fr
NoInstituteGiven
Abstrat. A non-empty word w of fa;bg
is a Lyndon word if and
onlyifitis stritlysmallerfor thelexiographialorderthanany ofits
propersuÆxes. Suhaword w is eithera letter or admits astandard
fatorization uv where v is itssmallest propersuÆx. For any Lyndon
word v, we show that the set of Lyndon words having v as right fa-
torofthe standardfatorization isrationaland omputeexpliitlythe
assoiated generatingfuntion. Nextweestablish that,for theuniform
distributionoverthe Lyndon words of lengthn, the average lengthof
therightfator vof thestandard fatorization is asymptotially3n=4.
Finallywepresent algorithmsonLyndonwords derivedfromourwork
togetherwithexperimentalresults.
1 Introdution
Given a totally ordered alphabet A, a Lyndon word is a word that is stritly
smaller,forthelexiographialorder,thananyofitsonjugates(i.e.,allwords
obtained by a irular permutation on the letters). Lyndon words were intro-
duedbyLyndon[Lyn54℄underthenameof\standardlexiographisequenes"
inordertogiveabaseforthefreeLiealgebraoverA;thestandardfatorization
playsaentralrolein thisframework(see[Lot83℄, [Reu93℄,[RSar℄).
Oneof the basi properties of the set of Lyndon words is that everyword
isuniquelyfatorizable asanoninreasing produtofLyndonwords.As there
exists abijetionbetweenLyndonwordsoveranalphabet ofardinalityk and
irreduible polynomials over F
k
[Gol69℄, lot of results are known about this
fatorization: the averagenumber of fators, theaverage lengthof the longest
fator[FGP01℄and oftheshortest[PR01℄.
Several algorithms deal with Lyndon words.Duval gives in [Duv83℄ an al-
gorithmthat omputes,inlineartime,thefatorizationof awordintoLyndon
words;healsopresentsin [Duv88℄analgorithmforgeneratingallLyndonword
uptoagivenlengthinlexiographialorder.Thisalgorithmrunsinaonstant
ativeproperties ofthese sets ofwords.Then weintroduethestandardfator-
ization of a Lyndonword w whih is theunique oupleof Lyndon wordsu, v
suhthatw=uv andvisofmaximallength.
InSetion3,westudythesetofLyndonwordsoffa;bg
havingagivenright
fator in their standard fatorization and provethat it is a rational language.
We also omputeits assoiated generating funtion. Butasthe set of Lyndon
wordsisnotontext-free[BB97℄, wearenotableto diretlyderiveasymptoti
properties from these generating funtions. Consequently in Setion 4we use
probabilistitehniquesandresultsfrom analytiombinatoris(see[FS02℄)in
ordertoomputetheaveragelengthofthefatorsofthestandardfatorization
ofLyndonwords.
Setion 5 is devoted to algorithms and experimental results. We give an
algorithm to generate randomly for uniform distribution a Lyndon word of a
givenlengthandanotheronerelatedto thestandardfatorizationofaLyndon
wordwhihisbasedontheproofofTheorem2ofSetion3.Tothebestofour
knowledgethese algorithms, although simple and notneessarily new,are not
found elsewhere. Finally experiments are given whih onrm our results and
givehintsoffurtherstudies.
Theresultsontainedinthispaperonstitutearststepinthestudyofthe
averagebehaviorofthebinaryLyndontreesobtainedfrom Lyndonwordsbya
reursiveappliationofthestandardfatorization.
2 Preliminary
We denote A
the free monoid over the alphabet A = fa;bg obtained by all
niteonatenationsofelementsofA.Thelengthjwjofawordwisthenumber
of thelettersw isprodut of,jwj
a
is thenumberofourrenesofthe lettera
in w. Weonsider the lexiographialorder <overallnon-empty wordsof A
dened bytheextensionoftheordera<boverA.
Wereordtwopropertiesofthisorder
(i) ForanywordwofA
,u<v ifandonlyifwu<wv.
(ii) Letx;y2A
betwowordssuhthatx<y.Ifxisnotaprexofythenfor
everyx 0
;y 0
2A
wehavexx 0
<yy 0
.
Bydenition, aLyndon word is a primitive word (i.e, it is not apower of
another word) that is minimal, for the lexiographial order, in its onjugate
lass (i.e, the set of all wordsobtained by a irular permutation). The set of
Lyndonwordsoflengthnisdenoted byL
n
andL=[
n L
n .
L=fa;b;ab;aab;abb;aaab;aabb;abbb;
aaaab;aaabb;aabab;aabbb;ababb;abbbb;:::g
Equivalently,w2Lifandonlyif
+
ofitspropersuÆxes.
Proposition1 A word w2A +
is aLyndon wordif and only if either w2A
orw=uvwith u;v2L,u<v.
Theorem1(Lyndon) Any word w 2A +
an be written uniquely as a non-
inreasing produtof Lyndonwords:
w=l
1 l
2 :::l
n
; l
i
2L; l
1 l
2
l
n :
Moreover, l
n
isthe smallestsuÆx ofw.
ThenumberCard(L
n
)ofLyndonwordsoflengthnoverA(see[Lot83℄)is
Card(L
n )=
1
n X
djn
(d)Card(A) n=d
;
where is theMoebius funtion dened onNnf0gby(1)=1,(n)=( 1) i
ifnistheprodutofidistintprimesand(n)=0otherwise.
WhenCard(A)=2,weobtainthefollowingestimate
Card(L
n )=
2 n
n
1+O
2 n=2
:
Denition1 (Standard fatorization). For w2LnA a Lyndonword not
reduedtoaletter, the pair (u;v), u;v2L suhthatw=uv andv of maximal
lengthisalledthe standardfatorization.The wordsuandv arealledthe left
fatorandrightfator ofthe standardfatorization.
Equivalently, the right fator v of the standard fatorization of a Lyndon
wordw whih is notredued to aletter anbedened asthe smallestproper
suÆxofw.
Examples.
aaabaab=aaab aab; aaababb=a aababb; aabaabb=aab aabb:
3 Counting Lyndon words with a given right fator
Inthissetion,weprovethatthesetofLyndonwordswithagivenrightfator
intheirstandardfatorizationisarationallanguageandomputeitsgenerating
funtion.Thetehniquesusedinthefollowingbasiallyomefromombinatoris
onwords.
Letw=vab i
be awordontaining onea and endingwith asequene ofb.
ThewordR (w)=vbisthereduedwordofw.
ForanyLyndonwordv,wedenetheset
X
v
=fv
0
=v;v
1
=R (v);v
2
=R 2
(v);:::;v
k
=R k
(v)g:
where k =jvj
a
is thenumberof ourrenesof a in v. Notethat Card(X
v )=
jvj
a
+1andv
k
=b.
aabab
2. v=a:X
a
=fa;bg.
3. v=b:X
b
=fbg.
Byonstrution,visthesmallestelementofX +
v
forthelexiographialorder.
Lemma1 Every wordx2X
v
isaLyndonword.
Proof. If v =a, then X
v
=fa;bg, else anyelementof X
v
ends bya b.In this
ase, if x 2= L, there exists a deomposition x = x
1 x
2
b suh that x
2 bx
1
x
1 x
2 b andx
1
6=". Thus x
2
a isnotaleft fatorof x
1 x
2
b and x
2 a<x
1 x
2 a.By
onstrutionofX
v
, asx6=v, thereexists awordwsuhthat v =x
1 x
2 aw. We
getthatx
2 awx
1
<x
1 x
2
aw. Thisisimpossiblesinev2L.
A ode C overA
is aset of non-emptywordssuh anywordw of A
an
bewrittenin atmostonewayasaprodutofelementsofC. Aset ofwordsis
prex if noneofits elements isthe prexof anotherone. Suh aset is aode,
alled a prex ode.A ode C is said to beirularif any word of A
written
alongairleadmitsatmostonedeompositionasprodutofwordsofC.These
odesanbeharaterized asthe bases ofverypuremonoids, i.e., ifw n
2C
thenw2C
.Forageneralrefereneaboutodes,see[BP85℄.
Proposition2 The setX
v
isaprex irularode.
Proof. If x;y 2 X
v
with jxj <jyj, then, by onstrutionof X
v
, x >y. Sox is
notaleftfatorofy andX
v
isaprexode.
Moreover,foreveryn1,ifw isawordsuhthat w n
2X
v
thenw2X
v .
Indeedif w2=X
v
, theneither w is aproperprex of awordof X
v
orw hasa
prex in X
v
. If w is aproperprex of aword of X
v
,it is a prexof v and it
isstritly smallerthananywordof X
v . Asw
n
2X
v
,woroneofitsprexisa
suÆxofawordofX
v
.ButallelementsofX
v
areLyndonwordsgreaterthanv,
sotheirsuÆxesarestritlygreaterthanv andw annotbeaprexofaword
ofX
v .
Now ifw =w
1 w
2
where w
1
is the longestprex of w in X +
v
, then w
2 isa
non-empty prexof awordX
v , sow
2
is stritly smallerthan any wordof X
v .
Asw n
2X
v ,w
2
oroneofitsprexisasuÆxofawordofX
v
, butallelements
of X
v
are Lyndon words greater than v, so their suÆxes are stritly greater
thanv andwannothaveaprexinX +
v .
Asaonlusion,sineX
v
isaodeandforeveryn1,ifw n
2X
v
thenw n
2
X
v ,X
v
isirularode.
Proposition3 Letl2L beaLyndonword, lv if andonlyif l2X +
v .
Proof. Iflv,letl
1
bethelongestprexoflwhihbelongstoX
v ,andl
2 suh
that l=l
1 l
2 . Ifl
2
6=", wehavetheinequalityl
2 l
1
>lv,thus l
2 l
1
>v. The
wordvisnotaprexofl
2 sinel
2
hasnoprexinX
v
,henewehavel
2
=l 0
2 bl
00
2
and v = l 0
2 av
00
. Then, by onstrution of X
v , l
0
2 b 2 X
v
whih is impossible.
Thusl
2
="andl2X +
v .
Conversely,ifl2X +
,asaprodutofwordsgreaterthanv,lv.
Theorem2 Letv 2L andw2A . Then awv isa Lyndonwordwith aw v
as standard fatorizationif andonly if w2X
v n(a
1
X
v )X
v
.Henethe set F
v
of Lyndonwordshavingv asright standardfatoris arational language.
Proof. AssumethatawvisaLyndonwordanditsstandardfatorizationisaw
v.ByTheorem 1,wv anbewrittenuniquelyas
wv=l
1 l
2 :::l
n
; l
i
2L; l
1 l
2
l
n :
Asvisthesmallest(forthelexiographialorder)suÆxofawv,andonsequently
ofwv,wegetl
n
=v;ifw=",thenn=1,elsen2andfor1in 1,l
i v.
Thus,w2X
v .
Moreover if w 2 (a 1
X
v )X
v
, then aw 2 X +
v
\L. Hene aw v whih is
ontraditory with the denition of the standard fatorization. So w 2 X
v n
(a 1
X
v )X
v .
Conversely,ifw2X
v n(a
1
X
v )X
v ,then
w=x
1 x
2 :::x
n
; x
i 2X
v
and aw2=X +
v :
>FromProposition 1,theprodutll 0
oftwoLyndonwordssuh thatl<l 0
isa
Lyndonword.Replaingasmuhaspossiblex
i x
i+1
bytheirprodutwhenx
i
<
x
i+1
,wanberewrittenas
w=y
1 y
2 :::y
m
; y
i 2X
+
v
\L; y
1 y
2
y
m :
Asaw2=X +
v
,foranyinteger1im, ay
1 :::y
i
= 2X
+
v .
Nowweproveby indution that aw 2L. As y
1
2L and a< y
1 , ay
1 2 L.
Supposethatay
1 :::y
i
2L.Then,asy
i+1
2L\X +
v
,anday
1 :::y
i 2LnX
+
v ,from
Proposition3,wegetay
1 :::y
i
<vy
i+1
.Heneay
1 :::y
i+1
2L. So,aw2L.
Asaw2LnX +
v
,aw<v andawv2L.Setting v=y
m+1
,wehave
wv=y
1 y
2 :::y
m y
m+1
; y
i 2X
v
\L; y
1 y
2
:::y
m+1 :
Moreover any proper suÆx s of awv is asuÆx of wv and an be written
as s = y 0
i y
i+1 :::y
m+1
where y 0
i
is a suÆxof y
i . As y
i
2 L, y 0
i y
i . As y
i 2
X +
v ,y
i
v andthussv.Thus,v isthesmallestsuÆxofawv andaw v is
thestandardfatorizationoftheLyndonwordawv.
Finallyas the set of rationallanguagesis losed by omplementation, on-
atenation,Kleenestaroperationandleftquotient,foranyLyndonwordv,the
set F
v
ofLyndonwordshavingv rightstandardfatorisarationallanguage.
Remark. TheproofofTheorem2leadstoalinearalgorithmthat omputesthe
rightfatorofaLyndonwordusingthefatthatthefatorizationofTheorem1
anbeahievedinlineartimeandspae(byanalgorithmofDuval[Duv83℄,see
Setion 5).
Wedenethegeneratingfuntions X
v
(z)of X
v andX
v
(z)ofX
v :
X
v (z)=
X
w2X
v z
jwj
and X
v (z)=
X
w2X
z jwj
:
AsthesetX
v
isaode,theelementsofX
v
aresequenesofelementsofX
v (see
[FS02℄):
X
v (z)=
1
1 X
v (z)
:
DenotebyF
v (z)=
P
x2Fv z
jxj
thegeneratingfuntion oftheset
F
v
=fawv2Ljaw visthestandardfatorizationg:
Theorem 3 Letv beaLyndonword. The generating funtion of the setF
v of
Lyndonwordshaving arightstandardfator v anbe written
F
v (z)=z
jvj
1+
2z 1
1 X
v (z)
:
Proof. First of all, note that anyLyndonwordof fa;bg
whih is not aletter
ends with the letter b, so F
a
(z) = 0. And as X
a
= fa;bg, the formula given
forF
v
(z)holdsforv=a.
Assumethatv6=a.FromTheorem 2,F
v
(z)anbewrittenas
F
v (z)=z
javj
X
w2X
v na
1
X +
v z
jwj
:
Inorder to transformthis ombinatorialdesription involvingX
v na
1
X +
v
into an enumerative formula of the generating funtion F
v
(z), we prove rst
that a 1
X +
v X
v
and,nextthattheset a 1
X +
v
anbedesribedasadisjoint
unionofrationalsets.
If x 2 X
v
nfbg, then x is greater than v and as x is a Lyndon word, its
propersuÆxesarestritlygreaterthanv; onsequently,writinga 1
x asanon-
inreasing sequene of Lyndon word l
1
;:::;l
m
, we get, sine l
m
v, that for
alli,l
i
isgreaterthanv.ConsequentlyfromProposition3,foralli,l
i 2X
v and
asaprodutofelementsofX +
v ,a
1
x2X +
v
.Thereforea 1
(X
v
nfbg)X
v X
v .
Moreoverifx
1
;x
2 2X
v andx
1 6=x
2 ,asX
v
isaprexode,
a 1
x
1 X
v
\a 1
x
2 X
v
=;:
Thus a 1
( X
v
nfbg)X
v
is the disjoint union of the sets a 1
x
i
X
v
when x
i
ranges over X
v
nfbg. Consequently the generating funtion of the set F
v of
Lyndonwordshavingv asrightfatorsatises
F
v (z)=z
jvj+1 1
Xv(z) z
z
1 X
v (z)
andnallytheannounedequality.
Note that thefuntion F
v
(z)isrational foranyLyndonword v. Butthe right
standard fator runs over the set of Lyndon words whih is not ontext-free
[BB97℄. Therefore in order to study the average length of the fators in the
Makinguseofprobabilistitehniquesandofresultsfromanalytiombinatoris
(see[FS02℄),weestablishthefollowingresult.
Theorem 4 The average length for the uniform distribution over the Lyndon
wordsoflengthnoftherightfatorofthestandardfatorizationisasymptotially
3n
4
1+O
log 3
n
n
:
Remark: Theerrortermomesfromsuessiveapproximationsatdierentsteps
of theproofand, forthis reason,itisprobablyoverestimated(seeexperiments
in Setion5).
FirstwepartitionthesetL
n
ofLyndonwordsoflengthninthetwofollowing
subsets:aL
n 1 andL
0
n
=L
n naL
n 1 .
Note that aL
n 1 L
n
(that is, if w is a Lyndon word then aw is also a
Lyndonword).Moreoverifw2aL
n 1
,thestandardfatorizationisw=a v
withv2L
n 1 .As
Card(L
n 1 )=
2 n 1
n 1
1+O
2 n=2
;
the ontributionof the set aL
n 1
to themean valueof thelength ofthe right
fatoris
(n 1)
Card(aL
n 1 )
Card(L
n )
= n
2
1+O
1
n
:
Theremainingpartofthispaperisdevotedtothestandardfatorizationof
thewordsofL 0
n
whihrequiresaarefulanalysis.
Proposition4 The ontribution of the set L 0
n
tothe meanvalue of the length
of rightfator is
n
4
1+O
log 3
n
n
:
This proposition basially asserts that in average for the uniform distribution
overL 0
n
,thelengthoftherightfatorisasymptotiallyn=2.
Theideaistobuildatransformation',whihisabijetiononasetD
n L
0
n ,
suh that the sum of the lengths of standard right fators of w and '(w) is
aboutjwjthelengthofw.Indeedwithsuharelationweanomputetheon-
tributionofD
n
totheexpetationofparameterright.Theniftheontribution
ofL 0
n nD
n
totheparameterrightisnegligibleweareableto onludeforthe
expetation ofparameterright.
It remainsto exhibit/onstrutsuh abijetion' and determinea\good"
setD
n
.Thisisdoneinthefollowingway:assumethatwisaLyndonwordinL
n n
aL .Letus denotebyk thelengthofthe rstrunofa'softhe standardright
n n 1
Indeedthestandardfatorizationofwanonlybeone ofthefollowing
w=a k +1
bu a k
bv(rstkind)
w =a k
bu a k
bv (seondkind):
This means that the left fator of aLyndon word w anonly begin by a k +1
b
ora k
b when we know that theright fatorbeginby a k
b (otherwise w annot
bein L
n naL
n 1
).Letusxaintegerparameter2Z +
. Thenthewordsu;v
of X
k
anbeuniquely written asu =u 0
u 00
and v = v 0
v 00
where u 0
and v 0
are
the smallest prexes of u and v of length greater than and ending by a b
(thereisalwayssuhasymbolbifthesewordsarenotemptysinethenuandv
end with ab). Whenjuj;jvjwedene '(w) fora wordw =a k
bu a k
bv
(resp.w=a k +1
bu a k
bv)by
'(w)=a k
bu 0
v 00
a k
bv 0
u 00
(resp.a k +1
bu 0
v 00
a k
bv 0
u 00
):
Forexample,onsidering w=a k
babb a k
bbaaba k
bbbb,ifwehoose=2
andsou 0
=ab;u 00
=b;v 0
=baab;v 00
=a k
bbbbthenweget'(w)=a k
baaba k
bbbb
a k
bab 2 L. Here jwj =j'(w)j=3k+13 and the lengthof the rightstandard
fatorare2k+9andk+3respetively.
Somewordsgivehintsofwhat wemustbearefulaboutifwewant'(w)to
beaLyndonword.
{ If we want the appliation ' to bewell dened,the parameter must be
greater or equal to 1. So the longest runs of a's have to be separated by
non-empty words. Ifw=a k
b a k
bb,then u=" is theemptyword.The
appliationexhanginguandv givesawordswhih is nolongeraLyndon
word.
{ Ifw=a k
bab a k
babb, thenu=ab andit isaprex ofv.Foranyhoie
of , '(w) is not a Lyndon word. So the longest runs of a's have to be
separatedbywordshavingdistintprexestoensurethat'(w)isaLyndon
word.
{ If w = a k
bbab a k
bbbab, then if we hoose = 1, we get '(w) =
a k
bbbaba k
bbab 2= L (sine u 0
= v 0
and u 0
v 00
= bbab > v 0
u 00
= bab). Thus
wehave to takeare, when we apply thetransformation that '(w) is still
smallerthanitspropersuÆxes(thisisensuredifu 0
6=v 0
).
Theappliation 'andsetD
n
aredependentandtosuit ourneedstheyare
impliitlydeterminedbythefollowingonstraints
1. Thefuntion 'isaninvolutiononD
n
:'('(w))=w.
2. Thestandardfatorizationof'(w)forw2D
n is
'(w)=a k +1
bu 0
v 00
a k
bv 0
u 00
(rstkind)
k 0 00 k 0 00
jright(w)j+jright('(w))j=jwj ( 1+o(jwj)):
4. The set D
n
\aptures" most of the set L 0
n
in an asymptoti way when n
growsto1,thatis
Card(D
n )
Card(L 0
n )
=1 o(1):
Mostoftheseonditionsarerelatedtothepropertiesofthelongestrunsofa's.
Hene, in the following parts, we study some ombinatorial properties of the
longestrunsofa'sin Lyndonwordstoharaterize'andD
n
preisely.
5 Algorithms and experimental results
InthissetionwegiveanalgorithmtogeneraterandomLyndonwordsofagiven
length nand useit to establishsomeexperimental results aboutthe lengthof
therightfatorinthestandardfatorization.
0 2000 4000 6000 8000 10000 12000
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
length of the right factor
length of the Lyndon word
3*x/4
Fig.1. Average length of the right fator of random Lyndon words with lengths
from1;000to10;000.Eahplotisomputedwith1;000words.Theerrorbarsrepre-
sentsthestandarddeviation.
OuralgorithmsuseDuval'salgorithm[Duv83℄,whihomputesinlineartime
wordsl
1 l
2
l
k
suhthat
u=l
1 l
2 :::l
k :
LetthefuntionDuval(string u, int k, array pos)bethefuntionwhih
omputes theLyndon deomposition of u by storing in an arraypos of size k
thepositionsofthefators.
ThereexistsanalgorithmSmallestConjugate(u),proposedbyBooth[Lot03,?℄,
that omputesthesmallestonjugatearandomlyndonwordoflengthninlin-
ear time. We use it to make a rejet algorithm whih is eÆient to generate
randomlyaLyndonwordoflengthn:
RandomLyndonWord(n) //returnarandomLyndonword
string u, v;
do
u = RandomWord(n); //uisarandomwordofA n
v = SmallestConjugate(u); //visthesmallest onjugateofu
until (length(v) == n); //visprimitive
return v;
ThealgorithmRandomLyndonWordomputesuniformly aLyndonword.
Lemma2 The averageomplexityofRandomLyndonWord(n)islinear.
Proof. Eah exeution of the do ... until loop is done in linear time. The
onditionisnotsatisedwhen uis aonjugateofapowerv p
withp>1.This
happenswithprobabilityO(
n
2 n=2
).Thustheloopisexeutedaboundednumber
oftimesintheaverage.
Lemma3 Letl=aubeaLyndonwordoflengthgreaterorequalsto2starting
with a letter a. Let l
1 :::l
k
be the Lyndon fatorization of u. The right fator
of linitsstandardfatorizationisl
k .
Proof. ByTheorem1,l
k
isthesmallestsuÆxofu,thusitisthesmallestproper
suÆxofl.
ThealgorithmtoomputetherightfatorofaLyndonwordlsuhthatjlj
2isthefollowing:
RightFator(string l[1..n℄)
array pos;
int k;
pos = Duval(l[2..n℄, k, pos); // omit the first letter a and apply Duval's algorithm.
return l[pos[k℄..n℄; // return the last fator
Thisalgorithmislinearin timesineDuval'salgorithmislinear.
Figures1 and 2present someexperimental results obtainedwith ouralgo-
rithms.
OpenproblemTheresultsobtainedinthispaperareonlytherststeptoward
theaveragease-analysisoftheLyndontree.TheLyndontreeT(w)ofaLyndon
0 500 1000 1500 2000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
number of hits
size of the right factor
Fig.2.Distribution ofthe lengthof theright fator.We generated 100;000random
Lyndonwordsoflength5;000.
{ ifwisaletter,thenT(w)is aexternalnodelabeledbytheletter.
{ otherwise,T(w)isaninternalnodehavingT(u)andT(v)ashildrenwhere
thestandardfatorizationof wisu v.
This struture enodesanonassoiativeoperation,either aommutatorin the
freegroup[CFL58℄,oraLiebraketing[Lot83℄;bothonstrutionsleadstobases
ofthefreeLiealgebra.
InordertostudytheheightofthetreeobtainedfromaLyndonwordbysu-
essivestandardfatorizations,itwouldbeveryinterestingtogetmorepreise
informations about thedistribution of therightfators of words of L 0
n . Fig.2
hints a very strong equi-repartition property of the length of the right fator
overthisset.Thissuggestsaverypartiularsubdivisionproessateahnodeof
thefatorizationtreewhihneedsfurtherinvestigations.
Referenes
[BB97℄ J.BerstelandL.Boasson. Thesetoflyndonwordsisnotontext-free. Bull.
Eur.Asso.Theor.Comput. Si.EATCS,63:139{140, 1997.
[Boo80℄ K. S. Booth. Lexiographially least irular substrings. Inform. Proess.
Lett.,10(4-5):240{242, 1980.
[BP85℄ J.BerstelandD.Perrin. Theory ofodes. AademiPress,1985.
[BP94℄ J.BerstelandM.Pohiola.AverageostofDuval'salgorithmforgenerating
Lyndonwords. Theoret. Comput.Si.,132(1-2):415{425, 1994.
[Car85℄ H.Cartan. Theorieelementairedesfontionsanalytiques d'uneou plusieurs
quotientgroupsofthelowerentralseries. Ann.Math.,58:81{95,1958.
[Duv83℄ J.-P. Duval. Fatorizingwords overanordered alphabet. Journal of Algo-
rithms,4:363{381, 1983.
[Duv88℄ J.-P.Duval. Generationd'unesetiondeslassesdeonjugaisonetarbredes
motsdeLyndondelongueurbornee.Theoret.Comput.Si.,4:363{381,1988.
[FGP01℄ P.Flajolet,X.Gourdon,andD.Panario. Theompleteanalysisofapolyno-
mialfatorizationalgorithmoverniteelds.JournalofAlgorithms,40:37{81,
2001.
[FS91℄ P. Flajolet and M. Soria. The yle onstrution. SIAM J. Dis. Math.,
4:58{60,1991.
[FS02℄ P.FlajoletandR.Sedgewik.Analytiombinatoris{symboliombinatoris.
Book inpreparation,2002. (Individualhaptersare availableas INRIARe-
searhreportsathttp://www.algo.inria.fr/ajolet/publist.html).
[Gol69℄ S.Golomb.Irreduiblepolynomials,synhronizingodes,primitiveneklaes
and ylotomi algebra. In Pro.Conf Combinatorial Math. and Its Appl.,
pages358{370,ChapelHill,1969.Univ.ofNorthCarolinaPress.
[HW38℄ G. H. Hardy and E. M. Wright. An Introdution to the Number Theory.
OxfordUniversityPress,1938.
[Knu78℄ D.E. Knuth. Theaverage timeforarrypropagation. Indagationes Mathe-
matiae,40:238{242,1978.
[Lot83℄ M.Lothaire. CombinatorisonWords,volume17ofEnylopediaof mathe-
matisandits appliations. Addison-Wesley,1983.
[Lot03℄ M.Lothaire.AppliedCombinatorisonWords.2003.inpreparation,hapters
availableathttp://www-igm.univ-mlv.fr/~berstel/Lothaire.
[Lyn54℄ R.C.Lyndon.OnBurnsideproblemI. Trans.AmerianMath.So.,77:202{
215,1954.
[PR01℄ D. Panario and B.Rihmond. Smallestomponentsindeomposablestru-
tures:exp-loglass. Algorithmia,29:205{226,2001.
[Reu93℄ C.Reutenauer. Free Liealgebras. OxfordUniversityPress,1993.
[RSar℄ F.RuskeyandJ.Sawada. GeneratingLyndonbrakets:abasisfor then-th
homogeneousomponent ofthe freeLiealgebra. Journal of Algorithms, (to
appear). Availableathttp://www.s.uvi.a/ fruskey/Publiations/.