Quasi-stationary distributions as centrality measures of reducible graphs

(1)

HAL Id: inria-00166333

https://hal.inria.fr/inria-00166333v2

Submitted on 8 Aug 2007

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Quasi-stationary distributions as centrality measures of

reducible graphs

Konstantin Avrachenkov, Vivek Borkar, Danil Nemirovsky

To cite this version:

Konstantin Avrachenkov, Vivek Borkar, Danil Nemirovsky. Quasi-stationary distributions as

central-ity measures of reducible graphs. [Research Report] RR-6263, INRIA. 2007, pp.19. �inria-00166333v2�

(2)

inria-00166333, version 2 - 8 Aug 2007

a p p o r t

d e r e c h e r c h e

S

N

0

2

4

9 -6

3

9

9 IS

R

N

IN

R

IA

/R

R

--6

2

6

3 --F

R

+

E

N

G

Thème COM

Quasi-stationary distributions

as centrality measures of reducible graphs

Konstantin Avrachenkov — Vivek Borkar — Danil Nemirovsky

N° 6263

(3)

(4)

Unité de recherche INRIA Sophia Antipolis

2004, route des Lucioles, BP 93, 06902 Sophia Antipolis Cedex (France)

Téléphone : +33 4 92 38 77 77 — Télécopie : +33 4 92 38 77 65

Konstantin Avra henkov

∗

, Vivek Borkar

†

, Danil Nemirovsky

‡

ThèmeCOMSystèmes ommuni ants

ProjetsMAESTRO

Rapportdere her he n° 6263August200719pages

Abstra t: Randomwalk anbeused asa entrality measureof adire tedgraph. However,if

thegraphis redu ibletherandomwalkwillbeabsorbedin somesubsetof nodesand will never

visittherestofthegraph. InGooglePageRanktheproblemwassolvedbyintrodu tionofuniform

randomjumpswithsomeprobability. Upto thepresent,thereisno lear riterionforthe hoi e

thisparameter. Weproposetouseparameter-free entralitymeasurewhi hisbasedonthenotion

of quasi-stationary distribution. Spe i ally we suggest four quasi-stationary based entrality

measures,analyze them and on ludethat theyprodu eapproximately thesame ranking. The

new entrality measures an be applied in spam dete tion to dete t link farms and in image

sear hto ndphoto albums.

Key-words: entralitymeasure,dire tedgraph,quasi-stationarydistribution,PageRank,Web

graph,link farm

Thisresear hwassupportedbyRIAMINRIA-Canongrant,Europeanresear hproje tBionets,andby

CE-FIPRAgrantno-2900-IT.

∗

INRIASophiaAntipolis,K.Avra henkovsophia.inria.fr

†

TataInstituteofFundamentalResear h,India,E-mail:borkartifr.res.in

‡

INRIA Sophia Antipolis, Fran e and St. Petersburg State University, Russia, E-mail:

(5)

omme les mesures de entralité pour des graphes rédu tible

Résumé : Une mar he au hasardpeutêtre utilisée omme mesurede entralitéd'un graphe

orienté. Cependant,silegrapheestrédu tiblelamar heauhasardseraabsorbéedansunquelque

sous-ensemble de noeuds et ne visitera jamais le reste du graphe. Dans Google PageRank, le

problèmeaétérésoluparl'introdu tiondessautsaléatoiresuniformesave une ertaineprobabilité.

Jusqu'à présent, il n'y a au un ritère lair pour le hoix de e paramètre. Nous proposons

d'utiliser la mesure de entralité sans paramètre qui est basée sur la notion de la distribution

quasi-stationnaire. Nousanalysonslesquatremesureset on luonsqu'ellesproduisentpresquele

même lassementdenoeuds. Lesnouvellesmesuresde entralitépeuventêtreappliquéesdansle

ontextde ladéte tiondespam pourdéte ter leslinkfarms et dansle ontext delare her he

d'imagepourtrouverdesalbumsphoto.

Mots- lés: mesurede entralité,mar heauhasard,grapheorienté,distributionquasi-stationnaire,

(6)

1 Introdu tion

Random walk anbeused as a entrality measure of adire ted graph. An example of random

walkbased entralitymeasuresisPageRank[21℄usedbysear hengineGoogle. PageRankisused

by Google to sort the relevant answersto user'squery. We shall follow theformal denition of

PageRankfrom [18℄. Denote by

n

the total number of pages onthe Web and dene the

n × n

hyperlinkmatrix

P

su hthat

p

ij

=







1/d

i

,

ifpage

i

links to

j,

1/n,

ifpage

i

isdangling

,

0,

otherwise

,

(1)

for

i, j = 1, ..., n

,where

d

i

isthenumberofoutgoinglinks frompage

i

. Apagewithnooutgoing

linksis alleddangling. Wenotethata ordingto(1)thereexistarti iallinkstoallpagesfrom

adanglingnode. Inordertomakethehyperlinkgraph onne ted,itisassumedthatatea hstep,

withsomeprobability

c

,arandomsurfergoestoanarbitraryWebpagesampledfromtheuniform

distribution. Thus,thePageRankisdenedasastationarydistributionofaMarkov hainwhose

statespa eistheset ofallWebpages,andthetransitionmatrixis

G = cP + (1 − c)(1/n)E,

where

E

isamatrixwhoseallentriesareequalto one,and

c ∈ (0, 1)

is aprobabilityoffollowing

a hyperlink. The onstant

c

is often referred to asa damping fa tor. TheGoogle matrix

G

is

sto hasti , aperiodi , and irredu ible, so the PageRank ve tor

π

is the unique solution of the

system

πG = π,

π1 = 1,

where

1

isa olumn ve torofones.

Eventhoughin anumberof re entworks,see e.g.,[5,6,8℄, the hoi e ofthedampingfa tor

c

has been dis ussed, there is still no lear riterion for the hoi e of its value. The goalof the

presentwork istoexploreparameter-free entralitymeasures.

In[5,7,15℄theauthorshavestudied thegraphstru tureoftheWeb. Inparti ular,in [7,15℄

it was shown that the Web Graph an be divided into three prin iple omponents: the Giant

StronglyConne tedComponent,towhi hwesimplyreferasSCC omponent,theIN omponent

andtheOUT omponent. TheSCC omponentisthelargeststrongly onne ted omponentinthe

WebGraph. Infa t,itislargerthanthese ondlargeststrongly onne ted omponentbyseveral

orders of magnitude. Following hyperlinks one an ome from the IN omponent to the SCC

omponentbutitisnotpossibletoreturnba k. Then,fromtheSCC omponentone an ometo

theOUT omponentanditisnotpossibletoreturntoSCCfromtheOUT omponent. In[7,15℄

theanalysisofthestru tureoftheWebwasmadeassumingthatdanglingnodeshavenooutgoing

links. However,a ordingto(1)thereisaprobabilitytojumpfromadanglingnodetoanarbitrary

node. This anbeviewed asalink betweenthenodes andwe allsu h alink thearti ial link.

Aswasshownin [5℄,these arti iallinkssigni antly hangethegraphstru tureof theWeb. In

parti ular,thearti iallinksofdanglingnodesintheOUT omponent onne tsomepartsofthe

OUT omponentwithINandSCC omponents. Thus, thesizeoftheGiantStrongly Conne ted

Componentin reasesfurther. Ifthearti iallinksfromdanglingnodesaretakenintoa ount,it

isshownin[5℄thattheWebGraph anbedividedintwodisjoint omponents: ExtendedStrongly

Conne ted Component(ESCC) and PureOUT (POUT) omponent. The POUT omponentis

small in size but ifthe dampingfa tor

c

is hosenequalto one,the random walkabsorbs with

probabilityoneinto POUT.Wenotethat nearlyallimportantpagesarein ESCC.Wealsonote

that even ifthe dampingfa toris hosen lose to one,the randomwalk anspend asigni ant

amountoftimeinESCCbeforetheabsorption. Therefore,forrankingWebpagesfromESCCwe

suggestto usethequasi-stationarydistributions[9,22℄.

Itturnsoutthatthereareseveralversionsofquasi-stationarydistribution. Herewestudyfour

(7)

bythem are verysimilar. Therefore, one an hose aversionofstationary distribution whi his

easierfor omputation.

Thepaperis organizedasfollows: InthenextSe tion2wedis ussdierentnotionsof

quasi-stationarity,therelationamongthem, andtherelationbetweenthequasi-stationarydistribution

andPageRank. Then,inSe tion3wepresenttheresultsofnumeri alexperimentsonWebGraph

whi h onrm our theoreti al ndings and suggest the appli ation of quasi-stationarity based

entralitymeasures tolink spam dete tionandimage sear h. Somete hni alresultswepla ein

theAppendix.

2 Quasi-stationary distributions as entrality measures

As notedin [5℄, by renumbering the nodes the transition matrix

P

an be transformed to the

followingform

P =

Q

0 R

T

,

where the blo k

T

orresponds to the ESCC, the blo k

Q

orrespondsto the partof the OUT

omponentwithout dangling nodes and their prede essors,and the blo k

R

orrespondsto the

transitions from ESCCto thenodes in blo k

Q

. Wereferto the set ofnodes in the blo k

Q

as

POUT omponent.

ThePOUT omponentissmall insizebut ifthedampingfa tor

c

is hosenequaltoone,the

random walk absorbs with probability one into POUT. We are mostly interested in the nodes

in the ESCC omponent. Denote by

π

Q

a part of the PageRank ve tor orresponding to the

POUT omponentanddenote by

π

T

apartofthePageRankve tor orrespondingtothe ESCC

omponent. Usingthefollowingformula[20℄

π(c) =

1 − c

n

1 T

_{[I − cP ]}

−1

_,

we on ludethat

π

T

(c) =

1 − c

n

1 T

_{[I − cT ]}

−1

_,

where

1

isave torofonesofappropriatedimension.

Letusdene

ˆ

π

T

(c) =

π

T

(c)

||π

T

(c)||

1 .

Sin ethematrix

T

issubsto hasti ,wehavethenextresult.

Proposition 1 The following limitexists

ˆ

π

T

(1) = lim

c→1

π

T

(c)

||π

T

(c)||

1 =

1 T

_{[I − T ]}

−1

1 T

_{[I − T ]}

−1

₁

,

and the ranking of pages in ESCC provided by the PageRank ve tor onverges to the ranking

provided by

ˆ

π

T

(1)

as the damping fa tor goes toone. Moreover, these tworankings oin ide for

allvalues of

c

above somevalue

c

∗

.

Nextwedenote

π

ˆ

T

(1)

simplyby

ˆ

π

T

. Following[9,12℄weshall alltheve tor

π

ˆ

T

pseudo-stationary

distribution. The

i

th

omponentof

π

ˆ

T

anbeinterpreted asafra tion oftimethe randomwalk

(with

c = 1

)spendsin node

i

priortoabsorption. Were allthat therandomwalkasdened in

Introdu tionstartsfromtheuniformdistribution. Iftherandomwalkwereinitiatedfromanother

distribution,thepseudo-stationarydistribution would hange.

Denote by

T

¯

the hyperlink matrix asso iated with ESCC when the links leading outsideof

ESCCarenegle ted. Clearly,wehave

¯

T

ij

=

T

ij

[T 1]

i

(8)

where

[T 1]

i

denotesthe

i

th

omponentofve tor

T 1

. Inotherwords,

[T 1]

i

isthesumofelements

inrow

i

ofmatrix

T

. The

T

¯

ij

entryofthematrix

T

¯

anbe onsideredasa onditionalprobability

tojumpfromthenode

i

tothenode

j

underthe onditionthatrandomwalkdoesnotleaveESCC

atthejump. Let

π

¯

T

beastationarydistributionof

T

¯

.

Letusnow onsider thesubsto hasti matrix

T

asaperturbationofsto hasti matrix

T

¯

. We

introdu etheperturbationterm

εD = ¯

T − T,

where the parameter

ε

is the perturbation parameter, whi h is typi ally small. The following

resultholds.

Proposition 2 The ve tor

ˆ

π

T

is lose to

¯

π

T

. Namely,

ˆ

π

T

= ¯

π

T

− ¯

π

T

1 n

T

(¯

π

T

εD1)1

T

X

0 1 + 1

T

X

0

1 n

T

(¯

π

T

εD1) + o(ε),

(2)

where

n

T

isthe numberof nodesinESCCand

X

0

isgiven inLemma1fromthe Appendix.

Proof: Wesubstitute

T = ¯

T − εD

into

[I − T ]

−1

anduseLemma1, toget

[I − T ]

−

1 ₌

1 ¯

πεD1

1¯

π + X

0 + O(ε).

Usingtheaboveexpression,we anwrite

ˆ

π

T

=

1 T

_{[I − T ]}

−

1

1 T

_{[I − T ]}

−1

₁

=

1 ¯

π

T

εD1

n

T

π

¯

T

+ 1

T

_X

0 + O(ε)

1 ¯

π

T

εD1

n

T

+ 1

T

_X

0 1 + O(ε)

=

π

¯

T

+

1 n

T

(¯

π

T

εD1)1

T

_X

0 + o(ε)

1 +

_n

1 _T

(¯

π

T

εD1)1

T

X

0 1 + o(ε)

=

¯

π

T

+

1 n

T

(¯

π

T

εD1)1

T

X

0 + o(ε)

1 −

1 n

T

(¯

π

T

εD1)1

T

X

0 1 + o(ε)

= ¯

π

T

− ¯

π

T

1 n

T

(¯

π

T

εD1)1

T

X

0 1 + 1

T

X

0

1 n

T

(¯

π

T

εD1) + o(ε).

2

Sin e

R1 + T 1 = 1

and

T 1 = 1

¯

,in lieu of

π

¯

T

εD1

we anwrite

π

¯

T

R1

. Thelatter expression

hasa learprobabilisti interpretation. Itisaprobabilitytoexit ESCCinonestepstartingfrom

the distribution

π

¯

T

. Later weshall demonstrate that this probability is indeed small. Wenote

that notonly

π

¯

T

R1

is smallbut alsothe fa tor

1/n

T

is small, asthe numberof statesin ESCC

islarge.

InthenextProposition3weprovidealternativeexpressionfortherstordertermsof

π

ˆ

T

.

Proposition 3

ˆ

π

T

= ¯

π

T

− ε¯

π

T

DH + ε1

T

1 n

T

(¯

π

T

D1)H + o(ε).

Proof: Letus onsider

π

ˆ

T

aspowerseries:

ˆ

π

T

= ˆ

π

T

(0)

+ εˆ

π

(1)

T

+ ε

2 _π

_ˆ

(2)

T

+ . . . .

From(2)weobtain

ˆ

π

T

=

π

¯

T

− ¯

π

T

1 n

T

(¯

π

T

εD1)1

T

X

0 1 + 1

T

X

0

1 n

T

(¯

π

T

εD1) + o(ε) =

=

π

¯

T

+ ε

1 T

X

0

1 n

T

(¯

π

T

D1) − ¯

π

T

1 n

T

(¯

π

T

D1)1

T

X

0

1 + o(ε),

(9)

andhen e

ˆ

π

(1)

T

= 1

T

_X

0

1 n

T

(¯

π

T

D1) − ¯π

T

1 n

T

(¯

π

T

D1)1

T

X

0 1,

(3)

where

X

0

isgivenby(30). Beforesubstituting(30)into(3)letusmaketransformations

X

0 =

(I − X

−

1 D)H(I − DX

−

1 ) =

=

H − HDX

−1

− X

−1

DH + X

−1

DHDX

−1

,

where

X

−1

isdenedby(29). Pre-multiplying

X

0

by

1 T

, weobtain

1 T

X

0 =

1 T

H − ¯

π

T

(1

T

HD1)(¯π

T

D1)

−1

− n

T

π

¯

T

(¯

π

T

D1)

−1

DH +

(4)

+

n

T

π

¯

T

DHD1¯π

T

(¯

π

T

D1)

−

2 _.

Post-multiplying

X

0

by

1

,weobtain

X

0 1 = X

−

1 DHDX

−

1 1 − HDX

−

1

andhen e

1 T

X

0 1 = n

T

¯

π

T

DHD1(¯π

T

D1)

−2

− 1

T

HD1(¯π

T

D1)

−1

.

(5)

Substituting (5)and(4)into(3) , weget

ˆ

π

T

(1)

=

1 T

X

0

1 n

T

(¯

π

T

D1) − ¯π

T

1 n

T

(¯

π

T

D1)1

T

X

0 1 =

=

1 T

H

1 n

T

(¯

π

T

D1) −

1 n

T

¯

π

T

1 T

HD1 − ¯π

T

DH +

+

¯

π

T

(¯

π

T

DHD1)(¯π

T

D1)

−1

− ¯

π

T

(¯

π

T

DHD1)(¯π

T

D1)

−1

+

1 n

T

¯

π

T

1 T

HD1 =

=

1 T

H

1 n

T

(¯

π

T

D1) − ¯

π

T

DH.

Thus,wehave

ˆ

π

(1)

T

=

1 T

H

1 n

T

(¯

π

T

D1) − ˜

π

T

.

2

Next,we onsideraquasi-stationarydistribution[9,22℄denedbyequation

˜

π

T

T = λ

1 π

˜

T

,

(6)

andthenormalization ondition

˜

π

T

1 = 1,

(7)

where

λ

1

is thePerron-Frobeniuseigenvalue ofmatrix

T

. The quasi-stationarydistribution an

beinterpretedasaproperinitialdistributiononthenon-absorbingstates(statesinESCC)whi h

is su h that the distributionof the random walk, onditionedon the non-absorption prior time

t

, is independentof

t

[11℄. As in the analysis ofthe pseudo-stationarydistribution, we takethe

(10)

˜

π

T

is lose totheve tor

π

¯

T

. Namely,

˜

π

T

= ¯

π

T

− ε¯

π

T

DH + o(ε).

Proof: Welook forthequasi-stationarydistribution andthePerron-Frobeniuseigenvaluein the

formofpowerseries

˜

π

T

= ˜

π

(0)

T

+ ε˜

π

(1)

T

+ ε

2 _π

_˜

(2)

T

+ . . . ,

(8)

λ

1 = 1 + ελ

(1)

1 + ε

2 _λ

(2)

1 + . . . .

Substituting

T = ¯

T − εD

andtheaboveseriesinto(6),andequatingtermswiththesamepowers

of

ε

,weobtain

˜

π

T

(0)

T = ˜

¯

π

(0)

T

,

(9)

˜

π

T

(1)

T − ˜

¯

π

(0)

T

D = 1˜

π

(1)

T

+ λ

(1)

1 π

˜

(0)

T

,

(10)

Substituting (8)intothenormalization ondition(7),weget

˜

π

_T

(0)

1 = 1,

(11)

˜

π

T

(1)

1 = 0.

(12)

From(9)and(11)we on ludethat

˜

π

(0)

T

= ¯

π

T

. Thus, theequation(10)takestheform

˜

π

_T

(1)

T − ¯

¯

π

T

D = 1˜

π

(1)

_T

+ λ

(1)

₁

π

¯

T

.

Post-multiplyingthisequationby

1

,weget

˜

π

_T

(1)

T 1 − ¯

¯

π

T

D1 = 1˜π

(1)

_T

1 + λ

(1)

₁

π

¯

T

1.

Nowusing

T 1 = 1

¯

,(11)and(12),we on ludethat

λ

(1)

1 = −¯

π

T

D1,

and, onsequently,

λ

1 = 1 − ε¯

π

T

D1 + o(ε).

(13)

Nowtheequation(10) anberewrittenasfollows:

˜

π

T

(1)

[I − ¯

T ] = ¯

π

T

[(¯

π

T

D1)I − D].

Itsgeneralsolutionisgivenby

˜

π

(1)

T

= ν ¯

π

T

+ ¯

π

T

[(¯

π

T

D1)I − D]H,

where

ν

is some onstant. To nd onstant

ν

, we substitute the above general solution into

ondition(12).

˜

π

T

(1)

1 = ν ¯

π

T

1 + ¯

π

T

[(¯

π

T

D1)I − D]H1 = 0.

Sin e

π

¯

T

1 = 1

and

H1 = 0

,weget

ν = 0

. Consequently,wehave

˜

π

T

(1)

= ¯

π

T

[(¯

π

T

D1)I − D]H = (¯π

T

D1)¯π

T

H − ¯

π

T

DH = −¯

π

T

DH.

Intheabove,wehaveusedthefa t that

π

¯

T

H = 0

. This ompletestheproof.

2

Sin e

λ

1

is very lose to one, we on lude from (13) and the equality

ε¯

π

T

D1 = ¯π

T

R1

that

indeed

π

¯

T

R1

istypi allyverysmall.

(11)

Proposition 5 The Perron-Frobeniuseigenvalue

λ

1

ofmatrix

T

isgiven by

λ

1 = 1 − ˜

π

T

R1.

(14)

Proof: Post-multiplyingtheequation(6) by

1

,weobtain

λ

1 = ˜

π

T

T 1.

Then,usingthefa t that

T 1 = 1 − R1

wederivetheformula(14).

2

Proposition5indi ates thatif

λ

1

is losetoonethen

˜

π

T

R1

issmall.

As we mentioned above the

T

¯

ij

entry of the matrix

T

¯

an be onsidered as a onditional

probabilitytojumpfromthenode

i

tothenode

j

underthe onditionthatrandomwalkdoesnot

leaveESCCatthejump.

Letus onsiderthesituationwhentherandomwalkstaysinsideESCCaftersomenitenumber

ofjumps. Theprobabilityof su hanevent anbeexpressedasfollows:

P

X

1 = j|X

0 = i ∧

N

^

m=1

X

m

∈ S

!

,

where ESCCis denoted by

S

for thesakeofshortening notationand

N

isthe numberof jumps

duringwhi htherandomwalkstaysinESCC.

Letusdenoteby

T

(N )

ij

theelementof

T

N

(the

N

th powerofT)andby

T

(N )

i

the

i

th rowofthe matrix

T

N

. Then

T

i

(N )

= (T

N

)

i

= (T T

N −1

)

i

= T

i

T

N −1

.

Proposition 6

P

X

1 = j|X

0 = i ∧

N

^

m=1

X

m

∈ S

!

=

T

ij

T

(N −1)

j

1 T

i

(N )

1 .

(15)

Proof: seeAppendix.

Then,ifwedenote

ˇ

T

ij

(N )

= P

X

1 = j|X

0 = i ∧

N

^

m=1

X

m

∈ S

!

,

wewill beabletondstationarydistributions of

T

ˇ

(N )

ij

,whi h anbeviewedasgeneralizationof

¯

π

T

. Letus now onsiderthelimiting ase,when

N

goestoinnity.

Beforewe ontinueletus analyzetheprin iplerighteigenve tor

u

ofthematrix

T

:

T u = λ

1 u,

(16)

where

λ

1

isasinthepreviousse tion,thePerron-Frobeniuseigenvalue.

Theve tor

u

anbenormalizedin dierentways. Letus denethemainnormalizationfor

u

as

1 T

u = n

T

.

Letusalsodene

u

¯

as

¯

u =

u

¯

π

T

u

,sothat

π

¯

T

u = 1,

¯

(17) and

˜

u =

u

˜

π

T

u

,sothat

π

˜

T

u = 1.

˜

(18)

(12)

u

¯

is lose tothe ve tor

1

. Namely,

¯

u = 1 − εHD1 + o(ε).

Proof: Welookfortherighteigenve torandthePerron-Frobeniuseigenvalueintheformofpower

series

¯

u = ¯

u

(0)

+ ε¯

u

(1)

+ ε

2 u

¯

(2)

+ . . . .

(19)

λ

1 = 1 + ελ

(1)

1 + ε

2 _λ

(2)

1 + . . . .

Substituting

T = ¯

T − εD

andtheaboveseriesinto(16),andequatingtermswiththesamepowers

of

ε

,weobtain

¯

T ¯

u

(0)

= ¯

u

(0)

,

(20)

¯

T ¯

u

(1)

− D¯

u

(0)

= ¯

u

(1)

+ λ

(1)

1 u

¯

(0)

.

(21)

Substituting (19)into thenormalization ondition(17),weobtain

¯

π

T

u

¯

(0)

= 1,

(22)

¯

π

T

u

¯

(1)

= 0.

(23)

From(20)and(22)we on ludethat

u

¯

(0)

_{= 1}

. Thus,theequation(21)takestheform

¯

T ¯

u

(1)

− D1 = ¯

u

(1)

+ λ

(1)

₁

1.

Pre-multiplyingthisequationby

π

¯

T

,weget

¯

π

T

u

¯

(1)

− ¯

π

T

D1 = ¯π

T

u

¯

(1)

+ ¯

π

T

λ

(1)

1

1.

Nowusing

T 1 = 1

¯

,(22)and(23),we on ludethat

λ

(1)

1 = −¯

π

T

D1,

and, onsequently,

λ

1 = 1 − ε¯

π

T

D1 + o(ε).

Nowtheequation(21) an berewrittenasfollows:

I − ¯

T ¯

u

(1)

= [(¯

π

T

D1) I − D] 1.

Itsgeneralsolutionisgivenby

¯

u

(1)

= ν1 + H [(¯π

T

D1) I − D] 1,

where

ν

is some onstant. To nd onstant

ν

, we substitute the above general solution into

ondition(23).

¯

π

T

¯

u

(1)

= ν ¯

π

T

1 + ¯

π

T

H [(¯

π

T

D1) I − D] 1.

Sin e

π

¯

T

1 = 1

and

π

¯

T

H = 0

,weget

ν = 0

. Consequently,wehave

¯

u

(1)

= −HD1.

Intheabove,wehaveusedthefa t that

H1 = 0

. This ompletestheproof.

2

(13)

Proposition 8 The following onvergen etakes pla e

˜

u

i

= lim

n→∞

T

i

T

n−1

e

λ

n

1 ,

(24) where

T

i

isthe

i

th

row of the matrix

T

.

Proof:

˜

u

(1)

i

=

T

i

e

˜

π

T

T e

=

T

i

e

λ

1 ,

˜

u

(2)

i

=

T

i

˜

u

(1)

˜

π

T

T ˜

u

(1)

=

T

i

T e

λ

1 λ

1 =

T

i

T e

λ

2

1 ,

˜

u

(3)

i

=

T

i

˜

u

(2)

˜

π

T

T ˜

u

(2)

=

T

i

T

2 _e

λ

3

1 ,

. . .

2

Letus onsider thetwistedkernel

T

ˇ

dened by

ˇ

T

ij

=

T

ij

u

j

λ

1 u

i

.

Asone anseethetwistedkerneldoesnotdependonthenormalizationof

u

. Hen e,we antake

anynormalization.

Proposition 9 Thetwistedkernel isalimitof (15)as

N

goes toinnity,that is

ˇ

T

ij

= lim

N →∞

T

ij

T

(N −1)

j

1 T

i

(N )

1 .

Proof:

T

ij

T

(N −1)

j

1 T

i

(N )

1 = T

ij

T

j

T

N −2

1 T

i

T

N −1

1 =

T

ij

λ

1 T

j

T

N −2

1 λ

N −1

1 T

i

T

N −1

1 λ

N

1 .

lim

N →∞

T

ij

T

j

(N −1)

1 T

i

(N )

1 =

T

ij

λ

1 lim

N →∞

T

j

T

N −2

1 λ

N −1

1 T

i

T

N −1

1 λ

N

1 =

T

ij

λ

1 lim

N →∞

T

j

T

N −2

1 λ

N −1

1 lim

N →∞

T

i

T

N −1

₁

λ

N

1 .

Using(24),we anwrite

lim

N →∞

T

ij

T

(N −1)

j

1 T

i

(N )

1 =

T

ij

u

˜

j

λ

1 u

˜

i

.

Afterrenormalization,weobtain

lim

N →∞

T

ij

T

(N −1)

j

1 T

i

(N )

1 =

T

ij

u

j

λ

1 u

i

.

2

Thetwistedkernelplaysanimportantroleinmultipli ativeergodi theoryandlargedeviations

(14)

ˇ

T

ij

≥ 0 ∀i, j,

and

P

j

T

ˇ

ij

= 1 ∀i

. Also,itisirredu ibleifthereexistsanpath

i → j

under

T

forall

i, j

, whi h we assumeto bethe ase. Inparti ular, itwill haveauniquestationary distribution

ˇ

π

T

asso iatedwithit:

ˇ

π

T

= ˇ

π

T

T ,

ˇ

(25)

ˇ

π

T

1 = 1.

(26)

If we assume aperiodi ity in addition,

T

ˇ

ij

an be given the interpretation of the probability of

transitionfrom

i

to

j

intheESCCforthe hain, onditionedonthefa t that itneverleavesthe

ESCC.Thus,

π

ˇ

T

qualiesasanalternativedenitionofaquasi-stationarydistribution.

Proposition 10 Thefollowing expressionfor

π

ˇ

T

holds:

ˇ

π

T

= ˜

π

T i

u

˜

i

.

(27)

Proof: Thenormalization ondition(26)issatiseddue to(18). Letusshowthat(25)holdsas

well,i.e.

ˇ

π

T j

=

n

T

X

i=1

ˇ

π

T i

T

ˇ

ij

,

where

n

T

isthedimensionof

π

ˇ

T

. Andfortherighthand sideof (27)wehave

n

T

X

i=1

˜

π

T i

u

˜

i

T

ˇ

ij

=

n

T

X

i=1

˜

π

T i

u

˜

i

T

ij

u

˜

j

λ

1 u

˜

i

=

n

T

X

i=1

˜

π

T i

u

˜

i

T

ij

u

˜

j

λ

1 u

˜

i

=

u

˜

j

λ

1 λ

1 π

˜

T j

= ˜

π

T j

u

˜

j

.

2

Thissuggests that

π

ˇ

T i

, orequivalently

π

˜

T i

u

˜

i

, maybe used asanother alternative entrality

measure. Sin ethesubsto hasti matrix

T

is losetosto hasti ,theve tor

u

willbevery loseto

1

.

Consequently,theve tor

π

ˇ

T

willbe loseto

π

˜

T

andto

π

¯

aswell. Thisshowsthatinthe asewhen

the matrix

T

is lose to the sto hasti matrixall the alternativedenitions of quasi-stationary

distributionare quite lose to ea h other. And then, from Proposition 1, we on ludethat the

PageRankranking onvergesto thequasi-stationaritybasedrankingasthedampingfa tor goes

toone.

3 Numeri al experiments and Appli ations

For our numeri al experiments we have used the Web site of INRIA (http://www.inria.fr). It

is a typi al Web sitewith about300 000Web pages and 2 200 000hyperlinks. Sin e the Web

hasafra talstru ture[10℄,weexpe tthat ourdatasetissu ientlyrepresentative. A ordingly,

datasetsofsimilarorevensmallersizeshavebeenextensivelyusedinexperimentalstudiesofnovel

algorithmsforPageRank omputation[1,16,17℄. To olle ttheWebgraphdata,we onstru tour

own Web rawlerwhi h works withthe Ora ledatabase. The rawler onsistsof twoparts: the

rstpartisrealizedinJavaandisresponsiblefordownloadingpagesfromtheInternet,parsingthe

pages,andinsertingtheirhyperlinksintothedatabase;these ondpartiswritteninPL/SQLand

isresponsibleforthedatamanagement. Fordetailed des riptionofthe rawlerreaderisreferred

to[3℄.

As was shown in [7, 15℄, aWeb graphhas three major distin t omponents: IN, OUTand

SCC.However,ifonetakesintoa ountthearti iallinksfromthedanglingnodes,aWebgraph

has two major distin t omponents: POUTand ESCC [5℄. In ourexperiments we onsider the

arti iallinks fromthe danglingnodesand ompute

π

¯

T

,

π

˜

T

,

π

ˆ

T

,and

ˇ

π

T

with5digitspre ision.

Weprovidethestatisti sfortheINRIAWebsitein Table1.

Forea hpairoftheseve torswe al ulatedKendallTaumetri (seeTable2). TheKendallTau

(15)

IN RIA

Totalsize 318585

NumberofnodesinSCC 154142

Numberofnodesin IN 0

Numberofnodesin OUT 164443

NumberofnodesinESCC 300682

Numberofnodesin POUT 17903

NumberofSCCsin OUT 1148

NumberofSCCsin POUT 631

Table1: Componentsizesin INRIAdataset

¯

π

T

π

˜

T

π

ˆ

T

π

ˇ

T

¯

π

T

1.0 0.99390

0.99498

0.98228

˜

π

T

1.0 0.99770

0.98786

ˆ

π

T

1.0 0.98597

ˇ

π

T

1.0

Table2: KendallTau omparison

transformonerankingtotheother. TheKendallTaumetri hasthevalueofoneiftworankings

areidenti alandminusoneifonerankingistheinverseoftheother.

Inour ase,theKendallTaumetri sforallthepairsisvery losetoone. Thus,we an on lude

thatallfourquasi-stationaritybased entralitymeasuresprodu everysimilarrankings.

We have also analyzed the Kendall Tau metri between

π

˜

T

and PageRank of ESCC as a

fun tion of damping fa tor (see Figure 1). As

c

goes to one, the KendallTau approa hes one.

Thisisin agreementwithProposition1.

Finally,wewouldliketonote thatin the aseofquasi-stationaritybased entralitymeasures

therstrankingpla eswereo upiedbythesiteswiththeinternalstru turedepi tedinFigure2.

Therefore,wesuggesttousethequasi-stationaritybased entralitymeasurestodete tlinkfarms

andto dis overphoto albums. Itturns outthat thequasi-stationaritybased entralitymeasures

highlightsthesites withstru tureasinFigure 2butatthesametimetherelativerankingofthe

othersitesprovidedbythestandardPageRankwith

c = 0.85

ispreserved. Toillustratethisfa t,

we give in Table 3rankings ofsome sites under dierent entrality measures. Even though the

absolute valueof rankingis hanging,the relativerankingamong these sites is the samefor all

entralitymeasures. Thisindi ates that thequasi-stationaritybased entralitymeasureshelp to

dis overlinkfarms andphotoalbumsandatthesametimetherankingofsitesoftheothertype

stays onsistentwiththestandardPageRankranking.

π

T

(0.85)

π

¯

T

π

˜

T

π

ˆ

T

π

ˇ

T

http : //www.inria.f r/

1

31

189

105

200 http : //www.loria.f r/

13

310 1605

356 1633

http : //www.irisa.f r/

16

432 1696

460

757 http : //www − sop.inria.f r/

30

508 1825

532 1819

http : //www − rocq.inria.f r/

74 1333

2099

1408

2158

http : //www − f uturs.inria.f r/

102 2201

2360

2206

2404

(16)

0.05

0.15

0.25

0.35

0.45

0.55

0.65

0.75

0.85

0.95

1.05

0.5

0.6

0.7

0.8 0.8728

0.9

1 damping factor

kendall tau

Figure 1: The Kendall Tau metri between

π

˜

T

and PageRank of ESCC as a fun tion of the

dampingfa tor.

(17)

4 Con lusion

Inthepaperwehaveproposed entralitymeasureswhi h anbeappliedto aredu ible graphto

avoid theabsorbtion problem. In GooglePageRankthe problem wassolved by introdu tionof

uniformrandomjumpswithsomeprobability. Uptothepresent,thereisno lear riterionforthe

hoi e this parameter. In thepaperwe havesuggestedfour quasi-stationaritybased

parameter-free entralitymeasures,analyzedthemand on ludedthattheyprodu eapproximatelythesame

ranking. Therefore,inpra ti eitissu ientto omputeonlyonequasi-stationaritybased

entral-itymeasure. All ourtheoreti alresultsare onrmedby numeri alexperiments. Thenumeri al

experimentshavealsoshowedthatthenew entralitymeasures anbeappliedin spamdete tion

todete tlinkfarms andin imagesear htondphotoalbums.

Referen es

[1℄ S.Abiteboul,M.Preda,andG.Cobena,Adaptiveon-linepageimportan e omputation,in

Pro eedings ofthe 12 InternationalWorld WideWebConferen e,Budapest, 2003.

[2℄ K.Avra henkov, Analyti Perturbation Theory and itsAppli ations, PhD thesis, University

ofSouthAustralia,1999.

[3℄ K.Avra henkov,D.Nemirovsky,andN.Osipova.WebGraphAnalyzerTool.InPro eedings

ofthe IEEE ValueTools onferen e,2006.

[4℄ K. Avra henkov, M. Haviv and P.G. Howlett, Inversionof analyti matrix fun tions that

are singular at the origin, SIAM Journal on Matrix Analysis and Appli ations, v. 22(4),

pp.1175-1189,2001.

[5℄ K.Avra henkov,N. Litvakand K.S. Pham, Asingularperturbation approa hfor hoosing

PageRankdampingfa tor,preprint,availableathttp://arxiv.org/abs/math.PR/0612079,

2006.

[6℄ P. Boldi, M. Santini, and S. Vigna, PageRank as a fun tion of the damping fa tor, in

Pro eedings ofthe 14 World WideWebConferen e,NewYork,2005.

[7℄ A.Broder,R.Kumar,F.Maghoul,P.Raghavan,S.Rajagopalan,R.Stata,A. Tomkinsand

J.Wiener,Graphstru turein theWeb, ComputerNetworks,v.33,pp.309-320,2000.

[8℄ P.Chen,H. Xie,S.Maslov,andS.Redner,Finding s ienti gemswithGoogle'sPageRank

algorithm,Journalof Informetri s,v.1,pp.815,2007.

[9℄ J.N.Darro handE.Seneta,OnQuasi-StationaryDistributionsinAbsorbingDis rete-Time

FiniteMarkovChains,JournalofAppliedProbability,v.2(1),pp.88-100,1965.

[10℄ S. Dill, R. Kumar, K. M Curley, S. Rajagopalan, D. Sivakumar, and A. Tomkins,

Self-similarityin theWeb, ACM Trans.InternetTe hnol.,2(2002),pp.205223.

[11℄ E.A. van Doorn, Quasi-stationary distributions and onvergen e to quasi-stationarity of

birth-deathpro esses,Advan es inAppliedProbability,v.23(4),pp.683-700,1991.

[12℄ W.J. Ewens,The diusionequation andpseudo-distributionin geneti s,J.R. Statist. So .

B,v.25,pp.405-412,1963.

[13℄ J.Kleinberg, Authoritativesour esin ahyperlinkedenvironment,Journal of ACM, v.46,

pp.604-632,1999.

[14℄ I.Kontoyiannis,andS.P.Meyn,Spe traltheoryandlimittheoremsforgeometri allyergodi

(18)

[15℄ R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tompkins and E. Upfal, The

Webasagraph,PODS'00: Pro eedingsofthenineteenthACMSIGMOD-SIGACT-SIGART

Symposiumon Prin iplesof DatabaseSystems,pp.1-10,2000.

[16℄ A. N. Langvilleand C. D. Meyer,DeeperInsidePageRank,Internet Math., 1(2004), pp.

335400;alsoavailable onlineathttp://www4.n su.edu/

∼

anlangvi/.

[17℄ A. N. Langvilleand C.D. Meyer,UpdatingPageRankwith iterativeaggregation, in

Pro- eedings ofthe 13th World Wide WebConferen e,NewYork,2004.

[18℄ A.N. Langville and C.D. Meyer, Google's PageRank and Beyond: The S ien e of Sear h

EngineRankings,Prin etonUniversityPress,2006.

[19℄ R.LempelandS. Moran,Thesto hasti approa hforlink-stru tureanalysis(SALSA)and

theTKCee t,ComputerNetworks,v.33,pp.387-401,2000.

[20℄ C.D.MolerandK.A.Moler,Numeri alComputingwithMATLAB, SIAM,2003.

[21℄ L. Page, S. Brin, R. Motwani and T. Winograd, Thepagerank itation ranking: Bringing

ordertotheweb,StanfordTe hni alReport,1998.

[22℄ E.Seneta,Non-negativematri esandMarkov hains,Springer,1973.

Appendix

Herewepresenta oupleofimportantauxiliaryresults.

Lemma1 Let

T

¯

bean irredu ible sto hasti matrix. And let

T (ε) = ¯

T − εD

be a perturbation

of

T

¯

su hthat

T (ε)

issubsto hasti matrix. Then, for su iently small

ε

the following Laurent

seriesexpansionholds

[I − T (ε)]

−

1 ₌

1 ε

X

−1

+ X

0 + εX

1 + . . . ,

(28) with

X

−1

=

1 ¯

πD1

1¯

π,

(29)

X

0 = (I − X

−1

D)H(I − DX

−1

),

(30)

where

π

¯

isthe stationarydistributionof

T

¯

and

H = (I − ¯

T + 1¯π)

−1

_{− 1¯π}

isthe deviationmatrix.

Proof: Theproofofthisresultisbasedontheapproa hdevelopedin[2,4℄. Theexisten eofthe

Laurentseries(28)isaparti ular aseofmoregeneralresultsof[4℄. To al ulatethetermsofthe

Laurentseries,letusequatethetermswiththesamepowersof

ε

inthefollowingidentity

(I − ¯

T + εD)(

1 ε

X

−

1 + X

0 + εX

1 + . . .) = I,

whi hresultsin

(I − ¯

T )X

−

1 = 0,

(31)

(I − ¯

T )X

0 + DX

−1

= I,

(32)

(I − ¯

T )X

1 + DX

0 = 0.

(33)

Fromequation(31)we on ludethat

(19)

where

µ

−1

is someve tor. Wend this ve torfrom the onditionthat the equation (32)has a

solution. Inparti ular,equation(32)hasasolutionifandonlyif

¯

π(I − DX

−1

) = 0.

Bysubstituting intotheaboveequationtheexpression(34),weobtain

¯

π − ¯

πD1µ

−1

= 0,

and, onsequently,

µ

−1

=

1 ¯

πD1

π,

¯

whi htogetherwith(34)gives(29).

Sin e the deviation matrix

H

is a Moore-Penrose generalized inverse of

I − ¯

T

, the general

solutionofequation(32)withrespe tto

X

0

isgivenby

X

0 = H(I − DX

−

1 ) + 1µ

0 ,

(35)

where

µ

0

issomeve tor. The ve tor

µ

0

anbefoundfrom the onditionthat theequation (33)

hasasolution. Inparti ular,equation(33)hasasolutionifandonlyif

¯

πDX

0 = 0.

Bysubstituting intotheaboveequationtheexpressionforthegeneralsolution(35),weobtain

¯

πDH(I − DX

−1

) + ¯

πD1µ

0 = 0.

Consequently,wehave

µ

0 = −

1 ¯

πD1

πDH(I − DX

¯

−1

)

andweobtain(30).

2

Proposition 11

P

X

1 = j|X

0 = i ∧

N

^

m=1

X

m

∈ S

!

=

T

ij

T

(N −1)

j

1 T

i

(N )

1

Proof:

P

X

1 = j|X

0 = i ∧

N

^

m=1

X

m

∈ S

!

=

P

X

0 = i ∧ X

1 = j ∧

V

N

_m=2

X

m

∈ S

P

X

0 = i ∧

V

N

m=1

X

m

∈ S

Denominator:

(20)

P

X

0 = i ∧

N

^

m=1

X

m

∈ S

!

=

= P

X

0 = i ∧

N

^

m=1

_

k

m

∈S

X

m

= k

m

!

=

= P

X

0 = i ∧

_

k

1 ∈S

X

1 = k

1 ∧

N

^

m=2

_

k

m

∈S

X

m

= k

m

!

=

= P (X

0 = i)

X

k

1 ∈S

P

X

1 = k

1 ∧

N

^

m=2

_

k

m

∈S

X

m

= k

m

!

=

= P (X

0 = i)

X

k

1 ∈S

P (X

1 = k

1 |X

0 = i) P

N

^

m=2

_

k

m

∈S

X

m

= k

m

|X

1 = k

1 !

=

= P (X

0 = i)

X

k

1 ∈S

P (X

1 = k

1 |X

0 = i) P

_

k

2 ∈S

X

2 = k

2 ∧

N

^

m=3

_

k

m

∈S

X

m

= k

m

|X

1 = k

1 !

=

= P (X

0 = i)

X

k

1 ∈S

P (X

1 = k

1 |X

0 = i)

X

k

2 ∈S

P

X

2 = k

2 ∧

N

^

m=3

_

k

m

∈S

X

m

= k

m

|X

1 = k

1 !

=

= P (X

0 = i)

X

k

1 ∈S

P (X

1 = k

1 |X

0 = i)

X

k

2 ∈S

P

N

^

m=3

_

k

m

∈S

X

m

= k

m

|X

2 = k

2 ∧ X

1 = k

1 !

P (X

2 = k

2 |X

1 = k

1 ) =

= P (X

0 = i)

X

k

1 ∈S

P (X

1 = k

1 |X

0 = i)

X

k

2 ∈S

P

N

^

m=3

_

k

m

∈S

X

m

= k

m

|X

2 = k

2 !

P (X

2 = k

2 |X

1 = k

1 ) =

= P (X

0 = i)

X

k

1 ∈S

P (X

1 = k

1 |X

0 = i)

X

k

2 ∈S

P

N

^

m=3

_

k

m

∈S

X

m

= k

m

|X

2 = k

2 !

P (X

2 = k

2 |X

1 = k

1 ) =

= P (X

0 = i)

X

k

2 ∈S

P

N

^

m=3

_

k

m

∈S

X

m

= k

m

|X

2 = k

2 !

X

k

1 ∈S

P (X

2 = k

2 |X

1 = k

1 ) P (X

1 = k

1 |X

0 = i) =

= P (X

0 = i)

X

k

2 ∈S

P

N

^

m=3

_

k

m

∈S

X

m

= k

m

|X

2 = k

2 !

P (X

2 = k

2 |X

0 = i) =

= P (X

0 = i)

X

k

2 ∈S

P

_

k

3 ∈S

X

3 = k

3 ∧

N

^

m=4

_

k

m

∈S

X

m

= k

m

|X

2 = k

2 !

P (X

2 = k

2 |X

0 = i) =

= P (X

0 = i)

X

k

2 ∈S

X

k

3 ∈S

P

X

3 = k

3 ∧

N

^

m=4

_

k

m

∈S

X

m

= k

m

|X

2 = k

2 !

P (X

2 = k

2 |X

0 = i) =

(21)

= P (X

0 = i)

X

k

3 ∈S

X

k

2 ∈S

P

N

^

m=4

_

k

m

∈S

X

m

= k

m

|X

3 = k

3 ∧ X

2 = k

2 !

P (X

3 = k

3 |X

2 = k

2 ) P (X

2 = k

2 |X

0 = i) =

= P (X

0 = i)

X

k

3 ∈S

X

k

2 ∈S

P

N

^

m=4

_

k

m

∈S

X

m

= k

m

|X

3 = k

3 !

P (X

3 = k

3 |X

2 = k

2 ) P (X

2 = k

2 |X

0 = i) =

= P (X

0 = i)

X

k

3 ∈S

P

N

^

m=4

_

k

m

∈S

X

m

= k

m

|X

3 = k

3 !

X

k

2 ∈S

P (X

3 = k

3 |X

2 = k

2 ) P (X

2 = k

2 |X

0 = i) =

= P (X

0 = i)

X

k

3 ∈S

P

N

^

m=4

_

k

m

∈S

X

m

= k

m

|X

3 = k

3 !

P (X

3 = k

3 |X

0 = i) = . . .

. . . = P (X

0 = i)

X

k

N −2

∈S

P

N

^

m=N −1

_

k

m

∈S

X

m

= k

m

|X

N −2

= k

N −2

!

P (X

N −2

= k

N −2

|X

0 = i) =

= P (X

0 = i)

X

k

_{N −2}

∈S

P





_

k

_{N −1}

∈S

X

N −1

= k

N −1

∧

_

k

N

∈S

X

N

= k

N

|X

N −2

= k

N −2





P (X

N −2

= k

N −2

|X

0 = i) =

= P (X

0 = i)

X

k

N −2

∈S

X

k

N −1

∈S

P

X

N −1

= k

N −1

∧

_

k

N

∈S

X

N

= k

N

|X

N −2

= k

N −2

!

P (X

N −2

= k

N −2

|X

0 = i) =

= P (X

0 = i)

X

k

_{N −2}

∈S

X

k

_{N −1}

∈S

P

_

k

N

∈S

X

N

= k

N

|X

N −1

= k

N −1

∧ X

N −2

= k

N −2

!

P (X

N −1

= k

N −1

|X

N −2

= k

N −2

) P (X

N −2

= k

N −2

|X

0 = i) =

= P (X

0 = i)

X

k

_{N −2}

∈S

X

k

_{N −1}

∈S

P

_

k

N

∈S

X

N

= k

N

|X

N −1

= k

N −1

!

P (X

N −1

= k

N −1

|X

N −2

= k

N −2

) P (X

N −2

= k

N −2

|X

0 = i) =

= P (X

0 = i)

X

k

N −1

∈S

P

_

k

N

∈S

X

N

= k

N

|X

N −1

= k

N −1

!

X

k

_{N −2}

∈S

P (X

N −1

= k

N −1

|X

N −2

= k

N −2

) P (X

N −2

= k

N −2

|X

0 = i) =

= P (X

0 = i)

X

k

N −1

∈S

P

_

k

N

∈S

X

N

= k

N

|X

N −1

= k

N −1

!

P (X

N −1

= k

N −1

|X

0 = i) =

= P (X

0 = i)

X

k

N −1

∈S

X

k

N

∈S

P (X

N

= k

N

|X

N −1

= k

N −1

) P (X

N −1

= k

N −1

|X

0 = i) =

= P (X

0 = i)

X

k

N

∈S

X

k

_{N −1}

∈S

P (X

N

= k

N

|X

N −1

= k

N −1

) P (X

N −1

= k

N −1

|X

0 = i) =

= P (X

0 = i)

X

k

N

∈S

P (X

N

= k

N

|X

0 = i) =

(22)

=

n

T

X

k

N

=1

T

ik

(N )

N

P (X

0 = i) =

= T

i

(N )

1P (X

0 = i) =

P

X

0 = i ∧

N

^

m=1

X

m

∈ S

!

= T

i

(N )

1P (X

0 = i)

Numerator:

P

X

0 = i ∧ X

1 = j ∧

N

^

m=2

X

m

∈ S

!

=

= P

N

^

m=2

_

k

m

∈S

X

m

= k

m

!

P (X

1 = j|X

0 = i) P (X

0 = i) =

= T

ij

T

(N −1)

j

1P (X

0 = i) =

P

X

0 = i ∧ X

1 = j ∧

N

^

m=2

X

m

∈ S

!

= T

ij

T

(N −1)

j

1P (X

0 = i)

2

Contents 1 Introdu tion 3

2 Quasi-stationary distributions as entrality measures 4

3 Numeri al experiments and Appli ations 11

(23)

Unité de recherche INRIA Sophia Antipolis

2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex (France)

Unité de recherche INRIA Futurs : Parc Club Orsay Université - ZAC des Vignes

4, rue Jacques Monod - 91893 ORSAY Cedex (France)

Unité de recherche INRIA Lorraine : LORIA, Technopôle de Nancy-Brabois - Campus scientifique

615, rue du Jardin Botanique - BP 101 - 54602 Villers-lès-Nancy Cedex (France)

Unité de recherche INRIA Rennes : IRISA, Campus universitaire de Beaulieu - 35042 Rennes Cedex (France)

Unité de recherche INRIA Rhône-Alpes : 655, avenue de l’Europe - 38334 Montbonnot Saint-Ismier (France)

Unité de recherche INRIA Rocquencourt : Domaine de Voluceau - Rocquencourt - BP 105 - 78153 Le Chesnay Cedex (France)

Éditeur

INRIA - Domaine de Voluceau - Rocquencourt, BP 105 - 78153 Le Chesnay Cedex (France)

http://www.inria.fr