A generalized model via random walks for information filtering

(1)

A

generalized

model

via

random

walks

for

information

ﬁltering

Zhuo-Ming Ren

a

,

b

,

∗

,

Yixiu Kong

a

,

Ming-Sheng Shang

b

,

∗

,

Yi-Cheng Zhang

a a_Department_of_Physics,_University_of_Fribourg,_Chemin_du_Musée_3,_CH-1700,_Fribourg,_Switzerland

b_Chongqing_Institute_of_Green_and_Intelligent_Technology,_Chinese_Academy_of_Sciences,_ChongQing,_400714,_PR_China

Therecouldexistasimplegeneralmechanismlurkingbeneathcollaborativeﬁlteringandinterdisciplinary physicsapproacheswhichhavebeensuccessfullyappliedtoonlineE-commerceplatforms.Motivatedby thisidea,weproposeageneralizedmodelemployingthedynamicsoftherandomwalkinthebipartite networks.Takingintoaccountthedegreeinformation,theproposedgeneralizedmodelcoulddeducethe collaborativeﬁltering,interdisciplinaryphysicsapproaches andeven theenormousexpansionofthem. Furthermore, weanalyzethe generalizedmodel with single and hybridof degreeinformation onthe processofrandomwalkinbipartitenetworks,andproposeapossiblestrategybyusingthehybriddegree informationfordifferentpopularobjectstotowardpromisingprecisionoftherecommendation.

1. Introduction

Over thelast decade, therapid growthofinformationinboth onlineandoffline leadstoan informationoverloadproblem[1,2]. All of surfers would have the feeling of confusion which one is the best when searchingonline, reading online, shopping online, entertaining online, oreven dating online[3,4]. Toaddress these problems, the information filtering has become a promising and effectivewaytofilterouttheirrelevant informationandprovides personalized suggestions accordingto thetrackofpast purchases ofusersaswell asother informationofproductsandusers [5,6]. Dueto itssignificancefortheeconomyandsociety,designing ef-ficient information filtering algorithms has received wide attrac-tions in manybranches ofscience such ascomputer science, in-formation science and interdisciplinary physics. One of the most promising informationfilteringalgorithms isthecollaborative fil-tering (CF) [7,8]. The CF makes work according to the database ofthe users’pasthistoryof purchasesandtheproduct searching recordstoofferthepersonalizedrecommendation.Breeseetal.[9] classified CF into two broad groups which were memory-based and model-based methods. The memory-based methods predict missinginformationandrecommendproductsbasedonsimilarity measures betweenusers and products [7,8,10]. The model-based algorithms use the collection of the user and object informa-tion to learn an information filtering model by clustering [11],

*

Corresponding authors.

E-mailaddresses:zhuomingren@gmail.com(Z.-M. Ren),msshang@cigit.ac.cn (M.-S. Shang).

bayesian [12], matrix factorization [13,14] and other machine learning techniques [15,16]. Being different from the perspec-tive computer sciences, the interdisciplinary physics approaches adaptedthecomplexnetworktheoryandvariousclassicalphysics processeshaveprovidedsome newinsights andsolutionsforthe challengesintheactive fieldoftheinformationfiltering[6,17,18], forinstance,adiffusionprocessanalogoustotheheatconduction processacrossabipartite complexnetwork[19],anetwork-based inferencemethodconsideringtheresourceallocationdynamicson bipartite complex networks [20], the opinion diffusion [21] and thegravityprinciple[22]beingextendedintheinformation filter-ing,the informationcore andinformationbackbone [23,24] shed somelightonthein-depthunderstandingofinformationfiltering. Further, the review [6] highlighted a prospect of physiciststo a comprehensiveguidetoinformationfilteringalgorithms.

In summary, the CF and interdisciplinary algorithms have already been successfully applied to many well-known online e-commerce platforms.Meanwhile,manyrecentworkshavebeen devotedtostudytheexpansionofbothalgorithms,forinstance hy-bridmethod[25,26],biased-heatconduction[27,28],multi-channel diffusion[29],preferential diffusion[30,31], hybriddiffusion[32], directrandomwalksmethodbasedonCF[33],hypergraphmodel with social tag [34,35], multi-linear interactive matrix factoriza-tion [36]. These algorithms would further improvethe eﬃciency oftheinformationﬁltering.Byreferringtothem,therecouldexist asimplegeneralformulabehindCF,interdisciplinaryphysics algo-rithmsaswell astheextension methods. Motivatedby thisidea, we propose a simplegeneral modelin which employing the dy-namicsoftherandomwalkinbipartite networksandthen derive an analytical expression fortunable parameters ofthe transition

1VCMJTIFEJO1IZTJDT-FUUFST" o

(2)

Fig. 1. (Color online.) Illustration

of a bipartite network which is constructed by the tracks of the consumer purchases, which can be described as the process of the random

walkermodel.(a)Asmallexampleofabipartitenetworkwhichkeepsthetrackoftheconsumerpurchases.(b)Therandomwalkerstartsfromtheuserside.(c)Therandom walkerstartsfromtheobjectside.

probabilitymatrix.Whentakingintoaccount thedegree informa-tion,theprocessofrandomwalkerscanbeequivalenttothe repre-sentativeinformationﬁlteringalgorithmssuchastheCF[7,8],heat conduction method [21], network-based inference method [20], hybrid method [25]. These above methods are exceptional cases from the proposed generalized model. Consequently, we analyze thegeneralizedmodel withsingle andhybrid ofdegree informa-tionontheprocessofrandomwalkinbipartitenetworks,andthe effectofdegreeinformationonthedifferentpopularlevelobjects in the process of the information ﬁltering. Finally, we suggest a possiblestrategyfordifferentpopularobjectsinthe recommenda-tion when usingthe single andhybrid of degree informationon theprocessofrandomwalkinbipartitenetworks.

2. Models

InFig. 1(a),wepresentanexampleofabipartitenetworkwhich is constituted by the records of consumer purchases. We would recommendanobjecttoauserbasedonhis/hersimilarpast pur-chases with others or the purchase records of those who have some resemblancewith him/her. In a broad view,we can make personalized recommendations for each consumer depending on the past purchases of the consumer as well as information re-lating to the similarity of other consumers or items. With this concept, many online business platforms such as Alibaba, Ama-zon,Neflix,Diggarereportedtodevelopsophisticatedinformation filtering systemsto boost their onlinesales. Therefore, the infor-mationfilteringproblemisreducedtotheproblemthatestimating thevaluationforproductsthathavenotbeenseenbyconsumers. Consideringthefactthatmillionsofproductsexistinonline busi-nessplatform,theinformationfilteringproblemcouldbe thought asidentifying onlythe highestrankedproducts foreach user in-steadofpredictingtheratingofeachproductforeachuser.

2.1. Therandomwalkermodel

Let us consider an unweighted bipartite network G

(

U

,

O

,

E

)

, where U

= {

u1

,

u2

, · · · ,

um

}

, O

= {

o1

,

o2

, · · · ,

on

}

and E

=

{

e1

,

e2

, · · · ,

el

}

arethesetofusers,productsandlinksrespectively. The networkis describedby AM∗N, adjacencymatrixwith Aiα if there is a track between user i and product

α

, and zero other-wise.The dynamicsofarandomwalker onthebipartite network isencodedbyatransitionprobabilitymatrixwithelementsofthe form,

P

(

ui

→

oα

)

or P

(

oα

→

ui

),

(1)

measuring the probability that a walker passes from ui to oα (or oα toui).

Firstly, we can assume that a random walker starts fromthe user side at the beginning, and the walker’s initial state assigns as X0

=

u. We represent the probability P

(

u

,

o

,

t

)

as the walker

passes from the user side ui to the object side oα during the

t steps and corresponding to transitionstate which interprets as

X0

,

X1

,

· · · ,

Xt−1

,

Xt. Consequently, the probability P

(

u

,

o

,

t

)

that the walker reaches the object side _{oα during} t steps under the initialstate X0

=

u is,

P

(

u

,

o

,

t

) =

P

(

Xt

=

o

|

X0

=

u

)

(2)

andthen,frombayesiantheory,wecanget,

P

(

Xt

=

o

|

X0

=

u

)

=

o∈O P

(

Xt

=

o

|

Xt−2

=

o

,

X0

=

u

)

P

(

Xt−1

=

o

|

X0

=

u

)

=

o∈O P

(

Xt

=

o

|

Xt−2

=

o

,

X0

=

u

)

P

(

u

,

o

,

t

−

1

).

(3)

Wenowdeﬁnethewalkersfromtheusersideu andalongthe objectsideo tothenextpathway.Theeachpathwayisselectedby theprobability

ψ

.

P

(

Xt

=

o

|

Xt−2

=

o

,

X0

=

u

) = ψ.

(4)

Finally,combingtheequation(2),(3),(4), wecanget

P

(

Xt

=

o

|

X0

=

u

) =

o∈O

P

(

u

,

o

,

t

−

1

)ψ.

(5)

Therecommendationistodevelopan effectivewaytoprovide personalized suggestions accordingtothetrackofpast purchases ofusers andobjects.We canassume thewalkerscan walk along with the track of past purchases of users and objects. For sim-plicity,Fig. 1(b)and(c)representtheprocessofthewalker when considering the situationt

=

3.Fig. 1(b)showsthat thewalker’s road map (users–objects–users–objects) which starting from the userside, whileasshowninFig. 1(c), thewalker’sroadmap ap-pears objects–users–objects–users which starting fromthe object side. Starting from the user side, the random walker must walk threestepsatleasttotheobjectside,and startingfromtheobject side,therandomwalkeralsomustwalkthreestepsatleasttothe userside.So,ifan objectisselectedtorecommendtoauser,the randomwalkerneedstowalkt (t

=

2

∗

j

+

1, j

=

1

,

2

,

. . .

)steps.In summary,therearetwopathwaysforrandomwalkersinonestep. 1) Thewalkerspassfromtheusersidetotheobjectsidewith theprobabilitymatrix,

P

(

u

→

o

) = (

u

)

A

.

(6)

2) Thewalkerspassfromtheobjectside totheusersidewith theprobabilitymatrix,

P

(

o

→

u

) =

A

(

o

),

(7)

where,

(

u

)

and

(

o

)

are M

∗

M and N

∗

N diagonal matrix respectively corresponding to the user and product proﬁles, for

(3)

example,degree, eigenvector, cluster coeﬃcient andso onin the bipartite networks. Apparently, we could also presentthe proba-bilitymatrixasAT

(

u

)

or

(

o

)

AT respectively.

2.2. Therandomwalkermodelwithdegreeinformation

The transition probability matrix allows us to capture many attributes of the bipartite networks. Among these attributes, the degree information which is the very simple way ofquantifying attributes of bipartite networks would be considered in this pa-per.Firstly, wedeﬁne D

(

u

,

λ)

andD

(

o

,

λ)

asthediagonaldegree matrixwhichiscalculateddetailedly,

D

(

u

, λ) =

⎧

⎨

⎩

1 ( j Ai j)λ

,

i

=

j 0

,

i

=

j (8) D

(

o

, λ) =

⎧

⎨

⎩

1 ( i Ai j)λ

,

i

=

j 0

,

i

=

j (9)

Wethen considerthesimplestsituationoftherandomwalker dynamicswhent

=

3.Aswe canseefromFig. 1(b)and(c), there aretwowaysinonestep,whichthewalkstartsfromtheuserside ortheabject siderespectively,

1)Thewalkerspassfromtheusersidetotheobjectsidewith theprobabilitymatrix,

P

(

u

→

o

) =

D

(

u

, λ)

A

.

(10)

2)Thewalkerspassfromtheobjectsidetotheusersidewith theprobabilitymatrix,

P

(

o

→

u

) =

A D

(

o

, λ).

(11)

Now,we introducethe CF,interdisciplinary physicsalgorithms from the prospect of the dynamics of random walkers by equa-tion (10) and (11). The common procedure of the collaborative filtering technique is firstly computing similarity between users (user-based)or objects(object-based) andthen making a predic-tion based on the similarity in the first step. The basic idea in the quantification of similarity between two users is based on the numberofobjects which havebeenchosen by both users in the past. It is also possible to define a similarity between two objects based on the number of users who have chosen them. Thus, takingaccount of thesetwo steps, thesimple collaborative filtering procedure can be expressed by A AT_. _In _the _HC, _the re-source isredistributed via an averagingprocedure withusers re-ceiving a level of resource equal to the mean amount possessed bytheirneighbouringobjects,andobjectsthenreceivingbackthe meanof their neighbouringusers’resource levels.By contrast,in the NBI,theinitial resource placed onobjects isfirst evenly dis-tributedamongneighbouring usersandthenevenly redistributed back tothoseusersneighboringobjects.Both HCandNBIare re-distributedresourceinamannerakintoarandomwalkerprocess. Whereas, HCemploysa row-normalizedtransitionmatrix,that of NBIisacolumn-normalized transitionmatrix.Thereupon,the hy-bridmethod(HM)canbeachievedbyincorporatingtheatunable parameter

λ

intothetransitionmatrixnormalization.Thedetailed matricesareexpressedasD

(

o

,

1

− λ)

AT_D

₍

_u

_,

₁

₎

_{A D}

₍

_o

_,

_λ)

_AT_._When

λ

=

0 theequationequalsHC,and

λ

=

1 theequation equals NBI. Asdiscussingabove,theserecommendationprocedurescanbe ex-pressed accordingtothe randomwalker processasshownin Ta-ble 1, fromwhichthe above methodsare exceptional casesfrom equation (10) and (11). When we consider three steps in ran-dom walks, we can achieve a simple recommendation. While in the three or more steps, there exists a great number of com-binations which can develop a large amountof recommendation algorithms.

Table 1

Deducingtheproposedinformationﬁlteringmethodsrecentlyintheliteratureby the random walker model.

Method First step Second step Third step

CF A AT _A

HC A AT_D₍_u_,₁₎ _{A D}₍_o_,₁₎

NBI A D(o,1)AT _D₍_u_,₁₎_A

HM D(o,1− λ)AT _D₍_u_,₁₎_A _D₍_o_{, λ)}_AT

3. Materialsandmetrics 3.1.Materials

Inthispaper,wetestthegeneralmodelinarepresentativereal networkwhich issampledfromMovieLens[37].MovieLensis an onlinevideorecommendationwebsitewhichinvitesuserstorate videos. Therating recordsin theMovieLens rangefromone (i.e., worst)to ﬁve(i.e.,best). Ifa user i selectsan video

α

andrates it,alinkbetweentheuseri andthevideo

α

wouldbeestablished asshowninFig. 1(a).In thispaper,we onlyconsider theratings largerthan3 aslinks.Basedonthoseselectedlinks, wecan con-structa user-video bipartite network. MovieLens-1 dataset (M1) containsmore than8 millionrecords,andMovieLens-2(M2)isa subset ofM1 dataset which onlycontains 800thousand records. M2datasetisusuallyusedasbenchmarkdatasetofthe informa-tionﬁltering.Inthefollowingwereporttheresultscorresponding totheM2 dataset thatis randomlydivided intothetraining set andthetest set.The trainingset contains 90%of thedata,while thetestsetconsistsoftheremaining10%ofdata.Itisnotablethat inresults, we consider that the length of a recommendationlist

L whichcanbe proposed toeachuseris 20.Thisvaluehasbeen already used in other works and can be considered a valid ref-erenceto comparethe performance ofdifferent recommendation measurements. In addition, we run 100 independent realizations ineachexperiment.

ForMovielens data sets, we first display the degree distribu-tionofobjectsinFig. 2(a).Thedistributionexhibitsrelativelybroad tailswhichindicatesmostobjects’degreeissmall, andafew ob-jects’degreeislarge.Thissuggeststhatweshouldtakethedegree longtaileffectintoaccount.Inthispaper,weclassifytheobjects’ degree into three different levels. In the high level, we consider theobjectswhosedegreerankstop1%.Inthemediallevel,the ob-jects’degreeranksfromtop1%totop10%.Thelowlevelcontains therest90%ofobjects.Foreachtargetedlevelobjectset,theuser degree distributionis shownin Fig. 2(b) and(c). We can findin thehighdegreeobjectgroup,theuserdegreedistributionshowsa stronginclination,whichsuggeststheuserswithsmalldegreeare morelikely to select the objects in the highlevel, while forthe lowdegree objectgroup,theinclination oftheuserdegreedrops down clearly. It is notable that theseobjects accountedfor 90%. Thereupon,whendesigninganefficientinformationfiltering algo-rithms,theinclinationoftheuserpreferencetoobjectsatdifferent popularitylevelshouldbewell-considered.

3.2.Metrics

Aswe known,thediscussed aboveeach methodwillgenerate arecommendationlistL for users.Inprinciple,the recommenda-tionlist froman effective recommendationmethod should be as closeas possibleto the user purchaserecords in thereal world. While,theproposed methodsallowustoemploymanyattributes ofthe bipartite networksand tunableparameters to capturethe individualinterestsofusers.Asaresult,we needtoevaluatethat howtheattributesofthebipartitenetworksandtunable parame-terseffectthe recommendationlist.So,wedeﬁne the preference

(4)

Fig. 2. (Color

online.) The

degree distribution of objects in two movielens data sets. (a) The degree distribution of objects. The two sets have the same slope −0.95. (b) and (c) Thedistributionoftheuserdegreewithobjectswhichareinhigh,medial,lowlevelrespectively.

Fig. 3. (Coloronline.) Thepreferenceofrecommendationwhenconsideringsinglefactorsoftheuserdegreeinformation.Weemploytherandomwalkerprocesswhichis describedasD(U,λ)A AT_D2₍_u_,_λ)_A._In_order_to_compare_the_prediction_with_the_realization,_we_consider_the_tunable_parameters_λ₌_0,₀_._5,_{1 and}_the_test_set.

oftherecommendation toanalyzethatforaspeciﬁcgroupof ob-jects,theusersby whichtherecommendationalgorithmstendto recommend to. Meanwhile, we use the precisionofthe recom-mendation toanalyzehow manyobjectsinthe recommendation listareactuallyinuserpurchaserecordsintherealworld.Firstly, we limit to the objects in the highlevel andassume that for a targeteduseri,thereare H

(

i

)

highlevelobjectsinthe recommen-dation list,and C

(

i

)

highlevelobjects belongto thetest andthe recommendationlistincommon.Wedeﬁnethepreferenceof rec-ommendationasfollows, P

(

k

) =

i∈UkH

(

i

)

i∈UH

(

i

)

,

(12)

whereUkisthesetofuserswhosedegreeequalsk.Theprecision is deﬁned as the ratio of the relevant objects in the real world selected to the number of objects recommended accurately. The precisionequals, Precision

=

1 mH

i∈U C

(

i

)

H

(

i

)

,

(13)

wheremH isthe totalnumberof objectsinthe highlevel.Thus, ahighertheprecision value meansamoreaccurate predictionof therecommendation.Inaddition,thecorrespondingobjectsinthe medialandlowlevelaretakenthesameprocesses.

4. Validations

4.1. Theeffectofsinglefactors

Fora better understandingof thegeneralizedmodel, we con-siderthe single degreeinformationin theprocess oftherandom walk. Firstly, we consider the process of the random walk only withtheprobabilitymatrix(D

(

u

,

λ)

A)duringthethreesteps.The equation for the three steps of random walks can be expressed as D

(

U

,

λ)

A ATD2

(

u

,

λ)

A. The preference of recommendation is showninFig. 3.With

λ

increasing,therecommendationalgorithm

islesslikelytorecommendobjectswhichareinthehighand me-dial level,butforobjectswhichare inthelow levelasshownin Fig. 3(c), with

λ

increasing, thecorresponding algorithm is more inclined to recommendtheseobjects. Itsuggeststhat the predic-tionofrecommendationforobjectswhichareinhighandmedial leveliscontrarytothatforobjectsinlowlevelwith

λ

increasing.

Meanwhile,wediscussthedynamicsoftherandomwalkwhich is only affectedby theobject degree informationwith the prob-ability matrix (D

(

o

,

λ)

AT_). _The _random _walk _process _can _be ex-pressed as A D2

₍

_o

_,

_λ)

_AT_{A D}

₍

_o

_,

_λ)

_._The _results_of_the recommenda-tionpreferenceareshowninFig. 4.Itisnotablethat when

λ

=

1, thepredictionresultsthat therecommendationalgorithmcannot recommendtheobjectswhichareinthehighandmediallevelas showninFig. 4(a)and(b).AsgiveninFig. 4(b)and(c),the recom-mendationalgorithmismoretendencytorecommendtheobjects whichareinmedialandlowlevelwith

λ

decreasing.Insummary, whenonlyconsideringtheuserdegreeinformation,objectsinthe high and mediallevel are lesslikely to be recommended to the users with the parameter

λ

increasing, while the objects in the low level, the preference of recommendation is reversed. When only considering the object degree information withthe param-eter

λ

increasing,the objectsinthelow levelarelessinclined to berecommendedtotheusers.

Fig. 5 showsthathow theprecision changes withthetunable parameter

λ

whentheprocessoftherandomwalkisonlyaffected bytheobjectoruserdegreeinformation.Thereexistdifferent opti-malparametersleadingtothehighestprecisionvalueforobjectsin thehigh,medialandlowlevelrespectively.Firstly,weanalyzethat the precision value changes with

λ

varying when only consider-ingtheuserdegreeinformation.Theprecisionabouttheobjectsin thehighleveliscloselyto1 andkeepsstably.Whileinthemedial level, theprecision declines slowly when

λ

increases, inaddition in low level, the precision value is very low, and also increases slowly.Butwhenonlyconsidertheobjectdegreeinformation,the precision can reach0.11inthe lowlevel. Inthehighandmedial level,when

λ

ismorethan0

.

75,theprecisionstaysat0.Themain reason is that when

λ

rising, the recommendation algorithm do not recommendedobjects inthehighandmediallevelasshown

(5)

Fig. 4. (Color

online.) The recommendation preference when considering single factors of object degree information. We employ the random walker process which can be

describedasA D2₍_o_,_λ)_AT_{A D}₍_o_,_λ)_._In_order_to_compare_the_prediction_with_the_realization,_we_consider_the_tunable_parameters_λ₌_0,₀_._5,_{1 and}_the_test_set.

Fig. 5. (Color

online.) Precision with the tunable parameter

λfor the prediction of the recommendation when considering the single degree information in the process of the random walk. Squares () indicate that the precision of the recommendation when the random walker process with the probability matrix D(o,λ)AT_{uses the object degree}

information. Circles (◦) report that the precision of the recommendation when the dynamics of random walk affects by user degree information with the probability matrix (D(u,λ)AT_).

Fig. 6. (Color

online.) The recommendation preference when considering the hybrid factors of the constant user degree information and the variable object degree information.

Weemploytherandomwalkerprocesswhichisdescribedas(D(o,1− λ)AT_D₍_u_,₁₎_{A D}₍_o_,_λ)_AT_)._In_order_to_compare_the_prediction_with_the_realization,_we_consider_the

tunable parameters λ=0, 0.4, 0.8, 1 and the test set.

inFig. 4(a)and(b).As aresult,inordertoimprovetheprecision oftherecommendation,weshouldtakedifferentstrategiesto rec-ommendtheobjectsatdifferentpopularitylevels.

4.2. Theeffectofhybridfactors

Inthissection,weanalyzetheeffectofthehybridfactorsabout the degree information of users andobjects. Firstly, we consider theformula(D

(

o

,

1

−λ)

AT_D

₍

_u

_,

₁

₎

_{A D}

₍

_o

_,

_λ)

_AT_),_in_which_the_object degreeinformationanduserdegreeinformationarebothtaken ac-countintheinformationﬁlteringsystems.Inthesystem,theuser degree information isconstant, while theobject degree informa-tionistunable.Wecandescribe thedynamicsofrandomwalk in detail.In theﬁrst step,therandom walkerspassfromthe object side to theuser side whoare only affectedby the objectdegree information with the probability matrix (D

(

o

,

1

− λ)

AT). In the second step, the random walkers move through the access from theusersidetotheobjectside withtheconstantprobability ma-trix (D

(

u

,

1

)

A). While in the third step, the random walkers go to the user side fromthe object side withthe tunable probabil-itymatrix(D

(

o

,

λ)

AT) whichjustincludestheobjectinformation. Inaddition,we considertheparameter

λ

=

0,0

.

4, 0

.

8 and1. We

cansee fromFig. 6(a) inthe highlevel,when

λ

=

0,the recom-mendationpreferenceﬁtswiththetestsetwell.However,when

λ

increasesto 1,thepreferenceofrecommendationtoobjectsinthe highlevelbecomesclearer,whichsuggeststhatthe recommenda-tionalgorithm ismoreinclined to recommend theobjects inthe highlevel.FromFig. 6(b)and(c)inthemedialandlow level,the preferenceofrecommendationto objectsalsobecomes more evi-dentas

λ

increasing.

After discussing the random walker dynamics with the con-stant user degree information and the tunable object degree in-formation,wethen analyzetheprocessofthe randomwalk with thetunable userdegree information andthe constant object de-gree information. In the ﬁrst step, walkers start from the user sidetoobject side,weonly considertheuserdegreeinformation withthetunableprobability matrix(D

(

u

,

1

− λ)

A).In thesecond step, the random walkers pass from the object side to the user side determined by the constant object degree informationwith the probability matrix (D

(

o

,

1

)

AT). In the last step, the walkers trackfrom the user side to the object side, the user degree in-formationisalso takenintoaccount withthetunable probability matrix(D

(

u

,

λ)

A). Theresults oftherecommendationpreference are revealedin Fig. 7, it is surprisingto usthat the

(6)

correspond-Fig. 7. (Color

online.) The recommendation preference when considering the hybrid factors of the constant object degree information and the variable user degree information.

Weemploytherandomwalkerprocesswhichisdescribedas(D(u,1− λ)A D(o,1)AT_D₍_u_,_λ)_A)._In_order_to_compare_the_prediction_with_the_realization,_we_consider_the

tunableparametersλ=0,0.4,0.8,1 andthetestset.

Fig. 8. (Color

online.) Precision with the tunable parameter

λfor the prediction of recommendation when considering the effect of the hybrid factors about the degree information of users and objects. Squares () indicate that the precision of the recommendation when the random walker dynamics with the constant user degree infor-mation and the tunable object degree inforinfor-mation. Circles (◦) report that the precision of recommendation when the process of random walk with the tunable user degree informationandtheconstantobjectdegreeinformation.

ing cannot recommend theobjects inthe highandmediallevel to the users when

λ

=

1 as shown in Fig. 7(a) and (b). When

λ

=

0, 0

.

4, 0

.

8,thepreferenceofrecommendationissimilar,which suggeststhetunableparameter

λ

could notaffectthe recommen-dationfortheobjectsinthehighandmediallevelconsideringthe constant objectdegree informationandthe tunablevariable user degree information.While inFig. 7(c)inthe low level,we could seethat with

λ

increasing,therecommendationalgorithm isless likelytorecommendobjects,whichisdifferentfromthat consider-ingtheconstantuserdegreeinformationandthetunablevariable objectdegreeinformation.

Wealsoperformtheanalysisofthepredictionprecisionwhen takingdifferent hybrid factorsinto account as revealedin Fig. 8. When the process ofrandom walk withthe tunable userdegree information and the constant object degree information as the parameter

λ

increases, the precision of recommendation always equals to 1 inthehighlevel, andincreasesslowly in themedial level,whileinthelow levelincreasesmoreslowly.Thatistosay, theprecisionofrecommendationdoesnot changemuchwiththe tunableparameter

λ

varying,which isdifferentfrom theprocess ofrandomwalkaffectedbythetunableobjectdegreeinformation andtheconstantuserdegreeinformation,inwhichasthe param-eter

λ

increases, the precision also rises to the highest value 1 andthen keepsstableinthehighlevel,whileinthemediallevel theprecision can reach thehighestvalue 0

.

49 andthen declines slowly.Inthelow level,theprecision valuealwaysdescends with the tunableparameter

λ

increasing. In general, it is easy to ﬁnd optimal strategy for objects in high andmedial level to recom-mendthemaccurately andeﬃciently,butforobjectsinlowlevel, we should consider a better method to improvethe accuracy of therecommendation.

5. Conclusionsanddiscussions

Sofar,thecollaborativeﬁlteringtechniquesandthe interdisci-plinary physicsapproachesattract awealthofempirical and

the-oretical research andproduce many successfulE-commerce plat-forms.Therecould exista simplegeneralformulabehindthe col-laborative filtering, interdisciplinary physics algorithms and even theextensionmethods.Inagreementwiththepreviousstudies,we employed thedynamicsoftherandomwalks,andtheframework ofthegeneralmodelhadthreekeysteps,whichwere(1)building the initial transition probability matrix withtunable parameters; (2)obtainingthetrackofrandomwalkers;and(3)figuringoutthe walkertrackstopredictthepreferenceofusers.Weconsideredthe simplest degree information in bipartite networks and proposed a generalizedformula, which could deduceand develop the pro-posed well-known information filtering methods recently in the literaturessuchascollaborativefiltering,network-basedinference, heat conductionandhybridmethod.Furthermore,we studiedthe generalized method with the single and hybrid degree informa-tionontheprocessoftherandomwalkinthebipartitenetworks. The results stated clearly that the user degree information af-fected the recommendationpreference which is in contrastwith onlyconsideringtheobjectdegreeinformation,furtherthe perfor-manceofrecommendationwas affectedbytheprocessofrandom walk when taking account differentsingle and hybriddegree in-formation.Asaresult,wecouldsuggestthatusingthetwohybrid factors adjusttherecommendationpreference fordifferent popu-larobjectstotowardpromisingprecisionofrecommendation.Our findings could provide severalpragmatic lessons forthe efficient recommendationstrategyintheE-businessplatformsandefficient interactive information prediction in online social networks. For instance,the differentpopularlevel objectscould develop an ap-propriate strategytomakerecommendationsforusers,aswellas considering the preferenceofuserswithdifferent activity. Mean-while, thiswork leads tomanyextension, forinstance, theother information of bipartite networkssuch as clustering, eigenvector, and assortative coefficient could also be applied in the random walkdynamicsinthefurtherresearch.Inaddition,wejustdevelop the processoftherandom walkto informationfiltering consider-ingsimplestsituationinourproposedgeneralizedmethod,andwe

(7)

willfurtherstudythemachinelearningtechniquetotrainthe op-timal road map of random walkers for offering the personalized recommendation.

Acknowledgements

We acknowledge Rui Xiao for fruitful discussion and assis-tance. This work is partially supported by the EU FP7 Grant 611272 (ProjectGROWTHCOM), theSwissNationalScience Foun-dation (Grant No. 200020-143272), andthe NationalNatural Sci-enceFoundationofChina(GrantNos.61370150,61433014). References

[1]D.J.Watts,Atwenty-ﬁrstcenturyscience,Nature445 (7127)(2007)489. [2]J.B.Schafer,J.A.Konstan,J.Riedl,E-commercerecommendationapplications,

in: Applications of Data Mining to Electronic Commerce, Springer, 2001, pp. 115–153.

[3]A. Vespignani, Predicting the behavior of techno-social systems, Science 325 (5939)(2009)425.

[4]Z.-M.Ren, Y.-Q. Shi, H. Liao, Characterizing popularity dynamics ofonline videos,PhysicaA453(2016)236–241.

[5]P.B.Kantor,L.Rokach,F.Ricci,B.Shapira,RecommenderSystemsHandbook, Springer,2011.

[6]L.Lü,M.Medo,C.H.Yeung,Y.-C.Zhang,Z.-K.Zhang,T.Zhou,Recommender systems,Phys.Rep.519 (1)(2012)1–49.

[7]P.Resnick,N.Iacovou,M.Suchak,P.Bergstrom,J.Riedl,Grouplens:anopen architecture for collaborative ﬁltering of netnews, in: Proceedings of the 1994ACMConferenceonComputerSupportedCooperativeWork,ACM,1994, pp. 175–186.

[8]B.Sarwar,G.Karypis,J.Konstan,J.Riedl,Item-basedcollaborativeﬁltering rec-ommendationalgorithms,in:Proceedingsofthe10thInternationalConference onWorldWideWeb,ACM,2001,pp. 285–295.

[9]J.S.Breese,D.Heckerman,C.Kadie,Empiricalanalysisofpredictivealgorithms for collaborativeﬁltering, in: Proceedingsof the Fourteenth Conference on UncertaintyinArtiﬁcialIntelligence,MorganKaufmannPublishersInc.,1998, pp. 43–52.

[10]D.Goldberg, D. Nichols,B.M. Oki, D. Terry,Using collaborativeﬁltering to weaveaninformationtapestry,Commun.ACM35 (12)(1992)61–70. [11]L.H.Ungar,D.P.Foster,Clusteringmethodsforcollaborativeﬁltering,in:AAAI

WorkshoponRecommendationSystems,vol.1,1998,pp. 114–129.

[12] L. Ungar,D.P. Foster, Aformal statisticalapproachto collaborativeﬁltering, CONALD’98.

[13]Y.Azar,A.Fiat,A.Karlin,F.McSherry,J.Saia,Spectralanalysisofdata,in: Pro-ceedingsoftheThirty-ThirdAnnualACMSymposiumonTheoryofComputing, ACM,2001,pp. 619–626.

[14]Y.Koren,R.Bell,C.Volinsky,Matrixfactorizationtechniquesforrecommender systems,Computer8(2009)30–37.

[15]D.M.Blei,A.Y.Ng,M.I.Jordan,LatentDirichletallocation,J.Mach.Learn.Res.3 (2003)993–1022.

[16]R.Keshavan,A.Montanari, S.Oh,Matrixcompletionfromnoisyentries,in: AdvancesinNeuralInformationProcessingSystems,2009,pp. 952–960. [17]R.Albert,A.-L.Barabási,Statisticalmechanicsofcomplexnetworks,Rev.Mod.

Phys.74 (1)(2002)47–97.

[18]C.Castellano,S.Fortunato,V.Loreto,Statisticalphysicsofsocialdynamics,Rev. Mod.Phys.81 (2)(2009)591–646.

[19]Y.-C.Zhang,M.Blattner,Y.-K.Yu,Heatconductionprocessoncommunity net-worksasarecommendationmodel,Phys.Rev.Lett.99 (15)(2007)154301. [20]T.Zhou,J.Ren,M.Medo,Y.-C.Zhang,Bipartitenetworkprojectionandpersonal

recommendation,Phys.Rev.E76 (4)(2007)046115.

[21]Y.-C.Zhang,M.Medo,J.Ren,T.Zhou,T.Li,F.Yang,Recommendationmodel basedonopiniondiffusion,Europhys.Lett.80 (6)(2007)68003.

[22]J.-H.Liu,Z.-K.Zhang,C.Yang,L.Chen,C.Liu,X.Wang,Gravityeffectson infor-mationﬁlteringandnetworkevolving,PLoSONE9 (3)(2014)e91070. [23]W.Zeng,A.Zeng,H.Liu,M.-S.Shang,T.Zhou,Uncoveringtheinformationcore

inrecommendersystems,Sci.Rep.4(2014)6140.

[24]Q.-M.Zhang,A.Zeng,M.-S.Shang,Extractingtheinformationbackbonein on-linesystem,PLoSONE8 (5)(2013)e62624.

[25]T. Zhou,Z. Kuscsik, J.-G. Liu,M. Medo, J.R. Wakeling, Y.-C. Zhang, Solving theapparentdiversity–accuracydilemmaofrecommendersystems,Proc.Natl. Acad.Sci.107 (10)(2010)4511–4515.

[26]A.Fiasconaro, M.Tumminello, V.Nicosia,V.Latora, R.N. Mantegna, Hybrid recommendation methodsincomplex networks, Phys. Rev.E 92 (1)(2015) 012811.

[27]A.Stojmirovic,Y.-K.Yu,Informationﬂowininteractionnetworks,J.Comput. Biol.14 (8)(2007)1115–1143.

[28]J.-G.Liu,T.Zhou,Q.Guo,Informationﬁlteringviabiasedheatconduction,Phys. Rev.E84 (3)(2011)037101.

[29]M.-S. Shang,C.-H.Jin, T.Zhou,Y.-C.Zhang, Collaborativeﬁlteringbased on multi-channeldiffusion,PhysicaA388 (23)(2009)4867–4871.

[30]T. Zhou,L.-L. Jiang,R.-Q. Su, Y.-C. Zhang, Effect ofinitial conﬁgurationon network-basedrecommendation,Europhys.Lett.81 (5)(2008)58004. [31]L.Lü,W.Liu,Informationﬁlteringviapreferentialdiffusion,Phys.Rev.E83 (6)

(2011)066119.

[32]F.-G.Zhang,A.Zeng,Informationﬁlteringviaheterogeneousdiffusioninonline bipartitenetworks,PLoSONE10 (6)(2015)e0129459.

[33]J.-G.Liu,K.Shi,Q.Guo,Solvingtheaccuracy–diversitydilemmaviadirected randomwalks,Phys.Rev.E85 (1)(2012)016118.

[34]Z.-K.Zhang, C.Liu,Ahypergraphmodelofsocialtaggingnetworks, J.Stat. Mech.2010 (10)(2010)P10005.

[35]Z.-K.Zhang,T.Zhou,Y.-C.Zhang,Tag-awarerecommendersystems:a state-of-the-artsurvey,J.Comput.Sci.Technol.26 (5)(2011)767–777.

[36]L. Yu, C. Liu, Z.-K. Zhang, Multi-linear interactive matrix factorization, Knowl.-BasedSyst.85(2015)307–315.

A generalized model via random walks for information filtering

A

generalized

model

via

random

walks

for

information

ﬁltering

Zhuo-Ming Ren

,

,

∗

,

Yixiu Kong

,

Ming-Sheng Shang

,

∗

,

Yi-Cheng Zhang

*

1VCMJTIFEJO1IZTJDT-FUUFST" o 

of a bipartite network which is constructed by the tracks of the consumer purchases, which can be described as the process of the random

(

,

,

)

= {

,

, · · · ,

}

= {

,

, · · · ,

}

=

{

,

, · · · ,

}

α

(

→

)

(

→

),

=

(

,

,

)

,

,

· · · ,

,

(

,

,

)

=

(

,

,

) =

(

=

|

=

)

(

=

|

=

)

=

(

=

1VCMJTIFEJO1IZTJDT-FUUFST" o