• Aucun résultat trouvé

A generalized model via random walks for information filtering

N/A
N/A
Protected

Academic year: 2021

Partager "A generalized model via random walks for information filtering"

Copied!
7
0
0

Texte intégral

(1)

A

generalized

model

via

random

walks

for

information

filtering

Zhuo-Ming Ren

a

,

b

,

,

Yixiu Kong

a

,

Ming-Sheng Shang

b

,

,

Yi-Cheng Zhang

a aDepartmentofPhysics,UniversityofFribourg,CheminduMusée3,CH-1700,Fribourg,Switzerland

bChongqingInstituteofGreenandIntelligentTechnology,ChineseAcademyofSciences,ChongQing,400714,PRChina

Therecouldexistasimplegeneralmechanismlurkingbeneathcollaborativefilteringandinterdisciplinary physicsapproacheswhichhavebeensuccessfullyappliedtoonlineE-commerceplatforms.Motivatedby thisidea,weproposeageneralizedmodelemployingthedynamicsoftherandomwalkinthebipartite networks.Takingintoaccountthedegreeinformation,theproposedgeneralizedmodelcoulddeducethe collaborativefiltering,interdisciplinaryphysicsapproaches andeven theenormousexpansionofthem. Furthermore, weanalyzethe generalizedmodel with single and hybridof degreeinformation onthe processofrandomwalkinbipartitenetworks,andproposeapossiblestrategybyusingthehybriddegree informationfordifferentpopularobjectstotowardpromisingprecisionoftherecommendation.

1. Introduction

Over thelast decade, therapid growthofinformationinboth onlineandoffline leadstoan informationoverloadproblem[1,2]. All of surfers would have the feeling of confusion which one is the best when searchingonline, reading online, shopping online, entertaining online, oreven dating online[3,4]. Toaddress these problems, the information filtering has become a promising and effectivewaytofilterouttheirrelevant informationandprovides personalized suggestions accordingto thetrackofpast purchases ofusersaswell asother informationofproductsandusers [5,6]. Dueto itssignificancefortheeconomyandsociety,designing ef-ficient information filtering algorithms has received wide attrac-tions in manybranches ofscience such ascomputer science, in-formation science and interdisciplinary physics. One of the most promising informationfilteringalgorithms isthecollaborative fil-tering (CF) [7,8]. The CF makes work according to the database ofthe users’pasthistoryof purchasesandtheproduct searching recordstoofferthepersonalizedrecommendation.Breeseetal.[9] classified CF into two broad groups which were memory-based and model-based methods. The memory-based methods predict missinginformationandrecommendproductsbasedonsimilarity measures betweenusers and products [7,8,10]. The model-based algorithms use the collection of the user and object informa-tion to learn an information filtering model by clustering [11],

*

Corresponding authors.

E-mailaddresses:zhuomingren@gmail.com(Z.-M. Ren),msshang@cigit.ac.cn (M.-S. Shang).

bayesian [12], matrix factorization [13,14] and other machine learning techniques [15,16]. Being different from the perspec-tive computer sciences, the interdisciplinary physics approaches adaptedthecomplexnetworktheoryandvariousclassicalphysics processeshaveprovidedsome newinsights andsolutionsforthe challengesintheactive fieldoftheinformationfiltering[6,17,18], forinstance,adiffusionprocessanalogoustotheheatconduction processacrossabipartite complexnetwork[19],anetwork-based inferencemethodconsideringtheresourceallocationdynamicson bipartite complex networks [20], the opinion diffusion [21] and thegravityprinciple[22]beingextendedintheinformation filter-ing,the informationcore andinformationbackbone [23,24] shed somelightonthein-depthunderstandingofinformationfiltering. Further, the review [6] highlighted a prospect of physiciststo a comprehensiveguidetoinformationfilteringalgorithms.

In summary, the CF and interdisciplinary algorithms have already been successfully applied to many well-known online e-commerce platforms.Meanwhile,manyrecentworkshavebeen devotedtostudytheexpansionofbothalgorithms,forinstance hy-bridmethod[25,26],biased-heatconduction[27,28],multi-channel diffusion[29],preferential diffusion[30,31], hybriddiffusion[32], directrandomwalksmethodbasedonCF[33],hypergraphmodel with social tag [34,35], multi-linear interactive matrix factoriza-tion [36]. These algorithms would further improvethe efficiency oftheinformationfiltering.Byreferringtothem,therecouldexist asimplegeneralformulabehindCF,interdisciplinaryphysics algo-rithmsaswell astheextension methods. Motivatedby thisidea, we propose a simplegeneral modelin which employing the dy-namicsoftherandomwalkinbipartite networksandthen derive an analytical expression fortunable parameters ofthe transition

1VCMJTIFEJO1IZTJDT-FUUFST" o 

(2)

Fig. 1. (Color online.) Illustration

of a bipartite network which is constructed by the tracks of the consumer purchases, which can be described as the process of the random

walkermodel.(a)Asmallexampleofabipartitenetworkwhichkeepsthetrackoftheconsumerpurchases.(b)Therandomwalkerstartsfromtheuserside.(c)Therandom walkerstartsfromtheobjectside.

probabilitymatrix.Whentakingintoaccount thedegree informa-tion,theprocessofrandomwalkerscanbeequivalenttothe repre-sentativeinformationfilteringalgorithmssuchastheCF[7,8],heat conduction method [21], network-based inference method [20], hybrid method [25]. These above methods are exceptional cases from the proposed generalized model. Consequently, we analyze thegeneralizedmodel withsingle andhybrid ofdegree informa-tionontheprocessofrandomwalkinbipartitenetworks,andthe effectofdegreeinformationonthedifferentpopularlevelobjects in the process of the information filtering. Finally, we suggest a possiblestrategyfordifferentpopularobjectsinthe recommenda-tion when usingthe single andhybrid of degree informationon theprocessofrandomwalkinbipartitenetworks.

2. Models

InFig. 1(a),wepresentanexampleofabipartitenetworkwhich is constituted by the records of consumer purchases. We would recommendanobjecttoauserbasedonhis/hersimilarpast pur-chases with others or the purchase records of those who have some resemblancewith him/her. In a broad view,we can make personalized recommendations for each consumer depending on the past purchases of the consumer as well as information re-lating to the similarity of other consumers or items. With this concept, many online business platforms such as Alibaba, Ama-zon,Neflix,Diggarereportedtodevelopsophisticatedinformation filtering systemsto boost their onlinesales. Therefore, the infor-mationfilteringproblemisreducedtotheproblemthatestimating thevaluationforproductsthathavenotbeenseenbyconsumers. Consideringthefactthatmillionsofproductsexistinonline busi-nessplatform,theinformationfilteringproblemcouldbe thought asidentifying onlythe highestrankedproducts foreach user in-steadofpredictingtheratingofeachproductforeachuser.

2.1. Therandomwalkermodel

Let us consider an unweighted bipartite network G

(

U

,

O

,

E

)

, where U

= {

u1

,

u2

, · · · ,

um

}

, O

= {

o1

,

o2

, · · · ,

on

}

and E

=

{

e1

,

e2

, · · · ,

el

}

arethesetofusers,productsandlinksrespectively. The networkis describedby AMN, adjacencymatrixwith Aiα if there is a track between user i and product

α

, and zero other-wise.The dynamicsofarandomwalker onthebipartite network isencodedbyatransitionprobabilitymatrixwithelementsofthe form,

P

(

ui

)

or P

(

ui

),

(1)

measuring the probability that a walker passes from ui to (or oα toui).

Firstly, we can assume that a random walker starts fromthe user side at the beginning, and the walker’s initial state assigns as X0

=

u. We represent the probability P

(

u

,

o

,

t

)

as the walker

passes from the user side ui to the object side oα during the

t steps and corresponding to transitionstate which interprets as

X0

,

X1

,

· · · ,

Xt−1

,

Xt. Consequently, the probability P

(

u

,

o

,

t

)

that the walker reaches the object side oα during t steps under the initialstate X0

=

u is,

P

(

u

,

o

,

t

) =

P

(

Xt

=

o

|

X0

=

u

)

(2)

andthen,frombayesiantheory,wecanget,

P

(

Xt

=

o

|

X0

=

u

)

=



o∈O P

(

Xt

=

o

|

Xt−2

=

o

,

X0

=

u

)

P

(

Xt−1

=

o

|

X0

=

u

)

=



o∈O P

(

Xt

=

o

|

Xt−2

=

o

,

X0

=

u

)

P

(

u

,

o

,

t

1

).

(3)

Wenowdefinethewalkersfromtheusersideu andalongthe objectsideo tothenextpathway.Theeachpathwayisselectedby theprobability

ψ

.

P

(

Xt

=

o

|

Xt−2

=

o

,

X0

=

u

) = ψ.

(4)

Finally,combingtheequation(2),(3),(4), wecanget

P

(

Xt

=

o

|

X0

=

u

) =



o∈O

P

(

u

,

o

,

t

1

)ψ.

(5)

Therecommendationistodevelopan effectivewaytoprovide personalized suggestions accordingtothetrackofpast purchases ofusers andobjects.We canassume thewalkerscan walk along with the track of past purchases of users and objects. For sim-plicity,Fig. 1(b)and(c)representtheprocessofthewalker when considering the situationt

=

3.Fig. 1(b)showsthat thewalker’s road map (users–objects–users–objects) which starting from the userside, whileasshowninFig. 1(c), thewalker’sroadmap ap-pears objects–users–objects–users which starting fromthe object side. Starting from the user side, the random walker must walk threestepsatleasttotheobjectside,and startingfromtheobject side,therandomwalkeralsomustwalkthreestepsatleasttothe userside.So,ifan objectisselectedtorecommendtoauser,the randomwalkerneedstowalkt (t

=

2

j

+

1, j

=

1

,

2

,

. . .

)steps.In summary,therearetwopathwaysforrandomwalkersinonestep. 1) Thewalkerspassfromtheusersidetotheobjectsidewith theprobabilitymatrix,

P

(

u

o

) = (

u

)

A

.

(6)

2) Thewalkerspassfromtheobjectside totheusersidewith theprobabilitymatrix,

P

(

o

u

) =

A

(

o

),

(7)

where,

(

u

)

and

(

o

)

are M

M and N

N diagonal matrix respectively corresponding to the user and product profiles, for

(3)

example,degree, eigenvector, cluster coefficient andso onin the bipartite networks. Apparently, we could also presentthe proba-bilitymatrixasAT

(

u

)

or

(

o

)

AT respectively.

2.2. Therandomwalkermodelwithdegreeinformation

The transition probability matrix allows us to capture many attributes of the bipartite networks. Among these attributes, the degree information which is the very simple way ofquantifying attributes of bipartite networks would be considered in this pa-per.Firstly, wedefine D

(

u

,

λ)

andD

(

o

,

λ)

asthediagonaldegree matrixwhichiscalculateddetailedly,

D

(

u

, λ) =

1 ( j Ai j)λ

,

i

=

j 0

,

i

=

j (8) D

(

o

, λ) =

1 ( i Ai j)λ

,

i

=

j 0

,

i

=

j (9)

Wethen considerthesimplestsituationoftherandomwalker dynamicswhent

=

3.Aswe canseefromFig. 1(b)and(c), there aretwowaysinonestep,whichthewalkstartsfromtheuserside ortheabject siderespectively,

1)Thewalkerspassfromtheusersidetotheobjectsidewith theprobabilitymatrix,

P

(

u

o

) =

D

(

u

, λ)

A

.

(10)

2)Thewalkerspassfromtheobjectsidetotheusersidewith theprobabilitymatrix,

P

(

o

u

) =

A D

(

o

, λ).

(11)

Now,we introducethe CF,interdisciplinary physicsalgorithms from the prospect of the dynamics of random walkers by equa-tion (10) and (11). The common procedure of the collaborative filtering technique is firstly computing similarity between users (user-based)or objects(object-based) andthen making a predic-tion based on the similarity in the first step. The basic idea in the quantification of similarity between two users is based on the numberofobjects which havebeenchosen by both users in the past. It is also possible to define a similarity between two objects based on the number of users who have chosen them. Thus, takingaccount of thesetwo steps, thesimple collaborative filtering procedure can be expressed by A AT. In the HC, the re-source isredistributed via an averagingprocedure withusers re-ceiving a level of resource equal to the mean amount possessed bytheirneighbouringobjects,andobjectsthenreceivingbackthe meanof their neighbouringusers’resource levels.By contrast,in the NBI,theinitial resource placed onobjects isfirst evenly dis-tributedamongneighbouring usersandthenevenly redistributed back tothoseusersneighboringobjects.Both HCandNBIare re-distributedresourceinamannerakintoarandomwalkerprocess. Whereas, HCemploysa row-normalizedtransitionmatrix,that of NBIisacolumn-normalized transitionmatrix.Thereupon,the hy-bridmethod(HM)canbeachievedbyincorporatingtheatunable parameter

λ

intothetransitionmatrixnormalization.Thedetailed matricesareexpressedasD

(

o

,

1

− λ)

ATD

(

u

,

1

)

A D

(

o

,

λ)

AT.When

λ

=

0 theequationequalsHC,and

λ

=

1 theequation equals NBI. Asdiscussingabove,theserecommendationprocedurescanbe ex-pressed accordingtothe randomwalker processasshownin Ta-ble 1, fromwhichthe above methodsare exceptional casesfrom equation (10) and (11). When we consider three steps in ran-dom walks, we can achieve a simple recommendation. While in the three or more steps, there exists a great number of com-binations which can develop a large amountof recommendation algorithms.

Table 1

Deducingtheproposedinformationfilteringmethodsrecentlyintheliteratureby the random walker model.

Method First step Second step Third step

CF A AT A

HC A ATD(u,1) A D(o,1)

NBI A D(o,1)AT D(u,1)A

HM D(o,1− λ)AT D(u,1)A D(o, λ)AT

3. Materialsandmetrics 3.1.Materials

Inthispaper,wetestthegeneralmodelinarepresentativereal networkwhich issampledfromMovieLens[37].MovieLensis an onlinevideorecommendationwebsitewhichinvitesuserstorate videos. Therating recordsin theMovieLens rangefromone (i.e., worst)to five(i.e.,best). Ifa user i selectsan video

α

andrates it,alinkbetweentheuseri andthevideo

α

wouldbeestablished asshowninFig. 1(a).In thispaper,we onlyconsider theratings largerthan3 aslinks.Basedonthoseselectedlinks, wecan con-structa user-video bipartite network. MovieLens-1 dataset (M1) containsmore than8 millionrecords,andMovieLens-2(M2)isa subset ofM1 dataset which onlycontains 800thousand records. M2datasetisusuallyusedasbenchmarkdatasetofthe informa-tionfiltering.Inthefollowingwereporttheresultscorresponding totheM2 dataset thatis randomlydivided intothetraining set andthetest set.The trainingset contains 90%of thedata,while thetestsetconsistsoftheremaining10%ofdata.Itisnotablethat inresults, we consider that the length of a recommendationlist

L whichcanbe proposed toeachuseris 20.Thisvaluehasbeen already used in other works and can be considered a valid ref-erenceto comparethe performance ofdifferent recommendation measurements. In addition, we run 100 independent realizations ineachexperiment.

ForMovielens data sets, we first display the degree distribu-tionofobjectsinFig. 2(a).Thedistributionexhibitsrelativelybroad tailswhichindicatesmostobjects’degreeissmall, andafew ob-jects’degreeislarge.Thissuggeststhatweshouldtakethedegree longtaileffectintoaccount.Inthispaper,weclassifytheobjects’ degree into three different levels. In the high level, we consider theobjectswhosedegreerankstop1%.Inthemediallevel,the ob-jects’degreeranksfromtop1%totop10%.Thelowlevelcontains therest90%ofobjects.Foreachtargetedlevelobjectset,theuser degree distributionis shownin Fig. 2(b) and(c). We can findin thehighdegreeobjectgroup,theuserdegreedistributionshowsa stronginclination,whichsuggeststheuserswithsmalldegreeare morelikely to select the objects in the highlevel, while forthe lowdegree objectgroup,theinclination oftheuserdegreedrops down clearly. It is notable that theseobjects accountedfor 90%. Thereupon,whendesigninganefficientinformationfiltering algo-rithms,theinclinationoftheuserpreferencetoobjectsatdifferent popularitylevelshouldbewell-considered.

3.2.Metrics

Aswe known,thediscussed aboveeach methodwillgenerate arecommendationlistL for users.Inprinciple,the recommenda-tionlist froman effective recommendationmethod should be as closeas possibleto the user purchaserecords in thereal world. While,theproposed methodsallowustoemploymanyattributes ofthe bipartite networksand tunableparameters to capturethe individualinterestsofusers.Asaresult,we needtoevaluatethat howtheattributesofthebipartitenetworksandtunable parame-terseffectthe recommendationlist.So,wedefine the preference

(4)

Fig. 2. (Color

online.) The

degree distribution of objects in two movielens data sets. (a) The degree distribution of objects. The two sets have the same slope −0.95. (b) and (c) Thedistributionoftheuserdegreewithobjectswhichareinhigh,medial,lowlevelrespectively.

Fig. 3. (Coloronline.) Thepreferenceofrecommendationwhenconsideringsinglefactorsoftheuserdegreeinformation.Weemploytherandomwalkerprocesswhichis describedasD(U,λ)A ATD2(u,λ)A.Inordertocomparethepredictionwiththerealization,weconsiderthetunableparametersλ=0,0.5,1 andthetestset.

oftherecommendation toanalyzethatforaspecificgroupof ob-jects,theusersby whichtherecommendationalgorithmstendto recommend to. Meanwhile, we use the precisionofthe recom-mendation toanalyzehow manyobjectsinthe recommendation listareactuallyinuserpurchaserecordsintherealworld.Firstly, we limit to the objects in the highlevel andassume that for a targeteduseri,thereare H

(

i

)

highlevelobjectsinthe recommen-dation list,and C

(

i

)

highlevelobjects belongto thetest andthe recommendationlistincommon.Wedefinethepreferenceof rec-ommendationasfollows, P

(

k

) =



iUkH

(

i

)



iUH

(

i

)

,

(12)

whereUkisthesetofuserswhosedegreeequalsk.Theprecision is defined as the ratio of the relevant objects in the real world selected to the number of objects recommended accurately. The precisionequals, Precision

=

1 mH



iU C

(

i

)

H

(

i

)

,

(13)

wheremH isthe totalnumberof objectsinthe highlevel.Thus, ahighertheprecision value meansamoreaccurate predictionof therecommendation.Inaddition,thecorrespondingobjectsinthe medialandlowlevelaretakenthesameprocesses.

4. Validations

4.1. Theeffectofsinglefactors

Fora better understandingof thegeneralizedmodel, we con-siderthe single degreeinformationin theprocess oftherandom walk. Firstly, we consider the process of the random walk only withtheprobabilitymatrix(D

(

u

,

λ)

A)duringthethreesteps.The equation for the three steps of random walks can be expressed as D

(

U

,

λ)

A ATD2

(

u

,

λ)

A. The preference of recommendation is showninFig. 3.With

λ

increasing,therecommendationalgorithm

islesslikelytorecommendobjectswhichareinthehighand me-dial level,butforobjectswhichare inthelow levelasshownin Fig. 3(c), with

λ

increasing, thecorresponding algorithm is more inclined to recommendtheseobjects. Itsuggeststhat the predic-tionofrecommendationforobjectswhichareinhighandmedial leveliscontrarytothatforobjectsinlowlevelwith

λ

increasing.

Meanwhile,wediscussthedynamicsoftherandomwalkwhich is only affectedby theobject degree informationwith the prob-ability matrix (D

(

o

,

λ)

AT). The random walk process can be ex-pressed as A D2

(

o

,

λ)

ATA D

(

o

,

λ)

.The resultsofthe recommenda-tionpreferenceareshowninFig. 4.Itisnotablethat when

λ

=

1, thepredictionresultsthat therecommendationalgorithmcannot recommendtheobjectswhichareinthehighandmediallevelas showninFig. 4(a)and(b).AsgiveninFig. 4(b)and(c),the recom-mendationalgorithmismoretendencytorecommendtheobjects whichareinmedialandlowlevelwith

λ

decreasing.Insummary, whenonlyconsideringtheuserdegreeinformation,objectsinthe high and mediallevel are lesslikely to be recommended to the users with the parameter

λ

increasing, while the objects in the low level, the preference of recommendation is reversed. When only considering the object degree information withthe param-eter

λ

increasing,the objectsinthelow levelarelessinclined to berecommendedtotheusers.

Fig. 5 showsthathow theprecision changes withthetunable parameter

λ

whentheprocessoftherandomwalkisonlyaffected bytheobjectoruserdegreeinformation.Thereexistdifferent opti-malparametersleadingtothehighestprecisionvalueforobjectsin thehigh,medialandlowlevelrespectively.Firstly,weanalyzethat the precision value changes with

λ

varying when only consider-ingtheuserdegreeinformation.Theprecisionabouttheobjectsin thehighleveliscloselyto1 andkeepsstably.Whileinthemedial level, theprecision declines slowly when

λ

increases, inaddition in low level, the precision value is very low, and also increases slowly.Butwhenonlyconsidertheobjectdegreeinformation,the precision can reach0.11inthe lowlevel. Inthehighandmedial level,when

λ

ismorethan0

.

75,theprecisionstaysat0.Themain reason is that when

λ

rising, the recommendation algorithm do not recommendedobjects inthehighandmediallevelasshown

(5)

Fig. 4. (Color

online.) The recommendation preference when considering single factors of object degree information. We employ the random walker process which can be

describedasA D2(o,λ)ATA D(o,λ).Inordertocomparethepredictionwiththerealization,weconsiderthetunableparametersλ=0,0.5,1 andthetestset.

Fig. 5. (Color

online.) Precision with the tunable parameter

λfor the prediction of the recommendation when considering the single degree information in the process of the random walk. Squares () indicate that the precision of the recommendation when the random walker process with the probability matrix D(o,λ)ATuses the object degree

information. Circles (◦) report that the precision of the recommendation when the dynamics of random walk affects by user degree information with the probability matrix (D(u,λ)AT).

Fig. 6. (Color

online.) The recommendation preference when considering the hybrid factors of the constant user degree information and the variable object degree information.

Weemploytherandomwalkerprocesswhichisdescribedas(D(o,1− λ)ATD(u,1)A D(o,λ)AT).Inordertocomparethepredictionwiththerealization,weconsiderthe

tunable parameters λ=0, 0.4, 0.8, 1 and the test set.

inFig. 4(a)and(b).As aresult,inordertoimprovetheprecision oftherecommendation,weshouldtakedifferentstrategiesto rec-ommendtheobjectsatdifferentpopularitylevels.

4.2. Theeffectofhybridfactors

Inthissection,weanalyzetheeffectofthehybridfactorsabout the degree information of users andobjects. Firstly, we consider theformula(D

(

o

,

1

−λ)

ATD

(

u

,

1

)

A D

(

o

,

λ)

AT),inwhichtheobject degreeinformationanduserdegreeinformationarebothtaken ac-countintheinformationfilteringsystems.Inthesystem,theuser degree information isconstant, while theobject degree informa-tionistunable.Wecandescribe thedynamicsofrandomwalk in detail.In thefirst step,therandom walkerspassfromthe object side to theuser side whoare only affectedby the objectdegree information with the probability matrix (D

(

o

,

1

− λ)

AT). In the second step, the random walkers move through the access from theusersidetotheobjectside withtheconstantprobability ma-trix (D

(

u

,

1

)

A). While in the third step, the random walkers go to the user side fromthe object side withthe tunable probabil-itymatrix(D

(

o

,

λ)

AT) whichjustincludestheobjectinformation. Inaddition,we considertheparameter

λ

=

0,0

.

4, 0

.

8 and1. We

cansee fromFig. 6(a) inthe highlevel,when

λ

=

0,the recom-mendationpreferencefitswiththetestsetwell.However,when

λ

increasesto 1,thepreferenceofrecommendationtoobjectsinthe highlevelbecomesclearer,whichsuggeststhatthe recommenda-tionalgorithm ismoreinclined to recommend theobjects inthe highlevel.FromFig. 6(b)and(c)inthemedialandlow level,the preferenceofrecommendationto objectsalsobecomes more evi-dentas

λ

increasing.

After discussing the random walker dynamics with the con-stant user degree information and the tunable object degree in-formation,wethen analyzetheprocessofthe randomwalk with thetunable userdegree information andthe constant object de-gree information. In the first step, walkers start from the user sidetoobject side,weonly considertheuserdegreeinformation withthetunableprobability matrix(D

(

u

,

1

− λ)

A).In thesecond step, the random walkers pass from the object side to the user side determined by the constant object degree informationwith the probability matrix (D

(

o

,

1

)

AT). In the last step, the walkers trackfrom the user side to the object side, the user degree in-formationisalso takenintoaccount withthetunable probability matrix(D

(

u

,

λ)

A). Theresults oftherecommendationpreference are revealedin Fig. 7, it is surprisingto usthat the

(6)

correspond-Fig. 7. (Color

online.) The recommendation preference when considering the hybrid factors of the constant object degree information and the variable user degree information.

Weemploytherandomwalkerprocesswhichisdescribedas(D(u,1− λ)A D(o,1)ATD(u,λ)A).Inordertocomparethepredictionwiththerealization,weconsiderthe

tunableparametersλ=0,0.4,0.8,1 andthetestset.

Fig. 8. (Color

online.) Precision with the tunable parameter

λfor the prediction of recommendation when considering the effect of the hybrid factors about the degree information of users and objects. Squares () indicate that the precision of the recommendation when the random walker dynamics with the constant user degree infor-mation and the tunable object degree inforinfor-mation. Circles (◦) report that the precision of recommendation when the process of random walk with the tunable user degree informationandtheconstantobjectdegreeinformation.

ing cannot recommend theobjects inthe highandmediallevel to the users when

λ

=

1 as shown in Fig. 7(a) and (b). When

λ

=

0, 0

.

4, 0

.

8,thepreferenceofrecommendationissimilar,which suggeststhetunableparameter

λ

could notaffectthe recommen-dationfortheobjectsinthehighandmediallevelconsideringthe constant objectdegree informationandthe tunablevariable user degree information.While inFig. 7(c)inthe low level,we could seethat with

λ

increasing,therecommendationalgorithm isless likelytorecommendobjects,whichisdifferentfromthat consider-ingtheconstantuserdegreeinformationandthetunablevariable objectdegreeinformation.

Wealsoperformtheanalysisofthepredictionprecisionwhen takingdifferent hybrid factorsinto account as revealedin Fig. 8. When the process ofrandom walk withthe tunable userdegree information and the constant object degree information as the parameter

λ

increases, the precision of recommendation always equals to 1 inthehighlevel, andincreasesslowly in themedial level,whileinthelow levelincreasesmoreslowly.Thatistosay, theprecisionofrecommendationdoesnot changemuchwiththe tunableparameter

λ

varying,which isdifferentfrom theprocess ofrandomwalkaffectedbythetunableobjectdegreeinformation andtheconstantuserdegreeinformation,inwhichasthe param-eter

λ

increases, the precision also rises to the highest value 1 andthen keepsstableinthehighlevel,whileinthemediallevel theprecision can reach thehighestvalue 0

.

49 andthen declines slowly.Inthelow level,theprecision valuealwaysdescends with the tunableparameter

λ

increasing. In general, it is easy to find optimal strategy for objects in high andmedial level to recom-mendthemaccurately andefficiently,butforobjectsinlowlevel, we should consider a better method to improvethe accuracy of therecommendation.

5. Conclusionsanddiscussions

Sofar,thecollaborativefilteringtechniquesandthe interdisci-plinary physicsapproachesattract awealthofempirical and

the-oretical research andproduce many successfulE-commerce plat-forms.Therecould exista simplegeneralformulabehindthe col-laborative filtering, interdisciplinary physics algorithms and even theextensionmethods.Inagreementwiththepreviousstudies,we employed thedynamicsoftherandomwalks,andtheframework ofthegeneralmodelhadthreekeysteps,whichwere(1)building the initial transition probability matrix withtunable parameters; (2)obtainingthetrackofrandomwalkers;and(3)figuringoutthe walkertrackstopredictthepreferenceofusers.Weconsideredthe simplest degree information in bipartite networks and proposed a generalizedformula, which could deduceand develop the pro-posed well-known information filtering methods recently in the literaturessuchascollaborativefiltering,network-basedinference, heat conductionandhybridmethod.Furthermore,we studiedthe generalized method with the single and hybrid degree informa-tionontheprocessoftherandomwalkinthebipartitenetworks. The results stated clearly that the user degree information af-fected the recommendationpreference which is in contrastwith onlyconsideringtheobjectdegreeinformation,furtherthe perfor-manceofrecommendationwas affectedbytheprocessofrandom walk when taking account differentsingle and hybriddegree in-formation.Asaresult,wecouldsuggestthatusingthetwohybrid factors adjusttherecommendationpreference fordifferent popu-larobjectstotowardpromisingprecisionofrecommendation.Our findings could provide severalpragmatic lessons forthe efficient recommendationstrategyintheE-businessplatformsandefficient interactive information prediction in online social networks. For instance,the differentpopularlevel objectscould develop an ap-propriate strategytomakerecommendationsforusers,aswellas considering the preferenceofuserswithdifferent activity. Mean-while, thiswork leads tomanyextension, forinstance, theother information of bipartite networkssuch as clustering, eigenvector, and assortative coefficient could also be applied in the random walkdynamicsinthefurtherresearch.Inaddition,wejustdevelop the processoftherandom walkto informationfiltering consider-ingsimplestsituationinourproposedgeneralizedmethod,andwe

(7)

willfurtherstudythemachinelearningtechniquetotrainthe op-timal road map of random walkers for offering the personalized recommendation.

Acknowledgements

We acknowledge Rui Xiao for fruitful discussion and assis-tance. This work is partially supported by the EU FP7 Grant 611272 (ProjectGROWTHCOM), theSwissNationalScience Foun-dation (Grant No. 200020-143272), andthe NationalNatural Sci-enceFoundationofChina(GrantNos.61370150,61433014). References

[1]D.J.Watts,Atwenty-firstcenturyscience,Nature445 (7127)(2007)489. [2]J.B.Schafer,J.A.Konstan,J.Riedl,E-commercerecommendationapplications,

in: Applications of Data Mining to Electronic Commerce, Springer, 2001, pp. 115–153.

[3]A. Vespignani, Predicting the behavior of techno-social systems, Science 325 (5939)(2009)425.

[4]Z.-M.Ren, Y.-Q. Shi, H. Liao, Characterizing popularity dynamics ofonline videos,PhysicaA453(2016)236–241.

[5]P.B.Kantor,L.Rokach,F.Ricci,B.Shapira,RecommenderSystemsHandbook, Springer,2011.

[6]L.Lü,M.Medo,C.H.Yeung,Y.-C.Zhang,Z.-K.Zhang,T.Zhou,Recommender systems,Phys.Rep.519 (1)(2012)1–49.

[7]P.Resnick,N.Iacovou,M.Suchak,P.Bergstrom,J.Riedl,Grouplens:anopen architecture for collaborative filtering of netnews, in: Proceedings of the 1994ACMConferenceonComputerSupportedCooperativeWork,ACM,1994, pp. 175–186.

[8]B.Sarwar,G.Karypis,J.Konstan,J.Riedl,Item-basedcollaborativefiltering rec-ommendationalgorithms,in:Proceedingsofthe10thInternationalConference onWorldWideWeb,ACM,2001,pp. 285–295.

[9]J.S.Breese,D.Heckerman,C.Kadie,Empiricalanalysisofpredictivealgorithms for collaborativefiltering, in: Proceedingsof the Fourteenth Conference on UncertaintyinArtificialIntelligence,MorganKaufmannPublishersInc.,1998, pp. 43–52.

[10]D.Goldberg, D. Nichols,B.M. Oki, D. Terry,Using collaborativefiltering to weaveaninformationtapestry,Commun.ACM35 (12)(1992)61–70. [11]L.H.Ungar,D.P.Foster,Clusteringmethodsforcollaborativefiltering,in:AAAI

WorkshoponRecommendationSystems,vol.1,1998,pp. 114–129.

[12] L. Ungar,D.P. Foster, Aformal statisticalapproachto collaborativefiltering, CONALD’98.

[13]Y.Azar,A.Fiat,A.Karlin,F.McSherry,J.Saia,Spectralanalysisofdata,in: Pro-ceedingsoftheThirty-ThirdAnnualACMSymposiumonTheoryofComputing, ACM,2001,pp. 619–626.

[14]Y.Koren,R.Bell,C.Volinsky,Matrixfactorizationtechniquesforrecommender systems,Computer8(2009)30–37.

[15]D.M.Blei,A.Y.Ng,M.I.Jordan,LatentDirichletallocation,J.Mach.Learn.Res.3 (2003)993–1022.

[16]R.Keshavan,A.Montanari, S.Oh,Matrixcompletionfromnoisyentries,in: AdvancesinNeuralInformationProcessingSystems,2009,pp. 952–960. [17]R.Albert,A.-L.Barabási,Statisticalmechanicsofcomplexnetworks,Rev.Mod.

Phys.74 (1)(2002)47–97.

[18]C.Castellano,S.Fortunato,V.Loreto,Statisticalphysicsofsocialdynamics,Rev. Mod.Phys.81 (2)(2009)591–646.

[19]Y.-C.Zhang,M.Blattner,Y.-K.Yu,Heatconductionprocessoncommunity net-worksasarecommendationmodel,Phys.Rev.Lett.99 (15)(2007)154301. [20]T.Zhou,J.Ren,M.Medo,Y.-C.Zhang,Bipartitenetworkprojectionandpersonal

recommendation,Phys.Rev.E76 (4)(2007)046115.

[21]Y.-C.Zhang,M.Medo,J.Ren,T.Zhou,T.Li,F.Yang,Recommendationmodel basedonopiniondiffusion,Europhys.Lett.80 (6)(2007)68003.

[22]J.-H.Liu,Z.-K.Zhang,C.Yang,L.Chen,C.Liu,X.Wang,Gravityeffectson infor-mationfilteringandnetworkevolving,PLoSONE9 (3)(2014)e91070. [23]W.Zeng,A.Zeng,H.Liu,M.-S.Shang,T.Zhou,Uncoveringtheinformationcore

inrecommendersystems,Sci.Rep.4(2014)6140.

[24]Q.-M.Zhang,A.Zeng,M.-S.Shang,Extractingtheinformationbackbonein on-linesystem,PLoSONE8 (5)(2013)e62624.

[25]T. Zhou,Z. Kuscsik, J.-G. Liu,M. Medo, J.R. Wakeling, Y.-C. Zhang, Solving theapparentdiversity–accuracydilemmaofrecommendersystems,Proc.Natl. Acad.Sci.107 (10)(2010)4511–4515.

[26]A.Fiasconaro, M.Tumminello, V.Nicosia,V.Latora, R.N. Mantegna, Hybrid recommendation methodsincomplex networks, Phys. Rev.E 92 (1)(2015) 012811.

[27]A.Stojmirovic,Y.-K.Yu,Informationflowininteractionnetworks,J.Comput. Biol.14 (8)(2007)1115–1143.

[28]J.-G.Liu,T.Zhou,Q.Guo,Informationfilteringviabiasedheatconduction,Phys. Rev.E84 (3)(2011)037101.

[29]M.-S. Shang,C.-H.Jin, T.Zhou,Y.-C.Zhang, Collaborativefilteringbased on multi-channeldiffusion,PhysicaA388 (23)(2009)4867–4871.

[30]T. Zhou,L.-L. Jiang,R.-Q. Su, Y.-C. Zhang, Effect ofinitial configurationon network-basedrecommendation,Europhys.Lett.81 (5)(2008)58004. [31]L.Lü,W.Liu,Informationfilteringviapreferentialdiffusion,Phys.Rev.E83 (6)

(2011)066119.

[32]F.-G.Zhang,A.Zeng,Informationfilteringviaheterogeneousdiffusioninonline bipartitenetworks,PLoSONE10 (6)(2015)e0129459.

[33]J.-G.Liu,K.Shi,Q.Guo,Solvingtheaccuracy–diversitydilemmaviadirected randomwalks,Phys.Rev.E85 (1)(2012)016118.

[34]Z.-K.Zhang, C.Liu,Ahypergraphmodelofsocialtaggingnetworks, J.Stat. Mech.2010 (10)(2010)P10005.

[35]Z.-K.Zhang,T.Zhou,Y.-C.Zhang,Tag-awarerecommendersystems:a state-of-the-artsurvey,J.Comput.Sci.Technol.26 (5)(2011)767–777.

[36]L. Yu, C. Liu, Z.-K. Zhang, Multi-linear interactive matrix factorization, Knowl.-BasedSyst.85(2015)307–315.

Figure

Fig. 1. (Color online.) Illustration of a bipartite network which is constructed by the tracks of the consumer purchases, which can be described as the process of the random walker model
Fig. 2. (Color online.) The degree distribution of objects in two movielens data sets
Fig. 4. (Color online.) The recommendation preference when considering single factors of object degree information
Fig. 7. (Color online.) The recommendation preference when considering the hybrid factors of the constant object degree information and the variable user degree information.

Références

Documents relatifs

Then the preceding lemma shows that for any α 6 = 1/2 satisfying the condition of the lemma, the random walk associated with urns starting from (α, 2l 0 ) is always transient,

Computing the martingale decomposition that we wish for amounts to decomposing a discrete vector field over this new state space into a divergence-free part (corresponding to

> We derive new statistical tests for economic networks, based on this theoretical model > Applying our tests to credit network data, we show these networks don't follow

In this article, we simplify the cellular medium by reducing it to the dimension one and by discretizing Brownian motion, that is we propose a naive one-dimensional random walk model

This technique, originally introduced in [6; 9; 10] for the proof of disorder relevance for the random pinning model with tail exponent α ≥ 1/2, has also proven to be quite powerful

Renewal process, fractional Poisson process and distribution, fractional Kolmogorov-Feller equation, continuous time random walk, generalized fractional diffusion.. ∗

Ils ont trouvé pour 90% des cas, que l’état clinique a été jugé limitant ou symptomatique, et chez 9% des patients l’effet indésirable constituait une menace vitale, la même

We then introduce some estimates for the asymptotic behaviour of random walks conditioned to stay below a line, and prove their extension to a generalized random walk where the law