Quantifying and suppressing ranking bias in a large citation network

(1)

Quantifying and suppressing ranking bias in a large citation

network

Giacomo Vaccario

a

_{, Matúˇs Medo}

b,c,d

_{, Nicolas Wider}

a

_,

Manuel Sebastian Mariani

d,e,∗

a_{Chair of Systems Design, ETH Zurich, 8092 Zurich, Switzerland}

b_{Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, PR China} c_{Department of Radiation Oncology, Inselspital, Bern University Hospital and University of Bern, 3010 Bern, Switzerland}

d_{Department of Physics, University of Fribourg, 1700 Fribourg, Switzerland}

e_{Guangdong Province Key Laboratory of Popular High Performance Computers, College of Computer Science and Software Engineering,} Shenzhen University, Shenzhen 518060, PR China

Keywords: Impact indicators Ranking Network analysis Field bias Field normalization

It is widely recognized that citation counts for papers from different fields cannot be directly compared because different scientific fields adopt different citation practices. Cita-tion counts are also strongly biased by paper age since older papers had more time to attract citations. Various procedures aim at suppressing these biases and give rise to new normalized indicators, such as the relative citation count. We use a large citation dataset from Microsoft Academic Graph and a new statistical framework based on the Mahalanobis distance to show that the rankings by well known indicators, including the relative citation count and Google’s PageRank score, are significantly biased by paper field and age. Our statistical framework to assess ranking bias allows us to exactly quantify the contributions of each individual field to the overall bias of a given ranking. We propose a general nor-malization procedure motivated by the z-score which produces much less biased rankings when applied to citation count and PageRank score.

1. Introduction

Paper citation count itself and various quantities derived from it are used as influential indicators of research impact (Garfield, 2006; Hirsch, 2005). At the same time, it is well known that the cumulative number of citations received by academic publications strongly depends on paper age and field (Schubert & Braun, 1986; Vinkler, 1986). Old papers have had more time to acquire citations than recent ones, and their advantage is further enhanced by the preferential attachment mechanism (de Solla Price, 1976; Newman, 2009). While heterogeneous paper fitness and paper aging possibly attenuate the advantage of old nodes (Medo, Cimini, & Gualdi, 2011; Wang, Song, & Barabási, 2013), empirical evidence typically shows that citation count is still biased toward old nodes (seeMariani, Medo, & Zhang, 2016;Newman, 2009; Radicchi & Castellano, 2011, among others). In addition, different academic fields adopt very different citation practices (seeBornmann & Daniel, 2008for a review on the topic), which results in a strong dependence of the mean number of citations on academic field, as shown in several works (Bornmann & Daniel, 2009; Lundberg, 2007;Radicchi, Fortunato, & Castellano, 2008, among others).

∗ Corresponding author at: Department of Physics, University of Fribourg, 1700 Fribourg, Switzerland. E-mail address:manuel.mariani@unifr.ch(M.S. Mariani).

http://doc.rero.ch

Published in "Journal of Informetrics 11(3): 766–782, 2017"

which should be cited to refer to this work.

(2)

Anaturalquestionarises:howcanwe“fairly”usecitation-basedindicatorstocomparepapersfromdifferentfieldsandof differentage?Theproblemofcomparingpapersfromdifferentfieldsisusuallyreferredtoasthefield-normalizationproblem. Severalapproachestoaddressthisquestionhavebeenproposedintheliterature(seeWaltman,2016forarecentreview).A particularlysimpleapproachistodivideeachpaper’scitationcountbythemeannumberofcitationsforpapersofthesame fieldpublishedinthesameyear.TheresultsbyRadicchietal.(2008)suggestedthatthisindicator,calledrelativecitation count,producesarankingthatisstatisticallyconsistentwiththehypothesisofarankingthatisnotbiasedbyfieldandage. ThisfindinghasbeenchallengedbysubsequentworksbyAlbarrán,Crespo,Ortuno,andRuiz-Castillo(2011)andWaltman, vanEck,andvanRaan(2012),whichleavesthedebateonage-andfield-normalizationproceduresstillopen.

In thisarticle,weanalyzealargedatasetfromMicrosoftAcademicGraph(Sinhaetal.,2015)toshowthatexisting indicatorsofimpact,includingtherelativecitationcount,failtoproducerankingsthatarenotbiasedbyageandfield.To simultaneouslyassessthesebiases,wepresentanewprocedurebasedontheMahalanobisdistance(Mahalanobis,1936). Thispermitstocomparetherankingbyagivenindicatorwiththoseobtainedwithasimulatedunbiasedsampling,andhence toquantifytheoverallrankingbias.Ananalyticresultderivedinthispaperallowsustoassessthecontributionofeachfield totheoverallrankingbias.Itisworthnoticingthatwhilewefocusonthebiasesbyageandfield,ourbiasassessment procedurecanbeeasilyextendedtodetectanyotherkindofinformationbias.

WealsopresentthefirstsystematicstudyofthepossiblebiasbyfieldofthePageRankscore(Brin&Page,1998)and ofitsage-rescaledversionintroducedbyMarianietal.(2016)atarticle-level.Themotivationtoanalyzethese network-basedindicatorscomesfromthefindingthattheyoutperformothermetricsinidentifyingexpert-selectedmilestonepapers (Marianietal.,2016).However,theapplicationofPageRankanditsvariantstoacademiccitationnetworksfocusedon datasetscomposedofpapersfromasinglefield(Chen,Xie,Maslov,&Redner,2007;Marianietal.,2016;Mariani,Medo, &Zhang,2015;Walker,Xie,Yan,&Maslov,2007;Yao,Wei,Zeng,Fan,&Di,2014;Zhou,Zeng,Fan,&Di,2016).While thepossiblebiasbyscientificfieldofeigenvector-basedalgorithmshasbeenexploredbyWaltmanandvanEck(2010)at journal-level,thePageRankscore’spossiblebiasbyacademicfieldatarticle-levelis(toourbestknowledge)stillunexplored andwearethefirstauthorstoaddressit.

Weintroducetwonovelindicatorsofimpactmotivatedbythez-score:age-andﬁeld-rescaledcitationcountRAF_(c)_and

age-andﬁeld-rescaledPageRankRAF_(p)._We_ﬁnd_that_the_novel_indicators_produce_paper_rankings_that_are_much_less_biased

byageandﬁeldthantherankingsproducedbytheotheranalyzedindicators.Nevertheless,alsotheMahalanobisdistance observedforthenewindicatorsisnotstatisticallyconsistentwithonesobtainedforasimulatedunbiasedprocess.This indicatesthattheproblemofachievinganidealunbiasedrankingofthepublicationsremainsopen.

Therestofourarticleisorganizedasfollows:Section2describestheanalyzeddatasetofpublicationsobtainedfrom theMicrosoftAcademicGraph.Section3presentsexistingpaper-levelimpactindicatorsandreportstheirbiasbyscientific field.InSection4,weintroducearescalingprocedureforcitationcountandPageRankscoresmotivatedbythez-score.In Section5,weintroduceageneralproceduretotestforanykindofrankingbias,andpresentitsapplicationtoassessthefield andagebiasoftherankingsbytheindicatorsstudiedhere.InSection6,weconcludebydiscussingpossiblelimitationsof ouranalysisandfutureresearchdirections.

2. Data

WeanalyzeabibliographicdatasetwhichwasprovidedfortheKDDCup2016.1 _This_data_is_a_dump_of_the_Microsoft

AcademicGraph(MAG)andcontainsmorethan126millionsofpublicationsandmorethan467millionscitations(Sinha etal.,2015).EachpublicationisalsoendowedwithvariouspropertiessuchasuniqueID,publicationdate,titleandjournal ID.Wepre-processedthedata(detailsareprovidedinAppendixA)toremovefromtheanalysispaperswithincomplete information,endingupwithN=18193082uniquepublicationsandE=109719182citations.

TheMAGhasafieldclassificationatpaperlevel(Sinhaetal.,2015).IntheKDDcupdump,thereare19mainfieldsand numeroussubfieldsupto3hierarchicallevelsofsubsubfields.However,allthedifferentsubfieldscanbelongtoseveral mainfields,meaningthateachpublicationcanbelongtomorethanonemainfield.Weuseherethefieldclassificationat thehighesthierarchicallevel,i.e.,weonlyconsiderthe19mainfields.WhencalculatingthecitationcountandPageRank scoreofpapers(seeSection3),weconsiderthepublicationsthatbelongtomorethanonefieldonlyonce.Inthisway,wedo notmodifythenumberofcitationsthateachpaperreceivesandgives,andwedonotchangethetopologyofthenetwork onwhichthePageRankscoresarecalculated.Ontheotherhand,inagreementwithWaltmanetal.(2012),tocompute thefields’size(seeTableA.2inAppendixA)andthefield-rescaledmetrics(seeSections3and4),eachpublicationcanbe consideredmultipletimesintheanalysis,onceforeachfieldthepublicationbelongsto.Inthisway,eachfieldisrepresented byallitspublicationsevenifsomeofthesearesharedwithotherfields.

BeforemovingtothenextSections,wedevoteourattentiontotwomainassumptionsofouranalysis.First,weassume thattheMicrosoftAcademicdataprovidearepresentativesampleofthepopulationofpublicationsandoftheircitations.This assumptionismotivatedbytheﬁndingsofindependentanalysesoftheMicrosoftAcademicdataset(Harzing&Alakangas, 2013;Hug&Braendle,2017)thathaveshownthatitscoverageiscomparabletootherpopularacademicdatabases,suchas ScopusandWebofScience.

1_{https://kddcup2016.azurewebsites.net/Data}

(3)

Second,toquantifythebiasbyfieldofimpactindicators,weassumethatthefieldsaregivenbytheMicrosoftAcademic’s fieldclassificationschemeatitshighesthierarchicallevel.Intheliterature,thereisnogeneralagreementonwhichfield classificationschemeshouldbeusedtoclassifypapersandthereisanentirestreamofworksinvestigatingissuesrelatedto this(Adams,Gurney,&Jackson,2008;Colliander&Ahlgren,2011;Radicchi&Castellano,2012a;Sirtes,2012;Zitt, Ramanana-Rahary,&Bassecoulard,2005).In particular,thechoiceofa suitableaggregationlevelhasbeenshown tobedelicate: byconsideringthemostaggregatefields,heterogeneitiesinthesubfields’citationpatternsmightbehidden(Radicchi& Castellano,2011;vanLeeuwen&Medina,2012)–thiseffecthasbeenshowntobemagnifiedwheniterativeranking algorithmsareusedinsteadofcitationcount(Waltman,Yan,&vanEck,2011).Ontheotherhand,increasingtheresolution ofthefieldclassificationmayleadtolargely-overlappingfieldsortohardlyinterpretablefields.Forexample,Hug,Ochsner, andBrandle(2017)showthattheMAGfieldsatthesecondhighestlevelaretoodetailedand,forthisreason,theauthors suggestthattheyshouldnotbeusedforfield-normalizationpurposes.Weleavetofutureresearchtheimportantstudyof howdifferentclassificationschemesimpactthebiasesofrankingsandhowourresultsgeneralizetootherdatasets.

3. Shortcomingsofexistingmetrics

Wenowdeﬁnethefourexistingmetricsanalyzedinthiswork:citationcountc,relativecitationcountcf_,_PageRank_score

p,age-rescaledPageRankscoreRA_(p)_(Section_3.1)._Furthermore,_we_show_that_these_metrics_are_severely_biased_by_scientiﬁc

ﬁeld(Section3.2).

3.1. Deﬁnitionofexistingmetrics

Citationcount,c.Thecitationcountciofnodeiissimplythenumberofcitationsreceivedbypaperi.Intermsofthecitation

network’sadjacencymatrixA(inadirectednetwork,Aij=1ifnodejpointstonodei,Aij=0otherwise),wecanexpressthe

citationcountasci=

jAij.

Relativecitationcount,cf_._To_overcome_citation_count’s_bias_by_paper_age_and_academic_ﬁeld,_Radicchi_et_al.₍₂₀₀₈₎_deﬁned

therelativecitationcountc_ifofpaperiasc_if:=ci/Yi(c),whereYi(c)denotesthemeancitationcountforpaperspublished

inthesameﬁeldandyearaspaperi.Throughoutthispaper,wealwaysrefertothe19mainﬁeldsprovidedintheMAG dataset.

PageRank,p.Citationcountandmetricsbuiltonitshareanimportantlimitation:thecitationsapaperreceivesareall countedthesame,regardlessoftheimportanceofthecitingpaper.Apossiblewaytoovercomethislimitation–recognized alreadyinthe70sinthescientometricscommunity(Pinski&Narin,1976)–istotakeintoaccountthewholestructureof thepaper-papercitationnetwork.Inthisspirit,eigenvector-basedmetricstakeasinputthecitationnetwork’sadjacency matrixA.Thisclassofmetricshavebeenappliedinvariousresearchdomainsincludingscientometrics(Bergstrom,West, &Wiseman,2008;Pinski&Narin,1976),Webinformationretrieval(Brin&Page,1998;Kleinberg,1999),socialscience (Bonacich,1987;Katz,1953)–see(Ermann,Frahm,&Shepelyansky,2015;Franceschet,2011;Gleich,2015)forareview. Amongthesemetrics,wefocusonGoogle’sPageRankscore(Brin&Page,1998).Thiswasoriginallydevisedtorankwebpages intheWorldWideWebandhasattractedconsiderableinterestofthescientometricscommunity.Therationalebehindits applicationtocitationnetworksisthatcitationscomingfrominﬂuentialpapersshouldcountmorethancitationsfrom obscurearticles.

ThePageRankscoresofpapersarecontainedinavectorpdeﬁnedbythefollowingequation

p=˛Pp+(1−˛)v, (1)

where˛isaparameterofthealgorithm(calleddampingfactor),Pistherandom-walktransitionmatrixwithelements Pij=Aij/koutj ,koutj =

lAlj isthenumberofreferencesinpaperj,andvisauniformteleportationvectorwithelements

v

_i=1/Nforallpapersi.Eq.(1)canbeinterpretedasthestationaryequationofastochasticprocessonthecitationnetwork. Inthisprocess,arandomwalkerisplacedoneachpaperandhe/sheeitherfollowsacitationedgewithprobability˛,or jumpstoarandomlychosenpaperwithprobability1−˛.Whenthenumberofwalkersoneachpaperreachesastationary value,weobtainthePageRankscoreofapaperibycalculatingthefractionofwalkeronthispaper.Thereisnouniversal criteriontochoosethevalueofthedampingfactor˛.InagreementwithChenetal.(2007),weset˛=0.5whichcorresponds toarandomwalkercoveringpathsoflengthtwobeforeteleportingtoarandomnode,asopposedtopathsoflengthcloseto sevenexpectedwiththeoftenusedvalue˛=0.85.Chenetal.(2007)arguethatthechoice˛=0.5betterreﬂectstheactual surﬁngbehaviorofresearchersthan˛=0.85.

PageRankisbasedonastatic,time-aggregatedperspectiveoftheconsiderednetwork.Ingeneral,suchperspectivehas beenshowntobelimitingfortheanalysisofevolvingnetworks(Marianietal.,2015;Scholtes,Wider,Pﬁtzner,Garas,Tessone, &Schweitzer,2014b,Wider,etal.,2014).Whiletheresultingmetric’sbiastowardsoldpapershasalreadybeenstudiedin theliterature(Chenetal.,2007;Marianietal.,2015,2016;Maslov&Redner,2008),itspossiblebiasbyacademicﬁeldisstill unexploredandweaddressitinSection3.2.

Age-rescaledPageRank,RA_(p)._To_suppress_the_age_bias_of_PageRank,_Mariani_et_al.₍₂₀₁₆₎_proposed_to_rescale_the_PageRank

scorebycomparingeachpaper’sscorewiththescoresofpapersofsimilarage.Assumingthatthepapersareorderedby

(4)

Fig.1.Fieldbiasoftheanalyzedcitation-basedmetrics.Toppanelsshowhistogramsofthefractionoftop-1%publicationsforeachﬁeldintherankingby (lefttoright)citationcountandrelativecitationcount.Theblackhorizontallineisat0.01,i.e.theexpectedvalue.Bottompanelsshowforeachﬁeldthe complementarycumulativedistributionsforcitationcount(left)andrelativecitationcount(right).

oldertoyounger,onecomputesthemeanvalueA

i(p)andthestandarddeviationiA(p)ofPageRankscoresoverppapers

aroundi,i.e.j∈[i−p/2,i+p/2].Consequently,therescaledPageRankscoreRA_i(p)ofpaperiisdeﬁnedas

RA i(p)= pi−A_i(p) A i(p) . (2)

Marianietal.(2016)appliedrescaledPageRank tothenetwork ofphysicspapers publishedbytheAmericanPhysical Societyjournalstoshowthattheresultingrankingisnotbiasedbypaperageand,asaresult,itallowsustoidentifyseminal publicationmuchearlierthanrankingsbymetricsthatarebiasedagainstrecentpapers.Inthefollowing,wesetp=1000

asinMarianietal.(2016).

3.2. Fieldbiasoftheexistingmetrics

Afterhavingdescribedasetofexistingmetrics,wenowapplythemtotheMAGdatasettoshowthattherankingsthat theyproducearebiasedbyscientificfield.Forarankingthatisnotbiasedbyscientificfield,thenumberoftop-ranked publicationsfromeachfieldshouldbeproportionaltothetotalnumberofpublicationsfromthatfield.Inotherwords,for anunbiasedranking,weexpect

f =₁₀₀z Kf (3)

papersfromﬁeldfamongthetopz%papersintheranking,whereKfisthetotalnumberofpublicationsfromﬁeldf(Radicchi

etal.,2008).Inthefollowing,wedenotebyk(m)_f thenumberofpublicationsfromﬁeldfinthetop-1%oftherankingbymetric m.Werestrictouranalysistoz%=1%;resultsforothervaluesofzareavailableuponrequestfromtheauthors.

InthetoppanelsofFig.1,weillustratetheﬁeldbiasofcitationcount,c,andrelativecitationcount,cf_._The_presence_of

strongbiasesisevidentforbothmetricsbecausetherearefieldswhoseratiok(m)_f /K_f isfarawayfromtheexpectedvalue 0.01.Inparticular,EnvironmentalScienceisextremelyover-representedinthetopoftherankingbycitationcount.We arguethatthisbiascomesfromthefactthatpublicationsfromthisfieldhaveameancitationcountalmosttwiceasbig comparedtopublicationsbelongingtootherfields(seeTableA.2).Forrelativecitationcount,wefindabetteragreement withwhatwewouldexpectfromanunbiasedindicator.However,relativelylargedeviationsarestillevident,especiallyfor thefieldofPoliticalScience.

InthebottompanelsofFig.1,wereportthedistributionsofcandcf_for_each_ﬁeld._These_panels_show_that_the_bias

byfieldisnotlimitedtothetop1%papersintheranking,butitarisesfromsystematicdifferencesbetweenthescore distributionsacrossdifferentfields.Forexample,whenlookingatthedistributionofc,papersinthefieldofPoliticalScience haveconsistentlysmallerprobabilitytohavemorethanonecitationcomparedtootherfields.Foradetaileddiscussion aboutthebiasoftherankingbycf_,_we_refer_to_Appendix_C.

Fig.2reportsthesameanalysisforPageRankscores,p,andage-rescaledPageRankscores,RA_(p)._This_ﬁgure_provides_the

firststudyofthedependenceofPageRankscoreonacademicfield.ThetoppanelsofFig.2showthatthetoppositionsofboth rankingsarebiasedbyfield,andbothrankingsoverestimatetheimpactofpublicationsinthefieldofEnvironmentalScience. Again,wearguethatthishappensbecausethemeanindegreeofpublicationsfromEnvironmentalScienceisapproximately twiceasbigcomparedtopublicationsthatbelongtootherfields(seeTableA.2).Fromthebottom-leftpanelofFig.2,wefind thatthefulldistributionofscoresofPageRankhaveasimilarshape,butdifferentbroadness.Thesedifferencesareslightly

(5)

Fig.2. FieldbiasoftheanalyzedmeasuresbasedonPageRank.Toppanelsshowhistogramsofthefractionoftop-1%publicationsforeachﬁeldinthe rankingby(lefttoright):PageRankandage-rescaledPageRank.Theblackhorizontallineisat0.01,i.e.theexpectedvalue.Bottompanelsshowforeach ﬁeldthecomplementarycumulativedistributionsforPageRank(left)andage-rescaledPageRank(right).

smallerfortheage-rescaledPageRank,withtheexceptionoftheﬁeldofEnvironmentalScience(seebottomrightpanelof Fig.2).

4. Deﬁningnewage-andﬁeld-normalizedmetrics

Inthissection,weintroducetwonovelindicatorsofpaperimpact:theage-andﬁeld-rescaledcitationcount,RAF_(c),

andtheage-andﬁeld-rescaledPageRank,RAF_(p)._The_two_indicators,_RAF_(c)_and_RAF_(p),_are_obtained_from_citation_count

candPageRankscorep,respectively,througharescalingprocedure.Thisprocedureisbasedonthez-scoreandisaimed atsuppressingageandﬁeldbias.Theideaofusingthez-scoreisnotnewinscientometrics(Bornmann&Daniel,2009; Lundberg,2007;Marianietal.,2016;McAllister,Narin,&Corrigan,1983;Newman,2009;Zhang,Cheng,&Liu,2014);our newindicatorscanbeconsideredasvariantsoftheindicatorbasedonthez-scorestudiedbyZhangetal.(2014)andtheir maindifferenceisexplainedbelow.

4.1. Age-andﬁeld-rescaledcitationcount,RAF_(c)

Tocalculatetheage-andﬁeld-rescaledcitationcountRAF

i (c)ofapaperibelongingtoaﬁeldf,weﬁrstcomputethemean

AF

i (c)andthestandarddeviationiAF(c)ofthecitationcountofpapersofthesameﬁeldandofsimilarageaspaperi.In

particular,AF

i (c)andAFi (c)arecomputedoverthepapersthatbelongtothesameﬁeldfaspaperiandthatareamongthe

cclosestpaperstoiasmeasuredbythedistance|i−j|betweentheirrankbyage.Then,theage-andﬁeld-rescaledcitation

countscoreRAF i (c)isdeﬁnedas RAF i (c)= ci−AFi (c) AF i (c) . (4)

Theaveragingwindowsizecisaparameterofthemethod,whichwesettoc=1000.

DifferentlyfromZhangetal.(2014),forthecomputationofthez-score,weusetemporalwindowswiththesamenumber ofpublications,whichingeneralcorrespondstoreal-timeintervalsofdifferentduration.Thischoiceissupportedbyrecent findings(Paroloetal.,2015)thatindicatethatincitationnetworks,timeisbetterdefinedbynumberofpublicationsthan byrealtime.Furthermore,rescaledmetricsbasedonthez-scorewithfixedtemporal-windowdurationhavealreadybeen showntounderperformwithrespecttotherelativecitationcountcf_in_the_task_of_producing_an_unbiased_ranking_(Zhang

etal.,2014).Ouranalysis(notshown)confirmsthatwhenthetemporalwindowsareofafixedtemporallength,thebias removalisinferiortothatachievedwithtemporalwindowscontainingafixednumberofpublications.Forthesereasons, wedonotincludemetricsbasedonz-scorewithfixedtemporal-windowdurationinouranalysis.

Differentlyfromtherelativecitationcountcf_,_RAF_(c)_is_expected_to_have_not_only_uniform_mean_value_across_different

publicationdatesandﬁelds,butalsouniformstandarddeviation.Thisshouldleadtoamorebalancedrankingofthepapers. Weshowinthefollowingthatthisisindeedthecase.

(6)

Fig.3. Fieldbalanceoftheanalyzedcitation-countandPageRank-basedmetrics.Toppanelsshowhistogramsofthenumberoftop-1%publicationsfor eachfieldintherankingby(lefttoright):age-andfield-rescaledcitationcount,andage-andfield-rescaledPageRank.Theblackhorizontallineisat0.01, i.e.theexpectedvalue.Bottompanelsshowforeachfieldthecomplementarycumulativedistributionsforage-andfield-rescaledcitationcount(left),and age-andfield-rescaledPageRank(right).

4.2. Age-andﬁeld-rescaledPageRank,RAF_(p)

PreviousworkshaveshownthatPageRankisbiasedtowardsoldpapersinscientificcitationnetworks(Chenetal.,2007; Marianietal.,2015;Maslov&Redner,2008).Moreover,wehaveshowninSection3thatPageRankscorepisbiasedby scientificdomain.Tosimultaneouslysuppressthesetwobiases,weproposetheage-andfield-rescaledPageRankscore RAF_(p)._RAF_(p)_is_defined_similarly_as_RAF_(c):_we_compute_the_mean_valueAF

i (p)andthestandarddeviationiAF(p)ofthe

PageRankscoresofthepapersthatbelongtothesameﬁeldaspaperiandthatareamongthepclosestpaperstoias

measuredbythedistance|i−j|betweentheirrankbyage.Theage-andﬁeld-rescaledPageRankscoreisthendeﬁnedas RAF_i (p)=pi−AFi (p)

AF i (p)

. (5)

Inthefollowing,wesetp=1000.

4.3. Fieldbiasofthenewmetrics

In thetoppanelsofFig.3,weshowthatinthetop-1%oftherankingsbyRAF_(p)_and_RAF_(c)_each_ﬁeld _appears_well

represented.Infact,thedeviationsfromtheexpectedvalueareverysmallespeciallyifcomparedtothedeviationsofthe otherrankings(seetoppanelsinFigs.1and2).InthebottompanelsofFig.3,wereportthatthefullscoredistributionsfor papersfromdifferentﬁeldscollapseextremelywelltopofeachotherthankstotherescalingprocedure.

5. Quantifyingrankings’biasesbyﬁeldandage

WebeginthisSectionbyintroducinganewmethodologytoassessaranking’sbiasbasedontheMahalanobisdistance (Section5.1).Then,weusethistoquantifythebiasbyﬁeld(Section5.2)andthebiasbyageandﬁeld(Section5.3). 5.1. AgeneralframeworktoassessrankingbiasesbasedontheMahalanobisdistance

WhileFigs.1and2illustratethesubstantialﬁeldbiasoftheexistingmetrics,thebiasismuchweaker(ifany)forthe newage-andﬁeld-rescaledmetricsinFigure3.Nowwequantifythisimprovementbyextendingthestatisticaltestsofbias suppressionpresentedbyMarianietal.(2016),RadicchiandCastellano(2012b),Radicchietal.(2008),Waltmanetal.(2012). Similarlytotheseworks,weassumethatarankingisunbiasedifitspropertiesareconsistentwiththoseofanunbiased selectionprocess.

5.1.1. Assessingthebiasbyﬁeld

Letusfirstanalyzetheproblemofassessingthebiasbyfield.ConsideranurnwhichcontainsNmarbles,eachofthem correspondingtooneofthepublicationspresentinourdataset.Anunbiasedselectionprocessthencorrespondstosampling fromthisurnatrandomwithoutreplacementafixednumbern=N×0.01ofpublications.Fromtheextractedsample,we countthenumberofpublicationsthatbelongtoeachfieldf,kf,andrecordthesenumbersinthevector k=(k1,...,kF)T;here

(7)

Fig.4.Mahalanobisdistances,dM,fortheanalyzedindicatorswhenconsideringthe19mainﬁelds.Fromlefttoright:citationcount,relativecitationcount,

PageRank,age-rescaledPageRank,age-andfield-rescaledcitationcountandage-andfield-rescaledPageRank.Thehorizontalredlinerepresentstheupper boundofthe95%confidenceintervalobtainedfromthesimulations.Intheinsets,wereportthedistributionofdMcomingfrom1000000simulationsof

theunbiasedsamplingprocess.Again,theredlinerepresentstheupperboundofthe95%conﬁdenceinterval.(Forinterpretationofthereferencestocolor inthisﬁgurelegend,thereaderisreferredtothewebversionofthearticle.)

Fdenotesthenumberofﬁelds.Theprobabilitytoobserveacertainvector, k,isgivenbythemultivariatehypergeomentric distribution(MHD) P

k

=

F f

K_f kf

N n

(6)

whereKfisthetotalnumberofpublicationsinﬁeldf.Followingthisselectionprocess,amongthenextractedpublications,

theexpectednumberofpublicationsforﬁeldfisf=nKf/N.

Assumethattheactualrankingbyagivenmetricmfeaturesk_f(m)publicationsfromﬁeldfinthetop1%ofitsranking.In general,theobservedk(m)_f deviatesfromitsexpectedvaluef.Asimpleapproachtoquantifythisdeviationwouldconsist

incomputingthez-score,deﬁnedasz(m)_f :=(k(m)_f −f)/f,wherefistheexpectedstandarddeviationforﬁeldfaccording

totheMHDspecifiedbyEq.(6).Therearehowevertwoshortcomingsofthez-score.First,thez-scoreonlygivespartial informationforaMHD–howfarfromtheexpectedvaluesweareinunitsofstandarddeviations–butitdoesnotprovide informationonhowstatisticallysignificantthedeviationsare.Second,toquantifytheoverallbiasofagivenindicatorm, wewouldneedtoaggregatethez-scoresfromthedifferentfields.Forexample,wecouldtaketheaveragez-score,butthis wouldneglectthecorrelationbetweenthedifferentfieldscomingfromtheconstraintn=

_fk(m)_f .

Toovercomethesetwoproblems,wefollowadifferentapproach.Weﬁrstrunvariousnumericalsimulationsthat repro-ducetheunbiasedselectionprocess.Thesesimulationsproduceasetofrankingvectorswhicharedistributedaccording toEq.(6)aroundthevectorofexpectedvalues, =(1,...,m).Differentlyfrom(Radicchi&Castellano,2012b),wedo

notestimatetheconfidenceintervalforthedifferentfieldsseparately.WecalculateinsteadtheMahalanobisdistance(dM, Mahalanobis(1936)andAppendixB)foreachsimulatedvectorfrom,andconstructthedistributionofdM’sobtainedby thesimulatedunbiasedselectionprocess.TheinsetoftheleftpanelofFig.4reportsthedistributionofthedMfor1000000 simulations.Thedistributioniscenteredarounditsmeanvalueof4.18andtheupperboundforthe95%confidencelevelis around5.37.2_For_an_ideal_unbiased_ranking,_we_would_expect_its_d

Mtofallintothe95%conﬁdenceintervalofthedistribution ofthedMobtainedfromthesimulatedunbiasedsamplingprocess.

5.1.2. Assessingthebiasbyageandﬁeld

Themethodologypresentedaboveiseasilygeneralizedtosimultaneouslyassessaranking’sbiasbyageandfield. Toaddthetemporaldimensiontothebiasassessmentprocedure,wesplitthepublicationsintoTequally-sizedagegroups, andrepeattheaboveanalysisbyusingF×Tdifferentcategoriesofpublications.InSection5.3,wesetF=19representing thenumberoffieldsandT=40asin(Marianietal.,2016),andthusweobtain760age-fieldgroupsofdifferentsizes.

2_A_curiosity_for_the_reader._Here,_the_average_of_the_square_of_the_d

Mfortheunbiasedsamplingprocessisextremelyclosetothenumberofdegreesof

freedomofourproblem.ThisstemsfromthefactthattheMHDdeﬁnedbyEq.(6)convergestoaMultivariateGaussianDistribution(MGD)asweincrease thenumberofpublicationsNwhilekeepingn/Nﬁxedandsmall.Ourdatasetislargeenoughforthisapproximationtobeaccurate.Thed2

MofaMGDis

distributedasa2_variable_with_average_equal_to_the_number_of_degrees_of_freedom,_i.e.₁₈_since_we_have₁₉_distinct_ﬁelds_and_one_constraint.

(8)

Table1

Theindividualcontributionz2

i(1−ki/N)ofeachﬁelditothedMofthedifferentmetrics.

Field c cf _p _RA_(p) _RAF_(c) _RAF_(p) Art 1.15 1.95 3.01 0.31 2.84 0.09 Biology 36.46 15.81 6.74 0.87 0.12 0.16 Business 2.06 2.08 5.95 1.51 14.56 13.50 Chemistry 0.34 8.44 0.85 9.28 4.23 0.00 ComputerScience 0.29 23.86 0.02 3.09 0.12 13.50 Economics 3.34 6.51 4.20 5.77 32.32 7.93 Engineering 8.36 0.09 10.48 0.35 8.36 0.03 EnvironmentalScience 16.82 0.04 35.67 29.89 2.45 2.23 Geography 0.05 1.34 0.15 0.50 0.15 0.51 Geology 1.02 3.04 6.42 0.13 2.10 10.06 History 0.69 2.48 1.67 0.15 3.12 0.52 MaterialsScience 0.39 0.43 0.23 0.01 2.37 0.10 Mathematics 5.17 1.94 0.10 0.14 0.09 25.34 Medicine 4.02 7.66 0.06 1.94 2.16 19.56 Philosophy 1.49 3.23 0.12 0.36 0.15 0.07 Physics 8.30 0.21 6.00 33.67 14.34 0.01 PoliticalScience 1.47 10.22 6.82 0.51 1.47 2.44 Psychology 5.75 1.43 8.56 10.24 2.93 1.65 Sociology 2.83 9.23 2.95 1.30 6.11 2.30

TheboldvaluesmarktheﬁeldsthatgivethelargestcontributiontothedMofeachmetric.

5.2. Resultsonthebiasbyﬁeld

TherankingsproducedbydifferentmetricsdiffergreatlybytheirdM(seeFig.4).Asexpected,themetricsthatarenot ﬁeld-rescaled(c,RA_(p),_and_p)_are_far_from_being_unbiased._At_the_same_time,_relative_citation_count_that_is_rescaled_by_ﬁeld

performsonlyslightlybetterthanPageRankwhichisignorantofanyﬁeldinformation.Thebestresultsbyawidemarginare achievedbyourmetrics,RAF_(c)_and_RAF_(p),_obtained_using_the_new_rescaling._{Nevertheless,}_both_these_metrics_fail_to_meet

the95%upperboundachievedbysimulatedunbiasedrankings.Asdisappointingasitmayseem,thisﬁndingisnotentirely surprisingastheproposedrescalingproceduresfocusonequalizingtheﬁrsttwomomentsoftherespectivequantities(c andp)whereasthequantities’distributionscandifferalsobyhighermoments.

TounderstandwhichﬁeldcontributesthemosttotheresultingdMvalues,wederivedanalternativeanalyticexpression forthedM dM(k, )2= F

i z2 i

1−ki N

(7)

whereweomitthemetricsuperscript (m)fromthenotationforziand k forsimplicity.Wehave proventhisformula

analyticallyforF=3,4,5,6,andwehavenumericallytesteditforF=19and760(seeAppendixB);itremainsopentoprove itinarbitrarydimensions.

InTable1,wereporttheindividualﬁelds’contributionstothed2

McalculatedusingEq.(7).WeﬁndthatBiologyand ComputerSciencearetheﬁeldswhichgivethebiggestcontributionstothed2

Mfortherankingsbycitationcountand relativecitationcount,respectively.Thiscouldnothavebeendetectedbylookingatthedeviationsfromtheexpected values.Indeed,inFig.1weonlyseethatEnvironmentalScienceandPoliticalSciencehavethelargestdeviations.Forthe novelindicators,approximatelyonethirdofthed2

MofRAF(c)isexplainedbytheﬁeldofEconomicsandapproximatelyone fourthofthed2

MofRAF(p)isexplainedbytheﬁeldofMathematics.

Inaddition,wealsofindthatthedM’scontributionsacrossdifferentfieldsassumevaluesinarelativelybroadrange.This suggeststhatfindingsonrankings’biasbyfieldmaystronglydependonwhichdisciplinesareincludedornotintheanalysis. Wearguethatarbitrarychoicesonwhichfieldstoincludeshouldbeavoidedinfutureresearchonfield-normalizationof impactindicators.

Tosummarize,ourbiassuppressiontestallowsusnotonlytoestimatethelevelofbias(dM)ofthevariousmetrics,but alsotoquantifywhichpercentageofthetotalbias(d2

M)ofanindicatorisexplainedbyeachsingleﬁeld. 5.3. Resultsonthebiasbyageandﬁeld

Whiletheanalysisoftheprevioussectionfocusedontherankingbiasbyﬁeld,inthissectionweusethedMto simulta-neouslyassessthebiasbyageandﬁeldofagivenranking.

InFig.5,weshowthedM’sforthedifferentindicatorsandforthe95%confidenceintervalforthesimulatedunbiased selectionprocessusing40×19age-fieldtypesofpublications.Forcitationcount,PageRank,relativecitationcountand age-rescaledPageRankwehavetorejectthehypothesisthattherankingsoftheseindicatorsarenotbiasedbyageandfield.For

(9)

Fig.5.Mahalanobisdistances,dM,fortheanalyzedindicatorswhenconsideringthe760age-ﬁeldgroups.Fromlefttoright:citationcount,relativecitation

count,PageRank,age-rescaledPageRank,age-andfield-rescaledcitationcountandage-andfield-rescaledPageRank.Thehorizontalredlinerepresents theupperboundofthe95%confidenceintervalobtainedfromthesimulations.Intheinsets,wereportthedistributionofdMcomingfrom1000000

simulationsoftheunbiasedsamplingprocess.Again,theredlinerepresentstheupperboundofthe95%conﬁdenceinterval.(Forinterpretationofthe referencestocolorinthisﬁgurelegend,thereaderisreferredtothewebversionofthearticle.)

Fig.6. Heatmapsshowingthebiasbyfieldandageoftherankingsbythedifferentindicators.Eachcellrepresentsanage-fieldgroup:agegroupsare representedhorizontally,whilefieldsarerepresentedvertically.Thecolorofthecellsshowsthebiasoftheindicatorswithrespecttothatage-fieldgroup. Whitemeansthattherespectiveage-fieldgroupisfairlyrepresentedinthetop1%oftherankingbytheindicator.Whileweuseacolorscalefromwhite tointensered(blue)forage-fieldgroupwhichareunderestimated(overestimated).(Forinterpretationofthereferencestocolorinthisfigurelegend,the readerisreferredtothewebversionofthearticle.)

theimprovedindicators,age-andﬁeld-rescaledcitationcountandPageRank,wealsohavetorejectthenullhypothesis, eventhoughtheyaremuchclosertothe95%conﬁdenceinterval.

Itisworthtonoticethatage-rescaledPageRank,anindicatordevelopedtoonlyremovePageRank’sbiasbyage(Mariani etal.,2016),issignificantlylessbiasedcomparedtorelativecitationcount,anindicatorspecificallydesignedto simultane-ouslyremovebiasbyageandfield.

5.4. Simultaneouslyvisualizingthebiasbyageandﬁeld

Tovisualizethefieldandagebiasoftherankingsbytheanalyzedindicators,weuseheatmapsintheage-fieldgroupplane (seeFig.6).Intheseheatmaps,eachcellrepresentsafield-agegroup,anditscolorindicatesthelevelofbias.Awhitecell indicatesthatthenumberofpapersintherespectiveage-fieldgroupfallsintothe95%confidencelevel(C95%)determined

(10)

withthesimulations.Hence,whitemeansthatnobiasisdetectedfor thatage-fieldgroup.Whileforrepresentingthe biastowardsoragainstagroupofpapers,weuseblue(overestimation)andred(underestimation).Toobtainarangeof over/under-estimation,thebrightnessofthecolorsrangesfromwhite(nobias)tointenseblue/red.Themostintensecolors indicatethatthenumberofpapersfromthatage-fieldgroupis5standarddeviationsmaller/biggerthantheexpectedvalue. ThetoppanelofFig.6showsthat,independentlyoffield,citationcountandPageRanksystematicallyover-represent oldpapersandunder-representrecentpapers.Thisisinagreementwiththefindingsofseveralotherworks(Chenetal., 2007;Marianietal.,2015,2016;Newman,2009).TheonlyexceptionisPoliticalSciencewhichisusuallyunderestimated independentlyofpaperage.Wearguethatthishappensbecausethisisthesmallestfieldinthedataset,andithasbecome anacademicdisciplinebyitselfmuchlatercomparedtomostoftheotherfields3_._Also,_the_oldest_papers_in_most_fields_are

under-representedbycitationcount,whichreﬂectsthechangeofcitationpracticesovertime.

ThemiddlepanelsofFig.6showthattherelativecitationcountandage-rescaledPageRanksuppresslargepartofthe biasesoftheoriginalmetrics,yetspecificfieldsareconsistentlyoverestimatedorunderestimated.Forexample,both age-rescaledPageRankandrelativecitationcountunder-representpapersbelongingtothefieldofChemistry.Apeculiarityof relativecitationcountisthatitover-representsboththeoldestaswellasthemostrecentpapersatthecostoftheother papers.

ThebottompanelsofFig.6showtheheatmapsforthenewindicatorsage-andﬁeld-rescaledcitationcountRAF_(c)_and

PageRankRAF_(p)._We_find_that_the_respective_rankings_are_much_less_biased_towards_specific_fields_compared_to_all_the_other

analyzedmeasures.However,therearetwopatterns:forRAF_(c)_recent_publications_tend_to_be_{underestimated}_for_some

ﬁelds,whereasforRAF_(p)_recent_publication_tend_to_be_{overestimated}_for_almost_all_ﬁelds._These_rather_systematic_patterns

musthavetheirrootsinchangesofthecitationandPageRankscoredistributionswithtime.Sinceourrescalingprocedure wasﬁxingtheﬁrsttwomomentsofthesedistributions,theobservedpatternscomefromdifferencesinhighermoments. Thus,thedistributionsofRAF_(c)_and_RAF_(p)_are_aligned_only_partially_for_papers_of_different_age.

6. Discussionandconclusion

Tosummarize,inthispaperwehaveanalyzedalargecitationnetworkfromtheMicrosoftAcademicGraphtoshowthat therankingsofpapersbywell-knownindicatorsareextremelybiasedbyageandfield.Thelevelofbiasoftherankingshas beenquantifiedwithanewstatisticalframeworkbasedontheMahalanobisdistance.Thisframeworkhasallowedusto simultaneouslyquantifytheageandfieldbiasesoftheanalyzedrankings,andtodeterminewhichgroupsofpapersgivethe largestcontributionstotheobservedbias.Toallowotherresearchertoeasilyimplementourstatisticaltestforrankingbias, wemaketherespectivecodepubliclyavailable4_together_with_a_quick_tutorial_on_how_to_use_it.5_In_addition,_we_have_also

introducedtwonewindicatorsofpaperimpact,rescaledcitationcountRAF_(c)_and_rescaled_PageRank_RAF_(p)_that_produce

muchlessbiasedrankingsthanexistingindicators.Inparticular,therankingbyRAF_(p)_is_{approximately}_three_times_less

biasedcomparedtotheleastbiasedexistingmetrics,relativecitationcountandage-rescaledPageRank.

Thecontributionofourresultstothedebateonthevalidityoffield-normalizationproceduresisthreefold.First,our findingsareinagreementwiththeconclusionsofAlbarránetal.(2011)andWaltmanetal.(2012)whicharguedthatthe relativecitationcountintroducedbyRadicchietal.(2008)canbeinsufficienttoeffectivelyremovecitationcount’sbiasby ageandfield.Second,weshowtheimportanceoftestingindicatorsusinganaccuratestatisticalprocedure,suchtheone introducedinthispaper.Indeed,fortheleast-biasedindicatorsanalyzed,RAF_(c)_and_RAF_(p),_no_clear_indication_of_bias_is_found

atfirstglance.However,whenusingthestatisticaltestbasedontheMahalanobisdistance,wefindasignificantdiscrepancy betweentheirrankingsandthosecomingfromunbiasedsamplingprocess.Wearguethatincludinghigher-ordermomenta (suchastheskewness)intherescalingprocedurecanbeanefficientwaytofurtherreducetherankings’levelofbias.Third, byderivinganexplicitformulatocalculatethecontributionofeachfieldtothebiasofaranking,wefindthatthethese contributionsassumeabroadrangeofvalues.Weobtainsimilarfindingsalsoforthecontributionstotheage-fieldbias.This meansthatthelevelofbiasofrankingsdependsheavilyonwhichyearsorfieldsareincludedintheanalysis.Forthisreason, infutureresearchonageandfieldnormalizationofindicators,itisessentialtoclearlymotivatewhichyearsandfieldsare includedintheanalysis,avoidingarbitraryoruncriticaldecisions.

Toaddressthebiasbyageandfieldofrankingofpapers,wehavefirstdividedthepapersingroupswithsimilarageand fromthesamefield.Then,weconsideredonlythesizesofthesegroupstodefineanunbiasedselectionprocessfromwhich weobtainedastatisticalnullmodelforanunbiasedranking.Inprinciple,additionalinformationcanbeincludedintothe nullmodeltocorrectforothereffects.Forexample,includinginformationabouttheco-authorshipnetworkwouldpermit tocorrectfortheeffectofthisnetworkonthegrowthandstructureofthecitationnetwork(Sarigöl,Pfitzner,Scholtes,Garas,

3_We _notice _that_the _{classification} _of _Political_Science _as _one_of _the _{highest-level}_fields _is _not _obvious. _In _Scopus_categories, _“Political Sci-enceandInternationalRelations”isonlyasubfieldofthehigher-levelfieldSocialScience[http://www.scimagojr.com/journalrank.php?area=3300]. In the Web of Science classification scheme, “Political Science” is only a subfield of the higher-level field “Social Sciences, General” [http://ipscience-help.thomsonreuters.com/inCites2Live/8300-TRS.html].

4_{https://github.com/giava90/quantifying-ranking-bias}_.

5_{https://www.sg.ethz.ch/team/people/gvaccario/quantifying-ranking-bias/}_.

(11)

&Schweitzer,2014;Sarigol,Garcia,Scholtes,&Schweitzer,2017).Inthisway,wewouldgainabetterunderstandingofhow thesocialdimensionofsciencecontributestotheﬁeldandagebiasesofimpactindicators.

Weemphasizethatremovingthebiasesaddressedinthispaperandthosethatcomefromsocialaspectsisofprimary importancenotonlyforscholarlypublicationdatabases,butalsoforseveralotherinformationsystems,suchastheWWW oronlinesocialnetworks(Scholtes,Pﬁtzner,&Schweitzer,2014a).Asamatteroffact,everydayscholarsandon-lineusers exploreavailableknowledgeusingrecommendersystemsbasedonrankingalgorithms.Thischallengesustodesignmore sophisticatedﬁlteringandrankingprocedurestoavoidbiasesthatcansystematicallyhiderelevantcontentsoronlyshow informationtoosimilartowhattheusersalreadyknow.

Whilewe haveanalyzedindetail thepresence ofageandﬁeldbiasin theranking,it stillremainstoevaluatethe actualrankingperformanceofthenewlyproposedindicatorsinartiﬁcialdata(Medo&Cimini,2016)orinrealdatawhere thegroundtruthisprovidedbysomeexternalsource(Dunaiski,Visser,&Geldenhuys,2016;Marianietal.,2016).Another importantissueisthecomparisonbetweenmetricsbasedoncitationcountandmetricsthattakethewholecitationnetwork intoaccounttodeterminepapers’score.Ouranalysis(seeSectionD)showsthattherankingsbytheleast-biasedindicators RAF_(c)_{(citation-based)}_and_RAF_(p)_{(network-based)}_are_positively_correlated,_still_{substantially}_different._Can_we_use_the

extra-informationprovidedbythenetworktoenhanceourabilitytoidentifyhighly-significantpublications?Ourintuition andtheresultspresentedbyChenetal.(2007)andbyMarianietal.(2016)forspecificresearchfieldssuggestthatthisisthe case.Yet,weneedadditionalanalysistovalidatethisconjectureinlargerdatasetssuchastheoneanalyzedinthisarticle.

Toconclude,byreducing theageandﬁeldbiasesfromindicatorsofscientiﬁcimpactandbyextendingtheexisting statisticaltestsforbiases,wecontributetothechallengeofquantifyingandsuppressingbiasesofrankingsininformation systems.

Acknowledgements

WethankIngoScholtes,FrankSchweitzerandYi-ChengZhangforsuggestionswhichimprovedthemanuscript.In addi-tion,wealsothankEmreSarigölforhishelpinpre-processingthedata,andEliasBaumanforhisimportantcontribution totheoptimizationoftheC++codethatweusedforthenetworkanalysis.GVacknowledgessupportfromtheSwissState SecretariatforEducation,ResearchandInnovation(SERI),GrantNo.C14.0036aswellasfromEUCOSTActionTD1210 KNOWeSCAPE.MSMacknowledgessupportfromtheSwissNationalScienceFoundationGrantNo.200020-156188.

AuthorcontributionsConceivedanddesignedtheanalysis:GiacomoVaccario,MatusMedo,NicolasWider,Manuel

Sebas-tianMariani.

Collectedthedata:GiacomoVaccario.

Contributeddataoranalysistools:GiacomoVaccario,ManuelSebastianMariani. Performedtheanalysis:GiacomoVaccario,ManuelSebastianMariani.

Wrotethepaper:GiacomoVaccario,MatusMedo,NicolasWider,ManuelSebastianMariani.

AppendixA. KDDCupdata

A.1Datasource

Inthiswork,weanalyzedthedumpoftheMicrosoftAcademicGraph(MAG)releasedfortheKDDCup2016(Sinhaetal., 2015),acompetitionlinkedtoaprestigiouscomputerscienceconferenceonknowledgediscoveryanddatamining(KDD). AmongtheprimaryinterestsofthecommunityorganizingtheKDDCup,therearethetechnicalchallengesrelatedto web-scaledatacollectionandaggregation.Forthisreason,thereleaseddatafortheKDDCup2016wentthroughonlybasic processing.6_Each_publication_in_the_dataset_is_endowed_with_its_unique_identiﬁer,_paperid,_publication_date,_references_to

otherpublications,anditsﬁeldofstudy. A.2Datapre-processing

Whenanalyzingthedata,wedonotdistinguishthepublicationsbytheirtype(paper,review,book,etc.).Further,we alsodonotdifferentiatebetweendifferenttypesofjournalsandtakeintoaccountallofthem:forexample,wedonot distinguishbetweenacitationcoming(orgoing)from(orto)aletterorabook.Wearguethatitisimportanttokeepvarious typesofjournalsandpublicationsbecausedifferentﬁeldsadoptnotonlydifferentcitationnorms,butalsodifferentwaysto communicatetheirresults.Forexample,computerscienceresearcherscommonlypublishresultsinconferenceproceedings, whilephysicsauthorstendtopreferarticlesorletters.Atthesametime,weareawarethatdifferenttypesofpublications mighthavedifferentcitationcharacteristics.However,goodindicatorsshouldideallybeabletoaccountforheterogeneities amongpublicationsandcitationnormsacrossdifferentcommunitiesandproduceunbiasedrankingswithouttheneedfor

6_{https://kddcup2016.azurewebsites.net/Data}

(12)

TableA.2

Mainfields.The19parentmainfieldsidentifiedbyMicrosoftAcademicGraphwiththeirnumberofpublicationsandaveragecitations.Thesecondtolast rowreportsthetotalnumberofpublicationsconsideringmultipletimespublicationsthatbelongtomorethanonefields,whereasthelastrowreportsthe totalnumberofuniquepublicationsandtheaveragecitationcount.

Field Publicationcount Meancitationcount

Art 233251 3.90 Biology 5847554 9.67 Business 613827 4.81 Chemistry 6204531 7.28 ComputerScience 4080636 6.13 Economics 2252921 5.56 Engineering 3011763 5.10 EnvironmentalScience 315465 12.63 Geography 288338 6.92 Geology 1825707 7.88 History 390144 5.53 MaterialsScience 2063474 6.12 Mathematics 4551453 5.87 Medicine 5061990 7.90 Philosophy 787649 5.05 Physics 6976644 5.55 PoliticalScience 144473 2.51 Psychology 2861813 8.23 Sociology 1784695 5.39 Total 49296327 –

Total(nomultiple) 18193082 6.42

arbitrarychoicesaboutwhichtypesofarticlestoincludeintheanalysis.Inaddition,similarlyasWaltmanetal.(2012)and differentlyfromRadicchietal.(2008),wedonotexcludepublicationswhichdonotreceivecitations.

AsmentionedinSection2,theMAGhasafieldclassificationschemewith4hierarchicallevels.Thefieldassignmentis basedonaninternalalgorithmthatusesamachinelearningapproach(Sinhaetal.,2015).Inourwork,similarlytoRadicchi etal.(2008),weareonlyinterestedinimpactmetricnormalizationatthemostcoarse-grainedlevel.Tothisaim,inour analysis,wefocusonlyonthe19mainfieldsaslistedinTableA.2.Discussingthepossiblelimitationsoftheclassification approachbyMASandthedependenceofourresultsontheadoptedclassificationschemeisarelevantsubjectforfuture researchbutgoesbeyondthescopeofthismanuscript.Weonlyincludedintheanalysispapersforwhichthefollowing informationareavailable:(1)uniqueidentifier(ID);(2)completepublicationdate(yyyy/mm/dd)crucialforthetemporal rescalingprocedurethatisexplainedinthefollowing;(3)DOIorjournal-id,inordertobeabletoretrievethepublication; (4)assignmenttoatleastoneofthemain19fields.Wediscardfromouranalysispublicationsforwhichoneormoreof thesefourpropertiesaremissing.

Withthisﬁlteringprocedure,weobtainN=18193082uniquepublicationsandE=109719182citations. A.3Databasicstatistics

Wefirstobservethatthedistributionsofbothincomingandoutgoingedgesarebroad(seeFig.A.7).Boththesedistribution infacthavelongandheavytails.Inaddition,wealsoobservestrongvariationsofthemeancitationcountacrossfields(see TableA.2fordetails).Forexample,publicationsbelongingtothefieldofEnvironmentalScienceandofPoliticalSciencehave

Fig.A.7.In-andout-degreedistributionforthepublicationspresentinourdatasetafterpreprocessingthedata.

(13)

Fig.A.8.ScatterplotofthecitationcountsreportedinthedatareleasedfortheKDDCup2016andfromtheonlineversionoftheMAG(02/2017).

meancitationcounttwotimesbiggerandsmaller,respectively,comparedtotheaveragecitationcountacrossallfields.In agreementwiththefindingsbyRadicchietal.(2008),SchubertandBraun(1986),Vinkler(1986),thestrongvarietyinmean citationcountperpublicationforthevariousfieldsconfirmsthatthecorrespondingcommunitiesexhibitdifferentcitation behavior,whichcallsforfieldnormalizationprocedures.

A.4Dataquality

Here,weaddressthefollowingquestion:towhichextentistheKDDdumpoftheMAGinaccordancewiththemost updatedonlineversionoftheMAG?7_For_this_comparison,_we_divided_the₄₉₂₉₆₃₂₇_analyzed_paper-ﬁeld_pairs_(one

paper-fieldpairiscomposedofapaperandthefieldsthatpaperbelongsto)into760ageandfieldgroups,asdescribedinSection5. Fromeachgroup,werandomlychoose0.1%ofpapers.Followingthisprocedure,weobtainedarepresentativesamplewith respecttofieldandagecomposedofapproximately50000papers.

First,wehavematchedthepaperIDinhexadecimalformatpresentinthedatareleasedfortheKDDCuptothepaper IDinint64formatpresentintheon-lineversionoftheMAG.For50papers,wemanuallyveriﬁedthatthepaperIDswere exactlythesameinthetwodatasets,albeitrepresentedindifferentformats.Then,usingtheAcademicKnowledgeApi,8_we

havedownloadedthenumberofcitationsforeachsampledpaper.

Forthe2.3%ofthesampledpapers,wewerenotabletoﬁndtheircorrespondingpaperintheonlineMAG. For50 unmatchedpapers,wefoundoutthatthesewerepaperswithduplicatesintheKDDCupdata,i.e.thesepapersarepresent intheKDDdatawithtwodistinctIDs,andonlyoneofthetwoIDsispresentintheMAG.

Forthematchedpapers(97.7%ofthesample),wecomparedthenumberofcitationsreportedintheKDDdumpofthe MAGwiththenumberofcitationsreportedontheonlineversionoftheMAG.SincetheonlineMAGhasalongertimespan thantheKDDdata,intheabsenceofnoise,wewouldexpectthenumberofcitationsintheKDDdatatobesmallerthan orequaltothenumberofcitationsreportedintheonlineMAG.Wefindthatthetwocitationcountsarehighlycorrelated (Fig.A.8),andonlythe2.2%ofthesampledpapershavemorecitationsintheKDDdatacomparedtotheonlineversionof theMAG.Thissetsalowerboundaryfortheerrorinthepercentageofpaperswithwrongnumberofcitationstoabout2.2%. Tosummarize,afterourfilteringprocedure,wefindthatthedatareleasedforKDDCup2016hasabout2.3%ofpapers withduplicates.Inaddition,about2.2%ofthematchedpapershaveerrorsintheircitationcount.Thismeansthatwehave correctcitationinformationforabout95.6%oftheanalyzedpapers.

AppendixB. EvaluatingindividualcontributionstotheMahalanobisdistance

TheMahalanobisdistance(dM)isanestablishedmeasureinstatisticswhichgeneralizestheconceptofz-scoreto multi-variatedistributionsbytakingintoaccountalsopossiblecorrelationsbetweentherandomvariables(Mahalanobis,1936). Itsdeﬁnitionreads

dM(x, y)=

(x − y)T_S−1₍_{x − y)} _(B.1)

7_We_have_performed_this_analysis_during_February_2017.

8_{https://www.microsoft.com/cognitive-services/en-us/academic-knowledge-api}

(14)

whereS−1 _is_the_inverse_of_the_covariance_matrix,_{x and y are}_two_vectors_containing_the_random_variables._When_the

covariancematrixisdiagonal,i.e.therandomvariablesarenotcorrelated,thanthedMisequivalenttothesquarerootof thesumofthesquaresofthez-scores.

InSection5.2,wehaveusedEq.(7),anexpressionforthedMvalidwhenthecovariancematrixcomesfromaMultivariate HypergeometricDistribution(MHD),i.e.,whentheelementsofthematrixare

Sij=

ıij(Ki(N−Ki))−

1−ıij

KiKj

∀

i,j=1,...,F−1 (B.2)

whereıijistheKroneckerdelta,Kiisthenumberofpapersofcategoryi,N=

F

iKiisthetotalnumberofpapers,Fisthe

numberofpapercategories,=n(N−n)/N2_(N₋₁₎_and_n_is_the_number_of_sampled_papers._It_is_worthy_to_remember_that

eventhoughwehaveFdifferentcategories,weonlyhaveF−1degreesoffreedom.Here,wederiveEq.(7)forthecaseofa MHDinthreedimensions,i.e.,forF=3.Inthiscase,thecovariancematrixis2×2:

S=

K1(N−K1) −K1K2 −K1K2 K2(N−K2)

(B.3) andtheinverseofthecovariancematrixis

S−1= 1 det(S)

K2(N−K2) K1K2 K1K2 K1(N−K1)

(B.4)

wheredet(S)=K1(N−K1)K2(N−K2)−(K1K2)2denotesthedeterminantofthecovariancematrix,S.Then,letusconsidertwo

randomcolumnvectorsextractedfroma3-dimensionalMHD,x=(x1,x2,x3)Tandy=(y1,y2,y3)Tsuchthatn=

3_i=1xi=

3

i=1yiwherenisthenumberofsampledpapers.SubstitutingEq.(B.4)inEq.(B.1),wewritethesquareofthedMbetween xandyas dM(x, y)2= 1 det(S)

x1−y1 x2−y2

_K₂_(N₋_K₂) +K1K2 +K1K2 K1(N−K1)

x1−y1 x2−y2

= 1 det(S)

(x1−y1)2K2(N−K2)+(x2−y2)2K1(N−K1) +2(x1−y1)(x2−y2)(K1K2))

= _det(S)1

(x1−y1)2K2(K1+K3)+(x2−y2)2K1(K2+K3) +2(x1−y1)(x2−y2)(K1K2)

= 1 det(S)

(x1−y1)2K2K3+(x2−y2)2K1K3 + [(x1−y1)+(x2−y2)]2K1K2

wherewehaveusedN=

3_i=1Ki.Recallingthatn=

3

i=1xi=

3

i=1yi,weknowthat(x1−y1)+(x2−y2)=(x3−y3),sowe

write

dM(x, y)2= _det(S)1

(x1−y1)2K2K3+(x2−y2)2K1K3+(x3−y3)2K1K2

(B.5) Then,byusingtherelationdet(S)=N

3_i=1K_i,wehave:

dM(x, y)2=1 3

i=1 (xi−yi)2 NK_i ; (B.6)

noticingfromEq.(B.2)that=Si,i/(Ki(N−Ki)),weobtain

dM(x, y)2= 3

i=1 (xi−yi)2 Sii Ki(N−Ki) NKi = 3

i=1 (xi−yi)2 Sii

1−Ki N

(B.7)

Finally,ifwechooseoneofthetwovectorstocontaintheexpectedvalues,i,were-obtainEq.(7)since(xi−i)2/Sii=z_i2.

Tobeprecise,thecovariancematrixisnotdeﬁnedfori=3,howevertherelation=2

3/(K3(N−K3))holdsandtherefore

alsotheﬁnalresult.

UsingMathematicaorsimilarsoftwares,itiseasytoproveanalyticallythateq.(7)holdsforsmalldimensions.Wehave veriﬁedituntil6dimensions.Moreover,wehavenumericallytestedthisformulabycalculatingthedM’sbetweentheranking vectorsoftheindicatorsandthevectorofexpectedvalues, ,withtwodifferentalternativemethods:(1)byusingEq.(B.1),

(15)

Fig.C.9.Distributionof2_obtained_by_calculating_the_empirical_ratio,_r₌_(e2

−1),betweenthevarianceandthesquareofthemeancitationcountof eachﬁeldandyear.

i.e.,byinvertingthecovariancematrix,and(2)byusingtheeigenvaluedecompositionofthecovariancematrix.9_The_results

ofthethreemethodswereallcompatiblewitheachotherupto10decimaldigits.TheadvantageofusingEq.(7)isthatwe cancalculatethedMbetweentwoarbitraryvectorswithoutdealingwithany(computationallyslow)matrixinversionor diagonalization,andthenumberofneededcalculationsscaleslinearlywiththenumberofdimensions.Importantly,Eq.(7) allowsalsotoassesstheindividualcontributionofeachdimension(i.e.ofeachcategory)tothed2

M.Toourbestknowledge, wearetheﬁrstonestohavederivedsuchexplicitformulafordMwhenthecovariancematrixandtherandomvectorscome fromaMHD.

AppendixC. Firstmomentrescaling

AccordingtoRadicchietal.(2008),thedistributionofthecf_indicator_is_log-normal:

F(cf_)dcf ₌ 1

cf√₂e

−[log(cf_)−]2_/22

dcf (C.1)

where=−2_/2_and_is_ﬁtted_from_the_data._When_Eq._(C.1)_is_veriﬁed,_then_also_the_{distributions}_of_citation_count,_c i,for

alltheindividualﬁelds,i,arelognormal: F(ci)dci=

1 c_i√2e

−[log(ci)−log(c0)−]2/22_dc

i (C.2)

wherec0isthemeanofci.Forlognormaldistributionsthevarianceisproportionaltothesquareofthemeanandthe

constantofproportionalityis(e2

−1).FromEq.(C.2),weseethatthecitationcountsciaredistributedlognormallywith

meane+log(c0)+2/2 _and_variance_(e2−_1)e2+2log(c0)+2_._Recalling_that=−2_/2,_we_have_that_the_mean_is_c₀_,_as_it_is expected,whilethevariancebecomes(e2

−1)c2

0.Thus,wheneq.(C.1)isveriﬁed,thevarianceoftheempiricaldistribution

ofthecitationsforeachﬁeldhastobeproportionaltothesquareofthemeancitationcount.Moreover,theconstantof proportionalityhastobe(e2

−1)foreveryﬁeldandyear.

TheanalyticresultjustpresentedisinlinewithEq.(C.1)giveninAppendixCofMarianietal.(2016).Thereitisshown thatarescalingprocedurebasedondivingtheoriginalscorebytheirﬁrstmomentworksiftheratiobetweenstandard deviationandmeanisconstant.Inthecaseoftherelativecitationratio,wecancalculateanalyticallysuchconstantusing thelognormaldistributionandobtaintheﬁttingparameter2_.

InFig.C.9,wereportthedistributionof2_obtained_by_calculating_the_empirical_ratio_between_the_variance_and_the

squareofthemean,r,andbyinvertingtherelationr=(e2

−1)foreveryfieldandyear.Iftheuniversalityclaimwascorrect, wewouldexpectanarrowdistributionof2_._By_contrast,_we_find_that2_ranges_between₀_and₈_across_different_fields_and

years.Wearguethatthebroadrangeof2_is_the_reason_why_the_ﬁrst_moment_rescaling_introduced_in_(Radicchi_et_al.,₂₀₀₈₎

doesnotworkintheanalyzeddataset.

9_The_matrix_S_is_symmetric_and_it_has_maximal_rank_because_it_is_the_covariance_matrix_of_a_multivariate_{distribution.}_Therefore,_we_can_diagonalize_it, S=B−1_DB_where_the_columns_of_B_form_an_orthonormal_basis;_we_can_also_write_S−1₌_B−1_D−1_B._With_this,_we_have_d2

M(x, y)=

F−1

i ci/i,where{i}are theeigenvaluesofSand{ci}arethecoordinatesofx − y inthebasiswhichdiagonalizesS,i.e.ci=

F−1 k (xk−yk)B

−1 ki =

F−1

k (xk−yk)Bikwherethelast equalitycomesfromtheorthonormalityofBwhichimpliesB−1₌_BT_.

(16)

TableD.3

Correlationsbetweenthemetrics.Thefourcolumnsrepresent(fromlefttoright):Pearson’scorrelationcoefﬁcientr,rrestrictedtopapersthatreceivedat leasttencitations,Spearman’srankcorrelationcoefﬁcient ,and restrictedtopapersthatreceivedatleasttencitations.

Metrics r r(onlyc>10) (onlyc>10)

c,p 0.82 0.82 0.79 0.59

RAF_(c),_RAF_(p) _0.80 _0.81 _0.82 _0.74

AppendixD. Comparingtherankingsbythemetrics

InSection5,wehaveshownthattherankingsbycitationcountandPageRankcanbesubstantiallyleveledoffacross differentﬁeldsandagegroupsthrougharescalingprocedurebasedonthez-score.Thetworesultingindicators,RAF_(p)

andRAF_(c),_are_the_least_biased_among_the_indicators_considered_in_this_paper._It_is_important_to_notice_that_while_citation

counthasadirectinterpretationanditiswidelyusedinresearchevaluation(Waltman,2016),PageRankscoreisamore sophisticatedquantitywhichhasnotyetbeturnedintoastandardtoolforresearchassessment.Iftherankingsbyrescaled PageRankandrescaledcitationcountbroughtsimilarinformation,rescaledcitationcountmightbepreferredduetoits simplerinterpretationandeasiercomputation.

Differentlyfromthecitationcount,PageRankscoreusesinformationonthewholenetworktopologytocomputeeach paper’sscore.WhilethereexistsnouniversalcriteriontodecidewhetherPageRankscoreleadstoabetterrankingthanthe citationcount,theresultsbyChenetal.(2007)andMarianietal.(2016)suggestthatincitationnetworks,PageRankscore improvesourabilitytoﬁndgroundbreakingpublicationsinthedata.Ontheotherhand,anodethatreceivedmanycitations ismorelikelytoachievelargerPageRankscore–Fortunato,Boguná,Flammini,andMenczer(2006)showedthatPageRank scoreisonaverageproportionaltocitationcountforuncorrelatednetworks.

Weaddressnowthequestion:Towhatextenttherankingsby(rescaled)PageRankand(rescaled)citationcountdiffer?We focusontwocorrelations:(1)thecorrelationbetweencitationcountandPageRank,whichhasbeenofinterestforprevious studies(Chen,Gan,andSuel(2004),Fortunatoetal.(2006),Pandurangan,Raghavan,andUpfal(2002))duetotheessential roleplayedbyPageRankalgorithmindeterminingthesuccessofGoogle’sWebsearchengine;(2)thecorrelationbetween thetworescaledindicatorsRAF_(c)_and_RAF_(p)._The_measured_correlations_are_all_positive,_{signiﬁcantly}_larger_than_zero,_yet

significantlysmallerthanone(seeTableD.3).Thisisinterestingasitmaypointoutthat,inanalogywiththefindingsby Chenetal.(2007),Marianietal.(2016),Ren,Mariani,Zhang,andMedo(2017),networktopologybringsusefulinformation thatisneglectedbycitationcount.WhethertheadditionalinformationusedbyPageRankmetriccanbeusedtoidentify groupsofsignificantpaperswillbethesubjectoffutureresearch.

DifferentlythantherankingsbycitationcountandPageRankscore(notshownhere),thetoppapersidentiﬁedbyRAF_(c)

andRAF_(p)_come_from_diverse_historical_period_and_diverse_ﬁelds._Due_to_the_lack_of_time_bias,_also_very_recent_papers_can

reachthetopoftherankingsbyRAF_(c)_and_RAF_(p)._It_would_be_instructive_to_look_at_the_top_papers_as_ranked_by_rescaled

citationcountandrescaledPageRank.However,theoriginalMAGdatasetspresentsnoisyentries,asreportedbytheKDD cup2016(seeAppendixAformoredetails),anditcausesthescoresofsomerecentpaperstobeover-estimated,which makessomeoftheentriesinthetop-20oftherankingsbytherescaledscoresunreliable.Forthisreason,wedonotshow therankingshere.Atthesame,thisproblemdoesnotaffectthestatisticalresultspresentedinthepreviousSections.

References

Adams,J.,Gurney,K.,&Jackson,L.(2008).Calibratingthezoomatestofzittshypothesis.Scientometrics,75(1),81–95.

Albarrán,P.,Crespo,J.A.,Ortuno,I.,&Ruiz-Castillo,J.(2011).Theskewnessofsciencein219sub-ﬁeldsandanumberofaggregates.Scientometrics,88(2), 385–397.

Bergstrom,C.T.,West,J.D.,&Wiseman,M.A.(2008).Theeigenfactormetrics.JournalofNeuroscience,28(45),11433–11434. Bonacich,P.(1987).Powerandcentrality:Afamilyofmeasures.AmericanJournalofSociology,92(5),1170–1182.

Bornmann,L.,&Daniel,H.-D.(2008).Whatdocitationcountsmeasure?Areviewofstudiesoncitingbehavior.JournalofDocumentation,64(1),45–80. Bornmann,L.,&Daniel,H.-D.(2009).Universalityofcitationdistributions–Avalidationofradicchietal.’srelativeindicatorcf=c/c0atthemicrolevel

usingdatafromchemistry.JournaloftheAmericanSocietyforInformationScienceandTechnology,60(8),1664–1670.

Brin,S.,&Page,L.(1998).Theanatomyofalarge-scalehypertextualwebsearchengine.ComputerNetworksandISDNSystems,30(1),107–117. Chen,Y.-Y.,Gan,Q.,&Suel,T.(2004).Localmethodsforestimatingpagerankvalues.ProceedingsofthethirteenthACMinternationalconferenceon

informationandknowledgemanagement,ACM,381–389.

Chen,P.,Xie,H.,Maslov,S.,&Redner,S.(2007).FindingscientiﬁcgemswithGoogle’sPageRankalgorithm.JournalofInformetrics,1(1),8–15.

Colliander,C.,&Ahlgren,P.(2011).Theeffectsandtheirstabilityofﬁeldnormalizationbaselineonrelativeperformancewithrespecttocitationimpact: Acasestudyof20naturalsciencedepartments.JournalofInformetrics,5(1),101–113.

deSollaPrice,D.(1976).Ageneraltheoryofbibliometricandothercumulativeadvantageprocesses.JournaloftheAmericanSocietyforInformationScience, 27(5),292–306.

Dunaiski,M.,Visser,W.,&Geldenhuys,J.(2016).Evaluatingpaperandauthorrankingalgorithmsusingimpactandcontributionawards.Journalof Informetrics,10(2),392–407.

Ermann,L.,Frahm,K.M.,&Shepelyansky,D.L.(2015).Googlematrixanalysisofdirectednetworks.ReviewsofModernPhysics,87(4),1261. Fortunato,S.,Boguná,M.,Flammini,A.,&Menczer,F.(2006).ApproximatingPageRankfromin-degree.InInternationalworkshoponalgorithmsand

modelsfortheweb-graph.pp.59–71.Springer.

Franceschet,M.(2011).Pagerank:Standingontheshouldersofgiants.CommunicationsoftheACM,54(6),92–101. Garﬁeld,E.(2006).Thehistoryandmeaningofthejournalimpactfactor.JAMA,295(1),90–93.

Gleich,D.F.(2015).Pagerankbeyondtheweb.SIAMReview,57(3),321–363.

Harzing,A.-W.,&Alakangas,S.(2013).Microsoftacademic:Isthephoenixgettingwings?Scientometrics,1–13.

(17)

Hirsch,J.E.(2005).Anindextoquantifyanindividual’sscientiﬁcresearchoutput.ProceedingsoftheNationalacademyofSciences,1656,9–16572. Hug,S.E.,&Braendle,M.P.(2017).ThecoverageofMicrosoftacademic:Analyzingthepublicationoutputofauniversity.,arXivpreprintarXiv:1703.05539 Hug,S.,Ochsner,M.,&Brandle,M.P.(2017).CitationanalysiswithMicrosoftacademic.Scientometrics,111(1),371–378.

Katz,L.(1953).Anewstatusindexderivedfromsociometricanalysis.Psychometrika,18(1),39–43.

Kleinberg,J.M.(1999).Authoritativesourcesinahyperlinkedenvironment.JournaloftheACM(JACM),46(5),604–632. Lundberg,J.(2007).Liftingthecrowncitationz-score.JournalofInformetrics,1(2),145–154.

Mahalanobis,P.C.(1936).Onthegeneraliseddistanceinstatistics.ProceedingsoftheNationalInstituteofScienceofIndia,4,9–55. Mariani,M.S.,Medo,M.,&Zhang,Y.-C.(2015).Rankingnodesingrowingnetworks:WhenPageRankfails.ScientiﬁcReports,5.

Mariani,M.S.,Medo,M.,&Zhang,Y.-C.(2016).Identiﬁcationofmilestonepapersthroughtime-balancednetworkcentrality.JournalofInformetrics,10(4), 1207–1223.

Maslov,S.,&Redner,S.(2008).PromiseandpitfallsofextendingGoogle’sPageRankalgorithmtocitationnetworks.JournalofNeuroscience,28(44), 11103–11105.

McAllister,P.R.,Narin,F.,&Corrigan,J.G.(1983).Programmaticevaluationandcomparisonbasedonstandardizedcitationscores.IEEETransactionson EngineeringManagement,4,205–211.

Medo,M.,&Cimini,G.(2016).Model-basedevaluationofscientificimpactindicators.PhysicalReviewE,94(3),032312. Medo,M.,Cimini,G.,&Gualdi,S.(2011).Temporaleffectsinthegrowthofnetworks.PhysicalReviewLetters,107,238701. Newman,M.E.J.(2009).Thefirst-moveradvantageinscientificpublication.EPL(EurophysicsLetters),86(6),68001.

Pandurangan,G.,Raghavan,P.,&Upfal,E.(2002).UsingPageRanktocharacterizewebstructure.InInternationalcomputingandcombinatoricsconference. pp.330–339.Springer.

Parolo,P.D.B.,Pan,R.K.,Ghosh,R.,Huberman,B.A.,Kaski,K.,&Fortunato,S.(2015).Attentiondecayinscience.JournalofInformetrics,9(4),734–745. Pinski,G.,&Narin,F.(1976).Citationinﬂuenceforjournalaggregatesofscientiﬁcpublications:Theory,withapplicationtotheliteratureofphysics.

InformationProcessing&Management,12(5),297–312.

Radicchi,F.,&Castellano,C.(2011).Rescalingcitationsofpublicationsinphysics.PhysicalReviewE,83(4),046116.

Radicchi,F.,&Castellano,C.(2012a).WhySirtes’sclaims(Sirtes,2012)donotsquarewithreality.JournalofInformetrics,6(4),615–618.

Radicchi,F.,&Castellano,C.(2012b).Testingthefairnessofcitationindicatorsforcomparisonacrossscientiﬁcdomains:Thecaseoffractionalcitation counts.JournalofInformetrics,6(1),121–130.

Radicchi,F.,Fortunato,S.,&Castellano,C.(2008).Universalityofcitationdistributions:Towardanobjectivemeasureofscientiﬁcimpact.Proceedingsof theNationalAcademyofSciences,105(45),17268–17272.

Ren,Z.-M.,Mariani,M.S.,Zhang,Y.-C.,&Medo,M.(2017).Atime-respectingnullmodeltoexplorethestructureofgrowingnetworks.,arXivpreprint arXiv:1703.07656.

Sarigöl,E.,Pﬁtzner,R.,Scholtes,I.,Garas,A.,&Schweitzer,F.(2014).Predictingscientiﬁcsuccessbasedoncoauthorshipnetworks.EPJDataScience,3(1),1. Sarigol,E.,Garcia,D.,Scholtes,I.,&Schweitzer,F.(2017).Quantifyingtheeffectofeditor-authorrelationsonmanuscripthandlingtimes.Scientometrics,

1–25(inpress).

Scholtes,I.,Pﬁtzner,R.,&Schweitzer,F.(2014).Thesocialdimensionofinformationranking:Adiscussionofresearchchallengesandapproaches.In Socioinformatics–ThesocialimpactofinteractionsbetweenhumansandIT.pp.45–61.Springer.

Scholtes,I.,Wider,N.,Pﬁtzner,R.,Garas,A.,Tessone,C.J.,&Schweitzer,F.(2014).Causality-drivenslow-downandspeed-upofdiffusionin non-Markoviantemporalnetworks.NatureCommunications,5.

Schubert,A.,&Braun,T.(1986).Relativeindicatorsandrelationalchartsforcomparativeassessmentofpublicationoutputandcitationimpact. Scientometrics,9(5-6),281–291.

Sinha,A.,Shen,Z.,Song,Y.,Ma,H.,Eide,D.,Hsu,B.-J.P.,etal.(2015).AnoverviewofMicrosoftacademicservice(MAS)andapplications.Proceedingsofthe 24thinternationalconferenceonworldwideweb,ACM,243–246.

Sirtes,D.(2012).FindingtheEastereggshiddenbyoneself:WhyRadicchiandCastellano’s(2012)fairnesstestforcitationindicatorsisnotfair.Journalof Informetrics,6(3),448–450.

vanLeeuwen,T.N.,&Medina,C.C.(2012).Redefiningthefieldofeconomics:Improvingfieldnormalizationfortheapplicationofbibliometrictechniques inthefieldofeconomics.ResearchEvaluation,21(1),61–70.

Vinkler,P.(1986).Evaluationofsomemethodsfortherelativeassessmentofscientiﬁcpublications.Scientometrics,10(3-4),157–177.

Walker,D.,Xie,H.,Yan,K.-K.,&Maslov,S.(2007).Rankingscientiﬁcpublicationsusingamodelofnetworktrafﬁc.JournalofStatisticalMechanics:Theory andExperiment,2007(06),P06010.

Waltman,L.,&vanEck,N.J.(2010).Therelationbetweeneigenfactor,audiencefactor,andinﬂuenceweight.JournaloftheAmericanSocietyforInformation ScienceandTechnology,61(7),1476–1486.

Waltman,L.,Yan,E.,&vanEck,N.J.(2011).Arecursiveﬁeld-normalizedbibliometricperformanceindicator:Anapplicationtotheﬁeldoflibraryand informationscience.Scientometrics,89(1),301–314.

Waltman,L.,vanEck,N.J.,&vanRaan,A.F.J.(2012).Universalityofcitationdistributionsrevisited.JournaloftheAmericanSocietyforInformationScience andTechnology,63(1),72–77.

Waltman,L.(2016).Areviewoftheliteratureoncitationimpactindicators.JournalofInformetrics,10(2),365–391.

Yao,L.,Wei,T.,Zeng,A.,Fan,Y.,&Di,Z.(2014).Rankingscientiﬁcpublications:Theeffectofnonlinearity.ScientiﬁcReports,4,6663.

Zhang,Z.,Cheng,Y.,&Liu,N.C.(2014).Comparisonoftheeffectofmean-basedmethodandz-scoreforﬁeldnormalizationofcitationsatthelevelofweb ofsciencesubjectcategories.Scientometrics,101(3),1679–1693.

Zhou,J.,Zeng,A.,Fan,Y.,&Di,Z.(2016).Rankingscientiﬁcpublicationswithsimilarity-preferentialmechanism.Scientometrics,106(2),805–816. Zitt,M.,Ramanana-Rahary,S.,&Bassecoulard,E.(2005).Relativityofcitationperformanceandexcellencemeasures:Fromcross-ﬁeldtocross-scale

effectsofﬁeld-normalisation.Scientometrics,63(2),373–401.