Identification of milestone papers through time-balanced network centrality

(1)

Identiﬁcation of milestone papers through time-balanced

network centrality

Manuel Sebastian Mariani

a,∗

, Matúˇs Medo

a

_{, Yi-Cheng Zhang}

a,b

a_{Department of Physics, University of Fribourg, 1700 Fribourg, Switzerland}

b_{Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, P.R. China}

Citations between scientiﬁc papers and related bibliometric indices, such as the h-index

for authors and the impact factor for journals, are being increasingly used – often in

con-troversial ways – as quantitative tools for research evaluation. Yet, a fundamental research

question remains still open: to which extent do quantitative metrics capture the

signiﬁ-cance of scientiﬁc works? We analyze the network of citations among the 449,935 papers

published by the American Physical Society (APS) journals between 1893 and 2009, and

focus on the comparison of metrics built on the citation count with network-based metrics.

We contrast ﬁve article-level metrics with respect to the rankings that they assign to a

set of fundamental papers, called Milestone Letters, carefully selected by the APS editors

for “making long-lived contributions to physics, either by announcing signiﬁcant

discov-eries, or by initiating new areas of research”. A new metric, which combines PageRank

centrality with the explicit requirement that paper score is not biased by paper age, is the

best-performing metric overall in identifying the Milestone Letters. The lack of time bias

in the new metric makes it also possible to use it to compare papers of different age on

the same scale. We ﬁnd that network-based metrics identify the Milestone Letters better

than metrics based on the citation count, which suggests that the structure of the citation

network contains information that can be used to improve the ranking of scientiﬁc

publica-tions. The methods and results presented here are relevant for all evolving systems where

network centrality metrics are applied, for example the World Wide Web and online social

networks. An interactive Web platform where it is possible to view the ranking of the APS

papers by rescaled PageRank is available at the address http://www.sciencenow.info.

1. Introduction

The notion of quantitative evaluation of scientific impact builds on the basic idea that the scientific merits of papers (Narin, 1976; Radicchi, Fortunato, & Castellano, 2008), scholars (Egghe, 2006; Hirsch, 2005), journals (Bollen, Rodriquez, & Van de Sompel, 2006; Liebowitz & Palmer, 1984; Pinski & Narin, 1976), universities (Kinney, 2007; Molinari & Molinari, 2008) and countries (Cimini, Gabrielli, & Labini, 2014; King, 2004) can be gauged by metrics based on the received citations. The respective field, referred to as bibliometrics or scientometrics, is undergoing a rapid growth (Van Noorden, 2010) fueled by the increasing availability of massive citation datasets collected by both academic journals and online platforms, such as

∗ Corresponding author.

E-mail address:manuel.mariani@unifr.ch(M.S. Mariani).

http://doc.rero.ch

Published in "Journal of Informetrics 10(4): 1207–1223, 2016"

which should be cited to refer to this work.

(2)

GoogleScholarandWebofScience.Thepossiblebeneﬁts,drawbacksandlong-termeffectsoftheuseofbibliometricindices arebeinghighlydebatedbyscholarsfromdiverseﬁelds(Hicks,Wouters,Waltman,deRijcke,&Rafols,2015;Lawrence, 2008;VanRaan,2005;Weingart,2005;Werner,2015).

Althoughsomeefforthasbeendevotedtocontrastdifferentmetricswithrespecttotheirabilitytosingleoutseminal papers(Dunaiski&Visser,2012;Dunaiski,Visser,&Geldenhuys,2016;Yao,Wei,Zeng,Fan,&Di,2014;Zhou,Zeng,Fan,& Di,2015),differencesamongtheadoptedbenchmarkingproceduresanddiverseconclusionsofthementionedreferences leavea fundamentalquestionstillopen:whichmetricofscientificimpactbestagreeswithexpert-basedperceptionof significance?InagreementwithWasserman,Zeng,andAmaral(2015),thesignificanceofascientificworkisintendedhere asitsenduringimportancewithinthescientificcommunity.

Toaddressthisquestion,wefocusonalistof87physicspapersofoutstandingsigniﬁcance–calledMilestoneLetters– recentlymadeavailablebytheAmericanPhysicalSociety(APS)[http://journals.aps.org/prl/50years/milestones,accessed 25-11-2015].AccordingtotheAPSeditors’description,theMilestoneLetters“havemadelong-livedcontributionstophysics, eitherbyannouncingsigniﬁcantdiscoveries,orbyinitiatingnewareasofresearch”.Thesearticleshavebeencarefully selectedbytheeditorsoftheAPS,andthechoicesaremotivatedindetailinthewebpage;thefactthatalargefractionof themledtoNobelPrizeforsomeoftheirauthorsisanindicationoftheexceptionalleveloftheselectedworks.

Inthiswork,weanalyzethenetworkofcitationsbetweentheN=449,935paperspublishedinAPSjournalsfrom1893 until2009tocomparefivearticle-levelmetricswithrespecttotherankingpositiontheyassigntotheMilestoneLetters. Areliableexpert-basedevaluationofthesignificance(intendedasenduringimportance,asinWassermanetal.,2015) ofapapernecessarilyrequiresatimelagbetweenthepaper’spublicationdateandtheexpert’sjudgment.Forexample, thereisatimeintervalof14yearsbetweenthemostrecentMilestoneLetter(from2001)andtheyearatwhichthelistof MilestoneLetterswasreleased(2015).However,weshowthatawell-designedquantitativemetricoffersustheopportunity todetectpotentiallysignificantpapersrelativelyshortaftertheirpublication–anaspectoftenneglectedintheevaluation ofbibliometricindicators.Toshowthis,westudyhowtheabilityofthedifferentmetricstoidentifytheMilestoneLetters changeswithpaperage.

Aplethoraofquantitativemetricsexistandcouldbestudiedinprinciple.Ourfocushereisnarrowedtometricsthatrely onadiffusionprocessontheunderlyingnetworkofcitationsbetweenpapersandtheircomparisonwithsimplecitation count.Theﬁvemetricsconsideredinthisworkarethus:thecitationcount,PageRank(introducedbyBrin&Page,1998), CiteRank(introducedbyWalker,Xie,Yan,&Maslov,2007),rescaledcitationcount(introducedbyNewman,2009),and novelrescaledPageRank.PageRankisaclassicalnetworkcentralitymetricwhichcombinesarandomwalkalongnetwork linkswitharandomteleportationprocess.Themetrichasbeenappliedtoabroadrangeofreal-worldproblems(Ermann, Frahm,&Shepelyansky,2015;Franceschet,2011;Gleich,2015forareview),includingrankingacademicpapers(Chen,Xie, Maslov,&Redner,2007;Yaoetal.,2014),journals(Bollenetal.,2006;González-Pereira,Guerrero-Bote,&Moya-Anegón, 2010)andauthors(Nykl,Jeˇzek,Fiala,&Dostal,2014;Radicchi,Fortunato,Markines,&Vespignani,2009;Yan&Ding,2009) (seeWaltman&Yan,2014forareviewoftheapplicationsofPageRank-relatedmethodstobibliometricanalysis).

Toovercomethewell-knownPageRank’sbiastowardoldnodesincitationdata(detailedlystudiedbyChenetal.,2007; Mariani,Medo,&Zhang,2015),theCiteRankalgorithmintroducesexponentialpenalizationofoldnodes,resultingina nodescorethatwellcapturesthefuturecitationcountincreaseofthepapersand,forthisreason,canbeconsideredasa reasonableproxyfornetworktraffic,asshownbyWalkeretal.(2007).However,weshowbelowthatCiteRankscoredoesnot allowonetofairlycomparepapersofdifferentage.RescaledcitationcountandrescaledPageRankarederivedfromcitation countandPageRankscore,respectively,byexplicitlyrequiringthatpaperscoreisnotbiasedbyage–theadoptedrescaling procedureisconceptuallyclosetothemethodsrecentlydevelopedbyRadicchietal.(2008),Newman(2009),Radicchiand Castellano(2011),Newman(2014),RadicchiandCastellano(2012b),RadicchiandCastellano(2012a),Crespo,Ortu ño-Ortín, andRuiz-Castillo(2012)andKaur,Ferrara,Menczer,Flammini,andRadicchi(2015)tosuppressbiasesbyageandfieldin theevaluationofacademicagents.Wefindthattherankingsproducedbytherescaledscoresareindeedconsistentwiththe hypothesisthattherankingsarenotbiasedbyage.

WeﬁndthatPageRankcancompeteandevenoutperformrescaledPageRankinidentifyingoldmilestonepapers,but completelyfailstoidentifyrecentmilestonepapersduetoitstemporalbias.CiteRankcancompeteandevenoutperform rescaledPageRankinidentifyingrecentmilestonepapers,butmarkedlyunderperformsinidentifyingoldmilestonepapers duetoitsbuilt-inexponentialpenalizationforolderpapers.Indicatorsbasedonsimplecitationcountareoutperformedby rescaledPageRankforpapersofeveryage.ThisleadsustotheconclusionthatrescaledPageRankisthebest-performing metricoverall.WithrespecttopreviousworksbyChenetal.(2007),DunaiskiandVisser(2012),Fiala(2012)andDunaiski etal.(2016)thatclaimedthesuperiorityofnetwork-basedmetricsinidentifyingimportantpapers,ourresultsclarifythe essentialroleofpaperageindeterminingthemetrics’performance:rescaledPageRankexcelsandPageRankperformspoorly inidentifyingMLsshortaftertheirpublication,andtheperformanceofthetwomethodsbecomescomparableonly15years aftertheMLsarepublished.QualitativelysimilarresultsarefoundforanalternativelistofAPSoutstandingpaperswhich onlyincludesworksthathaveledtoNobelprizeforsomeoftheauthors(thelistisprovidedintheTableS2).

Ourresultsindicatethatnetworkcentralityandtime-balancearetwoessentialingredients–thoughneglectedbypopular bibliometricindicatorssuchastheh-indexforscholars(Hirsch,2005)andimpactfactorforjournals(Garfield,1972)–for aneffectivedetectionofsignificantpapers.Thissetsanewbenchmarkforarticle-levelmetricsandquantitativelysupport theparadigmthatconsideringthewholenetworkinsteadofsimplecitationcountcanbringsubstantialbenefitstothe rankingofacademicagents.Inabroadercontext,ourresultsshowthatadirectrescalingofPageRankscoresisaneffective

(3)

methodtosolvethePageRank’swell-knownbiasagainstrecentnetworknodes.Weemphasizethatwhilescientiﬁcpapers arethefocusofthiswork,theaddressedresearchquestionisgeneralandcanemergewhenestimatingtheimportanceof anycreativework–suchasmovies(Spitz&Horvát,2014;Wassermanetal.,2015)–forwhichquantitativeimpactmetrics andexpert-basedsigniﬁcanceassessmentsaresimultaneouslyavailable.Thepotentialbroaderapplicationsandpossible limitationsofourresultsarediscussedintheDiscussionsection.

2. Metrics

Weconsiderﬁvearticle-levelmetrics:citationcountc,PageRankscorep,CiteRankscoreT,rescaledPageRankscoreR(p), andrescaledcitationcountR(c).

2.1. Citationcount

WedenotebyAthenetwork’sadjacencymatrix(Aijisoneifnodejpointstonodeiandzerootherwise).Citationcount

(referredtoasindegreeinnetworkscienceliterature,seeNewman,2010)isoneofthesimplestmetricstoinfernode centralityinanetwork,beingsimplydeﬁnedasci=

jAijforanodei.Citationcountisthebuildingblockofthemajority

ofmetricsforassessingtheimpactofsinglepapers,authors,journals(forareviewofcitation-basedimpactindicatorssee Waltman,2016).

2.2. PageRank

ThePageRankscorevectorwasintroducedbyBrinandPage(1998),andcanbedeﬁnedasthestationarystateofa processwhich combinesarandomwalkalongthenetworklinks andrandomteleportation.Ina directedmonopartite networkcomposedofNnodes,thevectorofPageRankscores{pi}canbefoundasthestationarysolutionofthefollowing

setofrecursivelinearequations p(n+1)_i =˛

j:kout j >0 Aij p(n)_j kout j +˛

j:kout j =0 p(n)_j N + 1−˛ N , (1) wherekout j :=

lAljistheoutdegreeofnodej,˛istheteleportationparameter,andnistheiterationnumber.Eq.(1)

repre-sentsthemasterequationofadiffusionprocessonthenetwork,whichconvergestoauniquestationarystateindependently oftheinitialcondition(seeBerkhin,2005forthemathematicaldetails).ThePageRankscorepiofnodeicanbeinterpreted

astheaveragefractionoftimespentonnodeibyarandomwalkerwhowithprobability˛followsthenetwork’slinksand withprobability1−˛teleportstoarandomnode.Throughoutthispaper,weset˛=0.5whichistheusualchoiceforcitation data(Chenetal.,2007).

2.3. CiteRank

TocorrectthePageRank’sstrongtemporalbiasincitationnetworks,theCiteRankalgorithm(introducedbyWalkeretal., 2007)introducesadhocpenalizationforoldernodes.TheCiteRankscoreTisdeﬁnedsimilarlyasPageRank;differentlyfrom PageRank,inCiteRankequationstheteleportationprobabilitydecaysexponentiallywithpaperagewithacertaintimescale .AccordingtoWalkeretal.(2007)andMaslovandRedner(2008),thischoiceoftheteleportationvectorisintendedto favortherecentnodesandthusleadtoascorethatbetterrepresentspapers’relevanceforthecurrentlinesofresearch. UsingthesamenotationasEq.(1),thevectorofCiteRankscores{Ti}canbefoundasthestationarysolutionofthefollowing

setofrecursivelinearequations T_i(n+1)=˛

j:kout j >0 Aji T_j(n) kout j +˛

j:kout j =0 T_j(n) N +(1−˛) exp(−(t−ti)/)

N j=1exp(−(t−tj)/) , (2)

wherewedenotebytithepublicationdateofpaperiandtthetimeatwhichthescoresarecomputed.Throughoutthispaper

weset˛=0.5and=2.6years,whicharetheparameterschosenbyWalkeretal.(2007).Theperformanceofthealgorithm forothervaluesoftheparameterisdiscussedinthecaptionofFig.E.10,inAppendixE.Weshowbelowthatexponential penalizationofoldernodesisnoteffectiveinremovingPageRank’sbias,andproposeinsteadarescaledPageRankscoreR(p) whoseaveragevalueandstandarddeviationdonotdependonpaperage.

2.4. RescaledPageRankandrescaledcitationcount

TocomputetherescaledPageRankscoreR(p)foragivenpaperi,weevaluatethepaper’sPageRankscorepiaswellasthe

meani(p)andstandarddeviationi(p)ofPageRankscoreforpaperspublishedinasimilartimeasi.Timeisnotmeasured

indaysoryears,butinnumbernofpublishedpapers;afterlabelingthepapersinorderofdecreasingage,i(p)andi(p)are

(4)

computedoverpapersj∈[i−p/2,i+p/2].Theparameterprepresentsthenumberofpapersintheaveragingwindow

ofeachpaper.1_The_rescaled_score_R

i(p)ofpaperiisthencomputedas

Ri(p)=pi−i(p)

i(p) . (3)

ValuesofR(p)largerorsmallerthanzeroindicatewhetherthepaperisout-orunder-performing,respectively,withrespect topapersofsimilarage.Ri(p)representsthez-score(Kreyszig,2010)ofpaperiwithinitsaveragingwindow.Forthesake

ofcompleteness,wehavealsotestedasimplerrescaledscoreintheformR(ratio)_i (p)=pi/i(p);however,R(ratio)(p)failsto

produceatime-balancedrankingduetothefactthat(p)/(p)stronglydependsonpaperage(seeAppendixCfordetails).In addition,wetestedarescaledscoreR(year)_(p)_based_on_Eq.₍₃₎_where

i(p)andi(p)arecomputedoverthepaperspublished

inthesameyearaspaperi.WefoundthatwhileR(year)_(p)_is_able_to_suppress_large_part_of_PageRank’s_temporal_bias,_its

rankingismuchlessinagreementwiththehypothesisofunbiasedrankingthantherankingbyR(p)(seeAppendixCfor details).Forthisreason,weuseanaveragingwindowbasedonnumberofpublicationsandnotonrealtime.Thischoiceis alsosupportedbytheﬁndingsbyNewman(2009)andParoloetal.(2015)whichsuggestthattheroleoftimeincitation networksisbettercapturedbythenumberofpublishedpapersthanbyrealtime.

Wedeﬁnetherescaledcitationcountanalogouslyas Ri(c)=ci−i(c)

i(c) , (4)

wherei(c)andi(c)representthemeanandthestandarddeviationofccomputedoverpapersj∈[i−c/2,i+c/2].

CitationcountrescalingwasusedbyNewman(2009)andNewman(2014)toidentifypapersthataccruemorecitationsthan expectedforpapersofsimilarageunderthehypothesisofpurepreferentialattachment.

Thechoiceofthesizeofthetemporalwindowdeservessomeattention:ifthesizeofthetemporalwindowistoolarge, onewouldfallagaininatime-biasedrankingthatisoneoftheissuesthatmotivatethepresentpaper.Ontheotherhand, ifwechooseatoosmallaveragingwindow,thepaperswouldbeonlycomparedwithfewotherpapersandtheresulting scoreswouldbetoovolatile.Throughoutthispaper,wesetc=p=1000;werefertoAppendixDforfurtherdetailson

thedependenceofrankingpropertiesontheaveragingwindowsize.WestressthattherankingsbyR(c)andR(p)areonly weaklydependentoncandp(seeFig.D.9),andthecorrelationbetweentherankingsbyR(p)obtainedwithdifferent

valuesofPageRank’steleportationparameter˛isclosetoone(Spearman’srankcorrelationcoefﬁcientbetweentherankings obtainedwith˛=0.5and˛=0.85isequalto0.98).Theseresultsindicatethattheproposedrescalingmetricsarerobustwith respecttovariationsoftheirparameters.

3. Results

We analyzed the network composed of L=4,672,812 citations among N=449,935 papers published in APS journals (1893–2009). The dataset was directly provided by the APS following our request at the webpage http://journals.aps.org/datasets,andwasalsostudiedby,amongothers,Medo,Cimini,andGualdi(2011).

3.1. Timebalanceoftherankings

BeforecomparingtheperformancesoftheﬁvemetricsinrecognizingtheMilestoneLetters(MLs),wewanttodetermine whetherthemetricsarebiasedbyageand,ifyes,thentowhichextent.InagreementwithRadicchietal.(2008)andRadicchi andCastellano(2011),weassumethatafairrankingofscientiﬁcpapersshouldbetime-balancedinthesensethatoldand recentpapersshouldbeequallylikelytoappearatthetopoftherankingbythemetric.Caveatsandpossibleweakpointsof thisassumptionareexaminedintheDiscussionsection.

Toassessthedegreeoftimebalanceoftheﬁvemetrics,weperformastatisticaltestsimilartothoseproposedbyRadicchi andCastellano(2011)andRadicchiandCastellano(2012b).WedividethepapersintoS=40differentgroupsaccordingto theirageand,foreachmetric,wecomputethenumbern˛(z)oftop-zNpapersbythemetricforeachagegroup˛,and

quantitativelycomparetheresultinghistogram{n˛(z)}withtheexpectedhistogram{n(0)˛ (z)}underthehypothesisthatthe

rankingistemporallyunbiased.Wesetz=0.01;resultsforothersmallvaluesofzarequalitativelysimilar.

Fig.1Ashowsthattheobservedvaluesofn(0.01)forPageRankarefarfromtheirexpectedvaluesunderthehypothesisof unbiasedranking.Forinstance,n1(0.01)/n(0)1 (0.01)=4.62fortheagegroupthatcontainstheoldestN/40papers,asopposed

ton40(0.01)=0fortheagegroupcomposedofthemostrecentN/40papers.Toquantifythedegreeoftimebalanceofametric,

wecomparethestandarddeviationoftheobservedhistogram{n˛(0.01)}withtheexpectedstandarddeviation0under

thehypothesisofunbiasedranking.Foraperfectlyunbiasedranking,thenumbern(0)˛ ofnodesfromagegroup˛inthetop-z

1_In_order_to_have_the_same_number_of_papers_in_each_averaging_window,_a_different_deﬁnition_of_averaging_window_is_needed_for_the_oldest_and_the mostrecentp/2papers,forwhichwecomputeiandioverthepapersj∈[1,p]andj∈(N−p,N],respectively.

(5)

Fig.1.Timebalanceofthenetwork-basedmetrics.Panels(A,C,E)showthehistogramofthenumberofpapersfromeachpaperagegroupinthetop-1% oftherankingbyPageRankscorep,rescaledPageRankscoreR(p)andCiteRankscoreT,respectively(agegroup1andagegroup40containtheoldestand mostrecentN/40papers,respectively).Thehorizontalblacklinerepresentstheunbiasedvaluen(0)_(0.01)₌_0.01_N/40;_the_gray-shaded_area_represents_the interval[n(0)₋

0,n(0)+0]with0givenbyEq.(5).Panels(B,D,F)showthecumulativedistributionsofPageRankscorep,rescaledPageRankscoreR(p) andCiteRankscoreT,respectively,fordifferentagegroups.

bytherankingobeysthemultivariatehypergeometricdistribution(Radicchi&Castellano,2012b).Therefore,weexpecton averagen(0)_(z)₌_z_N/S_top-z_N_papers_for_each_set,_with_the_standard_deviation

0(z)=

zN S

1−1_S

(1−z)_NN₋₁, (5)

Theobservedstandarddeviation(z)iscomputedas

(z)=

1 S S

˛=1 (n˛−n(0)˛ ) 2 . (6)

Theratio/0betweenobservedandexpectedstandarddeviationquantiﬁesthedegreeoftimebalanceoftheranking–

weexpectthisratiotobeclosetoorlowerthan(duetoﬂuctuations)oneforanunbiasedranking,andsigniﬁcantlylarger thanoneforarankingbiasedbyage.Toquantifytowhichextenttheobservedvaluesof/0−1areconsistentwiththe

hypothesisofunbiasedranking,werunasimulationwhere0.01Npapersarerandomlyassignedtooneamong40groups, andcomputethestandarddeviationdevoftheobserveddeviationrand/0−1accordingtoEq.(6).With105realizations,

weobtaindev=0.11.Wealwaysexpresstheobservedvaluesof/0−1asmultiplesofdevinthefollowing.

Weobtain/0−1=12.91=117.36devforPageRank,whichindicatesthattherankingisheavilybiased.Theheavybiasof

PageRankscoreisalsorevealedbyacomparisonofitsdistributionfornodesfromdifferentagegroups,whichshowsaclear advantageforoldnodes(Fig.1B).Fig.1CshowsthattherankingbytheR(p)scoreisingoodagreementwiththehypothesis thattherankingisunbiased;weﬁnd/0−1=0.16=1.45dev.ThetimebalanceofrescaledPageRankscoremanifestsitself

inthecollapseofthedistributionsoftheR(p)scorefordifferentagegroupsonauniquecurve,whichmeansthattheR(p) scoreallowsustocomparepapersofanyageonthesamescale(Fig.1D).Inasimilarway,therescalingproceduresuppresses thetemporalbiasofcitationcount[/0−1=0.10=0.91devforR(c)ascomparedto/0−1=6.01=54.64devforc,see

Fig.2].WeobserveaqualitativelysimilarsuppressionoftimebiasfordifferentchoicesofthenumberSofagegroups(not shownhere).

Withrespecttothehistogram obtainedwithR(p), thehistogram{n˛(0.01)}obtainedwiththeCiteRankalgorithm

(withtheparameterschosenbyMaslovandRedner(2008))presentsmuchlargerdeviationfromthehistogramexpected under the hypothesis of time-balanced ranking (see Fig. 1E). As a result, the value of /0 obtained for CiteRank

(/0−1=1.75=15.91devwiththeparameterschosenbyWalkeretal.(2007))islargerthanthevalueobtainedforR(p).

ThedistributionsofCiteRankscoreTfordifferentagegroupsdonotcollapseonasinglecurve(seeFig.1F),whichisdirectly duetothebuilt-inexponentialdecayoftheteleportationterm.ThefailureofCiteRankinproducingatime-balancedranking iswellexempliﬁedbythebehaviorofthescoredistributionforthemostrecentagegroup,whoseminimumscore(i.e.,

(6)

Fig.2.Timebalanceofthecitation-basedmetrics.Panels(A,B)showthehistogramofthenumberoftop-1%papersforeachpaperagegroupintheranking bycitationcountcandrescaledcitationcountR(c),respectively.Panels(C,D)showthecumulativedistributionsfordifferentagegroupsofcitationcount candrescaledcitationcountR(c),respectively.

thesmallestscorevaluesuchthatP(>T)deviatesfromone)ismuchlargerthanfortheotherdistributions,duetoalarger teleportationterm.TheseﬁndingsshowthatCiteRankscoredoesnotallowustofairlycomparepapersofdifferentage.

Thevaluesof/0−1fortheﬁvemetricsaresummarizedinTable1.

3.2. IdentiﬁcationoftheMilestoneLetters

Intheprevioussection,wehaveshownthattherankingsbytherescaledmetricsR(p)andR(c)areconsistentwiththe hypothesisthattherankingisnotbiasedbypaperage.Whiledifferentworkshaverecentlyemphasizedtheimportanceof removingthebiasbyageofcitation-performancemetricsforafairrankingofscientiﬁcpublications(Radicchi&Castellano, 2011;Radicchietal.,2008)andresearchers(Kaur,Radicchi,&Menczer,2013),thepossiblepositiveeffectsoftime-balanced rankingswithrespecttobiasedrankingsremainlargelyunexplored.

Chenetal.(2007)analyzedtheAPSdatasetandfoundthatPageRankisabletorecognizeoldpapersthatareuniversally importantforphysics.TheyalsonotedthatPageRankisbasedonadiffusionprocessthatdriftstowardsoldpapers(see Marianietal.,2015forageneralanalysisofthisaspect)and,asaconsequence,itinevitablyfavorsoldpapers.Sincethe rescalingprocedurethatweproposesolvesthisissue,itisthusplausibletoconjecturethatwithrespecttothePageRank algorithm,rescaledPageRankallowsustoidentifyseminalpapersearlier.

Inthissection,weusetheAPSdatasetandthelistofMilestoneLetters(MLs)chosenbyeditorsofPhysicalReviewLetters (seeSupplementaryTableS1forthelistofMLs)toaddressthetwofollowingresearchquestions:

Table1

Theﬁveconsideredmetricsandtheirbiasbyage.Thedifference/0−1quantiﬁeshowmuchthehistogramofthenumberoftop-1%papersbythe metricdeviatesfromthehistogramexpectedunderthehypothesisofrankingnotbiasedbyage(seethemaintext).Thevaluesof/0−1areexpressedas multiplesoftheirexpectedvaluedev=0.11forarandomrankingofthepapers(computedasexplainedinthemaintext).Valuesof/0−1smallerthan 2dev=0.22arereportedinboldcharacters.

Metric Properties /0−1

Citationcountc Localmetric 54.64dev

PageRankscorep Network-basedmetric 117.36dev

CiteRankT Network-basedmetric,time-aware 15.91dev

RescaledPageRankR(p) Network-basedmetric,time-aware 1.45dev

RescaledcitationcountR(c) Localmetric,time-aware 0.91dev

(7)

Fig.3. Metrics’performanceinrankingthemilestoneletters(listedintheSupplementaryTableS1)asafunctionofpaperage.(A)Dependenceofthe averagerankingratio ¯r onpaperage.(B)Dependenceoftheidentiﬁcationratef0.01onpaperage.(C)Dependenceofthenormalizedidentiﬁcationrate ˜f0.01 onpaperage.

1.IsthereasigniﬁcantgapbetweentheperformanceofrescaledPageRankandPageRankinidentifyingtheMLsshortafter publication?Ifthereisasubstantialgap,doesitclosedownafteracertainnumberofyearsafterpublication?

2.Donetwork-basedindicatorsoutperformindicatorsbasedonsimplecitationcountinrecognizingtheMLs?

TocomparetherankingpositionsoftheMLsbytheﬁvedifferentmetrics,therankingofMilestoneLetteriiscomputed tyearsafteritspublication.Wecalculatetheratioofi’srankingpositionri(s,t)bymetricsandi’sbestrankingposition

min

s {ri(s,t)}among allconsideredmetrics.Tocharacterizetheoverallperformanceof metricsinrankingtheMLs,we

averagetherankingratiooveriandobtain ¯r(s,t)(seeFforcomputationdetails).Theresultingquantityisreferredtoasthe averagerankingratioofmetricsfortheMilestoneLetterstyearsaftertheirpublication.Agoodmetricisexpectedtohave aslow ¯r(·,t)aspossible–theminimumvalue ¯r(·,t)=1isonlyachievedbyametricthatalwaysoutperformstheothers inrankingthemilestonepapersofaget.Notethattheaveragerankingratioreducestoaveragerankingpositionifwedo notnormalizetherankingpositionri(s,t)bymin

s {ri(s,t)}.However,theaveragerankingpositionofthetargetpapersbya

certainmetricisextremelysensitivetotherankingpositionsoftheleast-citedtargetpapers,asopposedtotherobustness oftheaveragerankingratiowithrespecttoremovaloftheleast-citedpapersfromthesetoftargetpapers(seeAppendix Afordetails).ThispropertymotivatestheuseofrankingratiotocomparetherankingpositionsoftheMLsbythedifferent metrics.

Thedependenceof ¯r(s,t)onpaperagetmeasuredinyearsafterpublicationisshowninFig.3A.Duetothesuppression oftimebias,rescaledPageRankscoreR(p)hasalargeadvantagewithrespecttotheoriginalPageRankscorepforpapersof smallage.SincethePageRankalgorithmisbiasedtowardsoldnodes,theperformancegapbetweenR(p)andpgradually decreaseswithageandvanishes18yearsafterpublication.Bycontrast,theCiteRankalgorithmexponentiallypenalizesolder nodesand,asaconsequence,theperformancegapbetweenR(p)andTisminimalforrecentpapers,andCiteRankscoreTcan evenoutperformR(p)duringthefirstsixyearsafterpublication.WhenpaperagebecomessufficientlylargerthanCiteRank’s temporaltimescale(=2.6yearshere,aschosenbyWalkeretal.(2007)andMaslovandRedner(2008)),olderpapersare stronglypenalizedbytheCiteRank’steleportationtermand,asaresult,CiteRankismarkedlyoutperformedbyrescaled PageRank.ThesamebehaviorisobservedalsoforothervaluesofCiteRanktime-decayparameter(seeAppendixE).The localmetricscandR(c)areoutperformedbyR(p)inrankingtheMLsofeveryage,whichindicatesthatnetworkcentrality bringsasubstantialadvantageinrankinghighlysignificantpaperswithrespecttosimpleandrescaledcitationcount.

Whiletheaveragerankingratio ¯r takesintoaccountalltheMLs,itisalsointerestingtomeasuretheage-dependence oftheidentiﬁcationratesofthemetrics,deﬁnedasthefractionfx(t)ofMLsthatwererankedamongthetopxNpapersby

themetricwhentheyweretyearsold2_(see_Fig._3B)._Rescaled_PageRank_R(p)_and_CiteRank_score_T_markedly_outperform

theothermetricsinidentifyingthemilestonepapersinthefirstyearsafterpublication.TheperformancegapbetweenR(p) andthecitation-basedindicatorscandR(c)remainssignificantduringthewholeobservationlapse.Analogouslytowhat weobservedfortheaveragerankingratio,theperformancegapbetweenR(p)andpgraduallydecreaseswithpaperage andvanishes15yearsafterpublication,whichissimilartothecrossingpointat18yearsafterpublicationobservedforthe averagerankingratio.CiteRankhasasmalladvantagewithrespecttorescaledPageRankinthefirstyearsafterpublication, whereasforolderpapersCiteRank’sidentificationratedropstothevalueachievedbysimplecitationcountc.

Itisworthtoobservethatthetemporalbiasofacertainmetricaffectsthebehaviorofboth ¯r(t)andf0.01(t)forthat

metric:asweobserveinAppendixB,ametricbiasedtowardsold(likePageRank)orrecentpapers naturallyperforms betterinidentifyingoldorrecentMLs,respectively.Onenaturalwaytounderstandthiseffectistoconsideranormalized identiﬁcationrate ˜f0.01(t)(hereafterabbreviatedasNIR),suchthatthecontributionofeachidentiﬁedMLiofaget(i.e.,aML

rankedinthetop0.01Noftheranking)to ˜f0.01(t)issmallerthanoneifthemetricfavorspapersthatbelongtothesameage

groupaspaperi(seeAppendixFforthemathematicaldeﬁnition).Inotherwords,whenevaluatingtheperformanceofa

2_The_{identiﬁcation}_rate_is_related_to_recall,_a_standard_measure_in_the_literature_of_{recommendation}_systems₍_Lü_et_al.,₂₀₁₂_).

(8)

Fig.4.Metrics’performanceinrankingtheAPSpapersthatledtoNobelprizeforsomeoftheauthors,listedintheSupplementaryTableS2.Thefigurehas beenrealizedwiththesameprocedureusedforFig.3.(A)Dependenceoftheaveragerankingratio ¯r onpaperage.(B)Dependenceoftheidentification ratef0.01onpaperage.(C)Dependenceofthenormalizedidentificationrate ˜f0.01onpaperage.Weobserveabehaviorinqualitativeagreementwiththat observedinFig.3.

givenmetric,thenormalizedidentiﬁcationrate ˜f0.01(t)takesintoaccountboththetemporalbalanceandtheidentiﬁcation

powerofthemetric.Thebehaviorof ˜f0.01(t)fortheﬁvemetricsisshowninFig.3C.Afteraninitialincreasingtrendforallthe

metrics,thenormalizedidentiﬁcationrateofbothpandcdeclineduetotheirtemporalbias;bycontrast,thesamequantity remainsrelativelystableforbothR(p)andR(c).Accordingto ˜f0.01(t),rescaledPageRankoutperformsCiteRankforpapersof

everyage.ThisisduetothefactthattherankingbyCiteRankisnotunbiasedand,asaconsequence,CiteRank’sperformance isoftenpenalizedbytheNIRforsmallagetduetothealgorithm’sbiastowardsrecentnodes.

OuranalysisassumesthataMLshouldberankedashighaspossiblebyagoodmetricforscientificsignificance.Onthe otherhand,manyoutstandingcontributionstophysicsarenotincludedinthelistofMLs.Toshowthatourresultsalso holdforanalternativechoiceofgroundbreakingpapers,weconsideralistof67APSpapersthatledtoNobelPrizeforsome oftheauthors(seeSupplementaryTableS1forthelistofpapers).Theresultsforthislistofbenchmarkpapersareshown inFig.4andarequalitativelysimilartothoseshowninFig.3,whichindicatesthatourfindingsarerobustwithrespectto modificationsofthebenchmarkpapers’list.

WhileFig.3concernsthemetrics’performanceaveragedoverthewholesetofMLs,theSupplementaryMovieshowsthe simultaneousdynamicsoftherankingpositionsbypandR(p)ofallindividualMLsfortheﬁrst15yearsafterpublication.3

ThemovieshowsthatrescaledscoreR(p)hasaclearadvantagewithrespecttoPageRankscorepintheﬁrstyearsafter publicationformostoftheMLs.AstheMLsbecomesufﬁcientlyold,theirpositionintheplanegraduallytendstoconverge tothediagonalwheretherankingpositionbypisequaltotherankingpositionbyR(p),whichisinagreementwiththe crossingbetweenPageRank’sandrescaledPageRank’sperformancecurvesobservedinFig.3AandB.

Inprinciple,onemightconsideracomparisonoftheﬁnalrankingpositions(i.e.,therankingpositionscomputedonthe wholedataset)ofthetargetpapersbyacertainmetric(Dunaiskietal.,2016;Dunaiski&Visser,2012)insteadofthe age-dependentevaluationofthemetricsintroducedabove.Butthiskindofcomparisonwouldmissourkeypoint–thestrong dependenceofmetrics’performanceonpaperage.Inaddition,thestrongdependenceofmetrics’performanceonpaperage showninthissectionmakestheoutcomeofsuchevaluationstronglydependentontheagedistributionofthetargetpapers weaimtoidentify.ThisissueisdiscussedinAppendixBandpotentiallyconcernsanyperformanceevaluationcarriedouton aﬁxedsnapshotofthenetwork.Bycontrast,theoutcomespresentedinthisparagraph(howwelldothedifferentmetrics performasafunctionofpaperage)arelittlesensitivetotheexactagedistributionofthetargetpapers.

3.3. ToppapersbyPageRankandrescaledPageRank

TogetanintuitiveunderstandingofthepropertiesofPageRankanditsrescaledversion,itisinstructivetolookatthe top-15papersaccordingtopandR(p)computedonthewholedataset,reportedinTables2and3,respectively.Although onlyoneMLappearsinthetop15byp(ranked6th,seeTable2),amongthenon-MLstherearepapersofexceptional significance,suchastheletterthatproposedthepopularEinstein–Podolsky–Rosenexperiment(ranked7th);thepaper thatintroducedafundamentaltoolinmany-bodysystems,Slater’sdeterminant(ranked5th);thepaperthatpresentedthe famousexactsolutionofthetwo-dimensionalIsingmodel(ranked8th).ThisconfirmsthatPageRankishighlyeffective infindingrelativelyoldpapersofoutstandingsignificance–referredtoas“scientificgems”byChenetal.(2007)–which hasledtotheinterpretationofPageRankscoreasa“lifetimeachievementaward”forapaper(Maslov&Redner,2008). Nevertheless,themostrecentpaperinTable2isfrom1981–28yearsoldwithrespecttothedataset’sendingpointin2009. Inthetop-15byR(p),wefindbotholdpapers(theoldestisfrom1964,45yearsoldin2009)andrecentpapers(the mostrecentisfrom2002,7yearsoldin2009).Fouroutof15top-papersareMLs,whichisanadditionalconfirmationofthe qualityoftherankingbyR(p).WeemphasizethatwhilebothPageRankandrescaledPageRankfeatureprominentpapers

3_Accordingly,_only_the₇₃_MLs_that_are_at_least₁₅_years_old_at_the_end_of_the_dataset_are_included_in_the_movie.

(9)

Table2

Top-15papersintheAPSdataasrankedbyPageRankscorep(asterisksmarktheMilestoneLetters).

Rank(p) Rank(R(p)) p(×10−5₎ _R(p) _Title _Year _Journal

1 1 43.32 29.96 Self-consistentequationsincludingexchangeand

correlationeffects(W.Kohn,L.Sham)

1965 Phys.Rev. 2 36 40.77 24.57 Theoryofsuperconductivity(J.Bardeen,L.Cooper,J.

Schrieffer)

1957 Phys.Rev.

3 8 35.88 28.58 Inhomogeneouselectrongas(P.Hohenberg) 1964 Phys.Rev.

4 115 24.74 18.64 Stochasticproblemsinphysicsandastronomy(S. Chandrasekhar)

1943 Rev.Mod.Phys.

5 137 23.57 17.78 Thetheoryofcomplexspectra(J.Slater) 1929 Phys.Rev.

6 21 23.46 26.53 *_A_model_of_leptons_(S._Weinberg) ₁₉₆₇ _Phys._Rev._Lett.

7 130 22.80 18.05 Canquantum-mechanicaldescriptionofphysical realitybeconsideredcomplete?(A.Einstein,B. Podolsky,N.Rosen)

1935 Phys.Rev.

8 140 22.67 17.73 Crystalstatistics.I.Atwo-dimensionalmodelwithan order-disordertransition(L.Onsager)

1944 Phys.Rev. 9 15 22.64 27.44 Self-interactioncorrectiontodensity-functional

approximationsformany-electronsystems(J.Perdew)

1981 Phys.Rev.B 10 335 22.39 13.17 Absenceofdiffusionincertainrandomlattices(P.

Anderson)

1958 Phys.Rev. 11 16 21.25 26.88 Scalingtheoryoflocalization:absenceofquantum

diffusionintwodimensions(E.Abrahams)

1979 Phys.Rev.Lett. 12 110 20.67 18.83 Effectsofconﬁgurationinteractiononintensitiesand

phaseshifts(U.Fano)

1961 Phys.Rev. 13 82 19.36 20.86 Ontheconstitutionofmetallicsodium(E.Wigner,F.

Seitz)

1933 Phys.Rev. 14 210 18.32 15.44 Ontheinteractionofelectronsinmetals(E.Wigner) 1934 Phys.Rev.

15 315 18.25 13.53 Cohesioninmonovalentmetals(J.Slater) 1930 Phys.Rev.

Table3

Top-15papersintheAPSdataasrankedbyrescaledPageRankscoreR(p)(asterisksmarktheMilestoneLetters).

Rank(p) Rank(R(p)) p(×10−5₎ _R(p) _Title _Year _Journal

1 1 43.32 29.96 Self-consistentequationsincludingexchangeand

correlationeffects(W.Kohn,L.Sham)

1965 Phys.Rev. 63 2 11.35 29.63 *_{Bose–Einstein}_condensation_in_a_gas_of_sodium_atoms

(K.Davisetal.)

1995 Phys.Rev.Lett. 16 3 17.74 29.34 Self-organizedcriticality:anexplanationofthe1/f

noise(P.Bak,C.Tang,K.Wiesenfeld)

1987 Phys.Rev.Lett. 115 4 8.60 29.16 *_Large_mass_hierarchy_from_a_small_extra_dimension_(L.

Randall)

1999 Phys.Rev.Lett. 29 5 14.99 29.01 Patternformationoutsideofequilibrium(M.Cross) 1993 Rev.Mod.Phys. 112 6 8.66 28.97 Statisticalmechanicsofcomplexnetworks(R.Albert,

A.-L.Barabási)

2002 Rev.Mod.Phys.

181 7 7.11 28.95 Reviewofparticleproperties(K.Hagiwaraetal) 2002 Phys.Rev.D

3 8 35.88 28.58 Inhomogeneouselectrongas(P.Hohenberg) 1964 Phys.Rev.

99 9 9.35 28.58 EvidenceofBose–Einsteincondensationinanatomic gaswithattractiveinteractions(C.Bradleyetal.)

1995 Phys.Rev.Lett. 59 10 11.65 28.11 Efﬁcientpseudopotentialsforplane-wavecalculations

(N.Troullier,J.Martins)

1991 Phys.Rev.B 53 11 12.11 27.88 *_Teleporting_an_unknown_quantum_state_via_dual

classicalandEinstein-Podolsky-Rosenchannels(C. Bennettetal.)

1993 Phys.Rev.Lett.

281 12 5.99 27.85 *_Negative_refraction_makes_a_perfect_lens_(J._Pendry) ₂₀₀₀ _Phys._Rev._Lett. 216 13 6.59 27.59 Tevscalesuperstringandextradimensions(G.Shiu,

S.-H.Tye)

1998 Phys.Rev.D 17 14 17.54 27.47 Diffusion-limitedaggregation,akineticcritical

phenomenon(T.Witten)

1981 Phys.Rev.Lett. 9 15 22.64 27.44 Self-interactioncorrectiontodensity-functional

approximationsformany-electronsystems(J.Perdew, A.Zunger)

1981 Phys.Rev.B

intheirtop-15,thedetailedperformanceanalysisdescribedintheprevioussectionisessentialinordertofullyunderstand thebehaviorofthetwometrics.

4. Discussion

MotivatedbytherecentpublicationofthelistofMilestoneLettersbythePhysicalReviewLetterseditors,weperformed anextensivecross-evaluationofdifferentdata-drivenmetricsofscientiﬁcimpactofresearchpaperswithrespecttotheir

(10)

abilitytoidentifypapersofexceptionalsignificance.WestudiedthenetworkofcitationsbetweenpapersinthePhysical Reviewcorpus,whichisrecognizedtobeacomprehensiveproxyforscientificresearchinphysics(Radicchi&Castellano, 2011;Radicchietal.,2009;Redner,2005).Themainassumptionofouranalysisisthatalthoughnotallthemostimportant papersinthePhysicalReviewcorpusarecoveredbytheMilestoneLetterslist,agoodpaper-levelmetricisexpectedtorank theMilestoneLettersashighaspossibleduetotheiroutstandingsignificance.Wefindaclearperformancegapbetween network-basedmetrics(p,R(p),T)andlocalmetricsbasedonlyonthenumberofreceivedcitations(c,R(c)).Thisfinding suggeststhattheuseofcitationcountstorankscientificpapersissub-optimal;additionalresearchwillbeneededtoassess whethernetwork-basedarticle-levelmetricscanbeusedtoconstructauthor-levelmetricsmoreeffectivethanthecurrently usedmetrics–suchasthepopularh-indexintroducedbyHirsch(2005)–thatareonlybasedoncitationcountsandneglect networkcentrality.

WehaveshownthattheproposedrescaledPageRankR(p)suppressesPageRank’swell-knownbiasagainstrecentpapers muchbetterthantheCiteRankalgorithmdoes.Asaresult,theproposedrescaledPageRankR(p)providesasuperior per-formancethanPageRankandCiteRankinrankingrecentandoldmilestonepapers,respectively.Therearestilltwopossible rankingerrors–falsepositivesandfalsenegatives–thathavenotbeenaddressedinthismanuscript.Youngpapersatthe topoftherankingbytherescaledPageRankmaybefalsepositivesbecausethecitationspurtthattheyhaveexperienced maystopwhichwilleventuallyforcethemoutoftheranking’stopaswellasoutfromthegroupofpossiblyhighly signifi-cantpapers.Bycontrast,theso-called“sleepingbeauties”thatreceivealargepartofcitationslongaftertheyarepublished (Ke,Ferrara,Radicchi,&Flammini,2015)arelikelytobeunder-evaluatedbytherescaledPageRank.Assessingtheextent towhichfalsepositivesandfalsenegativesaffecttherankingbyrescaledPageRank,andbyotherrelevantmetricsaswell, goesbeyondthescopeofourpaperyetitconstitutesamuchneededstepinfutureresearch.Theanalysisoflargerdatasets whichincludepapersfromdiversefieldsisanothernaturalnextstepforfutureresearch.Asdifferentacademicdisciplines adoptdifferentcitationpractices(Bornmann&Daniel,2008),therescalingprocedureproposedinthispapermayneedto beextendedtoalsoremovepossiblerankingbiasesbyacademicfield.

Theassumptionsbehindourdefinitionoftimebalanceandthecomputationoftherescaledscoresdeserveattention aswell.InagreementwithRadicchietal.(2008)andRadicchiandCastellano(2011),thedefinitionoftimebalanceofa rankingadoptedinthisarticlerequiresthatthelikelihoodthatapaperisrankedatthetopbyatime-balancedmetricis independentofpaperage.Ourdefinitionofrankingtimebalanceisimplicitlybasedontheassumptionthatthenumberof highlysignificantpapersgrowslinearlywithsystemsize.WhilethisassumptionseemsreasonableforthePhysicalReview corpuswhosejournalsapplystrictacceptancecriteriaforsubmittedpapers,itmightneedtobereconsideredwhenanalyzing largerdatasetswhichincluderecentlyemerginghigh-acceptancejournals–bothmega-journals(Björk.,2015)andpredatory journals(Xiaetal.,2015).Inotherwords,theexponentialgrowthofthenumberofpublishedpapers(Redner,2005;Sinatra, Deville,Szell,Wang,&Barabási,2015;Wang,Song,&Barabási,2013)doesnotnecessarilycorrespondtoanexponential growthofthenumberofhighlysignificantpapers.Theissueisdelicate(seeSarewitz,2016forarecentinsight)andwill needtobeaddressedinfutureresearchonbibliometricindicators.

Animportantgeneralquestionremainsopen:whichinherentpropertiesofanetworkdetermineifPageRank-likemethods willoutperformlocalmetricsornot?Weconjecturethatincitationnetworks,theobservedsuccessofnetwork-based metricsinidentifyinghighlysigniﬁcantpapersmightberelatedtothetendencyofhigh-impactpaperstociteother high-impactpapers,asfoundbyBornmann,deMoyaAnegón,andLeydesdorff(2010).Despiterecentefforts(Fortunato,Bogu ˜ná, Flammini,&Menczer,2008;Ghoshal&Barabási,2011;Marianietal.,2015;Medo,Mariani,Zeng,&Zhang,2015),which networkpropertiesmakethePageRankalgorithmsucceedorfailremainsalargelyunexploredproblemwhichwewill furtherinvestigateinfutureresearch.

Ourworkconstitutesaparticularinstanceofageneralmethodology–thecomparisonoftheoutcomesofquantitative variableswithaground-truthestablishedbyexperts–whichcanbeappliedformetric evaluationinseveralkindsof systems,suchasmovies(Spitz&Horvát,2014;Wassermanetal.,2015)orthenetworkofscientific authors(Radicchi etal.,2009).Inthedomainofresearchevaluation,thismethodologyisparticularlyrelevantsincebibliometricindicesare increasinglyusedinpractice–oftenuncriticallyandinquestionableways(Hicksetal.,2015;Wilsdon,2015)–andscholars fromdiversefieldhaveproducedaplethoraofpossibleimpactmetrics(VanNoorden,2010),especiallythoseaimedat assessingresearchers’productivityandimpact.Motivatedbytheresultsobtainedinthisarticle,weencouragethecreation oflistsofgroundbreaking papersalsoforotherscientific domains,whichcanleadtoaricherunderstandingandmore accuratebenchmarkingofquantitativemetricsforscientificsignificance.Ourfindingsconstituteabenchmarkfor article-levelmetricsofscientificsignificance,andcanbeusedasabaselinetoassesstheperformanceofnewindicatorsinfuture research.

Fromapracticalpointofview,improvingtheeffectivenessofpaperimpactmetricshasthepotentialtoimprovenotonly thecurrentbibliometricpractices,butalsoourabilitytodiscoverrelevantpapersinonlineplatformsthatcollectacademic papersanduseautomatedmethodstosortthem.Inthisrespect,ourfindingssuggestthatrescaledPageRankcanbeusedas anoperationaltooltoidentifythemostsignificantpapersonagiventopic.Supposethataresearcherentersanewresearch fieldandwantstostudythemostimportantworksinthatfield.Ifweprovidehim/herwiththetoppapersasrankedby PageRank,theresearcherwillonlyknowtheoldestpapersandwillnotbeinformedaboutrecentlinesofresearch.Onthe otherhand,byprovidinghim/herwiththetoppapersasrankedbyrescaledPageRank,he/shewillknowbotholdsignificant papersandrecentworksthathaveattractedconsiderableattention,leadingtoamorecompleteoverviewofthefield.To allowresearcherstoexperiencethebenefitsofatime-balancedrankingmethod,wedevelopedaninteractiveWebplatform

(11)

whichisavailableattheaddresshttp://www.sciencenow.info.Inthisplatform,userscanbrowsetherankingsoftheAPS papersbyR(p)yearbyyear,investigatethehistoricalevolutionofeachpaper’srankingpositionbyR(p),andcheckthe rankingpositionsandthescoresofeachresearcher’spublications.

5. Conclusions

Wepresentedadetailedanalysisoftheperformanceofdifferentquantitativemetricswithrespecttotheirabilityto identifytheMilestoneLettersselectedbythePhysicalReviewLetterseditors.Ourfindingsindicatethat:(1)adirectrescaling ofcitationcountandPageRankscoreisaneffectivewaytosuppressthetemporalbiasofthesetwometrics;(2)rescaled PageRankR(p)isthebest-performingmetricoverall,asitoutperformsPageRankandCiteRankinidentifyingrecentandold milestonepapers,respectively,anditoutperformscitation-basedindicatorsforpapersofeveryage.Thepresentedresults indicatethatthecombinationofnetworkcentralityandtimeholdspromiseforimprovingsomeofthetoolscurrentlyused torankscientificpublications,whichcouldbringvaluablebenefitsforquantitativeresearchassessmentanddesignofWeb academicplatforms.

Acknowledgements

We wishtothankGiulioCimini,MatthieuCristelli,LucianoPietronero,Zhuo-MingRen,Ming-ShengShang,Andrea Tacchella,GiacomoVaccario,AlexandreVidmerandAndreaZaccariaforinspiringdiscussionsandusefulsuggestions.We arealsogratefultothetwoanonymousrefereesfortheirinsightfulcommentswhichhelpedustoimprovethelevelofthe discussioninsomesectionsofthemanuscript.ThisworkwassupportedbytheEUFET-OpenGrantNo.611272(project Growthcom).Theauthorsdeclarethattheyhavenocompetingﬁnancialinterests.

Authorcontributionstatement

Conceivedanddesignedtheanalysis:ManuelSebastianMariani;MatúˇsMedo;Yi-ChengZhang Collectedthedata:ManuelSebastianMariani;MatúˇsMedo

Contributeddataoranalysistools:ManuelSebastianMariani;MatúˇsMedo Performedtheanalysis:ManuelSebastianMariani

Wrotethepaper:ManuelSebastianMariani;MatúˇsMedo AppendixA. Averagerankingpositionvs.averagerankingratio

WeshowherethattheaveragerankingpositionoftheMLsisextremelysensitivetotherankingpositionofthe least-citedMLs,whereastheaveragerankingratioisstablewithrespecttoremovaloftheleast-citedMLs.Forsimplicity,inthis Appendixweconsidertherankingscomputedonthewholedataset.Informulas,theaveragerankingposition ¯rraw(s)ofthe

MLsbymetricsisdeﬁnedas ¯ rraw(s)=_M1

i∈M ri(s), (A.1)

whereri(s)denotestherankingpositionofpaperibymetricsnormalizedbythetotalnumberofpapers:ri=1/Nandri=1

correspondtothebestandtheworstpaperintheranking,respectively.

InSection3.2,wementionthatlittle-citedpaperscanbiastheaveragerankingpositionofthetargetpapersbyacertain metric.Toillustratethispoint,considerﬁrstthefollowingidealexample.ConsidertwotargetpapersAandB.PaperAis ranked10thbymetricM1and1000thbymetricM2,whereaspaperBisranked20,000bymetricM1and15,000bymetricM2.

Theaveragerankingpositionforthesetofpapers{A,B}isequalto10,005andto8000formetricM1andM2,respectively.

Thismeansthataccordingtoaveragerankingposition,metricM2outperformsmetricM1,despitehavingnotbeenableto

placeanyofthetwopapersinthetop-100.

AqualitativelysimilarsituationoccursalsointheAPSdataset,asthefollowingexampleshows.Themilestoneletter “ElementNo.102”[Phys.Rev.Lett.1.1(1958):18]iscitedonlyﬁvetimeswithintheAPSdata.ItsrankingpositionbyR(p) (r(R(p))=0.22)isthusmuchlargerthantheMLs’averagerankingpositionrraw¯ (R(p))=0.016byR(p).OnlyfewMLsare littlecited–forinstance,onlyfouroutof87MLsarenotamongthetop-10%papersbycitationcount.Towhichextent dotheselittle-citedpapersaffect ¯rrawforthedifferentmetrics?Bydenotingwith ¯rraw(R(p))theaveragecomputedonthe

subsetof83MLswhichdoesnotincludethefourleast-citedMLs,weobtain ¯rraw(R(p))=0.009,whichissmallerthan ¯

rraw(R(p))=0.016byafactoraround1.8.Theeffectisevenlargerforcitationcount:wehave ¯rraw(c)=0.009againstthe

originalvalue ¯rraw(c)=0.020–theratiobetweenthetwoaveragesislargerthantwo.

Byusingtheaveragerankingratio,weonlycomparetherankingwithinthechosensetofmetricsforeachindividual paperand,asaconsequence,theaverageisstablewithrespecttoremovaloftheleast-citedMLs.Thiscanbeillustrated byagainexcludingthefourleast-citedMLsfromthecomputationof ¯r(R(p)),andbycomparingthecorrespondingvalues ¯r_(R(p))_of_the_average_ranking_ratio_with_the_values_computed_over_all_the_MLs._Among_the_ﬁve_metrics,_the_largest_variation

(12)

Fig.B.5. Valuesoftheaveragerankingposition ¯rraw(panelA)andoftheaveragerankingratio ¯r(panelB)oftheMLsfortheﬁvemetricscomputedonthe wholedataset(1893–2009);theerrorbarsrepresentthestandarderrorofthemean.

isobservedforPageRank,forwhich ¯r_(p)/¯r(p)₌_1.03_–_i.e.,_the_removal_of_the_least-cited_MLs_has_only_a_small_effect_on_the

averagerankingratiosfortheﬁvemetrics.

AppendixB. Assessingthemetrics’performanceonthewholedataset

Fig.B.5Ashowsthevaluesoftheaveragerankingposition ¯rraw(s)fortheﬁvemetricscomputedonthewholedataset:

accordingto ¯rraw(s),PageRankandrescaledPageRankoutperformtheothermetrics.

WhiletheaveragerankingpositionoftheMLsisasimplequantitytoevaluatethemetrics,someMLsarerelativelylittle citedand,asaresult,theirlowrankingpositioncanstronglybiastheaveragerankingposition.WerefertoAppendixAfor adetaileddiscussionofthisissue.Tosolvethisproblem,wedefinedtherankingratiointhemaintext.Fig.B.5Bshowsthe measuredvaluesoftheaveragerankingratio ¯r basedontherankingscomputedonthewholedataset.Thissimplemeasure wouldsuggestthatR(p)and,toalesserextent,pandcoutperformR(c)andCiteRank.Giventhesmallgapbetweenpand R(p),onemightbetemptedtoconcludethattherescalingproceduredoesnotbringsubstantialbenefitsintheidentification ofsignificantpapers.However,therankanalysispresentedinFig.B.5includesthecontributionofbotholdandrecentMLs, whereasacloseinspectionrevealsthatthemetricsperforminadrasticallydifferentwaydependingontheageofthetarget papers,asshowninFig.3anddiscussedinSection3.2.

Thispointcanbealsoillustratedbyusingtherankingscomputedonthewholedataset.Toshowthis,wedividethe87MLs intothreeequally-sizedgroupsofMLsaccordingtotheirage.ByconsideringonlytheoldestM/3=29MLsastargetpapers, weobtain ¯r(p)=1.1whereas ¯r(R(p))=5.5.Bycontrast,byconsideringonlytheM/3mostrecentMLsastargetpapers,we obtain ¯r(p)=7.3whereas ¯r(R(p))=1.7.WhilethisresultshowsaclearadvantageofPageRankandrescaledPageRankfor theoldestandforthemostrecentMLs,respectively,thereexistsafundamentaldifferencebetweentheperformancegaps observedfortheoldestandthemostrecentMLs.ThebiasofPageRanktowardsoldnodes(Fig.1A)makesitindeedeasier forthemetrictofindoldsignificantpapers.Ontheotherhand,rescaledPageRankdoesnotbenefitfromanybiasinranking themostrecentMLsastherankingbythemetricisnotbiasedbypaperage(Fig.1C).Itisthuscrucialtorealizethatwhen wecomputetherankingsonthewholedataset,thevalueoftheaveragerankingratiobythemetricsdependsontheage distributionoftheimportantpapersthatweaimtoidentify.Wereweusingtherankingscomputedonthewholedataset forevaluationandwereweonlyconsideringtheoldest(mostrecent)29MLsastargetpapers,wewouldhaveconcluded thatPageRank(rescaledPageRank)isbyfarthebest-performingmetric.Theseobservationsdemonstratethatanevaluation ofthemetricsbasedonthewholedatasetisstronglybiasedbytheagedistributionofthetargetitemsand,forthisreason, unreliableasatooltoassessmetrics’performance.

AppendixC. Alternativerescalingequations

Eq.(3)forcestherescaledscoreRi(p)ofapaperitohavemeanvalueequaltozeroandstandarddeviationequaltoone,

independentlyofitsage(i.e.,independentlyofi).Fig.2Cshowsthatthisrescalingissufﬁcienttoachieveatime-balanced rankingofthepapers.WeconsidernowasimplerescalingintheformR(ratio)_i (p):=pi/i(p).Whilethemeanvalueofthis

scoreisequaltoone,onecanshowthatitsstandarddeviationisgivenby

R(ratio)_i (p)

=

E_i

(R(ratio)_i (p))2

−E_i

R(ratio)_i (p)

2=

E_i[p2 i] i(p)2 −1= i(p) i(p), (C.1)

whereEi[·]denotestheexpectationvaluewithintheaveragingwindowofpaperi.Fig.C.6showsthat(p)/(p)strongly

depends onnode age in the APS dataset. As a result, the ranking by R(ratio)_(p) _is _strong _biased _towards _old _nodes

(/0−1=79.81dev).

(13)

Fig.C.6.Dependenceof(p)/(p)onpaperage;thevaluesof(p)and(p)arecalculatedoverthepapers’averagingwindows.

Fig.D.7.Numberofpaperswhoseaveragingwindowcontainslessthanfivepapersthatreceivedatleastcmincitationsasafunctionof.For≥1000, eachpaperiscomparedwithatleastfivepaperscitedatleastfivetimes.

WealsoconsideredavariantofourmethodwheretherescaledscoresarestillcomputedwithEq.(3),buti(p)andi(p)

arecomputedoverthepaperspublishedinthesameyearaspaperi.TheresultingrescaledscoreR(year)_(p)_produces_a_ranking

thatismuchlessinagreementwiththehypothesisofunbiasedranking(/0−1=15.55dev)thantherankingbyR(p).For

thisreason,thedeﬁnitionofpapers’averagingwindowadoptedinthemaintextisbasedonnumberofpublicationsand notonrealtime.However,R(year)_(p)_is_still_preferable_to_the_original_scores_when_the_aim_is_to_compare_papers_of_different

age.AlsonotethatR(year)_(p)_might_be_preferable_if_one_is_interested_in_a_ranking_of_the_papers_where_each_publication_year

isrepresentedbythesamenumberofpapers,apartfromstatisticalﬂuctuations.

AppendixD. DependenceofthepropertiesoftherankingsbyR(c)andR(p)onthetemporalwindowsize

Asdescribedinthemaintext,therescaledscoresRi(c)andRi(p)ofacertainpaperiareobtainedbycomparingits

scorewiththescoresofthenodesthatbelongtoits“averagingwindows”j∈[i−c/2,i+c/2]andj∈[i−p/2,i+p/2],

respectively.Tomotivatethechoicep=c=1000adoptedinthemaintext,westartbyobservingthatthesizeofthe

averagingwindowshouldbeneithertoolargenortoosmall.Alargewindowwouldincludepapersofsigniﬁcantlydifferent age,whichwouldturnouttobeineffectiveinremovingthetemporalbiasesofthemetrics.4_On_the_other_hand,_we_want

candptobesufﬁcientlylargetoavoidthatsomepapersareonlycomparedwithlittle-citedpapers,whichislikelyto

happenforasmallwindowduetotheskewedshapeofthecitationcountdistributionMedoetal.(2011).

Tounderstandthepossibledrawbacksofatoosmallaveragingwindow,wecomputethenumberN(cmin)ofpaperswhose

averagingwindowscontainlessthanﬁvepapersthatreceivedatleastcmincitations.TheresultsareshowninFig.D.7.For

≤800,theaveragingwindowsofanonzeronumberofpapershavelessthanfivepaperswithatleastfivereceivedcitations. Werestrictourchoicetotherange≥1000,forwhichnopaper’saveragewindowhaslessthanfivepaperscitedatleast fivetimes.

4_Note_that_the_ranking_by_R(p)_is_perfectly_correlated_with_the_ranking_by_p_for_p₌_N.

(14)

Fig.D.8.Leftpanel:Deviation/0−1fortherankingbyrescaledcitationcountR(c)asafunctionofcfordifferentvaluesofz.Rightpanel:Deviation /0−1fortherankingbyrescaledPageRankscoreR(p)asafunctionofpfordifferentvaluesofz.Thehorizontalblacklinemarkstheexpectedvalue /0−1=0foranunbiasedranking.

Fig.D.9.Spearman’srankingcorrelationbetweentherescaledscoreR()andtherescaledscoreR(=1000)usedinthemaintext.

Toevaluatetheabilityoftherescalingproceduretosuppressthebiasofthemetrics,weestimatethedeviation/0−1

ofthestandarddeviationratio/0fromtheexpectedvalue(one)foranunbiasedranking(seethemaintextfordetails).

Fig.D.8reportsthebehaviorofthedeviation/0−1asafunctionofpandcfordifferentselectivityvaluesz.Theupward

trendsofFig.D.8suggestthatinordertoreducetheratio/0,itisconvenienttochoosepandcassmallaspossible.

Hence,thechoicec=p=1000allowsustoobtainanhistogramclosetotheexpectedunbiasedhistogram–/0values

areclosetooneforallthevaluesofzrepresentedintheﬁgure–and,atthesametime,toavoidthatsomenodesareonly comparedwithlittlecitednodes,asdiscussedaboveforD.7.

Animportantobservationisthatthecorrelationsbetweentherankingsobtainedwithdifferentvaluesofandthe rankingobtainedwith=1000areclosetoone(Fig.D.9),which meansthattherescalingprocedureisrobustagainst variationoftheaveragingwindowsizescandp.

AppendixE. DependenceofCiteRankperformanceonitsparameter

Fig.E.10showsthedependenceoftheaveragerankingratio ¯ronpaperage,forfivedifferentvaluesofCiteRankparameter .ThefigureshowsthatthebehaviorofCiteRank’sperformancestronglydependsonthechoiceofitsparameter.Whenthe parameterissmall(panelA,=1year),CiteRankperformanceisoptimal(lowestaveragerankingratio)forveryrecent papers,andgraduallyworsenswithpaperage.Asincreases(movingfrompanelAtoE),theminimumpointofCiteRank’s averagerankingratiograduallyshiftstowardoldernodes.When issufficientlylarge(panelE,=16 years),CiteRank behaviorisqualitativelysimilartothatofPageRank,anditsperformancegraduallyimproveswithpaperage–thisisindeed consistentwiththefactthatT→pinthelimit→∞.

AppendixF. Dependenceofrankingratioandidentiﬁcationrateonpaperage

ToassesstherankingofeachMilestoneLettertyearsafteritspublication,wecomputetherankingseacht=183days (resultsfordifferentchoicesoftarequalitativelysimilar).Ateachcomputationtimet(c)_,_only_the_N(t(c)₎_papers_(with_their

links)publishedbeforetimet(c)_are_considered_for_the_scores’_and_rankings’_computation,_and_each_ML_contributes_to_the

(15)

Fig.E.10.Dependenceoftheaveragerankingratio ¯ronpaperage,forﬁvedifferentvaluesofCiteRankparameter.

rankingratio ¯r(s,t)correspondingtoitsagetattimet(c)_._This_procedure_allows_us_to_save_{computational}_time_with_respect

tocomputingtherankingsofeachMLexactlytyearsafteritspublication,becauseitrequiresfewerrankingcomputations. Informulas,theaveragerankingratio ¯r(s,t=kt)fort-yearsoldpapersisdeﬁnedas

¯ r(s,t=kt)= 1 M(t)

t(c)

i∈M ı

(t(c)₋_t i)/t,k

× r(s,i;t(c)) min_s{r(s,i;t(c))}, (F.1)

whereweusedk=0.5,1,1.5,2,...forFig.3B;intheequationabove,r(s,i;t(c)₎_denotes_the_ranking_position_of_ML_i_at_time_t(c)

accordingtometrics,M(t)denotesthenumberofMLsthatareatleasttyearsoldattheendofthedataset, xdenotesthe largestintegersmallerthanorequaltox,ı(x,y)denotestheKroneckerdeltafunctionofxandy.Hence,ateachcomputation timet(c)_,_each_ML_i_published_before_time_t(c)_gives_a_{contribution ˆ}_r(s,_i;_t(c)₎_to_the_average_ranking_{ratio ¯}_r(s,_t₌_k_t)_for

papersofaget(c)₋_t

i.Similarly,theidentiﬁcationratefx(t)iscomputedas

fx(s,kt)=_M(t)1

t(c)

i∈M ı

(t(c)₋_t i)/t,k

×(r(s,i;t(c)₎_≤_x), _(F.2)

where(r(s,i;t(c)₎_≤_x)_is_equal_to_one_if_paper_i_is_among_the_top_x_N(t(c)₎_papers_in_the_ranking_by_metric_s_at_time_t(c)_,_equal

tozerootherwise.

(16)

Todeﬁnethenormalizedidentiﬁcationrate(NIR)ofametric,ateachcomputationtimet(c)_we_divide_the_N(t(c)₎_papers

into40groupsaccordingtotheirage,analogouslytowhatwedidinSection3.1toevaluatethetemporalbalanceofthe metrics.TheNIRofmetricsisthendeﬁnedas

˜fx(s,kt)= _M(t)1

t(c)

i∈M ı

(t(c)₋_t i)/t,k

×(r(s,i;t(c)₎_≤_x)_y(n(s,_i;_t(c)₎₎_, _(F.3)

wherey(n(s,i;t(c)₎₎_is_a_decreasing_function_of_the_fraction_n(s,_i;t(c)₎_of_nodes_that_belong_to_the_same_age_group_of_node_i_and

arerankedamongthetopxN(t(c)₎_by_metric_s._Denoting_by_n

0(i;t(c))=1/40theexpectedvalueofn(·,i;t(c))foranunbiased

ranking,wesety(n(s,i;t(c)₎₎₌₍_n(s,_i;_t(c)₎_/n₀₍_i;_t(c)₎₎−1_if_n(s,_i;t(c)₎_>_n

0(i;t(c))(i.e.,ifthemetrictendstofavorpapersthat

belongtothesameagegroupaspaperi),whereas y(n(s,i;t(c)₎₎₌₁_if_n(s,_i;t(c)₎_≤_n

0(i;t(c)).AccordingtoEq.(F.3),ifthe

identiﬁedMLbelongstoanagegroupwhichisover-representedintopxN(t(c)₎_by_the_factor_of_four,_it_only_counts_as_1/4_in

thenormalizedidentiﬁcationrate.

AppendixG. Supplementarydata

Supplementarydata associated withthis article can befound, in the online version, at http://dx.doi.org/10.1016/ j.joi.2016.10.005.

References

Berkhin,P.(2005).Asurveyonpagerankcomputing.InternetMathematics,2(1),73–120.

Björk.,B.-C.(2015).Havethe“mega-journals”reachedthelimitstogrowth?PeerJ,3,e981.

Bollen,J.,Rodriquez,M.A.,&VandeSompel,H.(2006).Journalstatus.Scientometrics,69(3),669–687.

Bornmann,L.,&Daniel,H.-D.(2008).Whatdocitationcountsmeasure?Areviewofstudiesoncitingbehavior.JournalofDocumentation,64(1),45–80.

Bornmann,L.,deMoyaAnegón,F.,&Leydesdorff,L.(2010).Doscientiﬁcadvancementsleanontheshouldersofgiants?Abibliometricinvestigationofthe

ortegahypothesis.PLoSONE,5(10),e13327.

Brin,S.,&Page,L.(1998).Theanatomyofalarge-scalehypertextualwebsearchengine.ComputerNetworksandISDNSystems,30(1),107–117.

Chen,P.,Xie,H.,Maslov,S.,&Redner,S.(2007).Findingscientiﬁcgemswithgoogle’spagerankalgorithm.JournalofInformetrics,1(1),8–15.

Cimini,G.,Gabrielli,A.,&Labini,F.S.(2014).Thescientiﬁccompetitivenessofnations.PLOSONE,9(12),e113470.

Crespo,J.A.,Ortu ˜no-Ortín,I.,&Ruiz-Castillo,J.(2012).Thecitationmeritofscientiﬁcpublications.PloSone,7(11),e49156.

Dunaiski,M.,&Visser,W.(2012).Comparingpaperrankingalgorithms.InProceedingsoftheSouthAfricanInstituteforComputerScientistsandInformation

TechnologistsConference(pp.21–30).ACM.

Dunaiski,M.,Visser,W.,&Geldenhuys,J.(2016).Evaluatingpaperandauthorrankingalgorithmsusingimpactandcontributionawards.Journalof

Informetrics,10(2),392–407.

Egghe,L.(2006).Theoryandpracticeoftheg-index.Scientometrics,69(1),131–152.

Ermann,L.,Frahm,K.M.,&Shepelyansky,D.L.(2015).Googlematrixanalysisofdirectednetworks.ReviewsofModernPhysics,87(4),1261.

Fiala,D.(2012).Time-awarepagerankforbibliographicnetworks.JournalofInformetrics,6(3),370–388.

Fortunato,S.,Bogu ˜ná,M.,Flammini,A.,&Menczer,F.(2008).Approximatingpagerankfromin-degree.InAlgorithmsandmodelsfortheweb-graph(pp.

59–71).Springer.

Franceschet,M.(2011).Pagerank:Standingontheshouldersofgiants.CommunicationsoftheACM,54(6),92–101.

Garﬁeld,E.(1972).Citationanalysisasatoolinjournalevaluation.Science,178(4060),471–479.

Ghoshal,G.,&Barabási,A.-L.(2011).Rankingstabilityandsuper-stablenodesincomplexnetworks.NatureCommunications,2,394.

Gleich,D.F.(2015).Pagerankbeyondtheweb.SIAMReview,57(3),321–363.

González-Pereira,B.,Guerrero-Bote,V.P.,&Moya-Anegón,F.(2010).Anewapproachtothemetricofjournals’scientiﬁcprestige:TheSJRindicator.

JournalofInformetrics,4(3),379–391.

Hicks,D.,Wouters,P.,Waltman,L.,deRijcke,S.,&Rafols,I.(2015).Bibliometrics:TheLeidenManifestoforresearchmetrics.Nature,520,429–431.

Hirsch,J.E.(2005).Anindextoquantifyanindividual’sscientiﬁcresearchoutput.ProceedingsoftheNationalAcademyofSciencesoftheUnitedStatesof

America,102(46),16569–16572.

Kaur,J.,Radicchi,F.,&Menczer,F.(2013).Universalityofscholarlyimpactmetrics.JournalofInformetrics,7(4),924–932.

Kaur,J.,Ferrara,E.,Menczer,F.,Flammini,A.,&Radicchi,F.(2015).Qualityversusquantityinscientiﬁcimpact.JournalofInformetrics,9(4),800–808.

Ke,Q.,Ferrara,E.,Radicchi,F.,&Flammini,A.(2015).Deﬁningandidentifyingsleepingbeautiesinscience.ProceedingsoftheNationalAcademyofSciences,

112(24),7426–7431.

King,D.A.(2004).Thescientiﬁcimpactofnations.Nature,430(6997),311–316.

Kinney,A.L.(2007).Nationalscientiﬁcfacilitiesandtheirscienceimpactonnonbiomedicalresearch.ProceedingsoftheNationalAcademyofSciences,

104(46),17943–17947.

Kreyszig,E.(2010).Advancedengineeringmathematics.JohnWiley&Sons.

Lawrence,P.A.(2008).Lostinpublication:Howmeasurementharmsscience.EthicsinScienceandEnvironmentalPolitics,8,9–11.

Liebowitz,S.J.,&Palmer,J.P.(1984).Assessingtherelativeimpactsofeconomicsjournals.JournalofEconomicLiterature,22(1),77–88.

Lü,L.,Medo,M.,Yeung,C.H.,Zhang,Y.-C.,Zhang,Z.-K.,&Zhou,T.(2012).Recommendersystems.PhysicsReports,519(1),1–49.

Mariani,M.S.,Medo,M.,&Zhang,Y.-C.(2015).Rankingnodesingrowingnetworks:Whenpagerankfails.ScientiﬁcReports,5

Maslov,S.,&Redner,S.(2008).Promiseandpitfallsofextendinggoogle’spagerankalgorithmtocitationnetworks.TheJournalofNeuroscience,28(44), 11103–11105.

Medo,M.,Cimini,G.,&Gualdi,S.(2011).Temporaleffectsinthegrowthofnetworks.PhysicalReviewLetters,107(23),238701.

Medo,M.,Mariani,M.S.,Zeng,A.,&Zhang,Y.-C.(2015).Identiﬁcationandmodelingofdiscoverersinonlinesocialsystems.,arXivpreprintarXiv:1509.01477.

Molinari,J.-F.,&Molinari,A.(2008).Anewmethodologyforrankingscientiﬁcinstitutions.Scientometrics,75(1),163–174.

Narin,F.(1976).Evaluativebibliometrics:Theuseofpublicationandcitationanalysisintheevaluationofscientiﬁcactivity.Washington,DC:Computer Horizons.

Newman,M.(2010).Networks:Anintroduction.OxfordUniversityPress.

Newman,M.E.J.(2009).Theﬁrst-moveradvantageinscientiﬁcpublication.Europhys.Lett.,86(6),68001.

Newman,M.E.J.(2014).Predictionofhighlycitedpapers.Europhys.Lett.,105(2),28002.

Nykl,M.,Jeˇzek,K.,Fiala,D.,&Dostal,M.(2014).Pagerankvariantsintheevaluationofcitationnetworks.JournalofInformetrics,8(3),683–692.

Parolo,P.D.B.,Pan,R.K.,Ghosh,R.,Huberman,B.A.,Kaski,K.,&Fortunato,S.(2015).Attentiondecayinscience.JournalofInformetrics,9(4),734–745.

(17)

Pinski,G.,&Narin,F.(1976).Citationinﬂuenceforjournalaggregatesofscientiﬁcpublications:Theory,withapplicationtotheliteratureofphysics.

InformationProcessing&Management,12(5),297–312.

Radicchi,F.,&Castellano,C.(2011).Rescalingcitationsofpublicationsinphysics.PhysicalReviewE,83(4),046116.

Radicchi,F.,&Castellano,C.(2012a]).Areverseengineeringapproachtothesuppressionofcitationbiasesrevealsuniversalpropertiesofcitation

distributions.PLoSONE,7(3),e33833.

Radicchi,F.,&Castellano,C.(2012b]).Testingthefairnessofcitationindicatorsforcomparisonacrossscientiﬁcdomains:Thecaseoffractionalcitation

counts.JournalofInformetrics,6(1),121–130.

Radicchi,F.,Fortunato,S.,&Castellano,C.(2008).Universalityofcitationdistributions:Towardanobjectivemeasureofscientiﬁcimpact.Proceedingsof

theNationalAcademyofSciences,105(45),17268–17272.

Radicchi,F.,Fortunato,S.,Markines,B.,&Vespignani,A.(2009).Diffusionofscientiﬁccreditsandtherankingofscientists.PhysicalReviewE,80(5),056103.

Redner,S.(2005).Citationstatisticsfrom110yearsofphysicalreview.PhysicsToday,58,49.

Sarewitz,D.(2016).Thepressuretopublishpushesdownquality.Nature,533(7602),147–147.

Sinatra,R.,Deville,P.,Szell,M.,Wang,D.,&Barabási,A.-L.(2015).Acenturyofphysics.NaturePhysics,11(10),791–796.

Spitz,A.,&Horvát,E.-Á.(2014).Measuringlong-termimpactbasedonnetworkcentrality:Unravelingcinematiccitations.PLoSOne,9(10),e108857.

VanNoorden,R.(2010).Metrics:Aprofusionofmeasures.Nature,465(7300),864–866.

VanRaan,A.F.J.(2005).Fatalattraction:Conceptualandmethodologicalproblemsintherankingofuniversitiesbybibliometricmethods.Scientometrics,

62(1),133–143.

Walker,D.,Xie,H.,Yan,K.-K.,&Maslov,S.(2007).Rankingscientiﬁcpublicationsusingamodelofnetworktrafﬁc.JournalofStatisticalMechanics:Theory

andExperiment,2007(06),P06010.

Waltman,L.(2016).Areviewoftheliteratureoncitationimpactindicators.JournalofInformetrics,10(2),365–391.

Waltman,L.,&Yan,E.(2014).Pagerank-relatedmethodsforanalyzingcitationnetworks.InMeasuringscholarlyimpact.pp.83–100.Springer.

Wang,D.,Song,C.,&Barabási,A.-L.(2013).Quantifyinglong-termscientiﬁcimpact.Science,342(6154),127–132.

Wasserman,M.,Zeng,X.H.T.,&Amaral,L.A.N.(2015).Cross-evaluationofmetricstoestimatethesigniﬁcanceofcreativeworks.Proceedingsofthe

NationalAcademyofSciences,112(5),1281–1286.

Weingart,P.(2005).Impactofbibliometricsuponthesciencesystem:Inadvertentconsequences?Scientometrics,62(1),117–131.

Werner,R.(2015).Thefocusonbibliometricsmakespaperslessuseful.Nature,517(7534),245.

Wilsdon,J.(2015).Weneedameasuredapproachtometrics.Nature,523(7559),129–129.

Xia,J.,Harmon,J.L.,Connolly,K.G.,Donnelly,R.M.,Anderson,M.R.,&Howard,H.A.(2015).Whopublishesin“predatory”journals?Journalofthe

AssociationforInformationScienceandTechnology,66(7),1406–1417.

Yan,E.,&Ding,Y.(2009).Applyingcentralitymeasurestoimpactanalysis:Acoauthorshipnetworkanalysis.JournaloftheAmericanSocietyforInformation

ScienceandTechnology,60(10),2107–2118.

Yao,L.,Wei,T.,Zeng,A.,Fan,Y.,&Di,Z.(2014).Rankingscientiﬁcpublications:Theeffectofnonlinearity.ScientiﬁcReports,4,6663.

Zhou,J.,Zeng,A.,Fan,Y.,&Di,Z.(2015).Rankingscientiﬁcpublicationswithsimilarity-preferentialmechanism.Scientometrics,1–12.

Identification of milestone papers through time-balanced network centrality

Identiﬁcation of milestone papers through time-balanced

network centrality

Manuel Sebastian Mariani

, Matúˇs Medo

, Yi-Cheng Zhang

http://doc.rero.ch

Published in "Journal of Informetrics 10(4): 1207–1223, 2016"

which should be cited to refer to this work.

















_{, Yi-Cheng Zhang}