• Aucun résultat trouvé

Model-based evaluation of scientific impact indicators

N/A
N/A
Protected

Academic year: 2021

Partager "Model-based evaluation of scientific impact indicators"

Copied!
5
0
0

Texte intégral

(1)

Supporting Information

Mat´uˇs Medo1,∗ and Giulio Cimini2, 3,†

1Physics Department, University of Fribourg, CH-1700 Fribourg, Switzerland 2IMT School for Advanced Studies, 55100 Lucca, Italy

3

Istituto dei Sistemi Complessi (ISC)-CNR, 00185 Rome, Italy (Dated: September 7, 2016)

PAPER CITATION COUNT VERSUS PAPER APPEARANCE TIME

1900 1920 1940 1960 1980 2000 year 0 2 4 6 8 10 median citation coun t (a) APS data 0 2 4 6 8 10 paper age 0 2 4 6 8 10 median citation coun t (b) APS data model with Ω∞

Figure S1. (a) The dependence of the median paper citation count on the publication year in the standard American Physical Society (APS) data that cover all papers published by the APS journals and citations among them (the covered period is 1893–2009). The citation count decrease for papers published after 2000 can be attributed to these papers not having time to reach their long-term impact. The citation count decrease for papers published before 1950

can be attributed to different citations practices (fewer references per paper). Unlike the model without Ω, the data

feature no median peak for the earliest published papers.

(b) A comparison of the dependence of the median paper citation count in the model data with Ω∞(the basic model

setting; the error bars show three-fold of the standard error of the mean computed from 100 model realizations) and the APS data. Given the fact that the model was not calibrated with respect to this measurement, the two curves agree remarkably well.

matus.medo@unifr.chgiulio.cimini@imtlucca.it

(2)

VARIATIONS OF THE BASIC SETTING −0.2 −0.1 0.0 0.1 0.2 more fitness noise ? • −0.2 −0.1 0.0 0.1 0.2 ? • −0.2 −0.1 0.0 0.1 0.2 ? • • • • −0.2 −0.1 0.0 0.1 0.2 ? • • • • −0.2 −0.1 0.0 0.1 0.2 20% random links ? −0.2 −0.1 0.0 0.1 0.2 ? −0.2 −0.1 0.0 0.1 0.2 ? • • • • −0.2 −0.1 0.0 0.1 0.2 ? • • • • −0.2 −0.1 0.0 0.1 0.2 base fitness mean ? • −0.2 −0.1 0.0 0.1 0.2 ? −0.2 −0.1 0.0 0.1 0.2 ? • • • • −0.2 −0.1 0.0 0.1 0.2 ? • • • −0.2 −0.1 0.0 0.1 0.2 base fitness random

precision

difference

b

et

w

een

a

giv

en

setting

and

the

basic

setting

from

Figure

4

? −0.2 −0.1 0.0 0.1 0.2 ? • −0.2 −0.1 0.0 0.1 0.2 ? • • • • −0.2 −0.1 0.0 0.1 0.2 ? • • • −0.2 −0.1 0.0 0.1 0.2 base abilit y a0 = 0 ? −0.2 −0.1 0.0 0.1 0.2 ? • −0.2 −0.1 0.0 0.1 0.2 ? • • • • −0.2 −0.1 0.0 0.1 0.2 ? • • • • −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 5000 researc hers total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE ? −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE ? • −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE ? • • • −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE ? •

Figure S2. Precission difference with respect to the basic setting for individual metrics when various variations of this setting are considered (from top to bottom):

(1) the fitness noise amplitude η∗is increased from 0.2 to 0.5 (Ω

∞= 1.48· 104),

(2) 20% of references made by a new paper target a random existing paper (Ω∞= 1.28· 104),

(3) the base fitness of a paper is obtained as the average of the authors’ ability values (Ω∞= 9.20· 103),

(4) the base fitness of a paper is a random sample from the authors’ ability values (Ω∞= 1.10· 104),

(5) the baseline ability of authors a0 is set to zero (Ω= 1.06· 104),

(6) the number of researchers is increased from 1000 to 5000 (Ω= 6.90· 104).

The different ground truth assumptions are, from left to right: author ability, average fitness of authored papers, author ability times activity, and total fitness of authored papers. The error bars show the three-fold of the standard error of the mean. For every setting and ground truth, the best-performing metric (in absolute terms) is marked with

a star (?), and all metrics that reach at least 90% of its precision are marked with bullets (•) whose color indicates

(3)

AUTHORS AT DIFFERENT CAREER STAGES

Here we study the case where new authors are gradually introduced to the system, and check which indicators allow to fairly evaluate young researchers with respect to their senior colleagues. We thus consider a situation where 25% of researchers are active for the whole time; for the remaining 75%, author i begins their activity in a randomly

chosen step τi between t = 1 and t = T− 36. Here 36 is subtracted to only include researchers who spent sufficient

time in the system (by doing so, we are essentially considering only researchers who finished their PhD studies). In

this setting, the productivity of author i, initially drawn fromG(k), is linearly rescaled by her appearance time in the

system by a factor (1− τi/T ): young authors are thus in disadvantage with respect to seniors by having on average

less publications at the end of the simulation, and also by having less time to accrue citations.

Figure S3 presents the results obtained with this setting. The first observation is that most metrics actually perform similarly than in the basic setting where the group of active researchers does not change over time. The only metric that considers authors’ career length, m-quotient, is surprisingly performing worse in the new setting with researchers gradually entering the system. The reason lies again in the rescaling problem mentioned in the main text: in the new setting, there are more young users who only author their papers in the last years and yet outperform venerable researchers upon the rescaling, thus lowering the resulting precision more than in the basic setting presented in

Figure 4 of the main text. Specifically, there are on average 28± 8 authors with only one publication in the top 100

positions of the ranking by the m-quotient and the average activity span of top 100 researchers is 60± 40 months (out

of 120 in total). By contrast, for the h-index ranking there are no authors with only one publication in top 100 and

the average activity span of top researchers is 96± 21 months. The bias towards very young researchers is removed

by the corrected m, which brings to no researchers with only one paper in top 100 and to an average activity span of

86± 28 months: the resulting precision is similar to that achieved with the h-index and the contemporary h-index.

Overall, the logarithm-based indices λE and λC are again the best performers against intensive and extensive ground truths, respectively. 0.0 0.1 0.2 0.3 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (a) 0.0 0.1 0.2 0.3 0.4 0.5 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (b) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (c) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (d)

Figure S3. For the setting when the number of active researchers grows in time (see the description in the text above), comparison of the mean precision values achieved by impact indicators with respect to different ground truth assumptions: (a) author ability, (b) average fitness of authored papers, (c) author ability times activity, (d) total fitness of authored papers. The horizontal dotted line marks the performance of the best metric (which is typed with bold letters); the error bars show three-fold of the standard error of the mean.

(4)

INTERMEDIATE GROUND TRUTH ASSUMPTIONS 0.0 0.1 0.2 0.3 0.4 0.5 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (a) 0.0 0.1 0.2 0.3 0.4 0.5 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (b) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (e) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (f) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (c) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (d)

Figure S4. Precision values achieved by individual metrics against intensive (a,b), midway (c,d) and extensive (e,f) ground truth assumptions: (a) author ability, (b) author ability times square root of activity, (c) author ability times activity, (d) average fitness of authored papers, (e) average fitness of authored papers times square root of activity, (f) average fitness of authored papers times activity—i.e., total fitness of authored papers. The first row of panels (a,c,e) refers to a benchmark obtained from the researchers’ potential, whereas, the second row (b,d,f) to a benchmark related to realized publication outputs. We assume here the basic simulation setting that was used to obtain Figure

(5)

0.0 0.1 0.2 0.3 0.4 0.5 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (a) 0.0 0.1 0.2 0.3 0.4 0.5 precision P (100) total cit C mean cit E max cit M x -index h-index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (b) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (e) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h-index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (f) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h -index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (c) 0.0 0.2 0.4 0.6 0.8 1.0 precision P (100) total cit C mean cit E max cit M x -index h-index con temp h g-index o-index m -quotien t corrected m total log cit λC mean log cit λE (d)

Figure S5. Same as Figure S4, but for the simulation setting with the number of active researchers increasing with

Références

Documents relatifs

The irrelevance of GDP for the measurement of wealth has been demonstrated; alterna- tive indicators considering sustainability, social, and environmental aspects of economic

In this 49 study, three experimental cases for evaluating a workpiece 50 localization repeatability will be considered: (1) changes in 51 geometric parameters of a fixture

Forty-five cropping systems including 19 single-, 24 double- and 2 triple-cropping systems were practiced for the croplands across 107 counties in Shaanxi (Shaanxi Technology

6 Transpose la phrase suivante avec « Maman et Manon » en utilisant les aides.. Manon achète des pâtisseries à

Dans ces conditions, il devient difficile pour les divers services du CH de récupérer les coûts engagés pour la recherche qui sont alors assumés à même les sommes prévues pour

In the present paper, we calculate the full counting statistics of charge transport through a (classical or quantum) point contact coupled to a classi- cal two-level fluctuator; we

The main effects of a maxillary lip bumper thus seem to be a widening of the dental arch across the premolars, a moderate increase in arch length due to eruption and slight

[53] Pyörälä S., Pyörälä E., Accuracy of methods using somatic cell count and milk N-acetyl-b- D-Glucosaminidase activity in milk to assess the bacteriological cure of bovine