• Aucun résultat trouvé

DataCite as a novel bibliometric source: Coverage, strengths and limitations

N/A
N/A
Protected

Academic year: 2022

Partager "DataCite as a novel bibliometric source: Coverage, strengths and limitations"

Copied!
14
0
0

Texte intégral

(1)

ContentslistsavailableatScienceDirect

Journal of Informetrics

jo u r n al hom e p ag e :w w w . e l s e v i e r . c o m / l o c a t e / j o i

Regular article

DataCite as a novel bibliometric source: Coverage, strengths and limitations

Nicolas Robinson-Garcia

a,∗

, Philippe Mongeon

b

, Wei Jeng

c

, Rodrigo Costas

d,e

aINGENIO(CSIC-UPV),UniversitatPolitècnicadeValència,Spain

bÉcoledebibliothéconomieetdessciencesdel’information,UniversitédeMontréal,Canada

cDepartmentofLibraryandInformationScience,NationalTaiwanUniversity,Taiwan

dCWTS,LeidenUniversity,TheNetherlands

eCentreforResearchonEvaluation,ScienceandTechnology(CREST),StellenboschUniversity,PrivateBagX1,Matieland7602,South Africa

a rt i c l e i n f o

Articlehistory:

Received6March2017

Receivedinrevisedform17July2017 Accepted17July2017

Availableonline30August2017 Keywords:

Datasharing Datacitations Bibliometricsources Opendata Datainfrastructure Datametrics DataCite

a b s t ra c t

ThispaperexploresthecharacteristicsofDataCitetodetermineitspossibilitiesandpoten- tialasanewbibliometricdatasourcetoanalyzethescholarlyproductionofopendata.

Openscienceandtheincreasingdatasharingrequirementsfromgovernments,funding bodies,institutionsandscientificjournalshasledtoapressingdemandforthedevelop- mentofdatametrics.Asaveryfirststeptowardsreliabledatametrics,weneedtobetter comprehendthelimitationsandcaveatsoftheinformationprovidedbysourcesofopen data.Inthispaper,wecriticallyexaminerecordsdownloadedfromtheDataCite’sOAIAPI andelaborateaseriesofrecommendationsregardingtheuseofthissourceforbibliomet- ricanalysesofopendata.Wehighlightissuesrelatedtometadataincompleteness,lack ofstandardization,andambiguousdefinitionsofseveralfields.Despitetheselimitations, weemphasizeDataCite’svalueandpotentialtobecomeoneofthemainsourcesfordata metricsdevelopment.

©2017ElsevierLtd.Allrightsreserved.

1. Introduction

Callsfordataavailabilityandsharingcanbetracedbacktothebeginningofthe20thcenturywhenGaltonstated:“Ihave beguntothinkthatnooneoughttopublishbiometricresults,withoutlodgingawellarrangedandwellboundmanuscript copyofallhisdata,insomeplacewhereitshouldbeaccessible,underreasonablerestrictions,tothosewhodesireto verifyhiswork”(Galton,1901,ascitedinPerneger,2011).However,ithasbeenjustafewdecadessincetechnologyhas madepossiblethedevelopmentofthenecessaryinfrastructuretomakethishappen(Peng,2011).Inthelastdecade,public fundingagencies,publishersandinstitutionshavedirectedtheireffortstowardsdevelopingsuchinfrastructureaswellas toincentivizingdatasharingandreusewithinthescientificcommunitybypromotingdatacitations(Robinson-Garcíaetal., 2015).

“ThepeerreviewprocessofthispaperwashandledbyStaˇsaMilojevi ´c,AssociateEditorofJournalofInformetrics.”

Correspondingauthor.

E-mail addresses: [email protected] (N. Robinson-Garcia), [email protected] (P. Mongeon), [email protected] (W. Jeng), [email protected](R.Costas).

http://dx.doi.org/10.1016/j.joi.2017.07.003 1751-1577/©2017ElsevierLtd.Allrightsreserved.

(2)

InitiativessuchasthelaunchoftheDataCitationIndexandtheDataCiteconsortiumareexamplesofeffortsdirectedat promotingdatacitations.However,littleisknownabouttheproductionofdata,field-specificpractices,andotherbasic requirementssuchastheformatadatarecordshouldhavetofacilitateinformationretrievalandbibliometricanalyses.

PreviousstudiesfocusingonThomsonReuters’DataCitationIndex(nowClarivateAnalytics)haveexploreddisciplinary biasesanddatatypesincluded(Torres-Salinas,Martín-Martín,&Fuente-Gutiérrez,2014),datacitationpracticesbetween fields(Robinson-Garcíaetal.,2015),andtherelationbetweendatacitationsanddatamentionsinsocialmedia(Peters, Kraker,Lex,Gumpenberger,&Gorraiz,2016).

Inarecentreport,Costasetal.(2013)highlightedtheneedfordevelopingdatapublicationstandards,reducingthedisper- sionofdatarepositories,andfacilitatingthetraceability,citationandmeasurementofdatarecords.Themostcomprehensive sourceforopendatacurrentlyavailableisDataCite,whichcontainsmorethan7millionfreelyaccessiblerecords,almost doublingthefigureslastreportedfortheDataCitationIndex(Petersetal.,2016).

Inlinewiththeopensciencemovementandcallsforincreaseddatasharingandreuse,wehighlighttheimportanceofdata publicationsandcitations.ThispaperanalyzesthestructureandtypeofmetadataofferedbyDataCitetoassessitspotentialto becomeanimportantsourcefordevelopingdata-levelmetrics.DataCiteisaninternationalnon-profitorganizationformedin 2009.Itisaconsortiumofpublicresearchinstitutions,fundingbodiesandpublishersworldwidewhosemissionistopromote openresearchdataaccessibilityandtracking.Forthelatter,DataCiteadvocatesfortheuseofDigitalObjectIdentifiers(DOI) byassigningDOIstotheirrecords(DataCiteMetadataWorkingGroup,2015).

2. Objectives

ThispaperaimstoexplorethecharacteristicsofthedatacollectedbyDataCitetodetermineitspotentialasanewsource ofbibliometricdataforthestudyofopendataproduction.Specifically,weexaminethedatabasestructureandthelevel ofstandardizationoftheinformationprovidedineachfield,toassesstheusabilityofthedataforbibliometricpurposes.

Thepaperisstructuredasfollows.Firstly,wepresentthemetadataschemeofDataCiterecords(2015).Thenweassess thecompletenessofthedataineachspecificfieldandgiveanoverviewofthedatabasecoverage.Finally,wediscussthe potentialofDataCiteasasourcefortrackingopendataproduction,andweprovidesomerecommendationsforitsuseas toolforstudyingdataproductionandcitationpatterns.

3. Dataandmethods

Thissectionisstructuredinthreeparts.ThefirstonedescribesthedifferentpointsofaccessavailablebyDataCiteand advantagesandlimitationsofusingoneortheother.Second,werecollectanddescribetheinformationprovidedbyDataCite astoitsstructure,definitionofdatarecordfields,andinformationrequestedtoeachrepository.Theaimistogivethereader afullaccountastowhatDataCiteexpectstoreceivefromeachdatarepositoryandhowthisinformationisexpectedtobe presentedtothefinaluser.ThelastpartdescribesthedatasetdownloadedfromDataCite’spublicOAIAPI.Theinformation retrievedanditsstructureiscomparedwiththeinformationprovidedinthefirstsubsection.

3.1. PointsofaccesstoDataCite

DataCiteprovidestwoAPIstothepublicfordownloadingrecordsindexedinitsdatabase.Thesetwopointsofaccess containthesamenumberofrecordsbut differinthestructureinwhich theyarepresentedaswellasinthedetailof informationprovided.

DataCiteMetadataStore(https://oai.datacite.org/).TheDataCiteMetadataStoreisaservicetomanageactivitiesrelated toDigitalObjectIdentifier(DOI)registrationatDataCite.TheMDSisusedtocreate,register,storeandmanageDOIsand

(3)

associateddatasetmetadatacreatedbyDataCite’susersandmembers.Herewearepresentedtorawdataasprovidedby DataCite’smembersandhasnotyetbeenprocessedbyDataCite.

DataCiteRESTAPI(https://api.datacite.org/).TheDataCiteRESTAPIincludesthesamecontentsastheDataCiteMetadata Storebutwithaddedlayersofinformationbyrecord.TheDataCiteteamaddsnewinformationtoeachrecordregarding funding,ORCIDs,citationsnotprovidedbythedatacentersthemselvesareadded.

As well as these two points of access, DataCite allows bulk queries via two additional URLs: search SOLR (https://search.datacite.org/ui)andsearch(https://search.datacite.org/).Inthispaper,wehaveusedtheDataCiteMeta- dataStoretoretrieveallrecordsfromDataCite.ThroughouttherestofthepaperallreferencesmadetoDataCite’smetadata structurearebasedonsuchinformation.

3.2. DataCitemetadataschemev.3.1

InApril2016,weretrievedallrecordsfromDataCiteusingtheirpublicOAIAPI(https://oai.datacite.org).DataCiteprovides ametadataschemewhichshowstherecordstructureanddefineseachfield(DataCiteMetadataWorkingGroup,2015).Note thatalthougha4.0versionofthemetadataschemehasrecentlybeenimplemented,inthispaperwerefertoversion3.1 asitwastheschemainplaceatthetimeofdatacollection.Thisversionincludesmandatory,recommendedandoptional fields.Inthefollowingsections,webrieflydescribethemainfieldsretrievedfromtheDataCiteMetadataStore.

3.2.1. Mandatoryfields

Identifier.WhileinprincipleDataCiteencouragesandpromotestheuseofDOInumbers,italsoallowstheinclusionof otheruniqueidentifiers(e.g.URN,CCDC,INCHIkey,URL).

Creator.Thisfieldincludesthename,surnameoraffiliationnameofthecreatorsofthedatarecords.Itwouldbeequivalent totheauthorfieldofbibliographicrecords.

Title.Thenamebywhichtheresourceisknown.Sometimesitalsoincludessubtitleasasub-field.

Publisher.DataCitedefinespublisheras“[t]henameoftheentitythatholds,archives,publishesprints,distributes,releases, issues,orproducestheresource”(DataCite,2015).Forthecurrentpractice,therecanbedifferentinterpretationsonthis definitionthuscouldbeperformedbydifferentactors.Hence,itcanresultinambiguityonthetypeofentitiesassignedas publisher,namelyindividualauthors,institutions,orindividualdatarepositories.Wediscussthislimitationinsubsection 4.2.

PublicationYear. Theyearinwhichthedatarecordwasmadepubliclyavailable,whichmaydifferfromtheyearof itscreation.DataCite’sdocumentationacknowledgesthatthiscanbeproblematicincertaincasesleavinguptotheuser depositingthedatatochoosetheirpreferreddateforcitationpurposes.

3.2.2. Recommendedfields

Subject.Thisisafreetextfieldthatcanincludekeywords,classificationcodes,subjects,orkeyphrases.Itincludesas subfieldthesubjectschemeused,ifany,withalinktothesubjectscheme.

Contributor.Thisfieldincludestheinstitutionsandindividualsinvolvedonthecollection,management,distributionor othertypesofcontributionstotheproductionofthedata.Itincludesassubfieldthetypeofcontribution(i.e.,contactperson, datacollector,etc.).

Date.Duetothepotentialambiguityofthepublicationyear,thisfieldallowstospecifymorethanonedatewhichmay berelevantfortheuser,suchasdataavailability,collection,publication,etc.

ResourceType.Here,atwo-levelclassificationofdatatypesisintroduced.Whilethetoplevelisaclosedlistof15data types,thesecondlevelclassificationisafreetextfield.

RelatedIdentifier.ThisfieldcontainsidentifiersdifferentfromtheDOI.

Description.Thisisastructuredfield.Ifused,freetextcanbeenteredbutthetypeofcontent(abstract,methods,series information,tableofcontents,andother)mustbespecified.

GeoLocation.Includesthegeographicallocationinwhichthedatapresentedwascollected.

3.3. Generaldescriptionoftheretrieveddatabase

DatawereparsedandorganizedintoanSQLdatabase.Atotalof7,440,415recordswereretrieved.TheAPIdoesnot providetherecommendedGeolocationfield.ThisfieldwasincludedinSeptember2016.Itprovidesfiveoptionalfields:

Relation,Format,Language,andRights.Furthermore,thefieldsIdentifierandRelatedIdentifierandthefieldsPublicationYear andDatearecombinedintwofields(IdentifierandDate).Additionally,itindicatestheDataCenterprovidingtherecordsto DataCite.762organizationswereincludedasdatacentersatthetimeofthedownload.Theseorganizationshavecontracted withanindividualDataCitemembertoassignDOIs.AppendixAincludesadetaileddescriptionofeachfieldretrievedand theinformationtheycontain.

Fig.1showstheshareofrecordsinDataCitewithinformationineachofthefieldsdescribedinAppendixA.Wesee thatmanyrecordscontainemptyfields(evenmandatoryones).Atotalof1,092,131records(14.7%ofallrecordscollected) includenodataatall.ThisappearstobecausedbymodificationsmadebyDataCiteinthedatastructure.Morespecifically, DataCiteemploystheOpenArchivesInitiativeProtocolforMetadataHarvesting(OAI-PMH)andassignsanOAIidtoeach

(4)

Fig.1.Distributionofmetadatainformationbyfields.

Fig.2.ExampleofanemptyrecordretrievedfromDataCite’sAPI.

record.Itappearsthatwhenarecordneedstobemodified,anewrecordiscreatedwiththeupdatedinformation.The informationintheoldrecordisdeleted(exceptfortheOAIandthedatacenterinformation),butnottherecorditself.Fig.2 showsanexampleofanemptyrecord.ThisisanimportantelementtoconsiderwhenworkingwithDataCite’sAPIasthese recordsshouldberemovedfromthesample.

Whenfocusingontherecordsthatdoincludeinformation(6,348,284records),westillfindthat1306records(0.02%) donotincludeatitleorpublisherinformation.Resourcetypeandlanguagearereportedin60%and51%oftherecords, respectively.Thecontributor(18%)andrelation(25%)fieldshavethelowestpresenceinDataCiterecords.

4. Results

Inthissection,wereportourfindingsregardingthecontentofeachfieldandthelevelofstandardizationofthedata.

First,wepresentdescriptivestatisticsondifferenttypesofdatarecords.Thenweanalyzethegeographicaldistributionof datacentersandthenumberofrecordsbycountry.Wealsoanalyzethepublisherfieldtodisentanglethedifferenttypesof entitiesitcontains.Wealsopresentanoverviewofthedifferenttypesofdatesincludedinthedatabase.Finally,wefocus onthedescriptionoftherelationfield,whichcontainsDOIsofrelatedrecords,tryingtounderstandthetype(s)oflinkages capturedbyDataCite.

4.1. Resourcetypes

TheResourceTypefieldpresentsacontrolledlistof15values,complementedbyafree-textsubtype.Table1reportsthe totalnumberofrecordsbyresourcetypeandthethreemostcommonsubtypes.Weobservethat42%oftherecordsare categorizedasdatasets,followingbytext(18%),image(14%),andcollection(7%).AsobservedinTable1,mostofrecordswith aResourceType‘text’aremanuscripts,conferencepapersorjournalarticles.Recordstaggedasimagesareheterogeneous, rangingfromacademicposterstohistoricalmanuscripts,ordatafigures.Thesubtypeisnotmandatoryandisthusemptyin manyrecords.Forinstance,only4.3%,6%and6%ofrecordswiththeresourcetype“Model”,“Sound”and“Film”,respectively, haveasubtype.Overall,wefind158,781differentvariationsofresourcesubtypes,anaturaloff-shootofitbeingafree-text field,butwhichreflectsdifferentunderstandingsofwhatisdataandwhatisincludedbyeachofthe15datatypes.

(5)

Table1

Recordsbyresourcetypeandshareoftop3mostcommonsubtypesinDataCite.Inbold-cursivesubtypesappearinginmorethanonedatatypecategory.

Resourcetype Numberofrecords Mostfrequentsubtypes

N %

Dataset 1,867,627 41.69 Dataset(63.5%),Metadata(5.8%),Datapackage(4.1%)

786,882 17.56 Conferencepapers(15.5%),Journalarticles(15.4%),Report (10.1%)

Image 641,404 14.32 Image(11.9%),Figure(11.2%),Plate(8.1%)

Collection 303,638 6.78 Collection(20.7%),Gaussianjobarchive(9.1%),Report(4.7%)

Software 12,340 0.03 Simulationtool(16.9%),Software(10.8%),Code(5.3%)

Audiovisual 4470 0.10 Audiovisual(43.8%),Media(23.9%),Teachingmaterial(8.5%)

Film 960 0.02 Experiment(5.4%),Video(0.4%),Animation(0.1%)

PhysicalObject 587 0.01 Archivalobject(63.9%),HIAPER-HAISairbornesensor(2.4%),

Physicalobject(0.9%)

Event 508 0.01 Conferencepresentation(73.4%),Presentation(9.6%),Event

(1.6%)

Model 470 0.01 Model(2.8%),Ontology(0.9%),Shapefiles(0.2%)

InteractiveResources 287 0.01 Interactiveresources(12.2%),Learningobject(2.1%),SitesWeb (0.3%)

Sound 234 0.01 Recording,oral(4.3%),Sound(0.4%),Conference(0.4%)

Workflow 209 <0.01 Taverna2workflow(7.2%),Workflow(1.0%),RapidMiner

workflow(0.5%)

Service 18 <0.01 Service(88.9%),S-map(5.6%),Dataprovider(5.6%)

Other 871,549 19.45 Datasheet(98.2%),Oceanographiccruise(0.7%),Field

expedition(0.7%)

Total 4,480,077 100

Wealsoobserveclassificationredundanciesbetweenthetwolevels.Forexample,theresourcetype“dataset”hasa subtypealsocalled“dataset”.Therearealsoredundantsubtypesbetweendifferentresourcetypes.Forexample,thesubtype

“report”appearsasasubtypeofboththeresourcetypes“collection”and“text”.AspecificallyproblematiccaseistheResource Type“other”,forwhich98.2%oftherecordshaveasubtypelabeledas“Datasheet”.Thissuggeststhattheserecordscould perhapsbeconsideredasdatasets.Takingacloserlookattheserecords,wefoundthattheywereallderivedfromthe samerepository,Data-Planet.Actually,allrecordsfromData-Planetareclassifiedas“Datasheet”.Thisvariabilityinthe distributionofrecordsmayreflectsomeinconsistenciesinthewaydatacentersclassifyrecordsaccordingtothescheme proposedbyDataCite.

Fromnowon,wewillreferas“datarecords”toallthoserecordsinDataCitethathavearesourcetypedifferentthan

“text”(i.e.weconsiderasdata-relatedrecordsallrecordsthatarenotarticlessuchasmanuscriptsorpre-prints).

4.2. Thegeographicdistributionofdatainfrastructures

Inthissection,wefocusonthedataprovidersandthecountriesinwhichtheyarebased,toprovideinsightsonhow datainfrastructuresarebeingdevelopedindifferentcountries.DataCiteprovidesaclosedlistof762institutionsfromwhich recordsareretrieved.Thedistributionofrecordsacrossthesedatacentersisuneven:15(2%)datacentersaccountformore than80%ofallrecords.Fig.3showsthedistributionofrecordsbyresourcetype(excludingtherecordswherethisfieldis empty)forthe20datacenterswhoprovidedthemostrecords.

Thedatahighlightsthevarietyofinstitutionsprovidingdata:fromthematicdatarepositories(Data-Planet,PANGAEA, DigitalScience),toscientificsocialplatforms(ResearchGate)oruniversities(ImperialCollegeLondon,EThZürich).Data- PlanetisthelargestdatacenterinDataCite,providing20%ofalltherecords.Asmentionedbefore,allrecordsprovided byData-Planetare“datasheets”.Also,somedatacenters(ResearchGate,E-Periodica,UniversitätZürich,Zora,andETH E-Collection)provideonly“text”records.

In Table2weassigned eachdatacenter totheircountries. ThisinformationwasretrievedfromDataCiteStatistics (https://stats.datacite.org/).Itisimportanttonotethattheclassificationwasbasedonthelocationoftheirheadquarters, andthatsomedatacenterswereassociatedtomorethanonecountryiftheyhaveheadquartersindifferentcountries.The countrydistributioninTable2doesnotreflecttheaffiliationofdatacreatorsnorthegeographicoriginofcountries,but providesanoverviewofcountriescontributingtowardsthedevelopmentofanopendatainfrastructure.Wefindthatthe distributionofrecordsbycountryisveryskewed:theUnitedStates,GermanyandtheUnitedKingdomaccountfor82%of thetotalrecords.Thedistributionofresourcetypesalsodiffersbycountry.Forinstance,almost100%ofrecordscomingfrom Estonia,Denmark,andCanadaaredatarecords,whilethisproportionismuchsmallerinothercountriessuchasHungary (0.8%),Italy(4.2%),Ireland(16.1%),Australia(19.6%),andGermany(26.5%).Moreover,nodatarecordswerefoundindata centersbasedinAustria,Russia,Iran,SouthKorea,Liechtenstein,Slovenia,andJapan.

Thesecond sourceofinformation relating toopendataprovidersis obtainedfromthepublisherfield. It is anon- standardizedfree-textfieldinwhichwefound118,136differentnames.Thedistributionofrecordsishighlyskewed,hence

(6)

Fig.3.Top20datacentersbydatatypes.

Table2

Thenumberofdatacenters,numberofrecordsandshareofrecordsafterexcludingrecordslabeledasdatatype“text”bycountry.Countriesareordered bytotalnumberofrecords.

Countries Datacenters #records %datarecordsa Countries Datacenters #records %datarecordsa

USA 217 2952086 58.6% Hungary 37 1809 0.8%

Germany 185 1795638 26.5% Poland 4 1713 1.3%

UK 66 1382661 49.9% Russia 3 1388 0.0%

Switzerland 48 1120868 32.1% Iran 2 1292 0.0%

Estonia 6 489896 99.5% Romania 3 1032 47.2%

Denmark 5 138640 98.0% China 2 703 31.2%

Canada 24 85984 93.5% CzechRepublic 1 470 100.0%

Thailand 1 61529 87.9% SouthKorea 1 188 0.0%

Italy 35 50350 4.2% Belgium 1 106 79.2%

Netherlands 16 49900 80.8% SouthAfrica 1 105 93.3%

Austria 7 36450 0.0% Liechtenstein 1 56 0.0%

Australia 41 24122 19.6% Ghana 1 53 98.1%

Ireland 3 23181 16.1% Spain 2 37 100.0%

France 32 13093 48.7% Slovenia 1 18 0.0%

NewZealand 2 3081 39.4% Japan 1 15 0.0%

Sweden 6 2835 97.9% Tanzania 1 10 90.0%

Unknown 8 2722 1.8% Uruguay 1 1 100.0%

aDatarecordsaredefinedasalldatatypesexcludingtext.

bymanuallydisambiguatingthemostcommon1148publisherswemanagedtocoverabout90%ofalltherecordsthat includepublisherinformation.

Foreachofthese1148publishers,weassignedtwovariables:countryandtypeofentity.TheCountryinformationwas retrievedfromthepublishers’websitesandcorrespondstothecountrywherethepublisherislocated(likedatacenters, multiplecountriescanbeassignedtoasinglepublisher).Fig.4presentsthenumberofrecordsforeachcountry.Onlyrecords includingresourcetypeandpublisherinformationarerepresented(3,704,161records).Whilethedistributionofrecords bycountryissimilarusingeitherthedatacenterorpublisherinformation,therearenotabledifferences.Wefindthatthe numberofcountriescontributingtoDataCiteislowerwhenusingthepublisherinformationthanwhenusingdatacenter location.Forexample,norecordwouldbeassignedtoEstonia,ThailandorIrelandusingthismethod.However,theyoccupy

(7)

Fig.4.Totalnumberofdatarecords(excludingdatatype“text”)bycountryusingdatacenterandpublisheraffiliationdata.Y-axisarelogarithmic.Countries areorderedaccordingtothetotalnumberofrecordsusingthedatacenteraffiliation.

Fig.5.Numberofrecordsandshareofdatarecords(afterexcludingtext)bytypeofpublisher.Onlyrecordswithpublisherinformationanddatatypeare shown.

thethird,eighthandtwelfthpositionsrespectivelywhenusingthedatacenter.Attheotherextreme,Italy,Belgiumand Spainareclearlyunderrepresentedaccordingtodatacenters’location.

Wealsodividedthepublishersin11typesofentitytobettercomprehendwhatusersunderstandas“datapublisher”, butalsotoidentifydifferenttypesofinstitutionspublishdataproducts.Wedistinguishfourtypesofrepositories(i.e., national,institutional,disciplinary,andmultidisciplinaryrepositories),andtheotherentitiesarediversegroups(research body,professionalbody,andeducationalbody),publishers,firms,conferencesandindividuals.AppendixBprovidesmore detailsonthisclassification.

AsshowninFig.5,atotalof156distinctentitiesareidentifiedfromthe1148namevariantsdisambiguatedfromthe publisherfield.Mostoftherecordswereassignedto18thematicrepositories(43%).Among156entities,35areinstitutional repositories,followedby33researchbodies(e.g.,researchcentersandscientificassociations),and24academicpublishers (journals).Insecondandthirdplacebutwithasubstantiallylowerproportionofdatarecords,wefindinstitutionalreposi- tories(17%)andresearchbodies(15%).Theproportionofdatarecordsvariessubstantiallybypublishertype.While89%of recordsincludedinmultidisciplinaryrepositoriesaredatarecords,noneoftherecordspublishedbyprofessionalbodies,

(8)

Fig.6.NumberofrecordsperyearusingthepublicationyearinDataCite.1950–2020period.

conferencesandauthorsaredatarecords.Theseresultsreflecttheconceptualproblemstillexistingonthemeaningthat

“publishing”hasinthedataproductionmodel(Costasetal.,2013)orattheveryleast,theeffectofthediversityofrecords includedinDataCite.

4.3. Publicationyearandrelateddates

Publicationyearisakeyfieldinanybibliometricanalysisintendingtoprovidealongitudinalperspectiveortoframethe studyperiod(s).DataCiterequiresthepublicationyeartobepresentedinafour-digitformat.However,animportantpointto considerforthedevelopmentofdatametricsisthatdatarecordscanbesubjectedtodifferentactionsoccurringondifferent datesofactions,thatmayallbeincludedinthemetadata.Thus,DataCite(2015)hastwodate-relatedfields:publicationyear anddate.ThepublicationyearfieldisamandatoryfieldthatDataCiteMetadataWorkingGroup(2015)definesas“theyear whenthedatawasorwillbemadepubliclyavailable”.Still,DataCiteacknowledgesthatthisinformationmaybeunclear orunavailable,providingalternativessuchas,“[if]thatdatecannotbedetermined,usethedateofregistration”or“[i]fan embargoperiodhasbeenineffect,usethedatewhentheembargoperiodends”.Concludingthat“[i]fthereisnostandard publicationyearvalue,usethedatethatwouldbepreferredfromacitationperspective”.

Thedatefieldisanoptionalfree-textfieldthatcanrefertodifferentdatesrelevanttotherecord.Thesecanberelatedtothe datewhenthedatasetwascreated,uploadedtoarepository,madepubliclyavailable,updated,etc.Thus,wheninformation ifprovidedinthedatefield,oneofthefollowing9subtypesisrequired:accepted,available,copyrighted,collected,created, issued,submitted,updatedandvalid.

AsmentionedbeforeandpresentedinAppendixA,thefield“date”retrievedDataCiteMetadataStoreOAIAPIcombines boththepublicationyearanddateinasinglefield.Hencethedistinctionsdiscussedabovearenotavailable.Thismeansthat multipledatesmaybeassignedtoasinglerecordandthatthepublicationyearfieldcanonlybedistinguishedfromthedate fieldwhenthelatterisnotinafour-digitformat.Therefore,thedateinformationretrievedwiththeAPImustbesomehow processedbeforeused.Inthisstudy,wedefine“publicationyear”asadatepresentedwithafour-digitformat.Weidentified 4,242,804datarecordswiththisformat.Thiscleaningprocessisnotcompletelyaccurateasatotalof50,679recordsreported publicationyearsabove2099orfromearly1000sandwerethusnotconsidered.1Fig.6showsthenumberofrecordsfor the1950–2020period.Weobservemanyrecordsdatingfrom2016onwardsduetotheembargotheyarerestrictedby.

Thefactthatthereisnocleardefinitionforthepublicationyearfield,mayleadtosomediscrepanciesinthedata.Thisis especiallymeaningfulinthecaseofhistoricaldatawheretheusercouldchoosetoindicatethedateofthehistoricrecord orthedateofitsretrieval.Fig.7providestheexampleofadigitizedphotographwhichhadalreadybeenpublishedinits physicalform.Here,thepublicationyearfieldcontainsthevalue1929,whichisinfactthedatewhenthephotographwas taken.

Regardingrecordsincludingadditionaldates,weidentified2,095,183recordsofwhich43%reportedtheavailabilitydate, 25%reportedthedateofcreation14%declaredthecollectiondateand12%anupdateand3%andissuedate.Lessthan0.2%

oftherecordsreportedthedateofcopyright,submission,validityoracceptance.

1 Althoughtherearecasesofdatarecordsdatingfromtheearly1000s,e.g.,digitalizedarchivalobjects.

(9)

Fig.7.Exampleofrecordwithanolderdatetothedevelopmentofdatarepositories.6A.Contentsofaphotographtakenin1929.6BDatarecordinDataCite.

Thedateofpublicationoftherecordis1929.

4.4. RelatedDOInumbers

TheOAIDataCiteAPIalsoprovidesafieldnamedrelation,whichisequivalenttotheRelatedIdentifierfieldintheDataCite MetadataSchema.Themaindifferenceisthathereweretrieveonlytheinformationprovidedbythedatacenters,while theRelatedIdentifierfieldretrievedfromtheRESTAPIincludesadditionalrelatingprovidedbytheDataCiteteam.Itcontains identifiersforpublications(e.g.,DOIs,arxiv,bibcode,handles;notnecessarilyinDataCite).AsallrecordsinDataCiteinclude aDOInumberalongwithotherassociatedidentifiers,wecrossedrelatedDOInumberswith:1)theDataCitedatabaseitself, tofindpotentialrelationsamongdatarecordswithinDataCite;and2)withtheWebofScience,toidentifypotentialrelations withscientificpublications.AsshowninFig.8A,23%ofallDataCiterecordsincluderelatedDOIs.ThenumberofrelatedDOI numbersbyrecordvariesgreatly,showingahighly-skeweddistribution(Fig.8B).Fig.8CcrossesDataCiterelatedDOIswith DataCiterecords,withDataCiterecordsdefinedasdatasets,andwithWebofSciencerecords.Lessthan25%oftherelatedDOI numbersbelongtootherDataCiterecords.Approximately15%belongingtoarticlesindexedintheWebofScience(Fig.8C).

WhenwefocusonthedatatypeofrelatedDOIscontainedinDataCite(Fig.8D),weobservethat90%ofthesearedatasets.

Afteracursorycheckofsomeofthesecases,weobservethatoccasionallytherelationisformedbyacontainerdatarecord (i.e.,adatabase)anditstables(i.e.,datasets).Forexample,thedatabasehttp://dx.doi.org/10.15468/dl.qnbifhincludedatthe timeofthedatacollection,5192relateddatasets.ThispartiallyexplainstheskeweddistributionobservedinFig.8B.Inother cases,therelationindicatesdata(re)usebylinkingthedatawithapaper.However,thisfielddoesnotseemtocontainthe DOIofarticlescitingthedatarecord,andwefindnoevidentcriteriaforcharacterizingthetypesofrelationsreportedinthis field.

Interestingly,Robinson-García,Jiménez-Contreras,andTorres-Salinas(2016)reportedasimilartypeofrelationsalso consignedintheThomsonReuters’DataCitationIndex,althoughinthatcase,onlyrelationsbetweendatasetsandscientific paperswereincluded.However,theyreportedarepositorydependenceofthereportingoftheserelations,thatis,depending ontherepositorywewouldfindrecordswithrelationsornot.InDataCitethereisevidencesuggestingthatsuchadependency

(10)

Fig.8.AnalysisoftherelationfieldinDataCite.AShareofrecordsinDataCitewithrelatedDOInumberswithinDataCiterecords.B.Distributionofthe numberofrelatedDOInumbersbydatarecord.C.ShareofrelatedDOInumbersincludedinDataCitebytheirdatatype.D.ShareofrelatedDOInumbers indexedinDataCite,indexedinDataCiteandwithdatatypeinformation,andindexedinWebofScience.

alsoexists,inthiscasewithdatacenters:only226(30%)datacentersreportedatleastonedatarecordwitharelatedDOI number,and44(5%)ofthemreportedrelatedDOInumbersinalltheirrecords(seeFig.9).

5. Concludingremarksandrecommendations

Theresearchondatasharingandopendataisgrowing,whileatthesametimefundingbodiesareencouraginggreater researchtransparency.Termslikedata-drivenscience,data-intensivescience,andopensciencearebecomingmoreand morecommoninpolicydocumentsandstatementssuchastheEuropeanUnions’Horizon2020(EuropeanCommission, 2016).Inthiscontext,DataCiteiscalledtoplayanimportantroleassourcefortheanalysisandstudyofdatapublication andreuse.Whilethedemandofdatametricshasbeenaconstantsincethebeginningofthe2010s(Costasetal.,2013),there isstillalongwaytogountilthemovementexpandstobroaderfieldsofScienceandtomorecountries.

Thispaperpresentsthefirstlarge-scaledatacollectionandanalysisofDataCitetoassessitspotentialasabibliometric toolabletoprovideinformationandmetricsaboutopendataactivitiesata macro-scale.Comparedwithothersimilar productssuchastheDataCitationIndex,thesizeandrichnessofDataCitedataoffergreaterpossibilitiesasabibliometric sourcefordevelopingopendatametrics.Still,thisrichnessofdatacomesataprice.Conceptualproblemssuchaswhatis dataortowhichscientificfieldordisciplinedifferentdatasetsbelongto,alongwithtechnicalproblemssuchasthelackof standardizationofmanyofitsfields,maystillrepresentanadvantagetowardstheDataCitationIndex,inwhichthestructure offieldsintheDataCitationIndexadaptstosomeextentthestructureofbibliographicrecords.Thisispresentsapositive advantagefortheDataCitationIndexbecauseitallowsbibliometricanalyseswithoutpriorprocessing(e.g.,Robinson-Garcia etal.,2016).However,thisanalyticalsimplicityoftheDataCitationIndexoverlookssomeofthekeyissuesfoundwhen exploringthenatureandheterogeneityofopendata.Asshowninthispaper,themetadataofDataCiterecordsisveryrich andheterogeneous,herewedescribesomeoftheimportantissuesthatneedtobeconsideredwhenusingDataCiteasa sourceofdataforopendataanalytics.

5.1. CentralissuesregardingthemetadataprovidedbyDataCite 5.1.1. Datatypesandthedefinitionof“data”

AnimportantcriticalelementthatneedstobeconsideredwhenworkingwithDataCiteisthatassuch,allrecordsincluded inthedatabasearenotstrictlydata-related.Forexample,morethan12%ofthevalidrecordsinDataCitearetextorarticles.

Therefore,inordertoproperlyidentifyandanalyzetheproductionofdata,diversefiltersneedtobeappliedbytypesof data.However,wehavehighlightedtheimportantdiversityofdatatypesincludedinDataCite.Inaway,themanytypesof datacoveredinDataCitesuggestthatabroaderunderstandingofwhatconstitutesresearchdataisverynecessary.Infact,

(11)

Fig.9.ShareofrecordswithrelatedDOInumbersassignedtothem.BluerepresentsrecordswithrelatedDOInumbers.Greyrepresentsrecordswithno relatedDOInumbersreported.(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)

thepresenceofmultipledatarelatedtypessuchas“Images”,“collection”or“software”reinforcestheideathatweneedto stopconsidering“data”asahomogeneouspublicationtype.

5.1.2. DataCitemetadatafields

TheDataCiteschemacloselyalignedwithDublinCore,whichallowsinteroperabilitybetweendifferentplatformsand recordtypesaswellasensuringminimumlevelsofqualityofauthor-generatedmetadata(Greenberg,Pattuelli,Parsia,&

Robertson,2002).However,thesimplicityofthemodel(Lagoze,2001)leavesroomtoambiguityinmanyofthefieldsrequired inordertodevelopanytypeofbibliometricanalysis.WefoundthatamajorissueexistingonDataCiteisthatalotofrecords aremissinginformationinmanyofthefields(evenmandatoryones).Inaddition,makingsomeoftherecommendedfields mandatory(e.g.,thesubject,theinstitutionalaffiliationofthecreator)wouldenhanceDataCite’spotentialforbibliometric analyses.Itwouldalsobeusefultomakemandatorya“typeofrelation”subfieldforthe“Relation”fieldwhichisoneofthe

(12)

arecriticalissuesregardingthestructureandcleanlinessofDataCiterecordsthatwouldneedtobeaddressedtoimprove itsusability.Inanycase,theconclusionsdrawnherearebasedontheDataCiteMetadataStoreanddonotconsiderany improvedfunctionalitiesavailablethroughtheDataCiteRESTAPI.Inthissense,theadvantagesandlimitationsofusing differentpointsofaccessshouldbemadeclearersothatuserscanchooseoneortheotherdependingontheanalysisthey wishtoconduct.

Inthissense,potentialusersofDataCiteshouldconsiderthefollowingissues:First,emptyrecordsshouldberemoved beforeattemptingtomakeanystatementregardingtheactualdatacontainedbyDataCite.Asnotedinsubsection‘General descriptionoftheretrieveddatabase’,over1millionrecordswerefoundemptyatthetimeoftheretrievalofthedata.The non-removaloftheserecordsmaymisleadthecountsoftheactualsizeofthedatabase.

Second,issuesrelatedtodatacompletenessreducetheanalyzabledatasetasmorefiltersareusedtoretrieverecords.

Forexample,tofocusononlydata-relatedrecords(e.g.datasets)itisnecessarytofilterbyResourceType.However,this fieldisemptyforasubstantialamount(40%)ofrecords.Inaddition,theDataCiteMetadataStorecontainsawidevariety of“resourcetypes”.Thus,usersmustdecidebeforehandwhichdatatypesarerelevantfortheanalysisandunderstandthe potentiallossesofinformationthatthefilterswillimpose.

Third,aconsiderableamountofdataprocessingandcleaningwillmostlikelybeneeded,asmostfieldsarenotstandard- ized.Furthermore,thefactthatsomefieldsaremerged(e.g.publicationdateanddate)makesitcompulsorytoprocessand cleanthedatabeforeanalyzingit.

Finally,animportantissuecriticalforthepotentialusabilityofthedatabaseformetricpurposesisthelackofstandard- izationofmanymetadatafields.Havingmanyfreetextfields(e.g.Publicationyear,publisher,creator)makesdataretrieval morearduousandmakesitnecessarytodisambiguatethedata.Bysimplyimposingastandardformatforcertainfieldssuch asthecreatorfield,orbyincludingaclosedlistfortheResourceTypefieldandsubfieldorforthesubjectfieldwouldgreatly improvethequalityofthedataandfacilitateitsanalysis.

6.1. Furtherresearch

DataCiteiscurrentlyoneofthemaindatasourcesavailableforthedevelopmentofdatametrics,andagreatpromoter ofdatasharingandreuse.Indeed,despiteitsrecentcreation,DataCiteisprobablythelargestdatabase,withavastand heterogeneoussetofdatarecords,bringingusastepclosertoanidealofopensciencecharacterizedbyitstransparencyand itscapacitytooptimizetheuseofresources.ByprovidinganoverviewofthestructureandcontentoftheDataCiterecords, thispaperhashopefullyservedasafirststeptowardsabetterunderstandingofdataproduction,publicationandreuse bythescientificcommunity.FurtherresearchwillfocusoncomparisonswithdifferentofaccesstoDataCiterecords,the studyoftherelationshipsbetweenauthorsofscientificpublicationsandcreatorsofdatasets,thedevelopmentofsuitable classificationsofdatarecordsandthepresenceofmentionstoDOIsinthereferencesofscientificpublicationstodata.

Authorcontributions

NicolasRobinson-Garcia:Conceivedanddesignedtheanalysis,Contributeddataoranalysistools,Performedtheanalysis, Wrotethepaper.

PhilippeMongeon:Contributeddataoranalysistools.

WeiJeng:Collectedthedata,Contributeddataoranalysistools.

RodrigoCostas:Conceivedanddesignedtheanalysis,Contributeddataoranalysistools,Performedtheanalysis.

2 ThecurrentrecommendeddatacitationformatfromDataCiteisthefollowing.Creator(Publicationyear).Title.Publisher.Identifier(DataCite,2015).

(13)

Acknowledgements

Preliminaryresultsofthispaperwerereportedatthe3:AMConferenceheldinBucharest(Romania),27–29September, 2016.TheauthorswouldliketothankHenrideWinterfromCWTSforhelpingintheretrievalofthedataandKristianGarza fromDataCiteforfruitfulandhelpfuldiscussionsonpointsofaccesstoDataCiteandstructureofrecords.Thetwoanonymous reviewersarealsothankedfortheirconstructivecommentsandrecommendations.Thisstudyhasbeenpartiallysupported bytheEuropeanCommissionprojectRTD-B6-00964-2013MonitoringtheevolutionandbenefitsofResponsibleResearchand Innovation(MoRRI).NicolasRobinson-GarciaiscurrentlysupportedbyaJuandelaCierva-FormacióngrantfromtheSpanish MinistryofEconomyandCompetitiveness.

AppendixA. Retrievedfieldsanddescriptionoftheircontents

Field Description

Identifier Uniquenumberidentifier.DataCiteassignsDOIstoalldatarecords,althoughmanyincludeadditional identifierssuchasCCDC(CambridgeCrystallographicDataCentre)orInChI(InternationalChemicalIdentifier).

Creator Authorofthedatarecord.Thisfieldisnotpresentedinastandardizedformat(i.e.Surname,Initials).

Title Nameofthedatasetorfilestoredintherepository.

Publisher Non-standardizedformatwhichincludesagreatvarietyofdifferententitiesragingfromrepositories,journals, institutions,etc.

Date Thisfieldincludesthemandatoryfield‘PublicationYear’aswellasthe‘Date’field,whichmeansthateach recordcanhavemorethanonepublicationyear.Theformatisstandardizedbutheterogeneous.Hence

‘PublicationYear’informationappearsasafour-digitnumberwhileDateappearsstatingthetypeofdateand theactualyear(i.e.,Available:01/2/2005).

Subject Keywordsassignedtoeachdatarecord.Whileweobservethatforsomerepositoriesafixedclassification systemisemployed;thisisnotsystematizedforalldatarecords.

Contributor Individualsandinstitutionscollaboratingonthecreationofthedatabutnotconsideredascreators.Aswith the‘Creator’field,thisfieldisnotpresentedinastandardizedformat.

ResourceType Thisfieldincludesboth,thefirst-leveldatatypeclassificationaswellasthesecond-leveldatatype classification.

Description ThisfieldincludesinitscontentthefivedistinctsubsectionsdescribedbyDataCite.Howevernotallrecords includeallsubsections.

DataCenter InstitutioninchargeoffeedingDataCitewithrecords.Datacentershaveauniqueidentifiereachconstructed intwoparts.Firsttheintermediaryinstitutionandsecondly,thesendinginstitution.Forinstance,

BL.IMPERIAListheidentifierforImperialCollegeLondon.BLstandsforBritishLibrary,theintermediary institutionandIMPERIALforthesendinginstitution.

Relation ThisfieldrelatedeachdatarecordwithadditionalDOInumbers.Howsuchrelationisestablishedisnot formallydeclaredintherecord.DespiteDataCiteoffersacontrolledlistofvaluesindicatingthetypeof relationestablishedbetweenrecords,wedidnotfindthisinformationinthedataretrieved.Moreonthisin subsection3.4

Format Non-standardizedfieldwhichincludesaformaldescriptionofthecontentsoftherecord.Herewefind informationwhichrangesfromacatalographicdescriptionofthecontents(i.e.,ZweiTeilein1Band;17cm)to actualformatofthesubmittedfile(i.e.,SPSSfile).

Language Non-standardizedfieldindicatingthelanguageoftherecord.Languageisindicatedbyusingatwo-digit format,athree-digitformatorthefullname.Insomecases,morethanonelanguageisreported(i.e.,fr-en) Rights Non-standardizedformatincludingtheholderofthecopyrightsifanyorthelicensebywhichthedatarecord

isprotected.InformationisreportedherenotonlyinEnglishbutalsoinotherlanguages.

AppendixB. Classificationofpublishertypes

Publisherswereclassifiedintoelevenmutuallyexclusivecategoriestoanalyzedifferentnationaldatainfrastructures.

Followingweincludethetwelvetypesofpublishersidentifiedalongwithexamplesforeachofthem.

Publishertype Examples #records

Thematicrepository Data-PlanetÔStatisticalReadyReferencebyConquestSystems,Inc.;Cambridge CrystallographicDataCentre

2,205,204 Institutionalrepository ImperialCollegeLondon,ETH-BibliothekZürich,Bildarchiv,UniversityofPittsburgh 852,954 Researchbody PartnershipforInterdisciplinaryStudiesofCoastalOceans(PISCO),LeibnizInstitutfür

AstrophysikPotsdam(AIP)

764,962

Multidisciplinaryrepository Figshare,ZENODO 408,355

Scientificpublisher GermanMedicalScienceGMSPublishingHouse,ZofingerTagblatt,PeerJ 149,305 Nationalrepository DigitalRepositoryofIreland,Colchester,Essex:UKDataArchive 40,634 Firm Huber&Co.AG,VerlegergemeinschaftWerk,Bauen+WohnenBauen+WohnenGmbH 20,704 Professionalbody BundSchweizerArchitekten,Freidenker-VereinigungderSchweiz,Unionsyndicale

Suisse

19,215

Conference EuropeanCongressofRadiology 18,571

Individual W.Jegher&A.Ostertag,J.F.Boscovits 8025

Educationalbody nanoHUB 2326

(14)

AssociationforInformationScienceandTechnology,68(6),1341–1359.

Mayernik,M.S.(2012).Bridgingdatalifecycles:trackingdatauseviadatacitationsworkshopreport.NCARTechnicalNoteNCAR/TN-494+PROC.Boulder,CO:

NationalCenterforAtmosphericResearch(NCAR).http://dx.doi.org/10.5065/D6PZ56TX

Missier,P.(2016).Datatrajectories:Trackingreuseofpublisheddatafortransitivecreditattribution.InternationalJournalofDigitalCuration,11(1),1–16.

Parsons,M.A.,&Fox,P.A.(2013).Isdatapublicationtherightmetaphor?DataScienceJournal,12,WDS32–WDS46.

Peng,R.D.(2011).Reproducibleresearchincomputationalscience.Science,334(6060),1226.

Perneger,T.V.(2011).Sharingrawdata:AnotherofFrancisGalton’sideas.BritishMedicalJournal,342,d3035.

Peters,I.,Kraker,P.,Lex,E.,Gumpenberger,C.,&Gorraiz,J.(2016).Researchdataexplored:Anextendedanalysisofcitationsandaltmetrics.

Scientometrics,107(2),723–744.

Piwowar,H.A.,Day,R.S.,&Fridsma,D.B.(2007).Sharingdetailedresearchdataisassociatedwithincreasedcitationrate.PublicLibraryOfScience,2(3), e308.

Piwowar,H.A.,Becich,M.J.,Bilofsky,H.,&Crowley,R.S.(2008).Towardsadatasharingculture:Recommendationsforleadershipfromacademichealth centers.PLoSMedicine,5(9),e183.

Robinson-García,N.,Jiménez-Contreras,E.,&Torres-Salinas,D.(2016).AnalyzingdatacitationpracticesusingtheDataCitationIndex.Journalofthe AssociationforInformationScienceandTechnology,67(12),2964–2975.

Torres-Salinas,D.,Robinson-García,N.,&Cabezas-Clavijo.(2012).Compartirlosdatosdeinvestigaciónenciencia:introducciónaldatasharing.El ProfesionalDeLaInformación,21(2),173–184.

Torres-Salinas,D.,Martín-Martín,A.,&Fuente-Gutiérrez,E.(2014).AnálisisdelacoberturadelDataCitationIndex–ThomsonReuters:disciplinas, tipologíasdocumentalesyrepositorios.RevistaEspa˜nolaDeDocumentaciónCientífica,37(1),e036.

Références

Documents relatifs

1275. The delegation of Turkey commended the Evaluation Body for its comprehensive and well- structured report, which explained the principles of evaluation and addressed the

1002.Taking note of the proposal, the delegation of the Philippines wished to clarify its position in that a dialogue had been called for in last two years, which was in fact a

This should help reinforce the existing combined mechanism that allowed States Parties to nominate elements to the Urgent Safeguarding List while

968. The Secretary remarked on the fact that the IOS evaluation had noted the lack of guidance in the Operational Directives on how intangible cultural heritage was expected to

832. The delegation of Belgium remarked that although the Committee contained 24 members, the fact that six of its members made up the Subsidiary Body posed a problem of credibility

The delegation of Morocco regretted that the meeting might not take place due to the lack of available funds, and in a written amendment proposed to complete the paragraph, which

125. The delegation of Saint Lucia agreed with Albania that an unlimited Urgent Safeguarding List was unrealistic as long as the resources and capacity of the

The delegation of the Republic of Korea congratulated the Chairperson for his proposal, which reflected the concerns and demands of the States Parties shared in yesterday’s