• Aucun résultat trouvé

Semantic Terminology Management for Applications: Contextualized SKOS-XL

N/A
N/A
Protected

Academic year: 2022

Partager "Semantic Terminology Management for Applications: Contextualized SKOS-XL"

Copied!
2
0
0

Texte intégral

(1)

Semantic Terminology Management for Applications: Contextualized SKOS-XL

Andreas Thalhammer, Martin Romacker, and Joachim Rupp

Roche Pharma Research and Early Development Informatics, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Grenzacherstrasse 124, 4070 Basel,

Switzerland

{andreas.thalhammer, martin.romacker, joachim.rupp}@roche.com

Abstract. Terminology management is an important aspect for ensur- ing data quality in large organizations. To enable expert applications the use of agreed and curated terms enhances data quality while it signifi- cantly reduces the long-term cost for data integration. In this abstract, we outline our solution for two problems that occur in the context of terminology management for applications.

1 Introduction

In a large organization, like Roche, it is essential to ensure that different stake- holders are referring to the same thing when they mean the same thing. Instead of costly (and repeated) processing and mapping steps that are typically performed when “data integration is needed”, a functioning organization-wide terminology management system preemptively supports users in selecting the right terms at the point of data creation. This leads to a situation where two independent appli- cations can refer toroche:ROX1305277804386when they mean “Non-Small Cell Lung Cancer”. Next to straight-forward integration, terminology management significantly improves data quality in the long term. For this, the Simple Knowl- edge Organization System (SKOS) not just enables to have a single identifier per concept but also enables to assign different labels with preferences (typically distinguished as preferred and alternative labels) and concept hierarchies. This supports consuming applications (that are designed for expert users) to select the acronym “NSCLC” as a label for the previously mentioned entity and group it in “Diseases→Lung→Lung Cancer”. However, there are two main problems when applications consume SKOS terminologies:

Problem 1 Domain-specific schemes, like “Diseases” (organized via domain- centric polyhierarchies), would be used for a drop-down field called “Search cell line inventory—related disease)”.1 This means that an end user would need to select a single value from more than 4000 while, in the context of the drop-down field, far less entries are actually needed.

1 Note thatskos:broader/skos:narrowercannot be used in this case as the hierarchi- cal organization of the content is domain-dependent rather than application-centric.

(2)

Problem 2 When an application uses a preferred or alternative label in its specific context, the label would need to be transferred to the application (as otherwise it would be unclear which label the application wants to use).

Therefore, the consuming application becomes the “owner” of the term and potential changes (on the application side) might not be fed back to the terminology management system.

2 Contextualized SKOS-XL

Graph: domain master Graph: application

skosxl:prefLabel

skosxl:prefLabel skosxl:altLabel

:Concept

:Label2 :Label1

Fig. 1.Schematic overview of contextualized SKOS-XL.

In the pREDi-Roche Terminology System (RTS) we use RDF and SKOS-XL for specifying domain-centric terminologies. Consuming applications can rely on the defined semantics and the stable URIs that the terminology system provides. In order to address Problem 1, applications need access to subsets. However, in the design of the RTS system we decided not to maintain these subsets on the application side. This is due to two main reasons:1) Semantic -Applications use the same concepts with slightly different semantics (e.g., a disease can also be interpreted as an adverse event, genetic background, or pheno-type): an ap- plication naturally sets a context for a term. Data curators need to be aware of such contexts (which is enabled by maintaining the subsets at the side of the terminology system). 2) Organizational - If subsets are maintained at the application side (i.e., via lists of URIs and according preferred terms), the technical infrastructure for storing and retrieving terms is readily in place. In combination with corporate structure (i.e., shortest paths) and the long-term orientation of central terminology management (i.e., no quick benefit) this can lead to unnoticed disconnection of the applications.

In RTS, we make use of named graphs to maintain application contexts/subsets.

The URIs of the concepts and labels of a domain-master graph are reused and contextualized in application graphs. In order to address Problem 2, the domain master maintains all different types of SKOS-XL labels (preferred, alternative, or hidden label) and applications can choose one of these labels as their preferred label (see Fig. 1). This enables to provide acronyms like “NSCLC” in restricted, application-centric subsets (i.e., maximum flexibility). At any point of time, data curators can see which applications consume which terms and how they refer to them.

Références

Documents relatifs

Next, we also investigate if additional information can be inferred from joint usage with SKOS in order to enrich semantic inferences through OWL alone – although SKOS was

We will show how the Pick&Place skill, as well as its composite applications, are represented in the ReApp Store, together with different types of semantic

Lysiane Hauguel, Tanguy Lallemand, Rayan Eid, Fabrice Dupuis, Sylvain Gaillard, Florian Blessing, Sandra Pelletier, Julie Bourbeillon.. To cite

Indeed, the core elements of terminol- ogy remain the concept (unit of knowledge), the term (specialised lexical item), and the relationship between these elements, in which lies

La communication et le partage de connaissances font appel à l’utilisation de signes très différents de nature. La cartographie et la réalisation de schémas, de diagrammes

La linguistique resterait ainsi la seule science qui croie que le réel est en de- hors d’elle 25 : en effet, si le langage est un instrument de représentation, elle a besoin

We have developed a set of SKOS mapping validation rules [4] [6] to automatical- ly detect problematic patterns in terminology mappings that are expressed with SKOS mapping

D’autre part, tant dans la juxtaposition que dans la coordination et dans la phrase complexe, l’expression ne coïncide pas avec le simple codage, mais résulte d’une